Programmer Reference : UnicodeSupport : UnicodeReadStream
UnicodeReadStream
Description
This is an adapter class used for bridging <UnicodeView>s with <ReadStream>s.
By default, this streams 'graphemes' which are user-perceived characters. A grapheme is represented in VAST by a <Grapheme> object. A <UnicodeString> in VAST is to be thought of as a <Collection> of <Grapheme>s.
If you need more technical parsing precision or closer line-ending compatibility with , then you can put this stream into unicode scalar mode by calling #switchToUnicodeScalarMode. A unicode scalar is represented in VAST by a <UnicodeScalar> object. A <UnicodeScalar> represents all Unicode code points except for a special range reserved for UTF-16 encoding.
If you are working with pure Unicode, then consider using views rather than this adapter class.
@see the class category Views on a <UnicodeString> for more details.
Instance State
view: <UnicodeView> internal view to bridge
CLDT-API
This adapter redefines the necessary <ReadStream> (and superclass) APIs to allow for efficient streaming of a <UnicodeString>. In most cases, this means delegating to the internal view which tend to implement operations more efficiently for variable-width collections than <ReadStream> does.
Modes
This stream can go into different modes which define how elements of the stream are to be interpreted. While the default mode is graphemes, you can switch to different modes using the APIs in the Modescategory. Switching a mode will always reset the stream to the beginning.
For example, if you wanted to process a string object as a of s, you could do the following:
| stream |
 
stream := 'Smalltalk' asUnicodeString readStream.
 
"Process stream as unicode scalars"
stream switchToUnicodeScalarMode.
self assert: [stream next = $S asUnicodeScalar].
self assert: [(stream next: 8) = ('Smalltalk' asUnicodeString unicodeScalars copyFrom: 2) contents]
Class Methods
None
Instance Methods
atEnd
   Answer a Boolean which is true if the receiver cannot
   access any more objects, and false otherwise.
   
   Example:
    self assert: [UnicodeString new readStream atEnd].
    self assert: ['Smalltalk' asUnicodeString readStream atEnd not].
   
   Answers:
    <Boolean>
isEmpty
   Answer true if the contents of the view are empty.
   This is relative to the complete contents and is not
   impacted by the current position.

   Example:
    self assert: [UnicodeString new readStream isEmpty].
    self assert: ['Smalltalk' asUnicodeString readStream isEmpty not]
    
   Answers:
    <Boolean>
lineDelimiter
   Return the receiver's line delimiter.
   
   Answers:
    <Object>
lineDelimiter:
   Set the receiver's line delimiter to be delimiter, and
   answer the receiver.

   Example:
    | stream |
    stream := ('Small' , String lf , 'talk' , String cr , 'er') asUnicodeString readStream.
    self assert: [(stream lineDelimiter: Grapheme cr; nextLine) = ('Small' , String lf , 'talk')].
   
   Arguments:
    delimiter - Grapheme mode:
                <Grapheme> grapheme delim
                <UnicodeScalar> scalar delim
                <UnicodeString> graphemes
                <Array> of <implementors of #asGrapheme>
                Compat: <String | Character>
              Unicode Scalar mode:
                <UnicodeScalar> scalar delim
                <Grapheme> grapheme delim
                <UnicodeString> graphemes
                <Array> of <implementors of #asUnicodeScalar>
                Compat: <String | Character>
   Answers:
    <UnicodeReadStream> self
next
   Answer an Object that is the next accessible by the
   receiver. Change the state of the receiver so that
   returned object is no longer accessible.

   Example:
    self assert: [('Smalltalk' asUnicodeString readStream next; next; next) = $a asGrapheme].
    self assert: [('Smalltalk' asUnicodeString readStream switchToUnicodeScalarMode; next; next; next) = $a asUnicodeScalar].
    
   Answers:
    <Object> view object
next:
   Answer a collection containing the next @anInteger elements from the view.
   If @anInteger < 1, an empty collection is answered

   Example:
    self assert: [('Smalltalk' asUnicodeString readStream next: 5) = 'Small'].
    self assert: [| stream |
      stream := 'Smalltalk' asUnicodeString readStream.
      (stream switchToUnicodeScalarMode; next: 5) = 'Small' unicodeScalars contents]
    
   Arguments:
    anInteger - <Integer>
   Answers:
    <Object> instance of view collection class
   Raises:
    <Exception> ExCLDTIndexOutOfRange
next:into:startingAt:
   Answer @anIndexedCollection with the next @anInteger number of items from
   the receiver, stored starting at position @initialPosition.

   If the receiver's state is such that there are fewer than anInteger
   elements between its current position and the end of the stream,
   the operation will fail, and the receiver will be left in a state
   such that it answers true to the atEnd message.

   Example:
    | col |
    col := Array new: 5.
    'Smalltalk' asUnicodeString readStream next: 5 into: col startingAt: 1.
    self assert: [col = 'Small' asUnicodeString asArray]
    
   Arguments:
    anInteger - <Integer>
    anIndexedCollection - <Collection>
    initialPosition - <Integer>
   Answers:  
    <Collection> - anIndexedCollection
nextLine
   Answer the elements between the current position and the next lineDelimiter.

   Example:
    | stream |
    stream := ('Small' , String lf , 'talk' , String cr , 'er' , String crlf , 's') asUnicodeString readStream.
    self assert: [stream nextLine = 'Small'].
    self assert: [stream nextLine = 'talk'].
    self assert: [stream nextLine = 'er'].
    self assert: [stream nextLine = 's'].
    stream switchToUnicodeScalarMode.
    self assert: [stream nextLine = 'Small' unicodeScalars contents].
    self assert: [stream nextLine = 'talk' unicodeScalars contents].
    self assert: [stream nextLine = 'er' unicodeScalars contents].
    self assert: [stream nextLine = 's' unicodeScalars contents].
    self assert: stream atEnd.
    
   Answers:
    <Object> view-dependent
peek
   Answer an Object that is the next accessible by the receiver.
   Change the state of the receiver so that returned object is no longer accessible.
   Answer nil if the view is atEnd
   
   Example:
    self assert: [('' asUnicodeString readStream peek) isNil].
    self assert: [('Smalltalk' asUnicodeString readStream peek) = $S asGrapheme].
    self assert: [('Smalltalk' asUnicodeString readStream switchToUnicodeScalarMode; peek) = $S asUnicodeScalar].    
    
   Answers:
    <Object> or nil if at end
position:
   Set the receiver's position reference to argument anInteger.
   Answer self.

   Example:
    | stream pos |
    stream := 'abcde' asUnicodeString readStream.
    pos := stream setToEnd; position.
    self assert: [(stream reset; next: 3) = 'abc'].
    stream position: pos.
    self assert: [stream position = pos]
    
   Arguments:
    aPosition - <anInteger>
setToEnd
   Set the position of the receiver to be the size of the
   underlying contents
size
   Answer the number of elements in the view.
   
   Example:
    self assert: [('Smalltalk' , String crlf) asUnicodeString 
      readStream size = 10].
    self assert: [('Smalltalk' , String crlf) asUnicodeString 
      readStream switchToUnicodeScalarMode size = 11].
    
   Answers:
    <Integer>
skip:
   Increment the receiver's current reference position by anInteger.
   Fail if anInteger is not a kind of Integer.

   Example:
    self assert: [('abcde' asUnicodeString readStream skip: 2; upToEnd) = 'cde']
    
   Arguments:
    anInteger - <Integer>
   Raises:
    <Exception> ExCLDTIndexOutOfRange
skipTo:
   Read and discard elements just past the occurrence of @anObject.

   Example:
    self assert: [('abcde' asUnicodeString readStream skipTo: $c; upToEnd) = 'de'].
    self assert: [('abcde' asUnicodeString readStream skipTo: $z; upToEnd) = '']
    
   Arguments:
    anObject - <Object>
   Answers:
    <Boolean> true if found, false otherwise
skipToAll:
   Attempt to read and discard elements just past the occurrence of @aSequentialCollection.
   Answer true if all elements in @aSequentialCollection occurred, else answer false.
   
   Note:
    If aSequentialCollection is an EsString, then we attempt ot convert to a UnicodeString

   Example:
    self assert: ['abcde' asUnicodeString readStream skipToAll: 'bc'].
    self assert: [('abcde' asUnicodeString readStream skipToAll: 'bc'; upToEnd) = 'de'].
    self assert: [('abcde' asUnicodeString readStream skipToAll: 'zzz') not].
    self assert: [('abcde' asUnicodeString readStream skipToAll: 'zzz'; upToEnd) = ''].
    
   Arguments:
    aSequentialCollection - <aSequentialCollection>
   Answers:
    <Boolean>
skipToAny:
   Read and discard elements beyond the next occurrence
   of an element that exists in @aSequentialCollection or if none,
   to the end of stream.
   
   Answer true if an element in @aSequentialCollection
   occurred, else answer false.
   
   Note:
    If aSequentialCollection is an EsString, then we attempt ot convert to a UnicodeString

   Example:
    self assert: ['abcde' asUnicodeString readStream skipToAny: 'bd'].
    self assert: [('abcde' asUnicodeString readStream skipToAny: 'bd'; upToEnd) = 'cde'].
    self assert: [('abcde' asUnicodeString readStream skipToAny: 'zzz') not].
    self assert: [('abcde' asUnicodeString readStream skipToAny: 'zzz'; upToEnd) = ''].
    
   Arguments:
    aSequentialCollection - <aSequentialCollection>
   Answers:
    <Boolean>
switchToGraphemeMode
   Switch the mode to graphemes.
   This will reset the stream.
   
   Calls like #next will answer <Grapheme> objects.
   Calls like #next:/#contents will answer <UnicodeString> objects
   
   Example:
    self assert: [UnicodeString crlf readStream switchToGraphemeMode size = 1].
    self assert: [UnicodeString crlf readStream switchToUnicodeScalarMode size = 2]
switchToUnicodeScalarMode
   Switch the mode to unicode scalars.
   This will reset the stream.

   Calls like #next will answer <UnicodeScalar> objects.
   Calls like #next:/#contents will answer <Array> of <UnicodeScalar>s
   
   Example:
    self assert: [UnicodeString crlf readStream switchToGraphemeMode size = 1].
    self assert: [UnicodeString crlf readStream switchToUnicodeScalarMode size = 2].
upTo:
   Answers a collection of all of the objects in the view
   beginning from the current position up to, but not including,
   @anObject.

   Example:
    self assert: [('abcde' asUnicodeString readStream upTo: $c) = 'ab'].
    self assert: [('abcde' asUnicodeString readStream upTo: $z) = 'abcde']
    
   Arguments:
    anObject - <Object>
   Answers:
    <Object> instance of view collection class
upToAll:
   Answers a collection of all of the objects in the view beginning from the current position up to,
   but not including, @aSequenceableCollection
   
   Note:
    If aSequenceableCollection is an EsString, then we attempt ot convert to a UnicodeString

   Example:
    self assert: [('abcde' asUnicodeString readStream upToAll: 'bc') = 'a'].
    self assert: [('abcde' asUnicodeString readStream upToAll: 'bc'; upToEnd) = 'de'].
    self assert: [('abcde' asUnicodeString readStream upToAll: 'zzz') = 'abcde'].
    self assert: [('abcde' asUnicodeString readStream upToAll: 'zzz'; upToEnd) isEmpty].
    
   Arguments:
    aSequenceableCollection - <SequenceableCollection>
   Answers:
    <Object> instance of view collection class
upToAny:
   Answers a collection of all of the objects in the view up to, but not including, the next occurrence
   of the element that exists in @aSequenceableCollection.  If the element that exists in @aSequenceableCollection
   is not found and the end of the view is encountered, a collection of the objects read is returned.

   Note:
    If aSequenceableCollection is an EsString, then we attempt ot convert to a UnicodeString

   Example:
    self assert: [('abcde' asUnicodeString readStream upToAny: 'bd') = 'a'].
    self assert: [('abcde' asUnicodeString readStream upToAny: 'bd'; upToEnd) = 'cde'].
    self assert: [('abcde' asUnicodeString readStream upToAny: 'zzz') = 'abcde'].
    self assert: [('abcde' asUnicodeString readStream upToAny: 'zzz'; upToEnd) isEmpty].
    
   Arguments:
    aSequenceableCollection - <SequenceableCollection>
   Answers:
    <Object> view collection class
upToEnd
   Answer a collection containing UP TO the maximum number of elements read from the view.
   If there are no more elements available to be read, then an empty collection is answered.

   Example:
    self assert: ['abcde' asUnicodeString readStream upToEnd = 'abcde'].
    self assert: [('abcde' asUnicodeString readStream next: 2; upToEnd) = 'cde'].
    self assert: ['' asUnicodeString readStream upToEnd = '']
    
   Answers:
    <Object> instance of view collection class
Last modified date: 01/18/2023