Utf16View
Description
A Utf16View provides a bidirectional view over the unicode component as a collection of UTF-16 code units in platform-endian format.
UTF-16 is a 16-bit variable-length encoding that encodes unicode scalar values with one or two 16-bit code units. Unicode scalar values < 2^16 can be encoded in one 16-bit code unit, while everything above 2^16 will require two 16-bit code units. This pair of code units is called a surrogate pair with a leading surrogate and a trailing
surrogate.
#next/#previous - Answers a 16-bit <Integer>.
#contents - Answers a <Utf16>
Examples
| view |
view := Utf16View on: 'abc'.
self assert: [view next = $a value].
self assert: [view next = $b value].
self assert: [view next = $c value].
self assert: [view atEnd].
self assert: [view previous = $c value].
self assert: [view previous = $b value].
self assert: [view previous = $a value].
self assert: [view atStart].
self assert: [view contents asArray = { $a value. $b value. $c value}]
Instance State
trailingSurrogate: <Integer | UndefinedObject> vm managed decoder state. Will be notNil if cursored on a surrogate pair.
startAtTrailingSurrogate: <Boolean> vm managed. True if the view's first element is the trailing surrogate of a surrogate pair.
stopAfterLeadingSurrogate: <Boolean> vm managed. True if the view's last element is the leading surrogate of a surrogate pair.
Class Methods
None
Instance Methods
atEnd
   Answer a Boolean which is true if the receiver cannot
   access any more objects, and false otherwise.

   Notes:
    Override is required since utf16 views can end on a leading surrogate in a
    surrogate pair.  Therefore, the view surrogate related state must be considered
    before it can be determined that the utf16 view is positioned at the end.
    
   Example:
    | view sliceView |
    'EARTH GLOBE EUROPE-AFRICA'.
    view := #[240 159 140 141] utf16.
    self assert: [view contents = (Utf16 with: 55356 with: 57101)].
    
    sliceView := view copyFrom: 1 to: 1.
    self assert: [sliceView contents = (Utf16 with: 55356)].
    self assert: [sliceView atEnd not].
    self assert: [sliceView next = 55356].
    self assert: [sliceView atEnd].

    sliceView := view copyFrom: 2.
    self assert: [sliceView contents = (Utf16 with: 57101)].
    self assert: [sliceView atEnd not].
    self assert: [sliceView next = 57101].
    self assert: [sliceView atEnd]
    
   Answers:
    <Boolean>
atStart
   Answer a Boolean which is true if the receiver cannot
   access any more objects, and false otherwise.
   
   Notes:
    Override is required since utf16 views can start on a trailing surrogate in a
    surrogate pair.  Therefore, the view surrogate related state must be considered
    before it can be determined that the utf16 view is positioned at the start.

   Example:
    | view sliceView |
    'EARTH GLOBE EUROPE-AFRICA'.
    view := #[240 159 140 141] utf16.
    self assert: [view contents = (Utf16 with: 55356 with: 57101)].
    
    sliceView := view copyFrom: 1 to: 1.
    self assert: [sliceView contents = (Utf16 with: 55356)].
    sliceView setToEnd.
    self assert: [sliceView atStart not].
    self assert: [sliceView previous = 55356].
    self assert: [sliceView atStart].

    sliceView := view copyFrom: 2.
    self assert: [sliceView contents = (Utf16 with: 57101)].
    sliceView setToEnd.
    self assert: [sliceView atStart not].
    self assert: [sliceView previous = 57101].
    self assert: [sliceView atStart]
    
   Answers:
    <Boolean>
Last modified date: 01/18/2023