String literal objects are initialized with
the sequence of code unit values
corresponding to the
string-literal's sequence of
s-chars (originally from non-raw string literals) and
r-chars (originally from raw string literals),
plus a terminating
U+0000 null character,
in order as follows:
If a character lacks representation in the associated character encoding,
then the program is ill-formed
. [
Note 6:
No character lacks representation in any Unicode encoding form
. —
end note]
When encoding a stateful character encoding,
implementations should encode the first such sequence
beginning with the initial encoding state and
encode subsequent sequences
beginning with the final encoding state of the prior sequence
. [
Note 7:
The encoded code unit sequence can differ from
the sequence of code units that would be obtained by
encoding each character independently
. —
end note]
Each
numeric-escape-sequence (
[lex.ccon])
contributes a single code unit with a value as follows:
If
v does not exceed the range of representable values of
the
string-literal's array element type,
then the value is
v.Otherwise,
if the
string-literal's
encoding-prefix
is absent or
L, and
v does not exceed the range of representable values of
the corresponding unsigned type for the underlying type of
the
string-literal's array element type,
then the value is the unique value of
the
string-literal's array element type
T
that is congruent to
v modulo
2N, where
N is the width of
T.Otherwise, the program is ill-formed
.
When encoding a stateful character encoding,
these sequences should have no effect on encoding state
. When encoding a stateful character encoding,
it is
implementation-defined
what effect these sequences have on encoding state
.