5 Lexical conventions [lex]

5.13 Literals [lex.literal]

5.13.3 Character literals [lex.ccon]

encoding-prefix: one of
u8  u  U  L
basic-c-char:
any member of the basic source character set except the single-quote ', backslash \, or new-line character
simple-escape-sequence-char: one of
' " ? \ a b f n r t v
conditional-escape-sequence-char:
any member of the basic source character set that is not an octal-digit, a simple-escape-sequence-char, or the characters u, U, or x
A non-encodable character literal is a character-literal whose c-char-sequence consists of a single c-char that is not a numeric-escape-sequence and that specifies a character that either lacks representation in the literal's associated character encoding or that cannot be encoded as a single code unit.
A multicharacter literal is a character-literal whose c-char-sequence consists of more than one c-char.
The encoding-prefix of a non-encodable character literal or a multicharacter literal shall be absent or L.
Such character-literals are conditionally-supported.
The kind of a character-literal, its type, and its associated character encoding are determined by its encoding-prefix and its c-char-sequence as defined by Table 7.
The special cases for non-encodable character literals and multicharacter literals take precedence over their respective base kinds.
[Note 1:
The associated character encoding for ordinary and wide character literals determines encodability, but does not determine the value of non-encodable ordinary or wide character literals or ordinary or wide multicharacter literals.
The examples in Table 7 for non-encodable ordinary and wide character literals assume that the specified character lacks representation in the execution character set or execution wide-character set, respectively, or that encoding it would require more than one code unit.
— end note]
Table 7: Character literals [tab:lex.ccon.literal]
Encoding
Kind
Type
Associated char-
Example
prefix
acter encoding
none
char
encoding of
'v'
non-encodable ordinary character literal
int
the execution
'\U0001F525'
ordinary multicharacter literal
int
character set
'abcd'
L
wchar_­t
encoding of
L'w'
non-encodable wide character literal
wchar_­t
the execution
L'\U0001F32A'
wide multicharacter literal
wchar_­t
wide-character set
L'abcd'
u8
char8_­t
UTF-8
u8'x'
u
char16_­t
UTF-16
u'y'
U
char32_­t
UTF-32
U'z'
In translation phase 4, the value of a character-literal is determined using the range of representable values of the character-literal's type in translation phase 7.
A non-encodable character literal or a multicharacter literal has an implementation-defined value.
The value of any other kind of character-literal is determined as follows:
The character specified by a simple-escape-sequence is specified in Table 8.
[Note 3:
Using an escape sequence for a question mark is supported for compatibility with ISO C++ 2014 and ISO C.
— end note]
Table 8: Simple escape sequences [tab:lex.ccon.esc]
new-line
NL(LF)
\n
horizontal tab
HT
\t
vertical tab
VT
\v
backspace
BS
\b
carriage return
CR
\r
form feed
FF
\f
alert
BEL
\a
backslash
\
\\
question mark
?
\?
single quote
'
\'
double quote
"
\"