A preprocessing token is the minimal lexical element of the language in translation
phases 3 through 6
. In this document,
glyphs are used to identify
elements of the basic character set (
[lex.charset])
. The categories of preprocessing token are: header names,
placeholder tokens produced by preprocessing
import and
module directives
(
import-keyword,
module-keyword, and
export-keyword),
identifiers, preprocessing numbers, character literals (including user-defined character
literals), string literals (including user-defined string literals), preprocessing
operators and punctuators, and single non-whitespace characters that do not lexically
match the other preprocessing token categories
. If a
U+0027 apostrophe or a
U+0022 quotation mark character
matches the last category, the behavior is undefined
. If any character not in the basic character set matches the last category,
the program is ill-formed
. Preprocessing tokens can be separated by
whitespace;
this consists of comments (
[lex.comment]), or whitespace characters
(
U+0020 space,
U+0009 character tabulation,
new-line,
U+000b line tabulation, and
U+000c form feed), or both
. As described in
[cpp], in certain
circumstances during translation phase 4, whitespace (or the absence
thereof) serves as more than preprocessing token separation
. Whitespace
can appear within a preprocessing token only as part of a header name or
between the quotation characters in a character literal or
string literal
.