3644. std::format does not define "integer presentation type"

Section: 28.5.2.2 [format.string.std] Status: New Submitter: Charlie Barto Opened: 2021-11-23 Last modified: 2022-11-01

Priority: 2

View other active issues in [format.string.std].

View all other issues in [format.string.std].

View all issues with New status.

Discussion:

28.5.2.2 [format.string.std] specifies the behavior of several format specifiers in terms of "integer presentation types"; for example 28.5.2.2 [format.string.std]/4 states:

"The sign option is only valid for arithmetic types other than charT and bool or when an integer presentation type is specified".

Unfortunately nowhere does the standard actually define the term "integer presentation type". The closest it comes is in 28.5.2.2 [format.string.std]/19 and [tab:format.type.int], but that explicitly excludes charT and bool. [tab:format.type.char] and [tab:format.type.bool] then refer to [tab:format.type.int].

I can come up with many interpretations for what could happen when 'c' is used with charT or bool, but the following table is what msvc does right now (throws is the same as does not compile after P2216 in all these cases, although not in general for 'c'):

Argument type Specifiers Throws?
bool # Yes
bool #c No
bool :+ Yes
bool +c Yes
bool ^ No
bool ^c No
bool 0 Yes
bool 0c Yes
bool c No
charT # Yes
charT #c Yes
charT + Yes
charT +c Yes
charT ^ No
charT ^c No
charT 0 Yes
charT 0c Yes

As you can see we don't interpret 'c' as an "integer type specifier", except when explicitly specified for bool with #. I think this is because for # the standard states

"This option is valid for arithmetic types other than charT and bool or when an integer presentation type is specified, and not otherwise",

and [tab:format.type.bool] puts 'c' in the same category as all the other "integer type specifiers", whereas [tab:format.type.char] separates it out into the char-specific types. If this issue's proposed resolution is adopted our behavior would become non-conforming (arguably it already is) and "#c" with bools would become invalid.

[2021-11-29; Tim comments]

This issue touches the same wording area as LWG 3586 does.

[2022-01-30; Reflector poll]

Set priority to 2 after reflector poll.

[2021-11-29; Jonathan comments]

LWG 3648 removed 'c' as a valid presentation type for bool. The last change in the resolution below (and the drafting note) can be dropped.

LWG 3586 could be resolved as part of this issue by using "this is the default unless formatting a floating-point type or using an integer presentation type" for '<' and by using "this is the default when formatting a floating-point type or using an integer presentation type" for '>'.

Proposed resolution:

This wording is relative to N4901.

  1. Modify 28.5.2.2 [format.string.std] as indicated:

    -6- The # option causes the alternate form to be used for the conversion. This option is only valid for arithmetic types other than charT and bool or when an integer presentation type is specified, and not otherwise. For integral types, the alternate form inserts the base prefix (if any) specified in Table 65 into the output after the sign character (possibly space) if there is one, or before the output of to_chars otherwise. For floating-point types, the alternate form causes the result of the conversion of finite values to always contain a decimal-point character, even if no digits follow it. Normally, a decimal-point character appears in the result of these conversions only if a digit follows it. In addition, for g and G conversions, trailing zeros are not removed from the result.

    […]

    [Drafting note: This modification is a simple cleanup given the other changes further below, to bring the wording for # in line with the wording for the other modifiers, in the interest of preventing confusion.]

    […]

    -16- The type determines how the data should be presented.

    -?- An integer presentation type is one of the following type specifiers in Table [tab:format.type.integer_presentation], or none, if none is defined to have the same behavior as one of the type specifiers in Table [tab:format.type.integer_presentation].

    Table ? — Meaning of type options for integer representations [tab:format.type.integer_presentation]
    Type Meaning
    b to_chars(first, last, value, 2); the base prefix is 0b.
    B The same as b, except that the base prefix is 0B.
    d to_chars(first, last, value).
    o to_chars(first, last, value, 8); the base prefix is 0 if value is nonzero and is empty otherwise.
    x to_chars(first, last, value, 16); the base prefix is 0x.
    X The same as x, except that it uses uppercase letters for digits above 9 and the base prefix is 0X.

    [Drafting note: This is the same as [tab:format.type.int] with "none" and 'c' removed]

    -17- The available string presentation types are specified in Table 64 ([tab:format.type.string]).

    […]
    Table 65 — Meaning of type options for integer types [tab:format.type.int]
    Type Meaning
    b, B, d, o, x, X As specified in Table [tab:format.type.integer_presentation]to_chars(first, last, value, 2); the base prefix is 0b.
    B The same as b, except that the base prefix is 0B.
    c Copies the character static_cast<charT>(value) to the output. Throws format_error if value is not in the range of representable values for charT.
    d to_chars(first, last, value).
    o to_chars(first, last, value, 8); the base prefix is 0 if value is nonzero and is empty otherwise.
    x to_chars(first, last, value, 16); the base prefix is 0x.
    X The same as x, except that it uses uppercase letters for digits above 9 and the base prefix is 0X.
    none The same as d. [Note 8: If the formatting argument type is charT or bool, the default is instead c or s, respectively. — end note]
    Table 66 — Meaning of type options for charT [tab:format.type.char]
    Type Meaning
    none, c Copies the character to the output.
    b, B, d, o, x, X As specified in Table [tab:format.type.int][tab:format.type.integer_presentation].
    Table 67 — Meaning of type options for bool [tab:format.type.bool]
    Type Meaning
    none, s Copies textual representation, either true or false, to the output.
    b, B, c, d, o, x, X As specified in Table [tab:format.type.int][tab:format.type.integer_presentation] for the value static_cast<unsigned char>(value).
    c Copies the character static_cast<unsigned char>(value) to the output.

    [Drafting note: allowing the 'c' specifier for bool is pretty bizarre behavior, but that's very clearly what the standard says now, so I'm preserving it. I would suggest keeping discussion of changing that behavior to a separate issue or defect report (the reworking of the tables in this issue makes addressing that easier anyway).

    The inconsistency with respect to using static_cast<unsigned char> here and static_cast<charT> in [tab:format.type.int] is pre-existing and should be addressed in a separate issue if needed. ]