2392. "character type" is used but not defined

Section: 3.34 [defns.ntcts], 30.3.1.2.1 [locale.category], 31.2.3 [iostreams.limits.pos], 31.7.6.3.1 [ostream.formatted.reqmts], 31.7.6.3.4 [ostream.inserters.character] Status: WP Submitter: Jeffrey Yasskin Opened: 2014-06-01 Last modified: 2023-11-22 16:02:17 UTC

Priority: 3

View all issues with WP status.

Discussion:

The term "character type" is used in 3.34 [defns.ntcts], 30.3.1.2.1 [locale.category], 31.2.3 [iostreams.limits.pos], 31.7.6.3.1 [ostream.formatted.reqmts], and 31.7.6.3.4 [ostream.inserters.character], but the core language only defines "narrow character types" (6.8.2 [basic.fundamental]).

"wide-character type" is used in 99 [depr.locale.stdcvt], but the core language only defines a "wide-character set" and "wide-character literal".

[2023-06-14; Varna; Daniel comments and provides wording]

Given the resolution of P2314 which had introduced to 6.8.2 [basic.fundamental] p11 a definition of "character type":

The types char, wchar_t, char8_t, char16_t, char32_t are collectively called character types.

one might feel tempted to have most parts of this issue resolved here, but I think that this actually is a red herring.

First, as Jonathan already pointed out, for two places, 31.7.6.3.1 [ostream.formatted.reqmts] and 31.7.6.3.4 [ostream.inserters.character], this clearly doesn't work, instead it seems as if we should replace "character type of the stream" here by "char_type of the stream".

To me "char_type of the stream" sounds a bit odd (we usually refer to char_type in terms of a qualified name such as X::char_type instead unless we are specifying a member of some X, where we can omit the qualification) and in the suggested wording below I'm taking advantage of the already defined term "character container type" (3.10 [defns.character.container]) instead, which seems to fit its intended purpose here.

Second, on further inspection it turns out that actually only one usage of the term "character type" seems to be intended to refer to the actual core language meaning (See the unchanged wording for 30.4.3.3.3 [facet.num.put.virtuals] in the proposed wording below), all other places quite clearly must refer to the above mentioned "character container type".

For the problem related to the missing definition of "wide-character type" (used two times in 99 [depr.locale.stdcvt]) I would like to suggest a less general and less inventive approach to solve the definition problem here, because it only occurs in an already deprecated component specification: My suggestion is to simply get rid of that term by just identifying Elem with being one of wchar_t, char16_t, or, char32_t. (This result is identical to identifying "wide-character type" with a "character type that is not a narrow character type (6.8.2 [basic.fundamental])", but this seemingly more general definition doesn't provide a real advantage.)

[Varna 2023-06-14; Move to Ready]

[2023-06-25; Daniel comments]

During the Varna LWG discussions of this issue it had been pointed out that the wording change applied to [depr.locale.stdcvt.req] bullet (1.1) could exclude now the previously allowed support of narrow character types as a "wide-character" with e.g. a Maxcode value of 255. First, I don't think that the revised wording really forbids this. Second, the originating proposal N2401 doesn't indicate what the actual intend here was and it seems questionable to assign LEWG to this issue given that the relevant wording is part of deprecated components, especially given their current position expressed here to eliminate the specification of the affected components as suggested by P2871.

[2023-11-11 Approved at November 2023 meeting in Kona. Status changed: Voting → WP.]

Proposed resolution:

This wording is relative to N4950.

[Drafting note: All usages of "character type" in 22.14 [format] seem to be without problems.]

  1. Modify 30.3.1.2.1 [locale.category] as indicated:

    [Drafting note: The more general interpretation of "character container type" instead of character type by the meaning of the core language seems safe here. It seems reasonable that an implementation allows more than the core language character types, but still could impose additional constraints imposed on them. Even if an implementation does never intend to support anything beyond char and wchar_t, the wording below is harmless. One alternative could be here to use the even more general term "char-like types" from 23.1 [strings.general], but I'm unconvinced that this buys us much]

    -6- […] A template parameter with name C represents the set of types containing char, wchar_t, and any other implementation-defined character container types (3.10 [defns.character.container]) that meet the requirements for a character on which any of the iostream components can be instantiated. […]

  2. Keep 30.4.3.3.3 [facet.num.put.virtuals] of Stage 1 following p4 unchanged:

    [Drafting note: The wording here seems to refer to the pure core language wording meaning of a character type.]

    […] For conversion from an integral type other than a character type, the function determines the integral conversion specifier as indicated in Table 110.

  3. Modify 31.2.3 [iostreams.limits.pos] as indicated:

    [Drafting note: Similar to 30.3.1.2.1 [locale.category] above the more general interpretation of "character container type" instead of character type by the meaning of the core language seems safe here. ]

    -3- In the classes of Clause 31, a template parameter with name charT represents a member of the set of types containing char, wchar_t, and any other implementation-defined character container types (3.10 [defns.character.container]) that meet the requirements for a character on which any of the iostream components can be instantiated.

  4. Modify 31.7.6.3.1 [ostream.formatted.reqmts] as indicated:

    -3- If a formatted output function of a stream os determines padding, it does so as follows. Given a charT character sequence seq where charT is the character container type of the stream, […]

  5. Modify 31.7.6.3.4 [ostream.inserters.character] as indicated:

    template<class charT, class traits>
      basic_ostream<charT, traits>& operator<<(basic_ostream<charT, traits>& out, charT c);
    template<class charT, class traits>
      basic_ostream<charT, traits>& operator<<(basic_ostream<charT, traits>& out, char c);
    // specialization
    template<class traits>
      basic_ostream<char, traits>& operator<<(basic_ostream<char, traits>& out, char c);
    // signed and unsigned
    template<class traits>
      basic_ostream<char, traits>& operator<<(basic_ostream<char, traits>& out, signed char c);
    template<class traits>
      basic_ostream<char, traits>& operator<<(basic_ostream<char, traits>& out, unsigned char c);
    

    -1- Effects: Behaves as a formatted output function (31.7.6.3.1 [ostream.formatted.reqmts]) of out. Constructs a character sequence seq. If c has type char and the character container type of the stream is not char, then seq consists of out.widen(c); otherwise seq consists of c. Determines padding for seq as described in 31.7.6.3.1 [ostream.formatted.reqmts]. Inserts seq into out. Calls os.width(0).

  6. Modify [depr.locale.stdcvt.req] as indicated:

    1. (1.1) — Elem is one ofthe wide-character type, such as wchar_t, char16_t, or char32_t.

    2. (1.2) — Maxcode is the largest wide-character code value of Elem converted to unsigned long that the facet will read or write without reporting a conversion error.

    3. […]