427. Stage 2 and rationale of DR 221

Section: 28.3.4.3.2.3 [facet.num.get.virtuals] Status: C++11 Submitter: Martin Sebor Opened: 2003-09-18 Last modified: 2016-01-28

Priority: Not Prioritized

View other active issues in [facet.num.get.virtuals].

View all other issues in [facet.num.get.virtuals].

View all issues with C++11 status.

Discussion:

The requirements specified in Stage 2 and reiterated in the rationale of DR 221 (and echoed again in DR 303) specify that num_get<charT>:: do_get() compares characters on the stream against the widened elements of "012...abc...ABCX+-"

An implementation is required to allow programs to instantiate the num_get template on any charT that satisfies the requirements on a user-defined character type. These requirements do not include the ability of the character type to be equality comparable (the char_traits template must be used to perform tests for equality). Hence, the num_get template cannot be implemented to support any arbitrary character type. The num_get template must either make the assumption that the character type is equality-comparable (as some popular implementations do), or it may use char_traits<charT> to do the comparisons (some other popular implementations do that). This diversity of approaches makes it difficult to write portable programs that attempt to instantiate the num_get template on user-defined types.

[Kona: the heart of the problem is that we're theoretically supposed to use traits classes for all fundamental character operations like assignment and comparison, but facets don't have traits parameters. This is a fundamental design flaw and it appears all over the place, not just in this one place. It's not clear what the correct solution is, but a thorough review of facets and traits is in order. The LWG considered and rejected the possibility of changing numeric facets to use narrowing instead of widening. This may be a good idea for other reasons (see issue 459), but it doesn't solve the problem raised by this issue. Whether we use widen or narrow the num_get facet still has no idea which traits class the user wants to use for the comparison, because only streams, not facets, are passed traits classes. The standard does not require that two different traits classes with the same char_type must necessarily have the same behavior.]

Informally, one possibility: require that some of the basic character operations, such as eq, lt, and assign, must behave the same way for all traits classes with the same char_type. If we accept that limitation on traits classes, then the facet could reasonably be required to use char_traits<charT>.

[ 2009-07 Frankfurt ]

There was general agreement that the standard only needs to specify the behavior when the character type is char or wchar_t.

Beman: we don't need to worry about C++1x because there is a non-zero possibility that we would have a replacement facility for iostreams that would solve these problems.

We need to change the following sentence in [locale.category], paragraph 6 to specify that C is char and wchar_t:

"A template formal parameter with name C represents the set of all possible specializations on a parameter that satisfies the requirements for a character on which any member of the iostream components can be instantiated."

We also need to specify in 27 that the basic character operations, such as eq, lt, and assign use std::char_traits.

Daniel volunteered to provide wording.

[ 2009-09-19 Daniel provided wording. ]

[ 2009-10 Santa Cruz: ]

Leave as Open. Alisdair and/or Tom will provide wording based on discussions. We want to clearly state that streams and locales work just on char and wchar_t (except where otherwise specified).

[ 2010-02-06 Tom updated the proposed wording. ]

[ The original proposed wording is preserved here: ]

  1. Change 28.3.3.1.2.1 [locale.category]/6:

    [..] A template formal parameter with name C represents the set of all possible specializations on a char or wchar_t parameter that satisfies the requirements for a character on which any of the iostream components can be instantiated. [..]

  2. Add the following sentence to the end of 28.3.4.3 [category.numeric]/2:

    [..] These specializations refer to [..], and also for the ctype<> facet to perform character classification. Implementations are encouraged but not required to use the char_traits<charT> functions for all comparisons and assignments of characters of type charT that do not belong to the set of required specializations.

  3. Change 28.3.4.3.2.3 [facet.num.get.virtuals]/3:

    Stage 2: If in==end then stage 2 terminates. Otherwise a charT is taken from in and local variables are initialized as if by

    char_type ct = *in;
    using tr = char_traits<char_type>;
    const char_type* pos = tr::find(atoms, sizeof(src) - 1, ct);
    char c = src[find(atoms, atoms + sizeof(src) - 1, ct) - atoms
                 pos ? pos - atoms : sizeof(src) - 1];
    if (tr::eq(ct, ct == use_facet<numpunct<charT>(loc).decimal_point()))
        c = '.';
    bool discard =
        tr::eq(ct, ct == use_facet<numpunct<charT>(loc).thousands_sep())
        && use_facet<numpunct<charT> >(loc).grouping().length() != 0;
    

    where the values src and atoms are defined as if by: [..]

    [Remark of the author: I considered to replace the initialization "char_type ct = *in;" by the sequence "char_type ct; tr::assign(ct, *in);", but decided against it, because it is a copy-initialization context, not an assignment]

  4. Add the following sentence to the end of 28.3.4.6 [category.time]/1:

    [..] Their members use [..] , to determine formatting details. Implementations are encouraged but not required to use the char_traits<charT> functions for all comparisons and assignments of characters of type charT that do not belong to the set of required specializations.

  5. Change 28.3.4.6.2.2 [locale.time.get.members]/8 bullet 4:

    • The next element of fmt is equal to '%' For the next element c of fmt char_traits<char_type>::eq(c, use_facet<ctype<char_type>>(f.getloc()).widen('%')) == true, [..]
  6. Add the following sentence to the end of 28.3.4.7 [category.monetary]/2:

    Their members use [..] to determine formatting details. Implementations are encouraged but not required to use the char_traits<charT> functions for all comparisons and assignments of characters of type charT that do not belong to the set of required specializations.

  7. Change 28.3.4.7.2.3 [locale.money.get.virtuals]/4:

    [..] The value units is produced as if by:

    for (int i = 0; i < n; ++i)
      buf2[i] = src[char_traits<charT>::find(atoms, atoms+sizeof(src), buf1[i]) - atoms];
    buf2[n] = 0;
    sscanf(buf2, "%Lf", &units);
    
  8. Change 28.3.4.7.3.3 [locale.money.put.virtuals]/1:

    [..] for character buffers buf1 and buf2. If for the first character c in digits or buf2 is equal to ct.widen('-')char_traits<charT>::eq(c, ct.widen('-')) == true, [..]

  9. Add a footnote to the first sentence of 31.7.5.3.2 [istream.formatted.arithmetic]/1:

    As in the case of the inserters, these extractors depend on the locale's num_get<> (22.4.2.1) object to perform parsing the input stream data.(footnote) [..]

    footnote) If the traits of the input stream has different semantics for lt(), eq(), and assign() than char_traits<char_type>, this may give surprising results.

  10. Add a footnote to the second sentence of 31.7.6.3.2 [ostream.inserters.arithmetic]/1:

    Effects: The classes num_get<> and num_put<> handle locale-dependent numeric formatting and parsing. These inserter functions use the imbued locale value to perform numeric formatting.(footnote) [..]

    footnote) If the traits of the output stream has different semantics for lt(), eq(), and assign() than char_traits<char_type>, this may give surprising results.

  11. Add a footnote after the first sentence of 31.7.8 [ext.manip]/4:

    Returns: An object of unspecified type such that if in is an object of type basic_istream<charT, traits> then the expression in >> get_money(mon, intl) behaves as if it called f(in, mon, intl), where the function f is defined as:(footnote) [..]

    footnote) If the traits of the input stream has different semantics for lt(), eq(), and assign() than char_traits<char_type>, this may give surprising results.

  12. Add a footnote after the first sentence of 31.7.8 [ext.manip]/5:

    Returns: An object of unspecified type such that if out is an object of type basic_ostream<charT, traits> then the expression out << put_money(mon, intl) behaves as a formatted input function that calls f(out, mon, intl), where the function f is defined as:(footnote) [..]

    footnote) If the traits of the output stream has different semantics for lt(), eq(), and assign() than char_traits<char_type>, this may give surprising results.

  13. 13) Add a footnote after the first sentence of 31.7.8 [ext.manip]/8:

    Returns: An object of unspecified type such that if in is an object of type basic_istream<charT, traits> then the expression in >>get_time(tmb, fmt) behaves as if it called f(in, tmb, fmt), where the function f is defined as:(footnote) [..]

    footnote) If the traits of the input stream has different semantics for lt(), eq(), and assign() than char_traits<char_type>, this may give surprising results.

  14. Add a footnote after the first sentence of 31.7.8 [ext.manip]/10:

    Returns: An object of unspecified type such that if out is an object of type basic_ostream<charT, traits> then the expression out <<put_time(tmb, fmt) behaves as if it called f(out, tmb, fmt), where the function f is defined as:(footnote) [..]

    footnote) If the traits of the output stream has different semantics for lt(), eq(), and assign() than char_traits<char_type>, this may give surprising results.

[ 2010 Pittsburgh: ]

Moved to Ready with only two of the bullets. The original wording is preserved here:

  1. Change 28.3.3.1.2.1 [locale.category]/6:

    [..] A template formal parameter with name C represents the set of all possible specializations on a of types containing char, wchar_t, and any other implementation-defined character type parameter that satisfies the requirements for a character on which any of the iostream components can be instantiated. [..]

  2. Add the following sentence to the end of 28.3.4.3 [category.numeric]/2:

    [..] These specializations refer to [..], and also for the ctype<> facet to perform character classification. [Note: Implementations are encouraged but not required to use the char_traits<charT> functions for all comparisons and assignments of characters of type charT that do not belong to the set of required specializations - end note].

  3. Change 28.3.4.3.2.3 [facet.num.get.virtuals]/3:

    Stage 2: If in==end then stage 2 terminates. Otherwise a charT is taken from in and local variables are initialized as if by

    char_type ct = *in;
    using tr = char_traits<char_type>;
    const char_type* pos = tr::find(atoms, sizeof(src) - 1, ct);
    char c = src[find(atoms, atoms + sizeof(src) - 1, ct) - atoms
                 pos ? pos - atoms : sizeof(src) - 1];
    if (tr::eq(ct, ct == use_facet<numpunct<charT>(loc).decimal_point()))
        c = '.';
    bool discard =
        tr::eq(ct, ct == use_facet<numpunct<charT>(loc).thousands_sep())
        && use_facet<numpunct<charT> >(loc).grouping().length() != 0;
    

    where the values src and atoms are defined as if by: [..]

    [Remark of the author: I considered to replace the initialization "char_type ct = *in;" by the sequence "char_type ct; tr::assign(ct, *in);", but decided against it, because it is a copy-initialization context, not an assignment]

  4. Add the following sentence to the end of 28.3.4.6 [category.time]/1:

    [..] Their members use [..] , to determine formatting details. [Note: Implementations are encouraged but not required to use the char_traits<charT> functions for all comparisons and assignments of characters of type charT that do not belong to the set of required specializations - end note].

  5. Change 28.3.4.6.2.2 [locale.time.get.members]/8 bullet 4:

    • The next element of fmt is equal to '%' For the next element c of fmt char_traits<char_type>::eq(c, use_facet<ctype<char_type>>(f.getloc()).widen('%')) == true, [..]
  6. Add the following sentence to the end of 28.3.4.7 [category.monetary]/2:

    Their members use [..] to determine formatting details. [Note: Implementations are encouraged but not required to use the char_traits<charT> functions for all comparisons and assignments of characters of type charT that do not belong to the set of required specializations - end note].

  7. Change 28.3.4.7.2.3 [locale.money.get.virtuals]/4:

    [..] The value units is produced as if by:

    for (int i = 0; i < n; ++i)
      buf2[i] = src[char_traits<charT>::find(atoms, atoms+sizeof(src), buf1[i]) - atoms];
    buf2[n] = 0;
    sscanf(buf2, "%Lf", &units);
    
  8. Change 28.3.4.7.3.3 [locale.money.put.virtuals]/1:

    [..] for character buffers buf1 and buf2. If for the first character c in digits or buf2 is equal to ct.widen('-')char_traits<charT>::eq(c, ct.widen('-')) == true, [..]

  9. Add a new paragraph after the first paragraph of 31.2.3 [iostreams.limits.pos]/1:

    In the classes of clause 27, a template formal parameter with name charT represents one of the set of types containing char, wchar_t, and any other implementation-defined character type that satisfies the requirements for a character on which any of the iostream components can be instantiated.

  10. Add a footnote to the first sentence of 31.7.5.3.2 [istream.formatted.arithmetic]/1:

    As in the case of the inserters, these extractors depend on the locale's num_get<> (22.4.2.1) object to perform parsing the input stream data.(footnote) [..]

    footnote) If the traits of the input stream has different semantics for lt(), eq(), and assign() than char_traits<char_type>, this may give surprising results.

  11. Add a footnote to the second sentence of 31.7.6.3.2 [ostream.inserters.arithmetic]/1:

    Effects: The classes num_get<> and num_put<> handle locale-dependent numeric formatting and parsing. These inserter functions use the imbued locale value to perform numeric formatting.(footnote) [..]

    footnote) If the traits of the output stream has different semantics for lt(), eq(), and assign() than char_traits<char_type>, this may give surprising results.

  12. Add a footnote after the first sentence of 31.7.8 [ext.manip]/4:

    Returns: An object of unspecified type such that if in is an object of type basic_istream<charT, traits> then the expression in >> get_money(mon, intl) behaves as if it called f(in, mon, intl), where the function f is defined as:(footnote) [..]

    footnote) If the traits of the input stream has different semantics for lt(), eq(), and assign() than char_traits<char_type>, this may give surprising results.

  13. Add a footnote after the first sentence of 31.7.8 [ext.manip]/5:

    Returns: An object of unspecified type such that if out is an object of type basic_ostream<charT, traits> then the expression out << put_money(mon, intl) behaves as a formatted input function that calls f(out, mon, intl), where the function f is defined as:(footnote) [..]

    footnote) If the traits of the output stream has different semantics for lt(), eq(), and assign() than char_traits<char_type>, this may give surprising results.

  14. Add a footnote after the first sentence of 31.7.8 [ext.manip]/8:

    Returns: An object of unspecified type such that if in is an object of type basic_istream<charT, traits> then the expression in >>get_time(tmb, fmt) behaves as if it called f(in, tmb, fmt), where the function f is defined as:(footnote) [..]

    footnote) If the traits of the input stream has different semantics for lt(), eq(), and assign() than char_traits<char_type>, this may give surprising results.

  15. Add a footnote after the first sentence of 31.7.8 [ext.manip]/10:

    Returns: An object of unspecified type such that if out is an object of type basic_ostream<charT, traits> then the expression out <<put_time(tmb, fmt) behaves as if it called f(out, tmb, fmt), where the function f is defined as:(footnote) [..]

    footnote) If the traits of the output stream has different semantics for lt(), eq(), and assign() than char_traits<char_type>, this may give surprising results.

Proposed resolution:

  1. Change 28.3.3.1.2.1 [locale.category]/6:

    [..] A template formal parameter with name C represents the set of all possible specializations on a of types containing char, wchar_t, and any other implementation-defined character type parameter that satisfies the requirements for a character on which any of the iostream components can be instantiated. [..]

  2. Add a new paragraph after the first paragraph of 31.2.3 [iostreams.limits.pos]/1:

    In the classes of clause 27, a template formal parameter with name charT represents one of the set of types containing char, wchar_t, and any other implementation-defined character type that satisfies the requirements for a character on which any of the iostream components can be instantiated.