Section: 28.3.4.3.2.3 [facet.num.get.virtuals] Status: C++23 Submitter: Marshall Clow Opened: 2014-04-30 Last modified: 2023-11-22
Priority: 2
View other active issues in [facet.num.get.virtuals].
View all other issues in [facet.num.get.virtuals].
View all issues with C++23 status.
Discussion:
In 28.3.4.3.2.3 [facet.num.get.virtuals] we have:
Stage 3: The sequence of chars accumulated in stage 2 (the field) is converted to a numeric value by the rules of one of the functions declared in the header
<cstdlib>
:
For a signed integer value, the function
strtoll
.For an unsigned integer value, the function
strtoull
.For a floating-point value, the function
strtold
.
This implies that for many cases, this routine should return true:
bool is_same(const char* p) { std::string str{p}; double val1 = std::strtod(str.c_str(), nullptr); std::stringstream ss(str); double val2; ss >> val2; return std::isinf(val1) == std::isinf(val2) && // either they're both infinity std::isnan(val1) == std::isnan(val2) && // or they're both NaN (std::isinf(val1) || std::isnan(val1) || val1 == val2); // or they're equal }
and this is indeed true, for many strings:
assert(is_same("0")); assert(is_same("1.0")); assert(is_same("-1.0")); assert(is_same("100.123")); assert(is_same("1234.456e89"));
but not for others
assert(is_same("0xABp-4")); // hex float assert(is_same("inf")); assert(is_same("+inf")); assert(is_same("-inf")); assert(is_same("nan")); assert(is_same("+nan")); assert(is_same("-nan")); assert(is_same("infinity")); assert(is_same("+infinity")); assert(is_same("-infinity"));
These are all strings that are correctly parsed by std::strtod
, but not by the stream extraction operators.
They contain characters that are deemed invalid in stage 2 of parsing.
strtold
, then we should accept all the things that
strtold
accepts.
[2016-04, Issues Telecon]
People are much more interested in round-tripping hex floats than handling inf
and nan
. Priority changed to P2.
Marshall says he'll try to write some wording, noting that this is a very closely specified part of the standard, and has remained unchanged for a long time. Also, there will need to be a sample implementation.
[2016-08, Chicago]
Zhihao provides wording
The src
array in Stage 2 does narrowing only. The actual
input validation is delegated to strtold
(independent from
the parsing in Stage 3 which is again being delegated
to strtold
) by saying:
[...] If it is not discarded, then a check is made to determine if c is allowed as the next character of an input field of the conversion specifier returned by Stage 1.
So a conforming C++11 num_get
is supposed to magically
accept an hexfloat without an exponent
0x3.AB
because we refers to C99, and the fix to this issue should be
just expanding the src
array.
Support for Infs and NaNs are not proposed because of the complexity of nan(n-chars).
[2016-08, Chicago]
Tues PM: Move to Open
[2016-09-08, Zhihao Yuan comments and updates proposed wording]
Examples added.
[2018-08-23 Batavia Issues processing]
Needs an Annex C entry. Tim to write Annex C.
Previous resolution [SUPERSEDED]:
This wording is relative to N4606.
Change 28.3.4.3.2.3 [facet.num.get.virtuals]/3 Stage 2 as indicated:
static const char src[] = "0123456789abcdefpxABCDEFPX+-";
Append the following examples to 28.3.4.3.2.3 [facet.num.get.virtuals]/3 Stage 2 as indicated:
[Example:
Given an input sequence of
"0x1a.bp+07p"
,
if Stage 1 returns
%d
,"0"
is accumulated;if Stage 1 returns
%i
,"0x1a"
are accumulated;if Stage 1 returns
%g
,"0x1a.bp+07"
are accumulated.In all cases, leaving the rest in the input.
— end example]
[2021-05-18 Tim updates wording]
Based on the git history, libc++ appears to have always included
p
and P
in src
.
[2021-09-20; Reflector poll]
Set status to Tentatively Ready after eight votes in favour during reflector poll.
[2021-10-14 Approved at October 2021 virtual plenary. Status changed: Voting → WP.]
Proposed resolution:
This wording is relative to N4885.
Change 28.3.4.3.2.3 [facet.num.get.virtuals]/3 Stage 2 as indicated:
— Stage 2:
Ifin == end
then stage 2 terminates. Otherwise acharT
is taken fromin
and local variables are initialized as if bychar_type ct = *in; char c = src[find(atoms, atoms + sizeof(src) - 1, ct) - atoms]; if (ct == use_facet<numpunct<charT>>(loc).decimal_point()) c = '.'; bool discard = ct == use_facet<numpunct<charT>>(loc).thousands_sep() && use_facet<numpunct<charT>>(loc).grouping().length() != 0;where the values
src
andatoms
are defined as if by:static const char src[] = "0123456789abcdefpxABCDEFPX+-"; char_type atoms[sizeof(src)]; use_facet<ctype<charT>>(loc).widen(src, src + sizeof(src), atoms);for this value of
Ifloc
.discard
is true, then if'.'
has not yet been accumulated, then the position of the character is remembered, but the character is otherwise ignored. Otherwise, if'.'
has already been accumulated, the character is discarded and Stage 2 terminates. If it is not discarded, then a check is made to determine ifc
is allowed as the next character of an input field of the conversion specifier returned by Stage 1. If so, it is accumulated. If the character is either discarded or accumulated thenin
is advanced by++in
and processing returns to the beginning of stage 2. [Example:Given an input sequence of
"0x1a.bp+07p"
,
if the conversion specifier returned by Stage 1 is
%d
,"0"
is accumulated;if the conversion specifier returned by Stage 1 is
%i
,"0x1a"
are accumulated;if the conversion specifier returned by Stage 1 is
%g
,"0x1a.bp+07"
are accumulated.In all cases, the remainder is left in the input.
— end example]
Add the following new subclause to C.6 [diff.cpp03]:
C.4.? [locale]: localization library [diff.cpp03.locale]
Affected subclause: 28.3.4.3.2.3 [facet.num.get.virtuals]
Change: Thenum_get
facet recognizes hexadecimal floating point values.
Rationale: Required by new feature.
Effect on original feature: Valid C++2003 code may have different behavior in this revision of C++.