3081. Floating point from_chars API does not distinguish between overflow and underflow

Section: 28.2.3 [charconv.from.chars] Status: Open Submitter: Greg Falcon Opened: 2018-03-12 Last modified: 2023-03-29

Priority: 2

View other active issues in [charconv.from.chars].

View all other issues in [charconv.from.chars].

View all issues with Open status.

Discussion:

strtod() distinguishes between overflow and underflow by returning a value that is either very large or very small. Floating point from_chars does not currently offer any way for callers to distinguish these two cases.

It would be beneficial if users could migrate from strtod() to from_chars without loss of functionality.

I recommend that floating point from_chars use value as an overflow-vs-underflow reporting channel, in the same manner as strtod().

My proposed wording gives from_chars the same wide latitude that strtod() enjoys for handling underflow. A high-quality implementation would likely set ec == result_out_of_range for underflow only when the nearest representable float to the parsed value is a zero and the parsed mantissa was nonzero. In this case value would be set to (an appropriately-signed) zero. It is worth considering giving from_chars this more predictable behavior, if library writers feel they can provide this guarantee for all platforms. (I have a proof-of-concept integer-based implementation for IEEE doubles with this property.)

[2018-06 Rapperswil Wednesday issues processing]

Marshall to provide updated wording and propose Tentatively Ready on the reflector.

Priority set to 2

[2018-08-23 Batavia Issues processing]

Status to Open; Marshall to reword

[2023-03-29; Jonathan adds further discussion]

There are conflicting interpretations of "not in the range representable" for floating-point types. One view is that 1e-10000 and 1e+10000 are outside the representable range for a 64-bit double-precision double (which has min/max exponents of -1022 and 1023). Another view is that the representable range for floating-point types is [-inf,+inf], which means that there are values that cannot be accurately represented, but there are no values "not in the range representable". And 1e-10000 is clearly within the range [0,numeric_limits<double>::max()], even if we don't use infinity as the upper bound of the range. Under the second interpretation, the result will be ±0.0 for underflow and ±inf for overflow, but ec will not be set.

The current proposed resolution does address this, by making it clear that value should be set to a very small or very large value (with appropriate sign), but that ec should also be set. The use of the word "overflow" for the integer overloads is a problem though, because the result cannot "overflow" an unsigned integer type, but can certainly be outside its range.

Proposed resolution:

This wording is relative to N4727.

  1. Edit 28.2.3 [charconv.from.chars] as indicated:

    […] Otherwise, the characters matching the pattern are interpreted as a representation of a value of the type of value. The member ptr of the return value points to the first character not matching the pattern, or has the value last if all characters match. If the parsed value is not in the range representable by the type of value, value is unmodified and the member ec of the return value is equal to errc::result_out_of_range. Otherwise, value is set to the parsed value, after rounding according to round_to_nearest (17.3.4 [round.style]), and the member ec is value-initialized.

    from_chars_result from_chars(const char* first, const char* last,
                                 see below& value, int base = 10);
    

    -2- Requires: base has a value between 2 and 36 (inclusive).

    -3- Effects: The pattern is the expected form of the subject sequence in the "C" locale for the given nonzero base, as described for strtol, except that no "0x" or "0X" prefix shall appear if the value of base is 16, and except that '-' is the only sign that may appear, and only if value has a signed type. On overflow, value is unmodified.

    […]

    from_chars_result from_chars(const char* first, const char* last, float& value,
                                 chars_format fmt = chars_format::general);
    from_chars_result from_chars(const char* first, const char* last, double& value,
                                 chars_format fmt = chars_format::general);
    from_chars_result from_chars(const char* first, const char* last, long double& value,
                                 chars_format fmt = chars_format::general);
    

    -6- Requires: fmt has the value of one of the enumerators of chars_format.

    -7- Effects: The pattern is the expected form of the subject sequence in the "C" locale, as described for strtod, except that

    1. (7.1) […]

    2. (7.2) […]

    3. (7.3) […]

    4. (7.4) […]

    In any case, the resulting value is one of at most two floating-point values closest to the value of the string matching the pattern. On overflow, value is set to plus or minus std::numeric_limits<T>::max() of the appropriate type. On underflow, value is set to a value with magnitude no greater than std::numeric_limits<T>::min().

    […]