Issue 23: Num_get overflow result

23. Num_get overflow result

Section: 28.3.4.3.2.3 [facet.num.get.virtuals] Status: CD1 Submitter: Nathan Myers Opened: 1998-08-06 Last modified: 2016-01-28

Priority: Not Prioritized

View other active issues in [facet.num.get.virtuals].

View all other issues in [facet.num.get.virtuals].

View all issues with CD1 status.

Discussion:

The current description of numeric input does not account for the possibility of overflow. This is an implicit result of changing the description to rely on the definition of scanf() (which fails to report overflow), and conflicts with the documented behavior of traditional and current implementations.

Users expect, when reading a character sequence that results in a value unrepresentable in the specified type, to have an error reported. The standard as written does not permit this.

Further comments from Dietmar:

I don't feel comfortable with the proposed resolution to issue 23: It kind of simplifies the issue to much. Here is what is going on:

Currently, the behavior of numeric overflow is rather counter intuitive and hard to trace, so I will describe it briefly:

According to 28.3.4.3.2.3 [facet.num.get.virtuals] paragraph 11 failbit is set if scanf() would return an input error; otherwise a value is converted to the rules of scanf.
scanf() is defined in terms of fscanf().
fscanf() returns an input failure if during conversion no character matching the conversion specification could be extracted before reaching EOF. This is the only reason for fscanf() to fail due to an input error and clearly does not apply to the case of overflow.
Thus, the conversion is performed according to the rules of fscanf() which basically says that strtod, strtol(), etc. are to be used for the conversion.
The strtod(), strtol(), etc. functions consume as many matching characters as there are and on overflow continue to consume matching characters but also return a value identical to the maximum (or minimum for signed types if there was a leading minus) value of the corresponding type and set errno to ERANGE.
Thus, according to the current wording in the standard, overflows can be detected! All what is to be done is to check errno after reading an element and, of course, clearing errno before trying a conversion. With the current wording, it can be detected whether the overflow was due to a positive or negative number for signed types.

Further discussion from Redmond:

The basic problem is that we've defined our behavior, including our error-reporting behavior, in terms of C90. However, C90's method of reporting overflow in scanf is not technically an "input error". The strto_* functions are more precise.

There was general consensus that failbit should be set upon overflow. We considered three options based on this:

Set failbit upon conversion error (including overflow), and don't store any value.
Set failbit upon conversion error, and also set errno to indicated the precise nature of the error.
Set failbit upon conversion error. If the error was due to overflow, store +-numeric_limits<T>::max() as an overflow indication.

Straw poll: (1) 5; (2) 0; (3) 8.

Discussed at Lillehammer. General outline of what we want the solution to look like: we want to say that overflow is an error, and provide a way to distinguish overflow from other kinds of errors. Choose candidate field the same way scanf does, but don't describe the rest of the process in terms of format. If a finite input field is too large (positive or negative) to be represented as a finite value, then set failbit and assign the nearest representable value. Bill will provide wording.

Discussed at Toronto: N2327 is in alignment with the direction we wanted to go with in Lillehammer. Bill to work on.

Proposed resolution:

Change 28.3.4.3.2.3 [facet.num.get.virtuals], end of p3:

Stage 3: ~~The result of stage 2 processing can be one of~~ The sequence of chars accumulated in stage 2 (the field) is converted to a numeric value by the rules of one of the functions declared in the header <cstdlib>:

A sequence of chars has been accumulated in stage 2 that is converted (according to the rules of scanf) to a value of the type of val. This value is stored in val and ios_base::goodbit is stored in err. For a signed integer value, the function strtoll.

~~The sequence of chars accumulated in stage 2 would have caused scanf to report an input failure. ios_base::failbit is assigned to err.~~ For an unsigned integer value, the function strtoull.

For a floating-point value, the function strtold.

The numeric value to be stored can be one of:

zero, if the conversion function fails to convert the entire field. ios_base::failbit is assigned to err.

the most positive representable value, if the field represents a value too large positive to be represented in val. ios_base::failbit is assigned to err.

the most negative representable value (zero for unsigned integer), if the field represents a value too large negative to be represented in val. ios_base::failbit is assigned to err.

the converted value, otherwise.

The resultant numeric value is stored in val.

Change 28.3.4.3.2.3 [facet.num.get.virtuals], p6-p7:

iter_type do_get(iter_type in, iter_type end, ios_base& str, 
                 ios_base::iostate& err, bool& val) const;
-6- Effects: If (str.flags()&ios_base::boolalpha)==0 then input proceeds as it would for a long except that if a value is being stored into val, the value is determined according to the following: If the value to be stored is 0 then false is stored. If the value is 1 then true is stored. Otherwise ~~err|=ios_base::failbit is performed and no value~~ true is stored. and ios_base::failbit is assigned to err.

-7- Otherwise target sequences are determined "as if" by calling the members falsename() and truename() of the facet obtained by use_facet<numpunct<charT> >(str.getloc()). Successive characters in the range [in,end) (see 23.1.1) are obtained and matched against corresponding positions in the target sequences only as necessary to identify a unique match. The input iterator in is compared to end only when necessary to obtain a character. If ~~and only if~~ a target sequence is uniquely matched, val is set to the corresponding value. Otherwise false is stored and ios_base::failbit is assigned to err.