Section: 28.3.4.2.6 [locale.codecvt.byname] Status: NAD Submitter: Gregory Bumgardner Opened: 2001-01-25 Last modified: 2016-01-28
Priority: Not Prioritized
View all other issues in [locale.codecvt.byname].
View all issues with NAD status.
Discussion:
The effects of codecvt<>::do_length()
are described in
22.2.1.5.2, paragraph 10. As implied by that paragraph, and clarified
in issue 75, codecvt<>::do_length()
must
process the source data and update the stateT
argument just
as if the data had been processed by codecvt<>::in()
.
However, the standard does not specify how do_length()
would
report a translation failure, should the source sequence contain
untranslatable or illegal character sequences.
The other conversion methods return an "error" result value
to indicate that an untranslatable character has been encountered, but
do_length()
already has a return value (the number of source
characters that have been processed by the method).
Proposed resolution:
This issue cannot be resolved without modifying the interface. An exception cannot be used, as there would be no way to determine how many characters have been processed and the state object would be left in an indeterminate state.
A source compatible solution involves adding a fifth argument to length() and do_length() that could be used to return position of the offending character sequence. This argument would have a default value that would allow it to be ignored:
int length(stateT& state, const externT* from, const externT* from_end, size_t max, const externT** from_next = 0); virtual int do_length(stateT& state, const externT* from, const externT* from_end, size_t max, const externT** from_next);
Then an exception could be used to report any translation errors and the from_next argument, if used, could then be used to retrieve the location of the offending character sequence.
Rationale:
The standard is already clear: the return value is the number of "valid complete characters". If it encounters an invalid sequence of external characters, it stops.