wstring_convert
provides no indication of incomplete input or outputSection: 99 [depr.conversions.string] Status: NAD Submitter: PowerGamer Opened: 2017-01-08 Last modified: 2017-06-05
Priority: 3
View other active issues in [depr.conversions.string].
View all other issues in [depr.conversions.string].
View all issues with NAD status.
Discussion:
Example:
// Input UTF-16 string is incomplete - only first half of // UTF-16 surrogate pair L"\xD843\xDEF9": wchar_t in_utf16[] = L"\xD843"; std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> cvt; auto out_utf8 = cvt.to_bytes(in_utf16); // No error.
There is no indication that input was incomplete (the value returned
by cvt.state()
is not documented and so cannot be examined by user for
that purpose). As such the user will not know that more input data
should be provided in additional call to cvt.to_bytes()
.
"\xF0"
in out_utf8
.
Again, no indication of incomplete output produced is provided by
std::wstring_convert
.
IMO it makes std::wstring_convert
in its current state completely
useless (it cannot be relied upon to either produce complete and valid
UTF sequence or throw an error in all situations).
Imagine a file has UTF16 encoded text. You want to read all the data
from a file at once and convert it into UTF8 using
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>>
.
Now, if a file contains completely invalid UTF16 (for example,
forbidden or incorrectly encoded Unicode code points) you will get an
exception from std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>>
.
But if a file contains incomplete (but in all other regards valid)
UTF16 (for ex. file ends with only the first half of a valid surrogate
pair) you will neither get an error exception from
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>>
nor any
indication that the input provided to
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>>
was incomplete.
[2017-01-27 Telecon]
Priority 3; send to LEWG
[2017-02 in Kona, LEWG recommends NAD]
[2017-06-02 Issues Telecon]
This facility has a number of known problems, including poor error handling. The feature has been deprecated, and the plan is to replace it with better facilities with a better API.
Resolve as NAD
Proposed resolution: