Section: 28.3.4.2.5 [locale.codecvt] Status: NAD Submitter: Martin Sebor Opened: 2002-08-30 Last modified: 2016-01-28
Priority: Not Prioritized
View all other issues in [locale.codecvt].
View all issues with NAD status.
Discussion:
It seems that the descriptions of codecvt do_in() and do_out() leave sufficient room for interpretation so that two implementations of codecvt may not work correctly with the same filebuf. Specifically, the following seems less than adequately specified:
Finally, the conditions described at the end of 28.3.4.2.5.3 [locale.codecvt.virtuals], p4 don't seem to be possible:
"A return value of partial, if (from_next == from_end), indicates that either the destination sequence has not absorbed all the available destination elements, or that additional source elements are needed before another destination element can be produced."
If the value is partial, it's not clear to me that (from_next ==from_end) could ever hold if there isn't enough room in the destination buffer. In order for (from_next==from_end) to hold, all characters in that range must have been successfully converted (according to 28.3.4.2.5.3 [locale.codecvt.virtuals], p2) and since there are no further source characters to convert, no more room in the destination buffer can be needed.
It's also not clear to me that (from_next==from_end) could ever hold if additional source elements are needed to produce another destination character (not element as incorrectly stated in the text). partial is returned if "not all source characters have been converted" according to Table 53, which also implies that (from_next==from) does NOT hold.
Could it be that the intended qualifying condition was actually (from_next != from_end), i.e., that the sentence was supposed to read
"A return value of partial, if (from_next != from_end),..."
which would make perfect sense, since, as far as I understand it, partial can only occur if (from_next != from_end)?
[Lillehammer: Defer for the moment, but this really needs to be fixed. Right now, the description of codecvt is too vague for it to be a useful contract between providers and clients of codecvt facets. (Note that both vendors and users can be both providers and clients of codecvt facets.) The major philosophical issue is whether the standard should only describe mappings that take a single wide character to multiple narrow characters (and vice versa), or whether it should describe fully general N-to-M conversions. When the original standard was written only the former was contemplated, but today, in light of the popularity of utf8 and utf16, that doesn't seem sufficient for C++0x. Bill supports general N-to-M conversions; we need to make sure Martin and Howard agree.]
[ 2009-07 Frankfurt ]
codecvt is meant to be a 1-to-N to N-to-1 conversion. It does not work well for N-to-M conversions. wbuffer_convert now exists, and handles N-to-M cases. Also, there is a new specialization of codecvt that permits UTF-16 <-> UTF-8 conversions.
NAD without prejudice. Will reopen if proposed resolution is supplied.
Proposed resolution: