28 Localization library [localization]

28.4 Standard locale categories [locale.categories]

28.4.2 The ctype category [category.ctype]

28.4.2.5 Class template codecvt [locale.codecvt]

28.4.2.5.1 General [locale.codecvt.general]

namespace std { class codecvt_base { public: enum result { ok, partial, error, noconv }; }; template<class internT, class externT, class stateT> class codecvt : public locale::facet, public codecvt_base { public: using intern_type = internT; using extern_type = externT; using state_type = stateT; explicit codecvt(size_t refs = 0); result out( stateT& state, const internT* from, const internT* from_end, const internT*& from_next, externT* to, externT* to_end, externT*& to_next) const; result unshift( stateT& state, externT* to, externT* to_end, externT*& to_next) const; result in( stateT& state, const externT* from, const externT* from_end, const externT*& from_next, internT* to, internT* to_end, internT*& to_next) const; int encoding() const noexcept; bool always_noconv() const noexcept; int length(stateT&, const externT* from, const externT* end, size_t max) const; int max_length() const noexcept; static locale::id id; protected: ~codecvt(); virtual result do_out( stateT& state, const internT* from, const internT* from_end, const internT*& from_next, externT* to, externT* to_end, externT*& to_next) const; virtual result do_in( stateT& state, const externT* from, const externT* from_end, const externT*& from_next, internT* to, internT* to_end, internT*& to_next) const; virtual result do_unshift( stateT& state, externT* to, externT* to_end, externT*& to_next) const; virtual int do_encoding() const noexcept; virtual bool do_always_noconv() const noexcept; virtual int do_length(stateT&, const externT* from, const externT* end, size_t max) const; virtual int do_max_length() const noexcept; }; }
The class codecvt<internT, externT, stateT> is for use when converting from one character encoding to another, such as from wide characters to multibyte characters or between wide character encodings such as UTF-32 and EUC.
The stateT argument selects the pair of character encodings being mapped between.
The specializations required in Table 102 ([locale.category]) convert the implementation-defined native character set.
codecvt<char, char, mbstate_­t> implements a degenerate conversion; it does not convert at all.
The specialization codecvt<char16_­t, char8_­t, mbstate_­t> converts between the UTF-16 and UTF-8 encoding forms, and the specialization codecvt <char32_­t, char8_­t, mbstate_­t> converts between the UTF-32 and UTF-8 encoding forms.
codecvt<wchar_­t, char, mbstate_­t> converts between the native character sets for ordinary and wide characters.
Specializations on mbstate_­t perform conversion between encodings known to the library implementer.
Other encodings can be converted by specializing on a program-defined stateT type.
Objects of type stateT can contain any state that is useful to communicate to or from the specialized do_­in or do_­out members.

28.4.2.5.2 Members [locale.codecvt.members]

result out( stateT& state, const internT* from, const internT* from_end, const internT*& from_next, externT* to, externT* to_end, externT*& to_next) const;
Returns: do_­out(state, from, from_­end, from_­next, to, to_­end, to_­next).
result unshift(stateT& state, externT* to, externT* to_end, externT*& to_next) const;
Returns: do_­unshift(state, to, to_­end, to_­next).
result in( stateT& state, const externT* from, const externT* from_end, const externT*& from_next, internT* to, internT* to_end, internT*& to_next) const;
Returns: do_­in(state, from, from_­end, from_­next, to, to_­end, to_­next).
int encoding() const noexcept;
Returns: do_­encoding().
bool always_noconv() const noexcept;
Returns: do_­always_­noconv().
int length(stateT& state, const externT* from, const externT* from_end, size_t max) const;
Returns: do_­length(state, from, from_­end, max).
int max_length() const noexcept;
Returns: do_­max_­length().

28.4.2.5.3 Virtual functions [locale.codecvt.virtuals]

result do_out( stateT& state, const internT* from, const internT* from_end, const internT*& from_next, externT* to, externT* to_end, externT*& to_next) const; result do_in( stateT& state, const externT* from, const externT* from_end, const externT*& from_next, internT* to, internT* to_end, internT*& to_next) const;
Preconditions: (from <= from_­end && to <= to_­end) is well-defined and true; state is initialized, if at the beginning of a sequence, or else is equal to the result of converting the preceding characters in the sequence.
Effects: Translates characters in the source range [from, from_­end), placing the results in sequential positions starting at destination to.
Converts no more than (from_­end - from) source elements, and stores no more than (to_­end - to) destination elements.
Stops if it encounters a character it cannot convert.
It always leaves the from_­next and to_­next pointers pointing one beyond the last element successfully converted.
If returns noconv, internT and externT are the same type and the converted sequence is identical to the input sequence [from, from_­next).
to_­next is set equal to to, the value of state is unchanged, and there are no changes to the values in [to, to_­end).
A codecvt facet that is used by basic_­filebuf ([file.streams]) shall have the property that if do_out(state, from, from_end, from_next, to, to_end, to_next) would return ok, where from != from_­end, then do_out(state, from, from + 1, from_next, to, to_end, to_next) shall also return ok, and that if do_in(state, from, from_end, from_next, to, to_end, to_next) would return ok, where to != to_­end, then do_in(state, from, from_end, from_next, to, to + 1, to_next) shall also return ok.269
[Note 1:
As a result of operations on state, it can return ok or partial and set from_­next == from and to_­next != to.
— end note]
Remarks: Its operations on state are unspecified.
[Note 2:
This argument can be used, for example, to maintain shift state, to specify conversion options (such as count only), or to identify a cache of seek offsets.
— end note]
Returns: An enumeration value, as summarized in Table 104.
Table 104: do_­in/do_­out result values [tab:locale.codecvt.inout]
Value
Meaning
ok
completed the conversion
partial
not all source characters converted
error
encountered a character in [from, from_­end) that it could not convert
noconv
internT and externT are the same type, and input sequence is identical to converted sequence
A return value of partial, if (from_­next == from_­end), indicates that either the destination sequence has not absorbed all the available destination elements, or that additional source elements are needed before another destination element can be produced.
result do_unshift(stateT& state, externT* to, externT* to_end, externT*& to_next) const;
Preconditions: (to <= to_­end) is well-defined and true; state is initialized, if at the beginning of a sequence, or else is equal to the result of converting the preceding characters in the sequence.
Effects: Places characters starting at to that should be appended to terminate a sequence when the current stateT is given by state.270
Stores no more than (to_­end - to) destination elements, and leaves the to_­next pointer pointing one beyond the last element successfully stored.
Returns: An enumeration value, as summarized in Table 105.
Table 105: do_­unshift result values [tab:locale.codecvt.unshift]
Value
Meaning
ok
completed the sequence
partial
space for more than to_­end - to destination elements was needed to terminate a sequence given the value of state
error
an unspecified error has occurred
noconv
no termination is needed for this state_­type
int do_encoding() const noexcept;
Returns: -1 if the encoding of the externT sequence is state-dependent; else the constant number of externT characters needed to produce an internal character; or 0 if this number is not a constant.271
bool do_always_noconv() const noexcept;
Returns: true if do_­in() and do_­out() return noconv for all valid argument values.
codecvt<char, char, mbstate_­t> returns true.
int do_length(stateT& state, const externT* from, const externT* from_end, size_t max) const;
Preconditions: (from <= from_­end) is well-defined and true; state is initialized, if at the beginning of a sequence, or else is equal to the result of converting the preceding characters in the sequence.
Effects: The effect on the state argument is as if it called do_­in(state, from, from_­end, from, to, to+max, to) for to pointing to a buffer of at least max elements.
Returns: (from_­next-from) where from_­next is the largest value in the range [from, from_­end] such that the sequence of values in the range [from, from_­next) represents max or fewer valid complete characters of type internT.
The specialization codecvt<char, char, mbstate_­t>, returns the lesser of max and (from_­end-from).
int do_max_length() const noexcept;
Returns: The maximum value that do_­length(state, from, from_­end, 1) can return for any valid range [from, from_­end) and stateT value state.
The specialization codecvt<char, char, mbstate_­t>​::​do_­max_­length() returns 1.
Informally, this means that basic_­filebuf assumes that the mappings from internal to external characters is 1 to N: that a codecvt facet that is used by basic_­filebuf can translate characters one internal character at a time.
 â®¥
Typically these will be characters to return the state to stateT().
 â®¥
If encoding() yields -1, then more than max_­length() externT elements can be consumed when producing a single internT character, and additional externT elements can appear at the end of a sequence after those that yield the final internT character.
 â®¥