28 Localization library [localization]

28.4 Standard locale categories [locale.categories]

28.4.1 The ctype category [category.ctype]

28.4.1.4 Class template codecvt [locale.codecvt]

namespace std {
  class codecvt_base {
  public:
    enum result { ok, partial, error, noconv };
  };

  template<class internT, class externT, class stateT>
    class codecvt : public locale::facet, public codecvt_base {
    public:
      using intern_type = internT;
      using extern_type = externT;
      using state_type  = stateT;

      explicit codecvt(size_t refs = 0);

      result out(
          stateT& state,
          const internT* from, const internT* from_end, const internT*& from_next,
                externT*   to,       externT*   to_end,       externT*&   to_next) const;

      result unshift(
          stateT& state,
                externT*    to,      externT*   to_end,       externT*&   to_next) const;

      result in(
          stateT& state,
          const externT* from, const externT* from_end, const externT*& from_next,
                internT*   to,       internT*   to_end,       internT*&   to_next) const;

      int encoding() const noexcept;
      bool always_noconv() const noexcept;
      int length(stateT&, const externT* from, const externT* end, size_t max) const;
      int max_length() const noexcept;

      static locale::id id;

    protected:
      ~codecvt();
      virtual result do_out(
          stateT& state,
          const internT* from, const internT* from_end, const internT*& from_next,
                externT* to,         externT*   to_end,       externT*&   to_next) const;
      virtual result do_in(
          stateT& state,
          const externT* from, const externT* from_end, const externT*& from_next,
                internT* to,         internT*   to_end,       internT*&   to_next) const;
      virtual result do_unshift(
          stateT& state,
                externT* to,         externT*   to_end,       externT*&   to_next) const;

      virtual int do_encoding() const noexcept;
      virtual bool do_always_noconv() const noexcept;
      virtual int do_length(stateT&, const externT* from, const externT* end, size_t max) const;
      virtual int do_max_length() const noexcept;
    };
}
The class codecvt<internT, externT, stateT> is for use when converting from one character encoding to another, such as from wide characters to multibyte characters or between wide character encodings such as UTF-32 and EUC.
The stateT argument selects the pair of character encodings being mapped between.
The specializations required in Table 102 ([locale.category]) convert the implementation-defined native character set.
codecvt<char, char, mbstate_­t> implements a degenerate conversion; it does not convert at all.
The specialization codecvt<char16_­t, char8_­t, mbstate_­t> converts between the UTF-16 and UTF-8 encoding forms, and the specialization codecvt <char32_­t, char8_­t, mbstate_­t> converts between the UTF-32 and UTF-8 encoding forms.
codecvt<wchar_­t, char, mbstate_­t> converts between the native character sets for ordinary and wide characters.
Specializations on mbstate_­t perform conversion between encodings known to the library implementer.
Other encodings can be converted by specializing on a program-defined stateT type.
Objects of type stateT can contain any state that is useful to communicate to or from the specialized do_­in or do_­out members.

28.4.1.4.1 Members [locale.codecvt.members]

result out( stateT& state, const internT* from, const internT* from_end, const internT*& from_next, externT* to, externT* to_end, externT*& to_next) const;
Returns: do_­out(state, from, from_­end, from_­next, to, to_­end, to_­next).
result unshift(stateT& state, externT* to, externT* to_end, externT*& to_next) const;
Returns: do_­unshift(state, to, to_­end, to_­next).
result in( stateT& state, const externT* from, const externT* from_end, const externT*& from_next, internT* to, internT* to_end, internT*& to_next) const;
Returns: do_­in(state, from, from_­end, from_­next, to, to_­end, to_­next).
int encoding() const noexcept;
Returns: do_­encoding().
bool always_noconv() const noexcept;
Returns: do_­always_­noconv().
int length(stateT& state, const externT* from, const externT* from_end, size_t max) const;
Returns: do_­length(state, from, from_­end, max).
int max_length() const noexcept;
Returns: do_­max_­length().

28.4.1.4.2 Virtual functions [locale.codecvt.virtuals]

result do_out( stateT& state, const internT* from, const internT* from_end, const internT*& from_next, externT* to, externT* to_end, externT*& to_next) const; result do_in( stateT& state, const externT* from, const externT* from_end, const externT*& from_next, internT* to, internT* to_end, internT*& to_next) const;
Preconditions: (from <= from_­end && to <= to_­end) is well-defined and true; state is initialized, if at the beginning of a sequence, or else is equal to the result of converting the preceding characters in the sequence.
Effects: Translates characters in the source range [from, from_­end), placing the results in sequential positions starting at destination to.
Converts no more than (from_­end - from) source elements, and stores no more than (to_­end - to) destination elements.
Stops if it encounters a character it cannot convert.
It always leaves the from_­next and to_­next pointers pointing one beyond the last element successfully converted.
If returns noconv, internT and externT are the same type and the converted sequence is identical to the input sequence [from, from_next).
to_­next is set equal to to, the value of state is unchanged, and there are no changes to the values in [to, to_­end).
A codecvt facet that is used by basic_­filebuf ([file.streams]) shall have the property that if
do_out(state, from, from_end, from_next, to, to_end, to_next)
would return ok, where from != from_­end, then
do_out(state, from, from + 1, from_next, to, to_end, to_next)
shall also return ok, and that if
do_in(state, from, from_end, from_next, to, to_end, to_next)
would return ok, where to != to_­end, then
do_in(state, from, from_end, from_next, to, to + 1, to_next)
shall also return ok.263
Note
:
As a result of operations on state, it can return ok or partial and set from_­next == from and to_­next != to.
— end note
 ]
Remarks: Its operations on state are unspecified.
Note
:
This argument can be used, for example, to maintain shift state, to specify conversion options (such as count only), or to identify a cache of seek offsets.
— end note
 ]
Returns: An enumeration value, as summarized in Table 104.
Table 104: do_­in/do_­out result values   [tab:locale.codecvt.inout]
Value
Meaning
ok
completed the conversion
partial
not all source characters converted
error
encountered a character in [from, from_­end) that it could not convert
noconv
internT and externT are the same type, and input sequence is identical to converted sequence
A return value of partial, if (from_­next == from_­end), indicates that either the destination sequence has not absorbed all the available destination elements, or that additional source elements are needed before another destination element can be produced.
result do_unshift(stateT& state, externT* to, externT* to_end, externT*& to_next) const;
Preconditions: (to <= to_­end) is well-defined and true; state is initialized, if at the beginning of a sequence, or else is equal to the result of converting the preceding characters in the sequence.
Effects: Places characters starting at to that should be appended to terminate a sequence when the current stateT is given by state.264
Stores no more than (to_­end - to) destination elements, and leaves the to_­next pointer pointing one beyond the last element successfully stored.
Returns: An enumeration value, as summarized in Table 105.
Table 105: do_­unshift result values   [tab:locale.codecvt.unshift]
Value
Meaning
ok
completed the sequence
partial
space for more than to_­end - to destination elements was needed to terminate a sequence given the value of state
error
an unspecified error has occurred
noconv
no termination is needed for this state_­type
int do_encoding() const noexcept;
Returns: -1 if the encoding of the externT sequence is state-dependent; else the constant number of externT characters needed to produce an internal character; or 0 if this number is not a constant.265
bool do_always_noconv() const noexcept;
Returns: true if do_­in() and do_­out() return noconv for all valid argument values.
codecvt<char, char, mbstate_­t> returns true.
int do_length(stateT& state, const externT* from, const externT* from_end, size_t max) const;
Preconditions: (from <= from_­end) is well-defined and true; state is initialized, if at the beginning of a sequence, or else is equal to the result of converting the preceding characters in the sequence.
Effects: The effect on the state argument is as if it called do_­in(state, from, from_­end, from, to, to+max, to) for to pointing to a buffer of at least max elements.
Returns: (from_­next-from) where from_­next is the largest value in the range [from, from_­end] such that the sequence of values in the range [from, from_­next) represents max or fewer valid complete characters of type internT.
The specialization codecvt<char, char, mbstate_­t>, returns the lesser of max and (from_­end-from).
int do_max_length() const noexcept;
Returns: The maximum value that do_­length(state, from, from_­end, 1) can return for any valid range [from, from_­end) and stateT value state.
The specialization codecvt<char, char, mbstate_­t>​::​do_­max_­length() returns 1.
Informally, this means that basic_­filebuf assumes that the mappings from internal to external characters is 1 to N: a codecvt facet that is used by basic_­filebuf must be able to translate characters one internal character at a time.
⮥
Typically these will be characters to return the state to stateT().
⮥
If encoding() yields -1, then more than max_­length() externT elements may be consumed when producing a single internT character, and additional externT elements may appear at the end of a sequence after those that yield the final internT character.
⮥