[locale.stdcvt]

22.5 Standard code conversion facets [locale.stdcvt]

The header <codecvt> provides code conversion facets for various character encodings.

Header <codecvt> synopsis

namespace std {
  enum codecvt_mode {
    consume_header = 4,
    generate_header = 2,
    little_endian = 1
  };

  template<class Elem, unsigned long Maxcode = 0x10ffff,
    codecvt_mode Mode = (codecvt_mode)0>
  class codecvt_utf8
    : public codecvt<Elem, char, mbstate_t> {
  public:
    explicit codecvt_utf8(size_t refs = 0);
    ~codecvt_utf8();
  };

  template<class Elem, unsigned long Maxcode = 0x10ffff,
    codecvt_mode Mode = (codecvt_mode)0>
  class codecvt_utf16
    : public codecvt<Elem, char, mbstate_t> {
  public:
    explicit codecvt_utf16(size_t refs = 0);
    ~codecvt_utf16();
  };

  template<class Elem, unsigned long Maxcode = 0x10ffff,
    codecvt_mode Mode = (codecvt_mode)0>
  class codecvt_utf8_utf16
    : public codecvt<Elem, char, mbstate_t> {
  public:
    explicit codecvt_utf8_utf16(size_t refs = 0);
    ~codecvt_utf8_utf16();
  };
}

For each of the three code conversion facets codecvt_utf8, codecvt_utf16, and codecvt_utf8_utf16:

(3.1)
Elem is the wide-character type, such as wchar_t, char16_t, or char32_t.
(3.2)
Maxcode is the largest wide-character code that the facet will read or write without reporting a conversion error.
(3.3)
If (Mode & consume_header), the facet shall consume an initial header sequence, if present, when reading a multibyte sequence to determine the endianness of the subsequent multibyte sequence to be read.
(3.4)
If (Mode & generate_header), the facet shall generate an initial header sequence when writing a multibyte sequence to advertise the endianness of the subsequent multibyte sequence to be written.
(3.5)
If (Mode & little_endian), the facet shall generate a multibyte sequence in little-endian order, as opposed to the default big-endian order.

For the facet codecvt_utf8:

(4.1)
The facet shall convert between UTF-8 multibyte sequences and UCS2 or UCS4 (depending on the size of Elem) within the program.
(4.2)
Endianness shall not affect how multibyte sequences are read or written.
(4.3)
The multibyte sequences may be written as either a text or a binary file.

For the facet codecvt_utf16:

(5.1)
The facet shall convert between UTF-16 multibyte sequences and UCS2 or UCS4 (depending on the size of Elem) within the program.
(5.2)
Multibyte sequences shall be read or written according to the Mode flag, as set out above.
(5.3)
The multibyte sequences may be written only as a binary file. Attempting to write to a text file produces undefined behavior.

For the facet codecvt_utf8_utf16:

(6.1)
The facet shall convert between UTF-8 multibyte sequences and UTF-16 (one or two 16-bit codes) within the program.
(6.2)
Endianness shall not affect how multibyte sequences are read or written.
(6.3)
The multibyte sequences may be written as either a text or a binary file.

22 Localization library [localization]

22.5 Standard code conversion facets [locale.stdcvt]