22 Localization library [localization]

22.3 Locales [locales]

22.3.1 Class locale [locale]

namespace std {
  class locale {
  public:
    // types:
    class facet;
    class id;
    using category = int;
    static const category   // values assigned here are for exposition only
      none     = 0,
      collate  = 0x010, ctype    = 0x020,
      monetary = 0x040, numeric  = 0x080,
      time     = 0x100, messages = 0x200,
      all = collate | ctype | monetary | numeric | time  | messages;

    // construct/copy/destroy:
    locale() noexcept;
    locale(const locale& other) noexcept;
    explicit locale(const char* std_name);
    explicit locale(const string& std_name);
    locale(const locale& other, const char* std_name, category);
    locale(const locale& other, const string& std_name, category);
    template <class Facet> locale(const locale& other, Facet* f);
    locale(const locale& other, const locale& one, category);
    ~locale();                  // not virtual
    const locale& operator=(const locale& other) noexcept;
    template <class Facet> locale combine(const locale& other) const;

    // locale operations:
    basic_string<char>                  name() const;

    bool operator==(const locale& other) const;
    bool operator!=(const locale& other) const;

    template <class charT, class traits, class Allocator>
      bool operator()(const basic_string<charT,traits,Allocator>& s1,
                      const basic_string<charT,traits,Allocator>& s2) const;

    // global locale objects:
    static       locale  global(const locale&);
    static const locale& classic();
  };
}

Class locale implements a type-safe polymorphic set of facets, indexed by facet type. In other words, a facet has a dual role: in one sense, it's just a class interface; at the same time, it's an index into a locale's set of facets.

Access to the facets of a locale is via two function templates, use_facet<> and has_facet<>.

Example: An iostream operator<< might be implemented as:231

template <class charT, class traits>
basic_ostream<charT,traits>&
operator<< (basic_ostream<charT,traits>& s, Date d) {
  typename basic_ostream<charT,traits>::sentry cerberos(s);
  if (cerberos) {
    ios_base::iostate err = ios_base::iostate::goodbit;
    tm tmbuf; d.extract(tmbuf);
    use_facet<time_put<charT,ostreambuf_iterator<charT,traits>> >(
      s.getloc()).put(s, s, s.fill(), err, &tmbuf, 'x');
    s.setstate(err);            // might throw
  }
  return s;
}

 — end example ]

In the call to use_facet<Facet>(loc), the type argument chooses a facet, making available all members of the named type. If Facet is not present in a locale, it throws the standard exception bad_cast. A C++ program can check if a locale implements a particular facet with the function template has_facet<Facet>(). User-defined facets may be installed in a locale, and used identically as may standard facets ([facets.examples]).

Note: All locale semantics are accessed via use_facet<> and has_facet<>, except that:

  • A member operator template operator()(const basic_string<C, T, A>&, const basic_string<C, T, A>&) is provided so that a locale may be used as a predicate argument to the standard collections, to collate strings.

  • Convenient global interfaces are provided for traditional ctype functions such as isdigit() and isspace(), so that given a locale object loc a C++ program can call isspace(c,loc). (This eases upgrading existing extractors ([istream.formatted]).)

 — end note ]

Once a facet reference is obtained from a locale object by calling use_facet<>, that reference remains usable, and the results from member functions of it may be cached and re-used, as long as some locale object refers to that facet.

In successive calls to a locale facet member function on a facet object installed in the same locale, the returned result shall be identical.

A locale constructed from a name string (such as "POSIX"), or from parts of two named locales, has a name; all others do not. Named locales may be compared for equality; an unnamed locale is equal only to (copies of) itself. For an unnamed locale, locale::name() returns the string "*".

Whether there is one global locale object for the entire program or one global locale object per thread is implementation-defined. Implementations should provide one global locale object per thread. If there is a single global locale object for the entire program, implementations are not required to avoid data races on it ([res.on.data.races]).

Note that in the call to put the stream is implicitly converted to an ostreambuf_iterator<charT,traits>.

22.3.1.1 locale types [locale.types]

22.3.1.1.1 Type locale::category [locale.category]

using category = int;

Valid category values include the locale member bitmask elements collate, ctype, monetary, numeric, time, and messages, each of which represents a single locale category. In addition, locale member bitmask constant none is defined as zero and represents no category. And locale member bitmask constant all is defined such that the expression

(collate | ctype | monetary | numeric | time | messages | all) == all

is true, and represents the union of all categories. Further, the expression (X | Y), where X and Y each represent a single category, represents the union of the two categories.

locale member functions expecting a category argument require one of the category values defined above, or the union of two or more such values. Such a category value identifies a set of locale categories. Each locale category, in turn, identifies a set of locale facets, including at least those shown in Table [tab:localization.category.facets].

Table 69 — Locale category facets
CategoryIncludes facets
collate collate<char>, collate<wchar_t>
ctype ctype<char>, ctype<wchar_t>
codecvt<char,char,mbstate_t>
codecvt<char16_t,char,mbstate_t>
codecvt<char32_t,char,mbstate_t>
codecvt<wchar_t,char,mbstate_t>
monetary moneypunct<char>, moneypunct<wchar_t>
moneypunct<char,true>, moneypunct<wchar_t,true>
money_get<char>, money_get<wchar_t>
money_put<char>, money_put<wchar_t>
numeric numpunct<char>, numpunct<wchar_t>
num_get<char>, num_get<wchar_t>
num_put<char>, num_put<wchar_t>
time time_get<char>, time_get<wchar_t>
time_put<char>, time_put<wchar_t>
messages messages<char>, messages<wchar_t>

For any locale loc either constructed, or returned by locale::classic(), and any facet Facet shown in Table [tab:localization.category.facets], has_facet<Facet>(loc) is true. Each locale member function which takes a locale::category argument operates on the corresponding set of facets.

An implementation is required to provide those specializations for facet templates identified as members of a category, and for those shown in Table [tab:localization.required.specializations].

Table 70 — Required specializations
CategoryIncludes facets
collate collate_byname<char>, collate_byname<wchar_t>
ctype ctype_byname<char>, ctype_byname<wchar_t>
codecvt_byname<char,char,mbstate_t>
codecvt_byname<char16_t,char,mbstate_t>
codecvt_byname<char32_t,char,mbstate_t>
codecvt_byname<wchar_t,char,mbstate_t>
monetary moneypunct_byname<char,International>
moneypunct_byname<wchar_t,International>
money_get<C,InputIterator>
money_put<C,OutputIterator>
numeric numpunct_byname<char>, numpunct_byname<wchar_t>
num_get<C,InputIterator>, num_put<C,OutputIterator>
time time_get<char,InputIterator>
time_get_byname<char,InputIterator>
time_get<wchar_t,InputIterator>
time_get_byname<wchar_t,InputIterator>
time_put<char,OutputIterator>
time_put_byname<char,OutputIterator>
time_put<wchar_t,OutputIterator>
time_put_byname<wchar_t,OutputIterator>
messages messages_byname<char>, messages_byname<wchar_t>

The provided implementation of members of facets num_get<charT> and num_put<charT> calls use_facet <F> (l) only for facet F of types numpunct<charT> and ctype<charT>, and for locale l the value obtained by calling member getloc() on the ios_base& argument to these functions.

In declarations of facets, a template parameter with name InputIterator or OutputIterator indicates the set of all possible specializations on parameters that satisfy the requirements of an Input Iterator or an Output Iterator, respectively ([iterator.requirements]). A template parameter with name C represents the set of types containing char, wchar_t, and any other implementation-defined character types that satisfy the requirements for a character on which any of the iostream components can be instantiated. A template parameter with name International represents the set of all possible specializations on a bool parameter.

22.3.1.1.2 Class locale::facet [locale.facet]

namespace std {
  class locale::facet {
  protected:
    explicit facet(size_t refs = 0);
    virtual ~facet();
    facet(const facet&) = delete;
    void operator=(const facet&) = delete;
  };
}

Class facet is the base class for locale feature sets. A class is a defnfacet if it is publicly derived from another facet, or if it is a class derived from locale::facet and contains a publicly accessible declaration as follows:232

static ::std::locale::id id;

Template parameters in this Clause which are required to be facets are those named Facet in declarations. A program that passes a type that is not a facet, or a type that refers to a volatile-qualified facet, as an (explicit or deduced) template parameter to a locale function expecting a facet, is ill-formed. A const-qualified facet is a valid template argument to any locale function that expects a Facet template parameter.

The refs argument to the constructor is used for lifetime management. For refs == 0, the implementation performs delete static_cast<locale::facet*>(f) (where f is a pointer to the facet) when the last locale object containing the facet is destroyed; for refs == 1, the implementation never destroys the facet.

Constructors of all facets defined in this Clause take such an argument and pass it along to their facet base class constructor. All one-argument constructors defined in this Clause are explicit, preventing their participation in automatic conversions.

For some standard facets a standard “_byname” class, derived from it, implements the virtual function semantics equivalent to that facet of the locale constructed by locale(const char*) with the same name. Each such facet provides a constructor that takes a const char* argument, which names the locale, and a refs argument, which is passed to the base class constructor. Each such facet also provides a constructor that takes a string argument str and a refs argument, which has the same effect as calling the first constructor with the two arguments str.c_str() and refs. If there is no “_byname” version of a facet, the base class implements named locale semantics itself by reference to other facets.

This is a complete list of requirements; there are no other requirements. Thus, a facet class need not have a public copy constructor, assignment, default constructor, destructor, etc.

22.3.1.1.3 Class locale::id [locale.id]

namespace std {
  class locale::id {
  public:
    id();
    void operator=(const id&) = delete;
    id(const id&) = delete;
  };
}

The class locale::id provides identification of a locale facet interface, used as an index for lookup and to encapsulate initialization.

Note: Because facets are used by iostreams, potentially while static constructors are running, their initialization cannot depend on programmed static initialization. One initialization strategy is for locale to initialize each facet's id member the first time an instance of the facet is installed into a locale. This depends only on static storage being zero before constructors run ([basic.start.static]).  — end note ]

22.3.1.2 locale constructors and destructor [locale.cons]

locale() noexcept;

Default constructor: a snapshot of the current global locale.

Effects: Constructs a copy of the argument last passed to locale::global(locale&), if it has been called; else, the resulting facets have virtual function semantics identical to those of locale::classic(). [ Note: This constructor is commonly used as the default value for arguments of functions that take a const locale& argument.  — end note ]

locale(const locale& other) noexcept;

Effects: Constructs a locale which is a copy of other.

explicit locale(const char* std_name);

Effects: Constructs a locale using standard C locale names, e.g., "POSIX". The resulting locale implements semantics defined to be associated with that name.

Throws: runtime_error if the argument is not valid, or is null.

Remarks: The set of valid string argument values is "C", "", and any implementation-defined values.

explicit locale(const string& std_name);

Effects: The same as locale(std_name.c_str()).

locale(const locale& other, const char* std_name, category);

Effects: Constructs a locale as a copy of other except for the facets identified by the category argument, which instead implement the same semantics as locale(std_name).

Throws: runtime_error if the argument is not valid, or is null.

Remarks: The locale has a name if and only if other has a name.

locale(const locale& other, const string& std_name, category cat);

Effects: The same as locale(other, std_name.c_str(), cat).

template <class Facet> locale(const locale& other, Facet* f);

Effects: Constructs a locale incorporating all facets from the first argument except that of type Facet, and installs the second argument as the remaining facet. If f is null, the resulting object is a copy of other.

Remarks: The resulting locale has no name.

locale(const locale& other, const locale& one, category cats);

Effects: Constructs a locale incorporating all facets from the first argument except those that implement cats, which are instead incorporated from the second argument.

Remarks: The resulting locale has a name if and only if the first two arguments have names.

const locale& operator=(const locale& other) noexcept;

Effects: Creates a copy of other, replacing the current value.

Returns: *this.

~locale();

A non-virtual destructor that throws no exceptions.

22.3.1.3 locale members [locale.members]

template <class Facet> locale combine(const locale& other) const;

Effects: Constructs a locale incorporating all facets from *this except for that one facet of other that is identified by Facet.

Returns: The newly created locale.

Throws: runtime_error if has_facet<Facet>(other) is false.

Remarks: The resulting locale has no name.

basic_string<char> name() const;

Returns: The name of *this, if it has one; otherwise, the string "*".

22.3.1.4 locale operators [locale.operators]

bool operator==(const locale& other) const;

Returns: true if both arguments are the same locale, or one is a copy of the other, or each has a name and the names are identical; false otherwise.

bool operator!=(const locale& other) const;

Returns: !(*this == other).

template <class charT, class traits, class Allocator> bool operator()(const basic_string<charT,traits,Allocator>& s1, const basic_string<charT,traits,Allocator>& s2) const;

Effects: Compares two strings according to the collate<charT> facet.

Remarks: This member operator template (and therefore locale itself) satisfies requirements for a comparator predicate template argument (Clause [algorithms]) applied to strings.

Returns: use_facet<collate<charT>>(*this).compare(s1.data(), s1.data() + s1.size(),
s2.data(), s2.data() + s2.size()) < 0
.

Example: A vector of strings v can be collated according to collation rules in locale loc simply by ([alg.sort], [vector]):

std::sort(v.begin(), v.end(), loc);

 — end example ]

22.3.1.5 locale static members [locale.statics]

static locale global(const locale& loc);

Sets the global locale to its argument.

Effects: Causes future calls to the constructor locale() to return a copy of the argument. If the argument has a name, does

setlocale(LC_ALL, loc.name().c_str());

otherwise, the effect on the C locale, if any, is implementation-defined. No library function other than locale::global() shall affect the value returned by locale(). [ Note: See [c.locales] for data race considerations when setlocale is invoked.  — end note ]

Returns: The previous value of locale().

static const locale& classic();

The "C" locale.

Returns: A locale that implements the classic "C" locale semantics, equivalent to the value locale("C").

Remarks: This locale, its facets, and their member functions, do not change with time.

22.3.2 locale globals [locale.global.templates]

template <class Facet> const Facet& use_facet(const locale& loc);

Requires: Facet is a facet class whose definition contains the public static member id as defined in [locale.facet].

Returns: A reference to the corresponding facet of loc, if present.

Throws: bad_cast if has_facet<Facet>(loc) is false.

Remarks: The reference returned remains valid at least as long as any copy of loc exists.

template <class Facet> bool has_facet(const locale& loc) noexcept;

Returns: true if the facet requested is present in loc; otherwise false.

22.3.3 Convenience interfaces [locale.convenience]

22.3.3.1 Character classification [classification]

template <class charT> bool isspace (charT c, const locale& loc); template <class charT> bool isprint (charT c, const locale& loc); template <class charT> bool iscntrl (charT c, const locale& loc); template <class charT> bool isupper (charT c, const locale& loc); template <class charT> bool islower (charT c, const locale& loc); template <class charT> bool isalpha (charT c, const locale& loc); template <class charT> bool isdigit (charT c, const locale& loc); template <class charT> bool ispunct (charT c, const locale& loc); template <class charT> bool isxdigit(charT c, const locale& loc); template <class charT> bool isalnum (charT c, const locale& loc); template <class charT> bool isgraph (charT c, const locale& loc); template <class charT> bool isblank (charT c, const locale& loc);

Each of these functions isF returns the result of the expression:

use_facet<ctype<charT>>(loc).is(ctype_base::F, c)

where F is the ctype_base::mask value corresponding to that function ([category.ctype]).233

When used in a loop, it is faster to cache the ctype<> facet and use it directly, or use the vector form of ctype<>::is.

22.3.3.2 Conversions [conversions]

22.3.3.2.1 Character conversions [conversions.character]

template <class charT> charT toupper(charT c, const locale& loc);

Returns: use_facet<ctype<charT>>(loc).toupper(c).

template <class charT> charT tolower(charT c, const locale& loc);

Returns: use_facet<ctype<charT>>(loc).tolower(c).

22.3.3.2.2 string conversions [conversions.string]

Class template wstring_convert performs conversions between a wide string and a byte string. It lets you specify a code conversion facet (like class template codecvt) to perform the conversions, without affecting any streams or locales. [ Example: If you want to use the code conversion facet codecvt_utf8 to output to cout a UTF-8 multibyte sequence corresponding to a wide string, but you don't want to alter the locale for cout, you can write something like:

wstring_convert<std::codecvt_utf8<wchar_t>> myconv;
std::string mbstring = myconv.to_bytes(L"Hello\n");
std::cout << mbstring;

 — end example ]

Class template wstring_convert synopsis

namespace std {
template<class Codecvt, class Elem = wchar_t,
    class Wide_alloc = allocator<Elem>,
    class Byte_alloc = allocator<char>> class wstring_convert {
  public:
    using byte_string = basic_string<char, char_traits<char>, Byte_alloc>;
    using wide_string = basic_string<Elem, char_traits<Elem>, Wide_alloc>;
    using state_type  = typename Codecvt::state_type;
    using int_type    = typename wide_string::traits_type::int_type;

    explicit wstring_convert(Codecvt* pcvt = new Codecvt);
    wstring_convert(Codecvt* pcvt, state_type state);
    explicit wstring_convert(const byte_string& byte_err,
                             const wide_string& wide_err = wide_string());
    ~wstring_convert();

    wstring_convert(const wstring_convert&) = delete;
    wstring_convert& operator=(const wstring_convert&) = delete;

    wide_string from_bytes(char byte);
    wide_string from_bytes(const char* ptr);
    wide_string from_bytes(const byte_string& str);
    wide_string from_bytes(const char* first, const char* last);

    byte_string to_bytes(Elem wchar);
    byte_string to_bytes(const Elem* wptr);
    byte_string to_bytes(const wide_string& wstr);
    byte_string to_bytes(const Elem* first, const Elem* last);

    size_t converted() const noexcept;
    state_type state() const;
  private:
    byte_string byte_err_string;    // exposition only
    wide_string wide_err_string;    // exposition only
    Codecvt* cvtptr;                // exposition only
    state_type cvtstate;            // exposition only
    size_t cvtcount;                // exposition only
  };
}

The class template describes an object that controls conversions between wide string objects of class basic_string<Elem, char_traits<Elem>, Wide_alloc> and byte string objects of class basic_string<char, char_traits<char>, Byte_alloc>. The class template defines the types wide_string and byte_string as synonyms for these two types. Conversion between a sequence of Elem values (stored in a wide_string object) and multibyte sequences (stored in a byte_string object) is performed by an object of class Codecvt, which meets the requirements of the standard code-conversion facet codecvt<Elem, char, mbstate_t>.

An object of this class template stores:

  • byte_err_string — a byte string to display on errors

  • wide_err_string — a wide string to display on errors

  • cvtptr — a pointer to the allocated conversion object (which is freed when the wstring_convert object is destroyed)

  • cvtstate — a conversion state object

  • cvtcount — a conversion count

using byte_string = basic_string<char, char_traits<char>, Byte_alloc>;

The type shall be a synonym for basic_string<char, char_traits<char>, Byte_alloc>

size_t converted() const noexcept;

Returns: cvtcount.

wide_string from_bytes(char byte); wide_string from_bytes(const char* ptr); wide_string from_bytes(const byte_string& str); wide_string from_bytes(const char* first, const char* last);

Effects: The first member function shall convert the single-element sequence byte to a wide string. The second member function shall convert the null-terminated sequence beginning at ptr to a wide string. The third member function shall convert the sequence stored in str to a wide string. The fourth member function shall convert the sequence defined by the range [first, last) to a wide string.

In all cases:

  • If the cvtstate object was not constructed with an explicit value, it shall be set to its default value (the initial conversion state) before the conversion begins. Otherwise it shall be left unchanged.

  • The number of input elements successfully converted shall be stored in cvtcount.

Returns: If no conversion error occurs, the member function shall return the converted wide string. Otherwise, if the object was constructed with a wide-error string, the member function shall return the wide-error string. Otherwise, the member function throws an object of class range_error.

using int_type = typename wide_string::traits_type::int_type;

The type shall be a synonym for wide_string::traits_type::int_type.

state_type state() const;

returns cvtstate.

using state_type = typename Codecvt::state_type;

The type shall be a synonym for Codecvt::state_type.

byte_string to_bytes(Elem wchar); byte_string to_bytes(const Elem* wptr); byte_string to_bytes(const wide_string& wstr); byte_string to_bytes(const Elem* first, const Elem* last);

Effects: The first member function shall convert the single-element sequence wchar to a byte string. The second member function shall convert the null-terminated sequence beginning at wptr to a byte string. The third member function shall convert the sequence stored in wstr to a byte string. The fourth member function shall convert the sequence defined by the range [first, last) to a byte string.

In all cases:

  • If the cvtstate object was not constructed with an explicit value, it shall be set to its default value (the initial conversion state) before the conversion begins. Otherwise it shall be left unchanged.

  • The number of input elements successfully converted shall be stored in cvtcount.

Returns: If no conversion error occurs, the member function shall return the converted byte string. Otherwise, if the object was constructed with a byte-error string, the member function shall return the byte-error string. Otherwise, the member function shall throw an object of class range_error.

using wide_string = basic_string<Elem, char_traits<Elem>, Wide_alloc>;

The type shall be a synonym for basic_string<Elem, char_traits<Elem>, Wide_alloc>.

explicit wstring_convert(Codecvt* pcvt = new Codecvt); wstring_convert(Codecvt* pcvt, state_type state); explicit wstring_convert(const byte_string& byte_err, const wide_string& wide_err = wide_string());

Requires: For the first and second constructors, pcvt != nullptr.

Effects: The first constructor shall store pcvt in cvtptr and default values in cvtstate, byte_err_string, and wide_err_string. The second constructor shall store pcvt in cvtptr, state in cvtstate, and default values in byte_err_string and wide_err_string; moreover the stored state shall be retained between calls to from_bytes and to_bytes. The third constructor shall store new Codecvt in cvtptr, state_type() in cvtstate, byte_err in byte_err_string, and wide_err in wide_err_string.

~wstring_convert();

Effects: The destructor shall delete cvtptr.

22.3.3.2.3 Buffer conversions [conversions.buffer]

Class template wbuffer_convert looks like a wide stream buffer, but performs all its I/O through an underlying byte stream buffer that you specify when you construct it. Like class template wstring_convert, it lets you specify a code conversion facet to perform the conversions, without affecting any streams or locales.

Class template wbuffer_convert synopsis

namespace std {
template<class Codecvt,
    class Elem = wchar_t,
    class Tr = char_traits<Elem>>
  class wbuffer_convert
    : public basic_streambuf<Elem, Tr> {
  public:
    using state_type = typename Codecvt::state_type;

    explicit wbuffer_convert(streambuf* bytebuf = 0,
                             Codecvt* pcvt = new Codecvt,
                             state_type state = state_type());

    ~wbuffer_convert();

    wbuffer_convert(const wbuffer_convert&) = delete;
    wbuffer_convert& operator=(const wbuffer_convert&) = delete;

    streambuf* rdbuf() const;
    streambuf* rdbuf(streambuf* bytebuf);

    state_type state() const;

  private:
    streambuf* bufptr;         // exposition only
    Codecvt* cvtptr;                // exposition only
    state_type cvtstate;            // exposition only
  };
}

The class template describes a stream buffer that controls the transmission of elements of type Elem, whose character traits are described by the class Tr, to and from a byte stream buffer of type streambuf. Conversion between a sequence of Elem values and multibyte sequences is performed by an object of class Codecvt, which shall meet the requirements of the standard code-conversion facet codecvt<Elem, char, mbstate_t>.

An object of this class template stores:

  • bufptr — a pointer to its underlying byte stream buffer

  • cvtptr — a pointer to the allocated conversion object (which is freed when the wbuffer_convert object is destroyed)

  • cvtstate — a conversion state object

state_type state() const;

Returns: cvtstate.

streambuf* rdbuf() const;

Returns: bufptr.

streambuf* rdbuf(streambuf* bytebuf);

Effects: Stores bytebuf in bufptr.

Returns: The previous value of bufptr.

using state_type = typename Codecvt::state_type;

The type shall be a synonym for Codecvt::state_type.

explicit wbuffer_convert(streambuf* bytebuf = 0, Codecvt* pcvt = new Codecvt, state_type state = state_type());

Requires: pcvt != nullptr.

Effects: The constructor constructs a stream buffer object, initializes bufptr to bytebuf, initializes cvtptr to pcvt, and initializes cvtstate to state.

~wbuffer_convert();

Effects: The destructor shall delete cvtptr.