3094. §[time.duration.io]p4 makes surprising claims about encoding

Section: 30.5.11 [time.duration.io] Status: C++20 Submitter: Richard Smith Opened: 2018-04-02 Last modified: 2021-02-25

Priority: 0

View all other issues in [time.duration.io].

View all issues with C++20 status.

Discussion:

[time.duration.io]p4 says:

For streams where charT has an 8-bit representation, "µs" should be encoded as UTF-8. Otherwise UTF-16 or UTF-32 is encouraged. The implementation may substitute other encodings, including "us".

This choice of encoding is not up to the <chrono> library to decide or encourage. The basic execution character set determines how a mu should be encoded in type char, for instance, and it would be truly bizarre to use a UTF-8 encoding if that character set is, say, Latin-1 or EBCDIC.

I suggest we strike at least the first two sentences of this paragraph, as the meaning of the prior wording is unambiguous without them and confusing with them, and they do not providing any normative requirements (although they do provide recommendations). The third sentence appears to have a normative impact, but it's hard to see how it's legitimate to call "us" an "encoding" of "µs"; it's really just an alternative unit suffix. So how about replacing that paragraph with this:

If Period::type is micro, but the character U+00B5 cannot be represented in the encoding used for charT, the unit suffix "us" is used instead of "µs".

(This also removes the permission for an implementation to choose an arbitrary alternative "encoding", which seems undesirable.)

[ 2018-04-23 Moved to Tentatively Ready after 6 positive votes on c++std-lib. ]

[2018-06 Rapperswil: Adopted]

Proposed resolution:

This wording is relative to N4741.

  1. Edit 30.5.11 [time.duration.io] as indicated:

    template<class charT, class traits, class Rep, class Period>
      basic_ostream<charT, traits>&
        operator<<(basic_ostream<charT, traits>& os, const duration<Rep, Period>& d);
    

    -1- Requires: […]

    -2- Effects: […]

    -3- The units suffix depends on the type Period::type as follows:

    1. […]

    2. (3.5) — Otherwise, if Period::type is micro, the suffix is "µs" ("\u00b5\u0073").

    3. […]

    4. (3.21) — Otherwise, the suffix is "[num/den]s".

    […]

    -4- For streams where charT has an 8-bit representation, "µs" should be encoded as UTF-8. Otherwise UTF-16 or UTF-32 is encouraged. The implementation may substitute other encodings, including "us"If Period::type is micro, but the character U+00B5 cannot be represented in the encoding used for charT, the unit suffix "us" is used instead of "µs".

    -5- Returns: os.