std::string
is not good for UTF-8Section: D.21 [depr.fs.path.factory] Status: C++20 Submitter: The Netherlands Opened: 2019-11-07 Last modified: 2021-02-25
Priority: 0
View all other issues in [depr.fs.path.factory].
View all issues with C++20 status.
Discussion:
Addresses NL 375
Example in deprecated section implies that std::string
is the type to use for utf8 strings.
[Example: A string is to be read from a database that is encoded in UTF-8, and used to create a directory using the native encoding for filenames:
namespace fs = std::filesystem; std::string utf8_string = read_utf8_data(); fs::create_directory(fs::u8path(utf8_string));
Proposed change:
Add clarification that std::string
is the wrong type for utf8 strings
Jeff Garland:
SG16 in Belfast: Recommend to accept with a modification to update the example in D.21 [depr.fs.path.factory] p4 to state thatstd::u8string
should
be preferred for UTF-8 data.
Rationale: The example code is representative of historic use of std::filesystem::u8path
and should not be changed to use std::u8string
. The recommended change is to a
non-normative example and may therefore be considered editorial.
Previous resolution [SUPERSEDED]:
This wording is relative to N4835.
Modify D.21 [depr.fs.path.factory] as indicated:
-4- [Example: A string is to be read from a database that is encoded in UTF-8, and used to create a directory using the native encoding for filenames:
For POSIX-based operating systems with the native narrow encoding set to UTF-8, no encoding or type conversion occurs. For POSIX-based operating systems with the native narrow encoding not set to UTF-8, a conversion to UTF-32 occurs, followed by a conversion to the current native narrow encoding. Some Unicode characters may have no native character set representation. For Windows-based operating systems a conversion from UTF-8 to UTF-16 occurs. — end example] [Note: The example above is representative of historic use ofnamespace fs = std::filesystem; std::string utf8_string = read_utf8_data(); fs::create_directory(fs::u8path(utf8_string));filesystem
u8path
. New code should usestd::u8string
in place ofstd::string
. — end note]
LWG Belfast Friday Morning
Requested changes:
[2020-02 Moved to Immediate on Tuesday in Prague.]
Proposed resolution:
This wording is relative to N4835.
Modify D.21 [depr.fs.path.factory] as indicated:
-4- [Example: A string is to be read from a database that is encoded in UTF-8, and used to create a directory using the native encoding for filenames:
For POSIX-based operating systems with the native narrow encoding set to UTF-8, no encoding or type conversion occurs. For POSIX-based operating systems with the native narrow encoding not set to UTF-8, a conversion to UTF-32 occurs, followed by a conversion to the current native narrow encoding. Some Unicode characters may have no native character set representation. For Windows-based operating systems a conversion from UTF-8 to UTF-16 occurs. — end example] [Note: The example above is representative of a historical use ofnamespace fs = std::filesystem; std::string utf8_string = read_utf8_data(); fs::create_directory(fs::u8path(utf8_string));filesystem::u8path
. Passing astd::u8string
topath
's constructor is preferred for an indication of UTF-8 encoding more consistent withpath
's handling of other encodings. — end note]