4102. string_view(Iter, Iter) constructor breaks existing code

Section: 27.3.3.2 [string.view.cons] Status: New Submitter: Derek Zhang Opened: 2024-05-14 Last modified: 2024-08-02

Priority: 2

View all other issues in [string.view.cons].

View all issues with New status.

Discussion:

As a result of the new constructor added by P1391, this stopped working in C++20:


void fun(string_view);
void fun(vector<string_view>);
fun({"a", "b"});

Previously the first fun wasn't viable, so it constructed a vector<string_view> of two elements using its initializer-list constructor and then called the second fun. Now {"a", "b"} could also be a call to the new string_view(Iter, Iter), so it's ambiguous and fails to compile.

The following case is arguably worse as it doesn't become ill-formed in C++20, it still compiles but now has undefined behaviour:


fun({{"a", "b"}});

Previously the first fun wasn't viable, so this constructed a vector<string_view> of two elements (via somewhat bizarre syntax, but using the same initializer-list constructor as above). Now it constructs a vector from an initializer_list with one element, where that element is constructed from the two const char* using string_view(Iter, Iter). But those two pointers are unrelated and do not form a valid range, so this violates the constructor's precondition and has undefined behaviour. If you're lucky it crashes at runtime when trying to reach "b" from "a", but it could also form a string_view that reads arbitrary secrets from the memory between the two pointers.

[Jonathan comments]

At the very least, we should have an Annex C entry documenting the change. Making the new string_view(Iter, Iter) constructor explicit would prevent the runtime behaviour change for the second example, but GCC thinks the first example would still be ambiguous (it seems to depend on how list-initialization handles explicit constructors, which has implementation divergence).

Maybe we should have a deleted constructor matching string literals:


template<size_t N1, size_t N2>
basic_string_view(const charT(&)[N1], const charT(&)[N2]) = delete;
Or to handle both const char[N] and char[N]:

template<class A1, class A2>
requires (rank_v<A1> == 1) && (rank_v<A2> == 1)
basic_string_view(A1&, A2&) = delete;
Both options would prevent this currently valid (but weird) code:

const char arr[] = "str";
std::string_view s(arr, arr); // s.size() == 0 and s.data() == arr
That seems acceptable, because std::string_view s(arr, 0) is simpler and clearer anyway.

[2024-08-02; Reflector poll]

Set priority to 2 after reflector poll. "The constructor should be made explicit as part of any resolution for this."

Proposed resolution: