2475. Allow overwriting of std::basic_string terminator with charT() to allow cleaner interoperation with legacy APIs

Section: 27.4.3.6 [string.access] Status: C++17 Submitter: Matt Weber Opened: 2015-02-21 Last modified: 2017-07-30

Priority: 3

View all other issues in [string.access].

View all issues with C++17 status.

Discussion:

It is often desirable to use a std::basic_string object as a buffer when interoperating with libraries that mutate null-terminated arrays of characters. In many cases, these legacy APIs write a null terminator at the specified end of the provided buffer. Providing such a function with an appropriately-sized std::basic_string results in undefined behavior when the charT object at the size() position is overwritten, even if the value remains unchanged.

Absent the ability to allow for this, applications are forced into pessimizations such as: providing appropriately-sized std::vectors of charT for interoperating with the legacy API, and then copying the std::vector to a std::basic_string; providing an oversized std::basic_string object and then calling resize() later.

A trivial example:

#include <string>
#include <vector>

void legacy_function(char *out, size_t count) {
  for (size_t i = 0; i < count; ++i) {
    *out++ = '0' + (i % 10);
  }
  *out = '\0'; // if size() == count, this results in undefined behavior
}

int main() {
  std::string s(10, '\0');
  legacy_function(&s[0], s.size()); // undefined behavior

  std::vector<char> buffer(11);
  legacy_function(&buffer[0], buffer.size() - 1);
  std::string t(&buffer[0], buffer.size() - 1); // potentially expensive copy

  std::string u(11, '\0');
  legacy_function(&u[0], u.size() - 1);
  u.resize(u.size() - 1); // needlessly complicates the program's logic
}

A slight relaxation of the requirement on the returned object from the element access operator would allow for this interaction with no semantic change to existing programs.

[2016-08 Chicago]

Tues PM: This should also apply to non-const data(). Billy to update wording.

Fri PM: Move to Tentatively Ready

Proposed resolution:

This wording is relative to N4296.

  1. Edit 27.4.3.6 [string.access] as indicated:

    const_reference operator[](size_type pos) const;
    reference operator[](size_type pos);
    

    -1- Requires: […]

    -2- Returns: *(begin() + pos) if pos < size(). Otherwise, returns a reference to an object of type charT with value charT(), where modifying the object to any value other than charT() leads to undefined behavior.

    […]