2360. reverse_iterator::operator*() is unimplementable

Section: 24.5.1.2 [reverse.iterator] Status: C++14 Submitter: Stephan T. Lavavej Opened: 2014-02-07 Last modified: 2014-02-27

Priority: 1

View all other issues in [reverse.iterator].

View all issues with C++14 status.

Discussion:

Previously, C++03 24.4.1.3.3 [lib.reverse.iter.op.star] required:

reference operator*() const;

Effects:

Iterator tmp = current;
return *--tmp;

Now, N3797 24.5.1.1 [reverse.iterator] depicts:

private:
  Iterator deref_tmp; // exposition only
};

And 24.5.1.3.4 [reverse.iter.op.star] requires:

reference operator*() const;

Effects:

deref_tmp = current;
--deref_tmp;
return *deref_tmp;

[Note: This operation must use an auxiliary member variable rather than a temporary variable to avoid returning a reference that persists beyond the lifetime of its associated iterator. (See 24.2.) — end note]

As written, this won't compile, because operator*() is const yet it's modifying (via assignment and decrement) the deref_tmp data member. So what happens if you say "mutable Iterator deref_tmp;"?

DANGER: WARP CORE BREACH IMMINENT.

The Standard requires const member functions to be callable from multiple threads simultaneously. This is 16.4.6.10 [res.on.data.races]/3: "A C++ standard library function shall not directly or indirectly modify objects (1.10) accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function's non-const arguments, including this."

Multiple threads simultaneously modifying deref_tmp will trigger data races, so both mutable and some form of synchronization (e.g. mutex or atomic) are actually necessary!

Here's what implementations currently do: Dinkumware/VC follows C++03 and doesn't use deref_tmp (attempting to implement it is what led me to file this issue). According to Jonathan Wakely, libstdc++ also follows C++03 (see PR51823 which is suspended until LWG 2204 is resolved). According to Marshall Clow, libc++ uses deref_tmp with mutable but without synchronization, so it can trigger data races.

This deref_tmp Standardese was added by LWG 198 "Validity of pointers and references unspecified after iterator destruction" and is present in Working Papers going back to N1638 on April 11, 2004, long before C++ recognized the existence of multithreading and developed the "const means simultaneously readable" convention.

A related issue is LWG 1052 "reverse_iterator::operator-> should also support smart pointers" which mentioned the need to depict mutable in the Standardese, but it was resolved NAD Future and no change was made.

Finally, LWG 2204 "reverse_iterator should not require a second copy of the base iterator" talked about removing deref_tmp, but without considering multithreading.

I argue that deref_tmp must be removed. Its existence has highly undesirable consequences: either no synchronization is used, violating the Standard's usual multithreading guarantees, or synchronization is used, adding further costs for all users that benefit almost no iterators.

deref_tmp is attempting to handle iterators that return references to things "inside themselves", which I usually call "stashing iterators" (as they have a secret stash). Note that these are very unusual, and are different from proxy iterators like vector<bool>::iterator. While vector<bool>::iterator's operator*() does not return a true reference, it refers to a bit that is unrelated to the iterator's lifetime.

[2014-02-14 Issaquah meeting: Move to Immediate]

Strike superfluous note to avoid potential confusion, and move to Immediate.

Proposed resolution:

This wording is relative to N3797.

  1. Change class template reverse_iterator synopsis, 24.5.1.2 [reverse.iterator], as indicated:

    […]
    protected:
      Iterator current;
    private:
      Iterator deref_tmp; // exposition only
    };
    
  2. Change [reverse.iter.op.star] as indicated:

    reference operator*() const;
    

    -1- Effects:

    deref_tmp = current;
    --deref_tmp;
    return *deref_tmp;
    Iterator tmp = current;
    return *--tmp;
    

    -2- [Note: This operation must use an auxiliary member variable rather than a temporary variable to avoid returning a reference that persists beyond the lifetime of its associated iterator. (See 24.2.) — end note]