Section: 24.2.8 [unord.req], 24.5 [unord] Status: NAD Submitter: Matt Austern Opened: 2009-08-10 Last modified: 2019-02-26 17:49:57 UTC
Priority: Not Prioritized
View other active issues in [unord.req].
View all other issues in [unord.req].
View all issues with NAD status.
Unordered associative containers have a notion of a maximum load factor: when the number of elements grows large enough, the containers automatically perform a rehash so that the number of elements per bucket stays below a user-specified bound. This ensures that the hash table's performance characteristics don't change dramatically as the size increases.
For similar reasons, Google has found it useful to specify a minimum load factor: when the number of elements shrinks by a large enough, the containers automatically perform a rehash so that the number of elements per bucket stays above a user-specified bound. This is useful for two reasons. First, it prevents wasting a lot of memory when an unordered associative container grows temporarily. Second, it prevents amortized iteration time from being arbitrarily large; consider the case of a hash table with a billion buckets and only one element. (This was discussed even before TR1 was published; it was TR issue 6.13, which the LWG closed as NAD on the grounds that it was a known design feature. However, the LWG did not consider the approach of a minimum load factor.)
The only interesting question is when shrinking is allowed. In principle the cleanest solution would be shrinking on erase, just as we grow on insert. However, that would be a usability problem; it would break a number of common idioms involving erase. Instead, Google's hash tables only shrink on insert and rehash.
The proposed resolution allows, but does not require, shrinking in rehash, mostly because a postcondition for rehash that involves the minimum load factor would be fairly complicated. (It would probably have to involve a number of special cases and it would probably have to mention yet another parameter, a minimum bucket count.)
The current behavior is equivalent to a minimum load factor of 0. If we specify that 0 is the default, this change will have no impact on backward compatibility.
[ 2010 Rapperswil: ]
This seems to a useful extension, but is too late for 0x. Move to Tentatively NAD Future.
[ Moved to NAD Future at 2010-11 Batavia ]
[LEWG Kona 2017]
Should there be a shrink_to_fit()? Is it too surprising to shrink on insert()? (We understand that shrinking on erase() is not an option.) Maybe make people call rehash(0) to shrink to the min_load_factor? On clear(), the load factor goes to 0 or undefined (0/0), which is likely to violate min_load_factor() min_load_factor(z)'s wording should match max_load_factor(z)'s, e.g. "May change the container’s maximum load factor" Want a paper exploring whether shrink-on-insert has been surprising. From Titus: Google's experience is that maps don't shrink in the way this would help with. NAD, not worth the time. Write a paper if you can demonstrate a need for this.
Add two new rows, and change rehash's postcondition in the unordered associative container requirements table in 24.2.8 [unord.req]:
Table 87 — Unordered associative container requirements (in addition to container) Expression Return type Assertion/note pre-/post-condition Complexity a.rehash(n) void Post: a.bucket_cout > a.size() / a.max_load_factor() and a.bucket_count() >= n. Average case linear in a.size(), worst case quadratic.
Add a footnote to 24.2.8 [unord.req] p12:
The insert members shall not affect the validity of references to container elements, but may invalidate all iterators to the container. The erase members shall invalidate only iterators and references to the erased elements.
[A consequence of these requirements is that while insert may change the number of buckets, erase may not. The number of buckets may be reduced on calls to insert or rehash.]
Change paragraph 13:
The insert members shall not affect the validity of iterators if
(N+n) < z * B, where N is the number of elements in the container prior to the insert operation, n is the number of elements inserted, B is the container's bucket count, and z is the container's maximum load factor.
Add to the unordered_map class synopsis in section 24.5.4 [unord.map], the unordered_multimap class synopsis in 24.5.5 [unord.multimap], the unordered_set class synopsis in 24.5.6 [unord.set], and the unordered_multiset class synopsis in 24.5.7 [unord.multiset]:
In 220.127.116.11 [unord.map.cnstr], 18.104.22.168 [unord.multimap.cnstr], 22.214.171.124 [unord.set.cnstr], and 126.96.36.199 [unord.multiset.cnstr], change:
... max_load_factor() returns 1.0 .