Section: 32.5.4 [atomics.order] Status: NAD Editorial Submitter: Anthony Williams Opened: 2008-10-19 Last modified: 2016-01-28
Priority: Not Prioritized
View other active issues in [atomics.order].
View all other issues in [atomics.order].
View all issues with NAD Editorial status.
Discussion:
Addresses UK 313
There was an interesting issue raised over on comp.programming.threads today regarding the following example
// Thread 1: x.store(1, memory_order_relaxed); // SX atomic_thread_fence(memory_order_seq_cst); // F1 y.store(1, memory_order_relaxed); // SY1 atomic_thread_fence(memory_order_seq_cst); // F2 r1 = y.load(memory_order_relaxed); // RY // Thread 2: y.store(0, memory_order_relaxed); // SY2 atomic_thread_fence(memory_order_seq_cst); // F3 r2 = x.load(memory_order_relaxed); // RX
is the outcome r1 == 0
and r2 == 0
possible?
I think the intent is that this is not possible, but I am not sure the wording guarantees that. Here is my analysis:
Since all the fences are SC, there must be a total order between them.
F1
must be before F2
in that order since they are in
the same thread. Therefore F3
is either before F1
,
between F1
and F2
or after F2
.
If F3
is after F2
, then we can apply 32.5.4 [atomics.order]p5 from
N2798:
For atomic operations
A
andB
on an atomic objectM
, whereA
modifiesM
andB
takes its value, if there arememory_order_seq_cst
fencesX
andY
such thatA
is sequenced beforeX
,Y
is sequenced beforeB
, andX
precedesY
inS
, thenB
observes either the effects ofA
or a later modification ofM
in its modification order.
In this case, A
is SX
, B
is RX
, the
fence X
is F2
and the fence Y
is F3
,
so RX
must see 1.
If F3
is before F2
, this doesn't apply, but
F3
can therefore be before or after F1
.
If F3
is after F1
, the same logic applies, but this
time the fence X
is F1
. Therefore again, RX
must see 1.
Finally we have the case that F3
is before F1
in the SC ordering. There are now no guarantees about RX
, and
RX
can see r2==0
.
We can apply 32.5.4 [atomics.order]p5 again. This time,
A
is SY2
, B
is RY
, X
is
F3
and Y
is F1
. Thus RY
must observe
the effects of SY2
or a later modification of y
in its
modification order.
Since SY1
is sequenced before RY
, RY
must
observe the effects of SY1
or a later modification of
y
in its modification order.
In order to ensure that RY
sees (r1==1)
, we must see
that SY1
is later in the modification order of y
than
SY2
.
We're now skating on thin ice. Conceptually, SY2
happens-before
F3
, F3
is SC-ordered before F1
, F1
happens-before SY1
, so SY1
is later in the
modification order M
of y
, and RY
must see
the result of SY1
(r1==1
). However, I don't think the
words are clear on that.
[ Post Summit Hans adds: ]
In my (Hans') view, our definition of fences will always be weaker than what particular hardware will guarantee.
Memory_order_seq_cst
fences inherently don't guarantee sequential consistency anyway, for good reasons (e.g. because they can't enforce a total order on stores). Hence I don't think the issue demonstrates a gross failure to achieve what we intended to achieve. The example in question is a bit esoteric. Hence, in my view, living with the status quo certainly wouldn't be a disaster either.In any case, we should probably add text along the lines of the following between p5 and p6 in 32.5.4 [atomics.order]:
[Note:
Memory_order_seq_cst
only ensures sequential consistency for a data-race-free program that uses exclusivelymemory_order_seq_cst
operations. Any use of weaker ordering will invalidate this guarantee unless extreme care is used. In particular,memory_order_seq_cst
fences only ensure a total order for the fences themselves. They cannot, in general, be used to restore sequential consistency for atomic operations with weaker ordering specifications.]Also see thread beginning at c++std-lib-23271.
[ Herve's correction: ]
Minor point, and sorry for the knee jerk reaction: I admit to having no knowledge of Memory_order_seq_cst, but my former boss (John Lakos) has ingrained an automatic introspection on the use of "only". I think you meant:
[Note:
Memory_order_seq_cst
ensures sequential consistency only for . . . . In particular,memory_order_seq_cst
fences ensure a total order only for . . .Unless, of course,
Memory_order_seq_cst
really do nothing but ensure sequential consistency for a data-race-free program that uses exclusivelymemory_order_seq_cst
operations.
[ 2009-10 Santa Cruz: ]
NAD Editorial. Solved by N2992.
Proposed resolution:
Add a new paragraph after 32.5.4 [atomics.order]p5 that says
For atomic operations
A
andB
on an atomic objectM
, whereA
andB
modifyM
, if there arememory_order_seq_cst
fencesX
andY
such thatA
is sequenced beforeX
,Y
is sequenced beforeB
, andX
precedesY
inS
, thenB
occurs later thanA
in the modifiction order ofM
.