Issue 4177: §[atomics.order] p8 "circularly depend on their own computation" is unclear for loop

4177. §[atomics.order] p8 "circularly depend on their own computation" is unclear for loop

Section: 32.5.4 [atomics.order] Status: SG1 Submitter: jim x Opened: 2024-11-29 Last modified: 2025-02-07

Priority: 4

View other active issues in [atomics.order].

View all other issues in [atomics.order].

View all issues with SG1 status.

Discussion:

32.5.4 [atomics.order] p8 and p9 gave two paradigmatic examples of how "circularly depend on their own computation" means. However, consider this example:

std::atomic<int> x = 0, y = 2;

// thread 1:
if (y.load(relaxed) == 1) { // #1
  x.store(1, relaxed); // #2
}

//thread 2:
int pre = x.load(relaxed); // #3
while (pre != 0) {
  if (x.compare_exchange_strong(pre, pre + 1, acquire, relaxed)) {  // #4
    break;
  }
}
y.store(1, relaxed); // #5

when both #1 and #3 read 1, is this a kind of OOTA? #3 depends on #2, #2 depends on #1, #1 depends on #5, and the execution of #5 depends on the exiting of the loop, which in turn initially depends on pre.

The loop can never execute, exit after certain iterations, or be a long-time-running without exiting (i.e. cmpxchg keeps failing). So, it is unclear whether the execution of #5 depends on the loop. However, it resembles the spin-loop (a failed cmpxchg is a pure load with a relaxed load), and the subsequent codes won't execute until the loop exits. So, the scenario of spin-lock seems to agree that the code after a loop depends on the loop(regardless of whether the loop can quickly exit or be a long-time-run loop).

From this perspective, the while case is something like the if, for if, the condition is not true, and the code thereof cannot be executed. Similarly, a code after a while cannot be executed if the loop doesn't exit.

Suggested resolution:

Either accurately specify what "circularly depend on their own computation" means, or add a paradigmatic example regarding loop to indicate what it means.

[Additional discussion from submitter:]

I meant, that for a classic spin-lock, the code after the loop won't be executed until the loop exits, and the operation is just a pure load with relaxed memory order during the busy wait. The loop for a spin-lock can have three possible cases:

CAS immediately succeeds
the loop exits after certain times of iterations
the loop cannot be exited(e.g. the lock is not released)

The compiler cannot assume which case the loop for the spin-lock is. The code after the loop won't be reordered(during the busy waiting with a relaxed memory order); otherwise, implementing the spin-lock would be an issue based on the assumption that the codes after the loop wouldn't be reached when the loop was busy waiting.

In this example, the loop has the same possible cases as the loop in the spin-lock. So, the execution of #5 seems to depend on the loop and the loop depends on the pre. why I mention the spin-lock just is to justify that the code after the loop won't be reordered in both compile-time and runtime even though the condition of the loop is a pure load with a relaxed memory order(otherwise, the foundation for which the busy-waiting of a spink-lock is based on won't be true.). So, the loop has a similar effect to the if statement on the control flow: the code in the block of the if statement won't be executed if the condition is false, as well, the code after the loop won't be executed if the loop hasn't yet exited.

[2025-02-07; Reflector poll]

Set priority to 4 after reflector poll. Send to SG1.

"Should the example use unsigned so there's no chance of UB due to int overflowing?"

"Contrived example showing well-known fact that OOTA prohibition is just hand-waving. N3786 introduced the current wording and made no secret of that. P1217 explored the issue further, as have other SG1 papers. Hard problems. Solutions welcome. Don't think this is the place to focus however."

Proposed resolution: