Here is the scenario again:
CPU 0 | CPU 1 | CPU 2 | CPU 3 |
---|---|---|---|
rcu_read_lock(); |
r3 = B; |
A = 1; |
B = 1; |
What can we say about the values of r1
, r2
,
r3
, and r4
?
In particular, once execution is complete, can this counter-intuitive
result occur?
r1 == 1 && r2 == 0 && r3 == 1 && r4 == 0
In other words, can CPUs 0 and 1 see CPU 2's and CPU 3's stores as having happened in opposite orders?
Given that both the compiler and the CPU are within their rights (at least as far as the Linux kernel is concerned) to reorder CPU 0's loads, it is quite tempting to say “yes.” But we should instead take this question one step at a time, using the fundamental property of RCU.
Let's start by assuming that r3 == 1 && r4 == 0
.
This is simple enough: the store to B
happened before
CPU 1's load from B
(and thus before the
grace period) and the store to A
happened after CPU 1's
load from A
(and thus after the grace period).
Then r1 == 1
implies that CPU 0's load from A
sees CPU 2's store.
Because this store happened after the grace period, CPU 0's load
must also have happened after the grace period.
However, the fundamental property of RCU says that if part of an
RCU read-side critical section comes after a given grace period,
none of that same critical section can come before that same
grace period.
This implies that CPU 0's load from B
cannot
precede the grace period.
Because we know that CPU 3's store to B
happened before
the grace period, CPU 0's load from B
must see
the results of that store, so that r2
will be equal to the
value 1
, not 0
.
Therefore, the above counter-intuitive result is prohibited, even though both the CPU and the compiler would be within their rights (at least as far as the Linux kernel is concerned) to reorder CPU 0's loads.