Again, the code fragment is as follows:
1 rcu_read_lock(); 2 p = rcu_dereference(gp); 3 if (must_delete(p)) { 4 spin_lock(&my_lock); 5 if (must_delete(p)) { 6 kfree_rcu(p); 7 rcu_assign_pointer(gp, q); 8 already_deleted(p); 9 } 10 spin_unlock(&my_lock); 11 } else { 12 do_something_with(p); 13 } 14 rcu_read_unlock();
This is a bit unconventional: Normally lines 6 and 7 would be
interchanged.
But the RCU read-side critical section will prevent the structure
from being freed until after the global pointer is NULL
ed
out, right?
Yes it will, but that does not make the above code sequence safe. Far from it. To see this, consider the following sequence of events:
kfree_rcu()
on line 6
above, which starts a new grace period.
spin_unlock()
on line 10
and the rcu_read_unlock()
on line 14.
kfree_rcu()
is now freed.
The moral of this story: Always prevent readers from obtaining new references to the structure before starting the grace period. Always!
In this case, it is necessary to interchange lines 6 and 7
so that the readers' path to the to-be-deleted data item is removed
before the grace period is started.
In other words, the rcu_assign_pointer()
must precede the
kfree_rcu()
.
The resulting corrected code is as follows:
1 rcu_read_lock(); 2 p = rcu_dereference(gp); 3 if (must_delete(p)) { 4 spin_lock(&my_lock); 5 if (must_delete(p)) { 6 rcu_assign_pointer(gp, q); 7 kfree_rcu(p); 8 already_deleted(p); 9 } 10 spin_unlock(&my_lock); 11 } else { 12 do_something_with(p); 13 } 14 rcu_read_unlock();
Many thanks to Neil Brown for asking some excellent questions that led to this scenario.