diff options
author | Peter Zijlstra <peterz@infradead.org> | 2019-04-24 13:38:23 +0200 |
---|---|---|
committer | Ingo Molnar <mingo@kernel.org> | 2019-06-17 12:09:59 +0200 |
commit | 69d927bba39517d0980462efc051875b7f4db185 (patch) | |
tree | 5552ef9cc71fcdde90c1e6544cc82b4b682362ed /Documentation/atomic_t.txt | |
parent | dd471efe345bf6f9e1206f6c629ca3e80eb43523 (diff) | |
download | lwn-69d927bba39517d0980462efc051875b7f4db185.tar.gz lwn-69d927bba39517d0980462efc051875b7f4db185.zip |
x86/atomic: Fix smp_mb__{before,after}_atomic()
Recent probing at the Linux Kernel Memory Model uncovered a
'surprise'. Strongly ordered architectures where the atomic RmW
primitive implies full memory ordering and
smp_mb__{before,after}_atomic() are a simple barrier() (such as x86)
fail for:
*x = 1;
atomic_inc(u);
smp_mb__after_atomic();
r0 = *y;
Because, while the atomic_inc() implies memory order, it
(surprisingly) does not provide a compiler barrier. This then allows
the compiler to re-order like so:
atomic_inc(u);
*x = 1;
smp_mb__after_atomic();
r0 = *y;
Which the CPU is then allowed to re-order (under TSO rules) like:
atomic_inc(u);
r0 = *y;
*x = 1;
And this very much was not intended. Therefore strengthen the atomic
RmW ops to include a compiler barrier.
NOTE: atomic_{or,and,xor} and the bitops already had the compiler
barrier.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Diffstat (limited to 'Documentation/atomic_t.txt')
-rw-r--r-- | Documentation/atomic_t.txt | 3 |
1 files changed, 3 insertions, 0 deletions
diff --git a/Documentation/atomic_t.txt b/Documentation/atomic_t.txt index 89eae7f6b360..d439a0fdbe47 100644 --- a/Documentation/atomic_t.txt +++ b/Documentation/atomic_t.txt @@ -196,6 +196,9 @@ These helper barriers exist because architectures have varying implicit ordering on their SMP atomic primitives. For example our TSO architectures provide full ordered atomics and these barriers are no-ops. +NOTE: when the atomic RmW ops are fully ordered, they should also imply a +compiler barrier. + Thus: atomic_fetch_add(); |