Re: [PATCH] MIPS: implement smp_cond_load_acquire() for Loongson-3

19 Jun 2018

      Hi all,
On Tue, Jun 19, 2018 at 09:17:10AM +0200, Peter Zijlstra wrote:
...
On Mon, Jun 18, 2018 at 11:51:41AM -0700, Paul Burton wrote:
...
On Fri, Jun 15, 2018 at 02:07:38PM +0800, Huacai Chen wrote:
...
After commit 7f56b58a92aaf2c ("locking/mcs: Use smp_cond_load_acquire()
in MCS spin loop") Loongson-3 fails to boot. This is because Loongson-3
has SFB (Store Fill Buffer) and READ_ONCE() may get an old value in a
tight loop. So in smp_cond_load_acquire() we need a __smp_mb() after
every READ_ONCE().
Thanks - modifying smp_cond_load_acquire() is a step better than
modifying arch_mcs_spin_lock_contended() to avoid it, but I'm still not
sure we've reached the root of the problem.
Agreed, this looks entirely dodgy.
...
If tight loops using
READ_ONCE() are at fault then what's special about
smp_cond_load_acquire()? Could other such loops not hit the same
problem?
Right again, Linux has a number of places where it relies on loops like
this.
for (;;) {
   	if (READ_ONCE(*ptr))
   		break;
cpu_relax();

}
That is assumed to terminate -- provided the store to make *ptr != 0
happens of course.
And this has nothing to do with store buffers per se, sure store-buffers
might delay the store from being visible for a (little) while, but we
very much assume store buffers will not indefinitely hold on to data.
We had an issue 8 years ago with the 11MPCore CPU where reads were
prioritised over writes, so code doing something like:
WRITE_ONCE(*foo, 1);
  while (!READ_ONCE(*bar));
might never make the store to foo visible to other CPUs. This caused a
livelock in KGDB, where two CPUs were doing this on opposite variables
(i.e. the "SB" litmus test, but with the reads looping until they read
1).
See 534be1d5a2da ("ARM: 6194/1: change definition of cpu_relax() for
ARM11MPCore") for the ugly fix, assuming that the "Store Fill Buffer"
suffers from the same disease.
Will

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH] MIPS: implement smp_cond_load_acquire() for Loongson-3