On Fri, 2013-02-22 at 15:16 +0100, Ingo Molnar wrote:
I checked arch/x86/include/asm/atomic64_32.h and we use cmpxchg8b for everything from _set() to _read(), which translates into 'horridly stupendifyingly slow' for a number of machines, but coherent.
That's a valid concern - and cmpxchg8b is the only 64-bit op available on most 32-bit x86 CPUs which does not involve the FPU.
Wondering how significant this range of x86 problem boxes will be by the time any of these changes reaches upstream and distros
- and how much 'horridly stupendifyingly slow' is in terms of
cycles expended.
On the !x86 side of things we're implementing (generic) atomic64 using hashed spinlocks, so there too using a single spinlock around the entire data structure is a complete win.