On Tue, 2011-05-31 at 13:17 +0100, Dave Martin wrote:
On Mon, May 30, 2011 at 09:38:25AM +0200, Ken Werner wrote:
On 05/25/2011 03:17 PM, Dave Martin wrote:
On Wed, May 25, 2011 at 12:58:30PM +0100, David Gilbert wrote:
On 25 May 2011 04:45, Nicolas Pitrenicolas.pitre@linaro.org wrote:
FWIW, here's what the kernel part might look like, i.e. for compatibility with pre ARMv6k systems (beware, only compile tested):
OK, so that makes a eglibc part for that pretty easy. For things like fetch_and_add (which I can see membase needs) would you expect implementation using this cmpxchg so it has a fall back or just to use ldrexd directly which I assume would be somewhat more efficient.
(Question holds for both eglibc and gcc's __sync_*)
It depends on the baseline architecture for the build.
An eglibc built for ARMv6 and above would need to call the helper by default, though it could also use ldrexd/strexd if it determines at run- time that this is supported by the CPU.
Similarly, if GCC is building for -march=marmv7-a it can inline the atomics directly using ldrex/strex and friends, but for -march=armv6 it will need to call helpers via libgcc.
I would have thought that the libc does not decide this directly but just calls the GCC __sync_* routines (if build with a GCC that supports them). Then the GCC decides whether to inline them using ldrexd/strexd (ARMv6+) or emit calls to libgcc which calls the kernel helpers.
You're right; it looks like eglibc uses the GCC __sync_*() functions if they exist. So, it would be natural to follow this model for 64-bit atomics too.
I think the difficulty here is that glibc expects either the compiler, or libgcc to provide the sync primitives; and while GCC can tie the inlined copy of the primitive to use of CPUs with the relevant instruction, the libgcc version doesn't know how to specify that the code it's relying on requires a minimal kernel version...
It could throw the dependency back on glibc, but then you've got an expensive operation (the libgcc copies are normally implemented as private, per-library, helpers to avoid a PLT call overhead).
I'm not sure what the best solution is here.
R.