On Wed, Nov 11, 2015 at 04:23:41PM +0000, Will Deacon wrote:
If we're going to document it, a bug tracker might be a good place to start. The behaviour, as it stands, is broken wrt the definition of the __sync primitives. That is, there is no way to build __sync_fetch_and_add out of BPF_XADD without changing its semantics.
BPF_XADD == atomic_add() in kernel. period. we are not going to deprecate it or introduce something else. Semantics of __sync* or atomic in C standard and/or gcc/llvm has nothing to do with this. arm64 JIT needs to JIT bpf_xadd insn equivalent to the code of atomic_add() which is 'stadd' in armv8.1. The cpu check can be done by jit and for older cpus just fall back to interpreter. trivial.
We could fix this by either:
(1) Defining BPF_XADD to match __sync_fetch_and_add (including memory barriers).
nope.
(2) Introducing some new BPF_ atomics, that map to something like the C11 __atomic builtins and deprecating BPF_XADD in favour of these.
nope.
(3) Introducing new source-language intrinsics to match what BPF can do (unlikely to be popular).
llvm's __sync intrinsic is used temporarily until we have time to do new intrinsic in llvm that matches kernel's atomic_add() properly. It will be done similar to llvm-bpf load_byte/word intrinsics. Note that we've been hiding it under lock_xadd() wrapper, like here: https://github.com/iovisor/bcc/blob/master/examples/networking/tunnel_monito...