On Wed, Nov 11, 2015 at 01:21:04PM +0100, Daniel Borkmann wrote:
On 11/11/2015 12:58 PM, Will Deacon wrote:
On Wed, Nov 11, 2015 at 11:42:11AM +0100, Daniel Borkmann wrote:
On 11/11/2015 11:24 AM, Will Deacon wrote:
On Wed, Nov 11, 2015 at 09:49:48AM +0100, Arnd Bergmann wrote:
On Tuesday 10 November 2015 18:52:45 Z Lim wrote:
On Tue, Nov 10, 2015 at 4:42 PM, Alexei Starovoitov alexei.starovoitov@gmail.com wrote: >On Tue, Nov 10, 2015 at 04:26:02PM -0800, Shi, Yang wrote: >>On 11/10/2015 4:08 PM, Eric Dumazet wrote: >>>On Tue, 2015-11-10 at 14:41 -0800, Yang Shi wrote: >>>>aarch64 doesn't have native support for XADD instruction, implement it by >>>>the below instruction sequence:
aarch64 supports atomic add in ARMv8.1. For ARMv8(.0), please consider using LDXR/STXR sequence.
Is it worth optimizing for the 8.1 case? It would add a bit of complexity to make the code depend on the CPU feature, but it's certainly doable.
What's the atomicity required for? Put another way, what are we racing with (I thought bpf was single-threaded)? Do we need to worry about memory barriers?
Apologies if these are stupid questions, but all I could find was samples/bpf/sock_example.c and it didn't help much :(
The equivalent code more readable in restricted C syntax (that can be compiled by llvm) can be found in samples/bpf/sockex1_kern.c. So the built-in __sync_fetch_and_add() will be translated into a BPF_XADD insn variant.
Yikes, so the memory-model for BPF is based around the deprecated GCC __sync builtins, that inherit their semantics from ia64? Any reason not to use the C11-compatible __atomic builtins[1] as a base?
Hmm, gcc doesn't have an eBPF compiler backend, so this won't work on gcc at all. The eBPF backend in LLVM recognizes the __sync_fetch_and_add() keyword and maps that to a BPF_XADD version (BPF_W or BPF_DW). In the interpreter (__bpf_prog_run()), as Eric mentioned, this maps to atomic_add() and atomic64_add(), respectively. So the struct bpf_insn prog[] you saw from sock_example.c can be regarded as one possible equivalent program section output from the compiler.
Ok, so if I understand you correctly, then __sync_fetch_and_add() has different semantics depending on the backend target. That seems counter to the LLVM atomics Documentation:
http://llvm.org/docs/Atomics.html
which specifically calls out the __sync_* primitives as being sequentially-consistent and requiring barriers on ARM (which isn't the case for atomic[64]_add in the kernel).
If we re-use the __sync_* naming scheme in the source language, I don't think we can overlay our own semantics in the backend. The __sync_fetch_and_add primitive is also expected to return the old value, which doesn't appear to be the case for BPF_XADD.
Will