On Wed, Nov 11, 2015 at 07:54:15PM +0100, Peter Zijlstra wrote:
On Wed, Nov 11, 2015 at 07:44:27PM +0100, Peter Zijlstra wrote:
On Wed, Nov 11, 2015 at 07:31:28PM +0100, Peter Zijlstra wrote:
Add new one that does 'fetch_and_add' ? What is the real use case it will be used for?
Look at all the atomic_{add,dec}_return*() users in the kernel. A typical example would be a reader-writer lock implementations. See include/asm-generic/rwsem.h for examples.
Maybe a better example would be refcounting, where you free on 0.
if (!fetch_add(&obj->ref, -1)) free(obj);
Urgh, too used to the atomic_add_return(), which returns post op. That wants to be:
if (fetch_add(&obj->ref, -1) == 1) free(obj);
this type of code will never be acceptable in bpf world. If C code does cmpxchg-like things, it's clearly beyond bpf abilities. There are no locks or support for locks in bpf design and will not be. We don't want a program to grab a lock and then terminate automatically because it did divide by zero. Programs are not allowed to directly allocate/free memory either. We don't want dangling pointers. Therefore things like memory barriers, full set of atomics are not applicable in bpf world. The only goal for bpf_xadd (could have been named better, agreed) was to do counters. Like counting packets or bytes or events. In all such cases there is no need to do 'fetch' part. Another reason for lack of 'fetch' part is simplifying JIT. It's easier to emit 'atomic_add' equivalent than to emit 'atomic_add_return'. The only shared data structure two programs can see is a map element. They can increment counters via bpf_xadd or replace the whole map element atomically via bpf_update_map_elem() helper. That's it. If the program needs to grab the lock, do some writes and release it, then probably bpf is not suitable for such use case. The bpf programs should be "fast by design" meaning that there should be no mechanisms in bpf architecture that would allow a program to slow down other programs or the kernel in general.