On Mon, Oct 21, 2019 at 01:33:27PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats without holding sighand lock.
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
The first approach used smp_load_acquire() and smp_store_release(). However, after having discussed this it seems that the data dependency for kmem_cache_alloc() would be fixed by WRITE_ONCE(). Furthermore, the smp_load_acquire() would only manage to order the stats check before the thread_group_empty() check. So it seems just using READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this up for discussion at least.
Mmh, the RELEASE was intended to order the memory initialization in kmem_cache_zalloc() with the later ->stats pointer assignment; AFAICT, there is no data dependency between such memory accesses.
Correspondingly, the ACQUIRE was intended to order the ->stats pointer load with later, _independent dereferences of the same pointer; the latter are, e.g., in taskstats_exit() (but not thread_group_empty()).
Looking again, I see that fill_tgid_exit()'s dereferences of ->stats are protected by ->siglock: maybe you meant to rely on such a critical section pairing with the critical section in taskstats_tgid_alloc()?
That memcpy(-, tsk->signal->stats, -) at the end of taskstats_exit() also bugs me: could these dereferences of ->stats happen concurrently with other stores to the same memory locations?
Thanks, Andrea