On 04.04.2018 19:18, Peter Zijlstra wrote:
On Wed, Apr 04, 2018 at 06:51:08PM +0300, Kirill Tkhai wrote:
On 04.04.2018 18:35, Peter Zijlstra wrote:
On Wed, Apr 04, 2018 at 06:24:39PM +0300, Kirill Tkhai wrote:
The following situation leads to deadlock:
[task 1] [task 2] [task 3] kill_fasync() mm_update_next_owner() copy_process() spin_lock_irqsave(&fa->fa_lock) read_lock(&tasklist_lock) write_lock_irq(&tasklist_lock) send_sigio() <IRQ> ... read_lock(&fown->lock) kill_fasync() ... read_lock(&tasklist_lock) spin_lock_irqsave(&fa->fa_lock) ...
Task 1 can't acquire read locked tasklist_lock, since there is already task 3 expressed its wish to take the lock exclusive. Task 2 holds the read locked lock, but it can't take the spin lock.
The patch makes queued_read_lock_slowpath() to give task 1 the same priority as it was an interrupt handler, and to take the lock
That re-introduces starvation scenarios. And the above looks like a proper deadlock that should be sorted by fixing the locking order.
We can move tasklist_lock out of send_sigio(), but I'm not sure it's possible for read_lock(&fown->lock).
So the scenario is:
CPU0 CPU1 CPU2
spin_lock_irqsave(&fa->fa_lock); read_lock(&tasklist_lock) write_lock_irq(&tasklist_lock) read_lock(&tasklist_lock) <IRQ> spin_lock_irqsave(&fa->fa_lock);
Right? (where the row now signifies time)
That doesn't seem to include fown->lock, you're saying it has an identical issue?
There is also read_lock(), which can be taken from interrupt:
CPU0 CPU1 CPU2
f_getown() kill_fasync() read_lock(&f_own->lock) spin_lock_irqsave(&fa->fa_lock, flags) <IRQ> send_sigio() write_lock_irq(&f_own->lock); kill_fasync() read_lock(&fown->lock) spin_lock_irqsave(&fa->fa_lock, flags)
To prevent deadlock, this requires all &f_own->lock be taken via read_lock_irqsave().
This may be formalized as lockdep rule: if spinlock nests into read_lock(), and they both can be called from interrupt, the rest of read_lock() always must disable interrupts.
Is there another solution? Is there reliable way to iterate do_each_pid_task() with rcu_read_lock()?
Depends on what you call reliable :-), Yes you can use do_each_pid_task() with RCU, but as always you're prone to see tasks that are dead and miss tasks that just came in.
If that is sufficient for the signal muck, dunno :/ Typically signals use sighand lock, not tasklist_lock.
The first thing is not a problem, while missing a newly moved task is not suitable. So, it seems it's not reliable...
Kirill