On (10/21/18 11:09), Daniel Wang wrote:
Just got back from vacation. Thanks for the continued discussion. Just so I understand the current state. Looks like we've got a pretty good explanation of what's going on (though not completely sure), and backporting Steven's patches is still the way to go?
Up to -stable maintainers.
Note, with or without Steven's patch, the non-reentrable consoles are still non-reentrable, so the deadlock is still there:
spin_lock_irqsave(&port->lock, flags) <NMI> panic() console_flush_on_panic() spin_lock_irqsave(&port->lock, flags) // deadlock
// And I wouldn't mind to have more reviews/testing on [1].
Another deadlock scenario could be the following one:
printk() console_trylock() down_trylock() raw_spin_lock_irqsave(&sem->lock, flags) <NMI> panic() console_flush_on_panic() console_trylock() raw_spin_lock_irqsave(&sem->lock, flags) // deadlock
There are no patches addressing this one at the moment. And it's unclear if you are hitting this scenario.
I see that Sergey had sent an RFC series for similar things. Are those trying to solve the deadlock problem in a different way?
Umm, I wouldn't call it "another way". It turns non-reentrant serial consoles to re-entrable ones. Did you test patch [1] from that series on you environment, by the way?
[1] lkml.kernel.org/r/20181016050428.17966-2-sergey.senozhatsky@gmail.com
-ss