On Wed 2018-10-03 13:37:04, Steven Rostedt wrote:
On Wed, 3 Oct 2018 10:16:08 -0700 Daniel Wang wonderfly@google.com wrote:
On Wed, Oct 3, 2018 at 2:14 AM Petr Mladek pmladek@suse.com wrote:
On Tue 2018-10-02 21:23:27, Steven Rostedt wrote:
I don't see the big deal of backporting this. The biggest complaints about backports are from fixes that were added to late -rc releases where the fixes didn't get much testing. This commit was added in 4.16, and hasn't had any issues due to the design. Although a fix has been added:
c14376de3a1 ("printk: Wake klogd when passing console_lock owner")
As I said, I am fine with backporting the console_lock owner stuff into the stable release.
I just wonder (like Sergey) what the real problem is. The console_lock owner handshake is not fully reliable. It is might be good enough
I'm not sure what you mean by 'not fully reliable'
I mean that it is not guaranteed that the very first printk() takes over the console. It will happen only when the other printk() calls console_trylock_spinning() while the current console owner does the code between:
console_lock_spinning_enable(); console_lock_spinning_disable_and_check();
Just to be sure. Daniel, could you please send a log with the console_lock owner stuff backported? There we would see who called the panic() and why it rebooted early.
Sure. Here is one. It's a bit long but complete. I attached another log snippet below it which is what I got when `softlockup_panic` was turned off. The log was from the IRQ task that was flushing the printk buffer. I will be taking a closer look at it too but in case you'll find it helpful.
Just so I understand correctly. Does the panic hit with and without the suggested backport patch? The only difference is that you get the full output with the patch and limited output without it?
Sigh, the other mail suggest that there was a real deadlock. It means that the console owner logic might help but it would not prevent the deadlock completely.
Best Regards, Petr