On Fri, 10 Feb 2023 17:35:10 -0300 "Guilherme G. Piccoli" gpiccoli@igalia.com wrote:
Commit 8d470a45d1a6 ("panic: add option to dump all CPUs backtraces in panic_print") introduced a setting for the "panic_print" kernel parameter to allow users to request a NMI backtrace on panic. Problem is that the panic_print handling happens after the secondary CPUs are already disabled, hence this option ended-up being kind of a no-op - kernel skips the NMI trace in idling CPUs, which is the case of offline CPUs.
Fix it by checking the NMI backtrace bit in the panic_print prior to the CPU disabling function.
...
Notice that while at it, I got rid of the "crash_kexec_post_notifiers" local copy in panic(). This was introduced by commit b26e27ddfd2a ("kexec: use core_param for crash_kexec_post_notifiers boot option"), but it is not clear from comments or commit message why this local copy is required.
My understanding is that it's a mechanism to prevent some concurrency, in case some other CPU modify this variable while panic() is running. I find it very unlikely, hence I removed it - but if people consider this copy needed, I can respin this patch and keep it, even providing a comment about that, in order to be explict about its need.
Only two sites change crash_kexec_post_notifiers, in arch/powerpc/kernel/fadump.c and drivers/hv/hv_common.c. Yes, it's very unlikely that this will be altered while panic() is running and the consequences will be slight anyway.
But formally, we shouldn't do this, especially in a -stable backportable patch. So please, let's have the minimal bugfix for now and we can look at removing that local at a later time?