From: Paul E. McKenney paulmck@kernel.org
commit bfb3aa735f82c8d98b32a669934ee7d6b346264d upstream.
An outgoing CPU is marked offline in a stop-machine handler and most of that CPU's services stop at that point, including IRQ work queues. However, that CPU must take another pass through the scheduler and through a number of CPU-hotplug notifiers, many of which contain RCU readers. In the past, these readers were not a problem because the outgoing CPU has interrupts disabled, so that rcu_read_unlock_special() would not be invoked, and thus RCU would never attempt to queue IRQ work on the outgoing CPU.
This changed with the advent of the CONFIG_RCU_STRICT_GRACE_PERIOD Kconfig option, in which rcu_read_unlock_special() is invoked upon exit from almost all RCU read-side critical sections. Worse yet, because interrupts are disabled, rcu_read_unlock_special() cannot immediately report a quiescent state and will therefore attempt to defer this reporting, for example, by queueing IRQ work. Which fails with a splat because the CPU is already marked as being offline.
But it turns out that there is no need to report this quiescent state because rcu_report_dead() will do this job shortly after the outgoing CPU makes its final dive into the idle loop. This commit therefore makes rcu_read_unlock_special() refrain from queuing IRQ work onto outgoing CPUs.
Fixes: 44bad5b3cca2 ("rcu: Do full report for .need_qs for strict GPs") Signed-off-by: Paul E. McKenney paulmck@kernel.org Cc: Jann Horn jannh@google.com Signed-off-by: Zhen Lei thunder.leizhen@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/rcu/tree_plugin.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -628,7 +628,7 @@ static void rcu_read_unlock_special(stru set_tsk_need_resched(current); set_preempt_need_resched(); if (IS_ENABLED(CONFIG_IRQ_WORK) && irqs_were_disabled && - !rdp->defer_qs_iw_pending && exp) { + !rdp->defer_qs_iw_pending && exp && cpu_online(rdp->cpu)) { // Get scheduler to re-evaluate and call hooks. // If !IRQ_WORK, FQS scan will eventually IPI. init_irq_work(&rdp->defer_qs_iw,