When CPU 1 enters the nohz_full state, and the kworker on CPU 0 executes the function sched_tick_remote, holding the lock on CPU1's rq and triggering the warning WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3). This leads to the process of printing the warning message, where the console_sem semaphore is held. At this point, the print task on the CPU1's rq cannot acquire the console_sem and joins the wait queue, entering the UNINTERRUPTIBLE state. It waits for the console_sem to be released and then wakes up. After the task on CPU 0 releases the console_sem, it wakes up the waiting console_sem task. In try_to_wake_up, it attempts to acquire the lock on CPU1's rq again, resulting in a deadlock.
The triggering scenario is as follows:
CPU0 CPU1 sched_tick_remote WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3)
report_bug con_write printk
console_unlock do_con_write console_lock down(&console_sem) list_add_tail(&waiter.list, &sem->wait_list); up(&console_sem) wake_up_q(&wake_q) try_to_wake_up __task_rq_lock _raw_spin_lock
This patch fixes the issue by deffering all printk console printing during the lock holding period.
Fixes: d84b31313ef8 ("sched/isolation: Offload residual 1Hz scheduler tick") Signed-off-by: Wang Tao wangtao554@huawei.com --- kernel/sched/core.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index be00629f0ba4..8b2d5b5bfb93 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5723,8 +5723,10 @@ static void sched_tick_remote(struct work_struct *work) * Make sure the next tick runs within a * reasonable amount of time. */ + printk_deferred_enter(); u64 delta = rq_clock_task(rq) - curr->se.exec_start; WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3); + printk_deferred_exit(); } curr->sched_class->task_tick(rq, curr, 0);