Re: next: WARNING: CPU: 0 PID: 63 at kernel/workqueue.c:1999 worker_enter_idle+0xb2/0xc0

22 May 2023

...
Linux next 6.4.0-rc2-next-20230519 fails intermittently on qemu-x86_64
with the following kernel warning. The occurrence is 3 times out of 100
times.
...
Noticed with gcc-12 and clang-17 built kernels.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Boot log:
<4>[    2.460214] ------------[ cut here ]------------
<4>[    2.460602] WARNING: CPU: 0 PID: 63 at kernel/workqueue.c:1999
worker_enter_idle+0xb2/0xc0
<4>[    2.462163] Modules linked in:
<4>[    2.463401] CPU: 0 PID: 63 Comm: kworker/0:2 Not tainted
6.4.0-rc2-next-20230519 #1
<4>[    2.463771] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS 1.14.0-2 04/01/2014
<4>[    2.465127] Workqueue:  0x0 (events)
<4>[    2.465678] RIP: 0010:worker_enter_idle+0xb2/0xc0
<4>[    2.466095] Code: e0 93 04 00 48 03 35 7d 59 bc 01 e8 48 1e 0a
00 8b 43 3c 39 43 38 75 06 83 7b 24 00 75 0c 5b 5d c3 cc cc cc cc cc
0f 0b eb f4 <0f> 0b eb f0 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90
90 90 90
<4>[    2.467476] RSP: 0000:ffffa44a4042feb0 EFLAGS: 00000086
<4>[    2.468003] RAX: 0000000000000003 RBX: ffff8d71bbc2b000 RCX:
0000000050480000
<4>[    2.468470] RDX: 000000010000087f RSI: 0000000000000086 RDI:
0000000000000001
<4>[    2.468900] RBP: ffffa44a4042feb8 R08: 0000000000100000 R09:
0000000000000001
<4>[    2.469608] R10: ffff8d7140882d80 R11: 0000000000000000 R12:
ffff8d71bbc2b028
<4>[    2.470204] R13: ffff8d7140882d00 R14: ffff8d71bbc2b000 R15:
ffff8d71413df700
<4>[    2.470720] FS:  0000000000000000(0000)
GS:ffff8d71bbc00000(0000) knlGS:0000000000000000
<4>[    2.471248] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[    2.471648] CR2: ffff8d71bffff000 CR3: 0000000123e40000 CR4:
00000000000006f0
<4>[    2.472614] Call Trace:
<4>[    2.473152]  <TASK>
<4>[    2.474182]  worker_thread+0x71/0x430
<4>[    2.474992]  ? _raw_spin_unlock_irqrestore+0x28/0x50
<4>[    2.475263]  kthread+0x103/0x120
<4>[    2.475493]  ? __pfx_worker_thread+0x10/0x10
<4>[    2.476355]  ? __pfx_kthread+0x10/0x10
<4>[    2.476635]  ret_from_fork+0x2c/0x50
<4>[    2.477051]  </TASK>
<4>[    2.477785] ---[ end trace 0000000000000000 ]---
<6>[    2.748347] tsc: Refined TSC clocksource calibration: 2899.874 MHz
<6>[    2.749124] clocksource: tsc: mask: 0xffffffffffffffff
max_cycles: 0x29cccc5071e, max_idle_ns: 440795322437 ns
<6>[    2.751064] clocksource: Switched to clocksource tsc
CPU0
kworker/0:2 (events)
worker_clr_flags(worker, WORKER_PREP | WORKER_REBOUND);
        ->pool->nr_running++;   (1)
   process_one_work()
   ->worker->current_func(work);
       ->schedule()
            ->wq_worker_sleeping()
                  -> pool->nr_running--;  (0)
                  ..................
                  ->wq_worker_running()
                        ............
                          CPU0 by interrupt:
                           wq_worker_tick()
                             ->worker_set_flags(worker,
WORKER_CPU_INTENSIVE);
                                  ->pool->nr_running--;  (-1)
                       .............
                       ->if (!(worker->flags & WORKER_NOT_RUNNING))
                                 ->pool->nr_running++
<-----------------------------------------  will not execute
->worker_clr_flags(worker, WORKER_CPU_INTENSIVE);
        -> pool->nr_running++;  (0)
   worker_set_flags(worker, WORKER_PREP);
       ->pool->nr_running--;   (-1)
Please try it:

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 9c5c1cfa478f..f8d739fef311 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1060,10 +1060,9 @@ void wq_worker_running(struct task_struct *task)
         * and leave with an unexpected pool->nr_running == 1 on the newly
unbound
         * pool. Protect against such race.
         */
-       preempt_disable();
+       local_irq_disable();
        if (!(worker->flags & WORKER_NOT_RUNNING))
                worker->pool->nr_running++;
-       preempt_enable();
/*
         * CPU intensive auto-detection cares about how long a work item
hogged
@@ -1072,6 +1071,7 @@ void wq_worker_running(struct task_struct *task)
        worker->current_at = worker->task->se.sum_exec_runtime;
worker->sleeping = 0;
+       local_irq_enable();
 }
Thanks
Zqiang
...
log:



https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20230519/tes...
...



https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20230519/tes...
...
--
Linaro LKFT
https://lkft.linaro.org

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: next: WARNING: CPU: 0 PID: 63 at kernel/workqueue.c:1999 worker_enter_idle+0xb2/0xc0

Boot log:

log: