New subject: [PATCH v3] workqueue: Fix WARN_ON_ONCE() triggers in worker_enter_idle()

24 May 2023

+ Anders, LKFT
On Wed, 24 May 2023 at 09:23, Zqiang qiang.zhang1211@gmail.com wrote:
...
Currently, the nr_running can be modified from timer tick, that means
the timer tick can run in not-irq-protected critical section to modify
nr_runnig, consider the following scenario:
CPU0
kworker/0:2 (events)
   worker_clr_flags(worker, WORKER_PREP | WORKER_REBOUND);
   ->pool->nr_running++;  (1)
process_one_work()
   ->worker->current_func(work);
     ->schedule()
       ->wq_worker_sleeping()
         ->worker->sleeping = 1;
         ->pool->nr_running--;  (0)
           ....
       ->wq_worker_running()
               ....
               CPU0 by interrupt:
               wq_worker_tick()
               ->worker_set_flags(worker, WORKER_CPU_INTENSIVE);
                 ->pool->nr_running--;  (-1)
                 ->worker->flags |= WORKER_CPU_INTENSIVE;
               ....
         ->if (!(worker->flags & WORKER_NOT_RUNNING))
           ->pool->nr_running++;    (will not execute)
         ->worker->sleeping = 0;
         ....
    ->worker_clr_flags(worker, WORKER_CPU_INTENSIVE);
      ->pool->nr_running++;  (0)
    ....
    worker_set_flags(worker, WORKER_PREP);
    ->pool->nr_running--;   (-1)
    ....
    worker_enter_idle()
    ->WARN_ON_ONCE(pool->nr_workers == pool->nr_idle && pool->nr_running);
if the nr_workers is equal to nr_idle, due to the nr_running is not zero,
will trigger WARN_ON_ONCE().
[    2.460602] WARNING: CPU: 0 PID: 63 at kernel/workqueue.c:1999 worker_enter_idle+0xb2/0xc0
[    2.462163] Modules linked in:
[    2.463401] CPU: 0 PID: 63 Comm: kworker/0:2 Not tainted 6.4.0-rc2-next-20230519 #1
[    2.463771] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
[    2.465127] Workqueue:  0x0 (events)
[    2.465678] RIP: 0010:worker_enter_idle+0xb2/0xc0
...
[    2.472614] Call Trace:
[    2.473152]  <TASK>
[    2.474182]  worker_thread+0x71/0x430
[    2.474992]  ? _raw_spin_unlock_irqrestore+0x28/0x50
[    2.475263]  kthread+0x103/0x120
[    2.475493]  ? __pfx_worker_thread+0x10/0x10
[    2.476355]  ? __pfx_kthread+0x10/0x10
[    2.476635]  ret_from_fork+0x2c/0x50
[    2.477051]  </TASK>
This commit therefore add the check of worker->sleeping in wq_worker_tick(),
if the worker->sleeping is not zero, directly return.
Reported-by: Naresh Kamboju naresh.kamboju@linaro.org
Closes: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20230519/tes...
Signed-off-by: Zqiang qiang.zhang1211@gmail.com
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Tested-by: Anders Roxell anders.roxell@linaro.org
Since the probability of occurrence of this problem is only 3%,
Anders took this up and applied this on top of Linux next and
tested for 500 boot tests and all looked good.
Thanks, Anders.
- Naresh
...

kernel/workqueue.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 9c5c1cfa478f..a028b851333e 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1051,7 +1051,7 @@ void wq_worker_running(struct task_struct *task)
 {
        struct worker *worker = kthread_data(task);

  if (!worker->sleeping)




  if (!READ_ONCE(worker->sleeping))
          return;

  /*



@@ -1071,7 +1071,7 @@ void wq_worker_running(struct task_struct *task)
         */
        worker->current_at = worker->task->se.sum_exec_runtime;

  worker->sleeping = 0;




  WRITE_ONCE(worker->sleeping, 0);



}
/**
@@ -1097,10 +1097,10 @@ void wq_worker_sleeping(struct task_struct *task)
        pool = worker->pool;
    /* Return if preempted before wq_worker_running() was reached */


  if (worker->sleeping)




  if (READ_ONCE(worker->sleeping))
          return;





  worker->sleeping = 1;




  WRITE_ONCE(worker->sleeping, 1);
  raw_spin_lock_irq(&pool->lock);

  /*



@@ -1143,8 +1143,13 @@ void wq_worker_tick(struct task_struct *task)
         * If the current worker is concurrency managed and hogged the CPU for
         * longer than wq_cpu_intensive_thresh_us, it's automatically marked
         * CPU_INTENSIVE to avoid stalling other concurrency-managed work items.

   *


   * The worker->sleeping is true means that the worker doing voluntary


   * switch and will not hogged the CPU, or the worker is running again


   * but the worker->sleeping has not been reset, in the process of executing


   * wq_worker_running().
   */




  if ((worker->flags & WORKER_NOT_RUNNING) ||




  if ((worker->flags & WORKER_NOT_RUNNING) || READ_ONCE(worker->sleeping) ||
      worker->task->se.sum_exec_runtime - worker->current_at <
      wq_cpu_intensive_thresh_us * NSEC_PER_USEC)
          return;



--
2.17.1

    

Re: [PATCH v3] workqueue: Fix WARN_ON_ONCE() triggers in worker_enter_idle()