From: Zqiang qiang.zhang@linux.dev
[ Upstream commit 36c6f3c03d104faf1aa90922f2310549c175420f ]
For PREEMPT_RT kernels, the kick_cpus_irq_workfn() be invoked in the per-cpu irq_work/* task context and there is no rcu-read critical section to protect. this commit therefore use IRQ_WORK_INIT_HARD() to initialize the per-cpu rq->scx.kick_cpus_irq_work in the init_sched_ext_class().
Signed-off-by: Zqiang qiang.zhang@linux.dev Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
1. **Commit Message Analysis** - **Problem:** On `PREEMPT_RT` kernels, `irq_work` initialized with `init_irq_work()` executes in a threaded context where RCU read- side critical sections are not implicit. The function `kick_cpus_irq_workfn` accesses per-CPU request queues (`rq->scx`), which requires RCU protection or preemption disabling to be safe. - **Solution:** The commit changes the initialization to `IRQ_WORK_INIT_HARD()`. This macro sets the `IRQ_WORK_HARD_IRQ` flag, forcing the work item to execute in hard interrupt context even on `PREEMPT_RT` kernels. - **Keywords:** "PREEMPT_RT", "RCU-read critical section", "initialize". - **Tags:** No explicit `Cc: stable` or `Fixes` tag in the provided text, but the nature of the fix (correctness on RT) is a strong candidate for stable.
2. **Deep Code Research** - **Code Context:** The affected file `kernel/sched/ext.c` belongs to the `sched_ext` (Extensible Scheduler Class) subsystem. - **Technical Mechanism:** In standard kernels, `irq_work` usually runs in contexts where RCU is safe. In `PREEMPT_RT`, the default behavior changes to threaded IRQs to reduce latency, but this removes the implicit RCU protection. Accessing scheduler runqueues (`rq`) without this protection can lead to Use-After-Free (UAF) bugs or data corruption if a CPU goes offline or the task structure changes. - **The Fix:** `IRQ_WORK_INIT_HARD` is the standard mechanism to opt-out of threaded execution for specific work items that require hard IRQ semantics (atomic execution, implicit RCU protection). This is a well-understood pattern in the kernel. - **Subsystem Status:** `sched_ext` was merged in v6.12. Therefore, this fix is applicable to stable kernels v6.12 and newer.
3. **Stable Kernel Rules Evaluation** - **Fixes a real bug?** Yes. It fixes a race condition/correctness issue on `PREEMPT_RT` kernels which could lead to crashes. - **Obviously correct?** Yes. The fix uses standard kernel primitives to enforce the required execution context. - **Small and contained?** Yes. It is a one-line change to an initialization function. - **No new features?** Yes. It only corrects the behavior of existing code. - **Regression Risk:** Low. It forces hard IRQ context, which is generally safe for `irq_work` provided the work function is fast (which `kick_cpus` typically is).
4. **Conclusion** This commit is a text-book example of a stable backport candidate. It addresses a correctness issue in the interaction between a specific subsystem (`sched_ext`) and the `PREEMPT_RT` configuration. The fix is minimal, surgical, and necessary to prevent potential crashes. While it applies only to kernels containing `sched_ext` (6.12+), it is critical for users running that configuration.
**YES**
kernel/sched/ext.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index fa64fdb6e9796..6f8ef62c8216c 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -5281,7 +5281,7 @@ void __init init_sched_ext_class(void) BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_preempt, GFP_KERNEL, n)); BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_wait, GFP_KERNEL, n)); rq->scx.deferred_irq_work = IRQ_WORK_INIT_HARD(deferred_irq_workfn); - init_irq_work(&rq->scx.kick_cpus_irq_work, kick_cpus_irq_workfn); + rq->scx.kick_cpus_irq_work = IRQ_WORK_INIT_HARD(kick_cpus_irq_workfn);
if (cpu_online(cpu)) cpu_rq(cpu)->scx.flags |= SCX_RQ_ONLINE;