Re: [PATCH v3] sched/rt: Fix race in push_rt_task

4 Mar 2025

Hi Harshit,
On 25/02/25 18:05, Harshit Agarwal wrote:
...
Overview
When a CPU chooses to call push_rt_task and picks a task to push to
another CPU's runqueue then it will call find_lock_lowest_rq method
which would take a double lock on both CPUs' runqueues. If one of the
locks aren't readily available, it may lead to dropping the current
runqueue lock and reacquiring both the locks at once. During this window
it is possible that the task is already migrated and is running on some
other CPU. These cases are already handled. However, if the task is
migrated and has already been executed and another CPU is now trying to
wake it up (ttwu) such that it is queued again on the runqeue
(on_rq is 1) and also if the task was run by the same CPU, then the
current checks will pass even though the task was migrated out and is no
longer in the pushable tasks list.
...
...

kernel/sched/rt.c | 54 +++++++++++++++++++++++------------------------
 1 file changed, 26 insertions(+), 28 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 4b8e33c615b1..4762dd3f50c5 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1885,6 +1885,27 @@ static int find_lowest_rq(struct task_struct *task)
   return -1;
 }
 
+static struct task_struct *pick_next_pushable_task(struct rq *rq)
+{

struct task_struct *p;

if (!has_pushable_tasks(rq))
return NULL;



p = plist_first_entry(&rq->rt.pushable_tasks,
	      struct task_struct, pushable_tasks);



BUG_ON(rq->cpu != task_cpu(p));
BUG_ON(task_current(rq, p));
BUG_ON(task_current_donor(rq, p));
BUG_ON(p->nr_cpus_allowed <= 1);

BUG_ON(!task_on_rq_queued(p));
BUG_ON(!rt_task(p));

return p;

+}



/* Will lock the rq it finds */
 static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq *rq)
 {
@@ -1915,18 +1936,16 @@ static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq *rq)
   		/*
   		 * We had to unlock the run queue. In
   		 * the mean time, task could have

	 * migrated already or had its affinity changed.


	 * Also make sure that it wasn't scheduled on its rq.




	 * migrated already or had its affinity changed,


	 * therefore check if the task is still at the


	 * head of the pushable tasks list.
 * It is possible the task was scheduled, set
 * "migrate_disabled" and then got preempted, so we must
 * check the task migration disable flag here too.
 */




	if (unlikely(task_rq(task) != rq ||




	if (unlikely(is_migration_disabled(task) ||
	     !cpumask_test_cpu(lowest_rq->cpu, &task->cpus_mask) ||




		     task_on_cpu(rq, task) ||


		     !rt_task(task) ||


		     is_migration_disabled(task) ||


		     !task_on_rq_queued(task))) {




		     task != pick_next_pushable_task(rq))) {



double_unlock_balance(rq, lowest_rq);
   			lowest_rq = NULL;
@@ -1946,27 +1965,6 @@ static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq *rq)
   return lowest_rq;
 }
 
-static struct task_struct *pick_next_pushable_task(struct rq *rq)
-{

struct task_struct *p;

if (!has_pushable_tasks(rq))
return NULL;



p = plist_first_entry(&rq->rt.pushable_tasks,
	      struct task_struct, pushable_tasks);



BUG_ON(rq->cpu != task_cpu(p));
BUG_ON(task_current(rq, p));
BUG_ON(task_current_donor(rq, p));
BUG_ON(p->nr_cpus_allowed <= 1);

BUG_ON(!task_on_rq_queued(p));
BUG_ON(!rt_task(p));

return p;

-}



As usual, we have essentially the same in deadline.c, do you think we
should/could implement the same fix proactively in there as well? Steve?
Thanks,
Juri

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v3] sched/rt: Fix race in push_rt_task

Overview