(struct sg_lb_stats).idle_cpus is of type 'unsigned int'. (local->idle_cpus - busiest->idle_cpus) can underflow to UINT_MAX for instance, and max_t(long, 0, UINT_MAX) will output UINT_MAX.
Use lsub_positive() instead of max_t().
Fixes: 0b0695f2b34a ("sched/fair: Rework load_balance()") cc: stable@vger.kernel.org Signed-off-by: Pierre Gondois pierre.gondois@arm.com --- kernel/sched/fair.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9057584ec06d..6d9124499f52 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10775,8 +10775,8 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s * idle CPUs. */ env->migration_type = migrate_task; - env->imbalance = max_t(long, 0, - (local->idle_cpus - busiest->idle_cpus)); + env->imbalance = local->idle_cpus; + lsub_positive(&env->imbalance, busiest->idle_cpus); }
#ifdef CONFIG_NUMA
Hi Pierre
On Fri, 13 Sept 2024 at 10:58, Pierre Gondois pierre.gondois@arm.com wrote:
(struct sg_lb_stats).idle_cpus is of type 'unsigned int'. (local->idle_cpus - busiest->idle_cpus) can underflow to UINT_MAX for instance, and max_t(long, 0, UINT_MAX) will output UINT_MAX.
Use lsub_positive() instead of max_t().
Have you faced the problem or this patch is based on code review ?
we have the below in sched_balance_find_src_group() that should ensure that we have local->idle_cpus > busiest->idle_cpus
if (busiest->group_weight > 1 && local->idle_cpus <= (busiest->idle_cpus + 1)) { /* * If the busiest group is not overloaded * and there is no imbalance between this and busiest * group wrt idle CPUs, it is balanced. The imbalance * becomes significant if the diff is greater than 1 * otherwise we might end up to just move the imbalance * on another group. Of course this applies only if * there is more than 1 CPU per group. */ goto out_balanced; }
Fixes: 0b0695f2b34a ("sched/fair: Rework load_balance()") cc: stable@vger.kernel.org Signed-off-by: Pierre Gondois pierre.gondois@arm.com
kernel/sched/fair.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9057584ec06d..6d9124499f52 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10775,8 +10775,8 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s * idle CPUs. */ env->migration_type = migrate_task;
env->imbalance = max_t(long, 0,
(local->idle_cpus - busiest->idle_cpus));
env->imbalance = local->idle_cpus;
lsub_positive(&env->imbalance, busiest->idle_cpus); }
#ifdef CONFIG_NUMA
2.25.1
Hello Vincent,
On 9/13/24 18:14, Vincent Guittot wrote:
Hi Pierre
On Fri, 13 Sept 2024 at 10:58, Pierre Gondois pierre.gondois@arm.com wrote:
(struct sg_lb_stats).idle_cpus is of type 'unsigned int'. (local->idle_cpus - busiest->idle_cpus) can underflow to UINT_MAX for instance, and max_t(long, 0, UINT_MAX) will output UINT_MAX.
Use lsub_positive() instead of max_t().
Have you faced the problem or this patch is based on code review ?
we have the below in sched_balance_find_src_group() that should ensure that we have local->idle_cpus > busiest->idle_cpus
if (busiest->group_weight > 1 && local->idle_cpus <= (busiest->idle_cpus + 1)) { /* * If the busiest group is not overloaded * and there is no imbalance between this and busiest * group wrt idle CPUs, it is balanced. The imbalance * becomes significant if the diff is greater than 1 * otherwise we might end up to just move the imbalance * on another group. Of course this applies only if * there is more than 1 CPU per group. */ goto out_balanced; }
It was with this setup: - busiest->group_type == group_overloaded so it might not have checked the value. I can check the exact path if desired,
Regards, Pierre
Fixes: 0b0695f2b34a ("sched/fair: Rework load_balance()") cc: stable@vger.kernel.org Signed-off-by: Pierre Gondois pierre.gondois@arm.com
kernel/sched/fair.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9057584ec06d..6d9124499f52 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10775,8 +10775,8 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s * idle CPUs. */ env->migration_type = migrate_task;
env->imbalance = max_t(long, 0,
(local->idle_cpus - busiest->idle_cpus));
env->imbalance = local->idle_cpus;
lsub_positive(&env->imbalance, busiest->idle_cpus); }
#ifdef CONFIG_NUMA
-- 2.25.1
On Fri, 13 Sept 2024 at 19:36, Pierre Gondois pierre.gondois@arm.com wrote:
Hello Vincent,
On 9/13/24 18:14, Vincent Guittot wrote:
Hi Pierre
On Fri, 13 Sept 2024 at 10:58, Pierre Gondois pierre.gondois@arm.com wrote:
(struct sg_lb_stats).idle_cpus is of type 'unsigned int'. (local->idle_cpus - busiest->idle_cpus) can underflow to UINT_MAX for instance, and max_t(long, 0, UINT_MAX) will output UINT_MAX.
Use lsub_positive() instead of max_t().
Have you faced the problem or this patch is based on code review ?
we have the below in sched_balance_find_src_group() that should ensure that we have local->idle_cpus > busiest->idle_cpus
if (busiest->group_weight > 1 && local->idle_cpus <= (busiest->idle_cpus + 1)) { /* * If the busiest group is not overloaded * and there is no imbalance between this and busiest * group wrt idle CPUs, it is balanced. The imbalance * becomes significant if the diff is greater than 1 * otherwise we might end up to just move the imbalance * on another group. Of course this applies only if * there is more than 1 CPU per group. */ goto out_balanced; }
It was with this setup:
- busiest->group_type == group_overloaded
so it might not have checked the value. I can check the exact path if desired,
I'm curious that we never faced this before as the change is almost 4 years old now but there are probably some topologies that can end up in such situation
Also the fix tag should be Fixes: 16b0a7a1a0af ("sched/fair: Ensure tasks spreading in LLC during LB") Before this commit, group_overloaded was not comparing number of tasks
With the correct fix tag: Reviewed-by: Vincent Guittot vincent.guittot@linaro.org
Regards, Pierre
Fixes: 0b0695f2b34a ("sched/fair: Rework load_balance()") cc: stable@vger.kernel.org Signed-off-by: Pierre Gondois pierre.gondois@arm.com
kernel/sched/fair.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9057584ec06d..6d9124499f52 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10775,8 +10775,8 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s * idle CPUs. */ env->migration_type = migrate_task;
env->imbalance = max_t(long, 0,
(local->idle_cpus - busiest->idle_cpus));
env->imbalance = local->idle_cpus;
lsub_positive(&env->imbalance, busiest->idle_cpus); }
#ifdef CONFIG_NUMA
-- 2.25.1
linux-stable-mirror@lists.linaro.org