Re: [Eas-dev] [PATCH 0/3] Evaluation for tracking task load/util with rb tree

27 Oct 2016

      Hi Leo,
On 26/10/16 18:28, Leo Yan wrote:
...
o This patch series is to evaluate if can use rb tree to track task
  load and util on rq; there have some concern for this method is:
  rb tree has O(log(N)) computation complexity, so this will introduce
  extra workload by rb tree's maintainence. For this concern using
  hackbench to do stress testing, hackbench will generate mass tasks
  for message sender and receiver, so there will have many enqueue
  and dequeue operations, so we can use hackbench to get to know if
  rb tree will introduce big workload or not (Thanks a lot for Chris
  suggestion for this).
Another concern is scheduler has provided LB_MIN feature, after
  enable feature LB_MIN the scheduler will avoid to migrate task with
  load < 16. Somehow this also can help to filter out big tasks for
  migration. So we need compare power data between this patch series
  with directly setting LB_MIN.
I have difficulties to understand the whole idea here. Basically, you're
still doing classical load-balancing (lb) (with the aim of setting
env->imbalance ((runnable) load based) to 0. On a system like Hikey
(SMP) any order (load or util related) of the tasks can potentially
change how many tasks a dst cpu might pull (in case of an ordered list
(large to small load) we potentially pull only one task (doesn't have to
be the first one because 'task_h_load(p)/2 > env->imbalance', in case
its load is smaller but close to env->imbalance/2). But how can this
help to increase performance in a workload-agnostic way?
Besides this periodic lb case, what's the benefit of potentially pulling
the biggest task instead of an arbitrary one in active lb?
Now if we think about a big.LITTLE system and we are on DIE sd level, an
ordered list (large to small) on little cpu and vice versa on big could
potentially make sense. Not very scalable though because we're just
exploiting the idea of _two_ items with an opposite property (big vs.
LITTLE) and different order.
Another remark, where do you see the benefit of using util instead of
load? In both cases, you do 'task_h_load(p)/2 > env->imbalance),
env->imbalance -= task_h_load(p) and if (env->imbalance <= 0) break'
i.e. a load based policy any way?
The LB_MIN thing (avoiding to consider tiny tasks for pulling) is a
related thing but IMHO, aren't we seeing results which are pretty much
inside the noise-cloud here for a particular system/workload combination?
Let's talk about it further in tomorrow's meeting.
-- Dietmar
...
o Testing result:
Tested hackbench on Hikey with CA53x8 CPUs with SMP load balance:
time sh -c 'for i in `seq 100`; do /data/hackbench -p -P > /dev/null; done'
                   real           user           system
  baseline         6m00.57s       1m41.72s       34m38.18s
  rb tree          5m55.79s       1m33.68s       34m08.38s
For hackbench test case we can see with rb tree it even has better
  result than baseline kernel.
Tested video playback on Juno for LB_MIN vs rb tree:
LB_MIN           Nrg:LITTLE     Nrg:Big        Nrg:Sum
               11.3122        8.983429       20.295629
               11.337446      8.174061       19.511507
               11.256941      8.547895       19.804836
               10.994329      9.633028       20.627357
               11.483148      8.522364       20.005512
         avg.  11.2768128     8.7721554      20.0489682

rb tree          Nrg:LITTLE     Nrg:Big        Nrg:Sum
               11.384301      8.412714       19.797015
               11.673992      8.455219       20.129211
               11.586081      8.414606       20.000687
               11.423509      8.64781        20.071319
               11.43709       8.595252       20.032342
         avg.  11.5009946     8.5051202      20.0061148
    vs LB_MIN  +1.99%         -3.04%         -0.21%

o Known issues:
For patch 2, function detach_tasks() iterates rb tree for tasks, if
  there have one task has been detached then it calls rb_first() to
  fetch first node and it will iterate again from first node; it's
  better to use rb_next() but after change to use rb_next() will
  introduce panic.
Welcome any suggestion for better implementation for it.
Leo Yan (3):
  sched/fair: support to track biggest task on rq
  sched/fair: select biggest task for migration
  sched: remove unused rq::cfs_tasks
include/linux/sched.h        |   1 +
 include/linux/sched/sysctl.h |   1 +
 kernel/sched/core.c          |   2 -
 kernel/sched/fair.c          | 123 ++++++++++++++++++++++++++++++++++++-------
 kernel/sched/sched.h         |   5 +-
 kernel/sysctl.c              |   7 +++
 6 files changed, 116 insertions(+), 23 deletions(-)

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Eas-dev] [PATCH 0/3] Evaluation for tracking task load/util with rb tree

LB_MIN Nrg:LITTLE Nrg:Big Nrg:Sum

rb tree Nrg:LITTLE Nrg:Big Nrg:Sum