Re: [Eas-dev] [PATCH 0/3] Evaluation for tracking task load/util with rb tree

28 Oct 2016


      On Fri, Oct 28, 2016 at 10:19:41AM +0200, Vincent Guittot wrote:
...
On 28 October 2016 at 10:13, Leo Yan leo.yan@linaro.org wrote:
...
On Thu, Oct 27, 2016 at 08:37:05PM +0100, Dietmar Eggemann wrote:
...
Hi Leo,
On 26/10/16 18:28, Leo Yan wrote:
...
o This patch series is to evaluate if can use rb tree to track task
  load and util on rq; there have some concern for this method is:
  rb tree has O(log(N)) computation complexity, so this will introduce
  extra workload by rb tree's maintainence. For this concern using
  hackbench to do stress testing, hackbench will generate mass tasks
  for message sender and receiver, so there will have many enqueue
  and dequeue operations, so we can use hackbench to get to know if
  rb tree will introduce big workload or not (Thanks a lot for Chris
  suggestion for this).
Another concern is scheduler has provided LB_MIN feature, after
  enable feature LB_MIN the scheduler will avoid to migrate task with
  load < 16. Somehow this also can help to filter out big tasks for
  migration. So we need compare power data between this patch series
  with directly setting LB_MIN.
I have difficulties to understand the whole idea here. Basically, you're
still doing classical load-balancing (lb) (with the aim of setting
env->imbalance ((runnable) load based) to 0. On a system like Hikey
(SMP) any order (load or util related) of the tasks can potentially
change how many tasks a dst cpu might pull (in case of an ordered list
(large to small load) we potentially pull only one task (doesn't have to
be the first one because 'task_h_load(p)/2 > env->imbalance', in case
its load is smaller but close to env->imbalance/2). But how can this
help to increase performance in a workload-agnostic way?
On 4.4, the result is better than simple list. Vincent also suggested
me to do comparision on mainline kernel, the result shows my rt-tree
patches introduce performance regression:
mainline kernel
real    2m23.701s    user    1m4.500s     sys     4m34.604s
mainline kernel + fork regression patch
real    2m24.377s    user    1m3.952s     sys     4m39.928s
real    2m19.100s    user    0m48.776s    sys     3m33.440s
mainline with big task tracking:
real    2m28.837s    user    1m16.388s    sys     5m26.864s
real    2m28.501s    user    1m18.104s    sys     5m30.516s
That would be interesting to understand where the huge difference
between mainline above and your 1st test with v4.4 come from:
1st results on v4.4 were
                 real           user           system
  baseline         6m00.57s       1m41.72s       34m38.18s
  rb tree          5m55.79s       1m33.68s       34m08.38s
Is the difference linked to v4.4 vs mainline ? different version of
hackbench ? different version of rootfs/distro ? something else ?
Two things I think it's quite different between v4.4 and mainline
kernel:
- Mainline kernel has not enabled CPUFreq, so suppose always
  run@1.2GHz; v4.4 kernel has enabled "interactive" governor;
- v4.4 has merged EAS and WALT related code, but when I tested I
  disable EAS by "echo NO_ENERGY_AWARE > sched_features"; also used
  PELT signals rather than WALT.
Thanks,
Leo Yan

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Eas-dev] [PATCH 0/3] Evaluation for tracking task load/util with rb tree