On Thu, Oct 13, 2016 at 9:22 AM, Patrick Bellasi <patrick.bellasi@arm.com> wrote:
On 13-Oct 09:15, Andres Oportus wrote:
> On Thu, Oct 13, 2016 at 9:04 AM, Patrick Bellasi <patrick.bellasi@arm.com>
> wrote:
>
> > On 13-Oct 08:50, Andres Oportus wrote:
> > > On Thu, Oct 13, 2016 at 6:43 AM, Leo Yan <leo.yan@linaro.org> wrote:
> > >
> > > > On Thu, Oct 13, 2016 at 03:18:16PM +0200, Vincent Guittot wrote:
> > > > > On 13 October 2016 at 15:05, Patrick Bellasi <
> > patrick.bellasi@arm.com>
> > > > wrote:
> > > > > > On 10-Oct 16:35, Leo Yan wrote:
> > > > > >> Add extra two performance optimization methods by setting sysfs
> > nodes:
> > > > > >>
> > > > > >> - Method 1: set sched_migration_cost_ns to 0:
> > > > > >>
> > > > > >>   By default sched_migration_cost_ns = 50000, scheduler calls
> > > > > >
> > > > > > That's 50us...
> > > > >
> > > > > In fact default value is 500000 = 500us not 50000 = 50us
> > > >
> > > > Sorry, should be 500us.
> > > >
> > > > > >
> > > > > >>   can_migrate_task() to check if tasks are cache hot or not and it
> > > > > >>   compares sched_migration_cost_ns to avoid migrate tasks
> > frequently.
> > > > > >>
> > > > > >>   This introduces side effects to easily pack tasks on the same
> > one
> > > > CPU
> > > > > >>   and introduce latency to spread tasks within multi-cores,
> > especially
> > > > > >>   if we think energy aware scheduling is easily to pack tasks on
> > > > single
> > > > > >>   CPU. So after task packing on one CPU with high utilization, we
> > can
> > > > > >>   easily spread out tasks after we set sched_migration_cost_ns to
> > 0.
> > > > > >
> > > > > > ... dunno how exactly this metric is used by the scheduler but,
> > > > > > according to its name and you explanation, it seems that in the
> > > > > > use-case you are targeting, tasks needs to me migrated more often
> > than
> > > > > > 50us. Is that the case?
> > > > >
> > > > > main advantage is that there are sysfs entry for that so it can be
> > > > > tuned for each platform
> > > >
> > > > If set this value to 0, the most benefit I can see is if there have
> > > > idle CPU and there have two tasks are runnable on another CPU, it can
> > > > give more chance to migrate one of the runnable tasks onto the idle
> > > > CPU immediately.
> > > >
> > >
> > > The currently available mechanism to have less latency and spread tasks
> > is
> > > to set the prefer_idle flag (spreads tasks in the corresponding cgroup as
> > > long as there are idle cores).
> >
> > That's what AOSP's kernels use for the wakeup path.
> >
>
> wouldn't placement in the wakeup path possibly place a task in an idle cpu
> to begin with and get the same or higher perf improvement compared to
> moving the task as part of load balancing?
>
>
> >
> > > Is this set in this experiment?  Isn't
> > > load balancing supposed to be more about throughput and hence "slower" to
> > > kick in moving tasks as needed?
> >
> > What Leo is addressing here is idle load balance, when a CPU is going
> > to become idle we would like to pull tasks from CPUs which have many
> > "as soon as possible".
> >
> > Is seems that based on his experiments, the "migration cost" is
> > impacting on the movement of some tasks by introducing latencies.
> > By tuning the migration cost value (actually setting it to 0) _some_
> > benchmarks have been measured to get an uplift on performances.
> >
> > What we need to understand (at fist instance) is how much "generic"
> > (i.e. platform and workload independent) is the proposed solution.
> >
>
> I agree, I would think that migration cost tuning could improve load
> balancing behavior but setting it to 0 effectively saying that there is no
> cost to task movement as part of load balancing seems incorrect.  I'm
> wondering if this is a scenario that could be improved/tuned say in the
> wakeup path rather than assuming no migration cost.

Independently of how much smart is the wakeup path there can still be
cases in which you end up with a CPU going to be idle while another
one has more than on task RUNNABLE in its RQs.

Moreover, consider that in a loaded system, we bails out from the EAS
mode and we use the "normal" scheduler.

In both cases the load balancer (both idle balance and active balance)
are still valuable opportunities to migrate tasks among CPUs.

Agreed.

> > > > > > As a general comment, I can understand that an hardcoded 50us value
> > > > > > could be not generic at all, however: is there any indication on
> > how
> > > > > > to properly dimension this value for a specific target?
> > > > > >
> > > > > > Maybe a specific set of synthetics experiments can be used to
> > figure
> > > > > > out what is the best value to be used.
> > > > > > In that case we should probably report in the documentation how to
> > > > > > measure and tune experimentally this value instead of just
> > changing an
> > > > > > hardcoded value with another one.
> > > > > >
> > > > > >> - Method 2: set busy_factor to 1:
> > > > > >>
> > > > > >>   This decreases load balance inteval time, so it will give more
> > > > chance
> > > > > >>   for active load balance for migration running task from little
> > core
> > > > to
> > > > > >>   big core.
> > > > > >
> > > > > > Same reasoning as before, how can be sure that the value you are
> > > > > > proposing (ie. busy_factor=1) it is really generic enough?
> > > > > >
> > > > > >> Method 1 can improve prominent performance on one big.LITTLE
> > system
> > > > > >> (which has CA53x4 + CA72x4 cores), from the Geekbench testing
> > result
> > > > the
> > > > > >> score can improve performance ~5%.
> > > > > >>
> > > > > >> Tested method 1 with Geekbench on the ARM Juno R2 board for
> > > > multi-thread
> > > > > >> case, the score can be improved from 2348 to 2368, so can improve
> > > > > >> performance ~0.84%.
> > > > > >
> > > > > > Am I correct on assuming that potentially different values can
> > give us
> > > > > > even better performance but we tried and tested only the two values
> > > > > > you are proposing?
> > > >
> > > > Yes, I only tried these two values. Just like Patrick suggested, the
> > > > methodology is more important rather than hard-coded value.
> > > >
> > > > > For the 1st test, the root cause was that tasks was hot on a CPU and
> > > > > can't be selected to migrate on other CPU because of its hotness so
> > > > > decreasing the sched_migration_cost_ns directly reduce the hotness
> > > > > period during which a task can't migrate on another CPU
> > > > >
> > > > > That being said, i'm not sure that this should be put in the
> > > > > Documentation/scheduler/sched-energy.tx
> > > >
> > > > I think for EAS, these two paramters are more important than
> > > > traditional SMP load balance. Because EAS will have more chance to
> > > > pack tasks onto single CPU or into one cluster, so we need utilize the
> > > > existed machenism to spread out these tasks, so
> > > > sched_migration_cost_ns and busy_factors are two things we can rely
> > > > on.
> > > >
> > > > > > Moreover, do we have any measure of the impact on energy
> > consumption
> > > > > > for the proposed value?
> > > >
> > > > From one member's platform, have not observed power impaction for
> > > > energy consumption. I will try to use video playback case on Juno to
> > > > generate out more power data.
> > > >
> > > > > >> Tested method 2 on Juno as well, but it has very minor performance
> > > > > >> boosting.
> > > > > >
> > > > > > That seems to support the idea that what you are proposing are
> > values
> > > > > > "optimal" only for performance on a specific platform. Isn't it?
> > > >
> > > > Yes.
> > > >
> > > > > >> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > > > > >> ---
> > > > > >>  Documentation/scheduler/sched-energy.txt | 24
> > > > ++++++++++++++++++++++++
> > > > > >>  1 file changed, 24 insertions(+)
> > > > > >>
> > > > > >> diff --git a/Documentation/scheduler/sched-energy.txt
> > > > b/Documentation/scheduler/sched-energy.txt
> > > > > >> index dab2f90..c0e62fe 100644
> > > > > >> --- a/Documentation/scheduler/sched-energy.txt
> > > > > >> +++ b/Documentation/scheduler/sched-energy.txt
> > > > > >> @@ -360,3 +360,27 @@ of the cpu from idle/busy power of the shared
> > > > resources. The cpu can be tricked
> > > > > >>  into different per-cpu idle states by disabling the other states.
> > > > Based on
> > > > > >>  various combinations of measurements with specific cpus busy and
> > > > disabling
> > > > > >>  idle-states it is possible to extrapolate the idle-state power.
> > > > > >> +
> > > > > >> +Performance tunning method
> > > > > >> +==========================
> > > > > >> +
> > > > > >> +Below setting may impact heavily for performance tunning:
> > > > > >> +
> > > > > >> +echo 0 > /proc/sys/kernel/sched_migration_cost_ns
> > > > > >> +
> > > > > >> +After set sched_migration_cost_ns to 0, it is helpful to spread
> > > > tasks within
> > > > > >> +the big cluster. Otherwise when scheduler executes load balance,
> > it
> > > > calls
> > > > > >> +can_migrate_task() to check if tasks are cache hot or not and it
> > > > compares
> > > > > >> +sched_migration_cost_ns to avoid migrate tasks frequently. This
> > > > introduce side
> > > > > >> +effect to easily pack tasks on the same one CPU and introduce
> > > > latency to
> > > > > >> +spread tasks within multi-cores, especially if we think about
> > energy
> > > > awared
> > > > > >> +scheduling to pack tasks on single CPU.
> > > > > >> +
> > > > > >> +echo 1 > /proc/sys/kernel/sched_domain/cpuX/domain0/busy_factor
> > > > > >> +echo 1 > /proc/sys/kernel/sched_domain/cpuX/domain1/busy_factor
> > > > > >> +
> > > > > >> +After set busy_factor to 1, it decreases load balance inteval
> > time.
> > > > So if we
> > > > > >> +take min_interval = 8, that means we permit the load balance
> > > > interval =
> > > > > >> +busy_factor * min_interval = 8ms. So this will shorten task
> > > > migration latency,
> > > > > >> +especially if we want to migrate a running task from little core
> > to
> > > > big core
> > > > > >> +to trigger active load balance.
> > > > > >> --
> > > > > >> 1.9.1
> > > > > >>
> > > > > >
> > > > > > --
> > > > > > #include <best/regards.h>
> > > > > >
> > > > > > Patrick Bellasi
> > > > _______________________________________________
> > > > eas-dev mailing list
> > > > eas-dev@lists.linaro.org
> > > > https://lists.linaro.org/mailman/listinfo/eas-dev
> > > >
> >
> > > _______________________________________________
> > > eas-dev mailing list
> > > eas-dev@lists.linaro.org
> > > https://lists.linaro.org/mailman/listinfo/eas-dev
> >
> >
> > --
> > #include <best/regards.h>
> >
> > Patrick Bellasi
> >

--
#include <best/regards.h>

Patrick Bellasi