Re: [Eas-dev] [Question] EAS: Spread Tasks With Lower OPP

25 Nov 2015

      Hi Steve,
I took one week's holiday, sorry for late response.
On Sun, Nov 15, 2015 at 11:33:02AM -0800, Steve Muckle wrote:
...
Hi Leo,
On 11/11/2015 07:15 AM, Leo Yan wrote:
...
If pack task_A onto one busy CPU, this will introduce possible power
  penalty caused by higher OPP; on the other hand, if spread task_A to
  a idle CPU (The idle CPU's cluster may also stay in idle state),
  then this will introduce power penalty caused by extra power domain.
So I think we can enhance energy calculation algorithm when wake up
  the task in function energy_aware_wake_cpu(). For example, we can
  select two candidate CPUs for waken up task, one possible CPU is in
  the same schedule group with the task's original CPU, and another
  possible CPU is in another schedule group (this schedule group
  should have best or equal power efficiency in system). Then finally
  we can get to know if need spread tasks to different cluster, or need
  spread task to different CPU but in the same cluster, or just stay
  on original CPU.
When placing a waking task I'd think you only need to evaluate one CPU
in each cluster:
...

If there are utilized CPUs in the cluster with enough free capacity at

the current OPP to fit the waking task, then one of these CPUs should be
evaluated as the candidate (it's debatable which one IMO, perhaps the
one with the most capacity, but perhaps also the least to better pack CPUs).

If no busy CPUs have enough free capacity at the current OPP to

contain the waking task, then the least utilized CPU in the cluster.
Looking briefly at patch 32 of EASv5 (energy-aware task placement) this
seems to be what is done but we're only evaluating the smallest cluster
that can fit the task, and we are also evaluating the task's prev CPU.
In patch 32 of EASv5, there have assumption: It will "find group with
sufficient capacity", so this can work well with big.LITTLE system,
but will spread tasks into two clusters for SMP platform.
...
Perhaps we could stop evaluating the prev CPU (if it's not the best CPU
in the cluster as described above). Instead, we could evaluate placing
the task in at least one other cluster in the system, and choose between
those options.
Thanks for suggestion and agree. We can select CPUs with below
prioirty (from high to low):
- Select CPUs in most power efficient for groups (so may be have more
  than one groups on SMP platform);
- Select CPUs with lowest OPP to meet capacity requirement;
- Select CPUs with highest utilization (as your said, here need to try
  to use least one, and I think it's more suitable for rt-app cases,
  even rt-app-6 also will take 35% CPU's utilization when CPU run at
  lowest OPP);
- Select CPUs with least CPU ID;
If you think here have no obvious logic error, I will try it in next
1~2 weeks and post result after finish related testing.
...
...

I also observed here have another possible scenario. For example, if
tasks have been already packed to several CPUs, and though every
task's workload is not quite high (such like rt-app-13) but they
accumulate load on one CPU, so finally CPU will run at high OPP.
So if EAS pick up only one of these tasks and try to migrate the
task to another CPU, usually will not migrate to that CPU. The
reason is even target CPU have run at high OPP, but usually it still
have capacity to run more workloads with highest OPP; so energy_diff()
also will get worse power result after increase OPP, and task will
stay on original CPU. [1][2]

Hopefully applying a policy like the one above would prevent us from
getting into a situation like this.
...
Even if pick one idle CPU from another cluster, still cannot resolve
  this issue. Because if spread task to another cluster, the original
  cluster and CPU's OPP will not decrease but introduce extra power by
  the new cluster and CPU.
I would've expected the current algorithm to deal with this one
properly. Although the original cluster OPP doesn't decrease, the
utilization of it does, so the power consumed by it goes down. The power
should decrease by more than the power increase in the new cluster,
assuming the new cluster is indeed operating more efficiently.
...
So in this case, should consider as a global view and define some
  criteria:

CPUs don't stay on lowest OPP, but system have idle CPUs;
CPU's lower OPP can meet capacity requirement for all task's
average load;
CPU's lower OPP can meet capacity requirement for the highest load
task in system.

If meet these criteria, EAS can select idle CPU from schedule group
  with best power efficiency.
Though I agree this may address the issue you mentioned it'd be nice to
avoid adding special cases that we must test for. Do you think it's
possible to solve this problem with a more generic tweak to the wake up
logic like I proposed above?
After applied above logic, it may be helpful for rt-app cases, due
rt-app cases have consistent load. But in reality task's instant load
may change, so when several tasks is waken up with low load, then EAS
will pack them. After tasks run for a while, these tasks will increase
load, then EAS should detect this and spread tasks from global view.
So EAS will select best power efficient CPU for waken up task, this is
purely from task's pespective. I just wander if need do one more time
calculation from whole system level.
I'd like to take this with lower proirity, we can resolve the first
issue based on your above suggestion :).
Thanks,
Leo Yan

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Eas-dev] [Question] EAS: Spread Tasks With Lower OPP