Hi all,
On Fri, Jan 19, 2018 at 12:34:20PM +0800, Leo Yan wrote:
Let's firstly see one example for a small utilization task is waken up and need calculate energy for two candidate CPUs; each candidate CPU cannot decide the final OPP by itself due it binds with other CPUs in the same clock domain, at the end we need calculate all CPUs energy.
Let's use below CPU topology as the example:
Cluster_0 Cluster_1 CPU_0 CPU_4 CPU_1 CPU_5 CPU_2 CPU_6 CPU_3 CPU_7
Current code always calculate the energy for all CPUs in bound clock domain, if the candidate CPUs are CPU_0 and CPU_4, then the formula for energy calculation is as below:
E(CPU_0) = E(CPU_0)` + E(CPU_1) + E(CPU_2) + E(CPU_3) + E(CLS_0)` + E(CPU_4) + E(CPU_5) + E(CPU_6) + E(CPU_7) + E(CLS_1)
E(CPU_4) = E(CPU_0) + E(CPU_1) + E(CPU_2) + E(CPU_3) + E(CLS_0) + E(CPU_4)` + E(CPU_5) + E(CPU_6) + E(CPU_7) + E(CLS_1)`
E_Diff(CPU_0 - CPU_4) = E(CPU_0) - E(CPU_4)
But from upper formula we can easily get to know CPU_1/2/3/5/6/7 energy calculation are redundant, so if we only take account the energy for the task consumed (but not compute all CPUs energy) after place it onto one specific CPU, then the energy calculation can be optimized as:
E(CPU_0) = E(CPU_0)` + E(CLS_0)` - E(CPU_0) - E(CLS_0) E(CPU_4) = E(CPU_4)` + E(CLS_1)` - E(CPU_4) - E(CLS_1)
E_Diff(CPU_0 - CPU_4) = E(CPU_0) - E(CPU_4)
So the energy calculation iteration can be reduced from 20 times to 8 times; this can significant reduce the energy calculation overload.
After using task oriented calculation, there has one case the energy calculation might take longer time than previous method. For instance, if candidate CPUs are CPU_0 and CPU1, and after place task on either CPU the CPU OPP will be increased. In this case, the old code uses below method for energy calculation:
E(CPU_0) = E(CPU_0)` + E(CPU_1) + E(CPU_2) + E(CPU_3) + E(CLS_0) E(CPU_1) = E(CPU_0) + E(CPU_1)` + E(CPU_2) + E(CPU_3) + E(CLS_0)
E_Diff(CPU_1 - CPU_0) = E(CPU_1) - E(CPU_0)
Because the OPP increasing impacts other CPUs in the same clock domain, so it needs to calculate all related CPUs energy:
E(CPU_0) = E(CPU_0)` + E(CPU_1)' + E(CPU_2)' + E(CPU_3)' + E(CLS_0)` - E(CPU_0) - E(CPU_1) - E(CPU_2) - E(CPU_3) - E(CLS_0)
E(CPU_1) = E(CPU_0)` + E(CPU_1)' + E(CPU_2)' + E(CPU_3)' + E(CLS_0)` - E(CPU_0) - E(CPU_1) - E(CPU_2) - E(CPU_3) - E(CLS_0)
E_Diff(CPU_1 - CPU_0) = E(CPU_1) - E(CPU_0)
We can use more complex method for optimization, e.g. firstly calculate the CPU_0 OPP and CPU_1 OPP and directly select CPU with most power efficiency OPP. Or we can reuse the energy data before task placement for two candidates. These methods can be used for later optimization.
As side effect, this patch also resolves energy calculation consistent issue, e.g. for some cases the energy calculation is for one cluster, some cases the energy calculation is for multiple clusters; so the energy data semantics are not consistent for different scenarios. This patch fixes issue by always calculating task based energy.
To achieve the optimization, this patch utilizes 'eenv->sg_cap' and 'eenv->sg_top' parameters; the parameter 'eenv->sg_cap' is only about the CPU capacity shared attribution, so eventually it's to describe the clock domain shared within CPUs, from this parameter we can get to know the final OPP selection; we need utilize parameter 'eenv->sg_top' to define which CPU we take care about, if the frequency is not changed after placing waken task then it will set the first level scheduling group to it (means one the single CPU) so finally the energy calculation can be limited to this single CPU.
On Hikey960, after fixing LITTLT CPU freq to 1402000Hz and big CPU to 1421000Hz, with the home screen scenario for 10s ftrace log the energy calculation duration can be optimized as below:
Energy calculation between LITTLE CPU and big CPU, the duration can be decreased from 34660ns to 16565ns (52% decreasing); when the energy calculation between the two CPUs in the same cluster, the duration can be decreased from 24342ns to 21093ns (13% decreasing).
Thanks a lot for Daniel reviewing and suggestion, this patch is big and hard to digest so I will split this patch into two smaller patches (or even more smaller patches if I can) for easier reviewing:
- The first patch is to add cpu frequency predication around and cancel redundant CPUs for energy calculation; - The second patch is to introduce task energy calculation;
I will prepare for new patch set for this. FYI.
[...]
Thanks, Leo Yan