Thoughts and Questions For EAS Energy Model - eas-dev

17 Sep 2015


      Hi all,
Below are some thoughts and questions after reviewed EAS's energy model; my
purpose is want to get clear the energy model from user's perspective, so
below question will _ONLY_ focus on the model and not dig into the
implementation.
This email is related long, but i think if use formulas, we can easily
get the same page; So i lists the energy model's formulas, then based
on them i try to match with TC2's power data and bring up some questions.
Look forward to your suggestions and comments.
* Basic Energy and Power Calculation Formulas
From the doc Documentation/scheduler/sched-energy.txt, we can get to know
  the energy can be calculated with:
Energy [j] = Power [w] * Time [s]                        (F.1)
So let's assume there have one piece of code, which has fixed instruction
  numbers will be executed on CPU, the execution duration is depend on CPU's
  pipeline and CPU's frequency. So can convert F.1 to F.2:
Code [instructions]
  Energy [j] = Power [w] * ------------------------------
                             (Inst Per Cycle) * Frequency
Code [instructions]
             = Power [w] * ------------------------------  (F.2)
                                       MIPS(f)
    				  `-> 'f' is factor of frequency
Because MIPS(f) can be normalize as the CPU's capacity corresponding to
  OPP, so we can simply convert from F.2 to F.3:
Code [instructions]
  Energy [j] = Power [w] * ------------------------------  (F.3)
                                   CPU_Capacity(f)
If breakdown Power[w], we can split it into two parts: static leakage, and
  dynamic leakage:
Power [w] = Ps [w] + Pd [w]                              (F.4)
Static power leakage can be calculated with below formula:
  Ps [w] = i * V [v]                                       (F.5)
           `-> 'i' is coefficient for according to silicon's process
           V [v] is voltage according to OPP
Dynamic power leakage can be calculated with below formula:
  Pd [w] = b * V [v] * V [v] * frequency                   (F.6)
           `-> 'b' is coefficient for according to silicon's process
           V [v] is voltage according to OPP
Here have two special cases, if the island's clock is gated, then
  Pd [w] = 0, So:
  Power [w] = Ps [w]                                       (F.7)
If the island is powered off, then
  Ps [w] = 0, Pd [w] = 0; So:
  Power [w] = 0                                            (F.8)
So energy can be calculated as (come from F.3 and F.4):
Code [instructions]
  Energy [j] = (Ps [w] + Pd [w]) * ----------------------  (F.9)
                                      CPU_Capacity(f)
* Formulas for duty cycle
We separate the logic (cluster or CPU) into two states: P-state and C-state,
  for P-state and C-state they have different power data, this is because
  after the logic enter C-state, it will be clock gating or powered off. So if
  we expand the time axis for relative long time, we need calculate CPU's
  utilization percentage (for CPU is full running, util = 100%). Let's
  simplize the ratio between "Code [instructions]" and "CPU_Capactity(f)" as
  the utilization, So the energy calculation can be depicted as:
Code [instructions]
  Util(f) = --------------------------                     (F.10)
                 CPU_Capacity(f)
Energy [j] = Power_Pstate [w] * Util(f)
             + Power_Cstate [w] * (1 - Util(f))            (F.11)
(F.12)
  Energy [j] = Sum(i=0..MAX_OPP)(Power_Pstate [w](i) * Util_OPP(i))
             + Sum(i=0..MAX_IDL)(Power_Cstate [w](i) * Util_IDL(i))
  Sum(i=0..MAX_OPP)Util_OPP(i) + Sum(i..MAX_IDLE)Util_IDL(i) = 1
* Formulas for clusters
                                                           (F.13)
  Energy [j] = Energy_cluster [j]
             + Sum(i=0..MAX_CPU_PER_CLUSTER)Energy_cpu(i) [j]
(F.14)
  Energy_cluster [j]
             = Sum(i=0..MAX_OPP)(Power_Pstate [w](i) * Util_OPP(i))
             + Sum(i=WFI, ClusterOff)(Power_Cstate [w](i) * Util_IDL(i))
(F.15)
  Energy_cpu [j]
             = Sum(i=0..MAX_OPP)(Power_Pstate [w](i) * Util_OPP(i))
             + Sum(i=WFI, CPUOff)(Power_Cstate [w](i) * Util_IDL(i))
* Thoughts and Questions
- Let's summary EAS's energy model as below:
CPU::capacity_state::power : CPU's power [w] for specific OPP
      Power(OPP)         = Ps [w] + Pd [w]
CPU::idle_state::power : CPU's power [w] for specific idle state
      Power(IDLE_WFI)    = Ps [w]
      Power(IDLE_CPUOff) = 0
CPU's IDLE_WFI means: CPU is clock gating, so has static leakage but
      don't include dynamic leakage.
CLUSTER::capacity_state::power : Cluster's power [w] for specific OPP
      Power(OPP)         = Ps [w] + Pd [w]
CLUSTER::idle_state::power : CPU's power [w] for specific idle state
      Power(IDLE_WFI)    = Ps [w] + Pd [w]
      Power(IDLE_CLSOff) = 0
Cluster's IDLE_WFI is quite special, means all CPUs in cluster have been
      powered off, but cluster's logic (L2$ and SCU, etc) is powered on and clock
      is enabled, so it includes cluster level's static power and dynamic power.
Are these formulas matching the original design?
- TC2's data for cluster's sleep:
static struct idle_state idle_states_cluster_a7[] = {
     { .power = 25 }, /* WFI */
     { .power = 10 }, /* cluster-sleep-l */
    };
static struct idle_state idle_states_cluster_a15[] = {
     { .power = 70 }, /* WFI */
     { .power = 25 }, /* cluster-sleep-b */
    };
For cluster level's sleep, the clock is gating and domain is powered off,
    so the dynamic leakage and static leakge should be zero, right?
- TC2's data for CPU's idle state:
static struct idle_state idle_states_core_a7[] = {
    { .power = 0 }, /* WFI */
    };
static struct idle_state idle_states_core_a15[] = {
    { .power = 0 }, /* WFI */
    };
CPU has two idle state, one is 'WFI' and another is 'C2'; For 'WFI' state,
    the power will not be zero, this is because 'WFI' state means internal
    clock gating, so according to F.7, there should have static leakage.
BTW, for TC2, there have no corresponding idle state for 'C2', this is
    weird. Could you confirm it has been delibrately removed?
- TC2's data for P-state:
static struct capacity_state cap_states_cluster_a7[] = {
    /* Cluster only power */
    { .cap =  150, .power = 2967, }, /*  350 MHz */
    [...]
    };
static struct capacity_state cap_states_core_a7[] = {
    /* Power per cpu */
    { .cap =  150, .power =  187, }, /*  350 MHz */
    [...]
    };
From previous experience, the CPU level's power leakage is very higher
    than cluster level's leakage. For example, for CA7, if only power on cluster
    (all CPUs in cluster are powered off), the power delta is ~10mA@156MHz; if
    power on one CPUs, the power delta is about 30mA@156MHz. I also checked the
    data for CA53, it has similar result.
So this is confilict with TC2's power data, you can see the cluster
    level's power leakage is quite high (almost 15 times than CPU level). This
    means almostly we cannot get much benefit from CPU level's low power
    state, due cluster level will contribute most of power consumption. This
    is not make sense.
- From formula F.4, we can combine power with static leakage and dynamic
    leakage; IPA also used static/dynamic leakage to depict energy model. But
    EAS uses another way, which provide the power data according to every OPP
    and idle state. So that means on one platform, we need provide two kinds
    of power data.
IMHO, i think the static and dynamic leakage is more simple; because
    usually we will use (mW/MHz) to describe the power efficiency for specific
    CPU, though (mW/MHz) cannot very accurately for power consumption if the
    voltage has been changed (See formula F.6, usually the voltage will be
    increased at higher frequency). But if we use mW/MHz, maybe we can
    calculate with very simple way for we can just only use it to mulitplate
    with frequency to get dynamic power.
So we only need provide below parameters:
    P-state: static leakage, power efficiency (mW/MHz), capacity (DMIPS/MHz);
    C-state: static leakage, power efficiency (mW/MHz);
What's the thoughts for unify the energy model?
Thanks,
Leo Yan