On Mon, Sep 21, 2015 at 02:17:37PM +0100, Leo Yan wrote:
On Fri, Sep 18, 2015 at 05:57:48PM +0100, Morten Rasmussen wrote:
On Thu, Sep 17, 2015 at 04:02:09PM +0100, Leo Yan wrote:
[...]
From formula F.4, we can combine power with static leakage and dynamic leakage; IPA also used static/dynamic leakage to depict energy model. But EAS uses another way, which provide the power data according to every OPP and idle state. So that means on one platform, we need provide two kinds of power data.
IMHO, i think the static and dynamic leakage is more simple; because usually we will use (mW/MHz) to describe the power efficiency for specific CPU, though (mW/MHz) cannot very accurately for power consumption if the voltage has been changed (See formula F.6, usually the voltage will be increased at higher frequency). But if we use mW/MHz, maybe we can calculate with very simple way for we can just only use it to mulitplate with frequency to get dynamic power.
So we only need provide below parameters: P-state: static leakage, power efficiency (mW/MHz), capacity (DMIPS/MHz); C-state: static leakage, power efficiency (mW/MHz);
What's the thoughts for unify the energy model?
We want to unify the power models if at all possible. The IPA people are looking into it. The difficulty is that we are looking for different things, so the models have to capture enough detail to be useful for both.
Are you proposing to derive the individual P-state numbers from global numbers or do you propose to have the three parameters for each P-state in tables like we currently have them?
I'm referring first one to derive the individual P-state numbers from global numbers.
If you want to derive them from global numbers, you would need to compensate for voltage scaling for both Ps and Pd so you would need the voltage for each state. Otherwise you energy efficiency will _improve_ as you increasing frequency.
Correct.
It might work. I think the first step is to see if the derived curves would correlate well with real measurements. We would need a way to derive static leakage and power efficiency from measurements. I don't know if that can be easily done. Do you have any suggestions for that?
Pd [w] = b * V [v] * V [v] * frequency (F.6)
From previous experience, if fix the voltage for all OPPs then we can get almostly linear ratio between Pd [w] and frequency, this is because we have fixed voltage for 'b * V [v] * V [v]'. The ratio will skew after voltage is increased.
Yes, fixing the voltage would be one way of getting more measurement points to derrive 'b'. It does require setting up cpufreq to leave the voltage fixed though. We can't use an optimized cpufreq driver which scales the voltage.
We can do power measurement on simple enviornment (bare metal code or simple generic Linux envirnment); Below are some measurement methods:
- Firstly need a stable baseline before power meansurement; for exmaple, need firstly power off all other CPUs, and only use one CPU to meansurement. So we can firstly hotplug all unused CPUs.
You may want to repeat the experiments with more than one cpu just to verify that the power consumption should be associated with the core and not the cluster.
As mentioned in my reply from yesterday, hotplug may not actually power down the cpu (it doesn't on TC2). It most likely will on most systems, but it worth keeping in mind.
- CPU(Ps [w]) = Power(CPU_WFI) - Power (CPU_OFF) CPU(Pd [w]) = Power(OPP) - Power (CPU_WFI) or CPU(Pd [w]) = Power(OPP') - Power (OPP)
The last formula with fixed frequency and some additional computation to figure out the Pd, I assume.
For Pd [w], we need run benchmark (CoreMark) to let CPU run with 100% percentage.
Then we can get the "b * V [v] * V [v]" = Pd [w] / freuqency, this is usually we say the value of Pe (mW/MHz).
And when we have Pe, we can then compensate for the voltage scaling afterwards. Either directly as part of the energy calculatations or to generate tables similar to the existing ones with precomputed values.
I think Pe (mW/MHz) still cannot not really reflect power efficiency, We also need take account into CPU's performance improvement (with more stages' pipeline) and the relatioship with power consumption. so Pe (mW/MHz) / DMIPS (or capacity) can easily let use know if run one specific piece of code, which CPU will comsume more power.
Right, Pe is just a value expressing the relation between frequency and dynamic power for a particular processor implementation at a specific voltage. You are right that energy efficiency is comparison of real work (instructions executed) and energy cost (work/energy or the inverse). IPC is different between processors.
It actually depends on the workload, but in the interest of keeping the model simple enough to be used for scheduling decisions I think we should stick to some average expression of the IPC (and compute capacity).
Deriving the table data using F.5 and F.6 would mean that we can only model systems that follow those formulas reasonably well. The current tables are pure measurement data with a little bit of extrapolation to find the cluster power, which should be a bit more flexible. I'm not sure if that really matter though.
Agree we can firstly use pure measurement data, and later we can check if can use global power efficiency number for some optimization (may be we can simplize energy model and improve scheduling performance).
i also have no confidence which way is better :)
If it turns out that we can express the energy model using fewer input parameters and it works for real systems I think it could things easier for us in the long run. Less input parameters means less opportunities for people to do something wrong and we can probably more easily do some quick checks of the values to see if they make sense.
Also it mean less data stick into DT or wherever it is going to live.
Thanks, Morten