On Fri, Sep 18, 2015 at 12:23:32PM +0100, Dietmar Eggemann wrote:
On 17/09/15 18:09, Dietmar Eggemann wrote:
On 07/09/15 06:50, Leo Yan wrote:
Hi all,
[...]
Also have enclosed these two patches for review.
Let's discuss these patches on LKML since you have sent out emails to LKML discussing these changes.
[...]
Software Environment
Kernel (4.2 + EAS RFCv5) + extra two patches [1]
ARM-TF [2]
Enable CPUIdle with PSCI
Enable CPUFreq with cpufreq-dt driver
Profiling scritps: calc_idle_diff.py [3]: calculate C-state's difference for different configurations calc_pstate_time.py [4]: calculate P-state's difference for different configurations calc_sched_preformance.py [5]: calculate scheduler performance
I saw that you saved an x86_64 idlestat binary on your github utility/profile_eas project. I thought so far we have to run idlestat on the target so it can retrieve the target idle state names like WFI, C2 or M2?
There is this energy model (EM) feature in idlestat (-e energy_model_file) which calculates energy consumption per trace file.
example on TC2:
# idlestat --trace -f trace.dat -t T -e energy_model_arm_tc2 Parsed energy model file successfully ... ClusterA Energy Caps 22027 (2.202654e+04) ClusterA Energy Idle 57 (5.740462e+01) ClusterA Energy Index 22084 (2.208395e+04) ClusterB Energy Caps 3236 (3.235515e+03) ClusterB Energy Idle 40 (4.041970e+01) ClusterB Energy Index 3276 (3.275935e+03)
Total Energy Index 25360 (2.535988e+04)
The current idlestat code has only this ARM TC2 specific EM file put it should be easy for you to create one for your Hikey board.
[...]
Profiling: performance
sysbench --test=cpu --num-threads=1 --max-time=10 run
rt-app performance is calculate with below formula: task performance = slack/(c_period - c_run) * 1024
energy mainline (ndm) noeas (ndm) eas (ndm) eas (sched) prf prf prf prf sysbench 100 100 100 92
rt-app 6% 662 665 393 615 rt-app 13% 648 645 465 394 rt-app 19% 610 648 479 57 rt-app 25% 649 664 306 518 rt-app 31% 600 585 596 366 rt-app 38% 576 584 259 -166 rt-app 44% 466 487 30 -349 rt-app 50% 583 602 598 612
Seeing these performance numbers, have you calibrated your json files for your hikey board?
ARM TC2 example, calibrated against A15:
# cat wl_test.json | grep calibration "calibration": 141,
You're multiplying w/ 1024 whereas we use 100 :-)
Yes.
Have you found the reason for these crazy outliers (e.g. 'eas (sched)' 19%, 38%, 44% or 'eas (ndm)' 44%)?
Not yet and will dig into this issue.
Morten just mentioned that even if you calibrated your system correctly, you will not stress the hikey board as much as we stress the TC2 especially with rt-app 38%, 44% and 50%. We calibrate against a big cpu on TC2 and that means starting with an run/period ratio of 38% we start to saturate the 3 little cpus of the 5 TC2 cpus (we're using # rt-app threads eq. # logical cpus).
Please help review the json file for Hikey which i pasted in another email, it will launch 8 threads for 8 CPUs.
So with the current setup you should never see negative performance numbers on hikey (SMP) but the performance numbers should decrease with higher rt-app percentage values.
Thanks, Leo Yan