Re: [Eas-dev] EASv5 Profiling On Hikey (6th Sept)

18 Sep 2015

      Hi Dietmar,
Thanks a lot for reviewing, please see below comments.
On Thu, Sep 17, 2015 at 06:09:43PM +0100, Dietmar Eggemann wrote:
...
On 07/09/15 06:50, Leo Yan wrote:
...
Hi all,
[...]
...
Also have enclosed these two patches for review.

Let's discuss these patches on LKML since you have sent out emails to
LKML discussing these changes.
Yeah, will look into Morten's comments for related patches.
...
[...]
...

Software Environment

Kernel (4.2 + EAS RFCv5) + extra two patches [1]

ARM-TF [2]

Enable CPUIdle with PSCI

Enable CPUFreq with cpufreq-dt driver

Profiling scritps:
calc_idle_diff.py [3]:
calculate C-state's difference for different configurations
calc_pstate_time.py [4]:
calculate P-state's difference for different configurations
calc_sched_preformance.py [5]:
calculate scheduler performance

I saw that you saved an x86_64 idlestat binary on your github
utility/profile_eas project.
I use x86_64 idlestat to compare the trace logs to get difference of
idle's duty cycle. For example, i get the rt-app 6%'s trace log file
for mainline and eas (ndm), then i will use below command on host PC
to get their difference for CPU's idle duty cycle:
./idlestat --import -f eas_ndm_trace.log -b mainline_trace.log -r
comparison >> idlestat_compare.txt
So finally can summary idle duty cycle's for different configuration.
...
I thought so far we have to run idlestat on the target so it can
retrieve the target idle state names like WFI, C2 or M2?
Yes, i run idlestat on host with below commands:
./idlestat --trace -f ./result/mp3/trace.log -t 30 -p -c -w
-o ./result/mp3/report.log -- ./rt-app ./doc/examples/mp3-long.json
./idlestat --trace -f ./result/rt-app-6/trace.log -t 30 -p -c -w
-o ./result/rt-app-6/report.log -- ./rt-app ./doc/examples/rt-app-6.json
...
There is this energy model (EM) feature in idlestat (-e
energy_model_file) which calculates energy consumption per trace file.
example on TC2:
# idlestat --trace -f trace.dat -t T -e energy_model_arm_tc2
Parsed energy model file successfully
...
ClusterA Energy Caps           22027 (2.202654e+04)
ClusterA Energy Idle              57 (5.740462e+01)
ClusterA Energy Index          22084 (2.208395e+04)
ClusterB Energy Caps            3236 (3.235515e+03)
ClusterB Energy Idle              40 (4.041970e+01)
ClusterB Energy Index           3276 (3.275935e+03)
Total Energy Index          25360 (2.535988e+04)
The current idlestat code has only this ARM TC2 specific EM file put it
should be easy for you to create one for your Hikey board.
Thanks for pointing out this, before don't know about this. Just now
did a quick try, it can works at my side. But i found if add "WFI"
state, it will report below error; if remove "WFI" state, then the
error will dismiss. "WFI" state should not be ignored, will check
idlestat's source code.
Error:
parse_energy_model: too many C states specified for cluster in
energy_model_hikey can't parse energy model file
...
...

Profiling: performance
sysbench --test=cpu --num-threads=1 --max-time=10 run
rt-app performance is calculate with below formula:
task performance = slack/(c_period - c_run) * 1024
energy          mainline (ndm) noeas (ndm)    eas (ndm)      eas (sched)
                  prf          prf            prf            prf
sysbench          100          100            100            92
rt-app 6%         662          665            393            615
rt-app 13%        648          645            465            394
rt-app 19%        610          648            479            57
rt-app 25%        649          664            306            518
rt-app 31%        600          585            596            366
rt-app 38%        576          584            259            -166
rt-app 44%        466          487            30             -349
rt-app 50%        583          602            598            612

Seeing these performance numbers, have you calibrated your json files
for your hikey board?
ARM TC2 example, calibrated against A15:
# cat wl_test.json | grep calibration
        "calibration": 141,
No, still use "CPU0" for calibration. So below are my rt-app-6.json
file, could you check if there still have other things i missed?
{
    "tasks": {
        "thread0": {
            "instance": 5,
            "loop": -1,
            "run": 120,
            "sleep": 0,
            "timer": {
                "ref": "unique",
                "period": 2000
            }
        }
    },
    "global": {
        "duration": 20,
        "calibration": "CPU0",
        "default_policy": "SCHED_OTHER",
        "pi_enabled": false,
        "lock_pages": false,
        "logdir": "./",
        "log_basename": "rt-app-6",
        "gnuplot": true
    }
}
...
...

Summary

After applied the two extra patches, the profiling result is consistent
and stable for EAS (ndm) and EAS (sched). The tasks will be placed into
first cluster for LITTLE.LITTLE; so EAS (ndm) and EAS (sched) are much

Shouldn’t we call it an SMP system instead LITTLE.LITTLE?
...
From CPU's topology, LITTLE.LITTLE is somehow different with SMP
system with only one cluster. Later will directly use "SMP".
Thanks,
Leo Yan

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Eas-dev] EASv5 Profiling On Hikey (6th Sept)