Re: [Sched-tools] [RESEND PATCH v2] doc: measure the efficiency of cpufreq governors

25 Jun 2015

(This might be of interest to folks on eas-dev, adding them to cc)
On Wed, Jun 24, 2015 at 11:53 AM, Vincent Guittot
vincent.guittot@linaro.org wrote:
...
On 24 June 2015 at 03:41,  pang.xunlei@zte.com.cn wrote:
...
Hi Vincent,
Vincent Guittot vincent.guittot@linaro.org wrote 2015-06-23 PM 09:43:55:
...
Re: [RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
Hi Xunlei,
I have run the bench on a quad A15 with sched-dvfs (but without eas
patches unlike you)
#sudo ./test.sh 4 100 1000
Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
powersave efficiency: 0%
performance efficiency: 100%
conservative efficiency: 18%
ondemand efficiency: 48%
cfs efficiency: 24%
#sudo ./test.sh 4 200 1000
Frequency domain CPU0~CPU3, run 200ms, sleep 1000ms:
powersave efficiency: 0%
performance efficiency: 100%
conservative efficiency: 40%
ondemand efficiency: 68%
cfs efficiency: 30%
$ sudo ./test.sh 4 50 1000
Frequency domain CPU0~CPU3, run 50ms, sleep 1000ms:
powersave efficiency: 0%
performance efficiency: 100%
conservative efficiency: 0%
ondemand efficiency: 25%
cfs efficiency: 19%
As an example, here is the result when ondemand parameter are tuned
for the platform
 sudo ./test.sh 4 100 1000
Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
powersave efficiency: 0%
performance efficiency: 100%
conservative efficiency: 19%
ondemand efficiency: 94%
cfs efficiency: 23%
Beside these results, i have seen variation in the results that
confirm the interest of having more statistics like, min, man stdev
I guess the hardware environment has something to do with this, on my
platform, there're 11 available freuencies in total: 1200Mhz~2200Mhz,
the step size is 100Mhz.
Yes for sure, it was just to give some figures with a different platform.
 Regarding the variation, i have seen these variations for the same
platform with the same SW. Nevertheless, this variations are somewhat
normal if we consider the default sampling rate of 164ms for some
governor compared to a run duration of 100ms
Regards,
Vincent
...
Also it may get different results when running with different workload
loops.
-Xunlei
...
Regards,
Vincent
On 18 June 2015 at 13:35,  pang.xunlei@zte.com.cn wrote:
...
Just tested on my Intel EAS test environment(implemented x86 frequency
invariant hook).
With EAS disabled and sched-dvfs enabled.
#./test.sh 3 100 1000
Frequency domain CPU0~CPU2, run 100ms, sleep 1000ms:
powersave efficiency: 0%
performance efficiency: 100%
conservative efficiency: 92%
ondemand efficiency: 97%
cfs efficiency: 79%
#./test.sh 3 200 1000
Frequency domain CPU0~CPU2, run 200ms, sleep 1000ms:
powersave efficiency: 0%
performance efficiency: 100%
conservative efficiency: 97%
ondemand efficiency: 99%
cfs efficiency: 89%
#./test.sh 3 50 1000
Frequency domain CPU0~CPU2, run 50ms, sleep 1000ms:
powersave efficiency: 0%
performance efficiency: 100%
conservative efficiency: 93%
ondemand efficiency: 96%
cfs efficiency: 58%
#./test.sh 3 1000 100
Frequency domain CPU0~CPU2, run 1000ms, sleep 100ms:
powersave efficiency: 0%
performance efficiency: 100%
conservative efficiency: 99%
ondemand efficiency: 99%
cfs efficiency: 97%
Seems sched-dvfs is computing inefficient at low cpu usage(implies
power
...
...
efficient),
but computing efficient at high cpu usage.
-Xunlei
Xunlei Pang xlpang@126.com wrote 2015-06-18 PM 05:06:07:
...
[RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
From: Xunlei Pang pang.xunlei@linaro.org
DVFS adds a latency in the execution of task because of the time to
decide to move at max freq. We need to measure this latency and check
that the governor stays in an acceptable range.
When workgen runs a json file, a log file is created for each thread.
This log file records the number of loop that has been executed and
the duration for executing these loops (per phase). We can use these
figures to evaluate to latency that is added by a cpufreq governor
and its "performance efficiency".
We use the run+sleep pattern to do the measurement, for the run time
per
...
...
...
loop, the performance governor should run the expected duration as
the
...
...
...
CPU stays a max freq. At the opposite, the powersave governor will
give
...
...
...
use the longest duration (as it stays at lowest OPP). Other governor
will
...
be somewhere between the 2 previous duration as they will use several
OPP
...
and will go back to max frequency after a defined duration which
depends
...
...
...
on its monitoring period.
The formula:
duration of powersave gov - duration of the gov

--------------------------------------------------------  x 100%
 duration of powersave gov - duration of performance gov
will give the efficiency of the governor. 100% means as efficient as
the perf governor and 0% means as efficient as the powersave
governor.
...
...
...
This patch offers json files and shell scripts to do the measurement,
Usage: ./test.sh <cpus> <runtime> <sleeptime>
cpus: number of cpus in the CPU0's frequency domain
runtime: running time in ms per loop of the workload pattern
sleeptime: sleeping time in ms per loop of the workload pattern
Example:
"./test.sh 4 100 1000" means
CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
...
test result on my machine:
~#./test.sh 4 100 1000
Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
powersave efficiency: 0%
performance efficiency: 100%
conservative efficiency: 28%
ondemand efficiency: 95%
NOTE: Make sure there are "sed", "cut", "grep", "rt-app", etc tools
on
...
...
...
your test machine, and run the script under root privilege.
Signed-off-by: Xunlei Pang pang.xunlei@linaro.org
doc/examples/cpufreq_governor_efficiency/README    | 54
++++++++++++++
...
...
...
.../cpufreq_governor_efficiency/calibration.json   | 27 +++++++
 .../cpufreq_governor_efficiency/calibration.sh     | 11 +++
 doc/examples/cpufreq_governor_efficiency/dvfs.json | 27 +++++++
 doc/examples/cpufreq_governor_efficiency/dvfs.sh   | 38 ++++++++++
 doc/examples/cpufreq_governor_efficiency/test.sh   | 82 +++++++++++
+++++++++++
 6 files changed, 239 insertions(+)
 create mode 100644 doc/examples/cpufreq_governor_efficiency/README
 create mode 100644
doc/examples/cpufreq_governor_efficiency/calibration.json
...
create mode 100755
doc/examples/cpufreq_governor_efficiency/calibration.sh
...
create mode 100644
doc/examples/cpufreq_governor_efficiency/dvfs.json
...
...
...
create mode 100755 doc/examples/cpufreq_governor_efficiency/dvfs.sh
 create mode 100755 doc/examples/cpufreq_governor_efficiency/test.sh

diff --git a/doc/examples/cpufreq_governor_efficiency/README b/doc/
examples/cpufreq_governor_efficiency/README
new file mode 100644
index 0000000..cc8efe1
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/README
@@ -0,0 +1,54 @@
+Measure the efficiency of cpufreq governors using rt-app



+BACKGROUND:

DVFS adds a latency in the execution of task because of the time

to
...
...
...

decide to move at max freq. We need to measure this latency and

check
...

that the governor stays in an acceptable range.

When workgen runs a json file, a log file is created for each

thread.
...

This log file records the number of loop that has been executed

and
...
...
...

the duration for executing these loops (per phase). We can use

these
...

figures to evaluate to latency that is added by a cpufreq

governor
...
...
...

and its "performance efficiency".

We use the run+sleep pattern to do the measurement, for the run

time per
...

loop, the performance governor should run the expected duration

as
...
...
the
...

CPU stays a max freq. At the opposite, the powersave governor

will
...
...
give
...

use the longest duration (as it stays at lowest OPP). Other

governor will
...

be somewhere between the 2 previous duration as they will use

several OPP
...

and will go back to max frequency after a defined duration which

depends
...

on its monitoring period.

The formula:

   duration of powersave gov - duration of the gov


--------------------------------------------------------  x 100%
duration of powersave gov - duration of performance gov



will give the efficiency of the governor. 100% means as

efficient
...
...
as
...

the perf governor and 0% means as efficient as the powersave

governor.
...


This test offers json files and shell scripts to do the

measurement,
...



+USAGE:

./test.sh <cpus> <runtime> <sleeptime>
cpus: number of cpus in the CPU0's frequency domain
runtime: running time in ms per loop of the workload pattern
sleeptime: sleeping time in ms per loop of the workload pattern


+Example:

"./test.sh 4 100 1000" means
CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload

pattern.
...


test result on an Intel machine:
~#./test.sh 4 100 1000
Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
powersave efficiency: 0%
performance efficiency: 100%
conservative efficiency: 28%
ondemand efficiency: 95%


+NOTE:

Make sure there are "sed", "cut", "grep", "rt-app", etc tools

on your test

machine, and run the script under root privilege.


diff --git a/doc/examples/cpufreq_governor_efficiency/
calibration.json
b/doc/examples/cpufreq_governor_efficiency/calibration.json
...
new file mode 100644
index 0000000..4377990
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/calibration.json
@@ -0,0 +1,27 @@
+{

"tasks" : {
 "thread" : {


    "instance" : 1,


    "cpus" : [0],


    "loop" : 1,


    "phases" : {


       "run" : {


          "loop" : 1,


          "run" : 200000,


       },


       "sleep" : {


          "loop" : 1,


          "sleep" : 200000,


       }


    }


 }


},
"global" : {
 "default_policy" : "SCHED_FIFO",


 "calibration" : "CPU0",


 "lock_pages" : true,


 "ftrace" : true,


 "logdir" : "./",


}

+}



diff --git a/doc/examples/cpufreq_governor_efficiency/calibration.sh
b/doc/examples/cpufreq_governor_efficiency/calibration.sh
new file mode 100755
index 0000000..d10e644
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/calibration.sh
@@ -0,0 +1,11 @@
+#!/bin/sh



+set -e



+echo performance >
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
...



+sleep 1



+pLoad=$(rt-app calibration.json 2>&1 |grep pLoad |sed 's/.*=
(.*)ns.*/\1/')
...
+sed 's/"calibration" : .*,/"calibration" : '$pLoad',/' -i dvfs.json



diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.json b/
doc/examples/cpufreq_governor_efficiency/dvfs.json
new file mode 100644
index 0000000..b413156
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/dvfs.json
@@ -0,0 +1,27 @@
+{

"tasks" : {
 "thread" : {


    "instance" : 1,


    "cpus" : [0],


    "loop" : 5,


    "phases" : {


       "running" : {


          "loop" : 1,


          "run" : 100000,


       },


       "sleeping" : {


          "loop" : 1,


          "sleep" : 1000000,


       }


    }


 }


},
"global" : {
 "default_policy" : "SCHED_OTHER",


 "calibration" : 90,


 "lock_pages" : true,


 "ftrace" : true,


 "logdir" : "./",


}

+}



diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.sh b/doc/
examples/cpufreq_governor_efficiency/dvfs.sh
new file mode 100755
index 0000000..8591fc7
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/dvfs.sh
@@ -0,0 +1,38 @@
+#!/bin/sh



+#echo $1 $2 $3
+set -e



+if [ $1 ] && [ $2 ] ; then

for i in $(seq 0 1 $(expr $2 - 1)); do
 echo $1 >



/sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
...
...
...

 #cat /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor


done

sleep 3

+fi



+if [ $3 ] ; then

sed 's/"run" : .*,/"run" : '$3',/' -i dvfs.json

+fi



+if [ $4 ] ; then

sed 's/"sleep" : .*,/"sleep" : '$4',/' -i dvfs.json

+fi



+#cat dvfs.json



+rt-app dvfs.json 2> /dev/null



+if [ $1 ] ; then

mv -f rt-app-thread-0.log rt-app_$1_run$3us_sleep$4us.log

sum=0
for i in $(cat rt-app_$1_run$3us_sleep$4us.log | sed 'n;d' | sed

'1d' |cut -f 3); do

 sum=$(expr $sum + $i)


done
sum=$(expr $sum / 5)
echo $sum
rm -f rt-app_$1_run$3us_sleep$4us.log

+fi



diff --git a/doc/examples/cpufreq_governor_efficiency/test.sh b/doc/
examples/cpufreq_governor_efficiency/test.sh
new file mode 100755
index 0000000..d72fc6a
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/test.sh
@@ -0,0 +1,82 @@
+#!/bin/sh



+set -e



+set_calibration() {

calibration.sh

+}



+test_efficiency() {


FILENAME="results_$RANDOM$$.txt"

if [ -e /sys/devices/system/cpu/cpu0/cpufreq/

scaling_available_governors ]; then

 for i in $(cat /sys/devices/system/cpu/cpu0/cpufreq/



scaling_available_governors); do

    export gov_$i=$(echo $i)


 done


else
 echo "cpufreq is not available!"


 exit


fi

if [ ! $gov_performance ] ; then
 echo "Can't find performance governor!"


 exit


fi

if [ ! $gov_powersave ] ; then
 echo "Can't find powersave governor!"


 exit


fi

# Get powersave data
dvfs.sh powersave $1 $2 $3 > $FILENAME
powersave=$(cat $FILENAME |sed -n '1p')

# Get performance data
dvfs.sh performance $1 $2 $3 > $FILENAME
performance=$(cat $FILENAME |sed -n '1p')

if [ $performance -ge $powersave ] ; then
 echo "Error! Probably not input all the cpus in the same



frequency domain"

 exit


fi

denominator=$(expr $powersave - $performance)
echo "powersave efficiency: 0%"
echo "performance efficiency: 100%"

# Calcuate other governors data
for gov_next in $gov_conservative $gov_ondemand $gov_cfs; do
 if [ "$gov_next" != "" ] ; then


    dvfs.sh $gov_next $1 $2 $3 > $FILENAME


    data=$(cat $FILENAME |sed -n '1p');


    numerator=$(expr $powersave - $data)


    numerator=$(expr $numerator \* 100)


    if [ $numerator -lt 0 ] ; then


       let numerator=0


    fi


    data=$(expr $numerator / $denominator)


    echo "$gov_next efficiency: $data%"


 fi


done

rm -f $FILENAME

+}



+if [ $# -lt 3 ]; then

echo "Usage: ./test.sh <cpus> <runtime> <sleeptime>"
echo "cpus: number of cpus in the CPU0's frequency domain"
echo "runtime: running time in ms per loop of the workload

pattern"
...
...
...

echo "sleeptime: sleeping time in ms per loop of the workload

pattern"
...

echo -e "\nExample: \n"./test.sh 4 100 1000" means\nCPU0~CPU3

sharing frequency, "100ms run + 1000ms sleep" workload pattern.\n"

exit

+fi



+echo "Frequency domain CPU0~CPU$(expr $1 - 1), run $2ms, sleep
$3ms:"
...
...
...



+sleep 1
+PATH=$PATH:.
+set_calibration
+test_efficiency $1 $(expr $2 * 1000) $(expr $3 * 1000)



--
1.9.1

    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Sched-tools] [RESEND PATCH v2] doc: measure the efficiency of cpufreq governors

Signed-off-by: Xunlei Pang pang.xunlei@linaro.org