Re: [Sched-tools] [RESEND PATCH v2] doc: measure the efficiency of cpufreq governors

23 Jun 2015

Hi Vincent,
Vincent Guittot vincent.guittot@linaro.org wrote 2015-06-22 PM 09:43:52:
...
Re: [RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
Hi Xunlei,
On 18 June 2015 at 11:06, Xunlei Pang xlpang@126.com wrote:
...
From: Xunlei Pang pang.xunlei@linaro.org
DVFS adds a latency in the execution of task because of the time to
decide to move at max freq. We need to measure this latency and check
that the governor stays in an acceptable range.
When workgen runs a json file, a log file is created for each thread.
you use rt-app and not workgen in your script
I thought rt-app is the same thing as workgen, could you tell me the 
difference?
...
...
This log file records the number of loop that has been executed and
the duration for executing these loops (per phase). We can use these
figures to evaluate to latency that is added by a cpufreq governor
and its "performance efficiency".
We use the run+sleep pattern to do the measurement, for the run time
per
...
...
loop, the performance governor should run the expected duration as the
CPU stays a max freq. At the opposite, the powersave governor will
give
...
...
use the longest duration (as it stays at lowest OPP). Other governor
will
...
...
be somewhere between the 2 previous duration as they will use several
OPP
...
...
and will go back to max frequency after a defined duration which
depends
...
...
on its monitoring period.
The formula:
duration of powersave gov - duration of the gov

--------------------------------------------------------  x 100%
 duration of powersave gov - duration of performance gov
will give the efficiency of the governor. 100% means as efficient as
the perf governor and 0% means as efficient as the powersave governor.
This patch offers json files and shell scripts to do the measurement,
Usage: ./test.sh <cpus> <runtime> <sleeptime>
cpus: number of cpus in the CPU0's frequency domain
Why do you need the number of cpus in CPU0's freq domain as an
argument of the script ?
The cpus that share the same freq domain, are available in
/sys/devices/system/cpu/cpu0/cpufreq/related_cpus  and/or
affected_cpus. I don't remember the exact difference between both
lists
And why do you need this parameter at all ?
In your script, you set the governor of all cpus of the freq domain
but AFAICT, cpus in the same freq domain, share the same governor so
all cpus will use the new governor as soon as you change the governor
of one cpu.
Then, how can i test the efficiency of a cpu that doesn't belong to
CPU0 freq domain ? IMHO, you should better remove this parameter and
add a new one to select the cpu on which you want to run the bench.
Futhermore, CPU0 is the cpu that is the more used by default in a
platform so it's worth to use another one in order to not be disturbed
by background activity.
...
runtime: running time in ms per loop of the workload pattern
sleeptime: sleeping time in ms per loop of the workload pattern
Example:
"./test.sh 4 100 1000" means
CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
...
...
test result on my machine:
~#./test.sh 4 100 1000
Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
powersave efficiency: 0%
performance efficiency: 100%
conservative efficiency: 28%
ondemand efficiency: 95%
NOTE: Make sure there are "sed", "cut", "grep", "rt-app", etc tools on
your test machine, and run the script under root privilege.
Signed-off-by: Xunlei Pang pang.xunlei@linaro.org
doc/examples/cpufreq_governor_efficiency/README    | 54
++++++++++++++
...
...
.../cpufreq_governor_efficiency/calibration.json   | 27 +++++++
 .../cpufreq_governor_efficiency/calibration.sh     | 11 +++
 doc/examples/cpufreq_governor_efficiency/dvfs.json | 27 +++++++
 doc/examples/cpufreq_governor_efficiency/dvfs.sh   | 38 ++++++++++
 doc/examples/cpufreq_governor_efficiency/test.sh   | 82 +++++++++
+++++++++++++
...
6 files changed, 239 insertions(+)
 create mode 100644 doc/examples/cpufreq_governor_efficiency/README
 create mode 100644 doc/examples/cpufreq_governor_efficiency/
calibration.json
...
create mode 100755
doc/examples/cpufreq_governor_efficiency/calibration.sh
...
...
create mode 100644 doc/examples/cpufreq_governor_efficiency/dvfs.json
 create mode 100755 doc/examples/cpufreq_governor_efficiency/dvfs.sh
 create mode 100755 doc/examples/cpufreq_governor_efficiency/test.sh

diff --git a/doc/examples/cpufreq_governor_efficiency/README b/
doc/examples/cpufreq_governor_efficiency/README
...
new file mode 100644
index 0000000..cc8efe1
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/README
@@ -0,0 +1,54 @@
+Measure the efficiency of cpufreq governors using rt-app



+BACKGROUND:

DVFS adds a latency in the execution of task because of the time

to
...
...

decide to move at max freq. We need to measure this latency and

check
...
...

that the governor stays in an acceptable range.

When workgen runs a json file, a log file is created for each

thread.
...
...

This log file records the number of loop that has been executed

and
...
...

the duration for executing these loops (per phase). We can use

these
...
...

figures to evaluate to latency that is added by a cpufreq

governor
...
...

and its "performance efficiency".

We use the run+sleep pattern to do the measurement, for the

run time per
...

loop, the performance governor should run the expected duration

as the
...
...

CPU stays a max freq. At the opposite, the powersave governorwill

give
...
...

use the longest duration (as it stays at lowest OPP). Other

governor will
...

be somewhere between the 2 previous duration as they will use

several OPP
...

and will go back to max frequency after a defined duration

which depends
...

on its monitoring period.

The formula:

   duration of powersave gov - duration of the gov


--------------------------------------------------------  x 100%
duration of powersave gov - duration of performance gov



will give the efficiency of the governor. 100% means as efficient

as
...
...

the perf governor and 0% means as efficient as the powersave

governor.
...
...


This test offers json files and shell scripts to do the

measurement,
...
...



+USAGE:

./test.sh <cpus> <runtime> <sleeptime>
cpus: number of cpus in the CPU0's frequency domain
runtime: running time in ms per loop of the workload pattern
sleeptime: sleeping time in ms per loop of the workload pattern


+Example:

"./test.sh 4 100 1000" means
CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep"

workload pattern.
...


test result on an Intel machine:
~#./test.sh 4 100 1000
Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
powersave efficiency: 0%
performance efficiency: 100%
conservative efficiency: 28%
ondemand efficiency: 95%


+NOTE:

Make sure there are "sed", "cut", "grep", "rt-app", etc tools

on your test
...

machine, and run the script under root privilege.


diff --git a/doc/examples/cpufreq_governor_efficiency/
calibration.json
b/doc/examples/cpufreq_governor_efficiency/calibration.json
...
...
new file mode 100644
index 0000000..4377990
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/calibration.json
@@ -0,0 +1,27 @@
+{

  "tasks" : {


          "thread" : {


                  "instance" : 1,


                  "cpus" : [0],


                  "loop" : 1,


                  "phases" : {


                          "run" : {


                                  "loop" : 1,


                                  "run" : 200000,


                          },


                          "sleep" : {


                                  "loop" : 1,


                                  "sleep" : 200000,


                          }


                  }


          }


  },


  "global" : {


          "default_policy" : "SCHED_FIFO",


          "calibration" : "CPU0",


          "lock_pages" : true,


          "ftrace" : true,



remove or disable the ftrace parameter. Otherwise rt-app will return
an error if ftrace is not enable in the kernel. Then, your script
stops without any message if rt-app fails to run the use case, you
should detect the error and display a warning
...

          "logdir" : "./",


  }



+}



diff --git a/doc/examples/cpufreq_governor_efficiency/
calibration.sh b/doc/examples/cpufreq_governor_efficiency/calibration.sh
...
new file mode 100755
index 0000000..d10e644
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/calibration.sh
@@ -0,0 +1,11 @@
+#!/bin/sh



+set -e



+echo performance >
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
...
...



+sleep 1



+pLoad=$(rt-app calibration.json 2>&1 |grep pLoad |sed 's/.*= (.*
)ns.*/\1/')
...
+sed 's/"calibration" : .*,/"calibration" : '$pLoad',/' -i dvfs.json



diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.json b/
doc/examples/cpufreq_governor_efficiency/dvfs.json
...
new file mode 100644
index 0000000..b413156
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/dvfs.json
@@ -0,0 +1,27 @@
+{

  "tasks" : {


          "thread" : {


                  "instance" : 1,


                  "cpus" : [0],


                  "loop" : 5,



i wonder if 5 loops are enough to let the system stabilize ? May be we
can extract more statistic like the  min/max/average/stdev value ?
While testing your script, i have seen some variation on the results
(especially with short run time)
...

                  "phases" : {


                          "running" : {


                                  "loop" : 1,


                                  "run" : 100000,


                          },


                          "sleeping" : {


                                  "loop" : 1,


                                  "sleep" : 1000000,


                          }


                  }


          }


  },


  "global" : {


          "default_policy" : "SCHED_OTHER",


          "calibration" : 90,


          "lock_pages" : true,


          "ftrace" : true,



Remove or disable ftrace parameter
...

          "logdir" : "./",


  }



+}



diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.sh b/
doc/examples/cpufreq_governor_efficiency/dvfs.sh
...
new file mode 100755
index 0000000..8591fc7
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/dvfs.sh
@@ -0,0 +1,38 @@
+#!/bin/sh



+#echo $1 $2 $3
+set -e



+if [ $1 ] && [ $2 ] ; then

  for i in $(seq 0 1 $(expr $2 - 1)); do


          echo $1 > /sys/devices/system/cpu/cpu$i/cpufreq/



scaling_governor
...

          #cat 



/sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
...
is it for debug purpose ?
...

  done



  sleep 3



+fi



+if [ $3 ] ; then

  sed 's/"run" : .*,/"run" : '$3',/' -i dvfs.json



+fi



+if [ $4 ] ; then

  sed 's/"sleep" : .*,/"sleep" : '$4',/' -i dvfs.json



+fi



+#cat dvfs.json



+rt-app dvfs.json 2> /dev/null



+if [ $1 ] ; then

  mv -f rt-app-thread-0.log rt-app_$1_run$3us_sleep$4us.log



  sum=0


  for i in $(cat rt-app_$1_run$3us_sleep$4us.log | sed 'n;d'



| sed '1d' |cut -f 3); do
...

          sum=$(expr $sum + $i)


  done


  sum=$(expr $sum / 5)


  echo $sum


  rm -f rt-app_$1_run$3us_sleep$4us.log



+fi



diff --git a/doc/examples/cpufreq_governor_efficiency/test.sh b/
doc/examples/cpufreq_governor_efficiency/test.sh
...
new file mode 100755
index 0000000..d72fc6a
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/test.sh
@@ -0,0 +1,82 @@
+#!/bin/sh



+set -e



+set_calibration() {

  calibration.sh



+}



+test_efficiency() {


  FILENAME="results_$RANDOM$$.txt"



  if [ -e /sys/devices/system/cpu/cpu0/cpufreq/



scaling_available_governors ]; then
...

          for i in $(cat /sys/devices/system/cpu/cpu0/



cpufreq/scaling_available_governors); do
...

                  export gov_$i=$(echo $i)


          done


  else


          echo "cpufreq is not available!"


          exit


  fi



  if [ ! $gov_performance ] ; then


          echo "Can't find performance governor!"


          exit


  fi



  if [ ! $gov_powersave ] ; then


          echo "Can't find powersave governor!"


          exit


  fi



  # Get powersave data


  dvfs.sh powersave $1 $2 $3 > $FILENAME


  powersave=$(cat $FILENAME |sed -n '1p')



  # Get performance data


  dvfs.sh performance $1 $2 $3 > $FILENAME


  performance=$(cat $FILENAME |sed -n '1p')



  if [ $performance -ge $powersave ] ; then


          echo "Error! Probably not input all the cpus in 



the same frequency domain"
...

          exit


  fi



  denominator=$(expr $powersave - $performance)


  echo "powersave efficiency: 0%"


  echo "performance efficiency: 100%"



  # Calcuate other governors data


  for gov_next in $gov_conservative $gov_ondemand $gov_cfs; do



Why have you restricted the test to these 3 governors ? what about
userspace gov ? or interactive governor when available ?
...

          if [ "$gov_next" != "" ] ; then


                  dvfs.sh $gov_next $1 $2 $3 > $FILENAME


                  data=$(cat $FILENAME |sed -n '1p');


                  numerator=$(expr $powersave - $data)


                  numerator=$(expr $numerator \* 100)


                  if [ $numerator -lt 0 ] ; then


                          let numerator=0


                  fi


                  data=$(expr $numerator / $denominator)


                  echo "$gov_next efficiency: $data%"


          fi


  done



  rm -f $FILENAME



+}



+if [ $# -lt 3 ]; then

  echo "Usage: ./test.sh <cpus> <runtime> <sleeptime>"


  echo "cpus: number of cpus in the CPU0's frequency domain"


  echo "runtime: running time in ms per loop of the workload 



pattern"
...
...

  echo "sleeptime: sleeping time in ms per loop of the 



workload pattern"
...

  echo -e "\nExample: \n\"./test.sh 4 100 1000\" means



\nCPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.\n"
...

  exit



+fi



+echo "Frequency domain CPU0~CPU$(expr $1 - 1), run $2ms, sleep $3ms:"



+sleep 1
+PATH=$PATH:.
+set_calibration
+test_efficiency $1 $(expr $2 * 1000) $(expr $3 * 1000)
You have created several temporary file while executing your script,
you should clean them before exiting.
Some governors like ondemand or interactive ones have parameters that
modified their responsiveness. How can i set them before testing their
efficiency ?
As an example, ondemand governor efficiency moves from nearly 0% to
nearly 100% on my chromebook2 (only the quad A15 have been enable for
the test) if you change the sampling_rate and the sampling_down_factor
when you test short run: 100ms run 1000ms sleep. Default configuration
of a governor are not always the best one for the platform
Yes, thanks for the review.
I'm planning on changing test.sh to test only one governor on time, 
so the user can set the proper governor's parameters before running 
the test , like below:
Usage: ./test.sh <cpu> <runtime> <sleeptime> <loops> <governor>
cpu: cpu number on which you want to run the test
runtime: running time in ms per loop of the workload pattern
sleeptime: sleeping time in ms per loop of the workload pattern
loops: repeat times of the workload pattern
governor: CPUFreq governor you want to test
Example:
"./test.sh 0 100 1000 10 ondemand" means
Test ondemand on CPU0 with loops of "100ms run + 1000ms sleep" workload 
pattern.
-Xunlei
...
Regards,
Vincent
...



--
1.9.1
--------------------------------------------------------
ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s).  If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited.  If you have received this mail in error, please delete it and notify us immediately.

    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Sched-tools] [RESEND PATCH v2] doc: measure the efficiency of cpufreq governors

Signed-off-by: Xunlei Pang pang.xunlei@linaro.org