From: Xunlei Pang pang.xunlei@linaro.org
DVFS adds a latency in the execution of task because of the time to decide to move at max freq. We need to measure this latency and check that the governor stays in an acceptable range.
When workgen runs a json file, a log file is created for each thread. This log file records the number of loop that has been executed and the duration for executing these loops (per phase). We can use these figures to evaluate to latency that is added by a cpufreq governor and its "performance efficiency".
We use the run+sleep pattern to do the measurement, for the run time per loop, the performance governor should run the expected duration as the CPU stays a max freq. At the opposite, the powersave governor will give use the longest duration (as it stays at lowest OPP). Other governor will be somewhere between the 2 previous duration as they will use several OPP and will go back to max frequency after a defined duration which depends on its monitoring period.
The formula:
duration of powersave gov - duration of the gov -------------------------------------------------------- x 100% duration of powersave gov - duration of performance gov
will give the efficiency of the governor. 100% means as efficient as the perf governor and 0% means as efficient as the powersave governor.
This patch offers json files and shell scripts to do the measurement,
Usage: ./test.sh <cpus> <runtime> <sleeptime> cpus: number of cpus in the CPU0's frequency domain runtime: running time in ms per loop of the workload pattern sleeptime: sleeping time in ms per loop of the workload pattern
Example: "./test.sh 4 100 1000" means CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload pattern.
test result on my machine: ~#./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 28% ondemand efficiency: 95%
NOTE: Make sure there are "sed", "cut", "grep", "rt-app", etc tools on your test machine, and run the script under root privilege.
Signed-off-by: Xunlei Pang pang.xunlei@linaro.org --- doc/examples/cpufreq_governor_efficiency/README | 54 ++++++++++++++ .../cpufreq_governor_efficiency/calibration.json | 27 +++++++ .../cpufreq_governor_efficiency/calibration.sh | 11 +++ doc/examples/cpufreq_governor_efficiency/dvfs.json | 27 +++++++ doc/examples/cpufreq_governor_efficiency/dvfs.sh | 38 ++++++++++ doc/examples/cpufreq_governor_efficiency/test.sh | 82 ++++++++++++++++++++++ 6 files changed, 239 insertions(+) create mode 100644 doc/examples/cpufreq_governor_efficiency/README create mode 100644 doc/examples/cpufreq_governor_efficiency/calibration.json create mode 100755 doc/examples/cpufreq_governor_efficiency/calibration.sh create mode 100644 doc/examples/cpufreq_governor_efficiency/dvfs.json create mode 100755 doc/examples/cpufreq_governor_efficiency/dvfs.sh create mode 100755 doc/examples/cpufreq_governor_efficiency/test.sh
diff --git a/doc/examples/cpufreq_governor_efficiency/README b/doc/examples/cpufreq_governor_efficiency/README new file mode 100644 index 0000000..cc8efe1 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/README @@ -0,0 +1,54 @@ +Measure the efficiency of cpufreq governors using rt-app + +BACKGROUND: + DVFS adds a latency in the execution of task because of the time to + decide to move at max freq. We need to measure this latency and check + that the governor stays in an acceptable range. + + When workgen runs a json file, a log file is created for each thread. + This log file records the number of loop that has been executed and + the duration for executing these loops (per phase). We can use these + figures to evaluate to latency that is added by a cpufreq governor + and its "performance efficiency". + + We use the run+sleep pattern to do the measurement, for the run time per + loop, the performance governor should run the expected duration as the + CPU stays a max freq. At the opposite, the powersave governor will give + use the longest duration (as it stays at lowest OPP). Other governor will + be somewhere between the 2 previous duration as they will use several OPP + and will go back to max frequency after a defined duration which depends + on its monitoring period. + + The formula: + + duration of powersave gov - duration of the gov + -------------------------------------------------------- x 100% + duration of powersave gov - duration of performance gov + + will give the efficiency of the governor. 100% means as efficient as + the perf governor and 0% means as efficient as the powersave governor. + + This test offers json files and shell scripts to do the measurement, + +USAGE: + ./test.sh <cpus> <runtime> <sleeptime> + cpus: number of cpus in the CPU0's frequency domain + runtime: running time in ms per loop of the workload pattern + sleeptime: sleeping time in ms per loop of the workload pattern + +Example: + "./test.sh 4 100 1000" means + CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload pattern. + + test result on an Intel machine: + ~#./test.sh 4 100 1000 + Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: + powersave efficiency: 0% + performance efficiency: 100% + conservative efficiency: 28% + ondemand efficiency: 95% + +NOTE: + Make sure there are "sed", "cut", "grep", "rt-app", etc tools on your test + machine, and run the script under root privilege. + diff --git a/doc/examples/cpufreq_governor_efficiency/calibration.json b/doc/examples/cpufreq_governor_efficiency/calibration.json new file mode 100644 index 0000000..4377990 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.json @@ -0,0 +1,27 @@ +{ + "tasks" : { + "thread" : { + "instance" : 1, + "cpus" : [0], + "loop" : 1, + "phases" : { + "run" : { + "loop" : 1, + "run" : 200000, + }, + "sleep" : { + "loop" : 1, + "sleep" : 200000, + } + } + } + }, + "global" : { + "default_policy" : "SCHED_FIFO", + "calibration" : "CPU0", + "lock_pages" : true, + "ftrace" : true, + "logdir" : "./", + } +} + diff --git a/doc/examples/cpufreq_governor_efficiency/calibration.sh b/doc/examples/cpufreq_governor_efficiency/calibration.sh new file mode 100755 index 0000000..d10e644 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.sh @@ -0,0 +1,11 @@ +#!/bin/sh + +set -e + +echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor + +sleep 1 + +pLoad=$(rt-app calibration.json 2>&1 |grep pLoad |sed 's/.*= (.*)ns.*/\1/') +sed 's/"calibration" : .*,/"calibration" : '$pLoad',/' -i dvfs.json + diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.json b/doc/examples/cpufreq_governor_efficiency/dvfs.json new file mode 100644 index 0000000..b413156 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.json @@ -0,0 +1,27 @@ +{ + "tasks" : { + "thread" : { + "instance" : 1, + "cpus" : [0], + "loop" : 5, + "phases" : { + "running" : { + "loop" : 1, + "run" : 100000, + }, + "sleeping" : { + "loop" : 1, + "sleep" : 1000000, + } + } + } + }, + "global" : { + "default_policy" : "SCHED_OTHER", + "calibration" : 90, + "lock_pages" : true, + "ftrace" : true, + "logdir" : "./", + } +} + diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.sh b/doc/examples/cpufreq_governor_efficiency/dvfs.sh new file mode 100755 index 0000000..8591fc7 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.sh @@ -0,0 +1,38 @@ +#!/bin/sh + +#echo $1 $2 $3 +set -e + +if [ $1 ] && [ $2 ] ; then + for i in $(seq 0 1 $(expr $2 - 1)); do + echo $1 > /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor + #cat /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor + done + + sleep 3 +fi + +if [ $3 ] ; then + sed 's/"run" : .*,/"run" : '$3',/' -i dvfs.json +fi + +if [ $4 ] ; then + sed 's/"sleep" : .*,/"sleep" : '$4',/' -i dvfs.json +fi + +#cat dvfs.json + +rt-app dvfs.json 2> /dev/null + +if [ $1 ] ; then + mv -f rt-app-thread-0.log rt-app_$1_run$3us_sleep$4us.log + + sum=0 + for i in $(cat rt-app_$1_run$3us_sleep$4us.log | sed 'n;d' | sed '1d' |cut -f 3); do + sum=$(expr $sum + $i) + done + sum=$(expr $sum / 5) + echo $sum + rm -f rt-app_$1_run$3us_sleep$4us.log +fi + diff --git a/doc/examples/cpufreq_governor_efficiency/test.sh b/doc/examples/cpufreq_governor_efficiency/test.sh new file mode 100755 index 0000000..d72fc6a --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/test.sh @@ -0,0 +1,82 @@ +#!/bin/sh + +set -e + +set_calibration() { + calibration.sh +} + +test_efficiency() { + + FILENAME="results_$RANDOM$$.txt" + + if [ -e /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors ]; then + for i in $(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors); do + export gov_$i=$(echo $i) + done + else + echo "cpufreq is not available!" + exit + fi + + if [ ! $gov_performance ] ; then + echo "Can't find performance governor!" + exit + fi + + if [ ! $gov_powersave ] ; then + echo "Can't find powersave governor!" + exit + fi + + # Get powersave data + dvfs.sh powersave $1 $2 $3 > $FILENAME + powersave=$(cat $FILENAME |sed -n '1p') + + # Get performance data + dvfs.sh performance $1 $2 $3 > $FILENAME + performance=$(cat $FILENAME |sed -n '1p') + + if [ $performance -ge $powersave ] ; then + echo "Error! Probably not input all the cpus in the same frequency domain" + exit + fi + + denominator=$(expr $powersave - $performance) + echo "powersave efficiency: 0%" + echo "performance efficiency: 100%" + + # Calcuate other governors data + for gov_next in $gov_conservative $gov_ondemand $gov_cfs; do + if [ "$gov_next" != "" ] ; then + dvfs.sh $gov_next $1 $2 $3 > $FILENAME + data=$(cat $FILENAME |sed -n '1p'); + numerator=$(expr $powersave - $data) + numerator=$(expr $numerator * 100) + if [ $numerator -lt 0 ] ; then + let numerator=0 + fi + data=$(expr $numerator / $denominator) + echo "$gov_next efficiency: $data%" + fi + done + + rm -f $FILENAME +} + +if [ $# -lt 3 ]; then + echo "Usage: ./test.sh <cpus> <runtime> <sleeptime>" + echo "cpus: number of cpus in the CPU0's frequency domain" + echo "runtime: running time in ms per loop of the workload pattern" + echo "sleeptime: sleeping time in ms per loop of the workload pattern" + echo -e "\nExample: \n"./test.sh 4 100 1000" means\nCPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload pattern.\n" + exit +fi + +echo "Frequency domain CPU0~CPU$(expr $1 - 1), run $2ms, sleep $3ms:" + +sleep 1 +PATH=$PATH:. +set_calibration +test_efficiency $1 $(expr $2 * 1000) $(expr $3 * 1000) +
Just tested on my Intel EAS test environment(implemented x86 frequency invariant hook). With EAS disabled and sched-dvfs enabled.
#./test.sh 3 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 92% ondemand efficiency: 97% cfs efficiency: 79%
#./test.sh 3 200 1000 Frequency domain CPU0~CPU3, run 200ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 97% ondemand efficiency: 99% cfs efficiency: 89%
#./test.sh 3 50 1000 Frequency domain CPU0~CPU3, run 50ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 93% ondemand efficiency: 96% cfs efficiency: 58%
#./test.sh 3 1000 100 Frequency domain CPU0~CPU3, run 1000ms, sleep 100ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 99% ondemand efficiency: 99% cfs efficiency: 97%
Seems sched-dvfs is computing inefficient at low cpu usage(implies power efficient), but computing efficient at high cpu usage.
-Xunlei
Xunlei Pang xlpang@126.com wrote 2015-06-18 PM 05:06:07:
[RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
From: Xunlei Pang pang.xunlei@linaro.org
DVFS adds a latency in the execution of task because of the time to decide to move at max freq. We need to measure this latency and check that the governor stays in an acceptable range.
When workgen runs a json file, a log file is created for each thread. This log file records the number of loop that has been executed and the duration for executing these loops (per phase). We can use these figures to evaluate to latency that is added by a cpufreq governor and its "performance efficiency".
We use the run+sleep pattern to do the measurement, for the run time per loop, the performance governor should run the expected duration as the CPU stays a max freq. At the opposite, the powersave governor will give use the longest duration (as it stays at lowest OPP). Other governor
will
be somewhere between the 2 previous duration as they will use several
OPP
and will go back to max frequency after a defined duration which depends on its monitoring period.
The formula:
duration of powersave gov - duration of the gov
-------------------------------------------------------- x 100% duration of powersave gov - duration of performance gov
will give the efficiency of the governor. 100% means as efficient as the perf governor and 0% means as efficient as the powersave governor.
This patch offers json files and shell scripts to do the measurement,
Usage: ./test.sh <cpus> <runtime> <sleeptime> cpus: number of cpus in the CPU0's frequency domain runtime: running time in ms per loop of the workload pattern sleeptime: sleeping time in ms per loop of the workload pattern
Example: "./test.sh 4 100 1000" means CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
test result on my machine: ~#./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 28% ondemand efficiency: 95%
NOTE: Make sure there are "sed", "cut", "grep", "rt-app", etc tools on your test machine, and run the script under root privilege.
Signed-off-by: Xunlei Pang pang.xunlei@linaro.org
doc/examples/cpufreq_governor_efficiency/README | 54 ++++++++++++++ .../cpufreq_governor_efficiency/calibration.json | 27 +++++++ .../cpufreq_governor_efficiency/calibration.sh | 11 +++ doc/examples/cpufreq_governor_efficiency/dvfs.json | 27 +++++++ doc/examples/cpufreq_governor_efficiency/dvfs.sh | 38 ++++++++++ doc/examples/cpufreq_governor_efficiency/test.sh | 82 +++++++++++ +++++++++++ 6 files changed, 239 insertions(+) create mode 100644 doc/examples/cpufreq_governor_efficiency/README create mode 100644
doc/examples/cpufreq_governor_efficiency/calibration.json
create mode 100755
doc/examples/cpufreq_governor_efficiency/calibration.sh
create mode 100644 doc/examples/cpufreq_governor_efficiency/dvfs.json create mode 100755 doc/examples/cpufreq_governor_efficiency/dvfs.sh create mode 100755 doc/examples/cpufreq_governor_efficiency/test.sh
diff --git a/doc/examples/cpufreq_governor_efficiency/README b/doc/ examples/cpufreq_governor_efficiency/README new file mode 100644 index 0000000..cc8efe1 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/README @@ -0,0 +1,54 @@ +Measure the efficiency of cpufreq governors using rt-app
+BACKGROUND:
- DVFS adds a latency in the execution of task because of the time to
- decide to move at max freq. We need to measure this latency and
check
- that the governor stays in an acceptable range.
- When workgen runs a json file, a log file is created for each
thread.
- This log file records the number of loop that has been executed and
- the duration for executing these loops (per phase). We can use
these
- figures to evaluate to latency that is added by a cpufreq governor
- and its "performance efficiency".
- We use the run+sleep pattern to do the measurement, for the run
time per
- loop, the performance governor should run the expected duration as
the
- CPU stays a max freq. At the opposite, the powersave governor will
give
- use the longest duration (as it stays at lowest OPP). Other
governor will
- be somewhere between the 2 previous duration as they will use
several OPP
- and will go back to max frequency after a defined duration which
depends
- on its monitoring period.
- The formula:
duration of powersave gov - duration of the gov
- -------------------------------------------------------- x 100%
duration of powersave gov - duration of performance gov
- will give the efficiency of the governor. 100% means as efficient
as
- the perf governor and 0% means as efficient as the powersave
governor.
- This test offers json files and shell scripts to do the
measurement,
+USAGE:
- ./test.sh <cpus> <runtime> <sleeptime>
- cpus: number of cpus in the CPU0's frequency domain
- runtime: running time in ms per loop of the workload pattern
- sleeptime: sleeping time in ms per loop of the workload pattern
+Example:
- "./test.sh 4 100 1000" means
- CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
- test result on an Intel machine:
- ~#./test.sh 4 100 1000
- Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
- powersave efficiency: 0%
- performance efficiency: 100%
- conservative efficiency: 28%
- ondemand efficiency: 95%
+NOTE:
- Make sure there are "sed", "cut", "grep", "rt-app", etc tools
on your test
- machine, and run the script under root privilege.
diff --git a/doc/examples/cpufreq_governor_efficiency/ calibration.json
b/doc/examples/cpufreq_governor_efficiency/calibration.json
new file mode 100644 index 0000000..4377990 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.json @@ -0,0 +1,27 @@ +{
- "tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 1,
"phases" : {
"run" : {
"loop" : 1,
"run" : 200000,
},
"sleep" : {
"loop" : 1,
"sleep" : 200000,
}
}
}
- },
- "global" : {
"default_policy" : "SCHED_FIFO",
"calibration" : "CPU0",
"lock_pages" : true,
"ftrace" : true,
"logdir" : "./",
- }
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/calibration.sh b/doc/examples/cpufreq_governor_efficiency/calibration.sh new file mode 100755 index 0000000..d10e644 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.sh @@ -0,0 +1,11 @@ +#!/bin/sh
+set -e
+echo performance >
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
+sleep 1
+pLoad=$(rt-app calibration.json 2>&1 |grep pLoad |sed 's/.*=
(.*)ns.*/\1/')
+sed 's/"calibration" : .*,/"calibration" : '$pLoad',/' -i dvfs.json
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.json b/ doc/examples/cpufreq_governor_efficiency/dvfs.json new file mode 100644 index 0000000..b413156 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.json @@ -0,0 +1,27 @@ +{
- "tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 5,
"phases" : {
"running" : {
"loop" : 1,
"run" : 100000,
},
"sleeping" : {
"loop" : 1,
"sleep" : 1000000,
}
}
}
- },
- "global" : {
"default_policy" : "SCHED_OTHER",
"calibration" : 90,
"lock_pages" : true,
"ftrace" : true,
"logdir" : "./",
- }
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.sh b/doc/ examples/cpufreq_governor_efficiency/dvfs.sh new file mode 100755 index 0000000..8591fc7 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.sh @@ -0,0 +1,38 @@ +#!/bin/sh
+#echo $1 $2 $3 +set -e
+if [ $1 ] && [ $2 ] ; then
- for i in $(seq 0 1 $(expr $2 - 1)); do
echo $1 > /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
#cat /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
- done
- sleep 3
+fi
+if [ $3 ] ; then
- sed 's/"run" : .*,/"run" : '$3',/' -i dvfs.json
+fi
+if [ $4 ] ; then
- sed 's/"sleep" : .*,/"sleep" : '$4',/' -i dvfs.json
+fi
+#cat dvfs.json
+rt-app dvfs.json 2> /dev/null
+if [ $1 ] ; then
- mv -f rt-app-thread-0.log rt-app_$1_run$3us_sleep$4us.log
- sum=0
- for i in $(cat rt-app_$1_run$3us_sleep$4us.log | sed 'n;d' | sed
'1d' |cut -f 3); do
sum=$(expr $sum + $i)
- done
- sum=$(expr $sum / 5)
- echo $sum
- rm -f rt-app_$1_run$3us_sleep$4us.log
+fi
diff --git a/doc/examples/cpufreq_governor_efficiency/test.sh b/doc/ examples/cpufreq_governor_efficiency/test.sh new file mode 100755 index 0000000..d72fc6a --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/test.sh @@ -0,0 +1,82 @@ +#!/bin/sh
+set -e
+set_calibration() {
- calibration.sh
+}
+test_efficiency() {
- FILENAME="results_$RANDOM$$.txt"
- if [ -e /sys/devices/system/cpu/cpu0/cpufreq/
scaling_available_governors ]; then
for i in $(cat /sys/devices/system/cpu/cpu0/cpufreq/
scaling_available_governors); do
export gov_$i=$(echo $i)
done
- else
echo "cpufreq is not available!"
exit
- fi
- if [ ! $gov_performance ] ; then
echo "Can't find performance governor!"
exit
- fi
- if [ ! $gov_powersave ] ; then
echo "Can't find powersave governor!"
exit
- fi
- # Get powersave data
- dvfs.sh powersave $1 $2 $3 > $FILENAME
- powersave=$(cat $FILENAME |sed -n '1p')
- # Get performance data
- dvfs.sh performance $1 $2 $3 > $FILENAME
- performance=$(cat $FILENAME |sed -n '1p')
- if [ $performance -ge $powersave ] ; then
echo "Error! Probably not input all the cpus in the same
frequency domain"
exit
- fi
- denominator=$(expr $powersave - $performance)
- echo "powersave efficiency: 0%"
- echo "performance efficiency: 100%"
- # Calcuate other governors data
- for gov_next in $gov_conservative $gov_ondemand $gov_cfs; do
if [ "$gov_next" != "" ] ; then
dvfs.sh $gov_next $1 $2 $3 > $FILENAME
data=$(cat $FILENAME |sed -n '1p');
numerator=$(expr $powersave - $data)
numerator=$(expr $numerator \* 100)
if [ $numerator -lt 0 ] ; then
let numerator=0
fi
data=$(expr $numerator / $denominator)
echo "$gov_next efficiency: $data%"
fi
- done
- rm -f $FILENAME
+}
+if [ $# -lt 3 ]; then
- echo "Usage: ./test.sh <cpus> <runtime> <sleeptime>"
- echo "cpus: number of cpus in the CPU0's frequency domain"
- echo "runtime: running time in ms per loop of the workload pattern"
- echo "sleeptime: sleeping time in ms per loop of the workload
pattern"
- echo -e "\nExample: \n"./test.sh 4 100 1000" means\nCPU0~CPU3
sharing frequency, "100ms run + 1000ms sleep" workload pattern.\n"
- exit
+fi
+echo "Frequency domain CPU0~CPU$(expr $1 - 1), run $2ms, sleep $3ms:"
+sleep 1 +PATH=$PATH:. +set_calibration +test_efficiency $1 $(expr $2 * 1000) $(expr $3 * 1000)
-- 1.9.1
-------------------------------------------------------- ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately.
Just tested on my Intel EAS test environment(implemented x86 frequency invariant hook). With EAS disabled and sched-dvfs enabled.
#./test.sh 3 100 1000 Frequency domain CPU0~CPU2, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 92% ondemand efficiency: 97% cfs efficiency: 79%
#./test.sh 3 200 1000 Frequency domain CPU0~CPU2, run 200ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 97% ondemand efficiency: 99% cfs efficiency: 89%
#./test.sh 3 50 1000 Frequency domain CPU0~CPU2, run 50ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 93% ondemand efficiency: 96% cfs efficiency: 58%
#./test.sh 3 1000 100 Frequency domain CPU0~CPU2, run 1000ms, sleep 100ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 99% ondemand efficiency: 99% cfs efficiency: 97%
Seems sched-dvfs is computing inefficient at low cpu usage(implies power efficient), but computing efficient at high cpu usage.
-Xunlei
Xunlei Pang xlpang@126.com wrote 2015-06-18 PM 05:06:07:
[RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
From: Xunlei Pang pang.xunlei@linaro.org
DVFS adds a latency in the execution of task because of the time to decide to move at max freq. We need to measure this latency and check that the governor stays in an acceptable range.
When workgen runs a json file, a log file is created for each thread. This log file records the number of loop that has been executed and the duration for executing these loops (per phase). We can use these figures to evaluate to latency that is added by a cpufreq governor and its "performance efficiency".
We use the run+sleep pattern to do the measurement, for the run time per loop, the performance governor should run the expected duration as the CPU stays a max freq. At the opposite, the powersave governor will give use the longest duration (as it stays at lowest OPP). Other governor
will
be somewhere between the 2 previous duration as they will use several
OPP
and will go back to max frequency after a defined duration which depends on its monitoring period.
The formula:
duration of powersave gov - duration of the gov
-------------------------------------------------------- x 100% duration of powersave gov - duration of performance gov
will give the efficiency of the governor. 100% means as efficient as the perf governor and 0% means as efficient as the powersave governor.
This patch offers json files and shell scripts to do the measurement,
Usage: ./test.sh <cpus> <runtime> <sleeptime> cpus: number of cpus in the CPU0's frequency domain runtime: running time in ms per loop of the workload pattern sleeptime: sleeping time in ms per loop of the workload pattern
Example: "./test.sh 4 100 1000" means CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
test result on my machine: ~#./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 28% ondemand efficiency: 95%
NOTE: Make sure there are "sed", "cut", "grep", "rt-app", etc tools on your test machine, and run the script under root privilege.
Signed-off-by: Xunlei Pang pang.xunlei@linaro.org
doc/examples/cpufreq_governor_efficiency/README | 54 ++++++++++++++ .../cpufreq_governor_efficiency/calibration.json | 27 +++++++ .../cpufreq_governor_efficiency/calibration.sh | 11 +++ doc/examples/cpufreq_governor_efficiency/dvfs.json | 27 +++++++ doc/examples/cpufreq_governor_efficiency/dvfs.sh | 38 ++++++++++ doc/examples/cpufreq_governor_efficiency/test.sh | 82 +++++++++++ +++++++++++ 6 files changed, 239 insertions(+) create mode 100644 doc/examples/cpufreq_governor_efficiency/README create mode 100644
doc/examples/cpufreq_governor_efficiency/calibration.json
create mode 100755
doc/examples/cpufreq_governor_efficiency/calibration.sh
create mode 100644 doc/examples/cpufreq_governor_efficiency/dvfs.json create mode 100755 doc/examples/cpufreq_governor_efficiency/dvfs.sh create mode 100755 doc/examples/cpufreq_governor_efficiency/test.sh
diff --git a/doc/examples/cpufreq_governor_efficiency/README b/doc/ examples/cpufreq_governor_efficiency/README new file mode 100644 index 0000000..cc8efe1 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/README @@ -0,0 +1,54 @@ +Measure the efficiency of cpufreq governors using rt-app
+BACKGROUND:
- DVFS adds a latency in the execution of task because of the time to
- decide to move at max freq. We need to measure this latency and
check
- that the governor stays in an acceptable range.
- When workgen runs a json file, a log file is created for each
thread.
- This log file records the number of loop that has been executed and
- the duration for executing these loops (per phase). We can use
these
- figures to evaluate to latency that is added by a cpufreq governor
- and its "performance efficiency".
- We use the run+sleep pattern to do the measurement, for the run
time per
- loop, the performance governor should run the expected duration as
the
- CPU stays a max freq. At the opposite, the powersave governor will
give
- use the longest duration (as it stays at lowest OPP). Other
governor will
- be somewhere between the 2 previous duration as they will use
several OPP
- and will go back to max frequency after a defined duration which
depends
- on its monitoring period.
- The formula:
duration of powersave gov - duration of the gov
- -------------------------------------------------------- x 100%
duration of powersave gov - duration of performance gov
- will give the efficiency of the governor. 100% means as efficient
as
- the perf governor and 0% means as efficient as the powersave
governor.
- This test offers json files and shell scripts to do the
measurement,
+USAGE:
- ./test.sh <cpus> <runtime> <sleeptime>
- cpus: number of cpus in the CPU0's frequency domain
- runtime: running time in ms per loop of the workload pattern
- sleeptime: sleeping time in ms per loop of the workload pattern
+Example:
- "./test.sh 4 100 1000" means
- CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
- test result on an Intel machine:
- ~#./test.sh 4 100 1000
- Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
- powersave efficiency: 0%
- performance efficiency: 100%
- conservative efficiency: 28%
- ondemand efficiency: 95%
+NOTE:
- Make sure there are "sed", "cut", "grep", "rt-app", etc tools
on your test
- machine, and run the script under root privilege.
diff --git a/doc/examples/cpufreq_governor_efficiency/ calibration.json
b/doc/examples/cpufreq_governor_efficiency/calibration.json
new file mode 100644 index 0000000..4377990 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.json @@ -0,0 +1,27 @@ +{
- "tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 1,
"phases" : {
"run" : {
"loop" : 1,
"run" : 200000,
},
"sleep" : {
"loop" : 1,
"sleep" : 200000,
}
}
}
- },
- "global" : {
"default_policy" : "SCHED_FIFO",
"calibration" : "CPU0",
"lock_pages" : true,
"ftrace" : true,
"logdir" : "./",
- }
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/calibration.sh b/doc/examples/cpufreq_governor_efficiency/calibration.sh new file mode 100755 index 0000000..d10e644 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.sh @@ -0,0 +1,11 @@ +#!/bin/sh
+set -e
+echo performance >
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
+sleep 1
+pLoad=$(rt-app calibration.json 2>&1 |grep pLoad |sed 's/.*=
(.*)ns.*/\1/')
+sed 's/"calibration" : .*,/"calibration" : '$pLoad',/' -i dvfs.json
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.json b/ doc/examples/cpufreq_governor_efficiency/dvfs.json new file mode 100644 index 0000000..b413156 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.json @@ -0,0 +1,27 @@ +{
- "tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 5,
"phases" : {
"running" : {
"loop" : 1,
"run" : 100000,
},
"sleeping" : {
"loop" : 1,
"sleep" : 1000000,
}
}
}
- },
- "global" : {
"default_policy" : "SCHED_OTHER",
"calibration" : 90,
"lock_pages" : true,
"ftrace" : true,
"logdir" : "./",
- }
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.sh b/doc/ examples/cpufreq_governor_efficiency/dvfs.sh new file mode 100755 index 0000000..8591fc7 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.sh @@ -0,0 +1,38 @@ +#!/bin/sh
+#echo $1 $2 $3 +set -e
+if [ $1 ] && [ $2 ] ; then
- for i in $(seq 0 1 $(expr $2 - 1)); do
echo $1 > /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
#cat /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
- done
- sleep 3
+fi
+if [ $3 ] ; then
- sed 's/"run" : .*,/"run" : '$3',/' -i dvfs.json
+fi
+if [ $4 ] ; then
- sed 's/"sleep" : .*,/"sleep" : '$4',/' -i dvfs.json
+fi
+#cat dvfs.json
+rt-app dvfs.json 2> /dev/null
+if [ $1 ] ; then
- mv -f rt-app-thread-0.log rt-app_$1_run$3us_sleep$4us.log
- sum=0
- for i in $(cat rt-app_$1_run$3us_sleep$4us.log | sed 'n;d' | sed
'1d' |cut -f 3); do
sum=$(expr $sum + $i)
- done
- sum=$(expr $sum / 5)
- echo $sum
- rm -f rt-app_$1_run$3us_sleep$4us.log
+fi
diff --git a/doc/examples/cpufreq_governor_efficiency/test.sh b/doc/ examples/cpufreq_governor_efficiency/test.sh new file mode 100755 index 0000000..d72fc6a --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/test.sh @@ -0,0 +1,82 @@ +#!/bin/sh
+set -e
+set_calibration() {
- calibration.sh
+}
+test_efficiency() {
- FILENAME="results_$RANDOM$$.txt"
- if [ -e /sys/devices/system/cpu/cpu0/cpufreq/
scaling_available_governors ]; then
for i in $(cat /sys/devices/system/cpu/cpu0/cpufreq/
scaling_available_governors); do
export gov_$i=$(echo $i)
done
- else
echo "cpufreq is not available!"
exit
- fi
- if [ ! $gov_performance ] ; then
echo "Can't find performance governor!"
exit
- fi
- if [ ! $gov_powersave ] ; then
echo "Can't find powersave governor!"
exit
- fi
- # Get powersave data
- dvfs.sh powersave $1 $2 $3 > $FILENAME
- powersave=$(cat $FILENAME |sed -n '1p')
- # Get performance data
- dvfs.sh performance $1 $2 $3 > $FILENAME
- performance=$(cat $FILENAME |sed -n '1p')
- if [ $performance -ge $powersave ] ; then
echo "Error! Probably not input all the cpus in the same
frequency domain"
exit
- fi
- denominator=$(expr $powersave - $performance)
- echo "powersave efficiency: 0%"
- echo "performance efficiency: 100%"
- # Calcuate other governors data
- for gov_next in $gov_conservative $gov_ondemand $gov_cfs; do
if [ "$gov_next" != "" ] ; then
dvfs.sh $gov_next $1 $2 $3 > $FILENAME
data=$(cat $FILENAME |sed -n '1p');
numerator=$(expr $powersave - $data)
numerator=$(expr $numerator \* 100)
if [ $numerator -lt 0 ] ; then
let numerator=0
fi
data=$(expr $numerator / $denominator)
echo "$gov_next efficiency: $data%"
fi
- done
- rm -f $FILENAME
+}
+if [ $# -lt 3 ]; then
- echo "Usage: ./test.sh <cpus> <runtime> <sleeptime>"
- echo "cpus: number of cpus in the CPU0's frequency domain"
- echo "runtime: running time in ms per loop of the workload pattern"
- echo "sleeptime: sleeping time in ms per loop of the workload
pattern"
- echo -e "\nExample: \n"./test.sh 4 100 1000" means\nCPU0~CPU3
sharing frequency, "100ms run + 1000ms sleep" workload pattern.\n"
- exit
+fi
+echo "Frequency domain CPU0~CPU$(expr $1 - 1), run $2ms, sleep $3ms:"
+sleep 1 +PATH=$PATH:. +set_calibration +test_efficiency $1 $(expr $2 * 1000) $(expr $3 * 1000)
-- 1.9.1
-------------------------------------------------------- ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately.
On Thu, Jun 18, 2015 at 5:05 PM, pang.xunlei@zte.com.cn wrote:
Just tested on my Intel EAS test environment(implemented x86 frequency invariant hook). With EAS disabled and sched-dvfs enabled.
#./test.sh 3 100 1000 Frequency domain CPU0~CPU2, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 92% ondemand efficiency: 97% cfs efficiency: 79%
#./test.sh 3 200 1000 Frequency domain CPU0~CPU2, run 200ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 97% ondemand efficiency: 99% cfs efficiency: 89%
#./test.sh 3 50 1000 Frequency domain CPU0~CPU2, run 50ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 93% ondemand efficiency: 96% cfs efficiency: 58%
#./test.sh 3 1000 100 Frequency domain CPU0~CPU2, run 1000ms, sleep 100ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 99% ondemand efficiency: 99% cfs efficiency: 97%
Seems sched-dvfs is computing inefficient at low cpu usage(implies power efficient), but computing efficient at high cpu usage.
-Xunlei
Hi Xunlei,
Thanks for the numbers. So this is along expect lines: the scheduler is not responsive enough in raising OPP during short-lived (<250ms?) bursts of activity.
IIUC, PELT needs ~300ms to reflect the full load (and a little over 100ms to get to reflect 90% load).
Just curious, does the use of the RT scheduling class improve the efficiency?
What heuristics can be added to control hysteresis? Taking into account #of tasks per cpu, #tasks per group?
Regards, Amit
Hi Amit,
Amit Kucheria amit.kucheria@linaro.org wrote 2015-06-22 PM 06:18:35:
Re: [RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
On Thu, Jun 18, 2015 at 5:05 PM, pang.xunlei@zte.com.cn wrote:
Just tested on my Intel EAS test environment(implemented x86 frequency invariant hook). With EAS disabled and sched-dvfs enabled.
#./test.sh 3 100 1000 Frequency domain CPU0~CPU2, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 92% ondemand efficiency: 97% cfs efficiency: 79%
#./test.sh 3 200 1000 Frequency domain CPU0~CPU2, run 200ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 97% ondemand efficiency: 99% cfs efficiency: 89%
#./test.sh 3 50 1000 Frequency domain CPU0~CPU2, run 50ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 93% ondemand efficiency: 96% cfs efficiency: 58%
#./test.sh 3 1000 100 Frequency domain CPU0~CPU2, run 1000ms, sleep 100ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 99% ondemand efficiency: 99% cfs efficiency: 97%
Seems sched-dvfs is computing inefficient at low cpu usage(implies
power
efficient), but computing efficient at high cpu usage.
-Xunlei
Hi Xunlei,
Thanks for the numbers. So this is along expect lines: the scheduler is not responsive enough in raising OPP during short-lived (<250ms?) bursts of activity.
IIUC, PELT needs ~300ms to reflect the full load (and a little over 100ms to get to reflect 90% load).
Yeah, I found this when I tested the EAS, I used a "200ms run + 200ms sleep" pattern back then, and the task_utilization() is around 1000, so the tasks weren't all put onto the energy efficient cpus when doing energy_aware_wake_cpu().
Just curious, does the use of the RT scheduling class improve the
efficiency?
I changed kcpufreq_cfs_task to a cfs task, the result is almost the same: With "50ms run + 1000ms sleep" 10 loops running on CPU0, FIFO: 49%, non-RT: 50%
Maybe using FIFO for it just want to ensure the response time and avoid messing up the cfs load contribution(if kcpufreq_cfs_task is a cfs task).
What heuristics can be added to control hysteresis? Taking into account #of tasks per cpu, #tasks per group?
I think we can set a threshhold for it, low threshhold means energy efficient (like conservative), high threshhold means computing efficient which means
we use the highest frequency directly if the cpu usage exceeds the threshhold like ondemand?
-Xunlei
Regards, Amit
-------------------------------------------------------- ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately.
Hi Xunlei,
I have run the bench on a quad A15 with sched-dvfs (but without eas patches unlike you) #sudo ./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 18% ondemand efficiency: 48% cfs efficiency: 24%
#sudo ./test.sh 4 200 1000 Frequency domain CPU0~CPU3, run 200ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 40% ondemand efficiency: 68% cfs efficiency: 30%
$ sudo ./test.sh 4 50 1000 Frequency domain CPU0~CPU3, run 50ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 0% ondemand efficiency: 25% cfs efficiency: 19%
As an example, here is the result when ondemand parameter are tuned for the platform sudo ./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 19% ondemand efficiency: 94% cfs efficiency: 23%
Beside these results, i have seen variation in the results that confirm the interest of having more statistics like, min, man stdev
Regards, Vincent
On 18 June 2015 at 13:35, pang.xunlei@zte.com.cn wrote:
Just tested on my Intel EAS test environment(implemented x86 frequency invariant hook). With EAS disabled and sched-dvfs enabled.
#./test.sh 3 100 1000 Frequency domain CPU0~CPU2, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 92% ondemand efficiency: 97% cfs efficiency: 79%
#./test.sh 3 200 1000 Frequency domain CPU0~CPU2, run 200ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 97% ondemand efficiency: 99% cfs efficiency: 89%
#./test.sh 3 50 1000 Frequency domain CPU0~CPU2, run 50ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 93% ondemand efficiency: 96% cfs efficiency: 58%
#./test.sh 3 1000 100 Frequency domain CPU0~CPU2, run 1000ms, sleep 100ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 99% ondemand efficiency: 99% cfs efficiency: 97%
Seems sched-dvfs is computing inefficient at low cpu usage(implies power efficient), but computing efficient at high cpu usage.
-Xunlei
Xunlei Pang xlpang@126.com wrote 2015-06-18 PM 05:06:07:
[RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
From: Xunlei Pang pang.xunlei@linaro.org
DVFS adds a latency in the execution of task because of the time to decide to move at max freq. We need to measure this latency and check that the governor stays in an acceptable range.
When workgen runs a json file, a log file is created for each thread. This log file records the number of loop that has been executed and the duration for executing these loops (per phase). We can use these figures to evaluate to latency that is added by a cpufreq governor and its "performance efficiency".
We use the run+sleep pattern to do the measurement, for the run time per loop, the performance governor should run the expected duration as the CPU stays a max freq. At the opposite, the powersave governor will give use the longest duration (as it stays at lowest OPP). Other governor
will
be somewhere between the 2 previous duration as they will use several
OPP
and will go back to max frequency after a defined duration which depends on its monitoring period.
The formula:
duration of powersave gov - duration of the gov
-------------------------------------------------------- x 100% duration of powersave gov - duration of performance gov
will give the efficiency of the governor. 100% means as efficient as the perf governor and 0% means as efficient as the powersave governor.
This patch offers json files and shell scripts to do the measurement,
Usage: ./test.sh <cpus> <runtime> <sleeptime> cpus: number of cpus in the CPU0's frequency domain runtime: running time in ms per loop of the workload pattern sleeptime: sleeping time in ms per loop of the workload pattern
Example: "./test.sh 4 100 1000" means CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
test result on my machine: ~#./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 28% ondemand efficiency: 95%
NOTE: Make sure there are "sed", "cut", "grep", "rt-app", etc tools on your test machine, and run the script under root privilege.
Signed-off-by: Xunlei Pang pang.xunlei@linaro.org
doc/examples/cpufreq_governor_efficiency/README | 54 ++++++++++++++ .../cpufreq_governor_efficiency/calibration.json | 27 +++++++ .../cpufreq_governor_efficiency/calibration.sh | 11 +++ doc/examples/cpufreq_governor_efficiency/dvfs.json | 27 +++++++ doc/examples/cpufreq_governor_efficiency/dvfs.sh | 38 ++++++++++ doc/examples/cpufreq_governor_efficiency/test.sh | 82 +++++++++++ +++++++++++ 6 files changed, 239 insertions(+) create mode 100644 doc/examples/cpufreq_governor_efficiency/README create mode 100644
doc/examples/cpufreq_governor_efficiency/calibration.json
create mode 100755
doc/examples/cpufreq_governor_efficiency/calibration.sh
create mode 100644 doc/examples/cpufreq_governor_efficiency/dvfs.json create mode 100755 doc/examples/cpufreq_governor_efficiency/dvfs.sh create mode 100755 doc/examples/cpufreq_governor_efficiency/test.sh
diff --git a/doc/examples/cpufreq_governor_efficiency/README b/doc/ examples/cpufreq_governor_efficiency/README new file mode 100644 index 0000000..cc8efe1 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/README @@ -0,0 +1,54 @@ +Measure the efficiency of cpufreq governors using rt-app
+BACKGROUND:
- DVFS adds a latency in the execution of task because of the time to
- decide to move at max freq. We need to measure this latency and
check
- that the governor stays in an acceptable range.
- When workgen runs a json file, a log file is created for each
thread.
- This log file records the number of loop that has been executed and
- the duration for executing these loops (per phase). We can use
these
- figures to evaluate to latency that is added by a cpufreq governor
- and its "performance efficiency".
- We use the run+sleep pattern to do the measurement, for the run
time per
- loop, the performance governor should run the expected duration as
the
- CPU stays a max freq. At the opposite, the powersave governor will
give
- use the longest duration (as it stays at lowest OPP). Other
governor will
- be somewhere between the 2 previous duration as they will use
several OPP
- and will go back to max frequency after a defined duration which
depends
- on its monitoring period.
- The formula:
duration of powersave gov - duration of the gov
- -------------------------------------------------------- x 100%
duration of powersave gov - duration of performance gov
- will give the efficiency of the governor. 100% means as efficient
as
- the perf governor and 0% means as efficient as the powersave
governor.
- This test offers json files and shell scripts to do the
measurement,
+USAGE:
- ./test.sh <cpus> <runtime> <sleeptime>
- cpus: number of cpus in the CPU0's frequency domain
- runtime: running time in ms per loop of the workload pattern
- sleeptime: sleeping time in ms per loop of the workload pattern
+Example:
- "./test.sh 4 100 1000" means
- CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
- test result on an Intel machine:
- ~#./test.sh 4 100 1000
- Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
- powersave efficiency: 0%
- performance efficiency: 100%
- conservative efficiency: 28%
- ondemand efficiency: 95%
+NOTE:
- Make sure there are "sed", "cut", "grep", "rt-app", etc tools
on your test
- machine, and run the script under root privilege.
diff --git a/doc/examples/cpufreq_governor_efficiency/ calibration.json
b/doc/examples/cpufreq_governor_efficiency/calibration.json
new file mode 100644 index 0000000..4377990 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.json @@ -0,0 +1,27 @@ +{
- "tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 1,
"phases" : {
"run" : {
"loop" : 1,
"run" : 200000,
},
"sleep" : {
"loop" : 1,
"sleep" : 200000,
}
}
}
- },
- "global" : {
"default_policy" : "SCHED_FIFO",
"calibration" : "CPU0",
"lock_pages" : true,
"ftrace" : true,
"logdir" : "./",
- }
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/calibration.sh b/doc/examples/cpufreq_governor_efficiency/calibration.sh new file mode 100755 index 0000000..d10e644 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.sh @@ -0,0 +1,11 @@ +#!/bin/sh
+set -e
+echo performance >
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
+sleep 1
+pLoad=$(rt-app calibration.json 2>&1 |grep pLoad |sed 's/.*=
(.*)ns.*/\1/')
+sed 's/"calibration" : .*,/"calibration" : '$pLoad',/' -i dvfs.json
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.json b/ doc/examples/cpufreq_governor_efficiency/dvfs.json new file mode 100644 index 0000000..b413156 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.json @@ -0,0 +1,27 @@ +{
- "tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 5,
"phases" : {
"running" : {
"loop" : 1,
"run" : 100000,
},
"sleeping" : {
"loop" : 1,
"sleep" : 1000000,
}
}
}
- },
- "global" : {
"default_policy" : "SCHED_OTHER",
"calibration" : 90,
"lock_pages" : true,
"ftrace" : true,
"logdir" : "./",
- }
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.sh b/doc/ examples/cpufreq_governor_efficiency/dvfs.sh new file mode 100755 index 0000000..8591fc7 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.sh @@ -0,0 +1,38 @@ +#!/bin/sh
+#echo $1 $2 $3 +set -e
+if [ $1 ] && [ $2 ] ; then
- for i in $(seq 0 1 $(expr $2 - 1)); do
echo $1 > /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
#cat /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
- done
- sleep 3
+fi
+if [ $3 ] ; then
- sed 's/"run" : .*,/"run" : '$3',/' -i dvfs.json
+fi
+if [ $4 ] ; then
- sed 's/"sleep" : .*,/"sleep" : '$4',/' -i dvfs.json
+fi
+#cat dvfs.json
+rt-app dvfs.json 2> /dev/null
+if [ $1 ] ; then
- mv -f rt-app-thread-0.log rt-app_$1_run$3us_sleep$4us.log
- sum=0
- for i in $(cat rt-app_$1_run$3us_sleep$4us.log | sed 'n;d' | sed
'1d' |cut -f 3); do
sum=$(expr $sum + $i)
- done
- sum=$(expr $sum / 5)
- echo $sum
- rm -f rt-app_$1_run$3us_sleep$4us.log
+fi
diff --git a/doc/examples/cpufreq_governor_efficiency/test.sh b/doc/ examples/cpufreq_governor_efficiency/test.sh new file mode 100755 index 0000000..d72fc6a --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/test.sh @@ -0,0 +1,82 @@ +#!/bin/sh
+set -e
+set_calibration() {
- calibration.sh
+}
+test_efficiency() {
- FILENAME="results_$RANDOM$$.txt"
- if [ -e /sys/devices/system/cpu/cpu0/cpufreq/
scaling_available_governors ]; then
for i in $(cat /sys/devices/system/cpu/cpu0/cpufreq/
scaling_available_governors); do
export gov_$i=$(echo $i)
done
- else
echo "cpufreq is not available!"
exit
- fi
- if [ ! $gov_performance ] ; then
echo "Can't find performance governor!"
exit
- fi
- if [ ! $gov_powersave ] ; then
echo "Can't find powersave governor!"
exit
- fi
- # Get powersave data
- dvfs.sh powersave $1 $2 $3 > $FILENAME
- powersave=$(cat $FILENAME |sed -n '1p')
- # Get performance data
- dvfs.sh performance $1 $2 $3 > $FILENAME
- performance=$(cat $FILENAME |sed -n '1p')
- if [ $performance -ge $powersave ] ; then
echo "Error! Probably not input all the cpus in the same
frequency domain"
exit
- fi
- denominator=$(expr $powersave - $performance)
- echo "powersave efficiency: 0%"
- echo "performance efficiency: 100%"
- # Calcuate other governors data
- for gov_next in $gov_conservative $gov_ondemand $gov_cfs; do
if [ "$gov_next" != "" ] ; then
dvfs.sh $gov_next $1 $2 $3 > $FILENAME
data=$(cat $FILENAME |sed -n '1p');
numerator=$(expr $powersave - $data)
numerator=$(expr $numerator \* 100)
if [ $numerator -lt 0 ] ; then
let numerator=0
fi
data=$(expr $numerator / $denominator)
echo "$gov_next efficiency: $data%"
fi
- done
- rm -f $FILENAME
+}
+if [ $# -lt 3 ]; then
- echo "Usage: ./test.sh <cpus> <runtime> <sleeptime>"
- echo "cpus: number of cpus in the CPU0's frequency domain"
- echo "runtime: running time in ms per loop of the workload pattern"
- echo "sleeptime: sleeping time in ms per loop of the workload
pattern"
- echo -e "\nExample: \n"./test.sh 4 100 1000" means\nCPU0~CPU3
sharing frequency, "100ms run + 1000ms sleep" workload pattern.\n"
- exit
+fi
+echo "Frequency domain CPU0~CPU$(expr $1 - 1), run $2ms, sleep $3ms:"
+sleep 1 +PATH=$PATH:. +set_calibration +test_efficiency $1 $(expr $2 * 1000) $(expr $3 * 1000)
-- 1.9.1
ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately.
Hi Vincent,
Vincent Guittot vincent.guittot@linaro.org wrote 2015-06-23 PM 09:43:55:
Re: [RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
Hi Xunlei,
I have run the bench on a quad A15 with sched-dvfs (but without eas patches unlike you) #sudo ./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 18% ondemand efficiency: 48% cfs efficiency: 24%
#sudo ./test.sh 4 200 1000 Frequency domain CPU0~CPU3, run 200ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 40% ondemand efficiency: 68% cfs efficiency: 30%
$ sudo ./test.sh 4 50 1000 Frequency domain CPU0~CPU3, run 50ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 0% ondemand efficiency: 25% cfs efficiency: 19%
As an example, here is the result when ondemand parameter are tuned for the platform sudo ./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 19% ondemand efficiency: 94% cfs efficiency: 23%
Beside these results, i have seen variation in the results that confirm the interest of having more statistics like, min, man stdev
I guess the hardware environment has something to do with this, on my platform, there're 11 available freuencies in total: 1200Mhz~2200Mhz, the step size is 100Mhz.
Also it may get different results when running with different workload loops.
-Xunlei
Regards, Vincent
On 18 June 2015 at 13:35, pang.xunlei@zte.com.cn wrote:
Just tested on my Intel EAS test environment(implemented x86 frequency invariant hook). With EAS disabled and sched-dvfs enabled.
#./test.sh 3 100 1000 Frequency domain CPU0~CPU2, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 92% ondemand efficiency: 97% cfs efficiency: 79%
#./test.sh 3 200 1000 Frequency domain CPU0~CPU2, run 200ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 97% ondemand efficiency: 99% cfs efficiency: 89%
#./test.sh 3 50 1000 Frequency domain CPU0~CPU2, run 50ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 93% ondemand efficiency: 96% cfs efficiency: 58%
#./test.sh 3 1000 100 Frequency domain CPU0~CPU2, run 1000ms, sleep 100ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 99% ondemand efficiency: 99% cfs efficiency: 97%
Seems sched-dvfs is computing inefficient at low cpu usage(implies
power
efficient), but computing efficient at high cpu usage.
-Xunlei
Xunlei Pang xlpang@126.com wrote 2015-06-18 PM 05:06:07:
[RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
From: Xunlei Pang pang.xunlei@linaro.org
DVFS adds a latency in the execution of task because of the time to decide to move at max freq. We need to measure this latency and check that the governor stays in an acceptable range.
When workgen runs a json file, a log file is created for each thread. This log file records the number of loop that has been executed and the duration for executing these loops (per phase). We can use these figures to evaluate to latency that is added by a cpufreq governor and its "performance efficiency".
We use the run+sleep pattern to do the measurement, for the run time
per
loop, the performance governor should run the expected duration as
the
CPU stays a max freq. At the opposite, the powersave governor will
give
use the longest duration (as it stays at lowest OPP). Other governor
will
be somewhere between the 2 previous duration as they will use several
OPP
and will go back to max frequency after a defined duration which
depends
on its monitoring period.
The formula:
duration of powersave gov - duration of the gov
-------------------------------------------------------- x 100% duration of powersave gov - duration of performance gov
will give the efficiency of the governor. 100% means as efficient as the perf governor and 0% means as efficient as the powersave
governor.
This patch offers json files and shell scripts to do the measurement,
Usage: ./test.sh <cpus> <runtime> <sleeptime> cpus: number of cpus in the CPU0's frequency domain runtime: running time in ms per loop of the workload pattern sleeptime: sleeping time in ms per loop of the workload pattern
Example: "./test.sh 4 100 1000" means CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
test result on my machine: ~#./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 28% ondemand efficiency: 95%
NOTE: Make sure there are "sed", "cut", "grep", "rt-app", etc tools
on
your test machine, and run the script under root privilege.
Signed-off-by: Xunlei Pang pang.xunlei@linaro.org
doc/examples/cpufreq_governor_efficiency/README | 54
++++++++++++++
.../cpufreq_governor_efficiency/calibration.json | 27 +++++++ .../cpufreq_governor_efficiency/calibration.sh | 11 +++ doc/examples/cpufreq_governor_efficiency/dvfs.json | 27 +++++++ doc/examples/cpufreq_governor_efficiency/dvfs.sh | 38 ++++++++++ doc/examples/cpufreq_governor_efficiency/test.sh | 82 +++++++++++ +++++++++++ 6 files changed, 239 insertions(+) create mode 100644 doc/examples/cpufreq_governor_efficiency/README create mode 100644
doc/examples/cpufreq_governor_efficiency/calibration.json
create mode 100755
doc/examples/cpufreq_governor_efficiency/calibration.sh
create mode 100644
doc/examples/cpufreq_governor_efficiency/dvfs.json
create mode 100755 doc/examples/cpufreq_governor_efficiency/dvfs.sh create mode 100755 doc/examples/cpufreq_governor_efficiency/test.sh
diff --git a/doc/examples/cpufreq_governor_efficiency/README b/doc/ examples/cpufreq_governor_efficiency/README new file mode 100644 index 0000000..cc8efe1 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/README @@ -0,0 +1,54 @@ +Measure the efficiency of cpufreq governors using rt-app
+BACKGROUND:
- DVFS adds a latency in the execution of task because of the time
to
- decide to move at max freq. We need to measure this latency and
check
- that the governor stays in an acceptable range.
- When workgen runs a json file, a log file is created for each
thread.
- This log file records the number of loop that has been executed
and
- the duration for executing these loops (per phase). We can use
these
- figures to evaluate to latency that is added by a cpufreq
governor
- and its "performance efficiency".
- We use the run+sleep pattern to do the measurement, for the run
time per
- loop, the performance governor should run the expected duration
as
the
- CPU stays a max freq. At the opposite, the powersave governor
will
give
- use the longest duration (as it stays at lowest OPP). Other
governor will
- be somewhere between the 2 previous duration as they will use
several OPP
- and will go back to max frequency after a defined duration which
depends
- on its monitoring period.
- The formula:
duration of powersave gov - duration of the gov
- -------------------------------------------------------- x 100%
duration of powersave gov - duration of performance gov
- will give the efficiency of the governor. 100% means as
efficient
as
- the perf governor and 0% means as efficient as the powersave
governor.
- This test offers json files and shell scripts to do the
measurement,
+USAGE:
- ./test.sh <cpus> <runtime> <sleeptime>
- cpus: number of cpus in the CPU0's frequency domain
- runtime: running time in ms per loop of the workload pattern
- sleeptime: sleeping time in ms per loop of the workload pattern
+Example:
- "./test.sh 4 100 1000" means
- CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
- test result on an Intel machine:
- ~#./test.sh 4 100 1000
- Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
- powersave efficiency: 0%
- performance efficiency: 100%
- conservative efficiency: 28%
- ondemand efficiency: 95%
+NOTE:
- Make sure there are "sed", "cut", "grep", "rt-app", etc tools
on your test
- machine, and run the script under root privilege.
diff --git a/doc/examples/cpufreq_governor_efficiency/ calibration.json
b/doc/examples/cpufreq_governor_efficiency/calibration.json
new file mode 100644 index 0000000..4377990 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.json @@ -0,0 +1,27 @@ +{
- "tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 1,
"phases" : {
"run" : {
"loop" : 1,
"run" : 200000,
},
"sleep" : {
"loop" : 1,
"sleep" : 200000,
}
}
}
- },
- "global" : {
"default_policy" : "SCHED_FIFO",
"calibration" : "CPU0",
"lock_pages" : true,
"ftrace" : true,
"logdir" : "./",
- }
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/calibration.sh b/doc/examples/cpufreq_governor_efficiency/calibration.sh new file mode 100755 index 0000000..d10e644 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.sh @@ -0,0 +1,11 @@ +#!/bin/sh
+set -e
+echo performance >
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
+sleep 1
+pLoad=$(rt-app calibration.json 2>&1 |grep pLoad |sed 's/.*=
(.*)ns.*/\1/')
+sed 's/"calibration" : .*,/"calibration" : '$pLoad',/' -i dvfs.json
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.json b/ doc/examples/cpufreq_governor_efficiency/dvfs.json new file mode 100644 index 0000000..b413156 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.json @@ -0,0 +1,27 @@ +{
- "tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 5,
"phases" : {
"running" : {
"loop" : 1,
"run" : 100000,
},
"sleeping" : {
"loop" : 1,
"sleep" : 1000000,
}
}
}
- },
- "global" : {
"default_policy" : "SCHED_OTHER",
"calibration" : 90,
"lock_pages" : true,
"ftrace" : true,
"logdir" : "./",
- }
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.sh b/doc/ examples/cpufreq_governor_efficiency/dvfs.sh new file mode 100755 index 0000000..8591fc7 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.sh @@ -0,0 +1,38 @@ +#!/bin/sh
+#echo $1 $2 $3 +set -e
+if [ $1 ] && [ $2 ] ; then
- for i in $(seq 0 1 $(expr $2 - 1)); do
echo $1 >
/sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
#cat /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
- done
- sleep 3
+fi
+if [ $3 ] ; then
- sed 's/"run" : .*,/"run" : '$3',/' -i dvfs.json
+fi
+if [ $4 ] ; then
- sed 's/"sleep" : .*,/"sleep" : '$4',/' -i dvfs.json
+fi
+#cat dvfs.json
+rt-app dvfs.json 2> /dev/null
+if [ $1 ] ; then
- mv -f rt-app-thread-0.log rt-app_$1_run$3us_sleep$4us.log
- sum=0
- for i in $(cat rt-app_$1_run$3us_sleep$4us.log | sed 'n;d' | sed
'1d' |cut -f 3); do
sum=$(expr $sum + $i)
- done
- sum=$(expr $sum / 5)
- echo $sum
- rm -f rt-app_$1_run$3us_sleep$4us.log
+fi
diff --git a/doc/examples/cpufreq_governor_efficiency/test.sh b/doc/ examples/cpufreq_governor_efficiency/test.sh new file mode 100755 index 0000000..d72fc6a --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/test.sh @@ -0,0 +1,82 @@ +#!/bin/sh
+set -e
+set_calibration() {
- calibration.sh
+}
+test_efficiency() {
- FILENAME="results_$RANDOM$$.txt"
- if [ -e /sys/devices/system/cpu/cpu0/cpufreq/
scaling_available_governors ]; then
for i in $(cat /sys/devices/system/cpu/cpu0/cpufreq/
scaling_available_governors); do
export gov_$i=$(echo $i)
done
- else
echo "cpufreq is not available!"
exit
- fi
- if [ ! $gov_performance ] ; then
echo "Can't find performance governor!"
exit
- fi
- if [ ! $gov_powersave ] ; then
echo "Can't find powersave governor!"
exit
- fi
- # Get powersave data
- dvfs.sh powersave $1 $2 $3 > $FILENAME
- powersave=$(cat $FILENAME |sed -n '1p')
- # Get performance data
- dvfs.sh performance $1 $2 $3 > $FILENAME
- performance=$(cat $FILENAME |sed -n '1p')
- if [ $performance -ge $powersave ] ; then
echo "Error! Probably not input all the cpus in the same
frequency domain"
exit
- fi
- denominator=$(expr $powersave - $performance)
- echo "powersave efficiency: 0%"
- echo "performance efficiency: 100%"
- # Calcuate other governors data
- for gov_next in $gov_conservative $gov_ondemand $gov_cfs; do
if [ "$gov_next" != "" ] ; then
dvfs.sh $gov_next $1 $2 $3 > $FILENAME
data=$(cat $FILENAME |sed -n '1p');
numerator=$(expr $powersave - $data)
numerator=$(expr $numerator \* 100)
if [ $numerator -lt 0 ] ; then
let numerator=0
fi
data=$(expr $numerator / $denominator)
echo "$gov_next efficiency: $data%"
fi
- done
- rm -f $FILENAME
+}
+if [ $# -lt 3 ]; then
- echo "Usage: ./test.sh <cpus> <runtime> <sleeptime>"
- echo "cpus: number of cpus in the CPU0's frequency domain"
- echo "runtime: running time in ms per loop of the workload
pattern"
- echo "sleeptime: sleeping time in ms per loop of the workload
pattern"
- echo -e "\nExample: \n"./test.sh 4 100 1000" means\nCPU0~CPU3
sharing frequency, "100ms run + 1000ms sleep" workload pattern.\n"
- exit
+fi
+echo "Frequency domain CPU0~CPU$(expr $1 - 1), run $2ms, sleep
$3ms:"
+sleep 1 +PATH=$PATH:. +set_calibration +test_efficiency $1 $(expr $2 * 1000) $(expr $3 * 1000)
-- 1.9.1
ZTE Information Security Notice: The information contained in this
mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee (s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately.
-------------------------------------------------------- ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately.
On 24 June 2015 at 03:41, pang.xunlei@zte.com.cn wrote:
Hi Vincent,
Vincent Guittot vincent.guittot@linaro.org wrote 2015-06-23 PM 09:43:55:
Re: [RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
Hi Xunlei,
I have run the bench on a quad A15 with sched-dvfs (but without eas patches unlike you) #sudo ./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 18% ondemand efficiency: 48% cfs efficiency: 24%
#sudo ./test.sh 4 200 1000 Frequency domain CPU0~CPU3, run 200ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 40% ondemand efficiency: 68% cfs efficiency: 30%
$ sudo ./test.sh 4 50 1000 Frequency domain CPU0~CPU3, run 50ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 0% ondemand efficiency: 25% cfs efficiency: 19%
As an example, here is the result when ondemand parameter are tuned for the platform sudo ./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 19% ondemand efficiency: 94% cfs efficiency: 23%
Beside these results, i have seen variation in the results that confirm the interest of having more statistics like, min, man stdev
I guess the hardware environment has something to do with this, on my platform, there're 11 available freuencies in total: 1200Mhz~2200Mhz, the step size is 100Mhz.
Yes for sure, it was just to give some figures with a different platform. Regarding the variation, i have seen these variations for the same platform with the same SW. Nevertheless, this variations are somewhat normal if we consider the default sampling rate of 164ms for some governor compared to a run duration of 100ms
Regards, Vincent
Also it may get different results when running with different workload loops.
-Xunlei
Regards, Vincent
On 18 June 2015 at 13:35, pang.xunlei@zte.com.cn wrote:
Just tested on my Intel EAS test environment(implemented x86 frequency invariant hook). With EAS disabled and sched-dvfs enabled.
#./test.sh 3 100 1000 Frequency domain CPU0~CPU2, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 92% ondemand efficiency: 97% cfs efficiency: 79%
#./test.sh 3 200 1000 Frequency domain CPU0~CPU2, run 200ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 97% ondemand efficiency: 99% cfs efficiency: 89%
#./test.sh 3 50 1000 Frequency domain CPU0~CPU2, run 50ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 93% ondemand efficiency: 96% cfs efficiency: 58%
#./test.sh 3 1000 100 Frequency domain CPU0~CPU2, run 1000ms, sleep 100ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 99% ondemand efficiency: 99% cfs efficiency: 97%
Seems sched-dvfs is computing inefficient at low cpu usage(implies
power
efficient), but computing efficient at high cpu usage.
-Xunlei
Xunlei Pang xlpang@126.com wrote 2015-06-18 PM 05:06:07:
[RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
From: Xunlei Pang pang.xunlei@linaro.org
DVFS adds a latency in the execution of task because of the time to decide to move at max freq. We need to measure this latency and check that the governor stays in an acceptable range.
When workgen runs a json file, a log file is created for each thread. This log file records the number of loop that has been executed and the duration for executing these loops (per phase). We can use these figures to evaluate to latency that is added by a cpufreq governor and its "performance efficiency".
We use the run+sleep pattern to do the measurement, for the run time
per
loop, the performance governor should run the expected duration as
the
CPU stays a max freq. At the opposite, the powersave governor will
give
use the longest duration (as it stays at lowest OPP). Other governor
will
be somewhere between the 2 previous duration as they will use several
OPP
and will go back to max frequency after a defined duration which
depends
on its monitoring period.
The formula:
duration of powersave gov - duration of the gov
-------------------------------------------------------- x 100% duration of powersave gov - duration of performance gov
will give the efficiency of the governor. 100% means as efficient as the perf governor and 0% means as efficient as the powersave
governor.
This patch offers json files and shell scripts to do the measurement,
Usage: ./test.sh <cpus> <runtime> <sleeptime> cpus: number of cpus in the CPU0's frequency domain runtime: running time in ms per loop of the workload pattern sleeptime: sleeping time in ms per loop of the workload pattern
Example: "./test.sh 4 100 1000" means CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
test result on my machine: ~#./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 28% ondemand efficiency: 95%
NOTE: Make sure there are "sed", "cut", "grep", "rt-app", etc tools
on
your test machine, and run the script under root privilege.
Signed-off-by: Xunlei Pang pang.xunlei@linaro.org
doc/examples/cpufreq_governor_efficiency/README | 54
++++++++++++++
.../cpufreq_governor_efficiency/calibration.json | 27 +++++++ .../cpufreq_governor_efficiency/calibration.sh | 11 +++ doc/examples/cpufreq_governor_efficiency/dvfs.json | 27 +++++++ doc/examples/cpufreq_governor_efficiency/dvfs.sh | 38 ++++++++++ doc/examples/cpufreq_governor_efficiency/test.sh | 82 +++++++++++ +++++++++++ 6 files changed, 239 insertions(+) create mode 100644 doc/examples/cpufreq_governor_efficiency/README create mode 100644
doc/examples/cpufreq_governor_efficiency/calibration.json
create mode 100755
doc/examples/cpufreq_governor_efficiency/calibration.sh
create mode 100644
doc/examples/cpufreq_governor_efficiency/dvfs.json
create mode 100755 doc/examples/cpufreq_governor_efficiency/dvfs.sh create mode 100755 doc/examples/cpufreq_governor_efficiency/test.sh
diff --git a/doc/examples/cpufreq_governor_efficiency/README b/doc/ examples/cpufreq_governor_efficiency/README new file mode 100644 index 0000000..cc8efe1 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/README @@ -0,0 +1,54 @@ +Measure the efficiency of cpufreq governors using rt-app
+BACKGROUND:
- DVFS adds a latency in the execution of task because of the time
to
- decide to move at max freq. We need to measure this latency and
check
- that the governor stays in an acceptable range.
- When workgen runs a json file, a log file is created for each
thread.
- This log file records the number of loop that has been executed
and
- the duration for executing these loops (per phase). We can use
these
- figures to evaluate to latency that is added by a cpufreq
governor
- and its "performance efficiency".
- We use the run+sleep pattern to do the measurement, for the run
time per
- loop, the performance governor should run the expected duration
as
the
- CPU stays a max freq. At the opposite, the powersave governor
will
give
- use the longest duration (as it stays at lowest OPP). Other
governor will
- be somewhere between the 2 previous duration as they will use
several OPP
- and will go back to max frequency after a defined duration which
depends
- on its monitoring period.
- The formula:
duration of powersave gov - duration of the gov
- -------------------------------------------------------- x 100%
duration of powersave gov - duration of performance gov
- will give the efficiency of the governor. 100% means as
efficient
as
- the perf governor and 0% means as efficient as the powersave
governor.
- This test offers json files and shell scripts to do the
measurement,
+USAGE:
- ./test.sh <cpus> <runtime> <sleeptime>
- cpus: number of cpus in the CPU0's frequency domain
- runtime: running time in ms per loop of the workload pattern
- sleeptime: sleeping time in ms per loop of the workload pattern
+Example:
- "./test.sh 4 100 1000" means
- CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
- test result on an Intel machine:
- ~#./test.sh 4 100 1000
- Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
- powersave efficiency: 0%
- performance efficiency: 100%
- conservative efficiency: 28%
- ondemand efficiency: 95%
+NOTE:
- Make sure there are "sed", "cut", "grep", "rt-app", etc tools
on your test
- machine, and run the script under root privilege.
diff --git a/doc/examples/cpufreq_governor_efficiency/ calibration.json
b/doc/examples/cpufreq_governor_efficiency/calibration.json
new file mode 100644 index 0000000..4377990 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.json @@ -0,0 +1,27 @@ +{
- "tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 1,
"phases" : {
"run" : {
"loop" : 1,
"run" : 200000,
},
"sleep" : {
"loop" : 1,
"sleep" : 200000,
}
}
}
- },
- "global" : {
"default_policy" : "SCHED_FIFO",
"calibration" : "CPU0",
"lock_pages" : true,
"ftrace" : true,
"logdir" : "./",
- }
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/calibration.sh b/doc/examples/cpufreq_governor_efficiency/calibration.sh new file mode 100755 index 0000000..d10e644 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.sh @@ -0,0 +1,11 @@ +#!/bin/sh
+set -e
+echo performance >
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
+sleep 1
+pLoad=$(rt-app calibration.json 2>&1 |grep pLoad |sed 's/.*=
(.*)ns.*/\1/')
+sed 's/"calibration" : .*,/"calibration" : '$pLoad',/' -i dvfs.json
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.json b/ doc/examples/cpufreq_governor_efficiency/dvfs.json new file mode 100644 index 0000000..b413156 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.json @@ -0,0 +1,27 @@ +{
- "tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 5,
"phases" : {
"running" : {
"loop" : 1,
"run" : 100000,
},
"sleeping" : {
"loop" : 1,
"sleep" : 1000000,
}
}
}
- },
- "global" : {
"default_policy" : "SCHED_OTHER",
"calibration" : 90,
"lock_pages" : true,
"ftrace" : true,
"logdir" : "./",
- }
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.sh b/doc/ examples/cpufreq_governor_efficiency/dvfs.sh new file mode 100755 index 0000000..8591fc7 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.sh @@ -0,0 +1,38 @@ +#!/bin/sh
+#echo $1 $2 $3 +set -e
+if [ $1 ] && [ $2 ] ; then
- for i in $(seq 0 1 $(expr $2 - 1)); do
echo $1 >
/sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
#cat /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
- done
- sleep 3
+fi
+if [ $3 ] ; then
- sed 's/"run" : .*,/"run" : '$3',/' -i dvfs.json
+fi
+if [ $4 ] ; then
- sed 's/"sleep" : .*,/"sleep" : '$4',/' -i dvfs.json
+fi
+#cat dvfs.json
+rt-app dvfs.json 2> /dev/null
+if [ $1 ] ; then
- mv -f rt-app-thread-0.log rt-app_$1_run$3us_sleep$4us.log
- sum=0
- for i in $(cat rt-app_$1_run$3us_sleep$4us.log | sed 'n;d' | sed
'1d' |cut -f 3); do
sum=$(expr $sum + $i)
- done
- sum=$(expr $sum / 5)
- echo $sum
- rm -f rt-app_$1_run$3us_sleep$4us.log
+fi
diff --git a/doc/examples/cpufreq_governor_efficiency/test.sh b/doc/ examples/cpufreq_governor_efficiency/test.sh new file mode 100755 index 0000000..d72fc6a --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/test.sh @@ -0,0 +1,82 @@ +#!/bin/sh
+set -e
+set_calibration() {
- calibration.sh
+}
+test_efficiency() {
- FILENAME="results_$RANDOM$$.txt"
- if [ -e /sys/devices/system/cpu/cpu0/cpufreq/
scaling_available_governors ]; then
for i in $(cat /sys/devices/system/cpu/cpu0/cpufreq/
scaling_available_governors); do
export gov_$i=$(echo $i)
done
- else
echo "cpufreq is not available!"
exit
- fi
- if [ ! $gov_performance ] ; then
echo "Can't find performance governor!"
exit
- fi
- if [ ! $gov_powersave ] ; then
echo "Can't find powersave governor!"
exit
- fi
- # Get powersave data
- dvfs.sh powersave $1 $2 $3 > $FILENAME
- powersave=$(cat $FILENAME |sed -n '1p')
- # Get performance data
- dvfs.sh performance $1 $2 $3 > $FILENAME
- performance=$(cat $FILENAME |sed -n '1p')
- if [ $performance -ge $powersave ] ; then
echo "Error! Probably not input all the cpus in the same
frequency domain"
exit
- fi
- denominator=$(expr $powersave - $performance)
- echo "powersave efficiency: 0%"
- echo "performance efficiency: 100%"
- # Calcuate other governors data
- for gov_next in $gov_conservative $gov_ondemand $gov_cfs; do
if [ "$gov_next" != "" ] ; then
dvfs.sh $gov_next $1 $2 $3 > $FILENAME
data=$(cat $FILENAME |sed -n '1p');
numerator=$(expr $powersave - $data)
numerator=$(expr $numerator \* 100)
if [ $numerator -lt 0 ] ; then
let numerator=0
fi
data=$(expr $numerator / $denominator)
echo "$gov_next efficiency: $data%"
fi
- done
- rm -f $FILENAME
+}
+if [ $# -lt 3 ]; then
- echo "Usage: ./test.sh <cpus> <runtime> <sleeptime>"
- echo "cpus: number of cpus in the CPU0's frequency domain"
- echo "runtime: running time in ms per loop of the workload
pattern"
- echo "sleeptime: sleeping time in ms per loop of the workload
pattern"
- echo -e "\nExample: \n"./test.sh 4 100 1000" means\nCPU0~CPU3
sharing frequency, "100ms run + 1000ms sleep" workload pattern.\n"
- exit
+fi
+echo "Frequency domain CPU0~CPU$(expr $1 - 1), run $2ms, sleep
$3ms:"
+sleep 1 +PATH=$PATH:. +set_calibration +test_efficiency $1 $(expr $2 * 1000) $(expr $3 * 1000)
-- 1.9.1
ZTE Information Security Notice: The information contained in this
mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee (s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately.
ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately.
(This might be of interest to folks on eas-dev, adding them to cc)
On Wed, Jun 24, 2015 at 11:53 AM, Vincent Guittot vincent.guittot@linaro.org wrote:
On 24 June 2015 at 03:41, pang.xunlei@zte.com.cn wrote:
Hi Vincent,
Vincent Guittot vincent.guittot@linaro.org wrote 2015-06-23 PM 09:43:55:
Re: [RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
Hi Xunlei,
I have run the bench on a quad A15 with sched-dvfs (but without eas patches unlike you) #sudo ./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 18% ondemand efficiency: 48% cfs efficiency: 24%
#sudo ./test.sh 4 200 1000 Frequency domain CPU0~CPU3, run 200ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 40% ondemand efficiency: 68% cfs efficiency: 30%
$ sudo ./test.sh 4 50 1000 Frequency domain CPU0~CPU3, run 50ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 0% ondemand efficiency: 25% cfs efficiency: 19%
As an example, here is the result when ondemand parameter are tuned for the platform sudo ./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 19% ondemand efficiency: 94% cfs efficiency: 23%
Beside these results, i have seen variation in the results that confirm the interest of having more statistics like, min, man stdev
I guess the hardware environment has something to do with this, on my platform, there're 11 available freuencies in total: 1200Mhz~2200Mhz, the step size is 100Mhz.
Yes for sure, it was just to give some figures with a different platform. Regarding the variation, i have seen these variations for the same platform with the same SW. Nevertheless, this variations are somewhat normal if we consider the default sampling rate of 164ms for some governor compared to a run duration of 100ms
Regards, Vincent
Also it may get different results when running with different workload loops.
-Xunlei
Regards, Vincent
On 18 June 2015 at 13:35, pang.xunlei@zte.com.cn wrote:
Just tested on my Intel EAS test environment(implemented x86 frequency invariant hook). With EAS disabled and sched-dvfs enabled.
#./test.sh 3 100 1000 Frequency domain CPU0~CPU2, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 92% ondemand efficiency: 97% cfs efficiency: 79%
#./test.sh 3 200 1000 Frequency domain CPU0~CPU2, run 200ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 97% ondemand efficiency: 99% cfs efficiency: 89%
#./test.sh 3 50 1000 Frequency domain CPU0~CPU2, run 50ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 93% ondemand efficiency: 96% cfs efficiency: 58%
#./test.sh 3 1000 100 Frequency domain CPU0~CPU2, run 1000ms, sleep 100ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 99% ondemand efficiency: 99% cfs efficiency: 97%
Seems sched-dvfs is computing inefficient at low cpu usage(implies
power
efficient), but computing efficient at high cpu usage.
-Xunlei
Xunlei Pang xlpang@126.com wrote 2015-06-18 PM 05:06:07:
[RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
From: Xunlei Pang pang.xunlei@linaro.org
DVFS adds a latency in the execution of task because of the time to decide to move at max freq. We need to measure this latency and check that the governor stays in an acceptable range.
When workgen runs a json file, a log file is created for each thread. This log file records the number of loop that has been executed and the duration for executing these loops (per phase). We can use these figures to evaluate to latency that is added by a cpufreq governor and its "performance efficiency".
We use the run+sleep pattern to do the measurement, for the run time
per
loop, the performance governor should run the expected duration as
the
CPU stays a max freq. At the opposite, the powersave governor will
give
use the longest duration (as it stays at lowest OPP). Other governor
will
be somewhere between the 2 previous duration as they will use several
OPP
and will go back to max frequency after a defined duration which
depends
on its monitoring period.
The formula:
duration of powersave gov - duration of the gov
-------------------------------------------------------- x 100% duration of powersave gov - duration of performance gov
will give the efficiency of the governor. 100% means as efficient as the perf governor and 0% means as efficient as the powersave
governor.
This patch offers json files and shell scripts to do the measurement,
Usage: ./test.sh <cpus> <runtime> <sleeptime> cpus: number of cpus in the CPU0's frequency domain runtime: running time in ms per loop of the workload pattern sleeptime: sleeping time in ms per loop of the workload pattern
Example: "./test.sh 4 100 1000" means CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
test result on my machine: ~#./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 28% ondemand efficiency: 95%
NOTE: Make sure there are "sed", "cut", "grep", "rt-app", etc tools
on
your test machine, and run the script under root privilege.
Signed-off-by: Xunlei Pang pang.xunlei@linaro.org
doc/examples/cpufreq_governor_efficiency/README | 54
++++++++++++++
.../cpufreq_governor_efficiency/calibration.json | 27 +++++++ .../cpufreq_governor_efficiency/calibration.sh | 11 +++ doc/examples/cpufreq_governor_efficiency/dvfs.json | 27 +++++++ doc/examples/cpufreq_governor_efficiency/dvfs.sh | 38 ++++++++++ doc/examples/cpufreq_governor_efficiency/test.sh | 82 +++++++++++ +++++++++++ 6 files changed, 239 insertions(+) create mode 100644 doc/examples/cpufreq_governor_efficiency/README create mode 100644
doc/examples/cpufreq_governor_efficiency/calibration.json
create mode 100755
doc/examples/cpufreq_governor_efficiency/calibration.sh
create mode 100644
doc/examples/cpufreq_governor_efficiency/dvfs.json
create mode 100755 doc/examples/cpufreq_governor_efficiency/dvfs.sh create mode 100755 doc/examples/cpufreq_governor_efficiency/test.sh
diff --git a/doc/examples/cpufreq_governor_efficiency/README b/doc/ examples/cpufreq_governor_efficiency/README new file mode 100644 index 0000000..cc8efe1 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/README @@ -0,0 +1,54 @@ +Measure the efficiency of cpufreq governors using rt-app
+BACKGROUND:
- DVFS adds a latency in the execution of task because of the time
to
- decide to move at max freq. We need to measure this latency and
check
- that the governor stays in an acceptable range.
- When workgen runs a json file, a log file is created for each
thread.
- This log file records the number of loop that has been executed
and
- the duration for executing these loops (per phase). We can use
these
- figures to evaluate to latency that is added by a cpufreq
governor
- and its "performance efficiency".
- We use the run+sleep pattern to do the measurement, for the run
time per
- loop, the performance governor should run the expected duration
as
the
- CPU stays a max freq. At the opposite, the powersave governor
will
give
- use the longest duration (as it stays at lowest OPP). Other
governor will
- be somewhere between the 2 previous duration as they will use
several OPP
- and will go back to max frequency after a defined duration which
depends
- on its monitoring period.
- The formula:
duration of powersave gov - duration of the gov
- -------------------------------------------------------- x 100%
duration of powersave gov - duration of performance gov
- will give the efficiency of the governor. 100% means as
efficient
as
- the perf governor and 0% means as efficient as the powersave
governor.
- This test offers json files and shell scripts to do the
measurement,
+USAGE:
- ./test.sh <cpus> <runtime> <sleeptime>
- cpus: number of cpus in the CPU0's frequency domain
- runtime: running time in ms per loop of the workload pattern
- sleeptime: sleeping time in ms per loop of the workload pattern
+Example:
- "./test.sh 4 100 1000" means
- CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
- test result on an Intel machine:
- ~#./test.sh 4 100 1000
- Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
- powersave efficiency: 0%
- performance efficiency: 100%
- conservative efficiency: 28%
- ondemand efficiency: 95%
+NOTE:
- Make sure there are "sed", "cut", "grep", "rt-app", etc tools
on your test
- machine, and run the script under root privilege.
diff --git a/doc/examples/cpufreq_governor_efficiency/ calibration.json
b/doc/examples/cpufreq_governor_efficiency/calibration.json
new file mode 100644 index 0000000..4377990 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.json @@ -0,0 +1,27 @@ +{
- "tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 1,
"phases" : {
"run" : {
"loop" : 1,
"run" : 200000,
},
"sleep" : {
"loop" : 1,
"sleep" : 200000,
}
}
}
- },
- "global" : {
"default_policy" : "SCHED_FIFO",
"calibration" : "CPU0",
"lock_pages" : true,
"ftrace" : true,
"logdir" : "./",
- }
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/calibration.sh b/doc/examples/cpufreq_governor_efficiency/calibration.sh new file mode 100755 index 0000000..d10e644 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.sh @@ -0,0 +1,11 @@ +#!/bin/sh
+set -e
+echo performance >
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
+sleep 1
+pLoad=$(rt-app calibration.json 2>&1 |grep pLoad |sed 's/.*=
(.*)ns.*/\1/')
+sed 's/"calibration" : .*,/"calibration" : '$pLoad',/' -i dvfs.json
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.json b/ doc/examples/cpufreq_governor_efficiency/dvfs.json new file mode 100644 index 0000000..b413156 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.json @@ -0,0 +1,27 @@ +{
- "tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 5,
"phases" : {
"running" : {
"loop" : 1,
"run" : 100000,
},
"sleeping" : {
"loop" : 1,
"sleep" : 1000000,
}
}
}
- },
- "global" : {
"default_policy" : "SCHED_OTHER",
"calibration" : 90,
"lock_pages" : true,
"ftrace" : true,
"logdir" : "./",
- }
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.sh b/doc/ examples/cpufreq_governor_efficiency/dvfs.sh new file mode 100755 index 0000000..8591fc7 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.sh @@ -0,0 +1,38 @@ +#!/bin/sh
+#echo $1 $2 $3 +set -e
+if [ $1 ] && [ $2 ] ; then
- for i in $(seq 0 1 $(expr $2 - 1)); do
echo $1 >
/sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
#cat /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
- done
- sleep 3
+fi
+if [ $3 ] ; then
- sed 's/"run" : .*,/"run" : '$3',/' -i dvfs.json
+fi
+if [ $4 ] ; then
- sed 's/"sleep" : .*,/"sleep" : '$4',/' -i dvfs.json
+fi
+#cat dvfs.json
+rt-app dvfs.json 2> /dev/null
+if [ $1 ] ; then
- mv -f rt-app-thread-0.log rt-app_$1_run$3us_sleep$4us.log
- sum=0
- for i in $(cat rt-app_$1_run$3us_sleep$4us.log | sed 'n;d' | sed
'1d' |cut -f 3); do
sum=$(expr $sum + $i)
- done
- sum=$(expr $sum / 5)
- echo $sum
- rm -f rt-app_$1_run$3us_sleep$4us.log
+fi
diff --git a/doc/examples/cpufreq_governor_efficiency/test.sh b/doc/ examples/cpufreq_governor_efficiency/test.sh new file mode 100755 index 0000000..d72fc6a --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/test.sh @@ -0,0 +1,82 @@ +#!/bin/sh
+set -e
+set_calibration() {
- calibration.sh
+}
+test_efficiency() {
- FILENAME="results_$RANDOM$$.txt"
- if [ -e /sys/devices/system/cpu/cpu0/cpufreq/
scaling_available_governors ]; then
for i in $(cat /sys/devices/system/cpu/cpu0/cpufreq/
scaling_available_governors); do
export gov_$i=$(echo $i)
done
- else
echo "cpufreq is not available!"
exit
- fi
- if [ ! $gov_performance ] ; then
echo "Can't find performance governor!"
exit
- fi
- if [ ! $gov_powersave ] ; then
echo "Can't find powersave governor!"
exit
- fi
- # Get powersave data
- dvfs.sh powersave $1 $2 $3 > $FILENAME
- powersave=$(cat $FILENAME |sed -n '1p')
- # Get performance data
- dvfs.sh performance $1 $2 $3 > $FILENAME
- performance=$(cat $FILENAME |sed -n '1p')
- if [ $performance -ge $powersave ] ; then
echo "Error! Probably not input all the cpus in the same
frequency domain"
exit
- fi
- denominator=$(expr $powersave - $performance)
- echo "powersave efficiency: 0%"
- echo "performance efficiency: 100%"
- # Calcuate other governors data
- for gov_next in $gov_conservative $gov_ondemand $gov_cfs; do
if [ "$gov_next" != "" ] ; then
dvfs.sh $gov_next $1 $2 $3 > $FILENAME
data=$(cat $FILENAME |sed -n '1p');
numerator=$(expr $powersave - $data)
numerator=$(expr $numerator \* 100)
if [ $numerator -lt 0 ] ; then
let numerator=0
fi
data=$(expr $numerator / $denominator)
echo "$gov_next efficiency: $data%"
fi
- done
- rm -f $FILENAME
+}
+if [ $# -lt 3 ]; then
- echo "Usage: ./test.sh <cpus> <runtime> <sleeptime>"
- echo "cpus: number of cpus in the CPU0's frequency domain"
- echo "runtime: running time in ms per loop of the workload
pattern"
- echo "sleeptime: sleeping time in ms per loop of the workload
pattern"
- echo -e "\nExample: \n"./test.sh 4 100 1000" means\nCPU0~CPU3
sharing frequency, "100ms run + 1000ms sleep" workload pattern.\n"
- exit
+fi
+echo "Frequency domain CPU0~CPU$(expr $1 - 1), run $2ms, sleep
$3ms:"
+sleep 1 +PATH=$PATH:. +set_calibration +test_efficiency $1 $(expr $2 * 1000) $(expr $3 * 1000)
-- 1.9.1
Hi Xunlei,
On 18 June 2015 at 11:06, Xunlei Pang xlpang@126.com wrote:
From: Xunlei Pang pang.xunlei@linaro.org
DVFS adds a latency in the execution of task because of the time to decide to move at max freq. We need to measure this latency and check that the governor stays in an acceptable range.
When workgen runs a json file, a log file is created for each thread.
you use rt-app and not workgen in your script
This log file records the number of loop that has been executed and the duration for executing these loops (per phase). We can use these figures to evaluate to latency that is added by a cpufreq governor and its "performance efficiency".
We use the run+sleep pattern to do the measurement, for the run time per loop, the performance governor should run the expected duration as the CPU stays a max freq. At the opposite, the powersave governor will give use the longest duration (as it stays at lowest OPP). Other governor will be somewhere between the 2 previous duration as they will use several OPP and will go back to max frequency after a defined duration which depends on its monitoring period.
The formula:
duration of powersave gov - duration of the gov
-------------------------------------------------------- x 100% duration of powersave gov - duration of performance gov
will give the efficiency of the governor. 100% means as efficient as the perf governor and 0% means as efficient as the powersave governor.
This patch offers json files and shell scripts to do the measurement,
Usage: ./test.sh <cpus> <runtime> <sleeptime> cpus: number of cpus in the CPU0's frequency domain
Why do you need the number of cpus in CPU0's freq domain as an argument of the script ? The cpus that share the same freq domain, are available in /sys/devices/system/cpu/cpu0/cpufreq/related_cpus and/or affected_cpus. I don't remember the exact difference between both lists
And why do you need this parameter at all ? In your script, you set the governor of all cpus of the freq domain but AFAICT, cpus in the same freq domain, share the same governor so all cpus will use the new governor as soon as you change the governor of one cpu.
Then, how can i test the efficiency of a cpu that doesn't belong to CPU0 freq domain ? IMHO, you should better remove this parameter and add a new one to select the cpu on which you want to run the bench. Futhermore, CPU0 is the cpu that is the more used by default in a platform so it's worth to use another one in order to not be disturbed by background activity.
runtime: running time in ms per loop of the workload pattern sleeptime: sleeping time in ms per loop of the workload pattern
Example: "./test.sh 4 100 1000" means CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload pattern.
test result on my machine: ~#./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 28% ondemand efficiency: 95%
NOTE: Make sure there are "sed", "cut", "grep", "rt-app", etc tools on your test machine, and run the script under root privilege.
Signed-off-by: Xunlei Pang pang.xunlei@linaro.org
doc/examples/cpufreq_governor_efficiency/README | 54 ++++++++++++++ .../cpufreq_governor_efficiency/calibration.json | 27 +++++++ .../cpufreq_governor_efficiency/calibration.sh | 11 +++ doc/examples/cpufreq_governor_efficiency/dvfs.json | 27 +++++++ doc/examples/cpufreq_governor_efficiency/dvfs.sh | 38 ++++++++++ doc/examples/cpufreq_governor_efficiency/test.sh | 82 ++++++++++++++++++++++ 6 files changed, 239 insertions(+) create mode 100644 doc/examples/cpufreq_governor_efficiency/README create mode 100644 doc/examples/cpufreq_governor_efficiency/calibration.json create mode 100755 doc/examples/cpufreq_governor_efficiency/calibration.sh create mode 100644 doc/examples/cpufreq_governor_efficiency/dvfs.json create mode 100755 doc/examples/cpufreq_governor_efficiency/dvfs.sh create mode 100755 doc/examples/cpufreq_governor_efficiency/test.sh
diff --git a/doc/examples/cpufreq_governor_efficiency/README b/doc/examples/cpufreq_governor_efficiency/README new file mode 100644 index 0000000..cc8efe1 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/README @@ -0,0 +1,54 @@ +Measure the efficiency of cpufreq governors using rt-app
+BACKGROUND:
- DVFS adds a latency in the execution of task because of the time to
- decide to move at max freq. We need to measure this latency and check
- that the governor stays in an acceptable range.
- When workgen runs a json file, a log file is created for each thread.
- This log file records the number of loop that has been executed and
- the duration for executing these loops (per phase). We can use these
- figures to evaluate to latency that is added by a cpufreq governor
- and its "performance efficiency".
- We use the run+sleep pattern to do the measurement, for the run time per
- loop, the performance governor should run the expected duration as the
- CPU stays a max freq. At the opposite, the powersave governor will give
- use the longest duration (as it stays at lowest OPP). Other governor will
- be somewhere between the 2 previous duration as they will use several OPP
- and will go back to max frequency after a defined duration which depends
- on its monitoring period.
- The formula:
duration of powersave gov - duration of the gov
- -------------------------------------------------------- x 100%
duration of powersave gov - duration of performance gov
- will give the efficiency of the governor. 100% means as efficient as
- the perf governor and 0% means as efficient as the powersave governor.
- This test offers json files and shell scripts to do the measurement,
+USAGE:
- ./test.sh <cpus> <runtime> <sleeptime>
- cpus: number of cpus in the CPU0's frequency domain
- runtime: running time in ms per loop of the workload pattern
- sleeptime: sleeping time in ms per loop of the workload pattern
+Example:
- "./test.sh 4 100 1000" means
- CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload pattern.
- test result on an Intel machine:
- ~#./test.sh 4 100 1000
- Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
- powersave efficiency: 0%
- performance efficiency: 100%
- conservative efficiency: 28%
- ondemand efficiency: 95%
+NOTE:
- Make sure there are "sed", "cut", "grep", "rt-app", etc tools on your test
- machine, and run the script under root privilege.
diff --git a/doc/examples/cpufreq_governor_efficiency/calibration.json b/doc/examples/cpufreq_governor_efficiency/calibration.json new file mode 100644 index 0000000..4377990 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.json @@ -0,0 +1,27 @@ +{
"tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 1,
"phases" : {
"run" : {
"loop" : 1,
"run" : 200000,
},
"sleep" : {
"loop" : 1,
"sleep" : 200000,
}
}
}
},
"global" : {
"default_policy" : "SCHED_FIFO",
"calibration" : "CPU0",
"lock_pages" : true,
"ftrace" : true,
remove or disable the ftrace parameter. Otherwise rt-app will return an error if ftrace is not enable in the kernel. Then, your script stops without any message if rt-app fails to run the use case, you should detect the error and display a warning
"logdir" : "./",
}
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/calibration.sh b/doc/examples/cpufreq_governor_efficiency/calibration.sh new file mode 100755 index 0000000..d10e644 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.sh @@ -0,0 +1,11 @@ +#!/bin/sh
+set -e
+echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
+sleep 1
+pLoad=$(rt-app calibration.json 2>&1 |grep pLoad |sed 's/.*= (.*)ns.*/\1/') +sed 's/"calibration" : .*,/"calibration" : '$pLoad',/' -i dvfs.json
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.json b/doc/examples/cpufreq_governor_efficiency/dvfs.json new file mode 100644 index 0000000..b413156 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.json @@ -0,0 +1,27 @@ +{
"tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 5,
i wonder if 5 loops are enough to let the system stabilize ? May be we can extract more statistic like the min/max/average/stdev value ? While testing your script, i have seen some variation on the results (especially with short run time)
"phases" : {
"running" : {
"loop" : 1,
"run" : 100000,
},
"sleeping" : {
"loop" : 1,
"sleep" : 1000000,
}
}
}
},
"global" : {
"default_policy" : "SCHED_OTHER",
"calibration" : 90,
"lock_pages" : true,
"ftrace" : true,
Remove or disable ftrace parameter
"logdir" : "./",
}
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.sh b/doc/examples/cpufreq_governor_efficiency/dvfs.sh new file mode 100755 index 0000000..8591fc7 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.sh @@ -0,0 +1,38 @@ +#!/bin/sh
+#echo $1 $2 $3 +set -e
+if [ $1 ] && [ $2 ] ; then
for i in $(seq 0 1 $(expr $2 - 1)); do
echo $1 > /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
#cat /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
is it for debug purpose ?
done
sleep 3
+fi
+if [ $3 ] ; then
sed 's/"run" : .*,/"run" : '$3',/' -i dvfs.json
+fi
+if [ $4 ] ; then
sed 's/"sleep" : .*,/"sleep" : '$4',/' -i dvfs.json
+fi
+#cat dvfs.json
+rt-app dvfs.json 2> /dev/null
+if [ $1 ] ; then
mv -f rt-app-thread-0.log rt-app_$1_run$3us_sleep$4us.log
sum=0
for i in $(cat rt-app_$1_run$3us_sleep$4us.log | sed 'n;d' | sed '1d' |cut -f 3); do
sum=$(expr $sum + $i)
done
sum=$(expr $sum / 5)
echo $sum
rm -f rt-app_$1_run$3us_sleep$4us.log
+fi
diff --git a/doc/examples/cpufreq_governor_efficiency/test.sh b/doc/examples/cpufreq_governor_efficiency/test.sh new file mode 100755 index 0000000..d72fc6a --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/test.sh @@ -0,0 +1,82 @@ +#!/bin/sh
+set -e
+set_calibration() {
calibration.sh
+}
+test_efficiency() {
FILENAME="results_$RANDOM$$.txt"
if [ -e /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors ]; then
for i in $(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors); do
export gov_$i=$(echo $i)
done
else
echo "cpufreq is not available!"
exit
fi
if [ ! $gov_performance ] ; then
echo "Can't find performance governor!"
exit
fi
if [ ! $gov_powersave ] ; then
echo "Can't find powersave governor!"
exit
fi
# Get powersave data
dvfs.sh powersave $1 $2 $3 > $FILENAME
powersave=$(cat $FILENAME |sed -n '1p')
# Get performance data
dvfs.sh performance $1 $2 $3 > $FILENAME
performance=$(cat $FILENAME |sed -n '1p')
if [ $performance -ge $powersave ] ; then
echo "Error! Probably not input all the cpus in the same frequency domain"
exit
fi
denominator=$(expr $powersave - $performance)
echo "powersave efficiency: 0%"
echo "performance efficiency: 100%"
# Calcuate other governors data
for gov_next in $gov_conservative $gov_ondemand $gov_cfs; do
Why have you restricted the test to these 3 governors ? what about userspace gov ? or interactive governor when available ?
if [ "$gov_next" != "" ] ; then
dvfs.sh $gov_next $1 $2 $3 > $FILENAME
data=$(cat $FILENAME |sed -n '1p');
numerator=$(expr $powersave - $data)
numerator=$(expr $numerator \* 100)
if [ $numerator -lt 0 ] ; then
let numerator=0
fi
data=$(expr $numerator / $denominator)
echo "$gov_next efficiency: $data%"
fi
done
rm -f $FILENAME
+}
+if [ $# -lt 3 ]; then
echo "Usage: ./test.sh <cpus> <runtime> <sleeptime>"
echo "cpus: number of cpus in the CPU0's frequency domain"
echo "runtime: running time in ms per loop of the workload pattern"
echo "sleeptime: sleeping time in ms per loop of the workload pattern"
echo -e "\nExample: \n\"./test.sh 4 100 1000\" means\nCPU0~CPU3 sharing frequency, \"100ms run + 1000ms sleep\" workload pattern.\n"
exit
+fi
+echo "Frequency domain CPU0~CPU$(expr $1 - 1), run $2ms, sleep $3ms:"
+sleep 1 +PATH=$PATH:. +set_calibration +test_efficiency $1 $(expr $2 * 1000) $(expr $3 * 1000)
You have created several temporary file while executing your script, you should clean them before exiting.
Some governors like ondemand or interactive ones have parameters that modified their responsiveness. How can i set them before testing their efficiency ?
As an example, ondemand governor efficiency moves from nearly 0% to nearly 100% on my chromebook2 (only the quad A15 have been enable for the test) if you change the sampling_rate and the sampling_down_factor when you test short run: 100ms run 1000ms sleep. Default configuration of a governor are not always the best one for the platform
Regards, Vincent
-- 1.9.1
Hi Xunlei,
On 22 June 2015 at 15:43, Vincent Guittot vincent.guittot@linaro.org wrote:
Hi Xunlei,
On 18 June 2015 at 11:06, Xunlei Pang xlpang@126.com wrote:
From: Xunlei Pang pang.xunlei@linaro.org
DVFS adds a latency in the execution of task because of the time to
if [ "$gov_next" != "" ] ; then
dvfs.sh $gov_next $1 $2 $3 > $FILENAME
data=$(cat $FILENAME |sed -n '1p');
numerator=$(expr $powersave - $data)
numerator=$(expr $numerator \* 100)
if [ $numerator -lt 0 ] ; then
let numerator=0
numerator=0 should be enough
fi
data=$(expr $numerator / $denominator)
echo "$gov_next efficiency: $data%"
fi
done
rm -f $FILENAME
+}
Hi Vincent,
Vincent Guittot vincent.guittot@linaro.org wrote 2015-06-22 PM 09:43:52:
Re: [RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
Hi Xunlei,
On 18 June 2015 at 11:06, Xunlei Pang xlpang@126.com wrote:
From: Xunlei Pang pang.xunlei@linaro.org
DVFS adds a latency in the execution of task because of the time to decide to move at max freq. We need to measure this latency and check that the governor stays in an acceptable range.
When workgen runs a json file, a log file is created for each thread.
you use rt-app and not workgen in your script
I thought rt-app is the same thing as workgen, could you tell me the difference?
This log file records the number of loop that has been executed and the duration for executing these loops (per phase). We can use these figures to evaluate to latency that is added by a cpufreq governor and its "performance efficiency".
We use the run+sleep pattern to do the measurement, for the run time
per
loop, the performance governor should run the expected duration as the CPU stays a max freq. At the opposite, the powersave governor will
give
use the longest duration (as it stays at lowest OPP). Other governor
will
be somewhere between the 2 previous duration as they will use several
OPP
and will go back to max frequency after a defined duration which
depends
on its monitoring period.
The formula:
duration of powersave gov - duration of the gov
-------------------------------------------------------- x 100% duration of powersave gov - duration of performance gov
will give the efficiency of the governor. 100% means as efficient as the perf governor and 0% means as efficient as the powersave governor.
This patch offers json files and shell scripts to do the measurement,
Usage: ./test.sh <cpus> <runtime> <sleeptime> cpus: number of cpus in the CPU0's frequency domain
Why do you need the number of cpus in CPU0's freq domain as an argument of the script ? The cpus that share the same freq domain, are available in /sys/devices/system/cpu/cpu0/cpufreq/related_cpus and/or affected_cpus. I don't remember the exact difference between both lists
And why do you need this parameter at all ? In your script, you set the governor of all cpus of the freq domain but AFAICT, cpus in the same freq domain, share the same governor so all cpus will use the new governor as soon as you change the governor of one cpu.
Then, how can i test the efficiency of a cpu that doesn't belong to CPU0 freq domain ? IMHO, you should better remove this parameter and add a new one to select the cpu on which you want to run the bench. Futhermore, CPU0 is the cpu that is the more used by default in a platform so it's worth to use another one in order to not be disturbed by background activity.
runtime: running time in ms per loop of the workload pattern sleeptime: sleeping time in ms per loop of the workload pattern
Example: "./test.sh 4 100 1000" means CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
test result on my machine: ~#./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 28% ondemand efficiency: 95%
NOTE: Make sure there are "sed", "cut", "grep", "rt-app", etc tools on your test machine, and run the script under root privilege.
Signed-off-by: Xunlei Pang pang.xunlei@linaro.org
doc/examples/cpufreq_governor_efficiency/README | 54
++++++++++++++
.../cpufreq_governor_efficiency/calibration.json | 27 +++++++ .../cpufreq_governor_efficiency/calibration.sh | 11 +++ doc/examples/cpufreq_governor_efficiency/dvfs.json | 27 +++++++ doc/examples/cpufreq_governor_efficiency/dvfs.sh | 38 ++++++++++ doc/examples/cpufreq_governor_efficiency/test.sh | 82 +++++++++
+++++++++++++
6 files changed, 239 insertions(+) create mode 100644 doc/examples/cpufreq_governor_efficiency/README create mode 100644 doc/examples/cpufreq_governor_efficiency/
calibration.json
create mode 100755
doc/examples/cpufreq_governor_efficiency/calibration.sh
create mode 100644 doc/examples/cpufreq_governor_efficiency/dvfs.json create mode 100755 doc/examples/cpufreq_governor_efficiency/dvfs.sh create mode 100755 doc/examples/cpufreq_governor_efficiency/test.sh
diff --git a/doc/examples/cpufreq_governor_efficiency/README b/
doc/examples/cpufreq_governor_efficiency/README
new file mode 100644 index 0000000..cc8efe1 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/README @@ -0,0 +1,54 @@ +Measure the efficiency of cpufreq governors using rt-app
+BACKGROUND:
- DVFS adds a latency in the execution of task because of the time
to
- decide to move at max freq. We need to measure this latency and
check
- that the governor stays in an acceptable range.
- When workgen runs a json file, a log file is created for each
thread.
- This log file records the number of loop that has been executed
and
- the duration for executing these loops (per phase). We can use
these
- figures to evaluate to latency that is added by a cpufreq
governor
- and its "performance efficiency".
- We use the run+sleep pattern to do the measurement, for the
run time per
- loop, the performance governor should run the expected duration
as the
- CPU stays a max freq. At the opposite, the powersave governorwill
give
- use the longest duration (as it stays at lowest OPP). Other
governor will
- be somewhere between the 2 previous duration as they will use
several OPP
- and will go back to max frequency after a defined duration
which depends
- on its monitoring period.
- The formula:
duration of powersave gov - duration of the gov
- -------------------------------------------------------- x 100%
duration of powersave gov - duration of performance gov
- will give the efficiency of the governor. 100% means as efficient
as
- the perf governor and 0% means as efficient as the powersave
governor.
- This test offers json files and shell scripts to do the
measurement,
+USAGE:
- ./test.sh <cpus> <runtime> <sleeptime>
- cpus: number of cpus in the CPU0's frequency domain
- runtime: running time in ms per loop of the workload pattern
- sleeptime: sleeping time in ms per loop of the workload pattern
+Example:
- "./test.sh 4 100 1000" means
- CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep"
workload pattern.
- test result on an Intel machine:
- ~#./test.sh 4 100 1000
- Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
- powersave efficiency: 0%
- performance efficiency: 100%
- conservative efficiency: 28%
- ondemand efficiency: 95%
+NOTE:
- Make sure there are "sed", "cut", "grep", "rt-app", etc tools
on your test
- machine, and run the script under root privilege.
diff --git a/doc/examples/cpufreq_governor_efficiency/
calibration.json
b/doc/examples/cpufreq_governor_efficiency/calibration.json
new file mode 100644 index 0000000..4377990 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.json @@ -0,0 +1,27 @@ +{
"tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 1,
"phases" : {
"run" : {
"loop" : 1,
"run" : 200000,
},
"sleep" : {
"loop" : 1,
"sleep" : 200000,
}
}
}
},
"global" : {
"default_policy" : "SCHED_FIFO",
"calibration" : "CPU0",
"lock_pages" : true,
"ftrace" : true,
remove or disable the ftrace parameter. Otherwise rt-app will return an error if ftrace is not enable in the kernel. Then, your script stops without any message if rt-app fails to run the use case, you should detect the error and display a warning
"logdir" : "./",
}
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/
calibration.sh b/doc/examples/cpufreq_governor_efficiency/calibration.sh
new file mode 100755 index 0000000..d10e644 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.sh @@ -0,0 +1,11 @@ +#!/bin/sh
+set -e
+echo performance >
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
+sleep 1
+pLoad=$(rt-app calibration.json 2>&1 |grep pLoad |sed 's/.*= (.*
)ns.*/\1/')
+sed 's/"calibration" : .*,/"calibration" : '$pLoad',/' -i dvfs.json
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.json b/
doc/examples/cpufreq_governor_efficiency/dvfs.json
new file mode 100644 index 0000000..b413156 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.json @@ -0,0 +1,27 @@ +{
"tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 5,
i wonder if 5 loops are enough to let the system stabilize ? May be we can extract more statistic like the min/max/average/stdev value ? While testing your script, i have seen some variation on the results (especially with short run time)
"phases" : {
"running" : {
"loop" : 1,
"run" : 100000,
},
"sleeping" : {
"loop" : 1,
"sleep" : 1000000,
}
}
}
},
"global" : {
"default_policy" : "SCHED_OTHER",
"calibration" : 90,
"lock_pages" : true,
"ftrace" : true,
Remove or disable ftrace parameter
"logdir" : "./",
}
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.sh b/
doc/examples/cpufreq_governor_efficiency/dvfs.sh
new file mode 100755 index 0000000..8591fc7 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.sh @@ -0,0 +1,38 @@ +#!/bin/sh
+#echo $1 $2 $3 +set -e
+if [ $1 ] && [ $2 ] ; then
for i in $(seq 0 1 $(expr $2 - 1)); do
echo $1 > /sys/devices/system/cpu/cpu$i/cpufreq/
scaling_governor
#cat
/sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
is it for debug purpose ?
done
sleep 3
+fi
+if [ $3 ] ; then
sed 's/"run" : .*,/"run" : '$3',/' -i dvfs.json
+fi
+if [ $4 ] ; then
sed 's/"sleep" : .*,/"sleep" : '$4',/' -i dvfs.json
+fi
+#cat dvfs.json
+rt-app dvfs.json 2> /dev/null
+if [ $1 ] ; then
mv -f rt-app-thread-0.log rt-app_$1_run$3us_sleep$4us.log
sum=0
for i in $(cat rt-app_$1_run$3us_sleep$4us.log | sed 'n;d'
| sed '1d' |cut -f 3); do
sum=$(expr $sum + $i)
done
sum=$(expr $sum / 5)
echo $sum
rm -f rt-app_$1_run$3us_sleep$4us.log
+fi
diff --git a/doc/examples/cpufreq_governor_efficiency/test.sh b/
doc/examples/cpufreq_governor_efficiency/test.sh
new file mode 100755 index 0000000..d72fc6a --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/test.sh @@ -0,0 +1,82 @@ +#!/bin/sh
+set -e
+set_calibration() {
calibration.sh
+}
+test_efficiency() {
FILENAME="results_$RANDOM$$.txt"
if [ -e /sys/devices/system/cpu/cpu0/cpufreq/
scaling_available_governors ]; then
for i in $(cat /sys/devices/system/cpu/cpu0/
cpufreq/scaling_available_governors); do
export gov_$i=$(echo $i)
done
else
echo "cpufreq is not available!"
exit
fi
if [ ! $gov_performance ] ; then
echo "Can't find performance governor!"
exit
fi
if [ ! $gov_powersave ] ; then
echo "Can't find powersave governor!"
exit
fi
# Get powersave data
dvfs.sh powersave $1 $2 $3 > $FILENAME
powersave=$(cat $FILENAME |sed -n '1p')
# Get performance data
dvfs.sh performance $1 $2 $3 > $FILENAME
performance=$(cat $FILENAME |sed -n '1p')
if [ $performance -ge $powersave ] ; then
echo "Error! Probably not input all the cpus in
the same frequency domain"
exit
fi
denominator=$(expr $powersave - $performance)
echo "powersave efficiency: 0%"
echo "performance efficiency: 100%"
# Calcuate other governors data
for gov_next in $gov_conservative $gov_ondemand $gov_cfs; do
Why have you restricted the test to these 3 governors ? what about userspace gov ? or interactive governor when available ?
if [ "$gov_next" != "" ] ; then
dvfs.sh $gov_next $1 $2 $3 > $FILENAME
data=$(cat $FILENAME |sed -n '1p');
numerator=$(expr $powersave - $data)
numerator=$(expr $numerator \* 100)
if [ $numerator -lt 0 ] ; then
let numerator=0
fi
data=$(expr $numerator / $denominator)
echo "$gov_next efficiency: $data%"
fi
done
rm -f $FILENAME
+}
+if [ $# -lt 3 ]; then
echo "Usage: ./test.sh <cpus> <runtime> <sleeptime>"
echo "cpus: number of cpus in the CPU0's frequency domain"
echo "runtime: running time in ms per loop of the workload
pattern"
echo "sleeptime: sleeping time in ms per loop of the
workload pattern"
echo -e "\nExample: \n\"./test.sh 4 100 1000\" means
\nCPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload pattern.\n"
exit
+fi
+echo "Frequency domain CPU0~CPU$(expr $1 - 1), run $2ms, sleep $3ms:"
+sleep 1 +PATH=$PATH:. +set_calibration +test_efficiency $1 $(expr $2 * 1000) $(expr $3 * 1000)
You have created several temporary file while executing your script, you should clean them before exiting.
Some governors like ondemand or interactive ones have parameters that modified their responsiveness. How can i set them before testing their efficiency ?
As an example, ondemand governor efficiency moves from nearly 0% to nearly 100% on my chromebook2 (only the quad A15 have been enable for the test) if you change the sampling_rate and the sampling_down_factor when you test short run: 100ms run 1000ms sleep. Default configuration of a governor are not always the best one for the platform
Yes, thanks for the review.
I'm planning on changing test.sh to test only one governor on time, so the user can set the proper governor's parameters before running the test , like below:
Usage: ./test.sh <cpu> <runtime> <sleeptime> <loops> <governor> cpu: cpu number on which you want to run the test runtime: running time in ms per loop of the workload pattern sleeptime: sleeping time in ms per loop of the workload pattern loops: repeat times of the workload pattern governor: CPUFreq governor you want to test
Example: "./test.sh 0 100 1000 10 ondemand" means Test ondemand on CPU0 with loops of "100ms run + 1000ms sleep" workload pattern.
-Xunlei
Regards, Vincent
-- 1.9.1
-------------------------------------------------------- ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately.
On 23 June 2015 at 12:52, pang.xunlei@zte.com.cn wrote:
Hi Vincent,
Vincent Guittot vincent.guittot@linaro.org wrote 2015-06-22 PM 09:43:52:
Re: [RESEND PATCH v2] doc: measure the efficiency of cpufreq governors
Hi Xunlei,
On 18 June 2015 at 11:06, Xunlei Pang xlpang@126.com wrote:
From: Xunlei Pang pang.xunlei@linaro.org
DVFS adds a latency in the execution of task because of the time to decide to move at max freq. We need to measure this latency and check that the governor stays in an acceptable range.
When workgen runs a json file, a log file is created for each thread.
you use rt-app and not workgen in your script
I thought rt-app is the same thing as workgen, could you tell me the difference?
workgen is a python script that parses and checks the correctness of the json file, updates it if necessary and/or corrects it before calling rt-app that will run the use case
This log file records the number of loop that has been executed and the duration for executing these loops (per phase). We can use these figures to evaluate to latency that is added by a cpufreq governor and its "performance efficiency".
We use the run+sleep pattern to do the measurement, for the run time
per
loop, the performance governor should run the expected duration as the CPU stays a max freq. At the opposite, the powersave governor will
give
use the longest duration (as it stays at lowest OPP). Other governor
will
be somewhere between the 2 previous duration as they will use several
OPP
and will go back to max frequency after a defined duration which
depends
on its monitoring period.
The formula:
duration of powersave gov - duration of the gov
-------------------------------------------------------- x 100% duration of powersave gov - duration of performance gov
will give the efficiency of the governor. 100% means as efficient as the perf governor and 0% means as efficient as the powersave governor.
This patch offers json files and shell scripts to do the measurement,
Usage: ./test.sh <cpus> <runtime> <sleeptime> cpus: number of cpus in the CPU0's frequency domain
Why do you need the number of cpus in CPU0's freq domain as an argument of the script ? The cpus that share the same freq domain, are available in /sys/devices/system/cpu/cpu0/cpufreq/related_cpus and/or affected_cpus. I don't remember the exact difference between both lists
And why do you need this parameter at all ? In your script, you set the governor of all cpus of the freq domain but AFAICT, cpus in the same freq domain, share the same governor so all cpus will use the new governor as soon as you change the governor of one cpu.
Then, how can i test the efficiency of a cpu that doesn't belong to CPU0 freq domain ? IMHO, you should better remove this parameter and add a new one to select the cpu on which you want to run the bench. Futhermore, CPU0 is the cpu that is the more used by default in a platform so it's worth to use another one in order to not be disturbed by background activity.
runtime: running time in ms per loop of the workload pattern sleeptime: sleeping time in ms per loop of the workload pattern
Example: "./test.sh 4 100 1000" means CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload
pattern.
test result on my machine: ~#./test.sh 4 100 1000 Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms: powersave efficiency: 0% performance efficiency: 100% conservative efficiency: 28% ondemand efficiency: 95%
NOTE: Make sure there are "sed", "cut", "grep", "rt-app", etc tools on your test machine, and run the script under root privilege.
Signed-off-by: Xunlei Pang pang.xunlei@linaro.org
doc/examples/cpufreq_governor_efficiency/README | 54
++++++++++++++
.../cpufreq_governor_efficiency/calibration.json | 27 +++++++ .../cpufreq_governor_efficiency/calibration.sh | 11 +++ doc/examples/cpufreq_governor_efficiency/dvfs.json | 27 +++++++ doc/examples/cpufreq_governor_efficiency/dvfs.sh | 38 ++++++++++ doc/examples/cpufreq_governor_efficiency/test.sh | 82 +++++++++
+++++++++++++
6 files changed, 239 insertions(+) create mode 100644 doc/examples/cpufreq_governor_efficiency/README create mode 100644 doc/examples/cpufreq_governor_efficiency/
calibration.json
create mode 100755
doc/examples/cpufreq_governor_efficiency/calibration.sh
create mode 100644 doc/examples/cpufreq_governor_efficiency/dvfs.json create mode 100755 doc/examples/cpufreq_governor_efficiency/dvfs.sh create mode 100755 doc/examples/cpufreq_governor_efficiency/test.sh
diff --git a/doc/examples/cpufreq_governor_efficiency/README b/
doc/examples/cpufreq_governor_efficiency/README
new file mode 100644 index 0000000..cc8efe1 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/README @@ -0,0 +1,54 @@ +Measure the efficiency of cpufreq governors using rt-app
+BACKGROUND:
- DVFS adds a latency in the execution of task because of the time
to
- decide to move at max freq. We need to measure this latency and
check
- that the governor stays in an acceptable range.
- When workgen runs a json file, a log file is created for each
thread.
- This log file records the number of loop that has been executed
and
- the duration for executing these loops (per phase). We can use
these
- figures to evaluate to latency that is added by a cpufreq
governor
- and its "performance efficiency".
- We use the run+sleep pattern to do the measurement, for the
run time per
- loop, the performance governor should run the expected duration
as the
- CPU stays a max freq. At the opposite, the powersave governorwill
give
- use the longest duration (as it stays at lowest OPP). Other
governor will
- be somewhere between the 2 previous duration as they will use
several OPP
- and will go back to max frequency after a defined duration
which depends
- on its monitoring period.
- The formula:
duration of powersave gov - duration of the gov
- -------------------------------------------------------- x 100%
duration of powersave gov - duration of performance gov
- will give the efficiency of the governor. 100% means as efficient
as
- the perf governor and 0% means as efficient as the powersave
governor.
- This test offers json files and shell scripts to do the
measurement,
+USAGE:
- ./test.sh <cpus> <runtime> <sleeptime>
- cpus: number of cpus in the CPU0's frequency domain
- runtime: running time in ms per loop of the workload pattern
- sleeptime: sleeping time in ms per loop of the workload pattern
+Example:
- "./test.sh 4 100 1000" means
- CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep"
workload pattern.
- test result on an Intel machine:
- ~#./test.sh 4 100 1000
- Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
- powersave efficiency: 0%
- performance efficiency: 100%
- conservative efficiency: 28%
- ondemand efficiency: 95%
+NOTE:
- Make sure there are "sed", "cut", "grep", "rt-app", etc tools
on your test
- machine, and run the script under root privilege.
diff --git a/doc/examples/cpufreq_governor_efficiency/
calibration.json
b/doc/examples/cpufreq_governor_efficiency/calibration.json
new file mode 100644 index 0000000..4377990 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.json @@ -0,0 +1,27 @@ +{
"tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 1,
"phases" : {
"run" : {
"loop" : 1,
"run" : 200000,
},
"sleep" : {
"loop" : 1,
"sleep" : 200000,
}
}
}
},
"global" : {
"default_policy" : "SCHED_FIFO",
"calibration" : "CPU0",
"lock_pages" : true,
"ftrace" : true,
remove or disable the ftrace parameter. Otherwise rt-app will return an error if ftrace is not enable in the kernel. Then, your script stops without any message if rt-app fails to run the use case, you should detect the error and display a warning
"logdir" : "./",
}
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/
calibration.sh b/doc/examples/cpufreq_governor_efficiency/calibration.sh
new file mode 100755 index 0000000..d10e644 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/calibration.sh @@ -0,0 +1,11 @@ +#!/bin/sh
+set -e
+echo performance >
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
+sleep 1
+pLoad=$(rt-app calibration.json 2>&1 |grep pLoad |sed 's/.*= (.*
)ns.*/\1/')
+sed 's/"calibration" : .*,/"calibration" : '$pLoad',/' -i dvfs.json
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.json b/
doc/examples/cpufreq_governor_efficiency/dvfs.json
new file mode 100644 index 0000000..b413156 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.json @@ -0,0 +1,27 @@ +{
"tasks" : {
"thread" : {
"instance" : 1,
"cpus" : [0],
"loop" : 5,
i wonder if 5 loops are enough to let the system stabilize ? May be we can extract more statistic like the min/max/average/stdev value ? While testing your script, i have seen some variation on the results (especially with short run time)
"phases" : {
"running" : {
"loop" : 1,
"run" : 100000,
},
"sleeping" : {
"loop" : 1,
"sleep" : 1000000,
}
}
}
},
"global" : {
"default_policy" : "SCHED_OTHER",
"calibration" : 90,
"lock_pages" : true,
"ftrace" : true,
Remove or disable ftrace parameter
"logdir" : "./",
}
+}
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.sh b/
doc/examples/cpufreq_governor_efficiency/dvfs.sh
new file mode 100755 index 0000000..8591fc7 --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/dvfs.sh @@ -0,0 +1,38 @@ +#!/bin/sh
+#echo $1 $2 $3 +set -e
+if [ $1 ] && [ $2 ] ; then
for i in $(seq 0 1 $(expr $2 - 1)); do
echo $1 > /sys/devices/system/cpu/cpu$i/cpufreq/
scaling_governor
#cat
/sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
is it for debug purpose ?
done
sleep 3
+fi
+if [ $3 ] ; then
sed 's/"run" : .*,/"run" : '$3',/' -i dvfs.json
+fi
+if [ $4 ] ; then
sed 's/"sleep" : .*,/"sleep" : '$4',/' -i dvfs.json
+fi
+#cat dvfs.json
+rt-app dvfs.json 2> /dev/null
+if [ $1 ] ; then
mv -f rt-app-thread-0.log rt-app_$1_run$3us_sleep$4us.log
sum=0
for i in $(cat rt-app_$1_run$3us_sleep$4us.log | sed 'n;d'
| sed '1d' |cut -f 3); do
sum=$(expr $sum + $i)
done
sum=$(expr $sum / 5)
echo $sum
rm -f rt-app_$1_run$3us_sleep$4us.log
+fi
diff --git a/doc/examples/cpufreq_governor_efficiency/test.sh b/
doc/examples/cpufreq_governor_efficiency/test.sh
new file mode 100755 index 0000000..d72fc6a --- /dev/null +++ b/doc/examples/cpufreq_governor_efficiency/test.sh @@ -0,0 +1,82 @@ +#!/bin/sh
+set -e
+set_calibration() {
calibration.sh
+}
+test_efficiency() {
FILENAME="results_$RANDOM$$.txt"
if [ -e /sys/devices/system/cpu/cpu0/cpufreq/
scaling_available_governors ]; then
for i in $(cat /sys/devices/system/cpu/cpu0/
cpufreq/scaling_available_governors); do
export gov_$i=$(echo $i)
done
else
echo "cpufreq is not available!"
exit
fi
if [ ! $gov_performance ] ; then
echo "Can't find performance governor!"
exit
fi
if [ ! $gov_powersave ] ; then
echo "Can't find powersave governor!"
exit
fi
# Get powersave data
dvfs.sh powersave $1 $2 $3 > $FILENAME
powersave=$(cat $FILENAME |sed -n '1p')
# Get performance data
dvfs.sh performance $1 $2 $3 > $FILENAME
performance=$(cat $FILENAME |sed -n '1p')
if [ $performance -ge $powersave ] ; then
echo "Error! Probably not input all the cpus in
the same frequency domain"
exit
fi
denominator=$(expr $powersave - $performance)
echo "powersave efficiency: 0%"
echo "performance efficiency: 100%"
# Calcuate other governors data
for gov_next in $gov_conservative $gov_ondemand $gov_cfs; do
Why have you restricted the test to these 3 governors ? what about userspace gov ? or interactive governor when available ?
if [ "$gov_next" != "" ] ; then
dvfs.sh $gov_next $1 $2 $3 > $FILENAME
data=$(cat $FILENAME |sed -n '1p');
numerator=$(expr $powersave - $data)
numerator=$(expr $numerator \* 100)
if [ $numerator -lt 0 ] ; then
let numerator=0
fi
data=$(expr $numerator / $denominator)
echo "$gov_next efficiency: $data%"
fi
done
rm -f $FILENAME
+}
+if [ $# -lt 3 ]; then
echo "Usage: ./test.sh <cpus> <runtime> <sleeptime>"
echo "cpus: number of cpus in the CPU0's frequency domain"
echo "runtime: running time in ms per loop of the workload
pattern"
echo "sleeptime: sleeping time in ms per loop of the
workload pattern"
echo -e "\nExample: \n\"./test.sh 4 100 1000\" means
\nCPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload pattern.\n"
exit
+fi
+echo "Frequency domain CPU0~CPU$(expr $1 - 1), run $2ms, sleep $3ms:"
+sleep 1 +PATH=$PATH:. +set_calibration +test_efficiency $1 $(expr $2 * 1000) $(expr $3 * 1000)
You have created several temporary file while executing your script, you should clean them before exiting.
Some governors like ondemand or interactive ones have parameters that modified their responsiveness. How can i set them before testing their efficiency ?
As an example, ondemand governor efficiency moves from nearly 0% to nearly 100% on my chromebook2 (only the quad A15 have been enable for the test) if you change the sampling_rate and the sampling_down_factor when you test short run: 100ms run 1000ms sleep. Default configuration of a governor are not always the best one for the platform
Yes, thanks for the review.
I'm planning on changing test.sh to test only one governor on time, so the user can set the proper governor's parameters before running the test , like below:
Usage: ./test.sh <cpu> <runtime> <sleeptime> <loops> <governor> cpu: cpu number on which you want to run the test runtime: running time in ms per loop of the workload pattern sleeptime: sleeping time in ms per loop of the workload pattern loops: repeat times of the workload pattern governor: CPUFreq governor you want to test
could you put governor parameter before loops parameter and make the latter optional ?
Example: "./test.sh 0 100 1000 10 ondemand" means Test ondemand on CPU0 with loops of "100ms run + 1000ms sleep" workload pattern.
-Xunlei
Regards, Vincent
-- 1.9.1
ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately.