[RESEND PATCH v2] doc: measure the efficiency of cpufreq governors - Sched-tools

18 Jun 2015

From: Xunlei Pang pang.xunlei@linaro.org
DVFS adds a latency in the execution of task because of the time to
decide to move at max freq. We need to measure this latency and check
that the governor stays in an acceptable range.
When workgen runs a json file, a log file is created for each thread.
This log file records the number of loop that has been executed and
the duration for executing these loops (per phase). We can use these
figures to evaluate to latency that is added by a cpufreq governor
and its "performance efficiency".
We use the run+sleep pattern to do the measurement, for the run time per
loop, the performance governor should run the expected duration as the
CPU stays a max freq. At the opposite, the powersave governor will give
use the longest duration (as it stays at lowest OPP). Other governor will
be somewhere between the 2 previous duration as they will use several OPP
and will go back to max frequency after a defined duration which depends
on its monitoring period.
The formula:
duration of powersave gov - duration of the gov
--------------------------------------------------------  x 100%
 duration of powersave gov - duration of performance gov
will give the efficiency of the governor. 100% means as efficient as
the perf governor and 0% means as efficient as the powersave governor.
This patch offers json files and shell scripts to do the measurement,
Usage: ./test.sh <cpus> <runtime> <sleeptime>
cpus: number of cpus in the CPU0's frequency domain
runtime: running time in ms per loop of the workload pattern
sleeptime: sleeping time in ms per loop of the workload pattern
Example:
"./test.sh 4 100 1000" means
CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload pattern.
test result on my machine:
~#./test.sh 4 100 1000
Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
powersave efficiency: 0%
performance efficiency: 100%
conservative efficiency: 28%
ondemand efficiency: 95%
NOTE: Make sure there are "sed", "cut", "grep", "rt-app", etc tools on
your test machine, and run the script under root privilege.
Signed-off-by: Xunlei Pang pang.xunlei@linaro.org
---
 doc/examples/cpufreq_governor_efficiency/README    | 54 ++++++++++++++
 .../cpufreq_governor_efficiency/calibration.json   | 27 +++++++
 .../cpufreq_governor_efficiency/calibration.sh     | 11 +++
 doc/examples/cpufreq_governor_efficiency/dvfs.json | 27 +++++++
 doc/examples/cpufreq_governor_efficiency/dvfs.sh   | 38 ++++++++++
 doc/examples/cpufreq_governor_efficiency/test.sh   | 82 ++++++++++++++++++++++
 6 files changed, 239 insertions(+)
 create mode 100644 doc/examples/cpufreq_governor_efficiency/README
 create mode 100644 doc/examples/cpufreq_governor_efficiency/calibration.json
 create mode 100755 doc/examples/cpufreq_governor_efficiency/calibration.sh
 create mode 100644 doc/examples/cpufreq_governor_efficiency/dvfs.json
 create mode 100755 doc/examples/cpufreq_governor_efficiency/dvfs.sh
 create mode 100755 doc/examples/cpufreq_governor_efficiency/test.sh

diff --git a/doc/examples/cpufreq_governor_efficiency/README b/doc/examples/cpufreq_governor_efficiency/README
new file mode 100644
index 0000000..cc8efe1
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/README
@@ -0,0 +1,54 @@
+Measure the efficiency of cpufreq governors using rt-app
+
+BACKGROUND:
+    DVFS adds a latency in the execution of task because of the time to
+    decide to move at max freq. We need to measure this latency and check
+    that the governor stays in an acceptable range.
+
+    When workgen runs a json file, a log file is created for each thread.
+    This log file records the number of loop that has been executed and
+    the duration for executing these loops (per phase). We can use these
+    figures to evaluate to latency that is added by a cpufreq governor
+    and its "performance efficiency".
+
+    We use the run+sleep pattern to do the measurement, for the run time per
+    loop, the performance governor should run the expected duration as the
+    CPU stays a max freq. At the opposite, the powersave governor will give
+    use the longest duration (as it stays at lowest OPP). Other governor will
+    be somewhere between the 2 previous duration as they will use several OPP
+    and will go back to max frequency after a defined duration which depends
+    on its monitoring period.
+
+    The formula:
+
+        duration of powersave gov - duration of the gov
+    --------------------------------------------------------  x 100%
+     duration of powersave gov - duration of performance gov
+
+    will give the efficiency of the governor. 100% means as efficient as
+    the perf governor and 0% means as efficient as the powersave governor.
+
+    This test offers json files and shell scripts to do the measurement,
+
+USAGE:
+    ./test.sh <cpus> <runtime> <sleeptime>
+    cpus: number of cpus in the CPU0's frequency domain
+    runtime: running time in ms per loop of the workload pattern
+    sleeptime: sleeping time in ms per loop of the workload pattern
+
+Example:
+    "./test.sh 4 100 1000" means
+    CPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload pattern.
+
+    test result on an Intel machine:
+    ~#./test.sh 4 100 1000
+    Frequency domain CPU0~CPU3, run 100ms, sleep 1000ms:
+    powersave efficiency: 0%
+    performance efficiency: 100%
+    conservative efficiency: 28%
+    ondemand efficiency: 95%
+
+NOTE:
+    Make sure there are "sed", "cut", "grep", "rt-app", etc tools on your test
+    machine, and run the script under root privilege.
+
diff --git a/doc/examples/cpufreq_governor_efficiency/calibration.json b/doc/examples/cpufreq_governor_efficiency/calibration.json
new file mode 100644
index 0000000..4377990
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/calibration.json
@@ -0,0 +1,27 @@
+{
+	"tasks" : {
+		"thread" : {
+			"instance" : 1,
+			"cpus" : [0],
+			"loop" : 1,
+			"phases" : {
+				"run" : {
+					"loop" : 1,
+					"run" : 200000,
+				},
+				"sleep" : {
+					"loop" : 1,
+					"sleep" : 200000,
+				}
+			}
+		}
+	},
+	"global" : {
+		"default_policy" : "SCHED_FIFO",
+		"calibration" : "CPU0",
+		"lock_pages" : true,
+		"ftrace" : true,
+		"logdir" : "./",
+	}
+}
+
diff --git a/doc/examples/cpufreq_governor_efficiency/calibration.sh b/doc/examples/cpufreq_governor_efficiency/calibration.sh
new file mode 100755
index 0000000..d10e644
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/calibration.sh
@@ -0,0 +1,11 @@
+#!/bin/sh
+
+set -e
+
+echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
+
+sleep 1
+
+pLoad=$(rt-app calibration.json 2>&1 |grep pLoad |sed 's/.*= (.*)ns.*/\1/')
+sed 's/"calibration" : .*,/"calibration" : '$pLoad',/' -i dvfs.json
+
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.json b/doc/examples/cpufreq_governor_efficiency/dvfs.json
new file mode 100644
index 0000000..b413156
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/dvfs.json
@@ -0,0 +1,27 @@
+{
+	"tasks" : {
+		"thread" : {
+			"instance" : 1,
+			"cpus" : [0],
+			"loop" : 5,
+			"phases" : {
+				"running" : {
+					"loop" : 1,
+					"run" : 100000,
+				},
+				"sleeping" : {
+					"loop" : 1,
+					"sleep" : 1000000,
+				}
+			}
+		}
+	},
+	"global" : {
+		"default_policy" : "SCHED_OTHER",
+		"calibration" : 90,
+		"lock_pages" : true,
+		"ftrace" : true,
+		"logdir" : "./",
+	}
+}
+
diff --git a/doc/examples/cpufreq_governor_efficiency/dvfs.sh b/doc/examples/cpufreq_governor_efficiency/dvfs.sh
new file mode 100755
index 0000000..8591fc7
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/dvfs.sh
@@ -0,0 +1,38 @@
+#!/bin/sh
+
+#echo $1 $2 $3
+set -e
+
+if [ $1 ] && [ $2 ] ; then
+	for i in $(seq 0 1 $(expr $2 - 1)); do
+		echo $1 > /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
+		#cat /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
+	done
+
+	sleep 3
+fi
+
+if [ $3 ] ; then
+	sed 's/"run" : .*,/"run" : '$3',/' -i dvfs.json
+fi
+
+if [ $4 ] ; then
+	sed 's/"sleep" : .*,/"sleep" : '$4',/' -i dvfs.json
+fi
+
+#cat dvfs.json
+
+rt-app dvfs.json 2> /dev/null
+
+if [ $1 ] ; then
+	mv -f rt-app-thread-0.log rt-app_$1_run$3us_sleep$4us.log
+
+	sum=0
+	for i in $(cat rt-app_$1_run$3us_sleep$4us.log | sed 'n;d' | sed '1d' |cut -f 3); do
+		sum=$(expr $sum + $i)
+	done
+	sum=$(expr $sum / 5)
+	echo $sum
+	rm -f rt-app_$1_run$3us_sleep$4us.log
+fi
+
diff --git a/doc/examples/cpufreq_governor_efficiency/test.sh b/doc/examples/cpufreq_governor_efficiency/test.sh
new file mode 100755
index 0000000..d72fc6a
--- /dev/null
+++ b/doc/examples/cpufreq_governor_efficiency/test.sh
@@ -0,0 +1,82 @@
+#!/bin/sh
+
+set -e
+
+set_calibration() {
+	calibration.sh
+}
+
+test_efficiency() {
+
+	FILENAME="results_$RANDOM$$.txt"
+
+	if [ -e /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors ]; then
+		for i in $(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors); do
+			export gov_$i=$(echo $i)
+		done
+	else
+		echo "cpufreq is not available!"
+		exit
+	fi
+
+	if [ ! $gov_performance ] ; then
+		echo "Can't find performance governor!"
+		exit
+	fi
+
+	if [ ! $gov_powersave ] ; then
+		echo "Can't find powersave governor!"
+		exit
+	fi
+
+	# Get powersave data
+	dvfs.sh powersave $1 $2 $3 > $FILENAME
+	powersave=$(cat $FILENAME |sed -n '1p')
+
+	# Get performance data
+	dvfs.sh performance $1 $2 $3 > $FILENAME
+	performance=$(cat $FILENAME |sed -n '1p')
+
+	if [ $performance -ge $powersave ] ; then
+		echo "Error! Probably not input all the cpus in the same frequency domain"
+		exit
+	fi
+
+	denominator=$(expr $powersave - $performance)
+	echo "powersave efficiency: 0%"
+	echo "performance efficiency: 100%"
+
+	# Calcuate other governors data
+	for gov_next in $gov_conservative $gov_ondemand $gov_cfs; do
+		if [ "$gov_next" != "" ] ; then
+			dvfs.sh $gov_next $1 $2 $3 > $FILENAME
+			data=$(cat $FILENAME |sed -n '1p');
+			numerator=$(expr $powersave - $data)
+			numerator=$(expr $numerator * 100)
+			if [ $numerator -lt 0 ] ; then
+				let numerator=0
+			fi
+			data=$(expr $numerator / $denominator)
+			echo "$gov_next efficiency: $data%"
+		fi
+	done
+
+	rm -f $FILENAME
+}
+
+if [ $# -lt 3 ]; then
+	echo "Usage: ./test.sh <cpus> <runtime> <sleeptime>"
+	echo "cpus: number of cpus in the CPU0's frequency domain"
+	echo "runtime: running time in ms per loop of the workload pattern"
+	echo "sleeptime: sleeping time in ms per loop of the workload pattern"
+	echo -e "\nExample: \n"./test.sh 4 100 1000" means\nCPU0~CPU3 sharing frequency, "100ms run + 1000ms sleep" workload pattern.\n"
+	exit
+fi
+
+echo "Frequency domain CPU0~CPU$(expr $1 - 1), run $2ms, sleep $3ms:"
+
+sleep 1
+PATH=$PATH:.
+set_calibration
+test_efficiency $1 $(expr $2 * 1000) $(expr $3 * 1000)
+
-- 
1.9.1