On 03/09/16 00:27, markivx@codeaurora.org wrote:
From: Vikram Mulukutla markivx@codeaurora.org
Translating utilization to frequency may not be a simple operation since on some architectures, certain frequencies represent "boost" frequencies that may allow hardware to boost frequency to beyond what is represented in software. For example, Intel x86 machines have a max frequency that is only 1MHz greater than the next highest frequency in cpufreq tables, but can provide 200MHz more capacity depending on the number of non-idle CPUs.
This is a temporary/hack patch to use a translation table in cpufreq_schedutil to translate scheduler utilization to the next_freq value in get_next_freq. The capacity values in the table are calculated by running appropriate workloads (like sysbench) at each P-state.
Signed-off-by: Vikram Mulukutla markivx@codeaurora.org
include/linux/sched/sysctl.h | 1 + kernel/sched/cpufreq_schedutil.c | 37 ++++++++++++++++++++++++++++++++++++- kernel/sysctl.c | 9 +++++++++ 3 files changed, 46 insertions(+), 1 deletion(-)
diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h index 7007815..3b2dac1 100644 --- a/include/linux/sched/sysctl.h +++ b/include/linux/sched/sysctl.h @@ -32,6 +32,7 @@ extern unsigned int sysctl_numa_balancing_scan_period_min; extern unsigned int sysctl_numa_balancing_scan_period_max; extern unsigned int sysctl_numa_balancing_scan_size; extern unsigned int sysctl_sched_use_walt_metrics; +extern unsigned int sysctl_sched_use_cap_table;
#ifdef CONFIG_SCHED_DEBUG extern unsigned int sysctl_sched_migration_cost; diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 2eef34d..ef688216 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -107,6 +107,27 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time, } }
+struct util_freq {
- unsigned int util;
- unsigned long freq;
+};
+static struct util_freq cap_table[] = {
- {589, 3401000},
Another thing, the util of a single hw-thread on a two threaded core is 589 but cpu_efficiency returns 1024. There is ongoing discussion about this issue on LKML:
https://lkml.org/lkml/2016/8/18/993
Most part of the discussion is about load-balancing though.
- {526, 3400000},
- {494, 3200000},
- {463, 3000000},
- {433, 2800000},
- {401, 2600000},
- {362, 2400000},
- {339, 2200000},
- {308, 2000000},
- {276, 1800000},
- {245, 1600000},
+};
This probably makes only sense on a certain type of recent x86 platform.
My i7-4750HQ gives me:
2001000 2000000 1900000 1800000 1700000 1600000 1500000 1400000 1300000 1200000 1100000 1000000 900000 800000
I ran with 'intel_pstate=disable' which I guess it's worth mentioning as well.
+unsigned int sysctl_sched_use_cap_table;
Requires CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y. Maybe worth mentioning in the patch header since it's not part of make defconfig.
[...]