On 4/24/2015 7:50, Morten Rasmussen wrote:
On Mon, Apr 20, 2015 at 12:54:15PM +0100, Morten Rasmussen wrote:
On Fri, Apr 17, 2015 at 10:34:21PM +0100, Wu, Junjie wrote:
On 4/15/2015 22:29, Michael Turquette wrote:
From: Morten Rasmussen Morten.Rasmussen@arm.com
Architectures that don't have any other means for tracking cpu frequency changes need a callback from cpufreq to implement a scaling factor to enable scale-invariant per-entity load-tracking in the scheduler.
To compute the scale invariance correction factor the architecture would need to know both the max frequency and the current frequency. This patch defines weak functions for setting both from cpufreq.
Related architecture specific functions use weak function definitions. The same approach is followed here.
These callbacks can be used to implement frequency scaling of cpu capacity later.
Cc: Rafael J. Wysocki rjw@rjwysocki.net Cc: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Morten Rasmussen morten.rasmussen@arm.com
drivers/cpufreq/cpufreq.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 28e59a4..3c6398a 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -280,6 +280,10 @@ static void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci) #endif }
+void __weak arch_scale_set_curr_freq(int cpu, unsigned long freq) {}
+void __weak arch_scale_set_max_freq(int cpu, unsigned long freq) {}
- static void __cpufreq_notify_transition(struct cpufreq_policy *policy, struct cpufreq_freqs *freqs, unsigned int state) {
@@ -317,6 +321,7 @@ static void __cpufreq_notify_transition(struct cpufreq_policy *policy, pr_debug("FREQ: %lu - CPU: %lu\n", (unsigned long)freqs->new, (unsigned long)freqs->cpu); trace_cpu_frequency(freqs->new, freqs->cpu);
arch_scale_set_curr_freq(freqs->cpu, freqs->new); srcu_notifier_call_chain(&cpufreq_transition_notifier_list, CPUFREQ_POSTCHANGE, freqs); if (likely(policy) && likely(policy->cpu == freqs->cpu))
@@ -2148,7 +2153,7 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy, struct cpufreq_policy *new_policy) { struct cpufreq_governor *old_gov;
- int ret;
int ret, cpu;
pr_debug("setting new policy for CPU %u: %u - %u kHz\n", new_policy->cpu, new_policy->min, new_policy->max);
@@ -2186,6 +2191,12 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy, policy->min = new_policy->min; policy->max = new_policy->max;
- for_each_cpu(cpu, policy->cpus) {
arch_scale_set_max_freq(cpu, policy->max);
/* Workaround for corner cases where notifiers don't fire */
arch_scale_set_curr_freq(cpu, policy->cur);
- }
- pr_debug("new min and max freqs are %u - %u kHz\n", policy->min, policy->max);
Just curious why we need these new callbacks? I don't think they are providing any new information not already given by existing cpufreq policy and transition notifications.
Right. They are providing the same information I think. I went with the __weak function callbacks as this method is used for the existing scaling functions used by the scheduler (arch_scale_{cpu,freq}_capacity()). However, that is changing now so I should give it a try and see if we can use the notifiers instead. There might be some initialization problems though as the notifier has to be registrered before cpufreq initializes the first (default) policy for us to know what the max frequency is.
I think the patch below gives us what we need using the cpufreq notifiers instead. It is just a single patch, no neeed to touch cpufreq, so the patch below replaces both patch 1 and 2.
I haven't found any initialization problems on TC2. I'm not 100% sure that policy->cur is set at initilization of all cpufreq drivers. Drivers are not required to have a get-function which seems required for policy->cur to be set.
Even if cpufreq driver's init doesn't fill in policy->cur, cpufreq_init_policy() would call cpufreq_set_policy() that sets frequency. You would receive both policy and transition notifications where the POSTCHANGE would have the right frequency information. I think your patch would be fine.
I haven't tested with hotplug yet.
Mike: I have pushed the patch to linux-arm.org as well.
Morten
From 0fe329c77782acde0290954779a2ee17920a3dad Mon Sep 17 00:00:00 2001 From: Morten Rasmussen Morten.Rasmussen@arm.com Date: Mon, 22 Sep 2014 17:24:03 +0100 Subject: [PATCH] arm: Frequency invariant scheduler load-tracking support
Implements arch-specific function to provide the scheduler with a frequency scaling correction factor for more accurate load-tracking. The factor is:
current_freq(cpu) << SCHED_CAPACITY_SHIFT / max_freq(cpu)
This implementation only provides frequency invariance. No micro-architecture invariance yet.
Cc: Russell King linux@arm.linux.org.uk
Signed-off-by: Morten Rasmussen morten.rasmussen@arm.com
arch/arm/include/asm/topology.h | 7 ++++++ arch/arm/kernel/smp.c | 53 +++++++++++++++++++++++++++++++++++++++-- arch/arm/kernel/topology.c | 17 +++++++++++++ 3 files changed, 75 insertions(+), 2 deletions(-)
diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h index 2fe85ff..4b985dc 100644 --- a/arch/arm/include/asm/topology.h +++ b/arch/arm/include/asm/topology.h @@ -24,6 +24,13 @@ void init_cpu_topology(void); void store_cpu_topology(unsigned int cpuid); const struct cpumask *cpu_coregroup_mask(int cpu);
+#define arch_scale_freq_capacity arm_arch_scale_freq_capacity +struct sched_domain; +extern +unsigned long arm_arch_scale_freq_capacity(struct sched_domain *sd, int cpu);
+DECLARE_PER_CPU(atomic_long_t, cpu_freq_capacity);
#else
static inline void init_cpu_topology(void) { }
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 86ef244..297ce1b 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -672,12 +672,34 @@ static DEFINE_PER_CPU(unsigned long, l_p_j_ref); static DEFINE_PER_CPU(unsigned long, l_p_j_ref_freq); static unsigned long global_l_p_j_ref; static unsigned long global_l_p_j_ref_freq; +static DEFINE_PER_CPU(atomic_long_t, cpu_max_freq); +DEFINE_PER_CPU(atomic_long_t, cpu_freq_capacity);
+/*
- Scheduler load-tracking scale-invariance
- Provides the scheduler with a scale-invariance correction factor that
- compensates for frequency scaling through arch_scale_freq_capacity()
- (implemented in topology.c).
- */
+static inline +void scale_freq_capacity(int cpu, unsigned long curr, unsigned long max) +{
- unsigned long capacity;
- if (!max)
return;
- capacity = (curr << SCHED_CAPACITY_SHIFT) / max;
- atomic_long_set(&per_cpu(cpu_freq_capacity, cpu), capacity);
+}
static int cpufreq_callback(struct notifier_block *nb, unsigned long val, void *data) { struct cpufreq_freqs *freq = data; int cpu = freq->cpu;
unsigned long max = atomic_long_read(&per_cpu(cpu_max_freq, cpu));
if (freq->flags & CPUFREQ_CONST_LOOPS) return NOTIFY_OK;
@@ -702,6 +724,9 @@ static int cpufreq_callback(struct notifier_block *nb, per_cpu(l_p_j_ref_freq, cpu), freq->new); }
- scale_freq_capacity(cpu, freq->new, max);
- return NOTIFY_OK; }
@@ -709,11 +734,35 @@ static struct notifier_block cpufreq_notifier = { .notifier_call = cpufreq_callback, };
+static int cpufreq_policy_callback(struct notifier_block *nb,
unsigned long val, void *data)
+{
- struct cpufreq_policy *policy = data;
- int i;
- for_each_cpu(i, policy->cpus) {
scale_freq_capacity(i, policy->cur, policy->max);
atomic_long_set(&per_cpu(cpu_max_freq, i), policy->max);
- }
- return NOTIFY_OK;
+}
You need to skip all other events except CPUFREQ_NOTIFY. CPUFREQ_ADJUST and CPUFREQ_INCOMPATIBLE are meant for drivers to change policy->min/max, and it not uncommon to have policy->max adjusted by kernel thermal drivers. CPUFREQ_NOTIFY is the final result of new limits.
- Junjie
+static struct notifier_block cpufreq_policy_notifier = {
- .notifier_call = cpufreq_policy_callback,
+};
- static int __init register_cpufreq_notifier(void) {
- return cpufreq_register_notifier(&cpufreq_notifier,
- int ret;
- ret = cpufreq_register_notifier(&cpufreq_notifier, CPUFREQ_TRANSITION_NOTIFIER);
- if (ret)
return ret;
- return cpufreq_register_notifier(&cpufreq_policy_notifier,
} core_initcall(register_cpufreq_notifier);CPUFREQ_POLICY_NOTIFIER);
- #endif
diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c index 08b7847..9c09e6e 100644 --- a/arch/arm/kernel/topology.c +++ b/arch/arm/kernel/topology.c @@ -169,6 +169,23 @@ static void update_cpu_capacity(unsigned int cpu) cpu, arch_scale_cpu_capacity(NULL, cpu)); }
+/*
- Scheduler load-tracking scale-invariance
- Provides the scheduler with a scale-invariance correction factor that
- compensates for frequency scaling (arch_scale_freq_capacity()). The scaling
- factor is updated in smp.c
- */
+unsigned long arm_arch_scale_freq_capacity(struct sched_domain *sd, int cpu) +{
- unsigned long curr = atomic_long_read(&per_cpu(cpu_freq_capacity, cpu));
- if (!curr)
return SCHED_CAPACITY_SCALE;
- return curr;
+}
- #else static inline void parse_dt_topology(void) {} static inline void update_cpu_capacity(unsigned int cpuid) {}