Re: [Eas-dev] [RFC internal v2 1/4] arm: Frequency invariant scheduler load-tracking support

29 Apr 2015


      On Wed, Apr 29, 2015 at 03:12:38PM +0100, Daniel Lezcano wrote:
...
On 04/29/2015 03:33 PM, Vincent Guittot wrote:
...
On 29 April 2015 at 15:27, Daniel Lezcano daniel.lezcano@linaro.org wrote:
...
On 04/29/2015 03:17 PM, Vincent Guittot wrote:
...
On 29 April 2015 at 15:10, Daniel Lezcano daniel.lezcano@linaro.org
wrote:
...
On 04/29/2015 02:34 PM, Vincent Guittot wrote:
...
On 29 April 2015 at 14:22, Daniel Lezcano daniel.lezcano@linaro.org
wrote:
>
>
> On 04/29/2015 01:15 PM, Vincent Guittot wrote:
>>
>>
>>
>> On 29 April 2015 at 12:34, Daniel Lezcano daniel.lezcano@linaro.org
>> wrote:
>>>
>>>
>>>
>>> On 04/27/2015 09:46 AM, Michael Turquette wrote:
>>>>
>>>>
>>>>
>>>>
>>>> From: Morten Rasmussen Morten.Rasmussen@arm.com
>>>>
>>>> Implements arch-specific function to provide the scheduler with a
>>>> frequency scaling correction factor for more accurate load-tracking.
>>>> The
>>>> factor is:
>>>>
>>>>             current_freq(cpu) << SCHED_CAPACITY_SHIFT / max_freq(cpu)
>>>>
>>>> This implementation only provides frequency invariance. No
>>>> micro-architecture invariance yet.
>>>>
>>>> Signed-off-by: Morten Rasmussen morten.rasmussen@arm.com
>>>> ---
>>>> changes since internal v1:
>>>> * replaced two commits from eas v3 with this new one from Morten
>>>>
>>>>       arch/arm/include/asm/topology.h |  7 ++++++
>>>>       arch/arm/kernel/smp.c           | 53
>>>> +++++++++++++++++++++++++++++++++++++++--
>>>>       arch/arm/kernel/topology.c      | 17 +++++++++++++
>>>>       3 files changed, 75 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/arm/include/asm/topology.h
>>>> b/arch/arm/include/asm/topology.h
>>>> index 2fe85ff..4b985dc 100644
>>>> --- a/arch/arm/include/asm/topology.h
>>>> +++ b/arch/arm/include/asm/topology.h
>>>> @@ -24,6 +24,13 @@ void init_cpu_topology(void);
>>>>       void store_cpu_topology(unsigned int cpuid);
>>>>       const struct cpumask *cpu_coregroup_mask(int cpu);
>>>>
>>>> +#define arch_scale_freq_capacity arm_arch_scale_freq_capacity
>>>
>>>
>>>
>>>
>>>
>>> What is for this macro ?
>>
>>
>>
>>
>> This is used to doesn't add any useless computation in the hot path
>> when arch_scale_freq_capacity is not used. This has been asked by
>> peter
>
>
>
>
> What is the difference with having a dummy empty function with a 'weak'
> attribute (which is how it is done currently in the kernel) ?
You can have a look at the thread for the full discussion:
https://lkml.org/lkml/2015/3/24/113
Thanks for the pointer. The link seems down for the moment, but I was
able
to dig in the different folder and find the thread in my mailbox (that
would
be easier if I have been cc'ed).
It is not clear for the me why the macro is better than 'weak'. It sounds
like using the 'weak' attribute is the best thing to do, no ?
the weak function adds the useless sequence:
value *= 1024
value >>=10
whereas the macro doesn't add any additional instruction
So if the macro is not defined for the architecture, the compilation will
fail.
no there is a test in sched.h
#ifndef arch_scale_freq_capacity
static __always_inline
unsigned long arch_scale_freq_capacity(struct sched_domain *sd, int cpu)
{
return SCHED_CAPACITY_SCALE;
}
#endif
Ah, ok I didn't see this one.
...
...
I don't see why the function below is not right:
void __weak arch_scale_freq_capacity(struct sched_domain *sd, int cpu)
{
         return SCHED_CAPACITY_SCALE;
}
AFAIU, it doesn't ensure that the function will be inlined and
optimized by the compiler.
AFAIU, the __weak attribute guarantees that the function cannot be
inlined at all. The __weak symbols are resolved at link time, not at
compile time, hence they cannot be inlined. So Daniel's example above,
which is pretty much what we used to have prior to 4.1, will always lead
to a function call. In addition to that, the compiler has no chance of
figuring out that it always returns a constant which is paired with a
shift or division in every call site in fair.c which could all have been
optimized out.
The _weak trick was fine while we didn't call arch_scale_freq_capacity()
very often. Now that we call it all the time, we have to minimize the
overhead and the #define trick Peter came up with seems to be the only
feasible way to get zero overhead when the architecture doesn't care
about frequency scaling of capacity.
Morten

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Eas-dev] [RFC internal v2 1/4] arm: Frequency invariant scheduler load-tracking support