On 26 March 2015 at 18:38, Morten Rasmussen morten.rasmussen@arm.com wrote:
On Wed, Mar 25, 2015 at 06:08:42PM +0000, Vincent Guittot wrote:
On 25 March 2015 at 18:33, Peter Zijlstra peterz@infradead.org wrote:
On Tue, Mar 24, 2015 at 11:00:57AM +0100, Vincent Guittot wrote:
On 23 March 2015 at 14:19, Peter Zijlstra peterz@infradead.org wrote:
On Fri, Feb 27, 2015 at 04:54:07PM +0100, Vincent Guittot wrote:
unsigned long scale_freq = arch_scale_freq_capacity(NULL, cpu);
sa->running_avg_sum += delta_w * scale_freq
>> SCHED_CAPACITY_SHIFT;
so the only thing that could be improved is somehow making this multiplication go away when the arch doesn't implement the function.
But I'm not sure how to do that without #ifdef.
Maybe a little something like so then... that should make the compiler get rid of those multiplications unless the arch needs them.
yes, it removes useless multiplication when not used by an arch. It also adds a constraint on the arch side which have to define arch_scale_freq_capacity like below:
#define arch_scale_freq_capacity xxx_arch_scale_freq_capacity with xxx_arch_scale_freq_capacity an architecture specific function
Yeah, but it not being weak should make that a compile time warn/fail, which should be pretty easy to deal with.
If it sounds acceptable i can update the patch with your proposal ?
I'll stick it to the end, I just wanted to float to patch to see if people had better solutions.
ok. all other methods that i have tried, was removing the optimization when default arch_scale_freq_capacity was used
Another potential solution is to stay with weak functions but move the multiplication and shift into the arch_scale_*() functions by passing the value we want to scale into the arch_scale_*() function. That way we can completely avoid multiplication and shift in the default case (no arch_scale*() implementations, which is better than what we have today.
the sched_rt_avg_update only uses the mul with arch_scale_freq_capacity because the shift by SCHED_CAPACITY_SHIFT has been factorized in scale_rt_capacity
The only downside is that for frequency invariance we need three arch_scale_freq_capacity() calls instead of two.