Plumbers: Tweaking scheduler policy micro-conf RFP
Pantelis Antoniou
panto at antoniou-consulting.com
Fri May 18 17:02:39 UTC 2012
On May 18, 2012, at 7:36 PM, Peter Zijlstra wrote:
> On Fri, 2012-05-18 at 17:18 +0100, Morten Rasmussen wrote:
>
>>> one knob: sched_balance_policy with tri-state {performance, power, auto}
>>
>> Interesting. What would the power policy look like? Would performance
>> and power be the two extremes of the power/performance trade-off?
>
> Performance is basically what we do now, power I'll leave up to whomever
> wants to implement it. My only concern is that the code is pretty and I
> can actually understand it.
>
> Well, and that 'everybody' can agree on it :-)
>
> One thing to keep in mind though is that the goal shouldn't be to make
> it the best power aware scheduler for your platform of interest but to
> make a reasonably coherent framework that provides sufficient power
> awareness for all our platforms.
>
> I much prefer simplicity and robustness over the last 10% at this point.
>
>> In that case I would assume that most embedded systems would be using auto.
>
> Ah, I think you mis-understand auto, that would just be a binary flip
> between either based on external data. Like if you're on AC you pick
> performance, if you're on battery you pick power.
A binary switch is easy to understand, but I don't see how a simple binary switch
would work internally.
Because we basically have two extremes.
At the first (performance at 100%) the scheme that delivers it is something
like the current scheduler with all cores running at max speed and the
task load spread out to the highest performing CPUs first.
At the other (power at 100%) the scheme with the least power draw would be
one that keeps all processors but the least powerful one at shutdown
(not taking into account race to idle with a highest performance cpu at this point)
IMO we would need to assign any policies as some points between those
two extremes. Now those points could be affected by external inputs,
like the one you've mentioned (AC/battery), or even something else
set by some kind of user-space condition (for example user-space could
tweak the power setting according to the remaining battery level, i.e. a
full battery could lean to higher performance, while an almost depleted
one could lean to lower power more).
>
>>> The 'performance' policy is typically to spread over shared resources so
>>> as to minimize contention on these.
>>>
>>
>> Would it be worth extending this architecture specification to contain
>> more information like CPU_POWER for each core?
>
> Yes, currently we assume all logical cpus are equal (with a small
> exception for SMT).
>
>> After having experimented
>> a bit with scheduling on big.LITTLE my experience is that more
>> information about the platform is needed to make proper scheduling
>> decisions. So if the topology definition is going to be more generic and
>> be set up by the architecture it could be worth adding all the bits of
>> information that the scheduler would need to that data structure.
>
> Agreed, although we should strive to minimize the set. And we should
> make all interfaces optional, so that if an architecture doesn't use it
> it'll go with the defaults.
>
>> With such data structure, the scheduler would only need one knob to
>> adjust the power/performance trade-off. Any thoughts?
>
> That's the plan.
>
>>> To over-ride the defaults. But ideally I'd leave those until after we've
>>> got the basics working and there is a clear need for them (with a
>>> spread/pack default for perf/power aware).
>>
>> In my experience power optimized scheduling is quite tricky, especially
>> if you still want some level of performance. For heterogeneous
>> architecture packing might not be the best solution. Some indication of
>> the power/performance profile of each core could be useful.
>
> Right, hence my suggestion to add possible hints to over-ride the
> default policies.
>
> But also the desire to only add them once the 'regular'/'simple' bits
> work properly.
>
Regards
-- Pantelis
More information about the linaro-sched-sig
mailing list