On 6/21/2013 1:50 AM, Morten Rasmussen wrote:
in control of the p-state selection and changes it fast enough to match the current load, the scheduler doesn't have to care? By fast enough I mean, faster than the scheduler would notice if a cpu was temporarily overloaded at a low p-state. In that case, you wouldn't need cpufreq/p-state hints, and the scheduler would only move tasks between cpus when cpus are fully loaded at their max p-state.
with the migration hint, I'm pretty sure we'll be there today typically.
A hint when a task is moved to a new cpu is too late if the migration shouldn't have happened at all. If the scheduler knows that the cpu is able to switch to a higher p-state it can decide to wait for the p-state change instead of migrating the task and waking up another cpu.
ok maybe I am missing something but at least on the hardware I am familiar with (Intel and somewhat AMD), the frequency (and voltage) when idle is ... 0 Hz... no matter what the OS chose for when the CPU is running. And when coming out of idle, as part of the cost of that, is ramping up to something appropriate.
And such ramps are FAST. Changing P state is as a result generally quite fast as well... think "single digit microseconds" kind of fast. Much faster than waking a CPU up in the first place (by design.. since a wakeup of a CPU includes effectively a P state change)
I read your statement as "lets wait for the idle CPU to ramp its frequency up first", which doesn't really make sense to me...