On 7/15/2013 2:12 PM, Peter Zijlstra wrote:
On Mon, Jul 15, 2013 at 11:06:50PM +0200, Peter Zijlstra wrote:
OK, but isn't that part of why the micro controller might not make you go faster even if you do program a higher P state?
But yes, I understand this issue in the 'traditional' cpufreq sense. There's no point in ramping the speed if all you do is stall more.
But I was under the impression the 'hardware' was doing this. If not then we need the whole go-faster and go-slower thing and places to call them and means to determine to call them etc.
So with the scheduler measuring cpu utilization we could say to go-faster when u>0.8 and go-slower when u<0.75 or so. Lacking any better metrics like the stall stuff etc.
So I understand that ondemand spends quite a lot of time 'sampling' what the system does while the scheduler mostly already knows this.
yeah ondemand does this, but ondemand is actually a pretty bad governor. not because of the sampling, but because of its algorithm.
if you look at what the ondemand algorithm tries to do, it's trying to manage the cpu "frequency" primarily for when the system is idle. Ten to twelve years ago, this was actually important and it does a decent job on that.
HOWEVER, on modern CPUs, even many of the ARM ones, the frequency when you're idle is zero anyway regardless of what you as OS ask for.
And when Linux went tickless, ondemand went to deferred timers, which make it even worse.
btw technically ondemand does not sample things, you may (or may not) understand what it does. Every 10 (or 100) milliseconds, ondemand makes a new P state decision. It does this by asking the scheduler the time used, does a delta and ends up at a utilization %age which then goes into a formula. It's not that ondemand samples inbetween decision moments to see if the system is busy or not; the microaccounting that the scheduler does is used instead, and only at decision moments.