On 11/10/2014 08:48 PM, Peter Zijlstra wrote:
On Mon, Nov 10, 2014 at 06:19:02PM +0100, Daniel Lezcano wrote:
I really don't get why the governors should know about this though, its just another state, they should iterate all states and pick the best, given the power usage this state should really never be eligible unless we're QoS forced or whatnot.
The governors just don't use the poll state at all, except for a couple of cases in menu.c defined above in the previous email. What is the rational of adding a state in the cpuidle driver and do everything we can to avoid using it ? From my POV, the poll state is a special state, we should remove from the driver's idle states like the arch_cpu_idle() is a specific idle state only used in idle.c (but which may overlap with an idle state in different archs eg. cpu_do_idle() and the 0th idle state).
So I disagree, I think poll-idle is an idle mode just like all the others. It should be an available state to the governor and it should treat it like any other.
The governors are just ignoring it, except for a small timer optimization in menu.c (and I am still not convinced it is worth to have it). I don't see the point to add a state we don't want to use.
Eg. on my server it was called 2 times over 1313856 times.
I don't tihnk the whole ARCH_HAS_CPU_RELAX thing makes any kind of sense, _every_ arch has some definition of it, the generic polling loop is always a valid idle implementation.
What we can do is always populate the idle state table with it before calling the regular drivers.
I am not sure to understand. You want to add the poll idle loop in all the drivers ?
What about "safe_halt()" ? (arch_cpu_idle() for x86). It is also an idle state. Why not add it in the idle state table also ?
If the arch drivers have a 'better' latency_req==0 idle routine -- note my argument on the ppc issue, I think its wrong -- it can replace the existing one.
We should further remove all the special casing in the governors, its always a valid state, but it should hardly ever be the most desirable state.
I think the whole arch specific idle loop is a mistake, we already have an (arch) interface into the idle routines, we don't need yet another.