On Sat, 1 Feb 2014, Brown, Len wrote:
And your point is?
It is a bad idea for an individual CPU to track the C-state of another CPU, which can change the cycle after it was checked.
Absolutely. And I'm far from advocating we do this either.
We know it is a bad idea because we used to do it, until we realized code here can easily impact the performance critical path.
In general, it is the OS's job to communicate constraints to the HW, and the HW to act on them. Not all HW is smart, so sometimes the OS has to do more hand-holding -- but less is more.
Less is more indeed. I'm certainly a big fan of that principle.
Just so you understand more about the context we need to care for on ARM, I'd invite you to have a look at Documentation/arm/cluster-pm-race-avoidance.txt.
I'm advocating we do _not_ track everything at the scheduler domain simply because some cluster states are possible only if all the CPUs in a cluster are idle, and that idleness is already tracked by the scheduler at the scheduling domain level. So the information we don't update can already be inferred indirectly and cheaply with the information in place today.
Nicolas