On 02-12-15, 13:43, Saravana Kannan wrote:
There's a separate thread where we proposed a fix to deferrable timers that are stored globally if they are not CPU bound. That way, even if one CPU is up, they get handled. But TGLX had some valid concerns with cache thrashing and impact on some network code. So, last I heard, he was going to rewrite and fixed the deferrable timer problem by having the "orphaned" (because CPU has gone idle) deferrable timers being adopted by other CPUs while the original CPU is idle.
Once that's fixed, we just need one timer per policy. Long story short, CPU freq is working around a poor API semantic of deferrable timers.
There is another idea that I have.
Lets sacrifice idleness of CPU0 (which is already considered as housekeeping CPU in scheduler) to save us from all the complexity we have today.
Suppose we have 16 CPUs, with 4 CPUs per policy and hence 4 policies. - Keep a single delayed-work (non-deferrable) per policy and queue them as: queue_work_on(CPU0). - This will work because any CPU can calculate the load of other CPUs, and there is no dependency on the local CPU. - CPU0 will hence get interrupted, check if the policy->cpus are idle or not, and if not, update their frequency (perhaps with an IPI).
Not sure if this will be better performance wise though.