On Thu, Oct 17, 2013 at 06:18:38PM +0100, Arjan van de Ven wrote:
cpufreq has pre- and post-change notifiers so the current TC2 clock driver
yeah those are EVIL ;-)
waits (yields) in its clk_set_rate() implementation until the change has happened to ensure that the post-change notifier happens at the right time. Since clk_set_rate() is allowed to sleep other tasks may be running while waiting for the change to complete. This may be true for other clock drivers as well.
AFAICT, there is no way to reuse the existing cpufreq drivers in a sensible way for scheduler driven frequency scaling.
that's the conclusion we came to as well about a year ago (and is also why we're no longer using cpufreq core for the Intel pstate driver. the locking/sleeping/callback/cpuhotplug/sysfs/etc stuff is just a MESS for something that ends up being extremely simple if you just code the sequence... for us it's just one register write to change... which shows this as an extreme obviously)
We should be able to boil it down to a sequence on ARM as well. But it means dropping cpufreq and looking at the clock framework.
Are you still using the pre- and post-change notifiers on Intel, or can they be ignored safely?
Note that you still have preemption disabled in your late callback from finish_task_switch(). There's no way you can wait/yield/whatever from there. Nor is that really sane.
the other fun one with this could be that if you have a series of scheduleable tasks for changing stuff.... somehow you want to keep ordering in the requests, and only do the last one/etc. Not Fun(tm)
Agreed, I don't want to go there. Also, the overhead will probably kill any benefit that there might be.