On Fri, Mar 27, 2015 at 10:19:54AM +0530, Viresh Kumar wrote:
On 27 March 2015 at 01:48, Andrew Morton akpm@linux-foundation.org wrote:
Shouldn't this be viewed as a shortcoming of the core timer code?
Yeah, it is. Some (not so pretty) solutions were tried earlier to fix that, but they are rejected for obviously reasons [1].
vmstat_shepherd() is merely rescheduling itself with schedule_delayed_work(). That's a dead bog simple operation and if it's producing suboptimal behaviour then we shouldn't be fixing it with elaborate workarounds in the caller?
I understand that, and that's why I sent it as an RFC to get the discussion started. Does anyone else have got another (acceptable) idea to get this resolved ?
So the issue seems to be that we need base->running_timer in order to tell if a callback is running, right?
We could align the base on 8 bytes to gain an extra bit in the pointer and use that bit to indicate the running state. Then these sites can spin on that bit while we can change the actual base pointer.
Since the timer->base pointer is locked through the base->lock and hand-over is safe vs lock_timer_base, this should all work.