On 20 January 2014 21:21, Frederic Weisbecker fweisbec@gmail.com wrote:
I fear you can't. If you schedule a timer in 4 seconds away and your clockdevice can only count up to 2 seconds, you can't help much the interrupt in the middle to cope with the overflow.
So you need to act on the source of the timer:
- identify what cause this timer
- try to turn that feature off
- if you can't then move the timer to the housekeeping CPU
So, the main problem in my case was caused by this:
<...>-2147 [001] d..2 302.573881: hrtimer_start: hrtimer=c172aa50 function=tick_sched_timer expires=602075000000 softexpires=602075000000
I have mentioned this earlier when I sent you attachments. I think this is somehow tied with the NO_HZ_FULL stuff? As the timer is queued for 300 seconds after current time.
How to get this out?
I'll have a look into the latter point to affine global timers to the housekeeping CPU. Per cpu timers need more inspection though. Either we rework them to be possibly handled by remote/housekeeping CPUs, or we let the associate feature to be turned off. All in one it's a case by case work.
Which CPUs are housekeeping CPUs? How do we declare them?