On 3 June 2014 01:51, Kevin Hilman khilman@linaro.org wrote:
In general, your cover letters and change logs still mix up what things are like currently and what things will be like (or should be like) after this series. Now your adding another layer of adding things that will likely need fix *in addition to* this series. IMO, all three categories are still mixed up in the descriptions.
Another try :)
A clockevent device is used to service timers and hrtimers requests and the next event (when it should fire) is decided by the timer/hrtimer expiring next. When no timers/hrtimers are pending to be serviced, the expiry time is set to a special value: KTIME_MAX. This means that no events are required for indefinite amount of time.
This would normally happen with NO_HZ (both NO_HZ_IDLE and NO_HZ_FULL) when tick-sched timer is removed and we don't have any more timers scheduled for long.
At this time the clockevent device can be stopped, or its events can be masked in order to reduce latency added due to these unwanted interrupts (as hrtimer_interrupt() wouldn't have anything to do when called) OR to save some power associated with clockevent device & running extra code to handle these interrupts.
This wouldn't have any side effects of missing events when they are added later, as we will reprogram the next event again then.
If we don't STOP clockevent device, we are guaranteed to receive at least one spurious interrupt, as hrtimer_force_reprogram() isn't reprogramming the event device (on expires == KTIME_MAX). Depending on particular implementation of clockevent device, this can be a fake interrupt at tick-rate (When driver is emulating ONESHOT over PERIODIC mode. This was observed on at least one implementation: arm_arch_timer.c)..
A simple (yet hacky) solution to get this fixed could be: update hrtimer_force_reprogram() to always reprogram clockevent device and update clockevent drivers to STOP generating events (or delay it to max time), when 'expires' is set to KTIME_MAX. But the drawback here is that every clockevent driver has to be hacked for this particular case and its very easy for new ones to miss this.
However, Thomas suggested to add an optional mode: ONESHOT_STOPPED (lkml.org/lkml/2014/5/9/508) to solve this problem.
With this proposal, introducing a new ONESHOT_STOPPED mode would require the core to know whether the platform implements this mode so it could be reprogrammed later.
In order for the core to tell if the mode is implemented, ->set_mode() callback needs to be able to return success or failure.
To change return type of set_mode(), Thomas suggested to implement another callback: ->set_dev_mode(), with return type 'int'. We can then convert clockevent drivers to use this interface instead of existing ->set_mode() and then finally remove ->set_mode()'s support.
This patchset first adds another callback with return capability: set_dev_mode(), it then migrates all drivers one by one to start using it and finally removes existing callback set_mode() when it has no more users.
FIXME: It _might_ be required to fix two issues separately after applying this series:
1. Few drivers have 'switch cases' for ONESHOT or PERIODIC modes, when these modes aren't supported by '.features' field of 'struct clock_event_device'. These drivers are doing WARN/BUG/pr_err/empty/etc for these cases. Its guaranteed that core wouldn't try to set PERIODIC/ONESHOT modes if they aren't specified in .features field for a particular driver. And if this happens due to some bug at driver/core, then we do have proper handling at core level while calling ->set_dev_mode(). Drivers can be simplified by removing code around these unsupported cases and return -ENOSYS from 'default' case.
2. Few clockevent drivers are always disabling clock events from their ->set_dev_mode() callbacks and so they stay disabled for 'default case' as well. We *may* need to fix these drivers in case we don't want them to disable events in 'default case'. After this series we will never hit 'default case' as we are handling all currently available cases separately, but it might be required to fix them before ONESHOT_STOPPED series gets in.