On Tue, 18 Dec 2012, Todd Poynor wrote:
I put this up on AOSP Gerrit at https://android-review.googlesource.com/#/c/48233/ with a minor change to the semaphore field name to be more descriptive and to identify it as a semaphore.
After some out-of-band discussion it was agreed the race with the idle notifier definitely exists, but the race described with the timer function is unclear. As I understand it, del_timer_sync() spins waiting for the timer to stop running (and possibly rescheduling itself) and then deletes the timer. When try_to_del_timer_sync() returns successfully "the timer is not queued and the handler is not running on any CPU". Is there still a race here anyway?
After re-reading the code, I do agree.
What confused me is this comment for del_timer_sync():
* Synchronization rules: Callers must prevent restarting of the timer, * otherwise this function is meaningless. It must not be called from * interrupt contexts unless the timer is an irqsafe one. The caller must * not hold locks which would prevent completion of the timer's * handler. The timer's handler must not call add_timer_on(). [...]
However, the timer handler is using mod_timer_pinned(), not add_timer_on().
I stumbled upon the idle notifier race first, and initially my understanding of the timer code wasn't very clear. I used the timer self scheduling scenario to illustrate the race instead of the idle notifier simply because it was simpler.
Nicolas