On 28 March 2015 at 15:23, Peter Zijlstra peterz@infradead.org wrote:
Well, for one your patch is indeed disgusting.
Yeah, I agree :)
But yes I'm aware Thomas wants to rewrite the timer thing. But Thomas is away for a little while and if this really needs to happen then it does.
Sometime back I was trying to use another bit from base pointer for marking a timer as PINNED:
diff --git a/include/linux/timer.h b/include/linux/timer.h index 8c5a197e1587..e7184f57449c 100644 --- a/include/linux/timer.h +++ b/include/linux/timer.h @@ -67,8 +67,9 @@ extern struct tvec_base boot_tvec_bases; */ #define TIMER_DEFERRABLE 0x1LU #define TIMER_IRQSAFE 0x2LU +#define TIMER_PINNED 0x4LU -#define TIMER_FLAG_MASK 0x3LU +#define TIMER_FLAG_MASK 0x7LU
And Fenguang's build-bot showed the problem (only) on blackfin [1].
config: make ARCH=blackfin allyesconfig
All error/warnings:
kernel/timer.c: In function 'init_timers': >> kernel/timer.c:1683:2: error: call to '__compiletime_assert_1683' >> declared with attribute error: BUILD_BUG_ON failed: >> __alignof__(struct tvec_base) & TIMER_FLAG_MASK
So probably we need to make 'base' aligned to 8 bytes ?
So, what you are suggesting is something like this (untested):
diff --git a/include/linux/timer.h b/include/linux/timer.h index 8c5a197e1587..68bf09d69352 100644 --- a/include/linux/timer.h +++ b/include/linux/timer.h @@ -67,8 +67,9 @@ extern struct tvec_base boot_tvec_bases; */ #define TIMER_DEFERRABLE 0x1LU #define TIMER_IRQSAFE 0x2LU +#define TIMER_RUNNING 0x4LU
-#define TIMER_FLAG_MASK 0x3LU +#define TIMER_FLAG_MASK 0x7LU
#define __TIMER_INITIALIZER(_function, _expires, _data, _flags) { \ .entry = { .prev = TIMER_ENTRY_STATIC }, \ diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 2d3f5c504939..8f9efa64bd34 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -105,6 +105,21 @@ static inline unsigned int tbase_get_irqsafe(struct tvec_base *base) return ((unsigned int)(unsigned long)base & TIMER_IRQSAFE); }
+static inline unsigned int tbase_get_running(struct tvec_base *base) +{ + return ((unsigned int)(unsigned long)base & TIMER_RUNNING); +} + +static inline unsigned int tbase_set_running(struct tvec_base *base) +{ + return ((unsigned int)(unsigned long)base | TIMER_RUNNING); +} + +static inline unsigned int tbase_clear_running(struct tvec_base *base) +{ + return ((unsigned int)(unsigned long)base & ~TIMER_RUNNING); +} + static inline struct tvec_base *tbase_get_base(struct tvec_base *base) { return ((struct tvec_base *)((unsigned long)base & ~TIMER_FLAG_MASK)); @@ -781,21 +796,12 @@ __mod_timer(struct timer_list *timer, unsigned long expires, new_base = per_cpu(tvec_bases, cpu);
if (base != new_base) { - /* - * We are trying to schedule the timer on the local CPU. - * However we can't change timer's base while it is running, - * otherwise del_timer_sync() can't detect that the timer's - * handler yet has not finished. This also guarantees that - * the timer is serialized wrt itself. - */ - if (likely(base->running_timer != timer)) { - /* See the comment in lock_timer_base() */ - timer_set_base(timer, NULL); - spin_unlock(&base->lock); - base = new_base; - spin_lock(&base->lock); - timer_set_base(timer, base); - } + /* See the comment in lock_timer_base() */ + timer_set_base(timer, NULL); + spin_unlock(&base->lock); + base = new_base; + spin_lock(&base->lock); + timer_set_base(timer, base); }
timer->expires = expires; @@ -1016,7 +1022,7 @@ int try_to_del_timer_sync(struct timer_list *timer)
base = lock_timer_base(timer, &flags);
- if (base->running_timer != timer) { + if (tbase_get_running(timer->base)) { timer_stats_timer_clear_start_info(timer); ret = detach_if_pending(timer, base, true); } @@ -1202,6 +1208,7 @@ static inline void __run_timers(struct tvec_base *base) timer_stats_account_timer(timer);
base->running_timer = timer; + tbase_set_running(timer->base); detach_expired_timer(timer, base);
if (irqsafe) { @@ -1216,6 +1223,7 @@ static inline void __run_timers(struct tvec_base *base) } } base->running_timer = NULL; + tbase_clear_running(timer->base); spin_unlock_irq(&base->lock); }
------------x--------------------x----------------------
Right?
Now there are few issues I see here (Sorry if they are all imaginary): - In case a timer re-arms itself from its handler and is migrated from CPU A to B, what happens if the re-armed timer fires before the first handler finishes ? i.e. timer->fn() hasn't finished running on CPU A and it has fired again on CPU B. Wouldn't this expose us to a lot of other problems? It wouldn't be serialized to itself anymore ?
- Because the timer has migrated to another CPU, the locking in __run_timers() needs to be fixed. And that will make it complicated ..
- __run_timer() doesn't lock bases of other CPUs, and it has to do it now.. - We probably need to take locks of both local CPU and the one to which timer migrated.
- Its possible now that there can be more than one running timer for a base, which wasn't true earlier. Not sure if it will break something.
Thanks for your continuous support to reply to my (sometimes stupid) queries.
-- viresh
[1] https://lists.01.org/pipermail/kbuild-all/2014-April/003982.html