Hi Sasha,
[2025-11-04 14:17] Sasha Levin:
This is a note to let you know that I've just added the patch titled
clocksource/drivers/timer-rtl-otto: Work around dying timersto the 6.17-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git%3Ba=su...
The filename of the patch is: clocksource-drivers-timer-rtl-otto-work-around-dying.patch and it can be found in the queue-6.17 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree, please let stable@vger.kernel.org know about it.
commit fbc0494f847969d81c1f087117dca462c816bedb Author: Markus Stockhausen markus.stockhausen@gmx.de Date: Mon Aug 4 04:03:25 2025 -0400
clocksource/drivers/timer-rtl-otto: Work around dying timers[ Upstream commit e7a25106335041aeca4fdf50a84804c90142c886 ] The OpenWrt distribution has switched from kernel longterm 6.6 to 6.12. Reports show that devices with the Realtek Otto switch platform die during operation and are rebooted by the watchdog. Sorting out other possible reasons the Otto timer is to blame. The platform currently consists of 4 targets with different hardware revisions. It is not 100% clear which devices and revisions are affected. Analysis shows: A more aggressive sched/deadline handling leads to more timer starts with small intervals. This increases the bug chances. See https://marc.info/?l=linux-kernel&m=175276556023276&w=2 Focusing on the real issue a hardware limitation on some devices was found. There is a minimal chance that a timer ends without firing an interrupt if it is reprogrammed within the 5us before its expiration time. Work around this issue by introducing a bounce() function. It restarts the timer directly before the normal restart functions as follows: - Stop timer - Restart timer with a slow frequency. - Target time will be >5us - The subsequent normal restart is outside the critical window Downstream has already tested and confirmed a patch. See https://github.com/openwrt/openwrt/pull/19468 https://forum.openwrt.org/t/support-for-rtl838x-based-managed-switches/57875... Signed-off-by: Markus Stockhausen markus.stockhausen@gmx.de Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Tested-by: Stephen Howell howels@allthatwemight.be Tested-by: Bjørn Mork bjorn@mork.no Link: https://lore.kernel.org/r/20250804080328.2609287-2-markus.stockhausen@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org
diff --git a/drivers/clocksource/timer-rtl-otto.c b/drivers/clocksource/timer-rtl-otto.c index 8a3068b36e752..8be45a11fb8b6 100644 --- a/drivers/clocksource/timer-rtl-otto.c +++ b/drivers/clocksource/timer-rtl-otto.c @@ -38,6 +38,7 @@ #define RTTM_BIT_COUNT 28 #define RTTM_MIN_DELTA 8 #define RTTM_MAX_DELTA CLOCKSOURCE_MASK(28) +#define RTTM_MAX_DIVISOR GENMASK(15, 0) /*
- Timers are derived from the LXB clock frequency. Usually this is a fixed
@@ -112,6 +113,22 @@ static irqreturn_t rttm_timer_interrupt(int irq, void *dev_id) return IRQ_HANDLED; } +static void rttm_bounce_timer(void __iomem *base, u32 mode) +{
- /*
* When a running timer has less than ~5us left, a stop/start sequence* might fail. While the details are unknown the most evident effect is* that the subsequent interrupt will not be fired.** As a workaround issue an intermediate restart with a very slow* frequency of ~3kHz keeping the target counter (>=8). So the follow* up restart will always be issued outside the critical window.*/- rttm_disable_timer(base);
- rttm_enable_timer(base, mode, RTTM_MAX_DIVISOR);
+}
static void rttm_stop_timer(void __iomem *base) { rttm_disable_timer(base); @@ -129,6 +146,7 @@ static int rttm_next_event(unsigned long delta, struct clock_event_device *clkev struct timer_of *to = to_timer_of(clkevt); RTTM_DEBUG(to->of_base.base);
- rttm_bounce_timer(to->of_base.base, RTTM_CTRL_COUNTER); rttm_stop_timer(to->of_base.base); rttm_set_period(to->of_base.base, delta); rttm_start_timer(to, RTTM_CTRL_COUNTER);
@@ -141,6 +159,7 @@ static int rttm_state_oneshot(struct clock_event_device *clkevt) struct timer_of *to = to_timer_of(clkevt); RTTM_DEBUG(to->of_base.base);
- rttm_bounce_timer(to->of_base.base, RTTM_CTRL_COUNTER); rttm_stop_timer(to->of_base.base); rttm_set_period(to->of_base.base, RTTM_TICKS_PER_SEC / HZ); rttm_start_timer(to, RTTM_CTRL_COUNTER);
@@ -153,6 +172,7 @@ static int rttm_state_periodic(struct clock_event_device *clkevt) struct timer_of *to = to_timer_of(clkevt); RTTM_DEBUG(to->of_base.base);
- rttm_bounce_timer(to->of_base.base, RTTM_CTRL_TIMER); rttm_stop_timer(to->of_base.base); rttm_set_period(to->of_base.base, RTTM_TICKS_PER_SEC / HZ); rttm_start_timer(to, RTTM_CTRL_TIMER);
this patch is part of a series of 4 patches, but it seems you have only cherry-picked patches 1 and 3 from that series, although all 4 were merged into Linus' tree:
https://lore.kernel.org/all/20250804080328.2609287-1-markus.stockhausen@gmx....
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/drive...
I could only find the "linux-stable-commits" mails for queue-6.17, but you selected the same 2 patches for queue-6.12 as well. Is that selection of only 2 of the 4 patches intentional?
Regards Pascaö