On Sun, Jun 01, 2025 at 07:24:14PM -0400, Sasha Levin wrote:
From: Brian Norris briannorris@chromium.org
[ Upstream commit 788019eb559fd0b365f501467ceafce540e377cc ]
Affinity-managed interrupts can be shut down and restarted during CPU hotunplug/plug. Thereby the interrupt may be left in an unexpected state. Specifically:
- Interrupt is affine to CPU N
- disable_irq() -> depth is 1
- CPU N goes offline
- irq_shutdown() -> depth is set to 1 (again)
- CPU N goes online
- irq_startup() -> depth is set to 0 (BUG! driver expects that the interrupt still disabled)
- enable_irq() -> depth underflow / unbalanced enable_irq() warning
This is only a problem for managed interrupts and CPU hotplug, all other cases like request()/free()/request() truly needs to reset a possibly stale disable depth value.
Provide a startup function, which takes the disable depth into account, and invoked it for the managed interrupts in the CPU hotplug path.
This requires to change irq_shutdown() to do a depth increment instead of setting it to 1, which allows to retain the disable depth, but is harmless for the other code paths using irq_startup(), which will still reset the disable depth unconditionally to keep the original correct behaviour.
A kunit tests will be added separately to cover some of these aspects.
[ tglx: Massaged changelog ]
Suggested-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Brian Norris briannorris@chromium.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/all/20250514201353.3481400-2-briannorris@chromium.or... Signed-off-by: Sasha Levin sashal@kernel.org
This one breaks suspend of laptops like the Lenovo ThinkPad T14s. Issue was just reported here by Alex:
https://lore.kernel.org/lkml/24ec4adc-7c80-49e9-93ee-19908a97ab84@gmail.com/
Please drop from all stable queues for now.
Johan