On Sat, Feb 15, 2025 at 01:48:34AM +0100, Holger Hoffstätte wrote:
On 2025-02-15 00:18, Linus Torvalds wrote:
Adding more people: Peter / Phil / Waiman. Juri was already on the list earlier.
On Fri, 14 Feb 2025 at 02:12, Holger Hoffstätte holger@applied-asynchrony.com wrote:
Whoop! Whoop! The sound of da police!
2ce2a62881abcd379b714bf41aa671ad7657bdd2 is the first bad commit commit 2ce2a62881abcd379b714bf41aa671ad7657bdd2 (HEAD) Author: Juri Lelli juri.lelli@redhat.com Date: Fri Nov 15 11:48:29 2024 +0000
sched/deadline: Check bandwidth overflow earlier for hotplug [ Upstream commit 53916d5fd3c0b658de3463439dd2b7ce765072cb ]
With this reverted it reliably suspends again.
Can you check that it works (or - more likely - doesn't work) in upstream?
That commit 53916d5fd3c0 ("sched/deadline: Check bandwidth overflow earlier for hotplug") got merged during the current merge window, so it would be lovely if you can check whether current -git (or just the latest 6.14-rc) works for you, or has the same breakage.
Background for new people on the participants list: original report at
https://lore.kernel.org/all/e7096ec2-68db-fc3e-9c48-f20d3e80df72@applied-asy...
which says
Common symptom on all machines seems to be
[ +0.000134] Disabling non-boot CPUs ... [ +0.000072] Error taking CPU15 down: -16 [ +0.000002] Non-boot CPUs are not disabled
and this bisection result is from
https://lore.kernel.org/all/9a44f314-c101-4ed1-98ad-547c84df7cdd@applied-asy...
and if it breaks in 6.13 -stable, I would expect the same in the current tree. Unless there's some non-obvious interaction with something else ?
I just booted into current 6.14-git and could suspend/wakeup multiple times without any problem - no reverting necessary, so that is good.
As for 6.12/6.13 it might be necessary to revert an accompanying commit as well since it seems to cause test failures with hotplug, as documented here:
https://lore.kernel.org/stable/bcf76664-e77c-44b3-b78f-bcefc7aa3fc1@nvidia.c...
..but I don't know anything about that; I just wanted to find the patch causing the suspend problem. Other than that 6.13.3-rc2 works fine.
Not sure if that was useful information. :)
Yes, thanks, I'll go drop this other patch from the stable queues as taking only 1 of a 3 patch series generally isn't good :)
greg k-h