On 17.04.23 09:37, Acid Bong wrote:
So, I followed your advice and used the sources (6.3-rc6). Compiled even two versions: with my config (cf. head letter) and the Arch Linux one (I'm using Gentoo, but it still fits well), both updated with `olddefconfig`. Just to make sure that the problem is independent from the config.
Good news: I experienced the hanging 3 times with both kernels yesterday.
Two of them were on the custom kernel, and they were of the rare kind - they occured on shutdown. It goes normally, init disables the services, unmounts the filesystems, turns off the screen, but then - no response and the LED and the fan are still on. Another couple of shutdowns went normal, so the issue it still irregular.
One happened later on the Arch-based one and after a suspend.
/var/log/kern.log showed nothing specific in all cases.
Bad news: it seems, the fix hasn't arrived yet.
How do I proceed next?
Ideally you should still try to bisect this to find the change that causes your problems.
But I'm CCing the ACPI and PCI maintainers nevertheless, now that it's clear that it happens in vanilla mainline, too. *If* you are lucky they have an idea what might be wrong and can point you in a direction to narrow the cause down. But if you are unlucky, they will have no idea and just ignore this until you bisect the problem.
FWIW, Rafael, Bjorn thread starts here: https://lore.kernel.org/all/CRVU11I7JJWF.367PSO4YAQQEI@bong/
To quote some parts of it ``` Sometimes when I suspend (by closing the lid, less often - by pressing Fn+F1 (sleep key combo)) or poweroff my laptop (both by pressing powerit button and running "loginctl poweroff"), it goes in such a state when it doesn't respond to opening/closing the lid, power button nor Ctrl+Alt+Del, but, unlike in sleep mode, the fan is rotating and the "awake status" LED is on. [...] The issue appeared when I was using pf-kernel with genpatches and updated from 6.1-pf2 to 6.1-pf3 (corresponding to vanilla versions 6.1.3 -> 6.1.6). I used that fork until 6.2-pf2, but since then (early March) moved to vanilla sources and started following the 6.1.y branch when it was declared LTS. And the issue was present on all of them. ```
P.S. On the `pci=nomsi` case: I don't consider it being related to the issue we're discussing. For me it seems like a hardware issue that can be bypassed by reconfiguration.
I wouldn't be so sure about that.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page.