Hi,
On Bugzilla, danilrybakov249@gmail.com reported stable-specific, ACPI error regression that led into high CPU temperature [1]. He wrote:
Overview:
After updating from lts v6.6.14-2 to lts v6.6.17-1 noticed high CPU temperature and lag. After running htop noticed that journald was using 30-60% of CPU. Afterwards, tried switching to stable, or lts v6.6.18-1, but encountered the same issue.
Running journalctl -f gives these lines over and over again:
Feb 19 21:09:12 danirybe kernel: ACPI Error: Could not disable RealTimeClock events (20230628/evxfevnt-243) Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 08, disabling event (20230628/evgpe-839) Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 0A, disabling event (20230628/evgpe-839) Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 0B, disabling event (20230628/evgpe-839) Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - PM_Timer (0), disabling (20230628/evevent-255) Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - PowerButton (2), disabling (20230628/evevent-255) Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - SleepButton (3), disabling (20230628/evevent-255)
My system info:
Laptop model: ASUS VivoBook D540NV-GQ065T OS: Arch Linux x86_64 Kernel: 6.6.14-2-lts WM: sway CPU: Intel Pentium N420 (4) @ 2.500GHz GPU1: Intel Apollo Lake [HD Graphics 505] GPU2: NVIDIA GeForce 920MX
I've pinned down the commit after which the problem occurs:
847e1eb30e269a094da046c08273abe3f3361cf2 is the first bad commit commit 847e1eb30e269a094da046c08273abe3f3361cf2 Author: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Date: Mon Jan 8 15:20:58 2024 +0900
platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe
commit 5913320eb0b3ec88158cfcb0fa5e996bf4ef681b upstream.
<snipped>...
See Bugzilla for the full thread.
Thanks.
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=218531
On Feb 27, 2024 / 15:22, Bagas Sanjaya wrote:
Hi,
On Bugzilla, danilrybakov249@gmail.com reported stable-specific, ACPI error regression that led into high CPU temperature [1]. He wrote:
Thanks for the report, and sorry for the trouble.
Overview:
After updating from lts v6.6.14-2 to lts v6.6.17-1 noticed high CPU temperature and lag. After running htop noticed that journald was using 30-60% of CPU. Afterwards, tried switching to stable, or lts v6.6.18-1, but encountered the same issue.
Running journalctl -f gives these lines over and over again:
Feb 19 21:09:12 danirybe kernel: ACPI Error: Could not disable RealTimeClock events (20230628/evxfevnt-243) Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 08, disabling event (20230628/evgpe-839) Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 0A, disabling event (20230628/evgpe-839) Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 0B, disabling event (20230628/evgpe-839) Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - PM_Timer (0), disabling (20230628/evevent-255) Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - PowerButton (2), disabling (20230628/evevent-255) Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - SleepButton (3), disabling (20230628/evevent-255)
My system info:
Laptop model: ASUS VivoBook D540NV-GQ065T OS: Arch Linux x86_64 Kernel: 6.6.14-2-lts WM: sway CPU: Intel Pentium N420 (4) @ 2.500GHz
I think this CPU is in Goldmont microarchitecture group. The group is handled in a bit unique way in drivers/platform/x86/p2sb.c. I guess the commit affected handling of P2SB resource on machines with that architecture.
GPU1: Intel Apollo Lake [HD Graphics 505] GPU2: NVIDIA GeForce 920MX
I've pinned down the commit after which the problem occurs:
847e1eb30e269a094da046c08273abe3f3361cf2 is the first bad commit commit 847e1eb30e269a094da046c08273abe3f3361cf2 Author: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Date: Mon Jan 8 15:20:58 2024 +0900
platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe
commit 5913320eb0b3ec88158cfcb0fa5e996bf4ef681b upstream.
<snipped>...
See Bugzilla for the full thread.
Thanks.
I do not have access to the hardware. As I commented on the bugzilla link above, I would like ask help for debug.
On Tue, Feb 27, 2024 at 09:57:28AM +0000, Shinichiro Kawasaki wrote:
On Feb 27, 2024 / 15:22, Bagas Sanjaya wrote:
On Bugzilla, danilrybakov249@gmail.com reported stable-specific, ACPI error regression that led into high CPU temperature [1]. He wrote:
Thanks for the report, and sorry for the trouble.
Heads up. The problem seems with the caching algo which includes function 0 to be scanned. The investigation and fix development are in progress.
[TLDR: I'm adding this report to the list of tracked Linux kernel regressions; the text you find below is based on a few templates paragraphs you might have encountered already in similar form. See link in footer if these mails annoy you.]
On 27.02.24 09:22, Bagas Sanjaya wrote:
On Bugzilla, danilrybakov249@gmail.com reported stable-specific, ACPI error regression that led into high CPU temperature [1]. He wrote: [...]
#regzbot ^introduced 847e1eb30e269a094da046c08273abe3f3361cf2 #regzbot duplicate: https://bugzilla.kernel.org/show_bug.cgi?id=218531 #regzbot title platform/x86: p2sb: Continuous ACPI errors resulting in high CPU usage by journald #regzbot ignore-activity
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr That page also explains what to do if mails like this annoy you.
Hi,
On Thu, Feb 29, 2024 at 09:49:08AM +0100, Linux regression tracking #adding (Thorsten Leemhuis) wrote:
[TLDR: I'm adding this report to the list of tracked Linux kernel regressions; the text you find below is based on a few templates paragraphs you might have encountered already in similar form. See link in footer if these mails annoy you.]
On 27.02.24 09:22, Bagas Sanjaya wrote:
On Bugzilla, danilrybakov249@gmail.com reported stable-specific, ACPI error regression that led into high CPU temperature [1]. He wrote: [...]
#regzbot ^introduced 847e1eb30e269a094da046c08273abe3f3361cf2 #regzbot duplicate: https://bugzilla.kernel.org/show_bug.cgi?id=218531 #regzbot title platform/x86: p2sb: Continuous ACPI errors resulting in high CPU usage by journald #regzbot ignore-activity
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr That page also explains what to do if mails like this annoy you.
The fix for this issue seems to have landed in mainline:
aec7d25b497c ("platform/x86: p2sb: On Goldmont only cache P2SB and SPI devfn BAR")
Regards, Salvatore
linux-stable-mirror@lists.linaro.org