Hi,
I'm seeing a reproducible kernel oops on my home router updating from 5.15.181 to 5.15.189:
kernel BUG at lib/list_debug.c:50! invalid opcode: 0000 [#1] SMP NOPTI .. Call Trace: <TASK> drr_qlen_notify+0x11/0x50 [sch_drr] qdisc_tree_reduce_backlog+0x93/0xf0 drr_graft_class+0x109/0x220 [sch_drr] qdisc_graft+0xdd/0x510 ? qdisc_create+0x335/0x510 tc_modify_qdisc+0x53f/0x9d0 rtnetlink_rcv_msg+0x134/0x370 ? __getblk_gfp+0x22/0x230 ? rtnl_calcit.isra.38+0x130/0x130 netlink_rcv_skb+0x4f/0x100 rtnetlink_rcv+0x10/0x20 netlink_unicast+0x1d2/0x2a0 netlink_sendmsg+0x22a/0x480 ? netlink_broadcast+0x20/0x20 ____sys_sendmsg+0x25f/0x280 ? copy_msghdr_from_user+0x5b/0x90 ___sys_sendmsg+0x77/0xc0 ? __sys_recvmsg+0x5a/0xb0 ? do_filp_open+0xc3/0x120 __sys_sendmsg+0x5d/0xb0 __x64_sys_sendmsg+0x1a/0x20 x64_sys_call+0x17f1/0x1c80 do_syscall_64+0x53/0x80 ? exit_to_user_mode_prepare+0x2c/0x140 ? irqentry_exit_to_user_mode+0xe/0x20 ? irqentry_exit+0x1d/0x30 ? exc_page_fault+0x1e7/0x610 ? do_syscall_64+0x5f/0x80 entry_SYSCALL_64_after_hwframe+0x6c/0xd6 .. RIP: 0010:__list_del_entry_valid.cold.1+0xf/0x69
syzbot reported a similar looking thing here:
[v5.15] BUG: unable to handle kernel paging request in drr_qlen_notify https://groups.google.com/g/syzkaller-lts-bugs/c/_QJHiMHwfRw/m/2j1nSU1hBgAJ
and here:
"[syzbot] [net?] general protection fault in drr_qlen_notify" https://www.spinics.net/lists/netdev/msg1105420.html
syzboot bisected it to:
**************************************** commit e269f29e9395527bc00c213c6b15da04ebb35070 Refs: v5.15.186-114-ge269f29e9395 Author: Lion Ackermann nnamrec@gmail.com AuthorDate: Mon Jun 30 15:27:30 2025 +0200 Commit: Greg Kroah-Hartman gregkh@linuxfoundation.org CommitDate: Thu Jul 10 15:57:46 2025 +0200
net/sched: Always pass notifications when child class becomes empty
[ Upstream commit 103406b38c600fec1fe375a77b27d87e314aea09 ] ****************************************
The last line of the commit message mentions:
"This is not a problem after the recent patch series that made all the classful qdiscs qlen_notify() handlers idempotent."
It looks like the "idempotent" patches are missing from the 5.15 stable series.
Like this one: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/ne...
I've tried Ubuntu's backport for 5.15: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/jammy/com...
It seems to be identical to: https://lore.kernel.org/stable/bcf9c70e9cf750363782816c21c69792f6c81cd9.1754...
While the kernel didn't oops anymore with the patch applied, the network traffic behaves erratic: TCP traffic works, ICMP seems "stuck". tcpdump showed no icmp traffic on the ppp device.
Tomorrow I will try if I can reproduce the issue on a test VM.
Anything else I should try?
Thanks in advance, Thomas
Hello again,
You wrote on Thu, Aug 14, 2025 at 11:13:32PM +0200:
I'm seeing a reproducible kernel oops on my home router updating from 5.15.181 to 5.15.189: ..
It seems to be identical to: https://lore.kernel.org/stable/bcf9c70e9cf750363782816c21c69792f6c81cd9.1754...
While the kernel didn't oops anymore with the patch applied, the network traffic behaves erratic: TCP traffic works, ICMP seems "stuck". tcpdump showed no icmp traffic on the ppp device.
I didn't realize yesterday that the bandwidth management script uses both drr and hfsc.
There is an upcoming patch series proposed for the 5.15 stable queue:
1. "[PATCH 5.15, 5.10 2/6] sch_drr: make drr_qlen_notify() idempotent" https://lore.kernel.org/stable/bcf9c70e9cf750363782816c21c69792f6c81cd9.1754...
2. "[PATCH 5.15, 5.10 3/6] sch_hfsc: make hfsc_qlen_notify() idempotent" https://lore.kernel.org/stable/8f1d425178ad93064465e15c68b38890b10b5814.1754...
With those two patches applied, kernel 5.15.189 doesn't oops anymore. (if I drop the hfsc related patch, ICMP is broken as reported)
Thanks for all the hard work on the stable kernel series! It's highly appreciated.
Cheers, Thomas
linux-stable-mirror@lists.linaro.org