Hi, Thorsten here, the Linux kernel's regression tracker.
I noticed a report about a linux-6.6.y regression in bugzilla.kernel.org that appears to be caused by this commit from Dan applied by Greg:
15fffc6a5624b1 ("driver core: Fix uevent_show() vs driver detach race") [v6.11-rc3, v6.10.5, v6.6.46, v6.1.105, v5.15.165, v5.10.224, v5.4.282, v4.19.320]
The reporter did not check yet if mainline is affected; decided to forward the report by mail nevertheless, as the maintainer for the subsystem is also the maintainer for the stable tree. ;-)
To quote from https://bugzilla.kernel.org/show_bug.cgi?id=219244 :
The symptoms of this bug are as follows:
- After booting (to the graphical login screen) the mouse pointer
would frozen and only after unplugging and plugging-in again the usb plug of the mouse would the mouse be working as expected.
- If one would log in without fixing the mouse issue, the mouse
pointer would still be frozen after login.
- The usb keyboard usually is not affected even though plugged into
the same usb-hub - thus logging in is possible.
- The mouse pointer is also frozen if the usb connector is plugged
into a different usb-port (different from the usb-hub)
- The pointer is moveable via the inbuilt synaptics trackpad
The kernel log shows almost the same messages (not sure if the differences mean anything in regards to this bug) for the initial recognizing the mouse (frozen mouse pointer) and the re-plugged-in mouse (and subsequently moveable mouse pointer):
[kernel] [ 8.763158] usb 1-2.2.1.2: new low-speed USB device number 10 using xhci_hcd [kernel] [ 8.956028] usb 1-2.2.1.2: New USB device found, idVendor=045e, idProduct=00cb, bcdDevice= 1.04 [kernel] [ 8.956036] usb 1-2.2.1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [kernel] [ 8.956039] usb 1-2.2.1.2: Product: Microsoft Basic Optical Mouse v2.0 [kernel] [ 8.956041] usb 1-2.2.1.2: Manufacturer: Microsoft [kernel] [ 8.963554] input: Microsoft Microsoft Basic Optical Mouse v2.0 as /devices/pci0000:00/0000:00:14.0/usb1/1-2/1-2.2/1-2.2.1/1-2.2.1.2/1-2.2.1.2:1.0/0003:045E:00CB.0002/input/input18 [kernel] [ 8.964417] hid-generic 0003:045E:00CB.0002: input,hidraw1: USB HID v1.11 Mouse [Microsoft Microsoft Basic Optical Mouse v2.0 ] on usb-0000:00:14.0-2.2.1.2/input0
[kernel] [ 31.258381] usb 1-2.2.1.2: USB disconnect, device number 10 [kernel] [ 31.595051] usb 1-2.2.1.2: new low-speed USB device number 16 using xhci_hcd [kernel] [ 31.804002] usb 1-2.2.1.2: New USB device found, idVendor=045e, idProduct=00cb, bcdDevice= 1.04 [kernel] [ 31.804010] usb 1-2.2.1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [kernel] [ 31.804013] usb 1-2.2.1.2: Product: Microsoft Basic Optical Mouse v2.0 [kernel] [ 31.804016] usb 1-2.2.1.2: Manufacturer: Microsoft [kernel] [ 31.812933] input: Microsoft Microsoft Basic Optical Mouse v2.0 as /devices/pci0000:00/0000:00:14.0/usb1/1-2/1-2.2/1-2.2.1/1-2.2.1.2/1-2.2.1.2:1.0/0003:045E:00CB.0004/input/input20 [kernel] [ 31.814028] hid-generic 0003:045E:00CB.0004: input,hidraw1: USB HID v1.11 Mouse [Microsoft Microsoft Basic Optical Mouse v2.0 ] on usb-0000:00:14.0-2.2.1.2/input0
Differences:
../0003:045E:00CB.0002/input/input18 vs ../0003:045E:00CB.0004/input/input20
and
hid-generic 0003:045E:00CB.0002 vs hid-generic 0003:045E:00CB.0004
The connector / usb-port was not changed in this case!
The symptoms of this bug have been present at one point in the recent past, but with kernel v6.6.45 (or maybe even some version before that) it was fine. But with 6.6.45 it seems to be definitely fine.
But with v6.6.46 the symptoms returned. That's the reason I suspected the kernel to be the cause of this issue. So I did some bisecting - which wasn't easy because that bug would often times not appear if the system was simply rebooted into the test kernel. As the bug would definitely appear on the affected kernels (v6.6.46 ff) after shutting down the system for the night and booting the next day, I resorted to simulating the over-night powering-off by shutting the system down, unplugging the power and pressing the power button to get rid of residual voltage. But even then a few times the bug would only appear if I repeated this procedure before booting the system again with the respective kernel.
This is on a Thinkpad with Kaby Lake and integrated Intel graphics. Even though it is a laptop, it is used as a desktop device, and the internal battery is disconnected, the removable battery is removed as the system is plugged-in via the power cord at all times (when in use)! Also, the system has no power (except for the bios battery, of course) over night as the power outlet is switched off if the device is not in use.
Not sure if this affects the issue - or how it does. But for successful bisecting I had to resort to the above procedure.
Bisecting the issue (between the release commits of v6.6.45 and v6.6.46) resulted in this commit [1] being the probable culprit.
I then tested kernel v6.6.49. It still produced the bug for me. So I reverted the changes of the assumed "bad commit" and re-compiled kernel v6.6.49. With this modified kernel the bug seems to be gone.
Now, I assume the commit has a reason for being introduced, but maybe needs some adjusting in order to avoid this bug I'm experiencing on my system. Also, I can't say why the issue appeared in the past even without this commit being present, as I haven't bisected any kernel version before v6.6.45.
[1]:
4d035c743c3e391728a6f81cbf0f7f9ca700cf62 is the first bad commit commit 4d035c743c3e391728a6f81cbf0f7f9ca700cf62 Author: Dan Williams dan.j.williams@intel.com Date: Fri Jul 12 12:42:09 2024 -0700
driver core: Fix uevent_show() vs driver detach race
commit 15fffc6a5624b13b428bb1c6e9088e32a55eb82c upstream. uevent_show() wants to de-reference dev->driver->name. There is no clean
See the ticket for more details. Note, you have to use bugzilla to reach the reporter, as I sadly[1] can not CCed them in mails like this.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page.
[1] because bugzilla.kernel.org tells users upon registration their "email address will never be displayed to logged out users"
P.S.: let me use this mail to also add the report to the list of tracked regressions to ensure it's doesn't fall through the cracks:
#regzbot introduced: 4d035c743c3e391728a6f81cbf0f7f9ca700cf62 #regzbot from: brmails+k #regzbot duplicate: https://bugzilla.kernel.org/show_bug.cgi?id=219244 #regzbot title: driver core: frozen usb mouse pointer at boot #regzbot ignore-activity