On Thu, Mar 06, 2025 at 09:06:23PM +0000, Colin Evans wrote:
Please try collecting a usbmon trace for bus 2 showing the problem. Ideally the trace should show what happens from system boot-up, but there's no way to do that. Instead, you can do this (the first command below disables the bus, the second starts the usbmon trace, and the third re-enables the bus):
echo 0 >/sys/bus/usb/devices/usb2/bConfigurationValue cat /sys/kernel/debug/usb/usbmon/2u >usbmon.txt & echo 1 >/sys/bus/usb/devices/usb2/bConfigurationValue
Then after enough time has passed for the errors to show up, kill the "cat" process and post the resulting trace file. (Note: If your keyboard is attached to bus 2, you won't be able to use it to issue the second and third commands. You could use a network login, or put the commands into a shell file and run them that way.)
In fact, you should do this twice: The second time, run it on machine 2 with the powered hub plugged in to suppress the errors.
Alan Stern
Happy to try this, but as it stands there is no such file, or file-like thing, on my machine-
# ls /sys/kernel/debug/usb/usbmon/2u ls: cannot access '/sys/kernel/debug/usb/usbmon/2u': No such file or directory
# find /sys/kernel/debug/usb -name "2u" #
# ls /sys/kernel/debug/usb devices ehci ohci uhci uvcvideo xhci
It seems something is missing?
Ah -- you have to load the usbmon module first:
modprobe usbmon
Some distributions do this for you automatically.
Alan Stern
On 06/03/2025 21:43, Alan Stern wrote:
On Thu, Mar 06, 2025 at 09:06:23PM +0000, Colin Evans wrote:
Please try collecting a usbmon trace for bus 2 showing the problem. Ideally the trace should show what happens from system boot-up, but there's no way to do that. Instead, you can do this (the first command below disables the bus, the second starts the usbmon trace, and the third re-enables the bus):
echo 0 >/sys/bus/usb/devices/usb2/bConfigurationValue cat /sys/kernel/debug/usb/usbmon/2u >usbmon.txt & echo 1 >/sys/bus/usb/devices/usb2/bConfigurationValue
Then after enough time has passed for the errors to show up, kill the "cat" process and post the resulting trace file. (Note: If your keyboard is attached to bus 2, you won't be able to use it to issue the second and third commands. You could use a network login, or put the commands into a shell file and run them that way.)
In fact, you should do this twice: The second time, run it on machine 2 with the powered hub plugged in to suppress the errors.
Alan Stern
Happy to try this, but as it stands there is no such file, or file-like thing, on my machine-
# ls /sys/kernel/debug/usb/usbmon/2u ls: cannot access '/sys/kernel/debug/usb/usbmon/2u': No such file or directory
# find /sys/kernel/debug/usb -name "2u" #
# ls /sys/kernel/debug/usb devices ehci ohci uhci uvcvideo xhci
It seems something is missing?
Ah -- you have to load the usbmon module first:
modprobe usbmon
Some distributions do this for you automatically.
Alan Stern
------------------------------------------------------
I believe I have the information requested. The output of usbmon for the "problem" scenario is large, I hope it doesn't exceed any email attachment limits, but if it does I will have to work out another way to share it.
It may be that 30s of data is more than is needed. If that's the case I can easily run a shorter usbmon cycle.
Additional Observations ----------------------- It appears that having pretty much any external device plugged into a motherboard port connected to the _problem_ controller is enough to suppress the stream of "usb usb2-port4: Cannot enable. Maybe the USB cable is bad?" dmesg errors.
For these tests the results named "working" had a USB2.0 memory stick plugged into one of the top 4 USB ports on the motherboard, while the "problem" results didn't.
For info- the older machine that exhibits this problem ("machine 1") also shows device manager errors if booted into Windows 10, suggesting that machine may in fact have a motherboard hardware fault.
However "machine 2" (which is less than 2 weeks old), shows no errors when booted into Windows.
How the Results Were Generated ------------------------------
"working" results ----------------- The command string used (after "modprobe usbmon") was-
timeout -k 30 30 echo 0 >/sys/bus/usb/devices/usb2/bConfigurationValue ; \ cat /sys/kernel/debug/usb/usbmon/2u >usbmon_filename.txt & \ echo 1 >/sys/bus/usb/devices/usb2/bConfigurationValue
I booted the machine with the USB stick connected, checked that there were no dmesg error and the performance of 'lsusb' was sane, then ran the command above. I also ran-
lsusb -t > lsusb_t_filename.txt To document the USB device structure. "problem" results ----------------- Rebooted the machine with nothing connected to the problem USB ports. Confirmed the issue was present (slow boot, dmesg errors etc.) Re-ran the same commands. I hope this gives the information required.
On Sat, Mar 08, 2025 at 11:19:22PM +0000, Colin Evans wrote:
I believe I have the information requested. The output of usbmon for the "problem" scenario is large, I hope it doesn't exceed any email attachment limits, but if it does I will have to work out another way to share it.
It may be that 30s of data is more than is needed. If that's the case I can easily run a shorter usbmon cycle.
It is a lot more than needed, but that's okay.
Additional Observations
It appears that having pretty much any external device plugged into a motherboard port connected to the _problem_ controller is enough to suppress the stream of "usb usb2-port4: Cannot enable. Maybe the USB cable is bad?" dmesg errors.
For these tests the results named "working" had a USB2.0 memory stick plugged
into one
of the top 4 USB ports on the motherboard, while the "problem" results didn't.
For info- the older machine that exhibits this problem ("machine 1") also shows device manager errors if booted into Windows 10, suggesting that machine may in fact have a motherboard hardware fault.
However "machine 2" (which is less than 2 weeks old), shows no errors when booted into Windows.
Well, I have no idea what Windows is doing on that machine.
The usbmon trace shows that port 4 on bus 2 generates a continual stream of link-state-change events, constantly interrupting the system and consuming computational resources. That's why the performance goes way down.
I can't tell what's causing those link-state changes. It _looks_ like what you would get if there was an intermittent electrical connection causing random voltage fluctuations. Whatever the cause is, plugging in the memory stick does seem to suppress those changes; they don't show up at all in the "working" trace.
In theory, turning off power to port 4 might stop all the events from being reported. You can try this to see if it works:
echo 1 >/sys/bus/usb/devices/2-0:1.0/usb2-port4/disable
Alan Stern
On 09/03/2025 21:01, Alan Stern wrote:
On Sat, Mar 08, 2025 at 11:19:22PM +0000, Colin Evans wrote:
I believe I have the information requested. The output of usbmon for the "problem" scenario is large, I hope it doesn't exceed any email attachment limits, but if it does I will have to work out another way to share it.
It may be that 30s of data is more than is needed. If that's the case I can easily run a shorter usbmon cycle.
It is a lot more than needed, but that's okay.
Additional Observations
It appears that having pretty much any external device plugged into a motherboard port connected to the _problem_ controller is enough to suppress the stream of "usb usb2-port4: Cannot enable. Maybe the USB cable is bad?" dmesg errors.
For these tests the results named "working" had a USB2.0 memory stick plugged
into one
of the top 4 USB ports on the motherboard, while the "problem" results didn't.
For info- the older machine that exhibits this problem ("machine 1") also shows device manager errors if booted into Windows 10, suggesting that machine may in fact have a motherboard hardware fault.
However "machine 2" (which is less than 2 weeks old), shows no errors when booted into Windows.
Well, I have no idea what Windows is doing on that machine.
The usbmon trace shows that port 4 on bus 2 generates a continual stream of link-state-change events, constantly interrupting the system and consuming computational resources. That's why the performance goes way down.
I can't tell what's causing those link-state changes. It _looks_ like what you would get if there was an intermittent electrical connection causing random voltage fluctuations. Whatever the cause is, plugging in the memory stick does seem to suppress those changes; they don't show up at all in the "working" trace.
In theory, turning off power to port 4 might stop all the events from being reported. You can try this to see if it works:
echo 1 >/sys/bus/usb/devices/2-0:1.0/usb2-port4/disable
Alan Stern
Thank you, that is very helpful, for a couple of reasons.
"Machine 2" is a new build, so if (as it sounds) the motherboard has a hardware problem, then I need to look into returning it.
BTW- it seems I spoke too soon about the USB stick suppressing the error. After a couple of reboots with it in place the problem re-occurred. It does seem that connecting a hub (switch) is the only way to reliably stop the error. The switch has a bunch of wiring connected to USB peripherals and other machines. I would have guessed that might make the likelihood of picking up electrical noise actually worse, but that seems not to be the case here.
"Machine 1" is several years old, it's actually the guts of the same PC that was upgraded to make M/c 2. It's not usable, or sellable, with this performance hit happening. I have tried all the external USB ports on this machine and not found the failing controller, my guess is it's going to be one that supports some of the on-board USB headers.
I had been looking on the web for a way to shut down the problem port, or worst case the whole hub, however all the Linux examples I found worked by either-
a) Preventing the loading of the driver for the chipset, by type. However that would kill all ports supported by the same type of controller, and this motherboard has multiple controllers of the same type onboard.
b) Shutting down a port by searching for the connected device identifier. However in these cases there _are_ no connected devlces, the fault happens when the controller is not connected to anything.
Hopefully the command you recommended will do the trick, I will let you know.
Would I be correct in thinking this would need to be run at every boot, some time after device enumeration, or would it need to be run after every re-enumeration of devices after a USB device is connected / disconnected? Not sure how to achieve that.
I very much appreciate your help in identifying the fault. Thank you.
Regards: C Evans
On Sun, Mar 09, 2025 at 09:57:21PM +0000, Colin Evans wrote:
In theory, turning off power to port 4 might stop all the events from being reported. You can try this to see if it works:
echo 1 >/sys/bus/usb/devices/2-0:1.0/usb2-port4/disable
Alan Stern
Thank you, that is very helpful, for a couple of reasons.
"Machine 2" is a new build, so if (as it sounds) the motherboard has a hardware problem, then I need to look into returning it.
BTW- it seems I spoke too soon about the USB stick suppressing the error. After a couple of reboots with it in place the problem re-occurred. It does seem that connecting a hub (switch) is the only way to reliably stop the error. The switch has a bunch of wiring connected to USB peripherals and other machines. I would have guessed that might make the likelihood of picking up electrical noise actually worse, but that seems not to be the case here.
It may have something to do with whether the attached device is USB-3 or USB-2. Hubs are both (or are USB-2 only).
"Machine 1" is several years old, it's actually the guts of the same PC that was upgraded to make M/c 2. It's not usable, or sellable, with this performance hit happening. I have tried all the external USB ports on this machine and not found the failing controller, my guess is it's going to be one that supports some of the on-board USB headers.
In fact, the port in question might not be attached to anything, or improperly grounded, or something like that.
I had been looking on the web for a way to shut down the problem port, or worst case the whole hub, however all the Linux examples I found worked by either-
a) Preventing the loading of the driver for the chipset, by type. However that would kill all ports supported by the same type of controller, and this motherboard has multiple controllers of the same type onboard.
b) Shutting down a port by searching for the connected device identifier. However in these cases there _are_ no connected devlces, the fault happens when the controller is not connected to anything.
Hopefully the command you recommended will do the trick, I will let you know.
Would I be correct in thinking this would need to be run at every boot, some time after device enumeration, or would it need to be run after every re-enumeration of devices after a USB device is connected / disconnected? Not sure how to achieve that.
At every boot. It doesn't have to be after all the other devices are enumerated; after the USB controller itself is enumerated will be good enough.
I very much appreciate your help in identifying the fault. Thank you.
You're welcome.
Alan Stern
linux-stable-mirror@lists.linaro.org