Apologies for the delay in reporting this: I messed up my first attempt at bisecting, then I've spent a week going to, enjoying, returning from and recovering from a music festival.
Up to and including 5.18.18 things are fine. With 5.19.0 (and .1 and .2) I see lots of errors and hangs on the USB2 chipset, e.g.
$ grep "usb 9-4" dmesg.5.19.2 [ 6.669075] usb 9-4: new full-speed USB device number 2 using ohci-pci [ 6.829087] usb 9-4: device descriptor read/64, error -32 [ 7.097094] usb 9-4: device descriptor read/64, error -32 [ 7.361087] usb 9-4: new full-speed USB device number 3 using ohci-pci [ 7.521152] usb 9-4: device descriptor read/64, error -32 [ 7.789066] usb 9-4: device descriptor read/64, error -32 [ 8.081070] usb 9-4: new full-speed USB device number 4 using ohci-pci [ 8.497138] usb 9-4: device not accepting address 4, error -32 [ 8.653140] usb 9-4: new full-speed USB device number 5 using ohci-pci [ 9.069141] usb 9-4: device not accepting address 5, error -32 $
$ grep "usb 1-2" dmesg.5.19.2 [ 5.917102] usb 1-2: new high-speed USB device number 2 using ehci-pci [ 6.277076] usb 1-2: device descriptor read/64, error -71 [ 6.513143] usb 1-2: device descriptor read/64, error -32 [ 6.753146] usb 1-2: new high-speed USB device number 3 using ehci-pci [ 6.881143] usb 1-2: device descriptor read/64, error -32 [ 7.117144] usb 1-2: device descriptor read/64, error -32 [ 7.429141] usb 1-2: new high-speed USB device number 4 using ehci-pci [ 7.845134] usb 1-2: device not accepting address 4, error -32 [ 7.977142] usb 1-2: new high-speed USB device number 5 using ehci-pci [ 8.393158] usb 1-2: device not accepting address 5, error -32 $
the USB port is then no longer usable
This is not reproducible on the other chipset (USB3) on this machine, nor on two other systems. Swapping USB cables doesn't help.
I have bisected it to
$ git bisect bad 78013eaadf696d2105982abb4018fbae394ca08f is the first bad commit commit 78013eaadf696d2105982abb4018fbae394ca08f Author: Christoph Hellwig hch@lst.de Date: Mon Feb 14 14:11:44 2022 +0100
x86: remove the IOMMU table infrastructure
however it will not easily revert
I'll be more than happy to assist with any debugging/testing.
$ git revert 78013eaadf696d2105982abb4018fbae394ca08f Auto-merging arch/x86/include/asm/dma-mapping.h CONFLICT (content): Merge conflict in arch/x86/include/asm/dma-mapping.h Auto-merging arch/x86/include/asm/iommu.h Auto-merging arch/x86/include/asm/xen/swiotlb-xen.h Auto-merging arch/x86/kernel/Makefile Auto-merging arch/x86/kernel/pci-dma.c CONFLICT (content): Merge conflict in arch/x86/kernel/pci-dma.c Auto-merging arch/x86/kernel/vmlinux.lds.S Auto-merging drivers/iommu/amd/init.c Auto-merging drivers/iommu/amd/iommu.c CONFLICT (content): Merge conflict in drivers/iommu/amd/iommu.c Auto-merging drivers/iommu/intel/dmar.c error: could not revert 78013eaadf69... x86: remove the IOMMU table infrastructure
# dmidecode | grep -A2 "^Base Board" Base Board Information Manufacturer: Gigabyte Technology Co., Ltd. Product Name: 970A-DS3P #
# lspci -nn | grep -i usb 00:12.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:12.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 00:13.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:13.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 00:14.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller [1002:4399] 00:16.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:16.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 02:00.0 USB controller [0c03]: VIA Technologies, Inc. VL805/806 xHCI USB 3.0 Controller [1106:3483] (rev 01) #
# lspci -v -s 00:12 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller (prog-if 10 [OHCI]) Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3 Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 18 Memory at fe50a000 (32-bit, non-prefetchable) [size=4K] Kernel driver in use: ohci-pci Kernel modules: ohci_pci 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller (prog-if 20 [EHCI]) Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3 Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 17 Memory at fe509000 (32-bit, non-prefetchable) [size=256] Capabilities: [c0] Power Management version 2 Capabilities: [e4] Debug port: BAR=1 offset=00e0 Kernel driver in use: ehci-pci Kernel modules: ehci_pci #
# lsusb Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 009 Device 002: ID 067b:2303 Prolific Technology, Inc. PL2303 Serial Port / Mobile Action MA-8910P Bus 009 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 008 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 007 Device 002: ID 03f0:0317 HP, Inc LaserJet 1200 Bus 007 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 001 Device 002: ID 04e8:6860 Samsung Electronics Co., Ltd Galaxy A5 (MTP) Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 006 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 002 Device 002: ID 2109:3431 VIA Labs, Inc. Hub Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub #
$ git bisect log git bisect start # good: [4b0986a3613c92f4ec1bdc7f60ec66fea135991f] Linux 5.18 git bisect good 4b0986a3613c92f4ec1bdc7f60ec66fea135991f # good: [07e0b709cab7dc987b5071443789865e20481119] Linux 5.18.18 git bisect good 07e0b709cab7dc987b5071443789865e20481119 # bad: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19 git bisect bad 3d7cb6b04c3f3115719235cc6866b10326de34cd # bad: [c011dd537ffe47462051930413fed07dbdc80313] Merge tag 'arm-soc-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect bad c011dd537ffe47462051930413fed07dbdc80313 # good: [7e062cda7d90543ac8c7700fc7c5527d0c0f22ad] Merge tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next git bisect good 7e062cda7d90543ac8c7700fc7c5527d0c0f22ad # good: [f8122500a039abeabfff41b0ad8b6a2c94c1107d] Merge branch 'etnaviv/next' of https://git.pengutronix.de/git/lst/linux into drm-next git bisect good f8122500a039abeabfff41b0ad8b6a2c94c1107d # good: [2518f226c60d8e04d18ba4295500a5b0b8ac7659] Merge tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drm git bisect good 2518f226c60d8e04d18ba4295500a5b0b8ac7659 # good: [f7a344468105ef8c54086dfdc800e6f5a8417d3e] ASoC: max98090: Move check for invalid values before casting in max98090_put_enab_tlv() git bisect good f7a344468105ef8c54086dfdc800e6f5a8417d3e # good: [fbe86daca0ba878b04fa241b85e26e54d17d4229] Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi git bisect good fbe86daca0ba878b04fa241b85e26e54d17d4229 # good: [709c8632597c3276cd21324b0256628f1a7fd4df] xfs: rework deferred attribute operation setup git bisect good 709c8632597c3276cd21324b0256628f1a7fd4df # bad: [babf0bb978e3c9fce6c4eba6b744c8754fd43d8e] Merge tag 'xfs-5.19-for-linus' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux git bisect bad babf0bb978e3c9fce6c4eba6b744c8754fd43d8e # bad: [8b728edc5be161799434cc17e1279db2f8eabe29] Merge tag 'fs_for_v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs git bisect bad 8b728edc5be161799434cc17e1279db2f8eabe29 # bad: [3f70356edf5611c28a68d8d5a9c2b442c9eb81e6] swiotlb: merge swiotlb-xen initialization into swiotlb git bisect bad 3f70356edf5611c28a68d8d5a9c2b442c9eb81e6 # good: [f39f8d0eb081407e470396fd4cc376c526d13066] MIPS/octeon: use swiotlb_init instead of open coding it git bisect good f39f8d0eb081407e470396fd4cc376c526d13066 # bad: [c6af2aa9ffc9763826607bc2664ef3ea4475ed18] swiotlb: make the swiotlb_init interface more useful git bisect bad c6af2aa9ffc9763826607bc2664ef3ea4475ed18 # bad: [a3e230926708125205ffd06d3dc2175a8263ae7e] x86: centralize setting SWIOTLB_FORCE when guest memory encryption is enabled git bisect bad a3e230926708125205ffd06d3dc2175a8263ae7e # bad: [78013eaadf696d2105982abb4018fbae394ca08f] x86: remove the IOMMU table infrastructure git bisect bad 78013eaadf696d2105982abb4018fbae394ca08f # first bad commit: [78013eaadf696d2105982abb4018fbae394ca08f] x86: remove the IOMMU table infrastructure $
[Adding in linux-usb@vger]
On Thu, Aug 18, 2022 at 03:36:44PM +0100, Alan J. Wylie wrote:
Apologies for the delay in reporting this: I messed up my first attempt at bisecting, then I've spent a week going to, enjoying, returning from and recovering from a music festival.
Up to and including 5.18.18 things are fine. With 5.19.0 (and .1 and .2) I see lots of errors and hangs on the USB2 chipset, e.g.
$ grep "usb 9-4" dmesg.5.19.2 [ 6.669075] usb 9-4: new full-speed USB device number 2 using ohci-pci [ 6.829087] usb 9-4: device descriptor read/64, error -32 [ 7.097094] usb 9-4: device descriptor read/64, error -32 [ 7.361087] usb 9-4: new full-speed USB device number 3 using ohci-pci [ 7.521152] usb 9-4: device descriptor read/64, error -32 [ 7.789066] usb 9-4: device descriptor read/64, error -32 [ 8.081070] usb 9-4: new full-speed USB device number 4 using ohci-pci [ 8.497138] usb 9-4: device not accepting address 4, error -32 [ 8.653140] usb 9-4: new full-speed USB device number 5 using ohci-pci [ 9.069141] usb 9-4: device not accepting address 5, error -32 $
$ grep "usb 1-2" dmesg.5.19.2 [ 5.917102] usb 1-2: new high-speed USB device number 2 using ehci-pci [ 6.277076] usb 1-2: device descriptor read/64, error -71 [ 6.513143] usb 1-2: device descriptor read/64, error -32 [ 6.753146] usb 1-2: new high-speed USB device number 3 using ehci-pci [ 6.881143] usb 1-2: device descriptor read/64, error -32 [ 7.117144] usb 1-2: device descriptor read/64, error -32 [ 7.429141] usb 1-2: new high-speed USB device number 4 using ehci-pci [ 7.845134] usb 1-2: device not accepting address 4, error -32 [ 7.977142] usb 1-2: new high-speed USB device number 5 using ehci-pci [ 8.393158] usb 1-2: device not accepting address 5, error -32 $
the USB port is then no longer usable
This is not reproducible on the other chipset (USB3) on this machine, nor on two other systems. Swapping USB cables doesn't help.
I have bisected it to
$ git bisect bad 78013eaadf696d2105982abb4018fbae394ca08f is the first bad commit commit 78013eaadf696d2105982abb4018fbae394ca08f Author: Christoph Hellwig hch@lst.de Date: Mon Feb 14 14:11:44 2022 +0100
x86: remove the IOMMU table infrastructure
however it will not easily revert
I'll be more than happy to assist with any debugging/testing.
$ git revert 78013eaadf696d2105982abb4018fbae394ca08f Auto-merging arch/x86/include/asm/dma-mapping.h CONFLICT (content): Merge conflict in arch/x86/include/asm/dma-mapping.h Auto-merging arch/x86/include/asm/iommu.h Auto-merging arch/x86/include/asm/xen/swiotlb-xen.h Auto-merging arch/x86/kernel/Makefile Auto-merging arch/x86/kernel/pci-dma.c CONFLICT (content): Merge conflict in arch/x86/kernel/pci-dma.c Auto-merging arch/x86/kernel/vmlinux.lds.S Auto-merging drivers/iommu/amd/init.c Auto-merging drivers/iommu/amd/iommu.c CONFLICT (content): Merge conflict in drivers/iommu/amd/iommu.c Auto-merging drivers/iommu/intel/dmar.c error: could not revert 78013eaadf69... x86: remove the IOMMU table infrastructure
# dmidecode | grep -A2 "^Base Board" Base Board Information Manufacturer: Gigabyte Technology Co., Ltd. Product Name: 970A-DS3P #
# lspci -nn | grep -i usb 00:12.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:12.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 00:13.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:13.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 00:14.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller [1002:4399] 00:16.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:16.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 02:00.0 USB controller [0c03]: VIA Technologies, Inc. VL805/806 xHCI USB 3.0 Controller [1106:3483] (rev 01)
So this only happens with the on-board USB 2 controller?
This is odd, I would not expect one PCI controller to work, but the other one not.
#
# lspci -v -s 00:12 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller (prog-if 10 [OHCI]) Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3 Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 18 Memory at fe50a000 (32-bit, non-prefetchable) [size=4K] Kernel driver in use: ohci-pci Kernel modules: ohci_pci 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller (prog-if 20 [EHCI]) Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3 Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 17 Memory at fe509000 (32-bit, non-prefetchable) [size=256] Capabilities: [c0] Power Management version 2 Capabilities: [e4] Debug port: BAR=1 offset=00e0 Kernel driver in use: ehci-pci Kernel modules: ehci_pci #
What is the output of the lspci -v for the USB 3 controller?
Christoph, any ideas?
thanks,
greg k-h
at 16:47 on Thu 18-Aug-2022 Greg Kroah-Hartman (gregkh@linuxfoundation.org) wrote:
[Adding in linux-usb@vger]
On Thu, Aug 18, 2022 at 03:36:44PM +0100, Alan J. Wylie wrote:
Apologies for the delay in reporting this: I messed up my first attempt at bisecting, then I've spent a week going to, enjoying, returning from and recovering from a music festival.
Up to and including 5.18.18 things are fine. With 5.19.0 (and .1 and .2) I see lots of errors and hangs on the USB2 chipset, e.g.
$ grep "usb 9-4" dmesg.5.19.2 [ 6.669075] usb 9-4: new full-speed USB device number 2 using ohci-pci [ 6.829087] usb 9-4: device descriptor read/64, error -32 [ 7.097094] usb 9-4: device descriptor read/64, error -32 [ 7.361087] usb 9-4: new full-speed USB device number 3 using ohci-pci [ 7.521152] usb 9-4: device descriptor read/64, error -32 [ 7.789066] usb 9-4: device descriptor read/64, error -32 [ 8.081070] usb 9-4: new full-speed USB device number 4 using ohci-pci [ 8.497138] usb 9-4: device not accepting address 4, error -32 [ 8.653140] usb 9-4: new full-speed USB device number 5 using ohci-pci [ 9.069141] usb 9-4: device not accepting address 5, error -32 $
$ grep "usb 1-2" dmesg.5.19.2 [ 5.917102] usb 1-2: new high-speed USB device number 2 using ehci-pci [ 6.277076] usb 1-2: device descriptor read/64, error -71 [ 6.513143] usb 1-2: device descriptor read/64, error -32 [ 6.753146] usb 1-2: new high-speed USB device number 3 using ehci-pci [ 6.881143] usb 1-2: device descriptor read/64, error -32 [ 7.117144] usb 1-2: device descriptor read/64, error -32 [ 7.429141] usb 1-2: new high-speed USB device number 4 using ehci-pci [ 7.845134] usb 1-2: device not accepting address 4, error -32 [ 7.977142] usb 1-2: new high-speed USB device number 5 using ehci-pci [ 8.393158] usb 1-2: device not accepting address 5, error -32 $
the USB port is then no longer usable
This is not reproducible on the other chipset (USB3) on this machine, nor on two other systems. Swapping USB cables doesn't help.
I have bisected it to
$ git bisect bad 78013eaadf696d2105982abb4018fbae394ca08f is the first bad commit commit 78013eaadf696d2105982abb4018fbae394ca08f Author: Christoph Hellwig hch@lst.de Date: Mon Feb 14 14:11:44 2022 +0100
x86: remove the IOMMU table infrastructure
however it will not easily revert
I'll be more than happy to assist with any debugging/testing.
$ git revert 78013eaadf696d2105982abb4018fbae394ca08f Auto-merging arch/x86/include/asm/dma-mapping.h CONFLICT (content): Merge conflict in arch/x86/include/asm/dma-mapping.h Auto-merging arch/x86/include/asm/iommu.h Auto-merging arch/x86/include/asm/xen/swiotlb-xen.h Auto-merging arch/x86/kernel/Makefile Auto-merging arch/x86/kernel/pci-dma.c CONFLICT (content): Merge conflict in arch/x86/kernel/pci-dma.c Auto-merging arch/x86/kernel/vmlinux.lds.S Auto-merging drivers/iommu/amd/init.c Auto-merging drivers/iommu/amd/iommu.c CONFLICT (content): Merge conflict in drivers/iommu/amd/iommu.c Auto-merging drivers/iommu/intel/dmar.c error: could not revert 78013eaadf69... x86: remove the IOMMU table infrastructure
# dmidecode | grep -A2 "^Base Board" Base Board Information Manufacturer: Gigabyte Technology Co., Ltd. Product Name: 970A-DS3P #
# lspci -nn | grep -i usb 00:12.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:12.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 00:13.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:13.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 00:14.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller [1002:4399] 00:16.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:16.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 02:00.0 USB controller [0c03]: VIA Technologies, Inc. VL805/806 xHCI USB 3.0 Controller [1106:3483] (rev 01)
So this only happens with the on-board USB 2 controller?
That is correct
This is odd, I would not expect one PCI controller to work, but the other one not.
#
# lspci -v -s 00:12 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller (prog-if 10 [OHCI]) Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3 Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 18 Memory at fe50a000 (32-bit, non-prefetchable) [size=4K] Kernel driver in use: ohci-pci Kernel modules: ohci_pci 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller (prog-if 20 [EHCI]) Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3 Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 17 Memory at fe509000 (32-bit, non-prefetchable) [size=256] Capabilities: [c0] Power Management version 2 Capabilities: [e4] Debug port: BAR=1 offset=00e0 Kernel driver in use: ehci-pci Kernel modules: ehci_pci #
What is the output of the lspci -v for the USB 3 controller?
# lspci -v -s 02:00 02:00.0 USB controller: VIA Technologies, Inc. VL805/806 xHCI USB 3.0 Controller (rev 01) (prog-if 30 [XHCI]) Subsystem: Gigabyte Technology Co., Ltd VL805/806 xHCI USB 3.0 Controller Flags: bus master, fast devsel, latency 0, IRQ 36 Memory at fe400000 (64-bit, non-prefetchable) [size=4K] Capabilities: [80] Power Management version 3 Capabilities: [90] MSI: Enable+ Count=1/4 Maskable- 64bit+ Capabilities: [c4] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Kernel driver in use: xhci_hcd Kernel modules: xhci_pci
Christoph, any ideas?
thanks,
greg k-h
On Thu, Aug 18, 2022 at 04:47:14PM +0200, Greg Kroah-Hartman wrote:
What is the output of the lspci -v for the USB 3 controller?
Christoph, any ideas?
Well, with that commit it must be related to dma ops selection. As this appears to be an AMD system the options here are direct, amd_iommu and possibly amd_gart as the odd one in the mix.
Alan, can you send me your .config?
at 08:23 on Sun 21-Aug-2022 Christoph Hellwig (hch@lst.de) wrote:
On Thu, Aug 18, 2022 at 04:47:14PM +0200, Greg Kroah-Hartman wrote:
What is the output of the lspci -v for the USB 3 controller?
Christoph, any ideas?
Well, with that commit it must be related to dma ops selection. As this appears to be an AMD system the options here are direct, amd_iommu and possibly amd_gart as the odd one in the mix.
Alan, can you send me your .config?
I hope that with the following information there is no need for me to do so.
It is indeed an old AMD CPU Model name: AMD FX(tm)-4300 Quad-Core Processor CPU family: 21 Model: 2
Comparing with another AMD system that doesn't show the problem, I see that CONFIG_GART_IOMMU is only set on the one with the problem.
The configs have just had "make oldconfig" run on them for years, I have no idea why one has it set.
Clearing it fixes the problem!
Thanks for the hint, although there is a still wider issue with this regression.
$ diff .config.old .config 353c353 < CONFIG_GART_IOMMU=y ---
# CONFIG_GART_IOMMU is not set
4683d4682 < CONFIG_IOMMU_HELPER=y 4987d4985 < # CONFIG_IOMMU_DEBUG is not set $
On Sun, Aug 21, 2022 at 09:21:22AM +0100, Alan J. Wylie wrote:
Comparing with another AMD system that doesn't show the problem, I see that CONFIG_GART_IOMMU is only set on the one with the problem.
The configs have just had "make oldconfig" run on them for years, I have no idea why one has it set.
Clearing it fixes the problem!
Thanks for confirming my suspicion. I'd still like to fix the issue with CONFIG_GART_IOMMU enabled once I've tracked it down. Would you be willing to test patches?
at 16:26 on Sun 21-Aug-2022 Christoph Hellwig (hch@lst.de) wrote:
Thanks for confirming my suspicion. I'd still like to fix the issue with CONFIG_GART_IOMMU enabled once I've tracked it down. Would you be willing to test patches?
I'll be glad to help.
I've also had a look in the loft and my box of bits for an old Athlon64/Opteron/Turion/Sempron processor, but I'm afraid all I've got are:
Phenom II X6 1055T Phenom II X2 545 Athlon 2 x2 270
TWIMC: this mail is primarily send for documentation purposes and for regzbot, my Linux kernel regression tracking bot. These mails usually contain '#forregzbot' in the subject, to make them easy to spot and filter.
On 21.08.22 18:50, Alan J. Wylie wrote:
at 16:26 on Sun 21-Aug-2022 Christoph Hellwig (hch@lst.de) wrote:
Thanks for confirming my suspicion. I'd still like to fix the issue with CONFIG_GART_IOMMU enabled once I've tracked it down. Would you be willing to test patches?
I'll be glad to help.
I've also had a look in the loft and my box of bits for an old Athlon64/Opteron/Turion/Sempron processor, but I'm afraid all I've got are:
Phenom II X6 1055T Phenom II X2 545 Athlon 2 x2 270
#regzbot backburner: unusual config, workaround found, devs still want to fix it, but apparently not urgent #regzbot ignore-activity
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
P.S.: As the Linux kernel's regression tracker I deal with a lot of reports and sometimes miss something important when writing mails like this. If that's the case here, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight.
[TLDR: I'm adding this regression report to the list of tracked regressions; all text from me you find below is based on a few templates paragraphs you might have encountered already already in similar form.]
On 18.08.22 16:36, Alan J. Wylie wrote:
Apologies for the delay in reporting this: I messed up my first attempt at bisecting, then I've spent a week going to, enjoying, returning from and recovering from a music festival.
Up to and including 5.18.18 things are fine. With 5.19.0 (and .1 and .2) I see lots of errors and hangs on the USB2 chipset, e.g.
$ grep "usb 9-4" dmesg.5.19.2 [ 6.669075] usb 9-4: new full-speed USB device number 2 using ohci-pci [ 6.829087] usb 9-4: device descriptor read/64, error -32 [ 7.097094] usb 9-4: device descriptor read/64, error -32 [ 7.361087] usb 9-4: new full-speed USB device number 3 using ohci-pci [ 7.521152] usb 9-4: device descriptor read/64, error -32 [ 7.789066] usb 9-4: device descriptor read/64, error -32 [ 8.081070] usb 9-4: new full-speed USB device number 4 using ohci-pci [ 8.497138] usb 9-4: device not accepting address 4, error -32 [ 8.653140] usb 9-4: new full-speed USB device number 5 using ohci-pci [ 9.069141] usb 9-4: device not accepting address 5, error -32 $
$ grep "usb 1-2" dmesg.5.19.2 [ 5.917102] usb 1-2: new high-speed USB device number 2 using ehci-pci [ 6.277076] usb 1-2: device descriptor read/64, error -71 [ 6.513143] usb 1-2: device descriptor read/64, error -32 [ 6.753146] usb 1-2: new high-speed USB device number 3 using ehci-pci [ 6.881143] usb 1-2: device descriptor read/64, error -32 [ 7.117144] usb 1-2: device descriptor read/64, error -32 [ 7.429141] usb 1-2: new high-speed USB device number 4 using ehci-pci [ 7.845134] usb 1-2: device not accepting address 4, error -32 [ 7.977142] usb 1-2: new high-speed USB device number 5 using ehci-pci [ 8.393158] usb 1-2: device not accepting address 5, error -32 $
the USB port is then no longer usable
This is not reproducible on the other chipset (USB3) on this machine, nor on two other systems. Swapping USB cables doesn't help.
I have bisected it to
$ git bisect bad 78013eaadf696d2105982abb4018fbae394ca08f is the first bad commit commit 78013eaadf696d2105982abb4018fbae394ca08f Author: Christoph Hellwig hch@lst.de Date: Mon Feb 14 14:11:44 2022 +0100
x86: remove the IOMMU table infrastructure
however it will not easily revert
I'll be more than happy to assist with any debugging/testing.
$ git revert 78013eaadf696d2105982abb4018fbae394ca08f Auto-merging arch/x86/include/asm/dma-mapping.h CONFLICT (content): Merge conflict in arch/x86/include/asm/dma-mapping.h Auto-merging arch/x86/include/asm/iommu.h Auto-merging arch/x86/include/asm/xen/swiotlb-xen.h Auto-merging arch/x86/kernel/Makefile Auto-merging arch/x86/kernel/pci-dma.c CONFLICT (content): Merge conflict in arch/x86/kernel/pci-dma.c Auto-merging arch/x86/kernel/vmlinux.lds.S Auto-merging drivers/iommu/amd/init.c Auto-merging drivers/iommu/amd/iommu.c CONFLICT (content): Merge conflict in drivers/iommu/amd/iommu.c Auto-merging drivers/iommu/intel/dmar.c error: could not revert 78013eaadf69... x86: remove the IOMMU table infrastructure
# dmidecode | grep -A2 "^Base Board" Base Board Information Manufacturer: Gigabyte Technology Co., Ltd. Product Name: 970A-DS3P #
# lspci -nn | grep -i usb 00:12.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:12.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 00:13.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:13.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 00:14.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller [1002:4399] 00:16.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:16.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 02:00.0 USB controller [0c03]: VIA Technologies, Inc. VL805/806 xHCI USB 3.0 Controller [1106:3483] (rev 01) #
# lspci -v -s 00:12 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller (prog-if 10 [OHCI]) Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3 Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 18 Memory at fe50a000 (32-bit, non-prefetchable) [size=4K] Kernel driver in use: ohci-pci Kernel modules: ohci_pci 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller (prog-if 20 [EHCI]) Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3 Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 17 Memory at fe509000 (32-bit, non-prefetchable) [size=256] Capabilities: [c0] Power Management version 2 Capabilities: [e4] Debug port: BAR=1 offset=00e0 Kernel driver in use: ehci-pci Kernel modules: ehci_pci #
# lsusb Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 009 Device 002: ID 067b:2303 Prolific Technology, Inc. PL2303 Serial Port / Mobile Action MA-8910P Bus 009 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 008 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 007 Device 002: ID 03f0:0317 HP, Inc LaserJet 1200 Bus 007 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 001 Device 002: ID 04e8:6860 Samsung Electronics Co., Ltd Galaxy A5 (MTP) Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 006 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 002 Device 002: ID 2109:3431 VIA Labs, Inc. Hub Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub #
$ git bisect log git bisect start # good: [4b0986a3613c92f4ec1bdc7f60ec66fea135991f] Linux 5.18 git bisect good 4b0986a3613c92f4ec1bdc7f60ec66fea135991f # good: [07e0b709cab7dc987b5071443789865e20481119] Linux 5.18.18 git bisect good 07e0b709cab7dc987b5071443789865e20481119 # bad: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19 git bisect bad 3d7cb6b04c3f3115719235cc6866b10326de34cd # bad: [c011dd537ffe47462051930413fed07dbdc80313] Merge tag 'arm-soc-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect bad c011dd537ffe47462051930413fed07dbdc80313 # good: [7e062cda7d90543ac8c7700fc7c5527d0c0f22ad] Merge tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next git bisect good 7e062cda7d90543ac8c7700fc7c5527d0c0f22ad # good: [f8122500a039abeabfff41b0ad8b6a2c94c1107d] Merge branch 'etnaviv/next' of https://git.pengutronix.de/git/lst/linux into drm-next git bisect good f8122500a039abeabfff41b0ad8b6a2c94c1107d # good: [2518f226c60d8e04d18ba4295500a5b0b8ac7659] Merge tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drm git bisect good 2518f226c60d8e04d18ba4295500a5b0b8ac7659 # good: [f7a344468105ef8c54086dfdc800e6f5a8417d3e] ASoC: max98090: Move check for invalid values before casting in max98090_put_enab_tlv() git bisect good f7a344468105ef8c54086dfdc800e6f5a8417d3e # good: [fbe86daca0ba878b04fa241b85e26e54d17d4229] Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi git bisect good fbe86daca0ba878b04fa241b85e26e54d17d4229 # good: [709c8632597c3276cd21324b0256628f1a7fd4df] xfs: rework deferred attribute operation setup git bisect good 709c8632597c3276cd21324b0256628f1a7fd4df # bad: [babf0bb978e3c9fce6c4eba6b744c8754fd43d8e] Merge tag 'xfs-5.19-for-linus' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux git bisect bad babf0bb978e3c9fce6c4eba6b744c8754fd43d8e # bad: [8b728edc5be161799434cc17e1279db2f8eabe29] Merge tag 'fs_for_v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs git bisect bad 8b728edc5be161799434cc17e1279db2f8eabe29 # bad: [3f70356edf5611c28a68d8d5a9c2b442c9eb81e6] swiotlb: merge swiotlb-xen initialization into swiotlb git bisect bad 3f70356edf5611c28a68d8d5a9c2b442c9eb81e6 # good: [f39f8d0eb081407e470396fd4cc376c526d13066] MIPS/octeon: use swiotlb_init instead of open coding it git bisect good f39f8d0eb081407e470396fd4cc376c526d13066 # bad: [c6af2aa9ffc9763826607bc2664ef3ea4475ed18] swiotlb: make the swiotlb_init interface more useful git bisect bad c6af2aa9ffc9763826607bc2664ef3ea4475ed18 # bad: [a3e230926708125205ffd06d3dc2175a8263ae7e] x86: centralize setting SWIOTLB_FORCE when guest memory encryption is enabled git bisect bad a3e230926708125205ffd06d3dc2175a8263ae7e # bad: [78013eaadf696d2105982abb4018fbae394ca08f] x86: remove the IOMMU table infrastructure git bisect bad 78013eaadf696d2105982abb4018fbae394ca08f # first bad commit: [78013eaadf696d2105982abb4018fbae394ca08f] x86: remove the IOMMU table infrastructure $
Thanks for the report. To be sure below issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression tracking bot:
#regzbot ^introduced 78013eaadf696d2105982abb4018fbae394ca08 #regzbot title dma/iommu/gart/usb: USB errors during boot #regzbot ignore-activity
This isn't a regression? This issue or a fix for it are already discussed somewhere else? It was fixed already? You want to clarify when the regression started to happen? Or point out I got the title or something else totally wrong? Then just reply -- ideally with also telling regzbot about it, as explained here: https://linux-regtracking.leemhuis.info/tracked-regression/
Reminder for developers: When fixing the issue, add 'Link:' tags pointing to the report (the mail this one replies to), as explained for in the Linux kernel's documentation; above webpage explains why this is important for tracked regressions.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
P.S.: As the Linux kernel's regression tracker I deal with a lot of reports and sometimes miss something important when writing mails like this. If that's the case here, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight.
linux-stable-mirror@lists.linaro.org