Hi Juergen, hi all,
Radoslav Bodó reported in Debian an issue after updating our kernel from 6.1.112 to 6.1.115. His report in full is at:
https://bugs.debian.org/1088159
He reports that after switching to 6.1.115 (and present in any of the later 6.1.y series) booting under xen, the mptsas devices are not anymore accessible, the boot shows:
mpt3sas version 43.100.00.00 loaded mpt3sas_cm0: 63 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (8086116 kB) mpt3sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k mpt3sas_cm0: MSI-X vectors supported: 96 mpt3sas_cm0: 0 40 40 mpt3sas_cm0: High IOPs queues : disabled mpt3sas0-msix0: PCI-MSI-X enabled: IRQ 447 mpt3sas0-msix1: PCI-MSI-X enabled: IRQ 448 mpt3sas0-msix2: PCI-MSI-X enabled: IRQ 449 mpt3sas0-msix3: PCI-MSI-X enabled: IRQ 450 mpt3sas0-msix4: PCI-MSI-X enabled: IRQ 451 mpt3sas0-msix5: PCI-MSI-X enabled: IRQ 452 mpt3sas0-msix6: PCI-MSI-X enabled: IRQ 453 mpt3sas0-msix7: PCI-MSI-X enabled: IRQ 454 mpt3sas0-msix8: PCI-MSI-X enabled: IRQ 455 mpt3sas0-msix9: PCI-MSI-X enabled: IRQ 456 mpt3sas0-msix10: PCI-MSI-X enabled: IRQ 457 mpt3sas0-msix11: PCI-MSI-X enabled: IRQ 458 mpt3sas0-msix12: PCI-MSI-X enabled: IRQ 459 mpt3sas0-msix13: PCI-MSI-X enabled: IRQ 460 mpt3sas0-msix14: PCI-MSI-X enabled: IRQ 461 mpt3sas0-msix15: PCI-MSI-X enabled: IRQ 462 mpt3sas0-msix16: PCI-MSI-X enabled: IRQ 463 mpt3sas0-msix17: PCI-MSI-X enabled: IRQ 464 mpt3sas0-msix18: PCI-MSI-X enabled: IRQ 465 mpt3sas0-msix19: PCI-MSI-X enabled: IRQ 466 mpt3sas0-msix20: PCI-MSI-X enabled: IRQ 467 mpt3sas0-msix21: PCI-MSI-X enabled: IRQ 468 mpt3sas0-msix22: PCI-MSI-X enabled: IRQ 469 mpt3sas0-msix23: PCI-MSI-X enabled: IRQ 470 mpt3sas0-msix24: PCI-MSI-X enabled: IRQ 471 mpt3sas0-msix25: PCI-MSI-X enabled: IRQ 472 mpt3sas0-msix26: PCI-MSI-X enabled: IRQ 473 mpt3sas0-msix27: PCI-MSI-X enabled: IRQ 474 mpt3sas0-msix28: PCI-MSI-X enabled: IRQ 475 mpt3sas0-msix29: PCI-MSI-X enabled: IRQ 476 mpt3sas0-msix30: PCI-MSI-X enabled: IRQ 477 mpt3sas0-msix31: PCI-MSI-X enabled: IRQ 478 mpt3sas0-msix32: PCI-MSI-X enabled: IRQ 479 mpt3sas0-msix33: PCI-MSI-X enabled: IRQ 480 mpt3sas0-msix34: PCI-MSI-X enabled: IRQ 481 mpt3sas0-msix35: PCI-MSI-X enabled: IRQ 482 mpt3sas0-msix36: PCI-MSI-X enabled: IRQ 483 mpt3sas0-msix37: PCI-MSI-X enabled: IRQ 484 mpt3sas0-msix38: PCI-MSI-X enabled: IRQ 485 mpt3sas0-msix39: PCI-MSI-X enabled: IRQ 486 mpt3sas_cm0: iomem(0x00000000ac400000), mapped(0x00000000d9f45f61), size(65536) mpt3sas_cm0: ioport(0x0000000000006000), size(256) mpt3sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k mpt3sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(7), sge_per_io(128), chains_per_io(19) mpt3sas_cm0: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:12348/_scsih_probe()!
We were able to bissect the changes (see https://bugs.debian.org/1088159#64) down to
b1e6e80a1b42 ("xen/swiotlb: add alignment check for dma buffers")
#regzbot introduced: b1e6e80a1b42 #regzbot link: https://bugs.debian.org/1088159
reverting the commit resolves the issue.
Does that ring some bells?
In fact we have two more bugs reported with similar symptoms but not yet confirmed they are the same, but I'm referencing them here as well in case we are able to cross-match to root cause:
https://bugs.debian.org/1093371 (megaraid_sas didn't work anymore with Xen)
and
https://bugs.debian.org/1087807 (Unable to boot: i40e swiotlb buffer is full)
(but again the these are yet not confirmed to have the same root cause).
Thanks in advance,
Regards, Salvatore
Hi Salvatore,
On 08/02/25 21:26, Salvatore Bonaccorso wrote:
Hi Juergen, hi all,
Radoslav Bodó reported in Debian an issue after updating our kernel from 6.1.112 to 6.1.115. His report in full is at:
Note: We have seen this on 5.4.y kernel: More details here: https://lore.kernel.org/all/9dd91f6e-1c66-4961-994e-dbda87d69dad@oracle.com/
Thanks, Harshit
He reports that after switching to 6.1.115 (and present in any of the later 6.1.y series) booting under xen, the mptsas devices are not anymore accessible, the boot shows:
mpt3sas version 43.100.00.00 loaded mpt3sas_cm0: 63 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (8086116 kB) mpt3sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k mpt3sas_cm0: MSI-X vectors supported: 96 mpt3sas_cm0: 0 40 40 mpt3sas_cm0: High IOPs queues : disabled mpt3sas0-msix0: PCI-MSI-X enabled: IRQ 447 mpt3sas0-msix1: PCI-MSI-X enabled: IRQ 448 mpt3sas0-msix2: PCI-MSI-X enabled: IRQ 449 mpt3sas0-msix3: PCI-MSI-X enabled: IRQ 450 mpt3sas0-msix4: PCI-MSI-X enabled: IRQ 451 mpt3sas0-msix5: PCI-MSI-X enabled: IRQ 452 mpt3sas0-msix6: PCI-MSI-X enabled: IRQ 453 mpt3sas0-msix7: PCI-MSI-X enabled: IRQ 454 mpt3sas0-msix8: PCI-MSI-X enabled: IRQ 455 mpt3sas0-msix9: PCI-MSI-X enabled: IRQ 456 mpt3sas0-msix10: PCI-MSI-X enabled: IRQ 457 mpt3sas0-msix11: PCI-MSI-X enabled: IRQ 458 mpt3sas0-msix12: PCI-MSI-X enabled: IRQ 459 mpt3sas0-msix13: PCI-MSI-X enabled: IRQ 460 mpt3sas0-msix14: PCI-MSI-X enabled: IRQ 461 mpt3sas0-msix15: PCI-MSI-X enabled: IRQ 462 mpt3sas0-msix16: PCI-MSI-X enabled: IRQ 463 mpt3sas0-msix17: PCI-MSI-X enabled: IRQ 464 mpt3sas0-msix18: PCI-MSI-X enabled: IRQ 465 mpt3sas0-msix19: PCI-MSI-X enabled: IRQ 466 mpt3sas0-msix20: PCI-MSI-X enabled: IRQ 467 mpt3sas0-msix21: PCI-MSI-X enabled: IRQ 468 mpt3sas0-msix22: PCI-MSI-X enabled: IRQ 469 mpt3sas0-msix23: PCI-MSI-X enabled: IRQ 470 mpt3sas0-msix24: PCI-MSI-X enabled: IRQ 471 mpt3sas0-msix25: PCI-MSI-X enabled: IRQ 472 mpt3sas0-msix26: PCI-MSI-X enabled: IRQ 473 mpt3sas0-msix27: PCI-MSI-X enabled: IRQ 474 mpt3sas0-msix28: PCI-MSI-X enabled: IRQ 475 mpt3sas0-msix29: PCI-MSI-X enabled: IRQ 476 mpt3sas0-msix30: PCI-MSI-X enabled: IRQ 477 mpt3sas0-msix31: PCI-MSI-X enabled: IRQ 478 mpt3sas0-msix32: PCI-MSI-X enabled: IRQ 479 mpt3sas0-msix33: PCI-MSI-X enabled: IRQ 480 mpt3sas0-msix34: PCI-MSI-X enabled: IRQ 481 mpt3sas0-msix35: PCI-MSI-X enabled: IRQ 482 mpt3sas0-msix36: PCI-MSI-X enabled: IRQ 483 mpt3sas0-msix37: PCI-MSI-X enabled: IRQ 484 mpt3sas0-msix38: PCI-MSI-X enabled: IRQ 485 mpt3sas0-msix39: PCI-MSI-X enabled: IRQ 486 mpt3sas_cm0: iomem(0x00000000ac400000), mapped(0x00000000d9f45f61), size(65536) mpt3sas_cm0: ioport(0x0000000000006000), size(256) mpt3sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k mpt3sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(7), sge_per_io(128), chains_per_io(19) mpt3sas_cm0: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:12348/_scsih_probe()!
We were able to bissect the changes (see https://bugs.debian.org/1088159#64) down to
b1e6e80a1b42 ("xen/swiotlb: add alignment check for dma buffers")
#regzbot introduced: b1e6e80a1b42 #regzbot link: https://bugs.debian.org/1088159
reverting the commit resolves the issue.
Does that ring some bells?
In fact we have two more bugs reported with similar symptoms but not yet confirmed they are the same, but I'm referencing them here as well in case we are able to cross-match to root cause:
https://bugs.debian.org/1093371 (megaraid_sas didn't work anymore with Xen)
and
https://bugs.debian.org/1087807 (Unable to boot: i40e swiotlb buffer is full)
(but again the these are yet not confirmed to have the same root cause).
Thanks in advance,
Regards, Salvatore
Hi Harshit,
On Sun, Feb 09, 2025 at 01:45:38AM +0530, Harshit Mogalapalli wrote:
Hi Salvatore,
On 08/02/25 21:26, Salvatore Bonaccorso wrote:
Hi Juergen, hi all,
Radoslav Bodó reported in Debian an issue after updating our kernel from 6.1.112 to 6.1.115. His report in full is at:
Note: We have seen this on 5.4.y kernel: More details here: https://lore.kernel.org/all/9dd91f6e-1c66-4961-994e-dbda87d69dad@oracle.com/
Thanks for the pointer, so looking at that thread I suspect the three referenced bugs in Debian are in the end all releated. We have one as well relating to the megasas_sas driver, this one for the mpt3sas driver and one for the i40e driver).
AFAICS, there is not yet a patch which has landed upstream which I can redirect to a affected user to test?
Regards, Salvatore
Hi Salvatore,
On 12/02/25 00:56, Salvatore Bonaccorso wrote:
Hi Harshit,
On Sun, Feb 09, 2025 at 01:45:38AM +0530, Harshit Mogalapalli wrote:
Hi Salvatore,
On 08/02/25 21:26, Salvatore Bonaccorso wrote:
Hi Juergen, hi all,
Radoslav Bodó reported in Debian an issue after updating our kernel from 6.1.112 to 6.1.115. His report in full is at:
Note: We have seen this on 5.4.y kernel: More details here: https://lore.kernel.org/all/9dd91f6e-1c66-4961-994e-dbda87d69dad@oracle.com/
Thanks for the pointer, so looking at that thread I suspect the three referenced bugs in Debian are in the end all releated. We have one as well relating to the megasas_sas driver, this one for the mpt3sas driver and one for the i40e driver).
AFAICS, there is not yet a patch which has landed upstream which I can redirect to a affected user to test?
Konrad pointed me at this thread: https://lore.kernel.org/all/20250211120432.29493-1-jgross@suse.com/
This has some fixes, but not landed upstream yet.
Thanks, Harshit
Regards, Salvatore
On 12.02.25 16:12, Harshit Mogalapalli wrote:
Hi Salvatore,
On 12/02/25 00:56, Salvatore Bonaccorso wrote:
Hi Harshit,
On Sun, Feb 09, 2025 at 01:45:38AM +0530, Harshit Mogalapalli wrote:
Hi Salvatore,
On 08/02/25 21:26, Salvatore Bonaccorso wrote:
Hi Juergen, hi all,
Radoslav Bodó reported in Debian an issue after updating our kernel from 6.1.112 to 6.1.115. His report in full is at:
Note: We have seen this on 5.4.y kernel: More details here: https://lore.kernel.org/all/9dd91f6e-1c66-4961-994e-dbda87d69dad@oracle.com/
Thanks for the pointer, so looking at that thread I suspect the three referenced bugs in Debian are in the end all releated. We have one as well relating to the megasas_sas driver, this one for the mpt3sas driver and one for the i40e driver).
AFAICS, there is not yet a patch which has landed upstream which I can redirect to a affected user to test?
Konrad pointed me at this thread: https://lore.kernel.org/ all/20250211120432.29493-1-jgross@suse.com/
This has some fixes, but not landed upstream yet.
Patches are upstream now. In case you still experience any problems, please speak up.
Juergen
On Sat, Feb 15, 2025 at 12:47:57PM +0100, Jürgen Groß wrote:
On 12.02.25 16:12, Harshit Mogalapalli wrote:
Hi Salvatore,
On 12/02/25 00:56, Salvatore Bonaccorso wrote:
Hi Harshit,
On Sun, Feb 09, 2025 at 01:45:38AM +0530, Harshit Mogalapalli wrote:
Hi Salvatore,
On 08/02/25 21:26, Salvatore Bonaccorso wrote:
Hi Juergen, hi all,
Radoslav Bodó reported in Debian an issue after updating our kernel from 6.1.112 to 6.1.115. His report in full is at:
Note: We have seen this on 5.4.y kernel: More details here: https://lore.kernel.org/all/9dd91f6e-1c66-4961-994e-dbda87d69dad@oracle.com/
Thanks for the pointer, so looking at that thread I suspect the three referenced bugs in Debian are in the end all releated. We have one as well relating to the megasas_sas driver, this one for the mpt3sas driver and one for the i40e driver).
AFAICS, there is not yet a patch which has landed upstream which I can redirect to a affected user to test?
Konrad pointed me at this thread: https://lore.kernel.org/ all/20250211120432.29493-1-jgross@suse.com/
This has some fixes, but not landed upstream yet.
Patches are upstream now. In case you still experience any problems, please speak up.
What specific commits should be backported here?
thanks,
greg k-h
On 15.02.25 13:34, Greg KH wrote:
On Sat, Feb 15, 2025 at 12:47:57PM +0100, Jürgen Groß wrote:
On 12.02.25 16:12, Harshit Mogalapalli wrote:
Hi Salvatore,
On 12/02/25 00:56, Salvatore Bonaccorso wrote:
Hi Harshit,
On Sun, Feb 09, 2025 at 01:45:38AM +0530, Harshit Mogalapalli wrote:
Hi Salvatore,
On 08/02/25 21:26, Salvatore Bonaccorso wrote:
Hi Juergen, hi all,
Radoslav Bodó reported in Debian an issue after updating our kernel from 6.1.112 to 6.1.115. His report in full is at:
Note: We have seen this on 5.4.y kernel: More details here: https://lore.kernel.org/all/9dd91f6e-1c66-4961-994e-dbda87d69dad@oracle.com/
Thanks for the pointer, so looking at that thread I suspect the three referenced bugs in Debian are in the end all releated. We have one as well relating to the megasas_sas driver, this one for the mpt3sas driver and one for the i40e driver).
AFAICS, there is not yet a patch which has landed upstream which I can redirect to a affected user to test?
Konrad pointed me at this thread: https://lore.kernel.org/ all/20250211120432.29493-1-jgross@suse.com/
This has some fixes, but not landed upstream yet.
Patches are upstream now. In case you still experience any problems, please speak up.
What specific commits should be backported here?
Those are:
e93ec87286bd1fd30b7389e7a387cfb259f297e3 85fcb57c983f423180ba6ec5d0034242da05cc54
Juergen
thanks,
greg k-h
On Sat, Feb 15, 2025 at 02:39:46PM +0100, Jürgen Groß wrote:
On 15.02.25 13:34, Greg KH wrote:
On Sat, Feb 15, 2025 at 12:47:57PM +0100, Jürgen Groß wrote:
On 12.02.25 16:12, Harshit Mogalapalli wrote:
Hi Salvatore,
On 12/02/25 00:56, Salvatore Bonaccorso wrote:
Hi Harshit,
On Sun, Feb 09, 2025 at 01:45:38AM +0530, Harshit Mogalapalli wrote:
Hi Salvatore,
On 08/02/25 21:26, Salvatore Bonaccorso wrote: > Hi Juergen, hi all, > > Radoslav Bodó reported in Debian an issue after updating our kernel > from 6.1.112 to 6.1.115. His report in full is at: > > https://bugs.debian.org/1088159 >
Note: We have seen this on 5.4.y kernel: More details here: https://lore.kernel.org/all/9dd91f6e-1c66-4961-994e-dbda87d69dad@oracle.com/
Thanks for the pointer, so looking at that thread I suspect the three referenced bugs in Debian are in the end all releated. We have one as well relating to the megasas_sas driver, this one for the mpt3sas driver and one for the i40e driver).
AFAICS, there is not yet a patch which has landed upstream which I can redirect to a affected user to test?
Konrad pointed me at this thread: https://lore.kernel.org/ all/20250211120432.29493-1-jgross@suse.com/
This has some fixes, but not landed upstream yet.
Patches are upstream now. In case you still experience any problems, please speak up.
What specific commits should be backported here?
Those are:
e93ec87286bd1fd30b7389e7a387cfb259f297e3 85fcb57c983f423180ba6ec5d0034242da05cc54
Ugh, neither of them were marked for stable inclusion, why not? Anyway, I'll go queue them up after this round of kernels is released hopefully tomorrow, but next time, please follow the stable kernel rules if you know you want a patch included in a tree.
thanks,
greg k-h
Hi Juergen,
On 15/02/25 7:09 PM, Jürgen Groß wrote:
On 15.02.25 13:34, Greg KH wrote:
On Sat, Feb 15, 2025 at 12:47:57PM +0100, Jürgen Groß wrote:
On 12.02.25 16:12, Harshit Mogalapalli wrote:
Hi Salvatore,
On 12/02/25 00:56, Salvatore Bonaccorso wrote:
Hi Harshit,
On Sun, Feb 09, 2025 at 01:45:38AM +0530, Harshit Mogalapalli wrote:
Hi Salvatore,
On 08/02/25 21:26, Salvatore Bonaccorso wrote: > Hi Juergen, hi all, > > Radoslav Bodó reported in Debian an issue after updating our kernel > from 6.1.112 to 6.1.115. His report in full is at: > > https://bugs.debian.org/1088159 >
Note: We have seen this on 5.4.y kernel: More details here: https://lore.kernel.org/all/9dd91f6e-1c66-4961-994e-dbda87d69dad@oracle.com/
Thanks for the pointer, so looking at that thread I suspect the three referenced bugs in Debian are in the end all releated. We have one as well relating to the megasas_sas driver, this one for the mpt3sas driver and one for the i40e driver).
AFAICS, there is not yet a patch which has landed upstream which I can redirect to a affected user to test?
Konrad pointed me at this thread: https://lore.kernel.org/ all/20250211120432.29493-1-jgross@suse.com/
This has some fixes, but not landed upstream yet.
Patches are upstream now. In case you still experience any problems, please speak up.
What specific commits should be backported here?
Those are:
e93ec87286bd1fd30b7389e7a387cfb259f297e3 85fcb57c983f423180ba6ec5d0034242da05cc54
Is there a plan to backport a 5.4 variant of this series. I tried backporting it to 5.4 myself but found a lot of conflicts. It doesn't seem to be compliant with 5.4 swiotlib. If you could guide me as to how you would recommend backporting this for 5.4, whether it is via backporting multiple supporting patches to make the cherry-pick clean or manually resolving conflicts in the patch itself, that'll be highly appreciated.
Juergen
thanks,
greg k-h
Thanks, Harshvardhan
linux-stable-mirror@lists.linaro.org