On TI beagle board x15 the connected SSD is not detected on linux next 20221006 tag.
+ export STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 + STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 + test -n /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 + echo y + mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 mke2fs 1.46.5 (30-Dec-2021) The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does not exist and no size was specified. + lava-test-raise 'mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job exit'
Test log: - https://lkft.validation.linaro.org/scheduler/job/5634743#L2580
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: 7da9fed0474b4cd46055dd92d55c42faf32c19ac git_describe: next-20221006 kernel_version: 6.0.0 kernel-config: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F/config build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/659754170 artifact-location: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F toolchain: gcc-10
-- Linaro LKFT https://lkft.linaro.org
On 2022/10/12 16:24, Naresh Kamboju wrote:
On TI beagle board x15 the connected SSD is not detected on linux next 20221006 tag.
- export STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
- STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
- test -n /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
- echo y
- mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
mke2fs 1.46.5 (30-Dec-2021) The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does not exist and no size was specified.
- lava-test-raise 'mkfs.ext4
/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job exit'
Test log:
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: 7da9fed0474b4cd46055dd92d55c42faf32c19ac git_describe: next-20221006 kernel_version: 6.0.0 kernel-config: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F/config build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/659754170 artifact-location: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F toolchain: gcc-10
The kernel messages that are shown in the links above do not show any "libata version 3.00 loaded." message nor any ata/ahci message that I can see. So I think the eSATA adapter is not even being detected and libata/ahci driver not used.
Was this working before ? If yes, can you try with the following patches reverted ?
d3243965f24a ("ata: make PATA_PLATFORM selectable only for suitable architectures") 3ebe59a54111 ("ata: clean up how architectures enable PATA_PLATFORM and PATA_OF_PLATFORM")
If reverting these patches restores the eSATA port on this board, then you need to fix the defconfig for that board.
On Thu, 13 Oct 2022 at 12:41, Damien Le Moal damien.lemoal@opensource.wdc.com wrote:
On 2022/10/12 16:24, Naresh Kamboju wrote:
On TI beagle board x15 the connected SSD is not detected on linux next 20221006 tag.
- export STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
- STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
- test -n /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
- echo y
- mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
mke2fs 1.46.5 (30-Dec-2021) The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does not exist and no size was specified.
- lava-test-raise 'mkfs.ext4
/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job exit'
The reported issue is now noticed on the Linux mainline master branch.
1) I see following config is missing on latest problematic builds - CONFIG_HAVE_PATA_PLATFORM=y
2) Following ahci sata kernel message are missing on problematic boots, [ 1.408660] ahci 4a140000.sata: forcing port_map 0x0 -> 0x1 [ 1.408691] ahci 4a140000.sata: AHCI 0001.0300 32 slots 1 ports 3 Gbps 0x1 impl platform mode [ 1.408721] ahci 4a140000.sata: flags: 64bit ncq sntf pm led clo only pmp pio slum part ccc apst [ 1.409820] scsi host0: ahci [ 1.410064] ata1: SATA max UDMA/133 mmio [mem 0x4a140000-0x4a1410ff] port 0x100 irq 98
3) GOOD: 9d84bb40bcb30a7fa16f33baa967aeb9953dda78 BAD: e08466a7c00733a501d3c5328d29ec974478d717
4) Here i am adding links working and not working test jobs and kernel configs, problematic test job: - https://lkft.validation.linaro.org/scheduler/job/5641407#L2602 Good test job: - https://lkft.validation.linaro.org/scheduler/job/5640672#L2198
5) metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline git_sha: e08466a7c00733a501d3c5328d29ec974478d717 git_describe: v6.0-7220-ge08466a7c007 kernel_version: 6.0.0 kernel-config: https://builds.tuxbuild.com/2Fourpiqf1OrlPFFtKwhHV0wAiq/config build-url: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline/-/pipelines/6... artifact-location: https://builds.tuxbuild.com/2Fourpiqf1OrlPFFtKwhHV0wAiq toolchain: gcc-10
6) For your information, -- I see diff on good to bad commits, $ git log --oneline 9d84bb40bcb3..e08466a7c007 -- drivers/ata 4078aa685097 Merge tag 'ata-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata 71d7b6e51ad3 ata: libata-eh: avoid needless hard reset when revalidating link e3b1fff6c051 ata: libata: drop superfluous ata_eh_analyze_tf() parameter b46c760e11c8 ata: libata: drop superfluous ata_eh_request_sense() parameter cb6e73aaadff ata: libata-eh: Remove the unneeded result variable ecf8322f464d ata: ahci_st: Enable compile test 2d29dd108c78 ata: ahci_st: Fix compilation warning 9628711aa649 ata: ahci-dwc: Add Baikal-T1 AHCI SATA interface support bc7af9100fa8 ata: ahci-dwc: Add platform-specific quirks support 33629d35090f ata: ahci: Add DWC AHCI SATA controller support 6ce73f3a6fc0 ata: libahci_platform: Add function returning a clock-handle by id 18ee7c49f75b ata: ahci: Introduce firmware-specific caps initialization 7cbbfbe01a72 ata: ahci: Convert __ahci_port_base to accepting hpriv as arguments fad64dc06579 ata: libahci: Don't read AHCI version twice in the save-config method 88589772e80c ata: libahci: Discard redundant force_port_map parameter eb7cae0b6afd ata: libahci: Extend port-cmd flags set with port capabilities f67f12ff57bc ata: libahci_platform: Introduce reset assertion/deassertion methods 3f74cd046fbe ata: libahci_platform: Parse ports-implemented property in resources getter 3c132ea6508b ata: libahci_platform: Sanity check the DT child nodes number e28b3abf8020 ata: libahci_platform: Convert to using devm bulk clocks API 82d437e6dcb1 ata: libahci_platform: Convert to using platform devm-ioremap methods d3243965f24a ata: make PATA_PLATFORM selectable only for suitable architectures 3ebe59a54111 ata: clean up how architectures enable PATA_PLATFORM and PATA_OF_PLATFORM 55d5ba550535 ata: libata-core: Check errors in sata_print_link_status() 03070458d700 ata: libata-sff: Fix double word in comments 0b2436d3d25f ata: pata_macio: Remove unneeded word in comments 024811a2da45 ata: libata-core: Simplify ata_dev_set_xfermode() 066de3b9d93b ata: libata-core: Simplify ata_build_rw_tf() e00923c59e68 ata: libata: Rename ATA_DFLAG_NCQ_PRIO_ENABLE 614065aba704 ata: libata-core: remove redundant err_mask variable fee6073051c3 ata: ahci: Do not check ACPI_FADT_LOW_POWER_S0 99ad3f9f829f ata: libata-core: improve parameter names for ata_dev_set_feature() 16169fb78182 ata: libata-core: Print timeout value when internal command times
Test log:
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: 7da9fed0474b4cd46055dd92d55c42faf32c19ac git_describe: next-20221006 kernel_version: 6.0.0 kernel-config: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F/config build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/659754170 artifact-location: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F toolchain: gcc-10
7)
The kernel messages that are shown in the links above do not show any "libata version 3.00 loaded." message nor any ata/ahci message that I can see. So I think the eSATA adapter is not even being detected and libata/ahci driver not used.
Was this working before ? If yes, can you try with the following patches reverted ?
d3243965f24a ("ata: make PATA_PLATFORM selectable only for suitable architectures") 3ebe59a54111 ("ata: clean up how architectures enable PATA_PLATFORM and PATA_OF_PLATFORM")
I have reverted above two patches and but the problem has not been solved.
8)
If reverting these patches restores the eSATA port on this board, then you need to fix the defconfig for that board.
OTOH, Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the device failed to boot.
- Naresh
On Thu, 13 Oct 2022 at 14:39, Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Thu, 13 Oct 2022 at 12:41, Damien Le Moal damien.lemoal@opensource.wdc.com wrote:
On 2022/10/12 16:24, Naresh Kamboju wrote:
On TI beagle board x15 the connected SSD is not detected on linux next 20221006 tag.
- export STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
- STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
- test -n /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
- echo y
- mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
mke2fs 1.46.5 (30-Dec-2021) The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does not exist and no size was specified.
- lava-test-raise 'mkfs.ext4
/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job exit'
The reported issue is now noticed on the Linux mainline master branch.
I see following config is missing on latest problematic builds
- CONFIG_HAVE_PATA_PLATFORM=y
Following ahci sata kernel message are missing on problematic boots, [ 1.408660] ahci 4a140000.sata: forcing port_map 0x0 -> 0x1 [ 1.408691] ahci 4a140000.sata: AHCI 0001.0300 32 slots 1 ports 3 Gbps 0x1 impl platform mode [ 1.408721] ahci 4a140000.sata: flags: 64bit ncq sntf pm led clo only pmp pio slum part ccc apst [ 1.409820] scsi host0: ahci [ 1.410064] ata1: SATA max UDMA/133 mmio [mem 0x4a140000-0x4a1410ff] port 0x100 irq 98
GOOD: 9d84bb40bcb30a7fa16f33baa967aeb9953dda78 BAD: e08466a7c00733a501d3c5328d29ec974478d717
Here i am adding links working and not working test jobs and kernel configs, problematic test job:
Good test job:
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline git_sha: e08466a7c00733a501d3c5328d29ec974478d717 git_describe: v6.0-7220-ge08466a7c007 kernel_version: 6.0.0 kernel-config: https://builds.tuxbuild.com/2Fourpiqf1OrlPFFtKwhHV0wAiq/config build-url: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline/-/pipelines/6... artifact-location: https://builds.tuxbuild.com/2Fourpiqf1OrlPFFtKwhHV0wAiq toolchain: gcc-10
For your information,
I see diff on good to bad commits, $ git log --oneline 9d84bb40bcb3..e08466a7c007 -- drivers/ata 4078aa685097 Merge tag 'ata-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata 71d7b6e51ad3 ata: libata-eh: avoid needless hard reset when revalidating link e3b1fff6c051 ata: libata: drop superfluous ata_eh_analyze_tf() parameter b46c760e11c8 ata: libata: drop superfluous ata_eh_request_sense() parameter cb6e73aaadff ata: libata-eh: Remove the unneeded result variable ecf8322f464d ata: ahci_st: Enable compile test 2d29dd108c78 ata: ahci_st: Fix compilation warning 9628711aa649 ata: ahci-dwc: Add Baikal-T1 AHCI SATA interface support bc7af9100fa8 ata: ahci-dwc: Add platform-specific quirks support 33629d35090f ata: ahci: Add DWC AHCI SATA controller support 6ce73f3a6fc0 ata: libahci_platform: Add function returning a clock-handle by id 18ee7c49f75b ata: ahci: Introduce firmware-specific caps initialization 7cbbfbe01a72 ata: ahci: Convert __ahci_port_base to accepting hpriv as arguments fad64dc06579 ata: libahci: Don't read AHCI version twice in the save-config method 88589772e80c ata: libahci: Discard redundant force_port_map parameter eb7cae0b6afd ata: libahci: Extend port-cmd flags set with port capabilities f67f12ff57bc ata: libahci_platform: Introduce reset assertion/deassertion methods 3f74cd046fbe ata: libahci_platform: Parse ports-implemented property in resources getter 3c132ea6508b ata: libahci_platform: Sanity check the DT child nodes number e28b3abf8020 ata: libahci_platform: Convert to using devm bulk clocks API 82d437e6dcb1 ata: libahci_platform: Convert to using platform devm-ioremap methods d3243965f24a ata: make PATA_PLATFORM selectable only for suitable architectures 3ebe59a54111 ata: clean up how architectures enable PATA_PLATFORM and PATA_OF_PLATFORM 55d5ba550535 ata: libata-core: Check errors in sata_print_link_status() 03070458d700 ata: libata-sff: Fix double word in comments 0b2436d3d25f ata: pata_macio: Remove unneeded word in comments 024811a2da45 ata: libata-core: Simplify ata_dev_set_xfermode() 066de3b9d93b ata: libata-core: Simplify ata_build_rw_tf() e00923c59e68 ata: libata: Rename ATA_DFLAG_NCQ_PRIO_ENABLE 614065aba704 ata: libata-core: remove redundant err_mask variable fee6073051c3 ata: ahci: Do not check ACPI_FADT_LOW_POWER_S0 99ad3f9f829f ata: libata-core: improve parameter names for ata_dev_set_feature() 16169fb78182 ata: libata-core: Print timeout value when internal command times
Test log:
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: 7da9fed0474b4cd46055dd92d55c42faf32c19ac git_describe: next-20221006 kernel_version: 6.0.0 kernel-config: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F/config build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/659754170 artifact-location: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F toolchain: gcc-10
The kernel messages that are shown in the links above do not show any "libata version 3.00 loaded." message nor any ata/ahci message that I can see. So I think the eSATA adapter is not even being detected and libata/ahci driver not used.
Was this working before ? If yes, can you try with the following patches reverted ?
d3243965f24a ("ata: make PATA_PLATFORM selectable only for suitable architectures") 3ebe59a54111 ("ata: clean up how architectures enable PATA_PLATFORM and PATA_OF_PLATFORM")
I have reverted above two patches and but the problem has not been solved.
If reverting these patches restores the eSATA port on this board, then you need to fix the defconfig for that board.
OTOH, Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the device failed to boot.
I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't... However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA controller support") from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was successful.
Build artifacts [1]. Any idea what happens?
Cheers, Anders [1] https://builds.tuxbuild.com/2G53i1F7vUWWTuZJtka3Fr7iH1B/
On 10/14/22 07:07, Anders Roxell wrote: [...]
If reverting these patches restores the eSATA port on this board, then you need to fix the defconfig for that board.
OTOH, Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the device failed to boot.
I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
As mentioned in my previous reply to Naresh, this is a new driver added in 6.1. Your board was working before so this should not be the driver needed for it.
However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA controller support") from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was successful.
Which is very strange... There is only one hunk in that commit that could be considered suspicious:
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 9b56490ecbc3..8f5572a9f8f1 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */ - { .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", }, - { .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
Is your board using one of these compatible string ?
Serge ? Any idea ?
On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
On 10/14/22 07:07, Anders Roxell wrote: [...]
If reverting these patches restores the eSATA port on this board, then you need to fix the defconfig for that board.
OTOH, Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the device failed to boot.
I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
As mentioned in my previous reply to Naresh, this is a new driver added in 6.1. Your board was working before so this should not be the driver needed for it.
However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA controller support") from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was successful.
Which is very strange... There is only one hunk in that commit that could be considered suspicious:
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 9b56490ecbc3..8f5572a9f8f1 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */
{ .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", },
{ .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
Is your board using one of these compatible string ?
The x15 uses "snps,dwc-ahci". I would expect it to detect the device with the new driver if that is loaded, but it's possible that the driver does not work on all versions of the dwc-ahci hardware.
Anders, can you provide the boot log from a boot with the new driver built in? There should be some messages from dwc-ahci about finding the device, but then not ultimately working.
Depending on which way it goes wrong, the safest fallback for 6.1 is probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible strings back into the old driver, and leave the new one only for the "baikal,bt1-ahci" implementation of it, until it has been successfully verified on TI am5/dra7, spear13xx and exynos.
Arnd
On 10/14/22 16:31, Arnd Bergmann wrote:
On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
On 10/14/22 07:07, Anders Roxell wrote: [...]
If reverting these patches restores the eSATA port on this board, then you need to fix the defconfig for that board.
OTOH, Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the device failed to boot.
I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
As mentioned in my previous reply to Naresh, this is a new driver added in 6.1. Your board was working before so this should not be the driver needed for it.
However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA controller support") from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was successful.
Which is very strange... There is only one hunk in that commit that could be considered suspicious:
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 9b56490ecbc3..8f5572a9f8f1 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */
{ .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", },
{ .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
Is your board using one of these compatible string ?
The x15 uses "snps,dwc-ahci". I would expect it to detect the device with the new driver if that is loaded, but it's possible that the driver does not work on all versions of the dwc-ahci hardware.
Anders, can you provide the boot log from a boot with the new driver built in? There should be some messages from dwc-ahci about finding the device, but then not ultimately working.
Depending on which way it goes wrong, the safest fallback for 6.1 is probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible strings back into the old driver, and leave the new one only for the "baikal,bt1-ahci" implementation of it, until it has been successfully verified on TI am5/dra7, spear13xx and exynos.
OK. So a fix patch until further tests/debug is completed would be this:
diff --git a/drivers/ata/ahci_dwc.c b/drivers/ata/ahci_dwc.c index 8fb66860db31..7a0cbab00843 100644 --- a/drivers/ata/ahci_dwc.c +++ b/drivers/ata/ahci_dwc.c @@ -469,8 +469,6 @@ static struct ahci_dwc_plat_data ahci_bt1_plat = { };
static const struct of_device_id ahci_dwc_of_match[] = { - { .compatible = "snps,dwc-ahci", &ahci_dwc_plat }, - { .compatible = "snps,spear-ahci", &ahci_dwc_plat }, { .compatible = "baikal,bt1-ahci", &ahci_bt1_plat }, {}, }; diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 8f5572a9f8f1..9b56490ecbc3 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,7 +80,9 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */ + { .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", }, + { .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
Anders, Naresh,
Can you try this ?
On Fri, 14 Oct 2022 at 09:53, Damien Le Moal damien.lemoal@opensource.wdc.com wrote:
On 10/14/22 16:31, Arnd Bergmann wrote:
On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
On 10/14/22 07:07, Anders Roxell wrote: [...]
If reverting these patches restores the eSATA port on this board, then you need to fix the defconfig for that board.
OTOH, Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the device failed to boot.
I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
As mentioned in my previous reply to Naresh, this is a new driver added in 6.1. Your board was working before so this should not be the driver needed for it.
However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA controller support") from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was successful.
Which is very strange... There is only one hunk in that commit that could be considered suspicious:
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 9b56490ecbc3..8f5572a9f8f1 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */
{ .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", },
{ .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
Is your board using one of these compatible string ?
The x15 uses "snps,dwc-ahci". I would expect it to detect the device with the new driver if that is loaded, but it's possible that the driver does not work on all versions of the dwc-ahci hardware.
Anders, can you provide the boot log from a boot with the new driver built in? There should be some messages from dwc-ahci about finding the device, but then not ultimately working.
Depending on which way it goes wrong, the safest fallback for 6.1 is probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible strings back into the old driver, and leave the new one only for the "baikal,bt1-ahci" implementation of it, until it has been successfully verified on TI am5/dra7, spear13xx and exynos.
OK. So a fix patch until further tests/debug is completed would be this:
diff --git a/drivers/ata/ahci_dwc.c b/drivers/ata/ahci_dwc.c index 8fb66860db31..7a0cbab00843 100644 --- a/drivers/ata/ahci_dwc.c +++ b/drivers/ata/ahci_dwc.c @@ -469,8 +469,6 @@ static struct ahci_dwc_plat_data ahci_bt1_plat = { };
static const struct of_device_id ahci_dwc_of_match[] = {
{ .compatible = "snps,dwc-ahci", &ahci_dwc_plat },
{ .compatible = "snps,spear-ahci", &ahci_dwc_plat }, { .compatible = "baikal,bt1-ahci", &ahci_bt1_plat }, {},
}; diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 8f5572a9f8f1..9b56490ecbc3 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,7 +80,9 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */
{ .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", },
{ .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
Anders, Naresh,
Can you try this ?
Tested this patch on todays linux-next tag: next-20221014 without enabling CONFIG_AHCI_DWC and it worked as expected when booting [1]. On the other hand I also tried a build/boot with CONFIG_AHCI_DWC enabled and it worked as expected to boot [2]. However, during building a warning [3] popped up:
make --silent --keep-going --jobs=8 O=/home/tuxbuild/.cache/tuxmake/builds/2/build ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- 'CC=sccache arm-linux-gnueabihf-gcc' 'HOSTCC=sccache gcc' /builds/linux/drivers/ata/ahci_dwc.c:462:34: warning: 'ahci_dwc_plat' defined but not used [-Wunused-variable] 462 | static struct ahci_dwc_plat_data ahci_dwc_plat = { | ^~~~~~~~~~~~~
Cheers, Anders [1] https://lkft.validation.linaro.org/scheduler/job/5678031 [2] https://lkft.validation.linaro.org/scheduler/job/5678152 [3] https://builds.tuxbuild.com/2G7PDSV5uzjnQqCCBybK4WpoTxz/build.log
On Fri, Oct 14, 2022, at 11:22 AM, Anders Roxell wrote:
On Fri, 14 Oct 2022 at 09:53, Damien Le Moal
Tested this patch on todays linux-next tag: next-20221014 without enabling CONFIG_AHCI_DWC and it worked as expected when booting [1]. On the other hand I also tried a build/boot with CONFIG_AHCI_DWC enabled and it worked as expected to boot [2].
Ok, great. Can you a patch to soc@kernel.org to enable the driver in the relevant defconfigs?
However, during building a warning [3] popped up:
make --silent --keep-going --jobs=8 O=/home/tuxbuild/.cache/tuxmake/builds/2/build ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- 'CC=sccache arm-linux-gnueabihf-gcc' 'HOSTCC=sccache gcc' /builds/linux/drivers/ata/ahci_dwc.c:462:34: warning: 'ahci_dwc_plat' defined but not used [-Wunused-variable] 462 | static struct ahci_dwc_plat_data ahci_dwc_plat = {
Strange, I can't reproduce this, and the ahci_dwc_plat symbol looks like it is clearly used in ahci_dwc_of_match[], at least in next-20221014. Do you also see this on mainline?
Arnd
On 10/14/22 18:37, Arnd Bergmann wrote:
On Fri, Oct 14, 2022, at 11:22 AM, Anders Roxell wrote:
On Fri, 14 Oct 2022 at 09:53, Damien Le Moal
Tested this patch on todays linux-next tag: next-20221014 without enabling CONFIG_AHCI_DWC and it worked as expected when booting [1]. On the other hand I also tried a build/boot with CONFIG_AHCI_DWC enabled and it worked as expected to boot [2].
That is great news ! So the new driver is OK, good !
Ok, great. Can you a patch to soc@kernel.org to enable the driver in the relevant defconfigs?
However, during building a warning [3] popped up:
make --silent --keep-going --jobs=8 O=/home/tuxbuild/.cache/tuxmake/builds/2/build ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- 'CC=sccache arm-linux-gnueabihf-gcc' 'HOSTCC=sccache gcc' /builds/linux/drivers/ata/ahci_dwc.c:462:34: warning: 'ahci_dwc_plat' defined but not used [-Wunused-variable] 462 | static struct ahci_dwc_plat_data ahci_dwc_plat = {
Strange, I can't reproduce this, and the ahci_dwc_plat symbol looks like it is clearly used in ahci_dwc_of_match[], at least in next-20221014. Do you also see this on mainline?
This is with the trial fix diff I sent. My bad, it was not even compile tested :). Does not happen otherwise.
Arnd
Thanks for helping with this !
On Fri, Oct 14, 2022 at 11:22:38AM +0200, Anders Roxell wrote:
On Fri, 14 Oct 2022 at 09:53, Damien Le Moal damien.lemoal@opensource.wdc.com wrote:
On 10/14/22 16:31, Arnd Bergmann wrote:
On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
On 10/14/22 07:07, Anders Roxell wrote: [...]
> If reverting these patches restores the eSATA port on this board, then you need > to fix the defconfig for that board.
OTOH, Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the device failed to boot.
I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
As mentioned in my previous reply to Naresh, this is a new driver added in 6.1. Your board was working before so this should not be the driver needed for it.
However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA controller support") from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was successful.
Which is very strange... There is only one hunk in that commit that could be considered suspicious:
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 9b56490ecbc3..8f5572a9f8f1 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */
{ .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", },
{ .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
Is your board using one of these compatible string ?
The x15 uses "snps,dwc-ahci". I would expect it to detect the device with the new driver if that is loaded, but it's possible that the driver does not work on all versions of the dwc-ahci hardware.
Anders, can you provide the boot log from a boot with the new driver built in? There should be some messages from dwc-ahci about finding the device, but then not ultimately working.
Depending on which way it goes wrong, the safest fallback for 6.1 is probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible strings back into the old driver, and leave the new one only for the "baikal,bt1-ahci" implementation of it, until it has been successfully verified on TI am5/dra7, spear13xx and exynos.
OK. So a fix patch until further tests/debug is completed would be this:
diff --git a/drivers/ata/ahci_dwc.c b/drivers/ata/ahci_dwc.c index 8fb66860db31..7a0cbab00843 100644 --- a/drivers/ata/ahci_dwc.c +++ b/drivers/ata/ahci_dwc.c @@ -469,8 +469,6 @@ static struct ahci_dwc_plat_data ahci_bt1_plat = { };
static const struct of_device_id ahci_dwc_of_match[] = {
{ .compatible = "snps,dwc-ahci", &ahci_dwc_plat },
{ .compatible = "snps,spear-ahci", &ahci_dwc_plat }, { .compatible = "baikal,bt1-ahci", &ahci_bt1_plat }, {},
}; diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 8f5572a9f8f1..9b56490ecbc3 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,7 +80,9 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */
{ .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", },
{ .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
Anders, Naresh,
Can you try this ?
Tested this patch on todays linux-next tag: next-20221014 without enabling CONFIG_AHCI_DWC and it worked as expected when booting [1]. On the other hand I also tried a build/boot with CONFIG_AHCI_DWC enabled and it worked as expected to boot [2].
Expected result. The DWC driver will probe the device on our platform only while your platform falls back to using the generic driver. Anders, in order understand the root cause of the problem could you please 1. upload the bogus boot log. 2. try what I suggested here Link: https://lore.kernel.org/linux-ide/20221014133623.l6w4o7onoyhv2q34@mobilestat... and if the system fails to boot at some point upload the boot log.
-Sergey
However, during building a warning [3] popped up:
make --silent --keep-going --jobs=8 O=/home/tuxbuild/.cache/tuxmake/builds/2/build ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- 'CC=sccache arm-linux-gnueabihf-gcc' 'HOSTCC=sccache gcc' /builds/linux/drivers/ata/ahci_dwc.c:462:34: warning: 'ahci_dwc_plat' defined but not used [-Wunused-variable] 462 | static struct ahci_dwc_plat_data ahci_dwc_plat = { | ^~~~~~~~~~~~~
Cheers, Anders [1] https://lkft.validation.linaro.org/scheduler/job/5678031 [2] https://lkft.validation.linaro.org/scheduler/job/5678152 [3] https://builds.tuxbuild.com/2G7PDSV5uzjnQqCCBybK4WpoTxz/build.log
On Fri, 14 Oct 2022 at 16:06, Serge Semin Sergey.Semin@baikalelectronics.ru wrote:
On Fri, Oct 14, 2022 at 11:22:38AM +0200, Anders Roxell wrote:
On Fri, 14 Oct 2022 at 09:53, Damien Le Moal damien.lemoal@opensource.wdc.com wrote:
On 10/14/22 16:31, Arnd Bergmann wrote:
On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
On 10/14/22 07:07, Anders Roxell wrote: [...]
> 8) >> If reverting these patches restores the eSATA port on this board, then you need >> to fix the defconfig for that board. > > OTOH, > Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the > device failed to boot.
I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
As mentioned in my previous reply to Naresh, this is a new driver added in 6.1. Your board was working before so this should not be the driver needed for it.
However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA controller support") from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was successful.
Which is very strange... There is only one hunk in that commit that could be considered suspicious:
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 9b56490ecbc3..8f5572a9f8f1 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */
{ .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", },
{ .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
Is your board using one of these compatible string ?
The x15 uses "snps,dwc-ahci". I would expect it to detect the device with the new driver if that is loaded, but it's possible that the driver does not work on all versions of the dwc-ahci hardware.
Anders, can you provide the boot log from a boot with the new driver built in? There should be some messages from dwc-ahci about finding the device, but then not ultimately working.
Depending on which way it goes wrong, the safest fallback for 6.1 is probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible strings back into the old driver, and leave the new one only for the "baikal,bt1-ahci" implementation of it, until it has been successfully verified on TI am5/dra7, spear13xx and exynos.
OK. So a fix patch until further tests/debug is completed would be this:
diff --git a/drivers/ata/ahci_dwc.c b/drivers/ata/ahci_dwc.c index 8fb66860db31..7a0cbab00843 100644 --- a/drivers/ata/ahci_dwc.c +++ b/drivers/ata/ahci_dwc.c @@ -469,8 +469,6 @@ static struct ahci_dwc_plat_data ahci_bt1_plat = { };
static const struct of_device_id ahci_dwc_of_match[] = {
{ .compatible = "snps,dwc-ahci", &ahci_dwc_plat },
{ .compatible = "snps,spear-ahci", &ahci_dwc_plat }, { .compatible = "baikal,bt1-ahci", &ahci_bt1_plat }, {},
}; diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 8f5572a9f8f1..9b56490ecbc3 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,7 +80,9 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */
{ .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", },
{ .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
Anders, Naresh,
Can you try this ?
Tested this patch on todays linux-next tag: next-20221014 without enabling CONFIG_AHCI_DWC and it worked as expected when booting [1]. On the other hand I also tried a build/boot with CONFIG_AHCI_DWC enabled and it worked as expected to boot [2].
Expected result. The DWC driver will probe the device on our platform only while your platform falls back to using the generic driver. Anders, in order understand the root cause of the problem could you please
- upload the bogus boot log.
This [1] is the bogus boot log.
- try what I suggested here
Link: https://lore.kernel.org/linux-ide/20221014133623.l6w4o7onoyhv2q34@mobilestat... and if the system fails to boot at some point upload the boot log.
Only doing this:
--- a/drivers/ata/ahci_dwc.c +++ b/drivers/ata/ahci_dwc.c @@ -316,12 +316,13 @@ static int ahci_dwc_init_host(struct ahci_host_priv *hpriv) if (rc) goto err_disable_resources; } - +/* ahci_dwc_check_cap(hpriv);
ahci_dwc_init_timer(hpriv);
rc = ahci_dwc_init_dmacr(hpriv); +*/ if (rc) goto err_clear_platform;
and enable CONFIG_AHCI_DWC made the mkfs to detect the SATA drive [2].
Cheers, Anders [1] https://lkft.validation.linaro.org/scheduler/job/5634743#L2580 [2] https://lkft.validation.linaro.org/scheduler/job/5679278#L2617
On Mon, Oct 17, 2022 at 09:43:24AM +0200, Anders Roxell wrote:
On Fri, 14 Oct 2022 at 16:06, Serge Semin Sergey.Semin@baikalelectronics.ru wrote:
On Fri, Oct 14, 2022 at 11:22:38AM +0200, Anders Roxell wrote:
On Fri, 14 Oct 2022 at 09:53, Damien Le Moal damien.lemoal@opensource.wdc.com wrote:
On 10/14/22 16:31, Arnd Bergmann wrote:
On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
On 10/14/22 07:07, Anders Roxell wrote: [...] >> 8) >>> If reverting these patches restores the eSATA port on this board, then you need >>> to fix the defconfig for that board. >> >> OTOH, >> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the >> device failed to boot. > > I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
As mentioned in my previous reply to Naresh, this is a new driver added in 6.1. Your board was working before so this should not be the driver needed for it.
> However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA > controller support") > from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was > successful.
Which is very strange... There is only one hunk in that commit that could be considered suspicious:
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 9b56490ecbc3..8f5572a9f8f1 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */
{ .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", },
{ .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
Is your board using one of these compatible string ?
The x15 uses "snps,dwc-ahci". I would expect it to detect the device with the new driver if that is loaded, but it's possible that the driver does not work on all versions of the dwc-ahci hardware.
Anders, can you provide the boot log from a boot with the new driver built in? There should be some messages from dwc-ahci about finding the device, but then not ultimately working.
Depending on which way it goes wrong, the safest fallback for 6.1 is probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible strings back into the old driver, and leave the new one only for the "baikal,bt1-ahci" implementation of it, until it has been successfully verified on TI am5/dra7, spear13xx and exynos.
OK. So a fix patch until further tests/debug is completed would be this:
diff --git a/drivers/ata/ahci_dwc.c b/drivers/ata/ahci_dwc.c index 8fb66860db31..7a0cbab00843 100644 --- a/drivers/ata/ahci_dwc.c +++ b/drivers/ata/ahci_dwc.c @@ -469,8 +469,6 @@ static struct ahci_dwc_plat_data ahci_bt1_plat = { };
static const struct of_device_id ahci_dwc_of_match[] = {
{ .compatible = "snps,dwc-ahci", &ahci_dwc_plat },
{ .compatible = "snps,spear-ahci", &ahci_dwc_plat }, { .compatible = "baikal,bt1-ahci", &ahci_bt1_plat }, {},
}; diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 8f5572a9f8f1..9b56490ecbc3 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,7 +80,9 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */
{ .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", },
{ .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
Anders, Naresh,
Can you try this ?
Tested this patch on todays linux-next tag: next-20221014 without enabling CONFIG_AHCI_DWC and it worked as expected when booting [1]. On the other hand I also tried a build/boot with CONFIG_AHCI_DWC enabled and it worked as expected to boot [2].
Expected result. The DWC driver will probe the device on our platform only while your platform falls back to using the generic driver. Anders, in order understand the root cause of the problem could you please
- upload the bogus boot log.
This [1] is the bogus boot log.
- try what I suggested here
Link: https://lore.kernel.org/linux-ide/20221014133623.l6w4o7onoyhv2q34@mobilestat... and if the system fails to boot at some point upload the boot log.
Only doing this:
--- a/drivers/ata/ahci_dwc.c +++ b/drivers/ata/ahci_dwc.c @@ -316,12 +316,13 @@ static int ahci_dwc_init_host(struct ahci_host_priv *hpriv) if (rc) goto err_disable_resources; }
+/* ahci_dwc_check_cap(hpriv);
ahci_dwc_init_timer(hpriv);
rc = ahci_dwc_init_dmacr(hpriv); +*/ if (rc) goto err_clear_platform;
and enable CONFIG_AHCI_DWC made the mkfs to detect the SATA drive [2].
Judging by what is in [1] and [2] I have much doubt that [1] was executed with the CONFIG_AHCI_DWC config enabled because the boot log has nothing about the ahci-dwc driver probe failure or none of the logs messages seen in [2] (see every line with the ahci-dwc word in it).
1. If you had the device probe procedure failed at some point you would have got a line like this: < ahci-dwc: probe of 4a140000.sata failed with error -errno But there is no such line in [1]. There is literally nothing AHCI/SATA/SCSI/DWC AHCI/ahci-dwc/etc in it.
2. If you had the DW AHCI device probe at least performed, then the next calls-chain would have been executed: ahci_dwc_probe() +-> ahci_dwc_get_resources() +-> ahci_platform_get_resources() +-> ... +-> devm_regulator_get(...) +-> ... which would have caused the next log messages: < [] ahci-dwc 4a140000.sata: supply ahci not found, using dummy regulator < [] ahci-dwc 4a140000.sata: supply phy not found, using dummy regulator < [] ahci-dwc 4a140000.sata: supply target not found, using dummy regulator You do have these lines in [2] but missing them in [1]. Should you have any errors in ahci_dwc_probe() detected before that you would have an error printed as I noted in 1.
3. Should the problem was in the commented out code lines you would have at least got the messages above printed to the log [1] because the commented out code is executed after the resources request procedure (see the ahci_dwc_init_host() method is called after ahci_dwc_get_resources()).
4. Finally the commented out code doesn't really do any actions which could have caused the device probe to silently halt.
All of that makes me thinking that the DW AHCI SATA wasn't even probed in [1] which most likely means that either the driver config was omitted there or the device was disabled. So could you please re-start the system like in [2] but uncomment the lines above?
* Please make sure the Damien's fix https://www.spinics.net/lists/arm-kernel/msg1017920.html isn't applied on the kernel [2].
[1] https://lkft.validation.linaro.org/scheduler/job/5634743#L2580 [2] https://lkft.validation.linaro.org/scheduler/job/5679278#L2617
-Sergey
Cheers, Anders [1] https://lkft.validation.linaro.org/scheduler/job/5634743#L2580 [2] https://lkft.validation.linaro.org/scheduler/job/5679278#L2617
On Mon, 17 Oct 2022 at 21:22, Serge Semin fancer.lancer@gmail.com wrote:
On Mon, Oct 17, 2022 at 09:43:24AM +0200, Anders Roxell wrote:
On Fri, 14 Oct 2022 at 16:06, Serge Semin Sergey.Semin@baikalelectronics.ru wrote:
On Fri, Oct 14, 2022 at 11:22:38AM +0200, Anders Roxell wrote:
On Fri, 14 Oct 2022 at 09:53, Damien Le Moal damien.lemoal@opensource.wdc.com wrote:
On 10/14/22 16:31, Arnd Bergmann wrote:
On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote: > On 10/14/22 07:07, Anders Roxell wrote: > [...] >>> 8) >>>> If reverting these patches restores the eSATA port on this board, then you need >>>> to fix the defconfig for that board. >>> >>> OTOH, >>> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the >>> device failed to boot. >> >> I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't... > > As mentioned in my previous reply to Naresh, this is a new driver added in > 6.1. Your board was working before so this should not be the driver needed > for it. > >> However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA >> controller support") >> from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was >> successful. > > Which is very strange... There is only one hunk in that commit that could > be considered suspicious: > > diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c > index 9b56490ecbc3..8f5572a9f8f1 100644 > --- a/drivers/ata/ahci_platform.c > +++ b/drivers/ata/ahci_platform.c > @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, > static const struct of_device_id ahci_of_match[] = { > { .compatible = "generic-ahci", }, > /* Keep the following compatibles for device tree compatibility */ > - { .compatible = "snps,spear-ahci", }, > { .compatible = "ibm,476gtr-ahci", }, > - { .compatible = "snps,dwc-ahci", }, > { .compatible = "hisilicon,hisi-ahci", }, > { .compatible = "cavium,octeon-7130-ahci", }, > { /* sentinel */ } > > Is your board using one of these compatible string ?
The x15 uses "snps,dwc-ahci". I would expect it to detect the device with the new driver if that is loaded, but it's possible that the driver does not work on all versions of the dwc-ahci hardware.
Anders, can you provide the boot log from a boot with the new driver built in? There should be some messages from dwc-ahci about finding the device, but then not ultimately working.
Depending on which way it goes wrong, the safest fallback for 6.1 is probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible strings back into the old driver, and leave the new one only for the "baikal,bt1-ahci" implementation of it, until it has been successfully verified on TI am5/dra7, spear13xx and exynos.
OK. So a fix patch until further tests/debug is completed would be this:
diff --git a/drivers/ata/ahci_dwc.c b/drivers/ata/ahci_dwc.c index 8fb66860db31..7a0cbab00843 100644 --- a/drivers/ata/ahci_dwc.c +++ b/drivers/ata/ahci_dwc.c @@ -469,8 +469,6 @@ static struct ahci_dwc_plat_data ahci_bt1_plat = { };
static const struct of_device_id ahci_dwc_of_match[] = {
{ .compatible = "snps,dwc-ahci", &ahci_dwc_plat },
{ .compatible = "snps,spear-ahci", &ahci_dwc_plat }, { .compatible = "baikal,bt1-ahci", &ahci_bt1_plat }, {},
}; diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 8f5572a9f8f1..9b56490ecbc3 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,7 +80,9 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */
{ .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", },
{ .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
FYI,
We have been noticing this problem [a] & [b] on Linux mainline master 6.1.0-rc7
Test error: mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job exit
Please suggest a way forward on this reported issue on arm32 TI BeagleBoard X15 device. Build and Kernel configs details provided in the metadata section.
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline git_sha: b7b275e60bcd5f89771e865a8239325f86d9927d git_describe: v6.1-rc7 kernel_version: 6.1.0-rc7 kernel-config: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW/config build-url: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline/-/pipelines/7... artifact-location: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW toolchain: gcc-10
[a] https://lkft.validation.linaro.org/scheduler/job/5892099 [b] https://lore.kernel.org/all/20221017155246.zxal2cfehjgaajcu@mobilestation/
- Naresh
On Wed, Nov 30, 2022 at 03:10:37PM +0530, Naresh Kamboju wrote:
On Mon, 17 Oct 2022 at 21:22, Serge Semin fancer.lancer@gmail.com wrote:
FYI,
We have been noticing this problem [a] & [b] on Linux mainline master 6.1.0-rc7
Test error: mkfs.ext4
/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job exit
Please suggest a way forward on this reported issue on arm32 TI BeagleBoard X15 device. Build and Kernel configs details provided in the metadata section.
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline git_sha: b7b275e60bcd5f89771e865a8239325f86d9927d git_describe: v6.1-rc7 kernel_version: 6.1.0-rc7 kernel-config: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW/config build-url: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline/-/pipelines/7... artifact-location: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW toolchain: gcc-10
[a] https://lkft.validation.linaro.org/scheduler/job/5892099 [b] https://lore.kernel.org/all/20221017155246.zxal2cfehjgaajcu@mobilestation/
- Naresh
Hello Naresh,
Looking at the error from the log:
+ mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 mke2fs 1.46.5 (30-Dec-2021) The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does not exist and no size was specified.
It seems like the device that you are trying to format does not exist.
On October 17th Serge suggested that you guys should try to enable: CONFIG_AHCI_DWC and see if that does solve your problem.
There was never any reply to his suggestion.
Looking at the config in:
kernel-config: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW/config
# CONFIG_AHCI_DWC is not set
This Kconfig is indeed not enabled.
Could you guys please try the suggestion from Serge?
Kind regards, Niklas
On Wed, 30 Nov 2022 at 11:03, Niklas Cassel Niklas.Cassel@wdc.com wrote:
On Wed, Nov 30, 2022 at 03:10:37PM +0530, Naresh Kamboju wrote:
On Mon, 17 Oct 2022 at 21:22, Serge Semin fancer.lancer@gmail.com wrote:
FYI,
We have been noticing this problem [a] & [b] on Linux mainline master 6.1.0-rc7
Test error: mkfs.ext4
/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job exit
Please suggest a way forward on this reported issue on arm32 TI BeagleBoard X15 device. Build and Kernel configs details provided in the metadata section.
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline git_sha: b7b275e60bcd5f89771e865a8239325f86d9927d git_describe: v6.1-rc7 kernel_version: 6.1.0-rc7 kernel-config: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW/config build-url: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline/-/pipelines/7... artifact-location: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW toolchain: gcc-10
[a] https://lkft.validation.linaro.org/scheduler/job/5892099 [b] https://lore.kernel.org/all/20221017155246.zxal2cfehjgaajcu@mobilestation/
- Naresh
Hello Naresh,
Looking at the error from the log:
- mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
mke2fs 1.46.5 (30-Dec-2021) The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does not exist and no size was specified.
It seems like the device that you are trying to format does not exist.
On October 17th Serge suggested that you guys should try to enable: CONFIG_AHCI_DWC and see if that does solve your problem.
There was never any reply to his suggestion.
I re-tested this on todays linux tree v6.1-rc7-103-gef4d3ea40565.
With CONFIG_AHCI_DWC=y the kernel fell on its back and no output was produced So changing the ahci_dwc_init to be a late_initcall [1] made me see what was going on [2].
The kernel booted fine with CONFIG_AHCI_DWC=y + this patch [3]
--- a/drivers/ata/libahci_platform.c +++ b/drivers/ata/libahci_platform.c @@ -109,7 +109,8 @@ struct clk *ahci_platform_find_clk(struct ahci_host_priv *hpriv, const char *con int i;
for (i = 0; i < hpriv->n_clks; i++) { - if (!strcmp(hpriv->clks[i].id, con_id)) + if (hpriv->clks && hpriv->clks[i].id && + !strcmp(hpriv->clks[i].id, con_id)) return hpriv->clks[i].clk; }
Bootlog [4]. Thank you Arnd for helping out with the investigation and for proposing the patch for me to test.
Looking at the config in:
kernel-config: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW/config
# CONFIG_AHCI_DWC is not set
This Kconfig is indeed not enabled.
Could you guys please try the suggestion from Serge?
The patch was also tested [5] without enabling CONFIG_AHCI_DWC, this also worked fine.
Cheers, Anders [1] http://ix.io/4hmt [2] https://lkft.validation.linaro.org/scheduler/job/5902935 [3] http://ix.io/4hmv [4] https://lkft.validation.linaro.org/scheduler/job/5903220 [5] http://ix.io/4hmw
On Thu, Dec 01, 2022 at 12:48:32PM +0100, Anders Roxell wrote:
On Wed, 30 Nov 2022 at 11:03, Niklas Cassel Niklas.Cassel@wdc.com wrote:
On Wed, Nov 30, 2022 at 03:10:37PM +0530, Naresh Kamboju wrote:
On Mon, 17 Oct 2022 at 21:22, Serge Semin fancer.lancer@gmail.com wrote:
FYI,
We have been noticing this problem [a] & [b] on Linux mainline master 6.1.0-rc7
Test error: mkfs.ext4
/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job exit
Please suggest a way forward on this reported issue on arm32 TI BeagleBoard X15 device. Build and Kernel configs details provided in the metadata section.
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline git_sha: b7b275e60bcd5f89771e865a8239325f86d9927d git_describe: v6.1-rc7 kernel_version: 6.1.0-rc7 kernel-config: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW/config build-url: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline/-/pipelines/7... artifact-location: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW toolchain: gcc-10
[a] https://lkft.validation.linaro.org/scheduler/job/5892099 [b] https://lore.kernel.org/all/20221017155246.zxal2cfehjgaajcu@mobilestation/
- Naresh
Hello Naresh,
Looking at the error from the log:
- mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
mke2fs 1.46.5 (30-Dec-2021) The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does not exist and no size was specified.
It seems like the device that you are trying to format does not exist.
On October 17th Serge suggested that you guys should try to enable: CONFIG_AHCI_DWC and see if that does solve your problem.
There was never any reply to his suggestion.
I re-tested this on todays linux tree v6.1-rc7-103-gef4d3ea40565.
With CONFIG_AHCI_DWC=y the kernel fell on its back and no output was produced So changing the ahci_dwc_init to be a late_initcall [1] made me see what was going on [2].
The kernel booted fine with CONFIG_AHCI_DWC=y + this patch [3]
--- a/drivers/ata/libahci_platform.c +++ b/drivers/ata/libahci_platform.c @@ -109,7 +109,8 @@ struct clk *ahci_platform_find_clk(struct ahci_host_priv *hpriv, const char *con int i;
for (i = 0; i < hpriv->n_clks; i++) {
- if (!strcmp(hpriv->clks[i].id, con_id))
- if (hpriv->clks && hpriv->clks[i].id &&
- !strcmp(hpriv->clks[i].id, con_id)) return hpriv->clks[i].clk; }
Indeed I should have taken into account that devm_clk_bulk_get_all() can get unnamed clocks too. But checking the hpriv->clks pointer for being not null is redundant, since the ahci_platform_get_resources() procedure makes sure that the array is always allocated. At the very least you shouldn't check the pointer in the loop, but can make sure that the clks array is available before it.
-Serge(y)
Bootlog [4]. Thank you Arnd for helping out with the investigation and for proposing the patch for me to test.
Looking at the config in:
kernel-config: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW/config
# CONFIG_AHCI_DWC is not set
This Kconfig is indeed not enabled.
Could you guys please try the suggestion from Serge?
The patch was also tested [5] without enabling CONFIG_AHCI_DWC, this also worked fine.
Cheers, Anders [1] http://ix.io/4hmt [2] https://lkft.validation.linaro.org/scheduler/job/5902935 [3] http://ix.io/4hmv [4] https://lkft.validation.linaro.org/scheduler/job/5903220 [5] http://ix.io/4hmw
On Mon, Dec 5, 2022, at 02:11, Serge Semin wrote:
On Thu, Dec 01, 2022 at 12:48:32PM +0100, Anders Roxell wrote:
for (i = 0; i < hpriv->n_clks; i++) {
- if (!strcmp(hpriv->clks[i].id, con_id))
- if (hpriv->clks && hpriv->clks[i].id &&
- !strcmp(hpriv->clks[i].id, con_id)) return hpriv->clks[i].clk; }
Indeed I should have taken into account that devm_clk_bulk_get_all() can get unnamed clocks too. But checking the hpriv->clks pointer for being not null is redundant, since the ahci_platform_get_resources() procedure makes sure that the array is always allocated. At the very least you shouldn't check the pointer in the loop, but can make sure that the clks array is available before it.
Do you think this is otherwise the correct fix then? Any chance we can still get a version of it into 6.1?
Arnd
On 12/5/22 19:08, Arnd Bergmann wrote:
On Mon, Dec 5, 2022, at 02:11, Serge Semin wrote:
On Thu, Dec 01, 2022 at 12:48:32PM +0100, Anders Roxell wrote:
for (i = 0; i < hpriv->n_clks; i++) {
- if (!strcmp(hpriv->clks[i].id, con_id))
- if (hpriv->clks && hpriv->clks[i].id &&
- !strcmp(hpriv->clks[i].id, con_id)) return hpriv->clks[i].clk; }
Indeed I should have taken into account that devm_clk_bulk_get_all() can get unnamed clocks too. But checking the hpriv->clks pointer for being not null is redundant, since the ahci_platform_get_resources() procedure makes sure that the array is always allocated. At the very least you shouldn't check the pointer in the loop, but can make sure that the clks array is available before it.
Do you think this is otherwise the correct fix then? Any chance we can still get a version of it into 6.1?
If someone sends me a proper patch to apply, I can send a last PR for 6.1 to Linus before week end.
Arnd
On Mon, Dec 05, 2022 at 10:24:22PM +0900, Damien Le Moal wrote:
On 12/5/22 19:08, Arnd Bergmann wrote:
On Mon, Dec 5, 2022, at 02:11, Serge Semin wrote:
On Thu, Dec 01, 2022 at 12:48:32PM +0100, Anders Roxell wrote:
for (i = 0; i < hpriv->n_clks; i++) {
- if (!strcmp(hpriv->clks[i].id, con_id))
- if (hpriv->clks && hpriv->clks[i].id &&
- !strcmp(hpriv->clks[i].id, con_id)) return hpriv->clks[i].clk; }
Indeed I should have taken into account that devm_clk_bulk_get_all() can get unnamed clocks too. But checking the hpriv->clks pointer for being not null is redundant, since the ahci_platform_get_resources() procedure makes sure that the array is always allocated. At the very least you shouldn't check the pointer in the loop, but can make sure that the clks array is available before it.
Do you think this is otherwise the correct fix then? Any chance we can still get a version of it into 6.1?
I'll think of a better solution. But at this stage it seems like the best choice seeing the bindings permit having unnamed clocks specified.
If someone sends me a proper patch to apply, I can send a last PR for 6.1 to Linus before week end.
I'll submit the patch today. Thanks.
-Serge(y)
Arnd
-- Damien Le Moal Western Digital Research
On 12/6/22 17:46, Serge Semin wrote:
On Mon, Dec 05, 2022 at 10:24:22PM +0900, Damien Le Moal wrote:
On 12/5/22 19:08, Arnd Bergmann wrote:
On Mon, Dec 5, 2022, at 02:11, Serge Semin wrote:
On Thu, Dec 01, 2022 at 12:48:32PM +0100, Anders Roxell wrote:
for (i = 0; i < hpriv->n_clks; i++) {
- if (!strcmp(hpriv->clks[i].id, con_id))
- if (hpriv->clks && hpriv->clks[i].id &&
- !strcmp(hpriv->clks[i].id, con_id)) return hpriv->clks[i].clk; }
Indeed I should have taken into account that devm_clk_bulk_get_all() can get unnamed clocks too. But checking the hpriv->clks pointer for being not null is redundant, since the ahci_platform_get_resources() procedure makes sure that the array is always allocated. At the very least you shouldn't check the pointer in the loop, but can make sure that the clks array is available before it.
Do you think this is otherwise the correct fix then? Any chance we can still get a version of it into 6.1?
I'll think of a better solution. But at this stage it seems like the best choice seeing the bindings permit having unnamed clocks specified.
If someone sends me a proper patch to apply, I can send a last PR for 6.1 to Linus before week end.
I'll submit the patch today. Thanks.
Anders just posted one. Can you review it please ?
-Serge(y)
Arnd
-- Damien Le Moal Western Digital Research
On Tue, Dec 06, 2022 at 06:12:48PM +0900, Damien Le Moal wrote:
On 12/6/22 17:46, Serge Semin wrote:
On Mon, Dec 05, 2022 at 10:24:22PM +0900, Damien Le Moal wrote:
On 12/5/22 19:08, Arnd Bergmann wrote:
On Mon, Dec 5, 2022, at 02:11, Serge Semin wrote:
On Thu, Dec 01, 2022 at 12:48:32PM +0100, Anders Roxell wrote:
for (i = 0; i < hpriv->n_clks; i++) {
- if (!strcmp(hpriv->clks[i].id, con_id))
- if (hpriv->clks && hpriv->clks[i].id &&
- !strcmp(hpriv->clks[i].id, con_id)) return hpriv->clks[i].clk; }
Indeed I should have taken into account that devm_clk_bulk_get_all() can get unnamed clocks too. But checking the hpriv->clks pointer for being not null is redundant, since the ahci_platform_get_resources() procedure makes sure that the array is always allocated. At the very least you shouldn't check the pointer in the loop, but can make sure that the clks array is available before it.
Do you think this is otherwise the correct fix then? Any chance we can still get a version of it into 6.1?
I'll think of a better solution. But at this stage it seems like the best choice seeing the bindings permit having unnamed clocks specified.
If someone sends me a proper patch to apply, I can send a last PR for 6.1 to Linus before week end.
I'll submit the patch today. Thanks.
Anders just posted one. Can you review it please ?
Done. Thanks.
-Serge(y)
-Serge(y)
Arnd
-- Damien Le Moal Western Digital Research
-- Damien Le Moal Western Digital Research
On Fri, Oct 14, 2022 at 09:31:55AM +0200, Arnd Bergmann wrote:
On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
On 10/14/22 07:07, Anders Roxell wrote: [...]
If reverting these patches restores the eSATA port on this board, then you need to fix the defconfig for that board.
OTOH, Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the device failed to boot.
I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
As mentioned in my previous reply to Naresh, this is a new driver added in 6.1. Your board was working before so this should not be the driver needed for it.
However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA controller support") from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was successful.
Which is very strange... There is only one hunk in that commit that could be considered suspicious:
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 9b56490ecbc3..8f5572a9f8f1 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */
{ .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", },
{ .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
Is your board using one of these compatible string ?
The x15 uses "snps,dwc-ahci". I would expect it to detect the device with the new driver if that is loaded, but it's possible that the driver does not work on all versions of the dwc-ahci hardware.
Anders, can you provide the boot log from a boot with the new driver built in? There should be some messages from dwc-ahci about finding the device, but then not ultimately working.
Yes. The boot-log would be very useful.
Depending on which way it goes wrong, the safest fallback for 6.1 is probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible strings back into the old driver, and leave the new one only for the "baikal,bt1-ahci" implementation of it, until it has been successfully verified on TI am5/dra7, spear13xx and exynos.
Right. This would be a possible solution. But I'd rather suggest to at least try to debug the problem.
-Sergey
Arnd
Hello Damien, Anders
On Fri, Oct 14, 2022 at 09:22:34AM +0900, Damien Le Moal wrote:
On 10/14/22 07:07, Anders Roxell wrote: [...]
If reverting these patches restores the eSATA port on this board, then you need to fix the defconfig for that board.
OTOH, Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the device failed to boot.
I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
As mentioned in my previous reply to Naresh, this is a new driver added in 6.1. Your board was working before so this should not be the driver needed for it.
However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA controller support") from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was successful.
Which is very strange... There is only one hunk in that commit that could be considered suspicious:
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 9b56490ecbc3..8f5572a9f8f1 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend, static const struct of_device_id ahci_of_match[] = { { .compatible = "generic-ahci", }, /* Keep the following compatibles for device tree compatibility */
{ .compatible = "snps,spear-ahci", }, { .compatible = "ibm,476gtr-ahci", },
{ .compatible = "snps,dwc-ahci", }, { .compatible = "hisilicon,hisi-ahci", }, { .compatible = "cavium,octeon-7130-ahci", }, { /* sentinel */ }
Is your board using one of these compatible string ?
No. My board isn't using them. As a quick-fix they could be got back to the generic driver. But please see below.
Serge ? Any idea ?
The only difference between ahci_platform.c and ahci_dwc.c relevant to these compatibles is in calling the next methods: ahci_dwc_check_cap(hpriv); ahci_dwc_init_timer(hpriv); ahci_dwc_init_dmacr(hpriv); As a first step on debugging the problem I would comment them out and try to boot the system with the snps,dwc-ahci device being probed by the ahci_dwc.c driver.
Let's try to test that out first. Then we can narrow down the scale by commenting out one of these methods and then up to some parts of it. What do you think?
-Sergey
-- Damien Le Moal Western Digital Research
On 10/13/22 21:39, Naresh Kamboju wrote:
On Thu, 13 Oct 2022 at 12:41, Damien Le Moal damien.lemoal@opensource.wdc.com wrote:
On 2022/10/12 16:24, Naresh Kamboju wrote:
On TI beagle board x15 the connected SSD is not detected on linux next 20221006 tag.
- export STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
- STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
- test -n /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
- echo y
- mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
mke2fs 1.46.5 (30-Dec-2021) The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does not exist and no size was specified.
- lava-test-raise 'mkfs.ext4
/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job exit'
The reported issue is now noticed on the Linux mainline master branch.
I see following config is missing on latest problematic builds
- CONFIG_HAVE_PATA_PLATFORM=y
Following ahci sata kernel message are missing on problematic boots, [ 1.408660] ahci 4a140000.sata: forcing port_map 0x0 -> 0x1 [ 1.408691] ahci 4a140000.sata: AHCI 0001.0300 32 slots 1 ports 3 Gbps 0x1 impl platform mode [ 1.408721] ahci 4a140000.sata: flags: 64bit ncq sntf pm led clo only pmp pio slum part ccc apst [ 1.409820] scsi host0: ahci [ 1.410064] ata1: SATA max UDMA/133 mmio [mem 0x4a140000-0x4a1410ff] port 0x100 irq 98
The proper driver for this board is not being loaded I think, or not builtin. What is the compat string in the device tree for this ahci adapter ? What driver does it need ? I quickly tried to google that info but did not find any details.
GOOD: 9d84bb40bcb30a7fa16f33baa967aeb9953dda78 BAD: e08466a7c00733a501d3c5328d29ec974478d717
What are these ? "git show" says they are drm and rdma pull request merge from Linus...
Here i am adding links working and not working test jobs and kernel configs, problematic test job:
Good test job:
Hard to read... Can you send a diff of the kernel configs ?
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline git_sha: e08466a7c00733a501d3c5328d29ec974478d717 git_describe: v6.0-7220-ge08466a7c007 kernel_version: 6.0.0 kernel-config: https://builds.tuxbuild.com/2Fourpiqf1OrlPFFtKwhHV0wAiq/config build-url: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline/-/pipelines/6... artifact-location: https://builds.tuxbuild.com/2Fourpiqf1OrlPFFtKwhHV0wAiq toolchain: gcc-10
For your information,
I see diff on good to bad commits, $ git log --oneline 9d84bb40bcb3..e08466a7c007 -- drivers/ata 4078aa685097 Merge tag 'ata-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata 71d7b6e51ad3 ata: libata-eh: avoid needless hard reset when revalidating link e3b1fff6c051 ata: libata: drop superfluous ata_eh_analyze_tf() parameter b46c760e11c8 ata: libata: drop superfluous ata_eh_request_sense() parameter cb6e73aaadff ata: libata-eh: Remove the unneeded result variable ecf8322f464d ata: ahci_st: Enable compile test 2d29dd108c78 ata: ahci_st: Fix compilation warning 9628711aa649 ata: ahci-dwc: Add Baikal-T1 AHCI SATA interface support bc7af9100fa8 ata: ahci-dwc: Add platform-specific quirks support 33629d35090f ata: ahci: Add DWC AHCI SATA controller support 6ce73f3a6fc0 ata: libahci_platform: Add function returning a clock-handle by id 18ee7c49f75b ata: ahci: Introduce firmware-specific caps initialization 7cbbfbe01a72 ata: ahci: Convert __ahci_port_base to accepting hpriv as arguments fad64dc06579 ata: libahci: Don't read AHCI version twice in the save-config method 88589772e80c ata: libahci: Discard redundant force_port_map parameter eb7cae0b6afd ata: libahci: Extend port-cmd flags set with port capabilities f67f12ff57bc ata: libahci_platform: Introduce reset assertion/deassertion methods 3f74cd046fbe ata: libahci_platform: Parse ports-implemented property in resources getter 3c132ea6508b ata: libahci_platform: Sanity check the DT child nodes number e28b3abf8020 ata: libahci_platform: Convert to using devm bulk clocks API 82d437e6dcb1 ata: libahci_platform: Convert to using platform devm-ioremap methods d3243965f24a ata: make PATA_PLATFORM selectable only for suitable architectures 3ebe59a54111 ata: clean up how architectures enable PATA_PLATFORM and PATA_OF_PLATFORM 55d5ba550535 ata: libata-core: Check errors in sata_print_link_status() 03070458d700 ata: libata-sff: Fix double word in comments 0b2436d3d25f ata: pata_macio: Remove unneeded word in comments 024811a2da45 ata: libata-core: Simplify ata_dev_set_xfermode() 066de3b9d93b ata: libata-core: Simplify ata_build_rw_tf() e00923c59e68 ata: libata: Rename ATA_DFLAG_NCQ_PRIO_ENABLE 614065aba704 ata: libata-core: remove redundant err_mask variable fee6073051c3 ata: ahci: Do not check ACPI_FADT_LOW_POWER_S0 99ad3f9f829f ata: libata-core: improve parameter names for ata_dev_set_feature() 16169fb78182 ata: libata-core: Print timeout value when internal command times
I do not understand what you are trying to say here. These are the latest ata patches for 6.1. They touch different drivers and ata core. I still do not know which driver needs to be used on that board...
Test log:
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: 7da9fed0474b4cd46055dd92d55c42faf32c19ac git_describe: next-20221006 kernel_version: 6.0.0 kernel-config: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F/config build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/659754170 artifact-location: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F toolchain: gcc-10
The kernel messages that are shown in the links above do not show any "libata version 3.00 loaded." message nor any ata/ahci message that I can see. So I think the eSATA adapter is not even being detected and libata/ahci driver not used.
Was this working before ? If yes, can you try with the following patches reverted ?
d3243965f24a ("ata: make PATA_PLATFORM selectable only for suitable architectures") 3ebe59a54111 ("ata: clean up how architectures enable PATA_PLATFORM and PATA_OF_PLATFORM")
I have reverted above two patches and but the problem has not been solved.
OK.
If reverting these patches restores the eSATA port on this board, then you need to fix the defconfig for that board.
OTOH, Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the device failed to boot.
Why would you need to enable this new driver ? You board was working before without this new driver, so it is not the one to use for this board, right ? Please send the ata related bits of the device tree to understand what this board needs.