On Thu, Sep 08, 2022 at 04:42:38PM +0000, Lazar, Lijo wrote:
I am not sure if ASPM settings can be generalized by PCIE core. Performance vs Power savings when ASPM is enabled will require some additional tuning and that will be device specific.
Can you elaborate on this? In the universe of drivers, very few do their own ASPM configuration, and it's usually to work around hardware defects, e.g., L1 doesn't work on some e1000e devices, L0s doesn't work on some iwlwifi devices, etc.
The core does know how to configure all the ASPM features defined in the PCIe spec, e.g., L0s, L1, L1.1, L1.2, and LTR.
In some of the other ASICs, this programming is done in VBIOS/SBIOS firmware. Having it in driver provides the advantage of additional tuning without forcing a VBIOS upgrade.
I think it's clearly the intent of the PCIe spec that ASPM configuration be done by generic code. Here are some things that require a system-level view, not just an individual device view:
- L0s, L1, and L1 Substates cannot be enabled unless both ends support it (PCIe r6.0, secs 5.4.1.4, 7.5.3.7, 5.5.4).
- Devices advertise the "Acceptable Latency" they can accept for transitions from L0s or L1 to L0, and the actual latency depends on the "Exit Latencies" of all the devices in the path to the Root Port (sec 5.4.1.3.2).
- LTR (required by L1.2) cannot be enabled unless it is already enabled in all upstream devices (sec 6.18). This patch relies on "ltr_path", which works now but relies on the PCI core never reconfiguring the upstream path.
There might be amdgpu-specific features the driver needs to set up, but if drivers fiddle with architected features like LTR behind the PCI core's back, things are likely to break.
From: Alex Deucher alexdeucher@gmail.com On Thu, Sep 8, 2022 at 12:12 PM Bjorn Helgaas helgaas@kernel.org wrote:
Do you know why the driver configures ASPM itself? If the PCI core is doing something wrong (and I'm sure it is, ASPM support is kind of a mess), I'd much prefer to fix up the core where *all* drivers can benefit from it.
This is the programming sequence we get from our hardware team and it is used on both windows and Linux. As far as I understand it windows doesn't handle this in the core, it's up to the individual drivers to enable it. I'm not familiar with how this should be enabled generically, but at least for our hardware, it seems to have some variation compared to what is done in the PCI core due to stability, etc. It seems to me that this may need asic specific implementations for a lot of hardware depending on the required programming sequences. E.g., various asics may need hardware workaround for bugs or platform issues, etc. I can ask for more details from our hardware team.
If the PCI core has stability issues, I want to fix them. This hardware may have its own stability issues, and I would ideally like to have drivers use interfaces like pci_disable_link_state() to avoid broken things. Maybe we need new interfaces for more subtle kinds of breakage.
Bjorn
On Thu, Sep 8, 2022 at 1:57 PM Bjorn Helgaas helgaas@kernel.org wrote:
On Thu, Sep 08, 2022 at 04:42:38PM +0000, Lazar, Lijo wrote:
I am not sure if ASPM settings can be generalized by PCIE core. Performance vs Power savings when ASPM is enabled will require some additional tuning and that will be device specific.
Can you elaborate on this? In the universe of drivers, very few do their own ASPM configuration, and it's usually to work around hardware defects, e.g., L1 doesn't work on some e1000e devices, L0s doesn't work on some iwlwifi devices, etc.
The core does know how to configure all the ASPM features defined in the PCIe spec, e.g., L0s, L1, L1.1, L1.2, and LTR.
In some of the other ASICs, this programming is done in VBIOS/SBIOS firmware. Having it in driver provides the advantage of additional tuning without forcing a VBIOS upgrade.
I think it's clearly the intent of the PCIe spec that ASPM configuration be done by generic code. Here are some things that require a system-level view, not just an individual device view:
L0s, L1, and L1 Substates cannot be enabled unless both ends support it (PCIe r6.0, secs 5.4.1.4, 7.5.3.7, 5.5.4).
Devices advertise the "Acceptable Latency" they can accept for transitions from L0s or L1 to L0, and the actual latency depends on the "Exit Latencies" of all the devices in the path to the Root Port (sec 5.4.1.3.2).
LTR (required by L1.2) cannot be enabled unless it is already enabled in all upstream devices (sec 6.18). This patch relies on "ltr_path", which works now but relies on the PCI core never reconfiguring the upstream path.
There might be amdgpu-specific features the driver needs to set up, but if drivers fiddle with architected features like LTR behind the PCI core's back, things are likely to break.
From: Alex Deucher alexdeucher@gmail.com On Thu, Sep 8, 2022 at 12:12 PM Bjorn Helgaas helgaas@kernel.org wrote:
Do you know why the driver configures ASPM itself? If the PCI core is doing something wrong (and I'm sure it is, ASPM support is kind of a mess), I'd much prefer to fix up the core where *all* drivers can benefit from it.
This is the programming sequence we get from our hardware team and it is used on both windows and Linux. As far as I understand it windows doesn't handle this in the core, it's up to the individual drivers to enable it. I'm not familiar with how this should be enabled generically, but at least for our hardware, it seems to have some variation compared to what is done in the PCI core due to stability, etc. It seems to me that this may need asic specific implementations for a lot of hardware depending on the required programming sequences. E.g., various asics may need hardware workaround for bugs or platform issues, etc. I can ask for more details from our hardware team.
If the PCI core has stability issues, I want to fix them. This hardware may have its own stability issues, and I would ideally like to have drivers use interfaces like pci_disable_link_state() to avoid broken things. Maybe we need new interfaces for more subtle kinds of breakage.
I'm not sure what, if anything is wrong with the current generic PCIe ASPM code in Linux. I was speaking more from a hardware validation standpoint. E.g., our silicon validation and hardware teams run a lot of tests on a bunch of platforms and tune the programming sequence for speed/power/stability. Then they hand the programming sequence off to the software teams as sort of a golden config or rule set for ASPM enablement in the OS for each device. I'm not exactly sure how far these sequences stray from what the core PCI code does. Will try and find out more.
Alex
On 9/8/2022 11:27 PM, Bjorn Helgaas wrote:
On Thu, Sep 08, 2022 at 04:42:38PM +0000, Lazar, Lijo wrote:
I am not sure if ASPM settings can be generalized by PCIE core. Performance vs Power savings when ASPM is enabled will require some additional tuning and that will be device specific.
Can you elaborate on this? In the universe of drivers, very few do their own ASPM configuration, and it's usually to work around hardware defects, e.g., L1 doesn't work on some e1000e devices, L0s doesn't work on some iwlwifi devices, etc.
The core does know how to configure all the ASPM features defined in the PCIe spec, e.g., L0s, L1, L1.1, L1.2, and LTR.
In some of the other ASICs, this programming is done in VBIOS/SBIOS firmware. Having it in driver provides the advantage of additional tuning without forcing a VBIOS upgrade.
I think it's clearly the intent of the PCIe spec that ASPM configuration be done by generic code. Here are some things that require a system-level view, not just an individual device view:
L0s, L1, and L1 Substates cannot be enabled unless both ends support it (PCIe r6.0, secs 5.4.1.4, 7.5.3.7, 5.5.4).
Devices advertise the "Acceptable Latency" they can accept for transitions from L0s or L1 to L0, and the actual latency depends on the "Exit Latencies" of all the devices in the path to the Root Port (sec 5.4.1.3.2).
LTR (required by L1.2) cannot be enabled unless it is already enabled in all upstream devices (sec 6.18). This patch relies on "ltr_path", which works now but relies on the PCI core never reconfiguring the upstream path.
There might be amdgpu-specific features the driver needs to set up, but if drivers fiddle with architected features like LTR behind the PCI core's back, things are likely to break.
The programming is mostly related to entry conditions and spec leaves it to implementation.
From r4.0 spec - " This specification does not dictate when a component with an Upstream Port must initiate a transition to the L1 state. The interoperable mechanisms for transitioning into and out of L1 are defined within this specification; however, the specific ASPM policy governing when to transition into L1 is left to the implementer. ... Another approach would be for the Downstream device to initiate a transition to the L1 state once the Link has been idle in L0 for a set amount of time. "
Some of the programming like below relates to timings for entry.
def = data = RREG32_SOC15(NBIO, 0, regRCC_STRAP0_RCC_BIF_STRAP3); data |= 0x5DE0 << RCC_BIF_STRAP3__STRAP_VLINK_ASPM_IDLE_TIMER__SHIFT; data |= 0x0010 << RCC_BIF_STRAP3__STRAP_VLINK_PM_L1_ENTRY_TIMER__SHIFT; if (def != data) WREG32_SOC15(NBIO, 0, regRCC_STRAP0_RCC_BIF_STRAP3, data);
Similarly for LTR, as it provides a dynamic mechanism to report tolerance while in L1 substates, the tolerance timings can be tuned through registers though there is a threshold.
Regardless, Alex is already checking with hardware design team on possible improvements.
Thanks, Lijo
From: Alex Deucher alexdeucher@gmail.com On Thu, Sep 8, 2022 at 12:12 PM Bjorn Helgaas helgaas@kernel.org wrote:
Do you know why the driver configures ASPM itself? If the PCI core is doing something wrong (and I'm sure it is, ASPM support is kind of a mess), I'd much prefer to fix up the core where *all* drivers can benefit from it.
This is the programming sequence we get from our hardware team and it is used on both windows and Linux. As far as I understand it windows doesn't handle this in the core, it's up to the individual drivers to enable it. I'm not familiar with how this should be enabled generically, but at least for our hardware, it seems to have some variation compared to what is done in the PCI core due to stability, etc. It seems to me that this may need asic specific implementations for a lot of hardware depending on the required programming sequences. E.g., various asics may need hardware workaround for bugs or platform issues, etc. I can ask for more details from our hardware team.
If the PCI core has stability issues, I want to fix them. This hardware may have its own stability issues, and I would ideally like to have drivers use interfaces like pci_disable_link_state() to avoid broken things. Maybe we need new interfaces for more subtle kinds of breakage.
Bjorn
On Fri, Sep 09, 2022 at 01:11:54PM +0530, Lazar, Lijo wrote:
On 9/8/2022 11:27 PM, Bjorn Helgaas wrote:
On Thu, Sep 08, 2022 at 04:42:38PM +0000, Lazar, Lijo wrote:
I am not sure if ASPM settings can be generalized by PCIE core. Performance vs Power savings when ASPM is enabled will require some additional tuning and that will be device specific.
Can you elaborate on this? In the universe of drivers, very few do their own ASPM configuration, and it's usually to work around hardware defects, e.g., L1 doesn't work on some e1000e devices, L0s doesn't work on some iwlwifi devices, etc.
The core does know how to configure all the ASPM features defined in the PCIe spec, e.g., L0s, L1, L1.1, L1.2, and LTR.
In some of the other ASICs, this programming is done in VBIOS/SBIOS firmware. Having it in driver provides the advantage of additional tuning without forcing a VBIOS upgrade.
I think it's clearly the intent of the PCIe spec that ASPM configuration be done by generic code. Here are some things that require a system-level view, not just an individual device view:
L0s, L1, and L1 Substates cannot be enabled unless both ends support it (PCIe r6.0, secs 5.4.1.4, 7.5.3.7, 5.5.4).
Devices advertise the "Acceptable Latency" they can accept for transitions from L0s or L1 to L0, and the actual latency depends on the "Exit Latencies" of all the devices in the path to the Root Port (sec 5.4.1.3.2).
LTR (required by L1.2) cannot be enabled unless it is already enabled in all upstream devices (sec 6.18). This patch relies on "ltr_path", which works now but relies on the PCI core never reconfiguring the upstream path.
There might be amdgpu-specific features the driver needs to set up, but if drivers fiddle with architected features like LTR behind the PCI core's back, things are likely to break.
The programming is mostly related to entry conditions and spec leaves it to implementation.
From r4.0 spec - " This specification does not dictate when a component with an Upstream Port must initiate a transition to the L1 state. The interoperable mechanisms for transitioning into and out of L1 are defined within this specification; however, the specific ASPM policy governing when to transition into L1 is left to the implementer. ... Another approach would be for the Downstream device to initiate a transition to the L1 state once the Link has been idle in L0 for a set amount of time. "
Some of the programming like below relates to timings for entry.
def = data = RREG32_SOC15(NBIO, 0, regRCC_STRAP0_RCC_BIF_STRAP3); data |= 0x5DE0 <<
RCC_BIF_STRAP3__STRAP_VLINK_ASPM_IDLE_TIMER__SHIFT; data |= 0x0010 << RCC_BIF_STRAP3__STRAP_VLINK_PM_L1_ENTRY_TIMER__SHIFT; if (def != data) WREG32_SOC15(NBIO, 0, regRCC_STRAP0_RCC_BIF_STRAP3, data);
Similarly for LTR, as it provides a dynamic mechanism to report tolerance while in L1 substates, the tolerance timings can be tuned through registers though there is a threshold.
I don't object to the driver programming device-specific things, although there might be issues if it does that after the core has already configured and enabled ASPM -- the driver might need to temporarily disable ASPM while it updates parameters, then re-enable it.
I *do* object to the driver programming PCIe-generic things that the PCI core thinks it owns. It's especially annoying if the driver uses device-specific #defines and access methods for generic PCIe things because then we can't even find potential conflicts.
From: Alex Deucher alexdeucher@gmail.com On Thu, Sep 8, 2022 at 12:12 PM Bjorn Helgaas helgaas@kernel.org wrote:
Do you know why the driver configures ASPM itself? If the PCI core is doing something wrong (and I'm sure it is, ASPM support is kind of a mess), I'd much prefer to fix up the core where *all* drivers can benefit from it.
This is the programming sequence we get from our hardware team and it is used on both windows and Linux. As far as I understand it windows doesn't handle this in the core, it's up to the individual drivers to enable it. I'm not familiar with how this should be enabled generically, but at least for our hardware, it seems to have some variation compared to what is done in the PCI core due to stability, etc. It seems to me that this may need asic specific implementations for a lot of hardware depending on the required programming sequences. E.g., various asics may need hardware workaround for bugs or platform issues, etc. I can ask for more details from our hardware team.
If the PCI core has stability issues, I want to fix them. This hardware may have its own stability issues, and I would ideally like to have drivers use interfaces like pci_disable_link_state() to avoid broken things. Maybe we need new interfaces for more subtle kinds of breakage.
Bjorn
linux-stable-mirror@lists.linaro.org