From: Bjorn Helgaas bhelgaas@google.com
When booting with "pci=noaer", we don't request control of AER, but we previously *did* request control of DPC, as in the dmesg log attached at the bugzilla below:
Command line: ... pci=noaer acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which says:
If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it must also set bit 7 of the Support field (indicating support for Error Disconnect Recover notifications) and bits 3 and 4 of the Control field (requesting control of PCI Express Advanced Error Reporting and the PCI Express Capability Structure).
Request DPC control only if we have also requested AER control.
Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support") Link: https://bugzilla.kernel.org/show_bug.cgi?id=218491#c12 Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org # v5.7+ Cc: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Cc: Matthew W Carlis mattc@purestorage.com Cc: Keith Busch kbusch@kernel.org Cc: Lukas Wunner lukas@wunner.de Cc: Mika Westerberg mika.westerberg@linux.intel.com Cc: Jesse Brandeburg jesse.brandeburg@intel.com --- drivers/acpi/pci_root.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c index 58b89b8d950e..efc292b6214e 100644 --- a/drivers/acpi/pci_root.c +++ b/drivers/acpi/pci_root.c @@ -518,17 +518,19 @@ static u32 calculate_control(void) if (IS_ENABLED(CONFIG_HOTPLUG_PCI_SHPC)) control |= OSC_PCI_SHPC_NATIVE_HP_CONTROL;
- if (pci_aer_available()) + if (pci_aer_available()) { control |= OSC_PCI_EXPRESS_AER_CONTROL;
- /* - * Per the Downstream Port Containment Related Enhancements ECN to - * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5, - * OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC - * and EDR. - */ - if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR)) - control |= OSC_PCI_EXPRESS_DPC_CONTROL; + /* + * Per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, the + * OS can request DPC control only if it has advertised + * OSC_PCI_EDR_SUPPORT and requested both + * OSC_PCI_EXPRESS_CAPABILITY_CONTROL and + * OSC_PCI_EXPRESS_AER_CONTROL. + */ + if (IS_ENABLED(CONFIG_PCIE_DPC)) + control |= OSC_PCI_EXPRESS_DPC_CONTROL; + }
return control; }
On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
From: Bjorn Helgaas bhelgaas@google.com
When booting with "pci=noaer", we don't request control of AER, but we previously *did* request control of DPC, as in the dmesg log attached at the bugzilla below:
Command line: ... pci=noaer acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which says:
If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it must also set bit 7 of the Support field (indicating support for Error Disconnect Recover notifications) and bits 3 and 4 of the Control field (requesting control of PCI Express Advanced Error Reporting and the PCI Express Capability Structure).
IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies Between _OSC Control Bits".
Because handling of Downstream Port Containment has a dependency on Advanced Error Reporting, the operating system is required to request control over Advanced Error Reporting (bit 3 of the Control field) while requesting control over Downstream Port Containment Configuration (bit 7 of the Control field). If the operating system attempts to claim control of Downstream Port Containment Configuration without also claiming control over Advanced Error Reporting, firmware is required to refuse control of the feature being illegally claimed and mask the corresponding bit. Firmware is required to maintain ownership of Advanced Error Reporting if it retains ownership of Downstream Port Containment Configuration. If the operating system sets bit 7 of the Control field, it must set bit 7 of the Support field, indicating support for the Error Disconnect Recover event.
Request DPC control only if we have also requested AER control.
Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support") Link: https://bugzilla.kernel.org/show_bug.cgi?id=218491#c12 Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org # v5.7+ Cc: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Cc: Matthew W Carlis mattc@purestorage.com Cc: Keith Busch kbusch@kernel.org Cc: Lukas Wunner lukas@wunner.de Cc: Mika Westerberg mika.westerberg@linux.intel.com Cc: Jesse Brandeburg jesse.brandeburg@intel.com
Code wise it looks fine to me.
Reviewed-by: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com
drivers/acpi/pci_root.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c index 58b89b8d950e..efc292b6214e 100644 --- a/drivers/acpi/pci_root.c +++ b/drivers/acpi/pci_root.c @@ -518,17 +518,19 @@ static u32 calculate_control(void) if (IS_ENABLED(CONFIG_HOTPLUG_PCI_SHPC)) control |= OSC_PCI_SHPC_NATIVE_HP_CONTROL;
- if (pci_aer_available())
- if (pci_aer_available()) { control |= OSC_PCI_EXPRESS_AER_CONTROL;
- /*
* Per the Downstream Port Containment Related Enhancements ECN to
* the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
* OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC
* and EDR.
*/
- if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR))
control |= OSC_PCI_EXPRESS_DPC_CONTROL;
/*
* Per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, the
* OS can request DPC control only if it has advertised
* OSC_PCI_EDR_SUPPORT and requested both
* OSC_PCI_EXPRESS_CAPABILITY_CONTROL and
* OSC_PCI_EXPRESS_AER_CONTROL.
*/
if (IS_ENABLED(CONFIG_PCIE_DPC))
control |= OSC_PCI_EXPRESS_DPC_CONTROL;
- }
return control; }
On Sun, Feb 25, 2024 at 11:46:07AM -0800, Kuppuswamy Sathyanarayanan wrote:
On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
From: Bjorn Helgaas bhelgaas@google.com
When booting with "pci=noaer", we don't request control of AER, but we previously *did* request control of DPC, as in the dmesg log attached at the bugzilla below:
Command line: ... pci=noaer acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which says:
If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it must also set bit 7 of the Support field (indicating support for Error Disconnect Recover notifications) and bits 3 and 4 of the Control field (requesting control of PCI Express Advanced Error Reporting and the PCI Express Capability Structure).
IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies Between _OSC Control Bits".
Because handling of Downstream Port Containment has a dependency on Advanced Error Reporting, the operating system is required to request control over Advanced Error Reporting (bit 3 of the Control field) while requesting control over Downstream Port Containment Configuration (bit 7 of the Control field). If the operating system attempts to claim control of Downstream Port Containment Configuration without also claiming control over Advanced Error Reporting, firmware is required to refuse control of the feature being illegally claimed and mask the corresponding bit. Firmware is required to maintain ownership of Advanced Error Reporting if it retains ownership of Downstream Port Containment Configuration. If the operating system sets bit 7 of the Control field, it must set bit 7 of the Support field, indicating support for the Error Disconnect Recover event.
So I guess you're suggesting that there are two defects here?
1) Linux requested DPC control without requesting AER control.
2) Platform granted DPC control when it shouldn't have.
I do agree with that, but obviously we can only fix 1) in Linux.
Request DPC control only if we have also requested AER control.
Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support") Link: https://bugzilla.kernel.org/show_bug.cgi?id=218491#c12 Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org # v5.7+ Cc: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Cc: Matthew W Carlis mattc@purestorage.com Cc: Keith Busch kbusch@kernel.org Cc: Lukas Wunner lukas@wunner.de Cc: Mika Westerberg mika.westerberg@linux.intel.com Cc: Jesse Brandeburg jesse.brandeburg@intel.com
Code wise it looks fine to me.
Reviewed-by: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com
drivers/acpi/pci_root.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c index 58b89b8d950e..efc292b6214e 100644 --- a/drivers/acpi/pci_root.c +++ b/drivers/acpi/pci_root.c @@ -518,17 +518,19 @@ static u32 calculate_control(void) if (IS_ENABLED(CONFIG_HOTPLUG_PCI_SHPC)) control |= OSC_PCI_SHPC_NATIVE_HP_CONTROL;
- if (pci_aer_available())
- if (pci_aer_available()) { control |= OSC_PCI_EXPRESS_AER_CONTROL;
- /*
* Per the Downstream Port Containment Related Enhancements ECN to
* the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
* OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC
* and EDR.
*/
- if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR))
control |= OSC_PCI_EXPRESS_DPC_CONTROL;
/*
* Per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, the
* OS can request DPC control only if it has advertised
* OSC_PCI_EDR_SUPPORT and requested both
* OSC_PCI_EXPRESS_CAPABILITY_CONTROL and
* OSC_PCI_EXPRESS_AER_CONTROL.
*/
if (IS_ENABLED(CONFIG_PCIE_DPC))
control |= OSC_PCI_EXPRESS_DPC_CONTROL;
- }
return control; }
-- Sathyanarayanan Kuppuswamy Linux Kernel Developer
On 2/26/24 7:18 AM, Bjorn Helgaas wrote:
On Sun, Feb 25, 2024 at 11:46:07AM -0800, Kuppuswamy Sathyanarayanan wrote:
On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
From: Bjorn Helgaas bhelgaas@google.com
When booting with "pci=noaer", we don't request control of AER, but we previously *did* request control of DPC, as in the dmesg log attached at the bugzilla below:
Command line: ... pci=noaer acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which says:
If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it must also set bit 7 of the Support field (indicating support for Error Disconnect Recover notifications) and bits 3 and 4 of the Control field (requesting control of PCI Express Advanced Error Reporting and the PCI Express Capability Structure).
IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies Between _OSC Control Bits".
Because handling of Downstream Port Containment has a dependency on Advanced Error Reporting, the operating system is required to request control over Advanced Error Reporting (bit 3 of the Control field) while requesting control over Downstream Port Containment Configuration (bit 7 of the Control field). If the operating system attempts to claim control of Downstream Port Containment Configuration without also claiming control over Advanced Error Reporting, firmware is required to refuse control of the feature being illegally claimed and mask the corresponding bit. Firmware is required to maintain ownership of Advanced Error Reporting if it retains ownership of Downstream Port Containment Configuration. If the operating system sets bit 7 of the Control field, it must set bit 7 of the Support field, indicating support for the Error Disconnect Recover event.
So I guess you're suggesting that there are two defects here?
Linux requested DPC control without requesting AER control.
Platform granted DPC control when it shouldn't have.
I do agree with that, but obviously we can only fix 1) in Linux.
Sorry, maybe my comment was not clear. I was just suggesting to change the spec reference from r3.3, sec 4.5.1, table 4-5 to r3.3, sec 4.5.2.4 "Dependencies Between _OSC Control Bits". I agree that we cannot do much if firmware misbehaves here.
Request DPC control only if we have also requested AER control.
Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support") Link: https://bugzilla.kernel.org/show_bug.cgi?id=218491#c12 Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org # v5.7+ Cc: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Cc: Matthew W Carlis mattc@purestorage.com Cc: Keith Busch kbusch@kernel.org Cc: Lukas Wunner lukas@wunner.de Cc: Mika Westerberg mika.westerberg@linux.intel.com Cc: Jesse Brandeburg jesse.brandeburg@intel.com
Code wise it looks fine to me.
Reviewed-by: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com
drivers/acpi/pci_root.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c index 58b89b8d950e..efc292b6214e 100644 --- a/drivers/acpi/pci_root.c +++ b/drivers/acpi/pci_root.c @@ -518,17 +518,19 @@ static u32 calculate_control(void) if (IS_ENABLED(CONFIG_HOTPLUG_PCI_SHPC)) control |= OSC_PCI_SHPC_NATIVE_HP_CONTROL;
- if (pci_aer_available())
- if (pci_aer_available()) { control |= OSC_PCI_EXPRESS_AER_CONTROL;
- /*
* Per the Downstream Port Containment Related Enhancements ECN to
* the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
* OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC
* and EDR.
*/
- if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR))
control |= OSC_PCI_EXPRESS_DPC_CONTROL;
/*
* Per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, the
* OS can request DPC control only if it has advertised
* OSC_PCI_EDR_SUPPORT and requested both
* OSC_PCI_EXPRESS_CAPABILITY_CONTROL and
* OSC_PCI_EXPRESS_AER_CONTROL.
*/
if (IS_ENABLED(CONFIG_PCIE_DPC))
control |= OSC_PCI_EXPRESS_DPC_CONTROL;
- }
return control; }
-- Sathyanarayanan Kuppuswamy Linux Kernel Developer
On Mon, Feb 26, 2024 at 07:46:05AM -0800, Kuppuswamy Sathyanarayanan wrote:
On 2/26/24 7:18 AM, Bjorn Helgaas wrote:
On Sun, Feb 25, 2024 at 11:46:07AM -0800, Kuppuswamy Sathyanarayanan wrote:
On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
From: Bjorn Helgaas bhelgaas@google.com
When booting with "pci=noaer", we don't request control of AER, but we previously *did* request control of DPC, as in the dmesg log attached at the bugzilla below:
Command line: ... pci=noaer acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which says:
If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it must also set bit 7 of the Support field (indicating support for Error Disconnect Recover notifications) and bits 3 and 4 of the Control field (requesting control of PCI Express Advanced Error Reporting and the PCI Express Capability Structure).
IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies Between _OSC Control Bits".
Because handling of Downstream Port Containment has a dependency on Advanced Error Reporting, the operating system is required to request control over Advanced Error Reporting (bit 3 of the Control field) while requesting control over Downstream Port Containment Configuration (bit 7 of the Control field). If the operating system attempts to claim control of Downstream Port Containment Configuration without also claiming control over Advanced Error Reporting, firmware is required to refuse control of the feature being illegally claimed and mask the corresponding bit. Firmware is required to maintain ownership of Advanced Error Reporting if it retains ownership of Downstream Port Containment Configuration. If the operating system sets bit 7 of the Control field, it must set bit 7 of the Support field, indicating support for the Error Disconnect Recover event.
So I guess you're suggesting that there are two defects here?
Linux requested DPC control without requesting AER control.
Platform granted DPC control when it shouldn't have.
I do agree with that, but obviously we can only fix 1) in Linux.
Sorry, maybe my comment was not clear. I was just suggesting to change the spec reference from r3.3, sec 4.5.1, table 4-5 to r3.3, sec 4.5.2.4 "Dependencies Between _OSC Control Bits".
The requirement that the OS request AER control whenever it requests DPC control is mentioned in both sec 4.5.1 and sec 4.5.2.4. IMO sec 4.5.2.4 should not exist because the per-bit table in sec 4.5.1 is a better place for implementation guidance. 4.5.2.4 is easy to miss, mostly redundant, and hard to integrate with the 4.5.1 table.
What advantage do you see for citing 4.5.2.4 instead of 4.5.1? The only real difference I see is that it also points out a firmware problem. I don't think the extra text is worth it since it doesn't motivate the Linux change.
Bjorn
On 2/26/24 8:33 AM, Bjorn Helgaas wrote:
On Mon, Feb 26, 2024 at 07:46:05AM -0800, Kuppuswamy Sathyanarayanan wrote:
On 2/26/24 7:18 AM, Bjorn Helgaas wrote:
On Sun, Feb 25, 2024 at 11:46:07AM -0800, Kuppuswamy Sathyanarayanan wrote:
On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
From: Bjorn Helgaas bhelgaas@google.com
When booting with "pci=noaer", we don't request control of AER, but we previously *did* request control of DPC, as in the dmesg log attached at the bugzilla below:
Command line: ... pci=noaer acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which says:
If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it must also set bit 7 of the Support field (indicating support for Error Disconnect Recover notifications) and bits 3 and 4 of the Control field (requesting control of PCI Express Advanced Error Reporting and the PCI Express Capability Structure).
IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies Between _OSC Control Bits".
Because handling of Downstream Port Containment has a dependency on Advanced Error Reporting, the operating system is required to request control over Advanced Error Reporting (bit 3 of the Control field) while requesting control over Downstream Port Containment Configuration (bit 7 of the Control field). If the operating system attempts to claim control of Downstream Port Containment Configuration without also claiming control over Advanced Error Reporting, firmware is required to refuse control of the feature being illegally claimed and mask the corresponding bit. Firmware is required to maintain ownership of Advanced Error Reporting if it retains ownership of Downstream Port Containment Configuration. If the operating system sets bit 7 of the Control field, it must set bit 7 of the Support field, indicating support for the Error Disconnect Recover event.
So I guess you're suggesting that there are two defects here?
Linux requested DPC control without requesting AER control.
Platform granted DPC control when it shouldn't have.
I do agree with that, but obviously we can only fix 1) in Linux.
Sorry, maybe my comment was not clear. I was just suggesting to change the spec reference from r3.3, sec 4.5.1, table 4-5 to r3.3, sec 4.5.2.4 "Dependencies Between _OSC Control Bits".
The requirement that the OS request AER control whenever it requests DPC control is mentioned in both sec 4.5.1 and sec 4.5.2.4. IMO sec 4.5.2.4 should not exist because the per-bit table in sec 4.5.1 is a better place for implementation guidance. 4.5.2.4 is easy to miss, mostly redundant, and hard to integrate with the 4.5.1 table.
What advantage do you see for citing 4.5.2.4 instead of 4.5.1? The only real difference I see is that it also points out a firmware problem. I don't think the extra text is worth it since it doesn't motivate the Linux change.
My thinking is, since the fix is related to the dependency between _OSC control bits (AER & DPC), and there is a special section in the spec which discuss it, I thought it is better to quote it.
But I get your point. I think either if fine.
Bjorn
linux-stable-mirror@lists.linaro.org