Hi all,
We've been working through the details of getting ACPI to work on arm64, and there have been lots of questions about what this means for PCI. I've outlined this for several people individually, but I'm going to send this separately, apart from a specific patch series, to make sure we're all on the same page. Please correct my errors and misunderstandings.
Bjorn
The basic requirement is that the ACPI namespace should describe *everything* that consumes address space unless there's another standard way for the OS to find it [1, 2]. For example, windows that are forwarded to PCI by a PCI host bridge should be described via ACPI devices, since the OS can't locate the host bridge by itself. PCI devices *below* the host bridge do not need to be described via ACPI, because the resources they consume are inside the host bridge windows, and the OS can discover them via the standard PCI enumeration mechanism (using config accesses to read and size the BARs).
This ACPI resource description is done via _CRS methods of devices in the ACPI namespace [2]. _CRS methods are like generalized PCI BARs: the OS can read _CRS and figure out what resource is being consumed even if it doesn't have a driver for the device [3]. That's important because it means an old OS can work correctly even on a system with new devices unknown to the OS. The new devices won't do anything, but the OS can at least make sure no resources conflict with them.
Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for reserving address space! The static tables are for things the OS needs to know early in boot, before it can parse the ACPI namespace. If a new table is defined, an old OS needs to operate correctly even though it ignores the table. _CRS allows that because it is generic and understood by the old OS; a static table does not.
If the OS is expected to manage an ACPI device, that device will have a specific _HID/_CID that tells the OS what driver to bind to it, and the _CRS tells the OS and the driver where the device's registers are.
PNP0C02 "motherboard" devices are basically a catch-all. There's no programming model for them other than "don't use these resources for anything else." So any address space that is (1) not claimed by some other ACPI device and (2) should not be assigned by the OS to something else, should be claimed by a PNP0C02 _CRS method.
PCI host bridges are PNP0A03 or PNP0A08 devices. Their _CRS should describe all the address space they consume. In principle, this would be all the windows they forward down to the PCI bus, as well as the bridge registers themselves. The bridge registers include things like secondary/subordinate bus registers that determine the bus range below the bridge, window registers that describe the apertures, etc. These are all device-specific, non-architected things, so the only way a PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which contain the device-specific details. These bridge registers also include ECAM space, since it is consumed by the bridge.
ACPI defined a Producer/Consumer bit that was intended to distinguish the bridge apertures from the bridge registers [4, 5]. However, BIOSes didn't use that bit correctly, and the result is that OSes have to assume that everything in a PCI host bridge _CRS is a window. That leaves no way to describe the bridge registers in the PNP0A03/PNP0A08 device itself.
The workaround is to describe the bridge registers (including ECAM space) in PNP0C02 catch-all devices [6]. With the exception of ECAM, the bridge register space is device-specific anyway, so the generic PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it. For ECAM, pci_root.c learns about the space from either MCFG or the _CBA method.
Note that the PCIe spec actually does require ECAM unless there's a standard firmware interface for config access, e.g., the ia64 SAL interface [7]. One reason is that we want a generic host bridge driver (pci_root.c), and a generic driver requires a generic way to access config space.
[1] ACPI 6.0, sec 6.1: For any device that is on a non-enumerable type of bus (for example, an ISA bus), OSPM enumerates the devices' identifier(s) and the ACPI system firmware must supply an _HID object ... for each device to enable OSPM to do that.
[2] ACPI 6.0, sec 3.7: The OS enumerates motherboard devices simply by reading through the ACPI Namespace looking for devices with hardware IDs.
Each device enumerated by ACPI includes ACPI-defined objects in the ACPI Namespace that report the hardware resources the device could occupy [_PRS], an object that reports the resources that are currently used by the device [_CRS], and objects for configuring those resources [_SRS]. The information is used by the Plug and Play OS (OSPM) to configure the devices.
[3] ACPI 6.0, sec 6.2: OSPM uses device configuration objects to configure hardware resources for devices enumerated via ACPI. Device configuration objects provide information about current and possible resource requirements, the relationship between shared resources, and methods for configuring hardware resources.
When OSPM enumerates a device, it calls _PRS to determine the resource requirements of the device. It may also call _CRS to find the current resource settings for the device. Using this information, the Plug and Play system determines what resources the device should consume and sets those resources by calling the device’s _SRS control method.
In ACPI, devices can consume resources (for example, legacy keyboards), provide resources (for example, a proprietary PCI bridge), or do both. Unless otherwise specified, resources for a device are assumed to be taken from the nearest matching resource above the device in the device hierarchy.
[4] ACPI 6.0, sec 6.4.3.5.4: Extended Address Space Descriptor General Flags: Bit [0] Consumer/Producer: 1–This device consumes this resource 0–This device produces and consumes this resource
[5] ACPI 6.0, sec 19.6.43: ResourceUsage specifies whether the Memory range is consumed by this device (ResourceConsumer) or passed on to child devices (ResourceProducer). If nothing is specified, then ResourceConsumer is assumed.
[6] PCI Firmware 3.0, sec 4.1.2: If the operating system does not natively comprehend reserving the MMCFG region, the MMCFG region must be reserved by firmware. The address range reported in the MCFG table or by _CBA method (see Section 4.1.3) must be reserved by declaring a motherboard resource. For most systems, the motherboard resource would appear at the root of the ACPI namespace (under _SB) in a node with a _HID of EISAID (PNP0C02), and the resources in this case should not be claimed in the root PCI bus’s _CRS. The resources can optionally be returned in Int15 E820 or EFIGetMemoryMap as reserved memory but must always be reported through ACPI as a motherboard resource.
[7] PCI Express 3.0, sec 7.2.2: For systems that are PC-compatible, or that do not implement a processor-architecture-specific firmware interface standard that allows access to the Configuration Space, the ECAM is required as defined in this section.
On 11/09/2016 03:05 PM, Bjorn Helgaas wrote:
Hi all,
We've been working through the details of getting ACPI to work on arm64, and there have been lots of questions about what this means for PCI. I've outlined this for several people individually, but I'm going to send this separately, apart from a specific patch series, to make sure we're all on the same page. Please correct my errors and misunderstandings.
Bjorn
[snip....]
A big +1 to all of this. This also looks like something that should be added to either PCI, ACPI or arm64 documentation (or even all three). What do you think?
Thank you for putting this together, Bjorn.
On 11/10/2016 06:18 PM, Al Stone wrote:
On 11/09/2016 03:05 PM, Bjorn Helgaas wrote:
Hi all,
We've been working through the details of getting ACPI to work on arm64, and there have been lots of questions about what this means for PCI. I've outlined this for several people individually, but I'm going to send this separately, apart from a specific patch series, to make sure we're all on the same page. Please correct my errors and misunderstandings.
Bjorn
[snip....]
A big +1 to all of this. This also looks like something that should be added to either PCI, ACPI or arm64 documentation (or even all three). What do you think?
Thank you for putting this together, Bjorn.
+1. One-stop shopping! :) Nice summary, and clarification(s).
On Thu, Nov 10, 2016 at 04:18:54PM -0700, Al Stone wrote:
On 11/09/2016 03:05 PM, Bjorn Helgaas wrote:
Hi all,
We've been working through the details of getting ACPI to work on arm64, and there have been lots of questions about what this means for PCI. I've outlined this for several people individually, but I'm going to send this separately, apart from a specific patch series, to make sure we're all on the same page. Please correct my errors and misunderstandings.
Bjorn
[snip....]
A big +1 to all of this. This also looks like something that should be added to either PCI, ACPI or arm64 documentation (or even all three).
And to arm64 platforms FW :)
What do you think?
I do not think there is anything ARM64 specific in Bjorn's description, but I do think it is very useful to have it in documentation, these bits of information are scattered around ACPI specs and PCI FW specs, having a single source would help and would have prevented asking Bjorn the same questions 100 times.
Thank you for putting this together, Bjorn.
+1, Thank you very much for this nice summary Bjorn.
Lorenzo
On 11/11/2016 05:32 PM, Lorenzo Pieralisi wrote:
On Thu, Nov 10, 2016 at 04:18:54PM -0700, Al Stone wrote:
On 11/09/2016 03:05 PM, Bjorn Helgaas wrote:
Hi all,
We've been working through the details of getting ACPI to work on arm64, and there have been lots of questions about what this means for PCI. I've outlined this for several people individually, but I'm going to send this separately, apart from a specific patch series, to make sure we're all on the same page. Please correct my errors and misunderstandings.
Bjorn
[snip....]
A big +1 to all of this. This also looks like something that should be added to either PCI, ACPI or arm64 documentation (or even all three).
And to arm64 platforms FW :)
What do you think?
I do not think there is anything ARM64 specific in Bjorn's description, but I do think it is very useful to have it in documentation, these bits of information are scattered around ACPI specs and PCI FW specs, having a single source would help and would have prevented asking Bjorn the same questions 100 times.
Thank you for putting this together, Bjorn.
+1, Thank you very much for this nice summary Bjorn.
+1, thanks a lot :)
Hanjun
On 11/10/2016 6:18 PM, Al Stone wrote:
On 11/09/2016 03:05 PM, Bjorn Helgaas wrote:
Hi all,
We've been working through the details of getting ACPI to work on arm64, and there have been lots of questions about what this means for PCI. I've outlined this for several people individually, but I'm going to send this separately, apart from a specific patch series, to make sure we're all on the same page. Please correct my errors and misunderstandings.
Bjorn
[snip....]
A big +1 to all of this. This also looks like something that should be added to either PCI, ACPI or arm64 documentation (or even all three). What do you think?
I agree. In order to have compliant systems, we have to make PNP0C02 required in the PCIe appendix of the SBSA specification.
Thank you for putting this together, Bjorn.
On 11/11/2016 09:24 AM, Sinan Kaya wrote:
On 11/10/2016 6:18 PM, Al Stone wrote:
On 11/09/2016 03:05 PM, Bjorn Helgaas wrote:
Hi all,
We've been working through the details of getting ACPI to work on arm64, and there have been lots of questions about what this means for PCI. I've outlined this for several people individually, but I'm going to send this separately, apart from a specific patch series, to make sure we're all on the same page. Please correct my errors and misunderstandings.
Bjorn
[snip....]
A big +1 to all of this. This also looks like something that should be added to either PCI, ACPI or arm64 documentation (or even all three). What do you think?
I agree. In order to have compliant systems, we have to make PNP0C02 required in the PCIe appendix of the SBSA specification.
We're ramping up for a "Constitutional Convention" on the SBBR between a few of the vendors and we'll make sure this is covered.
Jon.
On Wed, Nov 09, 2016 at 04:05:06PM -0600, Bjorn Helgaas wrote:
[...]
ACPI defined a Producer/Consumer bit that was intended to distinguish the bridge apertures from the bridge registers [4, 5]. However, BIOSes didn't use that bit correctly, and the result is that OSes have to assume that everything in a PCI host bridge _CRS is a window. That leaves no way to describe the bridge registers in the PNP0A03/PNP0A08 device itself.
ACPI 6.1 states that in the revision changes 4.0a Apr.2010 (xiii) "Consumer/Producer bit is ignored (Restored 2.0C change that had been lost)" and still that bit is marked as valid. If it is not reliable it should be set as "ignored" in the specs (as it was on ACPI 2.0C, BTW), as it is it is just a source of confusion.
Thanks again ! Lorenzo
On 11/09/2016 05:05 PM, Bjorn Helgaas wrote:
We've been working through the details of getting ACPI to work on arm64, and there have been lots of questions about what this means for PCI. I've outlined this for several people individually, but I'm going to send this separately, apart from a specific patch series, to make sure we're all on the same page. Please correct my errors and misunderstandings.
Thanks so much again for writing this.
When we originally created the SBBR (Server Base Boot Requirements) we were very vague about things like PCI. On x86, it "just worked", and so we said generic things about MCFG tables and implementing PCI correctly, but we didn't think of all of the many ways it might be done badly. In that respect, Intel and AMD have spoiled us over the years (thanks!) :)
Since then, we've had a lot of opportunity to learn about buggy IP that's out there and we've done a lot to have it fixed, and to get all of the vendors to take care of these problems before their next generation silicon lands. Indeed, we're doing a lot more on the pre silicon front as well these days, but that's for another time.
And of course, once you have all the Linux distros and other OSes out there, it's easier for the next wave to come along anyway. It will either boot for them, or it won't. And if it doesn't boot, the vendors will have two choices: upstream a fix and get every distro to pick it up at some future point (and wait until then because you won't even be able to boot the installation media to install an update) or don't make the mistake in the first place and fix it pre-silicon.
What I would like to get out of this experience is not only the summary you've written, which we will point people to as a living document, but also a more useful update to the ARM SBBR that spells out the many actual requirements and expectations. I want to go a lot further and start prescribing lots of other things in the next major update to the SBBR (everything from "you will map your RAM at zero", and you will implement your SMMU topology in this way...") that were too hand wavy before, but definitely want a giant section on how to do PCI right. So we'll come ping you for input on that :)
The basic requirement is that the ACPI namespace should describe *everything* that consumes address space unless there's another standard way for the OS to find it [1, 2].
...and by the way, this was a key lesson for me, too. I had not fully internalized before that you don't just want to describe the ECAM region in the MCFG but you also need to ensure it's properly described in the ACPI namespace. Lots of good things learned.
Jon.
On Thu, Dec 01, 2016 at 11:52:00PM -0500, Jon Masters wrote:
On 11/09/2016 05:05 PM, Bjorn Helgaas wrote:
The basic requirement is that the ACPI namespace should describe *everything* that consumes address space unless there's another standard way for the OS to find it [1, 2].
...and by the way, this was a key lesson for me, too. I had not fully internalized before that you don't just want to describe the ECAM region in the MCFG but you also need to ensure it's properly described in the ACPI namespace. Lots of good things learned.
I wish the ACPI spec contained explicit language to this effect, but if it does, I haven't found it. There might be firmware people who would disagree with it.
My rationale is that the OS may receive a device with no address space assigned, and for the OS to safely assign space, it has to know everything to avoid. The devil's advocate might argue that the OS doesn't need full knowledge as long as firmware constrains every device's _PRS to avoid the possibility of conflict. But that seems like it would be impractical for non-trivial systems.
Bjorn