Re: [Linaro-acpi] [RFC] ACPI on arm64 TODO List

12 Jan 2015

      On Mon, Jan 12, 2015 at 10:21 AM, Arnd Bergmann arnd@arndb.de wrote:
...
On Saturday 10 January 2015 14:44:02 Grant Likely wrote:
...
On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely grant.likely@linaro.org wrote:
...
I've posted an article on my blog, but I'm reposting it here because
the mailing list is more conducive to discussion...
http://www.secretlab.ca/archives/151
Why ACPI on ARM?
Why are we doing ACPI on ARM? That question has been asked many times,
but we haven't yet had a good summary of the most important reasons
for wanting ACPI on ARM. This article is an attempt to state the
rationale clearly.
Thanks for writing this up, much appreciated. I'd like to comment
on some of the points here, which seems easier than commenting on the
blog post.
Thanks for reading through it. Replies below...
...
...
Device Configurations

Support device configurations
Support dynamic device configurations (hot add/removal)

...
...
DT platforms have also supported dynamic configuration and hotplug for
years. There isn't a lot here that differentiates between ACPI and DT.
The biggest difference is that dynamic changes to the ACPI namespace
can be triggered by ACPI methods, whereas for DT changes are received
as messages from firmware and have been very much platform specific
(e.g. IBM pSeries does this)
This seems like a great fit for AML indeed, but I wonder what exactly
we want to hotplug here, since everything I can think of wouldn't need
AML support for the specific use case of SBSA compliant servers:
[...]
I've trimmed the specific examples here because I think that misses
the point. The point is that regardless of interface (either ACPI or
DT) there are always going to be cases where the data needs to change
at runtime. Not all platforms will need to change the CPU data, but
some will (say for a machine that detects a failed CPU and removes
it). Some PCI add-in boards will carry along with them additional data
that needs to be inserted into the ACPI namespace or DT. Some
platforms will have system level component (ie. non-PCI) that may not
always be accessible.
ACPI has an interface baked in already for tying data changes to
events. DT currently needs platform specific support (which we can
improve on). I'm not even trying to argue for ACPI over DT in this
section, but I included it this document because it is one of the
reasons often given for choosing ACPI and I felt it required a more
nuanced discussion.
...
...
Power Management Model

Support hardware abstraction through control methods
Support power management
Support thermal management

Power, thermal, and clock management can all be dealt with as a group.
ACPI defines a power management model (OSPM) that both the platform
and the OS conform to. The OS implements the OSPM state machine, but
the platform can provide state change behaviour in the form of
bytecode methods. Methods can access hardware directly or hand off PM
operations to a coprocessor. The OS really doesn't have to care about
the details as long as the platform obeys the rules of the OSPM model.
With DT, the kernel has device drivers for each and every component in
the platform, and configures them using DT data. DT itself doesn't
have a PM model. Rather the PM model is an implementation detail of
the kernel. Device drivers use DT data to decide how to handle PM
state changes. We have clock, pinctrl, and regulator frameworks in the
kernel for working out runtime PM. However, this only works when all
the drivers and support code have been merged into the kernel. When
the kernel's PM model doesn't work for new hardware, then we change
the model. This works very well for mobile/embedded because the vendor
controls the kernel. We can change things when we need to, but we also
struggle with getting board support mainlined.
I can definitely see this point, but I can also see two important
downsides to the ACPI model that need to be considered for an
individual implementor:

As a high-level abstraction, there are limits to how fine-grained
the power management can be done, or is implemented in a particular
BIOS. The thinner the abstraction, the better the power savings can
get when implemented right.

Agreed. That is the tradeoff. OSPM defines a power model, and the
machine must restrict any PM behaviour to fit within that power model.
This is important for interoperability, but it also leaves performance
on the table. ACPI at least gives us the option to pick that
performance back up by adding better power management to the drivers,
without sacrificing the interoperability provided by OSPM.
In other words, OSPM gets us going, but we can add specific
optimizations when required.
Also important: Vendors can choose to not implement any PM into their
ACPI tables at all. In this case the the machine would be left running
at full tilt. It will be compatible with everything, but it won't be
optimized. Then they have the option of loading a PM driver at runtime
to optimize the system with the caveat that the PM driver must not be
required for the machine to be operational. In this case, as far as
the OS is concerned, it is still applying the OSPM state machine, but
the OSPM behaviour never changes the state of the hardware.
...

From the experience with x86, Linux tends to prefer using drivers
for hardware registers over the AML based drivers when both are
implemented, because of efficiency and correctness.

We should probably discuss at some point how to get the best of
both. I really don't like the idea of putting the low-level
details that we tend to have DT into ACPI, but there are two
things we can do: For systems that have a high-level abstraction
for their PM in hardware (e.g. talking to an embedded controller
that does the actual work), the ACPI description should contain
enough information to implement a kernel-level driver for it as
we have on Intel machines. For more traditional SoCs that do everything
themselves, I would recommend to always have a working DT for
those people wanting to get the most of their hardware. This will
also enable any other SoC features that cannot be represented in
ACPI.
The nice thing about ACPI is that we always have the option of
ignoring it when the driver knows better since it is always executed
under the control of the kernel interpreter. There is no ACPI going
off and doing something behind the kernel's back. To start with we
have the OSPM state model and devices can use additional ACPI methods
as needed, but as an optimization, the driver can do those operations
directly if the driver author has enough knowledge about the device.
...
...
Reliability, Availability & Serviceability (RAS)

Support RAS interfaces

This isn't a question of whether or not DT can support RAS. Of course
it can. Rather it is a matter of RAS bindings already existing for
ACPI, including a usage model. We've barely begun to explore this on
DT. This item doesn't make ACPI technically superior to DT, but it
certainly makes it more mature.
Unfortunately, RAS can mean a lot of things to different people.
Is there some high-level description of what the APCI idea of RAS
is? On systems I've worked on in the past, this was generally done
out of band (e.g. in an IPMI BMC) because you can't really trust
the running OS when you report errors that may impact data consistency
of that OS.
RAS is also something where every company already has something that
they are using on their x86 machines. Those interfaces are being
ported over to the ARM platforms and will be equivalent to what they
already do for x86. So, for example, an ARM server from DELL will use
mostly the same RAS interfaces as an x86 server from DELL.
...
...
Multiplatform support

Support multiple OSes, including Linux and Windows

I'm tackling this item last because I think it is the most contentious
for those of us in the Linux world. I wanted to get the other issues
out of the way before addressing it.
I know that this line of thought is more about market forces rather
than a hard technical argument between ACPI and DT, but it is an
equally significant one. Agreeing on a single way of doing things is
important. The ARM server ecosystem is better for the agreement to use
the same interface for all operating systems. This is what is meant by
standards compliant. The standard is a codification of the mutually
agreed interface. It provides confidence that all vendors are using
the same rules for interoperability.
I do think that this is in fact the most important argument in favor
of doing ACPI on Linux, because a number of companies are betting on
Windows (or some in-house OS that uses ACPI) support. At the same time,
I don't think talking of a single 'ARM server ecosystem' that needs to
agree on one interface is helpful here. Each server company has their
own business plan and their own constraints. I absolutely think that
getting as many companies as possible to agree on SBSA and UEFI is
helpful here because it reduces the the differences between the platforms
as seen by a distro. For companies that want to support Windows, it's
obvious they want to have ACPI on their machines, for others the
factors you mention above can be enough to justify the move to ACPI
even without Windows support. Then there are other companies for
which the tradeoffs are different, and I see no reason for forcing
it on them. Finally there are and will likely always be chips that
are not built around SBSA and someone will use the chips in creative
ways to build servers from them, so we already don't have a homogeneous
ecosystem.
Allow me to clarify my position here. This entire document is about
why ACPI was chosen for the ARM SBBR specification. The SBBR and the
SBSA are important because they document the agreements and
compromises made by vendors and industry representatives to get
interoperability. It is a tool for vendors to say that they are aiming
for compatibility with a particularly hardware/software ecosystem.
*Nobody* is forced to implement these specifications. Any company is
free to ignore them and go their own way. The tradeoff in doing so is
it means they are on their own for support. Non-compliant hardware
vendors have to convince OS vendors to support them, and similarly,
non-compliant OS vendors need to convince hardware vendors of the
same. Red Had has stated very clearly that they won't support any
hardware that isn't SBSA/SBBR compliant. So has Microsoft. Canonical
on the other hand has said they will support whatever if there is a
business case. This certainly is a business decision and each company
needs to make its own choices.
As far as we (Linux maintainers) are concerned, we've also been really
clear that DT is not a second class citizen to ACPI. Mainline cannot
and should not force certain classes of machines to use ACPI and other
classes of machines to use DT. As long as the code is well written and
conforms to our rules for what ACPI or DT code is allowed to do, then
we should be happy to take the patches.
g.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [Linaro-acpi] [RFC] ACPI on arm64 TODO List

Why ACPI on ARM?

Device Configurations

Power Management Model

Reliability, Availability & Serviceability (RAS)

Multiplatform support