On Mon, Sep 29, 2025 at 7:55 AM Ard Biesheuvel ardb@kernel.org wrote:
On Wed, 24 Sept 2025 at 18:27, Simon Glass sjg@chromium.org wrote:
Hi Ard,
On Wed, 24 Sept 2025 at 10:15, Ard Biesheuvel ardb@kernel.org wrote:
On Tue, 23 Sept 2025 at 21:32, Simon Glass sjg@chromium.org wrote:
Hi Ard,
On Fri, 19 Sept 2025 at 09:50, Ard Biesheuvel ardb@kernel.org wrote:
The main difference is the level of abstraction: AML carries code logic along with the device description that can en/disable the device and put it into different power states. This is backed by so-called OperationRegions, which are ways to expose [abstracted] SPI, I2C and serial busses to the AML interpreter (as well as MMIO memory) so that the code sequences effectuating things like power state changes can be reduced to pokes of device register, regardless of how those are accessed on the particular system.
On x86, many onboard devices are simply described as PCIe devices, even though they are not actually connected to any PCIe fabric. This solves the self-description problem, vastly reducing the number of devices that need to be described via AML.
Also, there is a lot more homogeneity in how the system topology is constructed: on embedded systems, it is quite common to, e.g., tie the PHY interrupt line from the PCIe NIC to some GPIO controller that is not a naturally associated with that device at all, and this is something ACPI struggles with, and where DT shines.
DT simply operates at a different abstraction level - it describes every detail of the system topology, including every clock generator and power source. This makes it very flexible and very powerful, but also a maintenance burden: e.g., if some OEM issues a v2 of some board where one clock generator IC has been replaced because the original is EOL, it requires a new DT and potentially an OS update if the new part was not supported yet. ACPI is more flexible here, as it can simply ship different ACPI tables that make the v2 board look 100% identical to the v1 as far as the OS is concerned.
There is also the PEP addition you mention below, which I tend to see as an admission that ACPI cannot handle the complexity of modern systems.
No. The problem is not the complexity itself, but the fact that it is exposed to software.
x86 systems are just as complex, but they a) make more effort to abstract away the OS visible differences in firmware, and b) design the system with ACPI in mind, e.g., masquerade on-board peripherals as PCIe (so-called root complex integrated endpoints) so they can describe themselves, and use PCI standard abstractions for configuration and power management.
RIght. But are you saying that Windows shouldn't have PEP drivers? Or Linux shouldn't need them?
ACPI + PEP does not provide the advantage of the higher abstraction level that 'pure' ACPI provides. Windows only supports ACPI, so PEP was bolted onto the side to be able to support these systems. Linux should not implement ACPI + PEP, because it serves the same purpose as DT (i.e., a more vertically integrated system), so we already solved that problem.
...
The problem, of course, is that the idea that we would maintain the DTs for these systems in the kernel tree is laughable. So either these systems need to ship as vertically integrated systems (Android, CrOS), or we need to muster the self discipline to create a DT description and *stick with it* rather than drop it like a brick as soon as the Linux minor version changes, so that we can support users installing their own Linux distros.
Yes.
I'm assuming no one has a magic solution for this?
Well, if we cared about breaking DT compatibility as much as Linus makes us care about breaking user space, the problem wouldn't exist.
I try, but I'm not Linus nor can I police everything. I think we need tools to detect this first, then we can decide if and when compatibility breaks are okay. Sometimes they are unavoidable or just don't matter (i.e. new h/w which has no users). How to distinguish stable vs. unstable platforms has been discussed multiple times in the past with no conclusion.
One option could be for OEMs to provide a devicetree package for each kernel version, perhaps in a /boot/oem directory with the firmware / bootloader selecting the closest one available. In other words, we try to solve the problem of 'OEMs owning the platform vs, distros owning the OS' by separating the concerns.
No. The problem is on the kernel side, and that is where we should fix it.
Perhaps add a meta-property to DT bindings that indicate whether they will be kept compatible going forward, and tell OEMs to only use ones that do?
The challenge is there are multiple aspects to being compatible. It's at both a binding level and the DTB as a whole.
At a binding level, there's changing required properties or the entries for properties (e.g. a new required clock). Now that we have schemas, we can actually check for these changes. I have a PoC tool that can detect these changes. It seems to work okay unless the schema is restructured in some way in addition. I haven't figured out how exactly to integrate it into our processes.
At a DTB level, we need to check for changed compatibles. For example, a platform changes from fixed clocks to a clock controller breaks forward compatibility as an existing OS will not have the clock driver. Adding pinctrl or power-domains later on creates similar problems.
These aren't really hard tools to write, but no one seems to care enough to do something other than complain.
Rob