Gregory's modest proposal to fix CXL cxl_mem_probe() failures due to delayed arrival of the CXL "root" infrastructure [1] prompted questions of how the existing mechanism for retrying cxl_mem_probe() could be failing.
The critical missing piece in the debug was that Gregory's setup had almost all CXL modules built-in to the kernel.
On the way to that discovery several other bugs and init-order corner cases were discovered.
The main fix is to make sure the drivers/cxl/Makefile object order supports root CXL ports being fully initialized upon cxl_acpi_probe() exit. The modular case has some similar potential holes that are fixed with MODULE_SOFTDEP() and other fix ups. Finally, an attempt to update cxl_test to reproduce the original report resulted in the discovery of a separate long standing use after free bug in cxl_region_detach().
[1]: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net
---
Dan Williams (5): cxl/port: Fix CXL port initialization order when the subsystem is built-in cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices() cxl/acpi: Ensure ports ready at cxl_acpi_probe() return cxl/port: Fix use-after-free, permit out-of-order decoder shutdown cxl/test: Improve init-order fidelity relative to real-world systems
drivers/base/core.c | 35 +++++++ drivers/cxl/Kconfig | 1 drivers/cxl/Makefile | 12 +-- drivers/cxl/acpi.c | 7 + drivers/cxl/core/hdm.c | 50 +++++++++-- drivers/cxl/core/port.c | 13 ++- drivers/cxl/core/region.c | 48 +++------- drivers/cxl/cxl.h | 3 - include/linux/device.h | 3 + tools/testing/cxl/test/cxl.c | 200 +++++++++++++++++++++++------------------- tools/testing/cxl/test/mem.c | 1 11 files changed, 228 insertions(+), 145 deletions(-)
base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b