Alejandro Lucero Palau wrote: [..]
Setting @memdev->endpoint to ERR_PTR(-EPROBE_DEFER), as I originally had, is an even more indirect way to convey a similar result and is starting to feel a bit "mid-layer-y".
I was a bit confused with this answer until I read again the patch commit from your original work.
The confusion came from my assumption about if the root device is not there, it is due to the hardware root initialization requiring more time. But I realize now you specifically said "the root driver has not attached yet" what turns it into this problem of kernel modules not loaded yet.
If so, I think I can solve this within the type2 driver code and kconfig. Kconfig will force the driver being compiled as a module...
There should be no requirement that accelerator drivers must be built as modules. An accelerator driver simply cannot enforce, via module load order, that CXL root infrastructure is up and ready before the accelerator 'probe' routine runs. This is because enumeration order still dominiates and enumeration order is effectively random*.
The accelerator driver only has 2 options, return EPROBE_DEFER until all resource dependencies are ready, or do what cxl_pci + cxl_mem do. What cxl_pci + cxl_mem do is, cxl_pci_probe() registers a memdev and then at some point later cxl_mem notices that the root infrastructure has arrived via the cxl_bus_rescan() event.
Note that these patches are about fixing the assumptions of cxl_bus_rescan(), not about ensuring init order.
* ...at least nothing should break if CXL root and CXL endpoint enumeration happens out of order.