Re: [PATCH 0/5] cxl: Initialization and shutdown fixes

16 Oct 2024


      On 10/15/24 17:37, Dan Williams wrote:
...
Alejandro Lucero Palau wrote:
[..]
...
...
...
Then it makes sense that change. I'll do it if not already taken. I'll
send v4 without the PROBE_FORCE_SYNCHRONOUS flag and without the
previous loop with delays implemented in v3.
So I think EPROBE_DEFER can stay out of the CXL core because it is up to
the accelerator driver to decide whether CXL availability is fatal to
init or not.
It needs support from the cxl core though. If the cxl root is not there
yet, the driver needs to know, and that is what you did in your original
patch and I'm keeping as well.
So there are two ways, check if a registered @memdev has
@memdev->dev.driver set, assuming you know that the cxl_mem driver is
available, or call devm_cxl_enumerate_ports() yourself and skip the
cxl_mem driver indirection.
Setting @memdev->endpoint to ERR_PTR(-EPROBE_DEFER), as I originally
had, is an even more indirect way to convey a similar result and is
starting to feel a bit "mid-layer-y".
I was a bit confused with this answer until I read again the patch 
commit from your original work.
The confusion came from my assumption about if the root device is not 
there, it is due to the hardware root initialization requiring more 
time. But I realize now you specifically said "the root driver has not 
attached yet" what turns it into this problem of kernel modules not 
loaded yet.
If so, I think I can solve this within the type2 driver code and 
kconfig. Kconfig will force the driver being compiled as a module if the 
cxl core is a module, and with MODULE_SOFTDEP("pre: cxl_core cxl_port 
cxl_acpi cxl-mem) the type2 driver modprobe will trigger loading of 
dependencies beforehand. This makes unnecessary EPROBE_DEFER.
What do you think?
...
...
...
Additionally, I am less and less convinced that Type-2 drivers should be
forced to depend on the cxl_mem driver to attach vs just arranging for
those Type-2 drivers to call devm_cxl_enumerate_ports() and
devm_cxl_add_endpoint() directly. In other words I am starting to worry
that the generic cxl_mem driver design pattern is a midlayer mistake.
You know better than me but in my view, a Type2 should follow what a
Type3 does with some small changes for dealing with the differences,
mainly the way it is going to be used and those optional capabilities
for Type2. This makes more sense to me for Type1.
If an accelerator driver *wants* to depend on cxl_mem, it can, but all
the cxl_core functions that cxl_mem uses are also exported for
accelerator drivers to use directly to avoid needing to guess about when
/ if cxl_mem is going to attach.
...
...
So, if it makes it easier for sfc, I would be open to exploring a direct
scheme for endpoint attachment, and not requiring the generic cxl_mem
driver as an intermediary.
V4 is ready (just having problems when testing with 6.12-rcX) so I would
like to keep it and explore this once we have something working and
accepted. Type2 and Type1 with CXL cache will bring new challenges and I
bet we will need refactoring in the code and potentially new design for
generic code (Type3 and Type2, Type2 and Type1).
Yeah, no need to switch horses mid-race if you already have a cxl_mem
dependent approach nearly ready, but a potential simplification to
explore going forward.

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH 0/5] cxl: Initialization and shutdown fixes