Hi,
On Mon, 2025-09-29 at 09:45 -0300, Jason Gunthorpe wrote:
On Mon, Sep 29, 2025 at 10:16:30AM +0200, Christian König wrote:
The point is that the exporter manages all accesses to it's buffer and there can be more than one importer accessing it at the same time.
So when an exporter sees that it already has an importer which can only do DMA to system memory it will expose only DMA address to all other importers as well.
I would rephrase that, if the exporter supports multiple placement options for the memory (VRAM/CPU for example) then it needs to track which placement options all its importer support and never place the memory someplace an active importer cannot reach.
I don't want to say that just because one importer wants to use dma_addr_t then all private interconnect options are disabled. If the memory is in VRAM then multiple importers using private interconnect concurrently with dma_addr_t should be possible.
This seems like it is making the argument that the exporter does need to know the importer capability so it can figure out what placement options are valid.
I didn't sketch further, but I think the exporter and importer should both be providing a compatible list and then in almost all cases the core code should do the matching.
More or less matches my idea. I would just start with the exporter providing a list of how it's buffer is accessible because it knows about other importers and can pre-reduce the list if necessary.
I think the importer also has to advertise what it is able to support. A big point of the private interconnect is that it won't use scatterlist so it needs to be a negotiated feature.
For example, we have some systems with multipath PCI. This could actually support those properly. The RDMA NIC has two struct devices it operates with different paths, so it would write out two &dmabuf_generic_dma_addr_t's - one for each.
That is actually something we try rather hard to avoid. E.g. the exporter should offer only one path to each importer.
Real systems have multipath. We need to do a NxM negotiation where both sides offer all their paths and the best quality path is selected.
Once the attachment is made it should be one interconnect and one stable address within that interconnect.
In this example I'd expect the Xe GPU driver to always offer its private interconnect and a dma_addr_t based interconnct as both exporter and importer. The core code should select one for the attachment.
We can of course do load balancing on a round robin bases.
I'm not thinking about load balancing, more a 'quality of path' metric.
This sounds like it's getting increasingly complex. TBH I think that at least all fast interconnects we have in the planning for xe either are fine with falling back to the current pcie-p2p / dma-buf or in worst case system memory. The virtual interconnect we've been discussing would probably not be able to fall back at all unless negotiation gets somehow forwarded to the vm guest.
So I wonder whether for now it's simply sufficient to
sg_table_replacement = dma_buf_map_interconnect();
if (IS_ERROR(sg_list_replacement)) { sg_table = dma_buf_map_attachment(); if (IS_ERROR(sg_table)) bail(); }
/Thomas
Jason