On Wed, Jul 30, 2025 at 01:58:46PM -0600, Alex Williamson wrote:
On Wed, 23 Jul 2025 16:00:01 +0300 Leon Romanovsky leon@kernel.org wrote:
From: Leon Romanovsky leonro@nvidia.com
Based on blk and DMA patches which will be sent during coming merge window.
This series extends the VFIO PCI subsystem to support exporting MMIO regions from PCI device BARs as dma-buf objects, enabling safe sharing of non-struct page memory with controlled lifetime management. This allows RDMA and other subsystems to import dma-buf FDs and build them into memory regions for PCI P2P operations.
The series supports a use case for SPDK where a NVMe device will be owned by SPDK through VFIO but interacting with a RDMA device. The RDMA device may directly access the NVMe CMB or directly manipulate the NVMe device's doorbell using PCI P2P.
However, as a general mechanism, it can support many other scenarios with VFIO. This dmabuf approach can be usable by iommufd as well for generic and safe P2P mappings.
I think this will eventually enable DMA mapping of device MMIO through an IOMMUFD IOAS for the VM P2P use cases, right?
This is the plan
How do we get from what appears to be a point-to-point mapping between two devices to a shared IOVA between multiple devices?
You have it right below, it is a point to point mapping between the vfio device and the iommufd.
I'm guessing we need IOMMUFD to support something like IOMMU_IOAS_MAP_FILE for dma-buf,
1) The dma phys series which needs more work 2) This series to get basic 'movable' DMABUF support in VFIO 3) Add 'revokable' as a DMABUF concept and implement it with mlx5 and vfio 4) Add some way to get the phys_addr list from the DMABUF 5) IOMMU_IOAS_MAP_FILE using a revokable attachment and the phys_addr list. When VFIO does FLR the iommufd can remove the IOPTEs and then put them back when FLR is done.
It is not so much more code, but I think every step will take a lot of work to get agreements.
Then we reuse all of the above with some tweaks for the CC problems too.
Jason