On Fri, Sep 09, 2022 at 06:24:35AM -0700, Christoph Hellwig wrote:
On Wed, Sep 07, 2022 at 01:12:52PM -0300, Jason Gunthorpe wrote:
The PCI offset is some embedded thing - I've never seen it in a server platform.
That's not actually true, e.g. some power system definitively had it, althiugh I don't know if the current ones do.
I thought those were all power embedded systems.
There is a reason why we have these proper APIs and no one has any business bypassing them.
Yes, we should try to support these things, but you said this patch didn't work and wasn't tested - that is not true at all.
And it isn't like we have APIs just sitting here to solve this specific problem. So lets make something.
So, would you be OK with this series if I try to make a dma_map_p2p() that resolves the offset issue?
Well, if it also solves the other issue of invalid scatterlists leaking outside of drm we can think about it.
The scatterlist stuff has already leaked outside of DRM anyhow.
Again, I think it is very problematic to let DRM get away with things and then insist all the poor non-DRM people be responsible to clean up their mess.
I'm skeptical I can fix AMD GPU, but I can try to create a DMABUF op that returns something that is not a scatterlist and teach RDMA to use it. So at least the VFIO/RDMA part can avoid the scatter list abuse. I expected to need non-scatterlist for iommufd anyhow.
Coupled with a series to add some dma_map_resource_pci() that handles the PCI_P2PDMA_MAP_BUS_ADDR and the PCI offset, would it be an agreeable direction?
Take a look at iommu_dma_map_sg and pci_p2pdma_map_segment to see how this is handled.
So there is a bug in all these DMABUF implementations, they do ignore the PCI_P2PDMA_MAP_BUS_ADDR "distance type".
This isn't a real-world problem for VFIO because VFIO is largely incompatible with the non-ACS configuration that would trigger PCI_P2PDMA_MAP_BUS_ADDR, and explains why we never saw any problem. All our systems have ACS turned on so we can use VFIO.
I'm unclear how Habana or AMD have avoided a problem here..
This is much more serious than the pci offset in my mind.
Thanks, Jason