Am 29.03.2018 um 17:45 schrieb Logan Gunthorpe:
On 29/03/18 05:44 AM, Christian König wrote:
Am 28.03.2018 um 21:53 schrieb Logan Gunthorpe:
On 28/03/18 01:44 PM, Christian König wrote:
Well, isn't that exactly what dma_map_resource() is good for? As far as I can see it makes sure IOMMU is aware of the access route and translates a CPU address into a PCI Bus address. I'm using that with the AMD IOMMU driver and at least there it works perfectly fine.
Yes, it would be nice, but no arch has implemented this yet. We are just lucky in the x86 case because that arch is simple and doesn't need to do anything for P2P (partially due to the Bus and CPU addresses being the same). But in the general case, you can't rely on it.
Well, that an arch hasn't implemented it doesn't mean that we don't have the right interface to do it.
Yes, but right now we don't have a performant way to check if we are doing P2P or not in the dma_map_X() wrappers.
Why not? I mean the dma_map_resource() function is for P2P while other dma_map_* functions are only for system memory.
And this is necessary to check if the DMA ops in use support it or not. We can't have the dma_map_X() functions do the wrong thing because they don't support it yet.
Well that sounds like we should just return an error from dma_map_resources() when an architecture doesn't support P2P yet as Alex suggested.
Devices integrated in the CPU usually only "claim" to be PCIe devices. In reality their memory request path go directly through the integrated north bridge. The reason for this is simple better throughput/latency.
These are just more reasons why our patchset restricts to devices behind a switch. And more mess for someone to deal with if they need to relax that restriction.
You don't seem to understand the implications: The devices do have a common upstream bridge! In other words your code would currently claim that P2P is supported, but in practice it doesn't work.
You need to include both drivers which participate in the P2P transaction to make sure that both supports this and give them opportunity to chicken out and in the case of AMD APUs even redirect the request to another location (e.g. participate in the DMA translation).
DMA-buf fortunately seems to handle all this already, that's why we choose it as base for our implementation.
Regards, Christian.