Hi Daniel,
On Thursday 22 March 2012 19:01:01 Daniel Vetter wrote:
On Thu, Mar 22, 2012 at 11:54:55AM -0300, Christian Robottom Reis wrote:
Tomasz: proposed extension to DMA Mapping -- dma_get_pages
Currently difficult to change the camera address into list of pages DMA framework has the knowledge of this list and could do this Depends on dma_get_pages Needs to be merged first Test application posted to dri-devel with dependencies to run demo Many dependencies
I kinda missed to yell at this patch when it first showed up, so I'll do that here ;-)
I think this is a gross layering violation and I don't like it at all. The entire point of the dma api is that device drivers only get to see device addresses and can forget about all the remapping/contig-alloc madness. And dma-buf should just follow this with it's map/unmap interfaces.
Furthermore the exporter memory might simply not have any associated struct pages. The two examples I always bring up:
- special purpose remapping units (like omap's TILER) which are managed by the exporter and can do crazy things like tiling or rotation transparently for all devices.
- special carve-out memory which is unknown to linux memory management. drm/i915 is totally abusing this, mostly because windows is lame and doesn't have decent largepage allocation support. This is just plain system memory, but there's no struct page for it (because it's not part of the system map).
I agree with you that the DMA API is the proper layer to abstract physical memory and provide devices with a DMA address. DMA addresses are specific to a device, while dma-buf needs to share buffers between separate devices (otherwise it would be pretty pointless). As DMA address are device-local, they can't be used to describe a cross-device buffer.
When allocating a buffer using the DMA API, memory is "allocated" behind the scene and mapped to the device address space ("allocated" in this case means anything from plain physical memory allocation to reservation of a special- purpose memory range, like in the OMAP TILER example). All the device driver gets to see is the DMA address and/or the DMA scatter list. So far, so good.
Then, when we want to share the memory with a second device, we need a way to map the memory to the second device's address space. There are several options here (and this is related to the "[RFCv2 PATCH 7/9] v4l: vb2-dma-contig: change map/unmap behaviour" mail thread).
- Let the importer driver map the memory to its own address space. This makes sense from the importer device's point of view, as that's where knowledge about the importer device is located (although you could argue that knowledge about the importer device is located in its struct device, which can be passed around - and I could agree with that). The importer driver would thus need to receive a cookie identifying the memory. As explained before, the exporter's DMA address isn't enough. There are various options here as well (list of pages or page frame numbers, exporter's DMA address + exporter's struct device, a new kind of DMA API-related cookie, ... to just list a few). The importer driver would then use that cookie to map the memory to the importer device's address space (and this should most probably be implemented in the DMA API, which would require extensions).
- Let the exporter driver map the memory to the importer device's address space. This makes sense from the exporter device's point of view, as that's where knowledge about the exported memory is located. In this case we also most probably want to extend the DMA API to handle the mapping operation, and we will need to pass the same kind of cookie as in the first option to the API.
Now the core dma api isn't fully up to snuff for everything yet and there are things missing. But it's certainly not dma_get_pages, but more things like mmap support for coherent memory or allocating coherent memroy which doesn't have a static mapping in the kernel address space. I very much hope that the interfaces we develop for dma-buf (and the insights gained) could help as examples here, so that in the further there's not such a gaping difference for the driver between dma_coherent allocations of it's own and imported buffer objects.