On Mon, Jan 23, 2012 at 4:54 AM, Laurent Pinchart laurent.pinchart@ideasonboard.com wrote:
Hi Daniel,
On Monday 23 January 2012 11:35:01 Daniel Vetter wrote:
On Mon, Jan 23, 2012 at 10:48, Laurent Pinchart wrote:
On Monday 23 January 2012 10:06:57 Marek Szyprowski wrote:
On Friday, January 20, 2012 5:29 PM Laurent Pinchart wrote:
On Friday 20 January 2012 17:20:22 Tomasz Stanislawski wrote:
>> IMO, One way to do this is adding field 'struct device *dev' to >> struct vb2_queue. This field should be filled by a driver prior >> to calling vb2_queue_init. > > I haven't looked into the details, but that sounds good to me. Do > we have use cases where a queue is allocated before knowing which > physical device it will be used for ?
I don't think so. In case of S5P drivers, vb2_queue_init is called while opening /dev/videoX.
BTW. This struct device may help vb2 to produce logs with more descriptive client annotation.
What happens if such a device is NULL. It would happen for vmalloc allocator used by VIVI?
Good question. Should dma-buf accept NULL devices ? Or should vivi pass its V4L2 device to vb2 ?
I assume you suggested using struct video_device->dev entry in such case. It will not work. DMA-mapping API requires some parameters to be set for the client device, like for example dma mask. struct video_device contains only an artificial struct device entry, which has no relation to any physical device and cannot be used for calling DMA-mapping functions.
Performing dma_map_* operations with such artificial struct device doesn't make any sense. It also slows down things significantly due to cache flushing (forced by dma-mapping) which should be avoided if the buffer is accessed only with CPU (like it is done by vb2-vmalloc style drivers).
I agree that mapping the buffer to the physical device doesn't make any sense, as there's simple no physical device to map the buffer to. In that case we could simply skip the dma_map/dma_unmap calls.
See my other mail, dma_buf v1 does not support cpu access.
v1 is in the kernel now, let's start discussing v2 ;-)
So if you don't have a device around, you can't use it in it's current form.
Note, however, that dma-buf v1 explicitly does not support CPU access by the importer.
IMHO this case perfectly shows the design mistake that have been made. The current version simply tries to do too much.
Each client of dma_buf should 'map' the provided sgtable/scatterlist on its own. Only the client device driver has all knowledge to make a proper 'mapping'. Real physical devices usually will use dma_map_sg() for such operation, while some virtual ones will only create a kernel mapping for the provided scatterlist (like vivi with vmalloc memory module).
I tend to agree with that. Depending on the importer device, drivers could then map/unmap the buffer around each DMA access, or keep a mapping and sync the buffer.
Again we've discussed adding a syncing op to the interface that would allow keeping around mappings. The thing is that this also requires an unmap callback or something similar, so that the exporter can inform the importer that the memory just moved around. And the exporter _needs_ to be able to do that, hence also the language in the doc that importers need to braked all uses with a map/unmap and can't sit forever on a dma_buf mapping.
Not all exporters need to be able to move buffers around. If I'm not mistaken, only DRM exporters need such a feature (which obviously makes it an important feature). Does the exporter need to be able to do so at any time ? Buffers can't obviously be moved around when they're used by an activa DMA, so I expect the exporter to be able to wait. How long can it wait ?
Offhand I think it would usually be a request from userspace (in some cases page faults (although I think only if there is hw de-tiling?), or command submission to gpu involving some buffer(s) that are not currently mapped) that would trigger the exporter to want to be able to evict something. So could be blocked or something else evicted/moved instead. Although perhaps not ideal for performance. (app/toolkit writers seem to have a love of temporary pixmaps, so x11/ddx driver can chew thru a huge number of new buffer allocations in very short amount of time)
I'm not sure I would like a callback approach. If we add a sync operation, the exporter could signal to the importer that it must unmap the buffer by returning an appropriate value from the sync operation. Would that be usable for DRM ?
It does seem a bit over-complicated.. and deadlock prone. Is there a reason the importer couldn't just unmap when DMA is completed, and the exporter give some hint on next map() that the buffer hasn't actually moved?
BR, -R
Another option would be to keep the mapping around, and check in the importer if the buffer has moved. If so, the importer would tear the mapping down and create a new one. This is a bit hackish though, as we would tear a mapping down for a buffer that doesn't exist anymore. Nothing should be accessing the mapping at that time, but it could be a security risk if we consider rogue hardware (that's pretty far-fetched though, as rogue hardware can probably already kill the system easily in many cases).
What about splitting the map_dma_buf operation into an operation that backs the buffer with pages and returns an sg_list, and an operation that performs DMA synchronization with the exporter ? unmap_dma_buf would similarly be split in two operations.
Again for v1 that doesn't make sense because you can't do cpu access anyway and you should not hang onto mappings forever.
For performance reasons I'd like to hang onto the mapping as long as possible. Creating and tearing down IOMMU mappings for large buffers is a costly operation, and I don't want to spend time there every time I use a buffer if the buffer hasn't moved since the previous time it was mapped.
Furthermore we have cases where an unmapped sg_list for cpu access simple makes no sense.
Yours, Daniel
[Aside: You can do a dirty trick like prime that grabs the underlying page of an already mapped sg list. Obviously highly non-portable.]
-- Regards,
Laurent Pinchart