dma-buf file descriptor. Userspace access to the buffer should be bracketed with DMA_BUF_IOCTL_{PREPARE,FINISH}_ACCESS ioctl calls to give the exporting driver a chance to deal with cache synchronization and such for cached userspace mappings without resorting to page
There should be flags indicating if this is necessary. We don't want extra syscalls on hardware that doesn't need it. The other question is what info is needed as you may only want to poke a few pages out of cache and the prepare/finish on its own gives no info.
E.g. If another device was writing to the buffer, the prepare ioctl could block until that device had finished accessing that buffer.
How do you avoid deadlocks on this ? We need very clear ways to ensure things always complete in some form given multiple buffer owner/requestors and the fact this API has no "prepare-multiple-buffers" support.
Alan
On Sat, Mar 17, 2012 at 3:17 PM, Alan Cox alan@lxorguk.ukuu.org.uk wrote:
dma-buf file descriptor. Userspace access to the buffer should be bracketed with DMA_BUF_IOCTL_{PREPARE,FINISH}_ACCESS ioctl calls to give the exporting driver a chance to deal with cache synchronization and such for cached userspace mappings without resorting to page
There should be flags indicating if this is necessary. We don't want extra syscalls on hardware that doesn't need it. The other question is what info is needed as you may only want to poke a few pages out of cache and the prepare/finish on its own gives no info.
Well, there isn't really a convenient way to know, for some random code that is just handed a dmabuf fd, what the flags are without passing around an extra param in userspace. So I'd tend to say, just live with the syscall even if it is a no-op (because if you are doing sw access to the buffer, that is anyways some slow/legacy path). But I'm open to suggestions.
As far as just peeking/poking a few pages, that is where some later ioctls or additional params could come in, to give some hints. But I wanted to keep it simple to start.
E.g. If another device was writing to the buffer, the prepare ioctl could block until that device had finished accessing that buffer.
How do you avoid deadlocks on this ? We need very clear ways to ensure things always complete in some form given multiple buffer owner/requestors and the fact this API has no "prepare-multiple-buffers" support.
Probably some separate synchronization is needed.. I'm not really sure if prepare/finish (or map/unmap on kernel side) is a the right way to handle that.
BR, -R
Alan _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
-----Original Message----- From: Alan Cox [mailto:alan@lxorguk.ukuu.org.uk] Sent: 17 March 2012 20:17 To: Tom Cooksey Cc: 'Rob Clark'; linaro-mm-sig@lists.linaro.org; dri- devel@lists.freedesktop.org; linux-media@vger.kernel.org; rschultz@google.com; Rob Clark; sumit.semwal@linaro.org; patches@linaro.org Subject: Re: [PATCH] RFC: dma-buf: userspace mmap support
dma-buf file descriptor. Userspace access to the buffer should be bracketed with DMA_BUF_IOCTL_{PREPARE,FINISH}_ACCESS ioctl calls to give the exporting driver a chance to deal with cache
synchronization
and such for cached userspace mappings without resorting to page
There should be flags indicating if this is necessary. We don't want extra syscalls on hardware that doesn't need it. The other question is what info is needed as you may only want to poke a few pages out of cache and the prepare/finish on its own gives no info.
E.g. If another device was writing to the buffer, the prepare ioctl could block until that device had finished accessing that buffer.
How do you avoid deadlocks on this ? We need very clear ways to ensure things always complete in some form given multiple buffer owner/requestors and the fact this API has no "prepare-multiple- buffers" support.
Yes, good point.
If the API was to also be used for synchronization it would have to include an atomic "prepare multiple" ioctl which blocked until all the buffers listed by the application were available. In the same way, the kernel interface would also need to allow drivers to pass a list of buffers a job will access in an atomic "add job" operation. Actually, our current "KDS" (Kernel Dependency System) implementation already works like this.
This might be a good argument for keeping synchronization and cache maintenance separate, though even ignoring synchronization I would think being able to issue cache maintenance operations for multiple buffers in a single ioctl might present some small efficiency gains. However as Rob points out, CPU access is already in slow/legacy territory.
Note: Making the ioctl a "prepare multiple" would at least prevent accidental dead-locks due to cross-dependencies, etc., but I think some kind of watchdog/timeout would be useful on userspace locks to stop a malicious application from preventing other devices and processes from using buffers indefinitely.
Finally, it's probably worth mentioning that when we implemented KDS we differentiated jobs which needed "exclusive access" to a buffer and jobs which needed "shared access" to a buffer. Multiple jobs could access a buffer at the same time if those jobs all indicated they only needed shared access. Typically this would be ajob which will only read a buffer, such as a display controller or texture read. The main use-case for this was implementing EGL's preserved swap behaviour when using "buffer flipping". Here, the display controller will be reading the front buffer, but the GPU might also need to read that front buffer. So perhaps adding "read-only" & "read-write" access flags to prepare could also be interpreted as shared & exclusive accesses, if we went down this route for synchronization that is. :-)
Cheers,
Tom
linaro-mm-sig@lists.linaro.org