[Linaro-mm-sig] [PATCH 3/3] dma_buf: Add documentation for the new cpu access support
rob at ti.com
Mon Mar 5 18:48:18 UTC 2012
On Fri, Mar 2, 2012 at 6:23 PM, Sakari Ailus <sakari.ailus at iki.fi> wrote:
> Hi Daniel,
> Thanks for the patch.
> On Thu, Mar 01, 2012 at 04:36:01PM +0100, Daniel Vetter wrote:
>> Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
>> Documentation/dma-buf-sharing.txt | 102 +++++++++++++++++++++++++++++++++++-
>> 1 files changed, 99 insertions(+), 3 deletions(-)
>> diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt
>> index 225f96d..f12542b 100644
>> --- a/Documentation/dma-buf-sharing.txt
>> +++ b/Documentation/dma-buf-sharing.txt
>> @@ -32,8 +32,12 @@ The buffer-user
>> *IMPORTANT*: [see https://lkml.org/lkml/2011/12/20/211 for more details]
>> For this first version, A buffer shared using the dma_buf sharing API:
>> - *may* be exported to user space using "mmap" *ONLY* by exporter, outside of
>> - this framework.
>> -- may be used *ONLY* by importers that do not need CPU access to the buffer.
>> + this framework.
>> +- with this new iteration of the dma-buf api cpu access from the kernel has been
>> + enable, see below for the details.
>> +dma-buf operations for device dma only
>> The dma_buf buffer sharing API usage contains the following steps:
>> @@ -219,7 +223,99 @@ NOTES:
>> If the exporter chooses not to allow an attach() operation once a
>> map_dma_buf() API has been called, it simply returns an error.
>> -Miscellaneous notes:
>> +Kernel cpu access to a dma-buf buffer object
>> +The motivation to allow cpu access from the kernel to a dma-buf object from the
>> +importers side are:
>> +- fallback operations, e.g. if the devices is connected to a usb bus and the
>> + kernel needs to shuffle the data around first before sending it away.
>> +- full transperancy for existing users on the importer side, i.e. userspace
>> + should not notice the difference between a normal object from that subsystem
>> + and an imported one backed by a dma-buf. This is really important for drm
>> + opengl drivers that expect to still use all the existing upload/download
>> + paths.
>> +Access to a dma_buf from the kernel context involves three steps:
>> +1. Prepare access, which invalidate any necessary caches and make the object
>> + available for cpu access.
>> +2. Access the object page-by-page with the dma_buf map apis
>> +3. Finish access, which will flush any necessary cpu caches and free reserved
>> + resources.
> Where it should be decided which operations are being done to the buffer
> when it is passed to user space and back to kernel space?
> How about spliting these operations to those done on the first time the
> buffer is passed to the user space (mapping to kernel address space, for
> example) and those required every time buffer is passed from kernel to user
> and back (cache flusing)?
> I'm asking since any unnecessary time-consuming operations, especially as
> heavy as mapping the buffer, should be avoidable in subsystems dealing
> with streaming video, cameras etc., i.e. non-GPU users.
Well, this is really something for the buffer exporter to deal with..
since there is no way for an importer to create a userspace mmap'ing
of the buffer. A lot of these expensive operations go away if you
don't even create a userspace virtual mapping in the first place ;-)
>> +1. Prepare acces
>> + Before an importer can acces a dma_buf object with the cpu from the kernel
>> + context, it needs to notice the exporter of the access that is about to
>> + happen.
>> + Interface:
>> + int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
>> + size_t start, size_t len,
>> + enum dma_data_direction direction)
>> + This allows the exporter to ensure that the memory is actually available for
>> + cpu access - the exporter might need to allocate or swap-in and pin the
>> + backing storage. The exporter also needs to ensure that cpu access is
>> + coherent for the given range and access direction. The range and access
>> + direction can be used by the exporter to optimize the cache flushing, i.e.
>> + access outside of the range or with a different direction (read instead of
>> + write) might return stale or even bogus data (e.g. when the exporter needs to
>> + copy the data to temporaray storage).
>> + This step might fail, e.g. in oom conditions.
>> +2. Accessing the buffer
>> + To support dma_buf objects residing in highmem cpu access is page-based using
>> + an api similar to kmap. Accessing a dma_buf is done in aligned chunks of
>> + PAGE_SIZE size. Before accessing a chunk it needs to be mapped, which returns
>> + a pointer in kernel virtual address space. Afterwards the chunk needs to be
>> + unmapped again. There is no limit on how often a given chunk can be mapped
>> + and unmmapped, i.e. the importer does not need to call begin_cpu_access again
>> + before mapping the same chunk again.
>> + Interfaces:
>> + void *dma_buf_kmap(struct dma_buf *, unsigned long);
>> + void dma_buf_kunmap(struct dma_buf *, unsigned long, void *);
>> + There are also atomic variants of these interfaces. Like for kmap they
>> + facilitate non-blocking fast-paths. Neither the importer nor the exporter (in
>> + the callback) is allowed to block when using these.
>> + Interfaces:
>> + void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long);
>> + void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *);
>> + For importers all the restrictions of using kmap apply, like the limited
>> + supply of kmap_atomic slots. Hence an importer shall only hold onto at most 2
>> + atomic dma_buf kmaps at the same time (in any given process context).
>> + dma_buf kmap calls outside of the range specified in begin_cpu_access are
>> + undefined. If the range is not PAGE_SIZE aligned, kmap needs to succeed on
>> + the partial chunks at the beginning and end but may return stale or bogus
>> + data outside of the range (in these partial chunks).
>> + Note that these calls need to always succeed. The exporter needs to complete
>> + any preparations that might fail in begin_cpu_access.
>> +3. Finish access
>> + When the importer is done accessing the range specified in begin_cpu_acces,
>> + it needs to announce this to the exporter (to facilitate cache flushing and
>> + unpinning of any pinned resources). The result of of any dma_buf kmap calls
>> + after end_cpu_access is undefined.
>> + Interface:
>> + void dma_buf_end_cpu_access(struct dma_buf *dma_buf,
>> + size_t start, size_t len,
>> + enum dma_data_direction dir);
>> +Miscellaneous notes
>> - Any exporters or users of the dma-buf buffer sharing framework must have
>> a 'select DMA_SHARED_BUFFER' in their respective Kconfigs.
> Kind regards,
> Sakari Ailus
> e-mail: sakari.ailus at iki.fi jabber/XMPP/Gmail: sailus at retiisi.org.uk
> Linaro-mm-sig mailing list
> Linaro-mm-sig at lists.linaro.org
More information about the Linaro-mm-sig