Re: [Linaro-mm-sig] [RFC] dma-shared-buf: Add buffer sharing framework

12 Sep 2011


      On Sun, Sep 11, 2011 at 10:32:20AM -0500, Clark, Rob wrote:
...
On Sat, Sep 10, 2011 at 6:45 AM, Daniel Vetter daniel@ffwll.ch wrote:
...
On Fri, Sep 09, 2011 at 06:36:23PM -0500, Clark, Rob wrote:
...
with this sort of approach, if a new device is attached after the
first get_scatterlist the buffer can be, if needed, migrated using the
union of all the devices requirements at a point in time when no DMA
is active to/from the buffer.  But if all the devices are known up
front, then you never need to migrate unnecessarily.
Well, the problem is with devices that hang onto mappings for way too long
so just waiting for all dma to finish to be able to fix up the buffer
placement is a no-go. But I think we can postpone that issue a bit,
especially since the drivers that tend to do this (gpus) can also evict
objects nilly-willy, so that should be fixable with some explicit
kill_your_mappings callback attached to drm_buf_attachment (or full-blown
sync objects a là ttm).
I'm ok if the weird fallback cases aren't fast.. I just don't want
things to explode catastrophically in weird cases.
I guess in the GPU / deep pipeline case, you can at least set up to
get an interrupt back when the GPU is done with some surface (ie. when
it gets to a certain point in the command-stream)?  I think it is ok
if things stall in this case until the GPU pipeline is drained (and if
you are targeting 60fps, that is probably still faster than video,
likely at 30fps).  Again, this is just for the cases where userspace
doesn't do what we want, to avoid just complete failure..
If the GPU is the one importing the dmabuf, it just calls
put_scatterlist() once it gets some interrupt from the GPU.  If the
GPU is the one exporting the dmabuf, then get_scatterlist() just
blocks until the GPU gets the interrupt from the GPU.  (Well, I guess
then do you need get_scatterlist_interruptable()?)
The problem with gpus is that they eat through data so _fast_ that not
caching mappings kills performance. Now for simpler gpus we could shovel
the mapping code into the dma/dma_buf subsystem and cache things there.
But desktop gpus already have (or will get) support for per-process gpu
address spaces and I don't thing it makes sense to put that complexity
into generic layers (nor is it imo feasible accross different gpus -
per-process stuff tends to highly integrate with command submission). So I
think we need some explicit unmap_ASAP callback support, but definitly not
for v1 of dma_buf. But with attach separated from get_scatterlist and an
explicit struct dma_buf_attachment around, such an extension should be
pretty straightforward to implement.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Linaro-mm-sig] [RFC] dma-shared-buf: Add buffer sharing framework