[Linaro-mm-sig] [RFC] dma-shared-buf: Add buffer sharing framework

Daniel Vetter daniel at ffwll.ch
Fri Sep 9 18:43:41 UTC 2011


On Thu, Sep 08, 2011 at 05:56:19PM -0500, Clark, Rob wrote:
> On Thu, Sep 8, 2011 at 3:28 PM, Daniel Vetter <daniel at ffwll.ch> wrote:
> > On Thu, Sep 08, 2011 at 12:39:41PM -0500, Clark, Rob wrote:
> >> On Thu, Sep 8, 2011 at 10:55 AM, Daniel Vetter <daniel at ffwll.ch> wrote:
> >> > On Thu, Sep 08, 2011 at 08:45:43PM +0530, Sumit Semwal wrote:
> >> >> On 8 September 2011 19:48, Clark, Rob <rob at ti.com> wrote:
> >> >> > actually, all I think we needed was to add 'struct device *'  to
> >> >> > get_scatterlist (and maybe doesn't hurt for put)..  I'm not entirely
> >> >> > sure why we need create/attach_device?  The create fxn especially
> >> >> > seems wrong.. what if you are the 2nd device to import the buffer?
> >> >> >
> >> >> Ah... I guess I misunderstood your 'it seems like a sane approach' to mean
> >> >> you agree with the 2-step process suggested by Daniel :) - my bad!
> >> >
> >> > On re-reading Rob's response I think there's a misunderstanding. I don't
> >> > want a create function in the dma_buf_ops. The create would be a generic
> >> > dma_buf subsystem function that (like in the patch) allocates the struct
> >> > and properly registers the fd and what not else. The attach_device would
> >> > then be the only new function in dma_buf_ops.
> >> >
> >> > The confusion probably stems from my other proposal to hide these function
> >> > pointers behind small inline helpers for the in-kernel api.
> >>
> >> I guess/assume that what you are trying to get to is a way to
> >> differentiate between "I am done DMA'ing to the buffer now, but will
> >> likely use it again (ie. feel free to cache the mapping)", and "I am
> >> done forever w/ the buffer (don't bother caching)"?
> >
> > Hm, no, I haven't really thought much about the release side. dri2/gem
> > doesn't have any automatic caching of shared buffers, btw, that's all done
> > manually (i.e. in the higher levels like the mesa/ddx drivers) by keeping
> > track of recently shared buffers (and hanging onto them for as long as
> > possible without creating havoc on the other side).
> 
> ok, then I've lost sync.. (ok, I admit to be only having time to read
> email for 5 min here and there, which might be contributing to the
> problem)..
> 
> what was the purpose of attach() then (which couldn't be accomplished
> in the first get_scatterlist())?

Ok, I'll try to explain with this a use-case we've discussed: A video
input device, some yuv plane scanout engine and an mpeg encoder, i.e. the
video-shooting usecase. mpeg encoder is optional and only engaged when the
user presse the record button.

Now for live-view, userspaces grabs some dma_buf ids (i.e. fds) from the
v4l device and then creates a bunch of yuv planes at the scanout device
with these. For this case you're right that we could arbitrage the memory
layout at the get_scatterlist call of the scanout device (the other user,
the frame-grabber, is already known because the dma_buf orginitated
there).

But if we throw a third device into the mix, this won't work. Somewhen
(and it doesn't matter how much we delay this till the complete pipeline
is set up), either the the scanout device or the encoder has to call
get_scatterlist before the other one. And as soon is the allocator
(respectively the originator) has handed out a scatterlist (either in
system address space or device address space) it cannot move the backing
storage anymore. If the other device then calls get_scatterlist and has
more stringet buffer placement requirements, things stop working.

Now my idea is that when userspace establishes the shared buffer (i.e.
subsystems convert the dma_buf fd number into a subsystem specific id),
the device taking part in this buffer sharing only calls attach_device,
but not get_scatterlist.

After this has taken place for all devices, userspace starts the
pipelined. Kernel drivers only then call get_scatterlist and so force the
allocation of the backing storage. And by then all participating devices
are known, so the right placement can be chosen for the buffer.

So the reason for attach_device is to make sharing between 3 devices
reliably possible (without userspace needing to know in which sequence to
share buffers with devices). For 2 devices, attach_device isn't really
necessary because the first device does an implicit attach_device when
creating the buffer - it better not forget it's own memory placement
requirements ;-) So for just 2 devices delaying the allocation up to the
call to get_scatterlist is good enough.

Cheers, Daniel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48



More information about the Linaro-mm-sig mailing list