[Linaro-mm-sig] [PATCH] RFC: dma-buf: userspace mmap support
alan at lxorguk.ukuu.org.uk
Mon Mar 19 16:56:44 UTC 2012
> If the API was to also be used for synchronization it would have to
> include an atomic "prepare multiple" ioctl which blocked until all
> the buffers listed by the application were available. In the same
Too slow already. You are now serializing stuff while what we want to do
so that you can maximise parallelism without allowing deadlocks. If
you've got a high memory bandwith and 8+ cores the 'stop everything'
model isn't great.
> This might be a good argument for keeping synchronization and cache
> maintenance separate, though even ignoring synchronization I would
> think being able to issue cache maintenance operations for multiple
> buffers in a single ioctl might present some small efficiency gains.
> However as Rob points out, CPU access is already in slow/legacy
Dangerous assumption. I do think they should be separate. For one it
makes the case of synchronization needed but hardware cache management
much easier to split cleanly. Assuming CPU access is slow/legacy reflects
a certain model of relatively slow CPU and accelerators where falling off
the acceleration path is bad. On a higher end processor falling off the
acceleration path isn't a performance matter so much as a power concern.
> KDS we differentiated jobs which needed "exclusive access" to a
> buffer and jobs which needed "shared access" to a buffer. Multiple
> jobs could access a buffer at the same time if those jobs all
Makes sense as it's a reader/writer lock and it reflects MESI/MOESI
caching and cache policy in some hardware/software assists.
> display controller will be reading the front buffer, but the GPU
> might also need to read that front buffer. So perhaps adding
> "read-only" & "read-write" access flags to prepare could also be
> interpreted as shared & exclusive accesses, if we went down this
> route for synchronization that is. :-)
mmap includes read/write info so probably using that works out. It also
means that you have the stuff mapped in a way that will bus error or
segfault anyone who goofs rather than give them the usual 'deep
weirdness' behaviour you get with mishandling of caching bits.
More information about the Linaro-mm-sig