Re: [Linaro-mm-sig] noveau vs arm dma ops

25 Apr 2018

      On Wed, Apr 25, 2018 at 12:04:29PM +0200, Daniel Vetter wrote:
...
...
Coordinating the backport of a trivial helper in the arm tree is not
the end of the world.  Really, this cowboy attitude is a good reason
why graphics folks have such a bad rep.  You keep poking into random
kernel internals, don't talk to anoyone and then complain if people
are upset.  This shouldn't be surprising.
Not really agreeing on the cowboy thing. The fundamental problem is that
the dma api provides abstraction that seriously gets in the way of writing
a gpu driver. Some examples:
So talk to other people.  Maybe people share your frustation.  Or maybe
other people have a way to help.
...

We never want bounce buffers, ever. dma_map_sg gives us that, so there's
hacks to fall back to a cache of pages allocated using
dma_alloc_coherent if you build a kernel with bounce buffers.

get_required_mask() is supposed to tell you if you are safe.  However
we are missing lots of implementations of it for iommus so you might get
some false negatives, improvements welcome.  It's been on my list of
things to fix in the DMA API, but it is nowhere near the top.
...

dma api hides the cache flushing requirements from us. GPUs love
non-snooped access, and worse give userspace control over that. We want
a strict separation between mapping stuff and flushing stuff. With the
IOMMU api we mostly have the former, but for the later arch maintainers
regularly tells they won't allow that. So we have drm_clflush.c.

The problem is that a cache flushing API entirely separate is hard. That
being said if you look at my generic dma-noncoherent API series it tries
to move that way.  So far it is in early stages and apparently rather
buggy unfortunately.
...

dma api hides how/where memory is allocated. Kinda similar problem,
except now for CMA or address limits. So either we roll our own
allocators and then dma_map_sg (and pray it doesn't bounce buffer), or
we use dma_alloc_coherent and then grab the sgt to get at the CMA
allocations because that's the only way. Which sucks, because we can't
directly tell CMA how to back off if there's some way to make CMA memory
available through other means (gpus love to hog all of memory, so we
have shrinkers and everything).

If you really care about doing explicitly cache flushing anyway (see
above) allocating your own memory and mapping it where needed is by
far the superior solution.  On cache coherent architectures
dma_alloc_coherent is nothing but allocate memory + dma_map_single.
On non coherent allocations the memory might come through a special
pool or must be used through a special virtual address mapping that
is set up either statically or dynamically.  For that case splitting
allocation and mapping is a good idea in many ways, and I plan to move
towards that once the number of dma mapping implementations is down
to a reasonable number so that it can actually be done.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Linaro-mm-sig] noveau vs arm dma ops