Re: [Linaro-mm-sig] [PATCH 4/8] dma-buf: add peer2peer flag

25 Apr 2018

      On Wed, Apr 25, 2018 at 7:48 AM, Christoph Hellwig hch@infradead.org wrote:
...
On Tue, Apr 24, 2018 at 09:32:20PM +0200, Daniel Vetter wrote:
...
Out of curiosity, how much virtual flushing stuff is there still out
there? At least in drm we've pretty much ignore this, and seem to be
getting away without a huge uproar (at least from driver developers
and users, core folks are less amused about that).
As I've just been wading through the code, the following architectures
have non-coherent dma that flushes by virtual address for at least some
platforms:

arm [1], arm64, hexagon, nds32, nios2, parisc, sh, xtensa, mips,
powerpc

These have non-coherent dma ops that flush by physical address:

arc, arm [1], c6x, m68k, microblaze, openrisc, sparc

And these do not have non-coherent dma ops at all:

alpha, h8300, riscv, unicore32, x86

[1] arm ѕeems to do both virtually and physically based ops, further
audit needed.
Note that using virtual addresses in the cache flushing interface
doesn't mean that the cache actually is virtually indexed, but it at
least allows for the possibility.
...
...
I think the most important thing about such a buffer object is that
it can distinguish the underlying mapping types.  While
dma_alloc_coherent, dma_alloc_attrs with DMA_ATTR_NON_CONSISTENT,
dma_map_page/dma_map_single/dma_map_sg and dma_map_resource all give
back a dma_addr_t they are in now way interchangable.  And trying to
stuff them all into a structure like struct scatterlist that has
no indication what kind of mapping you are dealing with is just
asking for trouble.
Well the idea was to have 1 interface to allow all drivers to share
buffers with anything else, no matter how exactly they're allocated.
Isn't that interface supposed to be dmabuf?  Currently dma_map leaks
a scatterlist through the sg_table in dma_buf_map_attachment /
->map_dma_buf, but looking at a few of the callers it seems like they
really do not even want a scatterlist to start with, but check that
is contains a physically contiguous range first.  So kicking the
scatterlist our there will probably improve the interface in general.
I think by number most drm drivers require contiguous memory (or an
iommu that makes it look contiguous). But there's plenty others who
have another set of pagetables on the gpu itself and can
scatter-gather. Usually it's the former for display/video blocks, and
the latter for rendering.
...
...
dma-buf has all the functions for flushing, so you can have coherent
mappings, non-coherent mappings and pretty much anything else. Or well
could, because in practice people hack up layering violations until it
works for the 2-3 drivers they care about. On top of that there's the
small issue that x86 insists that dma is coherent (and that's true for
most devices, including v4l drivers you might want to share stuff
with), and gpus really, really really do want to make almost
everything incoherent.
How do discrete GPUs manage to be incoherent when attached over PCIe?
It has a non-coherent transaction mode (which the chipset can opt to
not implement and still flush), to make sure the AGP horror show
doesn't happen again and GPU folks are happy with PCIe. That's at
least my understanding from digging around in amd the last time we had
coherency issues between intel and amd gpus. GPUs have some bits
somewhere (in the pagetables, or in the buffer object description
table created by userspace) to control that stuff.
For anything on the SoC it's presented as pci device, but that's
extremely fake, and we can definitely do non-snooped transactions on
drm/i915. Again, controlled by a mix of pagetables and
userspace-provided buffer object description tables.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Linaro-mm-sig] [PATCH 4/8] dma-buf: add peer2peer flag