Re: [Linaro-mm-sig] [RFC] ARM DMA mapping TODO, v1

29 Apr 2011


      On Fri, 29 Apr 2011 08:59:58 +0100
Russell King - ARM Linux linux@arm.linux.org.uk wrote:
...
On Fri, Apr 29, 2011 at 07:50:12AM +0200, Thomas Hellstrom wrote:
...
However, we should be able to construct a completely generic api around  
these operations, and for architectures that don't support them we need  
to determine
a)  Whether we want to support them anyway (IIRC the problem with PPC is  
that the linear kernel map has huge tlb entries that are very  
inefficient to break up?)
That same issue applies to ARM too - you'd need to stop the entire
machine, rewrite all processes page tables, flush tlbs, and only
then restart.  Otherwise there's the possibility of ending up with
conflicting types of TLB entries, and I'm not sure what the effect
of having two matching TLB entries for the same address would be.
Right, I don't think anyone wants to see this sort of thing happen with
any frequency.  So either a large, uncached region can be set up a boot
time for allocations, or infrequent, large requests and conversions can
be made on demand, with memory being freed back to the main, coherent
pool under pressure.
...
...
b)  Whether they are needed at all on the particular architecture. The  
Intel x86 spec is, (according to AMD), supposed to forbid conflicting  
caching attributes, but the Intel graphics guys use them for GEM. PPC  
appears not to need it.
Some versions of the architecture manual say that having multiple
mappings with differing attributes is unpredictable.
Yes, there's a bit of abuse going on there.  We've received a guarantee
that if the CPU speculates a line into the cache, as long as it's not
modified through the cacheable mapping the CPU won't write it back to
memory; it'll discard the line as needed instead (iirc AMD CPUs will
actually write back clean lines, so GEM wouldn't work the same way
there).
But even with GEM, there is a large performance penalty for having to
allocate a new buffer object the first time.  Even though we don't have
to change mappings by stopping the machine etc, we still have to flush
out everything from the CPU relating to the object (since some lines
may be dirty), and then flush the memory controller buffers before
accessing it through the uncached mapping.  So at least currently,
we're all in the same boat when it comes to new object allocations:
they will be expensive unless you already have some uncached mappings
you can re-use.
-- 
Jesse Barnes, Intel Open Source Technology Center

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Linaro-mm-sig] [RFC] ARM DMA mapping TODO, v1