Re: linux kernel flush_cache_all behaviour on a Big.LITTLE system

10 Mar 2014


      On Mon, Mar 10, 2014 at 10:52 AM, Catalin Marinas
catalin.marinas@arm.com wrote:
...
On Mon, Mar 10, 2014 at 10:44:05AM +0000, karim.allah.ahmed@gmail.com wrote:
...
I have two questions:
1- I was wondering what should be the expected semantics of
"flush_cache_all" on a Big.LITTLE architecture.
I can see that the implementation of this function under linux kernel
is doing the following:
a- Read the value of LoC ( level of coherency )
b- Flush each level of cache to that LoC value using DCCISW
co-processor register.
My expectation would be that if this is executed on one of the
processors of the Big cluster it should flush all L1 and L2 caches on
this cluster and then signal the CCI interconnect of the cache
cleaning operation and then the CCI interconnect would propagate this
signal downstream to the LITTLE cluster. This will mean that at the
end all cache will be flushed.
I am not sure exactly how the CCI behaves here but cache flushing by
set/way (like the flush_cache_all function) is not safe on SMP
(independent of big.LITTLE) and it should only be used in certain
contexts like suspend/resume where we have more control about cache
lines migration between CPUs/clusters.
...
Is that the proper semantics of this operation ?
or it's only going to affect this CPU and no other CPUs in the cluster
( and consequently no other CPUs on the other cluster ). And if that's
the case, does this mean that I've to do the cache flushing per_cpu ?
The safe thing is to assume that it only affects a single CPU (and as an
optimisation we use a flush_cache_louis which does the L1 cache only).
When the whole cluster is going down and we know that only one CPU is
running, we can use flush_cache_all for that cluster but it does not
affect the caches in the other cluster.
I see.
I assumed that flush_cache_all is going to be seen by the other
cluster as well! For example if my system was already declaring L2 as
the LoC, my understanding is that if I flushed my caches till I
reached L2 the CCI ( or something ) should be signalled to propagate
this flush downstream to the other cluster in order to maintain the
semantics of the Level of Coherency in ARM TRM. Is this a correct
understanding of LoC ? Maybe I should explicitly notify the other
observers as well after flushing to refresh their view of the memory (
like you said by flushing using MVA ) otherwise they might see stale
data ?
CCI already have signals to propagate the cache maintenance operations
downstream, I just didn't know when they are invoked and when they're
not!
...
Per-CPU cache flushing isn't useful either when all the CPUs are active
since cache lines can still migrate (unless you use something like
stop_machine, disable the MMU on all CPUs, do the flushing after the
MMUs have been disabled).
...
2- and Is there a difference in semantics between flushing each cache
till I reach the Level of coherency ( using DCCISW register ) and
flushing the first cache only to the point of coherency ( using
DCCIMVAC register ) ?
The difference is that the MVA operation is guaranteed to work on SMP
since it is broadcast to the other CPUs in hardware. The SW ops are not.
Thanks Catalin for your reply.
...
--
Catalin
-- 
Karim Allah Ahmed.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: linux kernel flush_cache_all behaviour on a Big.LITTLE system