On Fri, Oct 15, 2010 at 2:49 PM, Peter Maydell peter.maydell@linaro.org wrote:
One of the Valgrind subtools is Cachegrind; this is a cache profiler. (It simulates the I1, D1 and L2 caches so it can pinpoint the sources of cache misses in application code.)
On x86 Cachegrind automatically queries the host CPU to find out what sort/size of cache it has installed, and by default will simulate that sort of cache. (You can also use command line options to specify a different cache layout to model.)
On ARM, the ARMv7 VMSA coprocessor registers which describe the cache geometry are privileged-mode access only. This means cachegrind can't do the same "default cache model is the same as your real CPU" behaviour that it does on x86.
Can the kernel folks on this list suggest whether it would be a reasonable idea for the kernel to provide some sort of userspace API so tools like cachegrind can find out the cache geometry?
There are similar issues with the CPU ID and feature registers.
If we're going to do suggest a change for this, it would be good to clear up the whole CPU identification / CPU feature detection area at the same time.
Note, a key stumbling block has been that the configuration in effect may not be the same as that supported by the hardware (because some kernel feature is compiled out, or errata workarounds are in force, for example). This means that simply mirroring the CPU registers up to userspace (or simplistically emulating the MRCs) may give rise to problems.
This is potentially something where we can make some worthwhile progress in linaro, but there's a risk of creating Yet Another Interface, which few people migrate to --- exacerbating the fragmentation further. It would be interesting if anyone has thoughts on how to manage that.
Cheers ---Dave