11.07 oprofile on panda busted?
siarhei.siamashka at gmail.com
Mon Sep 19 15:27:46 UTC 2011
On Fri, Sep 16, 2011 at 12:30 PM, Bianconi, Cyril <c-bianconi at ti.com> wrote:
> I don't think that the A9 issue is the same as the A8. However, effects are
> the same i.e. it's hard to use PMU.
BTW, if anybody is interested in the details about the Cortex-A8 PMU
issue, this information can be found in i.MX51 errata list:
Just search for ENGcm10700 there or for 628216 which is the ARM
erratum ID. If Freescale keeps this nice tradition, eventually we may
enjoy also having Cortex-A9 errata information in a free public access
> I cannot communicate the A9 errata document as-is due to legal stuff but I
> belive that I can explain the issue.
> The issue happens when counters are in overflow (then not sure that this
> impacts OProfile).
> Theoritically, an interrupt should fire in this case. In reality, this
> interrupt is lost randomly.
> The ARM proposed workaround is to use 2 counters: counter 0 and counter1
> initialized at counter0+1. If one interrupt is lost, the other one should
> fire just after.
> We have noticed that this could not be sufficient and that a third counter
> should be used to have close to 0% of the interrupts lost.
Close to 0% is still not good enough if we really want to rely on the
statistical properties of the collected profiling data. As a shameless
plug, here is a link to my oprofile related blog post from a bit less
than a month ago:
> Note: This HW issue has been fixed by ARM quite "late", so I think that most
> of the devices on the market should be impacted.
This pretty much rules out the use of PMU for oprofile on both A8 and
A9. How soon can we expect linaro kernels to switch to using timer
mode in oprofile with a reasonably high samples collection rate for
all the linaro supported boards? Increasing samples collection rate is
very simple and can be done by replacing TICK_NSEC with something more
The majority of users even never use any counters other than the cycle
counter, so not using PMU is not a big loss. Just using a high
resolution timer is a viable replacement if the CPU clock frequency
does not change during the test. PMU can have a much better use for
timing short sequences of code if the performance counters could get
exposed to userspace in the mainline kernel.
More information about the linaro-dev