Re: perf record cs_etm: data lost

5 Mar 2020


      Hi
Thanks Leo for the clarifications. It is a flow problem betwen a producer
and consumer here. You can handle it either by reducing the amount of
traces generated before the context switch (use filters to only trace
important sections of the code, reduce the time where the process is
scheduled etc...) or increase the size of the buffer ( well, I think it is
not possible in your case) or implement a flow control. And here I guess
that this should be possible once the cti are fully supported on platforms
where a cti is connected to the interrupt controller....
Is this use case considered for the cti drivers?
Kind Regards
Zied Guermazi
On Thu, 5 Mar 2020, 1:29 PM Leo Yan leo.yan@linaro.org wrote:
...
Hi Andrea,
On Thu, Mar 05, 2020 at 10:28:27AM +0000, Andrea Brunato wrote:
...
Thank you Leo, this is really valuable!
You are welcome!
...
I'm going to update my kernel version to the latest one - hopefully I'll
manage to do this as soon as possible
...
As I'm trying different configurations, I've got a very interesting
result:
...
$ taskset -c 0 perf record --per-thread -e cs_etm/sinkid=0xa6509eae/u
~/afdo/coremark/coremark/coremark.exe  0x3415 0x3415 0x66 0 7 1 2000  >
run2.log
...
[ perf record: Woken up 28 times to write data ]
Warning:
AUX data lost 25 times out of 27!
[ perf record: Captured and wrote 3.323 MB perf.data ]
While instead, when NOT pinning the program to a specific core
$ perf record --per-thread -e cs_etm/sinkid=0xa6509eae/u
~/afdo/coremark/coremark/coremark.exe  0x3415 0x3415 0x66 0 7 1 2000  >
run2.log
...
[ perf record: Woken up 4 times to write data ]
Warning:
AUX data lost 4 times out of 4!
[ perf record: Captured and wrote 0.502 MB perf.data ]
While the information lost rate is still high, the `time` AUX data has
been lost is very different: 27 vs 4
...
Also the reported perf.data file is way bigger when pinning the task to
a specific core.
...
Interestingly enough, when instead tracing a short-lived program such as
`ls`, there is no difference in the perf.data reported.
...
Is anybody aware of any specific part in the code base whose behavior
may change according to the traced program being rescheduled to another
core?
...
Any idea/suggestion is highly appreciated
As I know Arm CoreSight cannot produce interrupts on many platforms, so
every time Perf tool only captures trace data when the profiled program
is switched out from a CPU.
So when set the CPU affinity to CPU0 in your first command, usually,
CPU0 is the primary CPU for handling interrupts and many interrupt
threads run on it, thus this gives many chance for the profiled
program to be scheduled out, and finally, you could see Perf can
capture trace data for many times (27 times).
In your second command it doesn't use taskset.  With this command, Linux
kernel scheduler spreads tasks to different CPUs as possible, this
gives more chance for the profiled program to occupy a CPU without
scheduled out.  I think this is the main reason why in the second
command Perf tool only captured trace data for 4 times.
Thanks,
Leo

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: perf record cs_etm: data lost