Thank you Mike - I'm getting a way better understanding of the whole system.

Sorry for sharing a wrong command, unfortunately on 5.5.0-rc7 I see the same error persisting when running:

$ perf record --per-thread -e cs_etm/@tmc_etr0/u  -- ls
failed to set sink "" on event cs_etm/@tmc_etr0/u with 21 (Is a directory)

I'm using perf built as an Arm binary against the same kernel source code while linked against openCSD.


> When I use the same command on my DB410 I get:

> linaro@linaro-developer:~/perf-tests$ ./perf record --per-thread -e
> cs_etm/@tmc_etr0/u -- ls
> perf                perf-save-data.bash   test_170428_1048.tgz
> test_170502_1550.tgz
> perf-200302_1356-cti-v10-lnk    perf.data
> test_170428_1107.tgz  test_170920_1514.tgz
> perf-4.11-rc1-arm64        perf.data.old          test_170501_1307.tgz
> test_171101_1726.tgz
> perf-4.9-rc1-arm64        sort              test_170501_1427.tgz
> test_190311_1240.tgz
> perf-5.0-rc6-cpu-wide        sort_3k              test_170501_1442.tgz
> trace-sysfs.bash
> perf-clean-data.bash        sort_any          test_170501_2033.tgz
> trace-sysfs.bash~
> perf-gcc5.3-4.13-cs-etr-arm64    test_170307_1304.tgz  test_170501_2116.tgz
> perf-gcc5.3-rel-ocsd-4-13-base    test_170307_1308.tgz  test_170501_2121.tgz
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.110 MB perf.data ]

This is interesting! Do you get any warning when tracing, with the same perf command, any single-threaded workload which takes several seconds to run?
How big is your perf.data file?

When running `perf report -D -i perf.data` I'm able to see the trace data correctly:

  0: I_ASYNC : Alignment Synchronisation.
  12: I_TRACE_INFO : Trace Info.
  17: I_TRACE_ON : Trace On.
  18: I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000007F89F24D80; Ctxt: AArch64,EL0, NS; 
  28: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
  29: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
  30: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
  32: I_ATOM_F6 : Atom format 6.; EEEEN
  33: I_ATOM_F1 : Atom format 1.; E
  34: I_EXCEPT : Exception.;  Data Fault; Ret Addr Follows;
  36: I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000007F89F2832C; 
  45: I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0xFFFFFFC000083400; Ctxt: AArch64,EL1, NS; 
  56: I_TRACE_ON : Trace On.
  57: I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000007F89F2832C; Ctxt: AArch64,EL0, NS; 
  68: I_ATOM_F3 : Atom format 3.; NEE
  69: I_ATOM_F3 : Atom format 3.; NEN

Is there any reference about the format in which the trace is here represented?
I would just like to know what I_ASYNC, I_TRACE_INFO, I_TRACE_ON, I_ADDR_CTXT_L_64IS0 ,I_ATOM_F3 and all the other labels exactly mean

Thanks,
Andrea


From: Mike Leach <mike.leach@linaro.org>
Sent: 04 March 2020 11:14
To: Andrea Brunato <Andrea.Brunato@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>; leo.yan@linaro.org <leo.yan@linaro.org>; guermazi.zied@gmail.com <guermazi.zied@gmail.com>; coresight@lists.linaro.org <coresight@lists.linaro.org>
Subject: Re: perf record cs_etm: data lost
 
Hi Andrea,

On Wed, 4 Mar 2020 at 09:50, Andrea Brunato <Andrea.Brunato@arm.com> wrote:
>
> Hi,
>
> Is this a linux distribution or an android running on a linux kernel for android?
>
> I'm running a linux distribution, Debian Buster
>
> What is "sinkid=0xa6509eae" ?
>
> It's the sinkid corresponding to tmc_etr0. When casting the perf command using @tmc_etr0 unfortunately I'm displayed the following error:
>
> $ perf record --per-thread -e cs_etm/@tmc_etr/u  -- ls
> >> failed to set sink "" on event cs_etm/@tmc_etr/u with 21 (Is a directory)
>

The command above is wrong - you have missed the '0' off tmc_etr0.
However I would expect the error to be
"failed to set sink "tmc_etr" on event cs_etm/@tmc_etr/u with 2 (No
such file or directory)"
if this was the only problem.

When I use the same command on my DB410 I get:

linaro@linaro-developer:~/perf-tests$ ./perf record --per-thread -e
cs_etm/@tmc_etr0/u -- ls
perf                perf-save-data.bash   test_170428_1048.tgz
test_170502_1550.tgz
perf-200302_1356-cti-v10-lnk    perf.data
test_170428_1107.tgz  test_170920_1514.tgz
perf-4.11-rc1-arm64        perf.data.old          test_170501_1307.tgz
 test_171101_1726.tgz
perf-4.9-rc1-arm64        sort              test_170501_1427.tgz
test_190311_1240.tgz
perf-5.0-rc6-cpu-wide        sort_3k              test_170501_1442.tgz
 trace-sysfs.bash
perf-clean-data.bash        sort_any          test_170501_2033.tgz
trace-sysfs.bash~
perf-gcc5.3-4.13-cs-etr-arm64    test_170307_1304.tgz  test_170501_2116.tgz
perf-gcc5.3-rel-ocsd-4-13-base    test_170307_1308.tgz  test_170501_2121.tgz
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.110 MB perf.data ]

Note - I have a local perf version built against the same tree as the
kernel I am running. Ensure that you are doing the same otherwise odd
things can happen.

> so I found myself forced to manually specify the identifier.
> I think this problem has been solved in the latest kernel versions but I still have to double check.
>
> My CoreSight subsystem is the following:
>
> $ ls /sys/bus/coresight/devices/
> > etm0  etm1  etm2  etm3  etm4  etm5  etm6  etm7  funnel0  funnel1  funnel2  funnel3  funnel4  replicator0  stm0  tmc_etf0  tmc_etr0
>
> $ grep . /sys/devices/cs_etm/sinks/*
> /sys/devices/cs_etm/sinks/tmc_etf0:0x97979549
> /sys/devices/cs_etm/sinks/tmc_etr0:0xa6509eae
>
> To allow the "--mmap-pages" to work with big buffer, the sink needs
> to directly output trace data to DDR, you also need to take care
> what's the buffer mode you are using (flat, sg, CATU).
>
> Thanks for the suggestion! Is there a way for me to enforce this?

These are hardware dependent features. CATU is a CS-SoC600 hardware
component that takes over the physical memory mapping from the ETR -
which in SoC600 uses VA only.
The ETR in CS-Soc400 had the mapping h/w build in that could in theory
operate in Scatter-Gather (sg) or Contiguous (flat) memory mapping
mode. However, some hardware implementations got the interconnect
design wrong, and did not support certain read protocols that the SG
mode in ETR requited, so the drivers were updated to have a device
tree flag to indicate if sg was permitted. As I recall, sg was
permitted in Juno, but others were limited to flat only.

> Is it possible to check out how much data I'm losing/overwriting?

No, this is a hardware controlled circular buffer. The only
information available to software  is wrapped / not full.

> Do you know what is the best buffer mode for avoiding losing data?

In general, CS-Soc400 based devices cannot halt the capture on full,
as there is no available interrupt - hence buffers wrap and trace can
be lost.

CS-SoC600 base devices could have this feature (which would match a
similar feature that intel-pt has) assuming the hardware has the
relevant designed connections.
However, at present the drivers do not have the capability to use such
an interrupt - as devices with the hardware have not been available to
developers to date.

> How can I switch between a buffer mode and another one?

See above  - this is largely a hardware feature.

> Is `--per-thread` the best way to trace a single-threaded workload?
>
Given the limited buffer size I would say probably yes. Using CPU wide
would trace everything on the CPU(s) - including any other tasks
running concurrently with that of interest.

Regards

Mike




> Thanks everybody for the interest on this!
> Andrea
> ________________________________
> From: Mathieu Poirier <mathieu.poirier@linaro.org>
> Sent: 04 March 2020 02:43
> To: leo.yan@linaro.org <leo.yan@linaro.org>
> Cc: Andrea Brunato <Andrea.Brunato@arm.com>; coresight@lists.linaro.org <coresight@lists.linaro.org>
> Subject: Re: perf record cs_etm: data lost
>
> On Tue, 3 Mar 2020 at 18:54, Leo Yan <leo.yan@linaro.org> wrote:
> >
> > Hi Andrea,
> >
> > On Tue, Mar 03, 2020 at 11:23:29AM -0700, Mathieu Poirier wrote:
> > > Hi Andrea,
> > >
> > > On Tue, 3 Mar 2020 at 09:34, Andrea Brunato <Andrea.Brunato@arm.com> wrote:
> > > >
> > > > Good afternoon,
> > > >
> > > > When using perf to record a simple benchmark I see the following warning:
> > > >
> > > > $ perf record --per-thread -e cs_etm/sinkid=0xa6509eae/u  ~/afdo/coremark/coremark/coremark.exe  0x3415 0x3415 0x66 0 7 1 2000  > run2.log
> > >
> > > What is "sinkid=0xa6509eae" ?
> >
> > Sink might be ETF/ETR/ETB, I think this is why Mathieu asked the meaning
> > for 'sinkid', which we don't which sink you are using.
> >
> > > > [ perf record: Woken up 6 times to write data ]
> > > > Warning:
> > > > AUX data lost 6 times out of 6!
> > > >
> > > > [ perf record: Captured and wrote 0.378 MB perf.data ]
> > > >
> > > > This is on DB845c (https://www.96boards.org/product/rb3-platform/), kernel 5.5.0-rc7.
> > > >
> > > > It looks like perf is losing data and the reported perf.data file is very tiny, given that the application runs for several seconds.
> > > > Does anybody have any idea about what it may be causing this?
> >
> > Just give some info based on my understanding.
> >
> > For the log "AUX data lost 6 times out of 6!", since you are using
> > per-thread mode, so every time when the task switching out, the trace
> > data in the sink will be read out and save into aux buffer.  But
> > usually the sink's buffer is not big (maximum to MBs), so it's easily
> > to overwrite the hardware's cycle buffer and the previous trace data
> > will be lost.  For this reason, it reports "AUX data lost".
> >
> > > > I've tried adding --mmap-pages=32K,64K to the command for increasing the buffer size, but it's not actually solving the problem
> >
> > If it's the overflow in the hardware buffer in SRAM, "--mmap-pages"
> > option doesn't work, since it only enlarges aux buffer page size but
> > cannot change the SRAM size in CoreSight.
> >
> > To allow the "--mmap-pages" to work with big buffer, the sink needs
> > to directly output trace data to DDR, you also need to take care
> > what's the buffer mode you are using (flat, sg, CATU).
>
> Everything Leo said above is accurate.
>
> >
> > Thanks,
> > Leo
> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
> _______________________________________________
> CoreSight mailing list
> CoreSight@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/coresight



--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.