Hi George,
I have looked at the trace you supplied using perf report --stdio --dump - the output of the first few lines is below.
My immediate view is that this trace is hopelessly corrupt. The size of trace would suggest that the buffer has not wrapped - which can cause issues in the current perf record implementation.
The small excerpt below is riddles with reserved trace packet tokens and invalid sequences. The address and context packet at index 26951 shows a VMID and ContextID entry - even though the perf setup cannot enable these features. The addresses look a little unusual too.
The trace ID is consistent, suggesting that the output from the ETR frame formatter is correct - making me think that something upstream (ETM, funnel etc) is causing the problem - though I would not rely on this. The other possibility is that the ETM is not programmed according to the settings perf enabled it with - which could cause a mismatch between the decoder expectations and the trace received.
I would suggest capturing a trace session using ETB / ETF if that is possible on your system.
With bad data any attempt to use --inject is doomed to failure
Regards
Mike
. ... CoreSight ETM Trace data: size 121856 bytes 2868: id[10] I_NOT_SYNC : I Stream not synchronised 26872: id[10] I_ASYNC : Alignment Synchronisation. 26885: id[10] I_TRACE_INFO : Trace Info.; PCTL=0x0 26887: id[10] I_BAD_SEQUENCE : Invalid Sequence in packet.[I_EXTENSION] 26889: id[10] I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x3000017EB7F43258; Ctxt: AArch32, EL2, NS; 26900: id[10] I_ATOM_F2 : Atom format 2.; EE 26901: id[10] I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0001007FB7EB17C4; 26910: id[10] I_ATOM_F4 : Atom format 4.; ENEN 26912: id[10] I_ADDR_S_IS0 : Address, Short, IS0.; Addr=0x0001007FB7EB17D0 ~[0x1D0] 26914: id[10] I_ATOM_F3 : Atom format 3.; ENE 26915: id[10] I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEE 26916: id[10] I_ATOM_F3 : Atom format 3.; NNE 26917: id[10] I_ATOM_F3 : Atom format 3.; EEN 26918: id[10] I_ATOM_F3 : Atom format 3.; ENE 26919: id[10] I_ATOM_F1 : Atom format 1.; E 26920: id[10] I_ADDR_S_IS0 : Address, Short, IS0.; Addr=0x0001007FB7EB22E8 ~[0x122E8] 26923: id[10] I_ATOM_F3 : Atom format 3.; NEE 26924: id[10] I_ADDR_S_IS0 : Address, Short, IS0.; Addr=0x0001007FB7EB17E0 ~[0x117E0] 26928: id[10] I_ATOM_F4 : Atom format 4.; NENE 26929: id[10] I_ATOM_F3 : Atom format 3.; NNE 26930: id[10] I_EXCEPT : Exception.; Reserved; 26932: id[10] I_RESERVED : Reserved Packet Header 26933: id[10] I_ADDR_L_32IS1 : Address, Long, 32 bit, IS1.; Addr=0xB6F418F2; 26938: id[10] I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0xFE88FEFF80082400; Ctxt: AArch64,EL1, NS; 26950: id[10] I_TRACE_ON : Trace On. 26951: id[10] I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x3000007EB7F431E0; Ctxt: AArch64,EL2, S; CID=0xea0c0d95; VMID=0x009d; 26967: id[10] I_RESERVED : Reserved Packet Header 26968: id[10] I_EVENT : Trace Event. 26969: id[10] I_BAD_SEQUENCE : Invalid Sequence in packet.[I_ASYNC] 26973: id[10] I_ATOM_F3 : Atom format 3.; NNN 26974: id[10] I_ATOM_F1 : Atom format 1.; N 26976: id[10] I_RESERVED : Reserved Packet Header 26977: id[10] I_ADDR_L_32IS0 : Address, Long, 32 bit, IS0.; Addr=0xF419EDE4; 26982: id[10] I_RESERVED : Reserved Packet Header 26983: id[10] I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x30FFFF8108092404; Ctxt: AArch32, EL0, S; 26994: id[10] I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x00007EB7F518F388; Ctxt: AArch32, EL0, S;
On 1 June 2017 at 05:59, George Burgess gbiv@google.com wrote:
Great to hear, Sebastian! :) The "failed to mmap" was the same error I got, as well.
Is this a good place to talk about perf inject machinery, or should I start a new thread for that?
If this is the right place, it looks like the traces from my hikey might not be getting read properly by `perf inject` when I'm trying to generate LBR traces. A bit of debugging has shown that many of the traces can be decoded, but we eventually hit one of type ETM4_PKT_I_COND_I_F2, which TrcPktDecodeEtmV4I::decodePacket currently doesn't support.
I'm reading the docs, and it looks like that kind of tracing needs to be explicitly enabled. Doing `cat /sys/bus/coresight/devices/*.etm/mode`, the bits to enable that all appear clear on my setup. So, I'm unsure how it's sneaking into the traces. I found a convenient way to dump the traces (cs_etm__dump_event), and it looks like there may be some invalid traces in perf.data, as well?
In any case, I've attached three things in one tarball:
- perf.data from running a bubble sort program on Hikey for a few ms;
recorded using `taskset 1 ./perf record -e cs_etm/@f6404000.etr/u --per-thread ./gbiv-tool`
- etm-dump.txt, which is stdout from running `perf --debug verbose=9 inject
-i ./perf.data -o lbr_trace --itrace=i100usl64 --strip` (with a call to cs_etm__dump_event baked in)
- lbr_trace, the (nearly empty?) output of the trace.
When trying to convert lbr_trace to an LLVM profile, I get errors about 0% of the samples being mappable onto the executable.
I'm running perf on my x86 machine from the opencsd 4.11 branch; opencsd was freshly pulled+compiled earlier today.
I'll continue trying to figure out the root cause tomorrow.
On Wed, May 31, 2017 at 10:56 AM, Mathieu Poirier mathieu.poirier@linaro.org wrote:
On 31 May 2017 at 11:09, Sebastian Pop sebpop@gmail.com wrote:
On Tue, May 23, 2017 at 1:44 AM, George Burgess gbiv@google.com wrote:
Right; sorry, I'm tired. :) I was a bit sneaky with the ARM patch, since I'm using Android+fastboot with this kernel, as I think you are.
In particular, I modified Li Pengcheng's sysconf patch to also flip the bits that the ARM patch does. It seems that this works just as well (and is convenient, since that patch already ioremaps the region that the arm-tf patch writes to). New hi6220-sysconfig.c patch is attached.
I tried applying this patch to 4.9, as well. I was able to get the device to boot and recognize its ETMs, but perf dies seemingly because `sink_ops(sink)->alloc_buffer == NULL` (coresight-etm-perf.c; line 241ish). Trying to apply a patch to make that work gave me a build error, which I didn't look too much into; I'm perfectly fine with my kernel not being near AOSP's HEAD for the moment. :)
If you'd like, I'm happy to gather the hacks I made to make 4.11 play nicely with AOSP and put them in patch form tomorrow. It was mainly just me adding bits and pieces to Makefiles and crossing my fingers.
Thanks George for the patch. I got the AOSP linux kernel 4.9 working with CoreSight with the attached patches. I can see the ETM components:
hikey:/ # ls /sys/bus/coresight/devices/ amba:replicator@0/ f6501000.funnel/ f659f000.etm/ f65df000.etm/ f6401000.funnel/ f659c000.etm/ f65dc000.etm/ f6402000.etf/ f659d000.etm/ f65dd000.etm/ f6404000.etr/ f659e000.etm/ f65de000.etm/
However perf does not collect traces yet: it fails with some malloc problem (maybe the same problem as what you reported above)
# perf record -e cs_etm/@f6404000.etr/u --per-thread failed to mmap with 12 (Cannot allocate memory)
The perf interface for ETR isn't part of mainline yet. There is a driver available on github (also for kernel 4.9) that should apply cleanly on top of AOSP. ETF support is available in mainline and should work out of the box.
I will try to run perf under gdb and see where and why this fails.
Sebastian
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight