Greetings, i was referred to this mailing list by Mathleu Poirler.
I'm recording Coresight using my Dragonboard 410c board, after compiling the perf-opencsd-master kernel.
Recording seems to work on a simple program i did which does nothing but print a string to the screen.
Now, i use perf to hopefully decode the trace, but perf segfaults. I'll let you know that decoding using the sample trace given here: https://github.com/Linaro/OpenCSD/blob/master/HOWTO.md
Does work.
I dived into the cs-trace-disasm.py script to see why exactly it doesn't work and i noticed this command causes the segfault: $ ~/linux/tools/perf/perf script --show-mmap-events
/lib/aarch64-linux-gnu/ld-2.26.so with build id 6516ef8fa13fcb739834af9e87fb5fe9df612096 not found, con> Segmentation fault (core dumped)
This command also segfaults:
$ ~/linux/tools/perf/perf report --stdio
/lib/aarch64-linux-gnu/ld-2.26.so with build id 6516ef8fa13fcb739834af9e87fb5fe9df612096 not found, con> (END)Segmentation fault (core dumped)
But this command works:
$ ~/linux/tools/perf/perf report --stdio --dump
And i can see Coresight packets information when browsing the output.
My .debug directory looks like this:
~/.debug/ |-- [kernel.kallsyms] | `-- 1dc43d23817467d7717b19af07463af0d9a9bd83 | `-- kallsyms |-- [vdso] | `-- 18863444e4f3e2600f53e406421b2a0edd940888 | `-- vdso |-- bin | `-- check | `-- 31694f29996e06da12f63d6088ec6eb23b3079c4 | `-- elf `-- lib `-- aarch64-linux-gnu |-- ld-2.26.so | `-- 6516ef8fa13fcb739834af9e87fb5fe9df612096 | `-- elf `-- libc.so.6 `-- 06e99d8d6acabab0643e0f525ac561cf73db6498 `-- elf
Now, another need i wanted to ask is where can i find the code that uses OpenCSD to decode the trace and output instructions? eventually, i don't want to use perf, but rather use OpenCSD directly in my code to decode traces.
Not sure what to do here and how to proceed, i'll appreciate some help. Thank you all!
Hi Mike,
On Mon, Mar 19, 2018 at 11:08:17AM +0200, Mike Bazov wrote:
Greetings, i was referred to this mailing list by Mathleu Poirler.
I'm recording Coresight using my Dragonboard 410c board, after compiling the perf-opencsd-master kernel.
I am using acme branch: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git
Recording seems to work on a simple program i did which does nothing but print a string to the screen.
Now, i use perf to hopefully decode the trace, but perf segfaults. I'll let you know that decoding using the sample trace given here: https://github.com/Linaro/OpenCSD/blob/master/HOWTO.md
Does work.
I dived into the cs-trace-disasm.py script to see why exactly it doesn't work and i noticed this command causes the segfault: $ ~/linux/tools/perf/perf script --show-mmap-events
/lib/aarch64-linux-gnu/ld-2.26.so with build id 6516ef8fa13fcb739834af9e87fb5fe9df612096 not found, con> Segmentation fault (core dumped)
This command also segfaults:
$ ~/linux/tools/perf/perf report --stdio
/lib/aarch64-linux-gnu/ld-2.26.so with build id 6516ef8fa13fcb739834af9e87fb5fe9df612096 not found, con> (END)Segmentation fault (core dumped)
I have not seen segmentation fault issue, but it's good to share some my finding:
- I remembered if use kallsyms for kernel symbols, you need enable kernel configs:
CONFIG_PROC_KCORE=y CONFIG_PROC_VMCORE=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y
Also please note that the userspace has permission to export the complete kernel address info; IIRC, you need set '0' or '1' to below entry: echo 0 > /proc/sys/kernel/kptr_restrict
- If you want to specify kernel for importing kernel symbol table, like this command './perf report -k ./vmlinux --stdio', you might need notice this patch: https://www.spinics.net/lists/linux-perf-users/msg05576.html
If upper two methods work, I still think it might need root cause the segmentation fault issue and fix with extra patches.
Thanks, Leo Yan
But this command works:
$ ~/linux/tools/perf/perf report --stdio --dump
And i can see Coresight packets information when browsing the output.
My .debug directory looks like this:
~/.debug/ |-- [kernel.kallsyms] | `-- 1dc43d23817467d7717b19af07463af0d9a9bd83 | `-- kallsyms |-- [vdso] | `-- 18863444e4f3e2600f53e406421b2a0edd940888 | `-- vdso |-- bin | `-- check | `-- 31694f29996e06da12f63d6088ec6eb23b3079c4 | `-- elf `-- lib `-- aarch64-linux-gnu |-- ld-2.26.so | `-- 6516ef8fa13fcb739834af9e87fb5fe9df612096 | `-- elf `-- libc.so.6 `-- 06e99d8d6acabab0643e0f525ac561cf73db6498 `-- elf
Now, another need i wanted to ask is where can i find the code that uses OpenCSD to decode the trace and output instructions? eventually, i don't want to use perf, but rather use OpenCSD directly in my code to decode traces.
Not sure what to do here and how to proceed, i'll appreciate some help. Thank you all!
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
Hi, thank you for your answer I don't think the kernel symbols are related. I do see the segfault for the command "perf report --stdio" occurs in OpenCSD, and this is the callstack:
#0 TrcPktDecodeEtmV4I::decodePacket (this=this@entry=0x555555e9a650, Complete=@0x7fffffffbb67: true) at /home/mike/repo/OpenCSD/decoder/source/etmv4/trc_pkt_decode_ etmv4i.cpp:373 #1 0x00007ffff69b812b in TrcPktDecodeEtmV4I::processPacket (this=0x555555e9a650) at /home/mike/repo/OpenCSD/decoder/source/etmv4/trc_pkt_decode_ etmv4i.cpp:99 #2 0x00007ffff69b8b58 in TrcPktDecodeBase<EtmV4ITrcPacket, EtmV4Config>::PacketDataIn ( this=0x555555e9a650, op=OCSD_OP_DATA, index_sop=41, p_packet_in=0x555555e98cf8) at /home/mike/repo/OpenCSD/decoder/include/common/trc_pkt_ decode_base.h:247 #3 0x00007ffff69b4000 in EtmV4IPktProcImpl::processData (this=0x555555e98c70, index=<optimized out>, dataBlockSize=15, pDataBlock=0x555555e94869 "\177\201\177\377\377", numBytesProcessed=0x7fffffffbda4) at /home/mike/repo/OpenCSD/decoder/source/etmv4/trc_pkt_proc_ etmv4i_impl.cpp:115 #4 0x00007ffff69b036a in TrcPktProcBase<EtmV4ITrcPacket, _ocsd_etmv4_i_pkt_type, EtmV4Config>::TraceDataIn (this=0x555555e98aa0, op=<optimized out>, index=32, dataBlockSize=<optimized out>, pDataBlock=<optimized out>, numBytesProcessed=<optimized out>) at /home/mike/repo/OpenCSD/decoder/include/common/trc_pkt_proc_ base.h:238 #5 0x00007ffff69a18bd in TraceFmtDcdImpl::outputFrame (this=this@entry =0x555555e93780) at /home/mike/repo/OpenCSD/decoder/source/trc_frame_deformatter.cpp:700 #6 0x00007ffff69a1c24 in TraceFmtDcdImpl::processTraceData (this=0x555555e93780, index=<optimized out>, dataBlockSize=<optimized out>, pDataBlock=<optimized out>, numBytesProcessed=0x7fffffffbee4) at /home/mike/repo/OpenCSD/decoder/source/trc_frame_deformatter.cpp:272 #7 0x00005555558284b6 in cs_etm_decoder__process_data_block (decoder=0x555555e8b2d0, indx=0, buf=buf@entry=0x7ffff7ff25aa ")", len=len@entry=16640, consumed=consumed@entry=0x7fffffffbf68) at util/cs-etm-decoder/cs-etm-decoder.c:518 #8 0x0000555555826d22 in cs_etm__run_decoder (etmq=0x555555e7b150) at util/cs-etm.c:986 #9 cs_etm__process_timeless_queues (etm=0x555555e7a070, tid=2161, time_=18446744073709551615)
*This is the instruction that causes the segfault as far as i saw:*
Program received signal SIGSEGV, Segmentation fault. TrcPktDecodeEtmV4I::decodePacket (this=this@entry=0x555555e9a650, Complete=@0x7fffffffbb67: true) at /home/mike/repo/OpenCSD/decoder/source/etmv4/trc_pkt_decode_ etmv4i.cpp:373 372 std::vector<uint32_t> params; 373 params[0] = m_curr_packet_in->getCC();
This segfaults because accessing an element in a vector that doesn't exist causes undefined behavior.. this causes segfault on my system. Any OpenCSD contributor that can help?
Thanks, Mike.
On Mon, Mar 19, 2018 at 11:40 AM, Leo Yan leo.yan@linaro.org wrote:
Hi Mike,
On Mon, Mar 19, 2018 at 11:08:17AM +0200, Mike Bazov wrote:
Greetings, i was referred to this mailing list by Mathleu Poirler.
I'm recording Coresight using my Dragonboard 410c board, after compiling the perf-opencsd-master kernel.
I am using acme branch: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git
Recording seems to work on a simple program i did which does nothing but print a string to the screen.
Now, i use perf to hopefully decode the trace, but perf segfaults. I'll
let
you know that decoding using the sample trace given here: https://github.com/Linaro/OpenCSD/blob/master/HOWTO.md
Does work.
I dived into the cs-trace-disasm.py script to see why exactly it doesn't work and i noticed this command causes the segfault: $ ~/linux/tools/perf/perf script --show-mmap-events
/lib/aarch64-linux-gnu/ld-2.26.so with build id 6516ef8fa13fcb739834af9e87fb5fe9df612096 not found, con> Segmentation fault (core dumped)
This command also segfaults:
$ ~/linux/tools/perf/perf report --stdio
/lib/aarch64-linux-gnu/ld-2.26.so with build id 6516ef8fa13fcb739834af9e87fb5fe9df612096 not found, con> (END)Segmentation fault (core dumped)
I have not seen segmentation fault issue, but it's good to share some my finding:
I remembered if use kallsyms for kernel symbols, you need enable kernel configs:
CONFIG_PROC_KCORE=y CONFIG_PROC_VMCORE=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y
Also please note that the userspace has permission to export the complete kernel address info; IIRC, you need set '0' or '1' to below entry: echo 0 > /proc/sys/kernel/kptr_restrict
If you want to specify kernel for importing kernel symbol table, like this command './perf report -k ./vmlinux --stdio', you might need notice this patch: https://www.spinics.net/lists/linux-perf-users/msg05576.html
If upper two methods work, I still think it might need root cause the segmentation fault issue and fix with extra patches.
Thanks, Leo Yan
But this command works:
$ ~/linux/tools/perf/perf report --stdio --dump
And i can see Coresight packets information when browsing the output.
My .debug directory looks like this:
~/.debug/ |-- [kernel.kallsyms] | `-- 1dc43d23817467d7717b19af07463af0d9a9bd83 | `-- kallsyms |-- [vdso] | `-- 18863444e4f3e2600f53e406421b2a0edd940888 | `-- vdso |-- bin | `-- check | `-- 31694f29996e06da12f63d6088ec6eb23b3079c4 | `-- elf `-- lib `-- aarch64-linux-gnu |-- ld-2.26.so | `-- 6516ef8fa13fcb739834af9e87fb5fe9df612096 | `-- elf `-- libc.so.6 `-- 06e99d8d6acabab0643e0f525ac561cf73db6498 `-- elf
Now, another need i wanted to ask is where can i find the code that uses OpenCSD to decode the trace and output instructions? eventually, i don't want to use perf, but rather use OpenCSD directly in my code to decode traces.
Not sure what to do here and how to proceed, i'll appreciate some help. Thank you all!
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
Hi Mike,
This issue has been pointed out in a pull request on the GitHub this week - though as a build issue rather than an out and out crash.
As you and the spec rightly point out - behaviour is undefined, so will vary on a per compiler / build basis - which may be why we have got away with if for so long,
I will be checking that and applying as soon as I can, then releasing a patch version of the library (though I am a conference this week so it may be a little while).
If you want to get ahead of the curve I recommend trying the patch sent with the pull request.
Regards
Mike
On 19 March 2018 at 13:02, Mike Bazov mike@perception-point.io wrote:
Hi, thank you for your answer I don't think the kernel symbols are related. I do see the segfault for the command "perf report --stdio" occurs in OpenCSD, and this is the callstack:
#0 TrcPktDecodeEtmV4I::decodePacket (this=this@entry=0x555555e9a650, Complete=@0x7fffffffbb67: true) at /home/mike/repo/OpenCSD/decoder/source/etmv4/trc_pkt_decode_etmv4i.cpp:373 #1 0x00007ffff69b812b in TrcPktDecodeEtmV4I::processPacket (this=0x555555e9a650) at /home/mike/repo/OpenCSD/decoder/source/etmv4/trc_pkt_decode_etmv4i.cpp:99 #2 0x00007ffff69b8b58 in TrcPktDecodeBase<EtmV4ITrcPacket, EtmV4Config>::PacketDataIn ( this=0x555555e9a650, op=OCSD_OP_DATA, index_sop=41, p_packet_in=0x555555e98cf8) at /home/mike/repo/OpenCSD/decoder/include/common/trc_pkt_decode_base.h:247 #3 0x00007ffff69b4000 in EtmV4IPktProcImpl::processData (this=0x555555e98c70, index=<optimized out>, dataBlockSize=15, pDataBlock=0x555555e94869 "\177\201\177\377\377", numBytesProcessed=0x7fffffffbda4) at /home/mike/repo/OpenCSD/decoder/source/etmv4/trc_pkt_proc_etmv4i_impl.cpp:115 #4 0x00007ffff69b036a in TrcPktProcBase<EtmV4ITrcPacket, _ocsd_etmv4_i_pkt_type, EtmV4Config>::TraceDataIn (this=0x555555e98aa0, op=<optimized out>, index=32, dataBlockSize=<optimized out>, pDataBlock=<optimized out>, numBytesProcessed=<optimized out>) at /home/mike/repo/OpenCSD/decoder/include/common/trc_pkt_proc_base.h:238 #5 0x00007ffff69a18bd in TraceFmtDcdImpl::outputFrame (this=this@entry=0x555555e93780) at /home/mike/repo/OpenCSD/decoder/source/trc_frame_deformatter.cpp:700 #6 0x00007ffff69a1c24 in TraceFmtDcdImpl::processTraceData (this=0x555555e93780, index=<optimized out>, dataBlockSize=<optimized out>, pDataBlock=<optimized out>, numBytesProcessed=0x7fffffffbee4) at /home/mike/repo/OpenCSD/decoder/source/trc_frame_deformatter.cpp:272 #7 0x00005555558284b6 in cs_etm_decoder__process_data_block (decoder=0x555555e8b2d0, indx=0, buf=buf@entry=0x7ffff7ff25aa ")", len=len@entry=16640, consumed=consumed@entry=0x7fffffffbf68) at util/cs-etm-decoder/cs-etm-decoder.c:518 #8 0x0000555555826d22 in cs_etm__run_decoder (etmq=0x555555e7b150) at util/cs-etm.c:986 #9 cs_etm__process_timeless_queues (etm=0x555555e7a070, tid=2161, time_=18446744073709551615)
This is the instruction that causes the segfault as far as i saw:
Program received signal SIGSEGV, Segmentation fault. TrcPktDecodeEtmV4I::decodePacket (this=this@entry=0x555555e9a650, Complete=@0x7fffffffbb67: true) at /home/mike/repo/OpenCSD/decoder/source/etmv4/trc_pkt_decode_etmv4i.cpp:373 372 std::vector<uint32_t> params; 373 params[0] = m_curr_packet_in->getCC();
This segfaults because accessing an element in a vector that doesn't exist causes undefined behavior.. this causes segfault on my system. Any OpenCSD contributor that can help?
Thanks, Mike.
On Mon, Mar 19, 2018 at 11:40 AM, Leo Yan leo.yan@linaro.org wrote:
Hi Mike,
On Mon, Mar 19, 2018 at 11:08:17AM +0200, Mike Bazov wrote:
Greetings, i was referred to this mailing list by Mathleu Poirler.
I'm recording Coresight using my Dragonboard 410c board, after compiling the perf-opencsd-master kernel.
I am using acme branch: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git
Recording seems to work on a simple program i did which does nothing but print a string to the screen.
Now, i use perf to hopefully decode the trace, but perf segfaults. I'll let you know that decoding using the sample trace given here: https://github.com/Linaro/OpenCSD/blob/master/HOWTO.md
Does work.
I dived into the cs-trace-disasm.py script to see why exactly it doesn't work and i noticed this command causes the segfault: $ ~/linux/tools/perf/perf script --show-mmap-events
/lib/aarch64-linux-gnu/ld-2.26.so with build id 6516ef8fa13fcb739834af9e87fb5fe9df612096 not found, con> Segmentation fault (core dumped)
This command also segfaults:
$ ~/linux/tools/perf/perf report --stdio
/lib/aarch64-linux-gnu/ld-2.26.so with build id 6516ef8fa13fcb739834af9e87fb5fe9df612096 not found, con> (END)Segmentation fault (core dumped)
I have not seen segmentation fault issue, but it's good to share some my finding:
I remembered if use kallsyms for kernel symbols, you need enable kernel configs:
CONFIG_PROC_KCORE=y CONFIG_PROC_VMCORE=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y
Also please note that the userspace has permission to export the complete kernel address info; IIRC, you need set '0' or '1' to below entry: echo 0 > /proc/sys/kernel/kptr_restrict
If you want to specify kernel for importing kernel symbol table, like this command './perf report -k ./vmlinux --stdio', you might need notice this patch: https://www.spinics.net/lists/linux-perf-users/msg05576.html
If upper two methods work, I still think it might need root cause the segmentation fault issue and fix with extra patches.
Thanks, Leo Yan
But this command works:
$ ~/linux/tools/perf/perf report --stdio --dump
And i can see Coresight packets information when browsing the output.
My .debug directory looks like this:
~/.debug/ |-- [kernel.kallsyms] | `-- 1dc43d23817467d7717b19af07463af0d9a9bd83 | `-- kallsyms |-- [vdso] | `-- 18863444e4f3e2600f53e406421b2a0edd940888 | `-- vdso |-- bin | `-- check | `-- 31694f29996e06da12f63d6088ec6eb23b3079c4 | `-- elf `-- lib `-- aarch64-linux-gnu |-- ld-2.26.so | `-- 6516ef8fa13fcb739834af9e87fb5fe9df612096 | `-- elf `-- libc.so.6 `-- 06e99d8d6acabab0643e0f525ac561cf73db6498 `-- elf
Now, another need i wanted to ask is where can i find the code that uses OpenCSD to decode the trace and output instructions? eventually, i don't want to use perf, but rather use OpenCSD directly in my code to decode traces.
Not sure what to do here and how to proceed, i'll appreciate some help. Thank you all!
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
Thank you for your answers. I fixed the issue by applying the patches in the pull-request. I'm not getting any segfaults.. but i have a different issue now. I'm recording my own program, super simple(int main() { printf("Sleep\n"); return 0; }). The recording seems to work:
$ sudo perf record -e cs_etm/@826000.etr/u --per-thread ./check
Sleep [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.003 MB perf.data ]
But when i try to run the perf script, i'm not getting any trace events, even though i can see there are trace packets(using perf report --dump):
$ /home/mike/repo/trace/linux/tools/perf/perf
--exec-path=/home/mike/repo/trace/linux/tools/perf script --script=python:/home/mike/repo/trace/linux/tools/perf/scripts/python/cs-trace-disasm.py -- -d /usr/bin/aarch64-linux-gnu-objdump /lib/aarch64-linux-gnu/ld-2.26.so with build id 6516ef8fa13fcb739834af9e87fb5fe9df612096 not found, continuing without symbols /lib/aarch64-linux-gnu/libc-2.26.so with build id 06e99d8d6acabab0643e0f525ac561cf73db6498 not found, continuing without symbols /home/linaro/check with build id 31694f29996e06da12f63d6088ec6eb23b3079c4 not found, continuing without symbols /lib/aarch64-linux-gnu/ld-2.26.so with build id 6516ef8fa13fcb739834af9e87fb5fe9df612096 not found, continuing without symbols /lib/aarch64-linux-gnu/libc-2.26.so with build id 06e99d8d6acabab0643e0f525ac561cf73db6498 not found, continuing without symbols /home/linaro/check with build id 31694f29996e06da12f63d6088ec6eb23b3079c4 not found, continuing without symbols
The output gives nothing but the "not found, continuing without symbols". I have dived into the code a little bit, and i can see that the only packet types that i get are* CS_ETM_TRACE_ON.* This switch case(cs-etm.c line 1010): 1009 switch (etmq->packet->sample_type) {
1010 case CS_ETM_RANGE:
1011 /*
1012 * If the packet contains an instruction 1013 * range, generate instruction sequence 1014 * events.
1015 */
1016 cs_etm__sample(etmq);
1017 break;
1018 case CS_ETM_TRACE_ON:
1019 /*
1020 * Discontinuity in trace, flush 1021 * previous branch stack
1022 */
1023 cs_etm__flush(etmq);
1024 break;
1025 default:
1026 break;
1027 }
Never! gets to CS_ETM_RANGE.. therefore i get no decoded instructions. perf report does show packets:
Idx:0; ID:16; I_ASYNC : Alignment Synchronisation. Idx:12; ID:16; I_TRACE_INFO : Trace Info.; INFO=0x1;
CC_THRESHOLD=0x100 Idx:19; ID:16; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000; Idx:28; ID:16; I_TRACE_ON : Trace On. Idx:29; ID:16; I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000FFFF9D3> Idx:40; ID:16; I_ATOM_F1 : Atom format 1.; E Idx:41; ID:16; I_CCNT_F1 : Cycle Count format 1.; Count=0x0 Idx:42; ID:16; I_TIMESTAMP : Timestamp.; Updated val = 0x70fcdee7; CC=0x1 Idx:54; ID:16; I_ATOM_F1 : Atom format 1.; E Idx:55; ID:16; I_CCNT_F1 : Cycle Count format 1.; Count=0x1d4 Idx:58; ID:16; I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE Idx:59; ID:16; I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE Idx:60; ID:16; I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE Idx:61; ID:16; I_ATOM_F3 : Atom format 3.; EEN Idx:62; ID:16; I_ATOM_F1 : Atom format 1.; E Idx:64; ID:16; I_EXCEPT : Exception.; Data Fault; Ret Addr Follows, Match Prev; Idx:66; ID:16; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000FFFF9D378770; Idx:75; ID:16; I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0xFFFF7DFFFE7> Idx:86; ID:16; I_TIMESTAMP : Timestamp.; Updated val = 0x70fcdf71; CC=0xf6 Idx:91; ID:16; I_TRACE_ON : Trace On. Idx:92; ID:16; I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000FFFF9D3> Idx:103; ID:16; I_ATOM_F1 : Atom format 1.; N Idx:104; ID:16; I_CCNT_F1 : Cycle Count format 1.; Count=0x0 Idx:105; ID:16; I_ATOM_F6 : Atom format 6.; EEEEN Idx:106; ID:16; I_ATOM_F2 : Atom format 2.; EE ... ... ... ... ... ...
This is my .debug tree:
|-- [kernel.kallsyms] | `-- 1dc43d23817467d7717b19af07463af0d9a9bd83 | `-- kallsyms |-- [vdso] | `-- 18863444e4f3e2600f53e406421b2a0edd940888 | `-- vdso |-- bin | `-- check | `-- 31694f29996e06da12f63d6088ec6eb23b3079c4 | `-- elf `-- lib `-- aarch64-linux-gnu |-- ld-2.26.so | `-- 6516ef8fa13fcb739834af9e87fb5fe9df612096 | `-- elf `-- libc.so.6 `-- 06e99d8d6acabab0643e0f525ac561cf73db6498 `-- elf
This seems fine according to the HOWTO.md. The weird part is, that if i record "uname" and not my program, i do get trace events.
Any suggestions? Thanks, Mike.
Hi,
On 20 March 2018 at 15:34, Mike Bazov mike@perception-point.io wrote:
Thank you for your answers. I fixed the issue by applying the patches in the pull-request. I'm not getting any segfaults.. but i have a different issue now. I'm recording my own program, super simple(int main() { printf("Sleep\n"); return 0; }). The recording seems to work:
$ sudo perf record -e cs_etm/@826000.etr/u --per-thread ./check Sleep [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.003 MB perf.data ]
But when i try to run the perf script, i'm not getting any trace events, even though i can see there are trace packets(using perf report --dump):
$ /home/mike/repo/trace/linux/tools/perf/perf --exec-path=/home/mike/repo/trace/linux/tools/perf script --script=python:/home/mike/repo/trace/linux/tools/perf/scripts/python/cs-trace-disasm.py -- -d /usr/bin/aarch64-linux-gnu-objdump /lib/aarch64-linux-gnu/ld-2.26.so with build id 6516ef8fa13fcb739834af9e87fb5fe9df612096 not found, continuing without symbols /lib/aarch64-linux-gnu/libc-2.26.so with build id 06e99d8d6acabab0643e0f525ac561cf73db6498 not found, continuing without symbols /home/linaro/check with build id 31694f29996e06da12f63d6088ec6eb23b3079c4 not found, continuing without symbols /lib/aarch64-linux-gnu/ld-2.26.so with build id 6516ef8fa13fcb739834af9e87fb5fe9df612096 not found, continuing without symbols /lib/aarch64-linux-gnu/libc-2.26.so with build id 06e99d8d6acabab0643e0f525ac561cf73db6498 not found, continuing without symbols /home/linaro/check with build id 31694f29996e06da12f63d6088ec6eb23b3079c4 not found, continuing without symbols
Without the memory images loaded above, the decoder cannot decode the trace packets into meaningful trace execution ranges - the process of decode requires that the decoder follow the executed program binary to determine which instructions were traced and which branches were taken. (the 'E' atoms in the dump below represent taken branches - the decode walks the code to find these branches).
So that's why the trace is not being decoded. The dump command simply shows the raw trace packets before decode.
The output gives nothing but the "not found, continuing without symbols". I have dived into the code a little bit, and i can see that the only packet types that i get are CS_ETM_TRACE_ON. This switch case(cs-etm.c line 1010): 1009 switch (etmq->packet->sample_type) { 1010 case CS_ETM_RANGE: 1011 /* 1012 * If the packet contains an instruction 1013 * range, generate instruction sequence 1014 * events. 1015 */ 1016 cs_etm__sample(etmq); 1017 break; 1018 case CS_ETM_TRACE_ON: 1019 /* 1020 * Discontinuity in trace, flush 1021 * previous branch stack 1022 */ 1023 cs_etm__flush(etmq); 1024 break; 1025 default: 1026 break; 1027 }
Never! gets to CS_ETM_RANGE.. therefore i get no decoded instructions. perf report does show packets:
Idx:0; ID:16; I_ASYNC : Alignment Synchronisation. Idx:12; ID:16; I_TRACE_INFO : Trace Info.; INFO=0x1;
CC_THRESHOLD=0x100 Idx:19; ID:16; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000; Idx:28; ID:16; I_TRACE_ON : Trace On. Idx:29; ID:16; I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000FFFF9D3> Idx:40; ID:16; I_ATOM_F1 : Atom format 1.; E Idx:41; ID:16; I_CCNT_F1 : Cycle Count format 1.; Count=0x0 Idx:42; ID:16; I_TIMESTAMP : Timestamp.; Updated val = 0x70fcdee7; CC=0x1 Idx:54; ID:16; I_ATOM_F1 : Atom format 1.; E Idx:55; ID:16; I_CCNT_F1 : Cycle Count format 1.; Count=0x1d4 Idx:58; ID:16; I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE Idx:59; ID:16; I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE Idx:60; ID:16; I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE Idx:61; ID:16; I_ATOM_F3 : Atom format 3.; EEN Idx:62; ID:16; I_ATOM_F1 : Atom format 1.; E Idx:64; ID:16; I_EXCEPT : Exception.; Data Fault; Ret Addr Follows, Match Prev; Idx:66; ID:16; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000FFFF9D378770; Idx:75; ID:16; I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0xFFFF7DFFFE7> Idx:86; ID:16; I_TIMESTAMP : Timestamp.; Updated val = 0x70fcdf71; CC=0xf6 Idx:91; ID:16; I_TRACE_ON : Trace On. Idx:92; ID:16; I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000FFFF9D3> Idx:103; ID:16; I_ATOM_F1 : Atom format 1.; N Idx:104; ID:16; I_CCNT_F1 : Cycle Count format 1.; Count=0x0 Idx:105; ID:16; I_ATOM_F6 : Atom format 6.; EEEEN Idx:106; ID:16; I_ATOM_F2 : Atom format 2.; EE ... ... ... ... ... ...
This is my .debug tree:
|-- [kernel.kallsyms] | `-- 1dc43d23817467d7717b19af07463af0d9a9bd83 | `-- kallsyms |-- [vdso] | `-- 18863444e4f3e2600f53e406421b2a0edd940888 | `-- vdso |-- bin | `-- check | `-- 31694f29996e06da12f63d6088ec6eb23b3079c4 | `-- elf `-- lib `-- aarch64-linux-gnu |-- ld-2.26.so | `-- 6516ef8fa13fcb739834af9e87fb5fe9df612096 | `-- elf `-- libc.so.6 `-- 06e99d8d6acabab0643e0f525ac561cf73db6498 `-- elf
This seems fine according to the HOWTO.md. The weird part is, that if i record "uname" and not my program, i do get trace events.
Any suggestions? Thanks, Mike.
I agree that the .debug database looks valid. Not sure why perf cannot see it. I'll see if I can find out more.
Mike
On Thu, 22 Mar 2018 07:52:15 +0000 Mike Leach mike.leach@linaro.org wrote:
Hi, sorry, just jumping in with a couple of extra recommendations...
On 20 March 2018 at 15:34, Mike Bazov mike@perception-point.io wrote:
Thank you for your answers. I fixed the issue by applying the patches in the pull-request. I'm not
If this means you cherry-picked them without rebasing them onto a very late version of perf, then please do that, too. Preferably this one:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git
because it's perf/core branch contains some recent fixes in the area.
getting any segfaults.. but i have a different issue now. I'm recording my own program, super simple(int main() { printf("Sleep\n"); return 0; }). The recording seems to work:
That's an awfully small piece of userspace code: All that executable will do is issue a printf syscall with the fixed string, and nothing else: a very small amount of instructions, so maybe it's triggering a rarely tested recording corner-case.
$ sudo perf record -e cs_etm/@826000.etr/u --per-thread ./check Sleep [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.003 MB perf.data ]
But when i try to run the perf script, i'm not getting any trace events, even though i can see there are trace packets(using perf report --dump):
$ /home/mike/repo/trace/linux/tools/perf/perf --exec-path=/home/mike/repo/trace/linux/tools/perf script --script=python:/home/mike/repo/trace/linux/tools/perf/scripts/python/cs-trace-disasm.py -- -d /usr/bin/aarch64-linux-gnu-objdump /lib/aarch64-linux-gnu/ld-2.26.so with build id 6516ef8fa13fcb739834af9e87fb5fe9df612096 not found, continuing without symbols /lib/aarch64-linux-gnu/libc-2.26.so with build id 06e99d8d6acabab0643e0f525ac561cf73db6498 not found, continuing without symbols /home/linaro/check with build id 31694f29996e06da12f63d6088ec6eb23b3079c4 not found, continuing without symbols /lib/aarch64-linux-gnu/ld-2.26.so with build id 6516ef8fa13fcb739834af9e87fb5fe9df612096 not found, continuing without symbols /lib/aarch64-linux-gnu/libc-2.26.so with build id 06e99d8d6acabab0643e0f525ac561cf73db6498 not found, continuing without symbols /home/linaro/check with build id 31694f29996e06da12f63d6088ec6eb23b3079c4 not found, continuing without symbols
Without the memory images loaded above, the decoder cannot decode the trace packets into meaningful trace execution ranges - the process of decode requires that the decoder follow the executed program binary to determine which instructions were traced and which branches were taken. (the 'E' atoms in the dump below represent taken branches - the decode walks the code to find these branches).
So that's why the trace is not being decoded. The dump command simply shows the raw trace packets before decode.
I didn't know this, but also think it might be fixed with a perf update to the latest acme's perf/core branch. Not sure, just saying one more thing to check.
The output gives nothing but the "not found, continuing without symbols". I have dived into the code a little bit, and i can see that the only packet types that i get are CS_ETM_TRACE_ON. This switch case(cs-etm.c line 1010): 1009 switch (etmq->packet->sample_type) { 1010 case CS_ETM_RANGE: 1011 /* 1012 * If the packet contains an instruction 1013 * range, generate instruction sequence 1014 * events. 1015 */ 1016 cs_etm__sample(etmq); 1017 break; 1018 case CS_ETM_TRACE_ON: 1019 /* 1020 * Discontinuity in trace, flush 1021 * previous branch stack 1022 */ 1023 cs_etm__flush(etmq); 1024 break; 1025 default: 1026 break; 1027 }
Never! gets to CS_ETM_RANGE.. therefore i get no decoded instructions. perf report does show packets:
Idx:0; ID:16; I_ASYNC : Alignment Synchronisation. Idx:12; ID:16; I_TRACE_INFO : Trace Info.; INFO=0x1;
CC_THRESHOLD=0x100 Idx:19; ID:16; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000; Idx:28; ID:16; I_TRACE_ON : Trace On. Idx:29; ID:16; I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000FFFF9D3> Idx:40; ID:16; I_ATOM_F1 : Atom format 1.; E Idx:41; ID:16; I_CCNT_F1 : Cycle Count format 1.; Count=0x0 Idx:42; ID:16; I_TIMESTAMP : Timestamp.; Updated val = 0x70fcdee7; CC=0x1 Idx:54; ID:16; I_ATOM_F1 : Atom format 1.; E Idx:55; ID:16; I_CCNT_F1 : Cycle Count format 1.; Count=0x1d4 Idx:58; ID:16; I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE Idx:59; ID:16; I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE Idx:60; ID:16; I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE Idx:61; ID:16; I_ATOM_F3 : Atom format 3.; EEN Idx:62; ID:16; I_ATOM_F1 : Atom format 1.; E Idx:64; ID:16; I_EXCEPT : Exception.; Data Fault; Ret Addr Follows, Match Prev; Idx:66; ID:16; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000FFFF9D378770; Idx:75; ID:16; I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0xFFFF7DFFFE7> Idx:86; ID:16; I_TIMESTAMP : Timestamp.; Updated val = 0x70fcdf71; CC=0xf6 Idx:91; ID:16; I_TRACE_ON : Trace On. Idx:92; ID:16; I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000FFFF9D3>
I don't recognize these addresses, some are 32-bit printouts, some aren't, which is just confusing unless it's something like an aarch32 binary running on an aarch64 kernel. I can't tell whether they're userspace process, shared libraries, or kernelspace. The number of instructions in the example "Sleep" printing above tells me they might be kernel addresses, which might mean it needs to be run with superuser privileges? This is also needed, e.g., to read nonzero kallsyms pointers.
Idx:103; ID:16; I_ATOM_F1 : Atom format 1.; N Idx:104; ID:16; I_CCNT_F1 : Cycle Count format 1.; Count=0x0 Idx:105; ID:16; I_ATOM_F6 : Atom format 6.; EEEEN Idx:106; ID:16; I_ATOM_F2 : Atom format 2.; EE ... ... ... ... ... ...
This is my .debug tree:
|-- [kernel.kallsyms] | `-- 1dc43d23817467d7717b19af07463af0d9a9bd83 | `-- kallsyms |-- [vdso] | `-- 18863444e4f3e2600f53e406421b2a0edd940888 | `-- vdso |-- bin | `-- check | `-- 31694f29996e06da12f63d6088ec6eb23b3079c4 | `-- elf `-- lib `-- aarch64-linux-gnu |-- ld-2.26.so | `-- 6516ef8fa13fcb739834af9e87fb5fe9df612096 | `-- elf `-- libc.so.6 `-- 06e99d8d6acabab0643e0f525ac561cf73db6498 `-- elf
This seems fine according to the HOWTO.md. The weird part is, that if i record "uname" and not my program, i do get trace events.
I agree that the .debug database looks valid. Not sure why perf cannot see it. I'll see if I can find out more.
uname does a lot more than the simple example above: was it meant to actually call sleep() in addition to printing "Sleep"?
Updating the perf executable may also fix bugs in the area of resolving paths to non-system binaries. Try sudo rm -fr ~/.debug, and re-run, since these may be results from a prior run.
Not sure if your target has it, but disabling address space randomization helps debugging, too (setarch linux64 -R <workload>).
Kim
On Thu, Mar 22, 2018 at 05:03:48PM +0800, Kim Phillips wrote:
On Thu, 22 Mar 2018 07:52:15 +0000 Mike Leach mike.leach@linaro.org wrote:
Hi, sorry, just jumping in with a couple of extra recommendations...
On 20 March 2018 at 15:34, Mike Bazov mike@perception-point.io wrote:
Thank you for your answers. I fixed the issue by applying the patches in the pull-request. I'm not
If this means you cherry-picked them without rebasing them onto a very late version of perf, then please do that, too. Preferably this one:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git
because it's perf/core branch contains some recent fixes in the area.
+1. Before I encountered quite similiar issue with you [1]; after I used acme branch, then kernel symbol issue can be fixed.
[1] https://patchwork.kernel.org/patch/10115523/
Thanks, Leo Yan
Hi,
Thank you for your answers.
1) I updated to the acme's "perf/core" branch, and it's still not solved. I still don't get any events for my recorded traces, except if i record "uname -r".
2) I'm trying to integrate directly with OpenCSD, and decode my own traces. I took the "perf.data" file from the HOWTO.md, which contains a valid trace according to the HOWTO, and extracted the trace itself from the perf file. I have a raw trace file without any perf related data and structures. I'm trying to open my own decoder, and decode instructions, after taking examples from "tools/perf/util/cs-etm-decoder/cs-etm-decoder.c", "tools/perf/util/cs-etm.c". There is no documentation on how to use OpenCSD as a library, nothing(full, not partial like the docs in the repo) that i found at least.
This is a very short example of what i do(same order that they appear in the code) to create a dcd tree, and create a decoder, ignore the error validation right now, they are there, i'm making sure this message is short as possible:
/* Open a decoder tree */ flags |= OCSD_DFRMTR_FRAME_MEM_ALIGN; flags |= OCSD_DFRMTR_RESET_ON_4X_FSYNC; decoder_tree = ocsd_create_dcd_tree(OCSD_TRC_SRC_FRAME_FORMATTED, OCSD_DFRMTR_RESET_ON_4X_FSYNC);
/* Create a __full__ decoder */
config.arch_ver = ARCH_V8; config.core_prof = profile_CortexA; config.reg_traceidr = 18; /* specific hard coded trace id to be
indentical to perf, just to check if it works */ ocsd_err = ocsd_dt_create_decoder(decoder_tree, OCSD_BUILTIN_DCD_ETMV4I,
OCSD_CREATE_FLG_FULL_DECODER, &config, &csid);
/* Register my element callback */
ocsd_err = ocsd_dt_set_gen_elem_outfn(decoder_tree,
decoder_callback, NULL);
Now i mmap() my trace file.. and do the following:
/* Reset the decoder */
val = ocsd_dt_process_data(dcd_tree, OCSD_OP_RESET, 0, 0, NULL, NULL); if (OCSD_DATA_RESP_IS_FATAL(val)) { return -1; }
From here now, i call ocsd_dt_process_data() to fully decode the trace,
just like the code in cs_etm_decoder__process_data_block() in "tools/perf/util/cs-etm-decoder/cs-etm-decoder.c:518". The problem is, the resetting of the decoder fails fatally in the "OCSD_DATA_RESP_IS_FATAL" macro with the error: "OCSD_RESP_FATAL_NOT_INIT". Not sure what i did wrong, as i followed the exact code like in perf.
Any suggestions? Thank you!
Cheers, Mike.
On Thu, Mar 22, 2018 at 4:21 PM, Leo Yan leo.yan@linaro.org wrote:
On Thu, Mar 22, 2018 at 05:03:48PM +0800, Kim Phillips wrote:
On Thu, 22 Mar 2018 07:52:15 +0000 Mike Leach mike.leach@linaro.org wrote:
Hi, sorry, just jumping in with a couple of extra recommendations...
On 20 March 2018 at 15:34, Mike Bazov mike@perception-point.io
wrote:
Thank you for your answers. I fixed the issue by applying the patches in the pull-request. I'm
not
If this means you cherry-picked them without rebasing them onto a very late version of perf, then please do that, too. Preferably this one:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git
because it's perf/core branch contains some recent fixes in the area.
+1. Before I encountered quite similiar issue with you [1]; after I used acme branch, then kernel symbol issue can be fixed.
[1] https://patchwork.kernel.org/patch/10115523/
Thanks, Leo Yan
Hi Mike,
You are correct - we do need additional documentation on how to create a program using the decoder.
To create a valid decode environment you need the following:- 1) most obvious - some trace. Normally formatted in 16 byte coresight trace frames.
2) for each core traced, the configuration of the ETM attached to that core.
if full decode - rather than the packet dump seen in perf report --dump is required then...
3) the memory images of the programs executed - with information on addresses these images were loaded.
All the above appears in the perf.data file, and is extracted and added to the decode tree.
I suggest looking at the test programs to get an idea of how to use the decoder - especially trc_pkt_lister.cpp. This program processes trace snapshots, so uses the snapshot library to create a decode tree. [CreateDcdTreeFromSnapShot::createDecodeTree() in ss_to_dcdtree.cpp ] The c_api_pkt_print_test.c is similar but for the C-API wrapper.
The principles are the same though. The packet lister will...
1) create an initial decode tree - normally with a CoreSight trace frame demux working on aligned trace data .(OCSD_TRC_SRC_FRAME_FORMATTED | OCSD_DFRMTR_FRAME_MEM_ALIGN) 2) for each CPU/PE, call CreateDcdTreeFromSnapShot::createPEDecoder(), which calls a protocol specific function. 3)e.g. createETMv4Decoder() populates an ETMv4 config structure and passes it to the decode tree. 4) for the "dump" files - i.e. memory images, these are added in the CreateDcdTreeFromSnapShot::processDumpfiles() function, using addBinFileMemAcc(). These are added to the decode tree as file images - with a load address for the 1st byte in the file
For memory images, perf will actually add the saved elf files which it lists in perf.data as being loaded while the program is run, and it stores in .debug in the users home directory. In this case these are added to the memaccessor with both a load address, and the offset of the code section within the file. Thus if perf cannot find the programs in the .debug directory, an essential part of the decode process is missing.
Finally, sinks for the output of the trace decode can be added - in this case packet printers. For full decode a single printer is used as fully decoded trace is output at a single point with the trace ID associated. When printing raw trace packets, an output sink per decoder is used.
From the code you show, I can see 1), but no evidence that 2) or 3)
are occurring.
Regards
Mike
On 27 March 2018 at 13:48, Mike Bazov mike@perception-point.io wrote:
Hi,
Thank you for your answers.
- I updated to the acme's "perf/core" branch, and it's still not solved. I
still don't get any events for my recorded traces, except if i record "uname -r".
- I'm trying to integrate directly with OpenCSD, and decode my own traces.
I took the "perf.data" file from the HOWTO.md, which contains a valid trace according to the HOWTO, and extracted the trace itself from the perf file. I have a raw trace file without any perf related data and structures. I'm trying to open my own decoder, and decode instructions, after taking examples from "tools/perf/util/cs-etm-decoder/cs-etm-decoder.c", "tools/perf/util/cs-etm.c". There is no documentation on how to use OpenCSD as a library, nothing(full, not partial like the docs in the repo) that i found at least.
This is a very short example of what i do(same order that they appear in
the code) to create a dcd tree, and create a decoder, ignore the error validation right now, they are there, i'm making sure this message is short as possible:
/* Open a decoder tree */ flags |= OCSD_DFRMTR_FRAME_MEM_ALIGN; flags |= OCSD_DFRMTR_RESET_ON_4X_FSYNC; decoder_tree = ocsd_create_dcd_tree(OCSD_TRC_SRC_FRAME_FORMATTED,
OCSD_DFRMTR_RESET_ON_4X_FSYNC);
/* Create a __full__ decoder */ config.arch_ver = ARCH_V8; config.core_prof = profile_CortexA; config.reg_traceidr = 18; /* specific hard coded trace id to be
indentical to perf, just to check if it works */ ocsd_err = ocsd_dt_create_decoder(decoder_tree, OCSD_BUILTIN_DCD_ETMV4I,
OCSD_CREATE_FLG_FULL_DECODER, &config, &csid);
/* Register my element callback */ ocsd_err = ocsd_dt_set_gen_elem_outfn(decoder_tree,
decoder_callback, NULL);
Now i mmap() my trace file.. and do the following:
/* Reset the decoder */
val = ocsd_dt_process_data(dcd_tree, OCSD_OP_RESET, 0, 0, NULL, NULL); if (OCSD_DATA_RESP_IS_FATAL(val)) { return -1; }
From here now, i call ocsd_dt_process_data() to fully decode the trace, just like the code in cs_etm_decoder__process_data_block() in "tools/perf/util/cs-etm-decoder/cs-etm-decoder.c:518". The problem is, the resetting of the decoder fails fatally in the "OCSD_DATA_RESP_IS_FATAL" macro with the error: "OCSD_RESP_FATAL_NOT_INIT". Not sure what i did wrong, as i followed the exact code like in perf.
Any suggestions? Thank you!
Cheers, Mike.
On Thu, Mar 22, 2018 at 4:21 PM, Leo Yan leo.yan@linaro.org wrote:
On Thu, Mar 22, 2018 at 05:03:48PM +0800, Kim Phillips wrote:
On Thu, 22 Mar 2018 07:52:15 +0000 Mike Leach mike.leach@linaro.org wrote:
Hi, sorry, just jumping in with a couple of extra recommendations...
On 20 March 2018 at 15:34, Mike Bazov mike@perception-point.io wrote:
Thank you for your answers. I fixed the issue by applying the patches in the pull-request. I'm not
If this means you cherry-picked them without rebasing them onto a very late version of perf, then please do that, too. Preferably this one:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git
because it's perf/core branch contains some recent fixes in the area.
+1. Before I encountered quite similiar issue with you [1]; after I used acme branch, then kernel symbol issue can be fixed.
[1] https://patchwork.kernel.org/patch/10115523/
Thanks, Leo Yan
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
Hi Mike,
The c_api_pkt_print_test.c is probably a simpler example of how to create a valid decode tree.
This program sets up a decoder for a single trace source from a block of captured trace data. It uses the same snapshot data, but has a lot of the config parameters hard coded into the program, rather than use the snapshot library to read them from files. Start at process_trace_data() and work from there.
Mike
On 27 March 2018 at 15:18, Mike Leach mike.leach@linaro.org wrote:
Hi Mike,
You are correct - we do need additional documentation on how to create a program using the decoder.
To create a valid decode environment you need the following:-
- most obvious - some trace. Normally formatted in 16 byte coresight
trace frames.
- for each core traced, the configuration of the ETM attached to that core.
if full decode - rather than the packet dump seen in perf report --dump is required then...
- the memory images of the programs executed - with information on
addresses these images were loaded.
All the above appears in the perf.data file, and is extracted and added to the decode tree.
I suggest looking at the test programs to get an idea of how to use the decoder - especially trc_pkt_lister.cpp. This program processes trace snapshots, so uses the snapshot library to create a decode tree. [CreateDcdTreeFromSnapShot::createDecodeTree() in ss_to_dcdtree.cpp ] The c_api_pkt_print_test.c is similar but for the C-API wrapper.
The principles are the same though. The packet lister will...
- create an initial decode tree - normally with a CoreSight trace
frame demux working on aligned trace data .(OCSD_TRC_SRC_FRAME_FORMATTED | OCSD_DFRMTR_FRAME_MEM_ALIGN) 2) for each CPU/PE, call CreateDcdTreeFromSnapShot::createPEDecoder(), which calls a protocol specific function. 3)e.g. createETMv4Decoder() populates an ETMv4 config structure and passes it to the decode tree. 4) for the "dump" files - i.e. memory images, these are added in the CreateDcdTreeFromSnapShot::processDumpfiles() function, using addBinFileMemAcc(). These are added to the decode tree as file images - with a load address for the 1st byte in the file
For memory images, perf will actually add the saved elf files which it lists in perf.data as being loaded while the program is run, and it stores in .debug in the users home directory. In this case these are added to the memaccessor with both a load address, and the offset of the code section within the file. Thus if perf cannot find the programs in the .debug directory, an essential part of the decode process is missing.
Finally, sinks for the output of the trace decode can be added - in this case packet printers. For full decode a single printer is used as fully decoded trace is output at a single point with the trace ID associated. When printing raw trace packets, an output sink per decoder is used.
From the code you show, I can see 1), but no evidence that 2) or 3) are occurring.
Regards
Mike
On 27 March 2018 at 13:48, Mike Bazov mike@perception-point.io wrote:
Hi,
Thank you for your answers.
- I updated to the acme's "perf/core" branch, and it's still not solved. I
still don't get any events for my recorded traces, except if i record "uname -r".
- I'm trying to integrate directly with OpenCSD, and decode my own traces.
I took the "perf.data" file from the HOWTO.md, which contains a valid trace according to the HOWTO, and extracted the trace itself from the perf file. I have a raw trace file without any perf related data and structures. I'm trying to open my own decoder, and decode instructions, after taking examples from "tools/perf/util/cs-etm-decoder/cs-etm-decoder.c", "tools/perf/util/cs-etm.c". There is no documentation on how to use OpenCSD as a library, nothing(full, not partial like the docs in the repo) that i found at least.
This is a very short example of what i do(same order that they appear in
the code) to create a dcd tree, and create a decoder, ignore the error validation right now, they are there, i'm making sure this message is short as possible:
/* Open a decoder tree */ flags |= OCSD_DFRMTR_FRAME_MEM_ALIGN; flags |= OCSD_DFRMTR_RESET_ON_4X_FSYNC; decoder_tree = ocsd_create_dcd_tree(OCSD_TRC_SRC_FRAME_FORMATTED,
OCSD_DFRMTR_RESET_ON_4X_FSYNC);
/* Create a __full__ decoder */ config.arch_ver = ARCH_V8; config.core_prof = profile_CortexA; config.reg_traceidr = 18; /* specific hard coded trace id to be
indentical to perf, just to check if it works */ ocsd_err = ocsd_dt_create_decoder(decoder_tree, OCSD_BUILTIN_DCD_ETMV4I,
OCSD_CREATE_FLG_FULL_DECODER, &config, &csid);
/* Register my element callback */ ocsd_err = ocsd_dt_set_gen_elem_outfn(decoder_tree,
decoder_callback, NULL);
Now i mmap() my trace file.. and do the following:
/* Reset the decoder */
val = ocsd_dt_process_data(dcd_tree, OCSD_OP_RESET, 0, 0, NULL, NULL); if (OCSD_DATA_RESP_IS_FATAL(val)) { return -1; }
From here now, i call ocsd_dt_process_data() to fully decode the trace, just like the code in cs_etm_decoder__process_data_block() in "tools/perf/util/cs-etm-decoder/cs-etm-decoder.c:518". The problem is, the resetting of the decoder fails fatally in the "OCSD_DATA_RESP_IS_FATAL" macro with the error: "OCSD_RESP_FATAL_NOT_INIT". Not sure what i did wrong, as i followed the exact code like in perf.
Any suggestions? Thank you!
Cheers, Mike.
On Thu, Mar 22, 2018 at 4:21 PM, Leo Yan leo.yan@linaro.org wrote:
On Thu, Mar 22, 2018 at 05:03:48PM +0800, Kim Phillips wrote:
On Thu, 22 Mar 2018 07:52:15 +0000 Mike Leach mike.leach@linaro.org wrote:
Hi, sorry, just jumping in with a couple of extra recommendations...
On 20 March 2018 at 15:34, Mike Bazov mike@perception-point.io wrote:
Thank you for your answers. I fixed the issue by applying the patches in the pull-request. I'm not
If this means you cherry-picked them without rebasing them onto a very late version of perf, then please do that, too. Preferably this one:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git
because it's perf/core branch contains some recent fixes in the area.
+1. Before I encountered quite similiar issue with you [1]; after I used acme branch, then kernel symbol issue can be fixed.
[1] https://patchwork.kernel.org/patch/10115523/
Thanks, Leo Yan
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
-- Mike Leach Principal Engineer, ARM Ltd. Blackburn Design Centre. UK
Hi,
Thank you for your detailed answers Mike! i appreciate it!
1) I have surfed the c_api_pkt_print_test.c, and went on from process_trace_data(). The only thing that was missing as far as i understood from my previous message in the steps for creating a decoder, is registering a bin file at a certain linear address. I did not add all of the executed libraries bin files, i assumed this will make me get only executed instructions in the given bin file. I extracted the text section from my elf file(which is loaded at a certain virtual address), and added this:
/* add binfile containing opcodes */
ocsd_err = ocsd_dt_add_binfile_mem_acc(decoder_tree, 0x400280,
OCSD_MEM_SPACE_ANY, "./check.bin");
/* add mem access callback*/ ocsd_err = ocsd_dt_add_callback_mem_acc(decoder_tree, 0,
(uint64_t) -1,
OCSD_MEM_SPACE_ANY, decoder_mem_callback, NULL);
/* add packet callback */
ocsd_err = ocsd_dt_attach_packet_callback(decoder_tree, csid,
OCSD_C_API_CB_PKT_MON,
packet_monitor, NULL);
Now, i get no error when decoding the trace, but my callback for processed elements is never called, and my packet callback is only called once with "op=OCSD_OP_RESET". I should get a element callback with element type = "OCSD_GEN_TRC_ELEM_INSTR_RANGE", this didn't happen. I dived into the trace packets using this tool: https://github.com/hwang cc23/ptm2human and honestly the packets look valid(i sometimes get "ADDRESSS" packets that claim they ran on 32bit.. even though the CPU is 64bit[ARMv8], and the ELF is 64bit, i don't think it's the issue) I can see the "ADDRESS" packets with expected executed addresses though.
2) To be sure the lack of elements isn't because i didn't add all the binfiles, i tried to make my perf recording as minimal as possible, and tried to filter userspace addresses only, and more specifcally, addresses that belong to my program only, but this fails with the following error:
sudo ./perf record -e cs_etm/@826000.etr/u --filter 'filter
0x4002b4/0x48' --per-thread ./check
* failed to set filter "filter 0x4002b4/0x48" on event cs_etm/@826000.etr/u with 22 (Invalid argument)"*
*kernel address filtering do work though:*
sudo ./perf record -e cs_etm/@826000.etr/k --filter 'filter
0xffffff8008562d0c/0x48' --per-thread ./check
* Sleep[ perf record: Woken up 1 times to write data ]* * [ perf record: Captured and wrote 0.006 MB perf.data ]*
Any thoughts on my questions? Thank you!