Mike Leach wrote:
Hi Vincent, PTM and ETMv4 support what is called program flow trace - where trace elements (E/N atoms) are only output on potential changes in program flow - primarily branch instructions - unlike ETMV3.x which outputs an atom per instruction. This does not mean that the intermediate instructions have not been traced / execute, its just the execution is implied rather than explicitly stated. Looking at the first few packets: ADDRESS AND CONTEXT: addr = 0xffffff80080fbfc4 EL = 0x1 SF = 0x1 NS = NON_SECURE V = 0x0 VMID = 0x0 CONTEXT_ID = 0x0 <00f4:85 71 5f 0f 08 80 ff ff ff 31 > This is the start address of the traced section of the program ATOM packet format 2: ATOMS: N, E, <00fe:da > These are a couple of branch elements - one branch not taken, one branch taken The decoder will then get the program image for the code loaded at 0xffffff80080fbfc4 and examine consecutive opcodes till one matches a branch instruction (or other program flow element) <dequeue_entity+0xa94> ffffff80080fbfc4: b9401b40 ffffff80080fbfc8: 8b204301 ffffff80080fbfcc: f10ffc3f N ffffff80080fbfd0: 54002028 b.hi ffffff80080fc3d4 <dequeue_entity+0xea4> Here the decoder walks four instructions and finds the B.HI instruction, which it associates the N atom with. This it can disassemble to calculate the direct target address, so this address is not output in the trace. The previous three instructions are implied as executed as well. This trace client decoder is simply not providing any disassembly for them. This is not taken so decode continues with the next instruction. ffffff80080fbfd4: 9b1c7f04 ffffff80080fbfd8: 2a1803ea ffffff80080fbfdc: 52800001 ffffff80080fbfe0: d34afc87 E ffffff80080fbfe4: b5003519 cbnz x25, ffffff80080fc684 <dequeue_entity+0x1154> Walk 5 instructions to find an indirect branch that is taken, associates the E atom. Again all 5 instructions have been executed. The branch target cannot be calculated from the instruction, so the trace has to output the target address: ADDRESS AND CONTEXT:L_64_IS0 addr = 0xffffff80080fc684 <00ff:9d 21 63 0f 08 80 ff ff ff > This is then used to determine where trace continues from - 0xffffff80080fc684 And so on...... The OpenCSD library itself does no disassembly - other than that necessary detection of key instructions and calculation of target addresses. It will take in the raw trace and the program image supplied by a client (e.g. perf) and output information that the client can use for further processing. e.g. for this snippet the OpenCSD output would be something like: ADDR_RANGE( ffffff80080fbfc4- ffffff80080fbfd0, N,) ADDR_RANGE( ffffff80080fbfd4-ffffff80080fbfe4, E) perf then takes this and processes it accordingly - depending on the type of information requested on the command line (flame graphs, disassembly, etc). A full featured GUI debugger might use an .elf file to show disassembly and C sources lines for the same trace. On Mon, 10 Feb 2025 at 17:46, vincent.ernst@web.de vincent.ernst@web.de wrote:
Hi Mike, I was actually able to set up the Nvidia driver, collect some trace data and decode it with Nvidia's mem_parser (see attachment). Comparing this to the trace output in the OpenCSD how-to, it seems to me that the ETMs on my device only support branch instruction trace and do not trace every instruction. Or does using perf + OpenCSD provide additional trace information compared to what I am getting right now? The trace information is all that is necessary and sufficient to
reconstruct the execution of the program. How much of that information is used is a function of the decoder / client program.
Since the Nvidia driver + decoder seem to work now, I think I will stick to them and not try to backport perf. Still no clue why tracing with the Linaro drivers does not work, though.
If not, trying to backport perf would not be necessary for me anymore. We discussed the versions of kernel that supported coresight in one of
our internal meetings this week. A colleague pointed out that NVidia appear to have released a 5.15 base kernel for Jetson.
Yes, newer Jetson devices have kernel versions up to 5.15. But the device that I use (Jetson Nano) is stuck at kernel 4.9, unfortunately.
I think I have all the trace data that I need now. Thank you very much for your help and explanations!
Best regards, Vincent