Hi all,
I have been using CoreSight PTM component on Zynq for more than two years. I started out by programming a simple library to program these components on a “bare-metal” system (without OS). Then, I moved on Linux and Mathieu Poirier (I can’t thank him enough) helped me a lot during this phase. So far, I have been tracing small portions of my applications and the amount of trace generated was not that important. I was getting the expected trace i.e. for each branch (direct or indirect), I was getting a branch address packet in my trace. Now, I started tracing the whole .text section of binaries and I am not understanding the obtained trace.
Here is how I configure Linux kernel driver (Kernel v4.9):
cd /sys/bus/coresight/devices/f889c000.ptm0 echo 1 > addr_idx echo 0 > addr_acctype echo 0 > addr_idx echo 0 > addr_acctype echo 20 > mode echo 100e0 104b4 > addr_range # These two addresses represent the beginning and end of .text section
Then, I enable the trace sink component (either ETB or TPIU) and trace source (PTM) component.
cd /sys/bus/coresight/devices/ echo 1 > f8801000.etb/enable_sink echo 1 > f889c000.ptm0/enable_source
Then, I run my application and stop tracing.
./application.elf ./disable # simply writes 0 to each enabled component (source and sink)
Then, I recover the trace using dd.
When I trace small portions of my application, the obtained trace gives the right behavior. I check it manually by looking at objdump of the binary.
However, when I trace the whole .text section of the application, the amount of obtained trace is very small (even smaller than if I trace only main function of the application) which is quite strange for me. Basically, the obtained trace is going through libc functions that call the main function and it stops while it is in libc. I don’t understand why I am getting this strange behavior. Do you have any ideas about what I am doing wrong.
I have attached a binary source code that I am trying to trace.
Thank you for your help and time. Best regards, Muhammad PS: Sorry, I formatted my previous mail in markdown rather than HTML. This should be more readable than the previous mail.
On 5 April 2018 at 02:38, Muhammad Abdul WAHAB muhammadabdul.wahab@centralesupelec.fr wrote:
Hi all,
I have been using CoreSight PTM component on Zynq for more than two years. I started out by programming a simple library to program these components on a “bare-metal” system (without OS). Then, I moved on Linux and Mathieu Poirier (I can’t thank him enough) helped me a lot during this phase. So far, I have been tracing small portions of my applications and the amount of trace generated was not that important. I was getting the expected trace i.e. for each branch (direct or indirect), I was getting a branch address packet in my trace. Now, I started tracing the whole .text section of binaries and I am not understanding the obtained trace.
Here is how I configure Linux kernel driver (Kernel v4.9):
cd /sys/bus/coresight/devices/f889c000.ptm0 echo 1 > addr_idx echo 0 > addr_acctype echo 0 > addr_idx echo 0 > addr_acctype echo 20 > mode echo 100e0 104b4 > addr_range # These two addresses represent the beginning and end of .text section
Then, I enable the trace sink component (either ETB or TPIU) and trace source (PTM) component.
You have a TPIU driver?
cd /sys/bus/coresight/devices/ echo 1 > f8801000.etb/enable_sink echo 1 > f889c000.ptm0/enable_source
How many CPUs on this system? Are you sure the application runs on PTM0? Could it be running on another processor?
Then, I run my application and stop tracing.
./application.elf ./disable # simply writes 0 to each enabled component (source and sink)
Then, I recover the trace using dd.
When I trace small portions of my application, the obtained trace gives the right behavior. I check it manually by looking at objdump of the binary.
However, when I trace the whole .text section of the application, the amount of obtained trace is very small (even smaller than if I trace only main function of the application) which is quite strange for me. Basically, the obtained trace is going through libc functions that call the main function and it stops while it is in libc. I don’t understand why I am getting this strange behavior. Do you have any ideas about what I am doing wrong.
I will offer (without looking at your above configuration) that by default the tracers are configured to trace the whole address range. As such can you tell me what you're getting if you enable the tracers without doing any configuration of your own? It could also be that the default configuration is clashing with your new configuration but that's a conversation we can have in an upcoming email once you've answered the above questions.
Mathieu
I have attached a binary source code that I am trying to trace.
Thank you for your help and time. Best regards, Muhammad PS: Sorry, I formatted my previous mail in markdown rather than HTML. This should be more readable than the previous mail.
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
Hi Mathieu,
Le 05/04/2018 à 17:27, Mathieu Poirier a écrit :
You have a TPIU driver?
Yes, I slightly modified the existing TPIU driver and it worked. Luckily, the TPIU does not require much configuration. I am not enabling the formatter of TPIU because I am tracing only single CPU core. As it is incomplete (no formatter), I did not submitted it as a patch.
How many CPUs on this system? Are you sure the application runs on PTM0? Could it be running on another processor?
I have two CPUs on my system. I made sure that the program runs on the CPU that I am tracing by turning off the second core through sysfs.
I will offer (without looking at your above configuration) that by default the tracers are configured to trace the whole address range. As such can you tell me what you're getting if you enable the tracers without doing any configuration of your own? It could also be that the default configuration is clashing with your new configuration but that's a conversation we can have in an upcoming email once you've answered the above questions.
If I enable trace sink and source components without any configuration of my own, I am getting some trace that consists mainly of kernel code (addresses above 0xc000 0000). For example, the output of dd then hexdump looks like the trace below.
00000000 ab e3 db 0d 72 37 85 80 fe ff 4f 14 d1 f6 81 80 |....r7....O.....| 00000010 0e 77 86 9f 86 24 86 a9 05 8e c5 99 a0 01 c3 d9 |.w...$..........| 00000020 88 00 8e 95 1a 86 f1 bc 0c 86 9f da 08 53 f1 bc |.............S..| 00000030 0c 85 3d d9 da 08 23 cd 99 a0 01 86 69 d9 85 a4 |..=...#.....i...| 00000040 00 f5 f6 23 e9 38 05 df 85 24 9e d5 f5 23 9e bf |...#.8...$...#..| 00000050 36 2d fd 9b a0 01 f1 da 88 00 9e bf 1b 8f 9c a0 |6-..............| 00000060 01 86 b1 f6 a3 00 86 b7 82 20 86 51 86 ff 83 28 |......... .Q...(| 00000070 dd 82 20 e1 a9 29 e3 82 20 c1 c0 20 86 e7 82 20 |.. ..).. .. ... | 00000080 bb f6 23 e1 f5 01 86 b9 ab e3 db 0d 86 cf 89 1c |..#.............| 00000090 fb d3 82 80 08 d1 97 02 e1 f2 db db 0d ff d3 82 |................| 000000a0 80 08 b7 98 02 e9 aa e3 db 0d 9d f9 1a 33 f1 aa |.............3..| 000000b0 23 e7 9e 22 d3 2e 73 89 22 86 72 1b 85 80 fe ff |#.."..s.".r.....| 000000c0 4f 14 d1 f6 81 80 0e 77 86 f5 82 20 e1 81 24 c5 |O......w... ..$.| 000000d0 99 a0 01 c3 d9 88 00 8e 95 1a 86 f1 bc 0c 86 9f |................| 000000e0 da 08 53 f1 bc 0c 85 3d d9 da 08 23 cd 99 a0 01 |..S....=...#....| 000000f0 86 69 ed 81 a4 00 8e a3 02 b1 f2 23 a9 82 24 fd |.i.........#..$.| 00000100 9b a0 01 f1 da 88 00 9e bf 1b 8f 9c a0 01 86 ad |................| 00000110 82 a4 00 b7 82 20 86 51 86 ff 83 28 dd 82 20 e1 |..... .Q...(.. .| 00000120 a9 29 e3 82 20 c1 c0 20 86 e7 82 20 85 03 86 e1 |.).. .. ... ....| 00000130 f5 01 86 9d a2 e2 db 0d 83 2f 2b 83 ab 23 cf 89 |........./+..#..| 00000140 1c 87 d4 82 80 08 d1 97 02 e1 f2 db db 0d 89 d4 |................| 00000150 82 80 08 a5 03 86 c3 9e 02 86 99 1f 8f d4 02 b3 |................| 00000160 94 02 bd 88 e3 db 0d 86 8d 09 ff 89 1c 95 0a 86 |................| 00000170 13 86 13 86 13 86 13 86 13 86 13 86 13 86 13 86 |................| ... and so on.
How can I make sure that there are no conflicts of kernel configuration ? Best regards, Muhammad
On 5 April 2018 at 10:57, Muhammad Abdul WAHAB muhammadabdul.wahab@centralesupelec.fr wrote:
Hi Mathieu,
Le 05/04/2018 à 17:27, Mathieu Poirier a écrit :
You have a TPIU driver?
Yes, I slightly modified the existing TPIU driver and it worked. Luckily, the TPIU does not require much configuration. I am not enabling the formatter of TPIU because I am tracing only single CPU core. As it is incomplete (no formatter), I did not submitted it as a patch.
How many CPUs on this system? Are you sure the application runs on PTM0? Could it be running on another processor?
I have two CPUs on my system. I made sure that the program runs on the CPU that I am tracing by turning off the second core through sysfs.
I will offer (without looking at your above configuration) that by default the tracers are configured to trace the whole address range. As such can you tell me what you're getting if you enable the tracers without doing any configuration of your own? It could also be that the default configuration is clashing with your new configuration but that's a conversation we can have in an upcoming email once you've answered the above questions.
If I enable trace sink and source components without any configuration of my own, I am getting some trace that consists mainly of kernel code (addresses above 0xc000 0000). For example, the output of dd then hexdump looks like the trace below.
00000000 ab e3 db 0d 72 37 85 80 fe ff 4f 14 d1 f6 81 80 |....r7....O.....| 00000010 0e 77 86 9f 86 24 86 a9 05 8e c5 99 a0 01 c3 d9 |.w...$..........| 00000020 88 00 8e 95 1a 86 f1 bc 0c 86 9f da 08 53 f1 bc |.............S..| 00000030 0c 85 3d d9 da 08 23 cd 99 a0 01 86 69 d9 85 a4 |..=...#.....i...| 00000040 00 f5 f6 23 e9 38 05 df 85 24 9e d5 f5 23 9e bf |...#.8...$...#..| 00000050 36 2d fd 9b a0 01 f1 da 88 00 9e bf 1b 8f 9c a0 |6-..............| 00000060 01 86 b1 f6 a3 00 86 b7 82 20 86 51 86 ff 83 28 |......... .Q...(| 00000070 dd 82 20 e1 a9 29 e3 82 20 c1 c0 20 86 e7 82 20 |.. ..).. .. ... | 00000080 bb f6 23 e1 f5 01 86 b9 ab e3 db 0d 86 cf 89 1c |..#.............| 00000090 fb d3 82 80 08 d1 97 02 e1 f2 db db 0d ff d3 82 |................| 000000a0 80 08 b7 98 02 e9 aa e3 db 0d 9d f9 1a 33 f1 aa |.............3..| 000000b0 23 e7 9e 22 d3 2e 73 89 22 86 72 1b 85 80 fe ff |#.."..s.".r.....| 000000c0 4f 14 d1 f6 81 80 0e 77 86 f5 82 20 e1 81 24 c5 |O......w... ..$.| 000000d0 99 a0 01 c3 d9 88 00 8e 95 1a 86 f1 bc 0c 86 9f |................| 000000e0 da 08 53 f1 bc 0c 85 3d d9 da 08 23 cd 99 a0 01 |..S....=...#....| 000000f0 86 69 ed 81 a4 00 8e a3 02 b1 f2 23 a9 82 24 fd |.i.........#..$.| 00000100 9b a0 01 f1 da 88 00 9e bf 1b 8f 9c a0 01 86 ad |................| 00000110 82 a4 00 b7 82 20 86 51 86 ff 83 28 dd 82 20 e1 |..... .Q...(.. .| 00000120 a9 29 e3 82 20 c1 c0 20 86 e7 82 20 85 03 86 e1 |.).. .. ... ....| 00000130 f5 01 86 9d a2 e2 db 0d 83 2f 2b 83 ab 23 cf 89 |........./+..#..| 00000140 1c 87 d4 82 80 08 d1 97 02 e1 f2 db db 0d 89 d4 |................| 00000150 82 80 08 a5 03 86 c3 9e 02 86 99 1f 8f d4 02 b3 |................| 00000160 94 02 bd 88 e3 db 0d 86 8d 09 ff 89 1c 95 0a 86 |................| 00000170 13 86 13 86 13 86 13 86 13 86 13 86 13 86 13 86 |................| ... and so on.
How can I make sure that there are no conflicts of kernel configuration ?
You will likely have to instrument the code. When tracers are enabled I would look at the final configuration (via sysFS), to make sure what you have there is really what you wrote in the configuration registers.
Best regards, Muhammad
Hi,
How are you determining that the trace addresses are mainly in the kernel? Do you have a dump of the decoded trace packets?
If you are getting the trace from the ETB, then the coresight frame formatter will be running, so any trace will have to be first extracted from the frame format before any decode.
Looking at the source code you sent, if you limit the traced region to the code area in your application, then you might expect the trace to contain addresses outside this area as the PTM trace consists of "waypoint" addresses, not individual instructions executed. The decoder determines the executed trace by following the source between waypoint addresses.
A waypoint address will be generated whenever you move out of the specified trace region, so I would guess that each time you call the sqrt() function, an address outside the main program will appear in the trace. Once this function returns you may see a second address output as the trace restarts - this time this should be at the return address for the function.
It may be worthwhile trying to run your data through the trc_pkt_lister test program. To do this you can adapt one of the tests\snapshots to your data. If you look at the TC2 snapshot this has PTM trace in it - the .ini files here could be adapted.
Regards
Mike
On 6 April 2018 at 15:37, Mathieu Poirier mathieu.poirier@linaro.org wrote:
On 5 April 2018 at 10:57, Muhammad Abdul WAHAB muhammadabdul.wahab@centralesupelec.fr wrote:
Hi Mathieu,
Le 05/04/2018 à 17:27, Mathieu Poirier a écrit :
You have a TPIU driver?
Yes, I slightly modified the existing TPIU driver and it worked. Luckily, the TPIU does not require much configuration. I am not enabling the formatter of TPIU because I am tracing only single CPU core. As it is incomplete (no formatter), I did not submitted it as a patch.
How many CPUs on this system? Are you sure the application runs on PTM0? Could it be running on another processor?
I have two CPUs on my system. I made sure that the program runs on the CPU that I am tracing by turning off the second core through sysfs.
I will offer (without looking at your above configuration) that by default the tracers are configured to trace the whole address range. As such can you tell me what you're getting if you enable the tracers without doing any configuration of your own? It could also be that the default configuration is clashing with your new configuration but that's a conversation we can have in an upcoming email once you've answered the above questions.
If I enable trace sink and source components without any configuration of my own, I am getting some trace that consists mainly of kernel code (addresses above 0xc000 0000). For example, the output of dd then hexdump looks like the trace below.
00000000 ab e3 db 0d 72 37 85 80 fe ff 4f 14 d1 f6 81 80 |....r7....O.....| 00000010 0e 77 86 9f 86 24 86 a9 05 8e c5 99 a0 01 c3 d9 |.w...$..........| 00000020 88 00 8e 95 1a 86 f1 bc 0c 86 9f da 08 53 f1 bc |.............S..| 00000030 0c 85 3d d9 da 08 23 cd 99 a0 01 86 69 d9 85 a4 |..=...#.....i...| 00000040 00 f5 f6 23 e9 38 05 df 85 24 9e d5 f5 23 9e bf |...#.8...$...#..| 00000050 36 2d fd 9b a0 01 f1 da 88 00 9e bf 1b 8f 9c a0 |6-..............| 00000060 01 86 b1 f6 a3 00 86 b7 82 20 86 51 86 ff 83 28 |......... .Q...(| 00000070 dd 82 20 e1 a9 29 e3 82 20 c1 c0 20 86 e7 82 20 |.. ..).. .. ... | 00000080 bb f6 23 e1 f5 01 86 b9 ab e3 db 0d 86 cf 89 1c |..#.............| 00000090 fb d3 82 80 08 d1 97 02 e1 f2 db db 0d ff d3 82 |................| 000000a0 80 08 b7 98 02 e9 aa e3 db 0d 9d f9 1a 33 f1 aa |.............3..| 000000b0 23 e7 9e 22 d3 2e 73 89 22 86 72 1b 85 80 fe ff |#.."..s.".r.....| 000000c0 4f 14 d1 f6 81 80 0e 77 86 f5 82 20 e1 81 24 c5 |O......w... ..$.| 000000d0 99 a0 01 c3 d9 88 00 8e 95 1a 86 f1 bc 0c 86 9f |................| 000000e0 da 08 53 f1 bc 0c 85 3d d9 da 08 23 cd 99 a0 01 |..S....=...#....| 000000f0 86 69 ed 81 a4 00 8e a3 02 b1 f2 23 a9 82 24 fd |.i.........#..$.| 00000100 9b a0 01 f1 da 88 00 9e bf 1b 8f 9c a0 01 86 ad |................| 00000110 82 a4 00 b7 82 20 86 51 86 ff 83 28 dd 82 20 e1 |..... .Q...(.. .| 00000120 a9 29 e3 82 20 c1 c0 20 86 e7 82 20 85 03 86 e1 |.).. .. ... ....| 00000130 f5 01 86 9d a2 e2 db 0d 83 2f 2b 83 ab 23 cf 89 |........./+..#..| 00000140 1c 87 d4 82 80 08 d1 97 02 e1 f2 db db 0d 89 d4 |................| 00000150 82 80 08 a5 03 86 c3 9e 02 86 99 1f 8f d4 02 b3 |................| 00000160 94 02 bd 88 e3 db 0d 86 8d 09 ff 89 1c 95 0a 86 |................| 00000170 13 86 13 86 13 86 13 86 13 86 13 86 13 86 13 86 |................| ... and so on.
How can I make sure that there are no conflicts of kernel configuration ?
You will likely have to instrument the code. When tracers are enabled I would look at the final configuration (via sysFS), to make sure what you have there is really what you wrote in the configuration registers.
Best regards, Muhammad
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
Le 09/04/2018 à 15:03, Mike Leach a écrit :
Hi,
Hi Mike,
How are you determining that the trace addresses are mainly in the kernel? Do you have a dump of the decoded trace packets?
Actually, I enabled tracing and did not launch any program. I just wanted to make sure that the default configuration is generating trace. In decoded trace, I received addresses which are for most of the part kernel addresses. It seems right.
Here is how I configure CoreSight components.
The first thing I do is that I disable the 2nd CPU so that everything runs on the CPU that I am tracing. For ETB, I disabled formatting (by clearing ETB_FFCR_EN_FTC bit in the kernel driver) because I am only interested in raw trace. For PTM, I did not changed default configuration. I just enabled ETB and PTM for CPU 0. I found a decoder on <a href="https://www.spinics.net/lists/arm- kernel/msg207080.html">the link</a> that only decodes raw trace. This decoder is used to decode raw trace.
I have the decoded trace in simple text format. If you want to have a look, I can send it to you.
If you are getting the trace from the ETB, then the coresight frame formatter will be running, so any trace will have to be first extracted from the frame format before any decode.
In Coresight-etb10.c file, I replaced the following line : writel_relaxed(ETB_FFCR_EN_FTC | ETB_FFCR_STOP_TRIGGER, drvdata->base + ETB_FFCR); by writel_relaxed(ETB_FFCR_STOP_TRIGGER, drvdata->base + ETB_FFCR); So that the Formatter is not enabled.
Looking at the source code you sent, if you limit the traced region to the code area in your application, then you might expect the trace to contain addresses outside this area as the PTM trace consists of "waypoint" addresses, not individual instructions executed. The decoder determines the executed trace by following the source between waypoint addresses.
Actually, I am compiling the application statically to make sure that I can get all the trace for the application (even for the libraries).
A waypoint address will be generated whenever you move out of the specified trace region, so I would guess that each time you call the sqrt() function, an address outside the main program will appear in the trace. Once this function returns you may see a second address output as the trace restarts - this time this should be at the return address for the function.
Yes, I observe this behaviour when I trace small portion of an application. Each time a BL instruction is executed, I get the function address and the return address. The strange thing is that the trace stops before I receive all the trace.
It may be worthwhile trying to run your data through the trc_pkt_lister test program. To do this you can adapt one of the tests\snapshots to your data. If you look at the TC2 snapshot this has PTM trace in it - the .ini files here could be adapted.
I think, I have to trace it using perf and see what I get and then adapt my configuration accordingly. Is their another way to stop formatter from SysFS ?
I have developped a decoder for PFT protocol in FPGA part of Zynq SoC and have been using it so far with TPIU to decode traces on the FPGA. The decoder expects raw trace. This is why I disable formatting.
Regards Mike
Thank you so much for your help.
Best regards, Muhammad PS: Sorry for my late reply.
Le 09/04/2018 à 15:03, Mike Leach a écrit :
Hi,
How are you determining that the trace addresses are mainly in the kernel? Do you have a dump of the decoded trace packets?
If you are getting the trace from the ETB, then the coresight frame formatter will be running, so any trace will have to be first extracted from the frame format before any decode.
Looking at the source code you sent, if you limit the traced region to the code area in your application, then you might expect the trace to contain addresses outside this area as the PTM trace consists of "waypoint" addresses, not individual instructions executed. The decoder determines the executed trace by following the source between waypoint addresses.
A waypoint address will be generated whenever you move out of the specified trace region, so I would guess that each time you call the sqrt() function, an address outside the main program will appear in the trace. Once this function returns you may see a second address output as the trace restarts - this time this should be at the return address for the function.
It may be worthwhile trying to run your data through the trc_pkt_lister test program. To do this you can adapt one of the tests\snapshots to your data. If you look at the TC2 snapshot this has PTM trace in it - the .ini files here could be adapted.
Regards
Mike
On 6 April 2018 at 15:37, Mathieu Poirier mathieu.poirier@linaro.org wrote:
On 5 April 2018 at 10:57, Muhammad Abdul WAHAB muhammadabdul.wahab@centralesupelec.fr wrote:
Hi Mathieu,
Le 05/04/2018 à 17:27, Mathieu Poirier a écrit :
You have a TPIU driver?
Yes, I slightly modified the existing TPIU driver and it worked. Luckily, the TPIU does not require much configuration. I am not enabling the formatter of TPIU because I am tracing only single CPU core. As it is incomplete (no formatter), I did not submitted it as a patch.
How many CPUs on this system? Are you sure the application runs on PTM0? Could it be running on another processor?
I have two CPUs on my system. I made sure that the program runs on the CPU that I am tracing by turning off the second core through sysfs.
I will offer (without looking at your above configuration) that by default the tracers are configured to trace the whole address range. As such can you tell me what you're getting if you enable the tracers without doing any configuration of your own? It could also be that the default configuration is clashing with your new configuration but that's a conversation we can have in an upcoming email once you've answered the above questions.
If I enable trace sink and source components without any configuration of my own, I am getting some trace that consists mainly of kernel code (addresses above 0xc000 0000). For example, the output of dd then hexdump looks like the trace below.
00000000 ab e3 db 0d 72 37 85 80 fe ff 4f 14 d1 f6 81 80 |....r7....O.....| 00000010 0e 77 86 9f 86 24 86 a9 05 8e c5 99 a0 01 c3 d9 |.w...$..........| 00000020 88 00 8e 95 1a 86 f1 bc 0c 86 9f da 08 53 f1 bc |.............S..| 00000030 0c 85 3d d9 da 08 23 cd 99 a0 01 86 69 d9 85 a4 |..=...#.....i...| 00000040 00 f5 f6 23 e9 38 05 df 85 24 9e d5 f5 23 9e bf |...#.8...$...#..| 00000050 36 2d fd 9b a0 01 f1 da 88 00 9e bf 1b 8f 9c a0 |6-..............| 00000060 01 86 b1 f6 a3 00 86 b7 82 20 86 51 86 ff 83 28 |......... .Q...(| 00000070 dd 82 20 e1 a9 29 e3 82 20 c1 c0 20 86 e7 82 20 |.. ..).. .. ... | 00000080 bb f6 23 e1 f5 01 86 b9 ab e3 db 0d 86 cf 89 1c |..#.............| 00000090 fb d3 82 80 08 d1 97 02 e1 f2 db db 0d ff d3 82 |................| 000000a0 80 08 b7 98 02 e9 aa e3 db 0d 9d f9 1a 33 f1 aa |.............3..| 000000b0 23 e7 9e 22 d3 2e 73 89 22 86 72 1b 85 80 fe ff |#.."..s.".r.....| 000000c0 4f 14 d1 f6 81 80 0e 77 86 f5 82 20 e1 81 24 c5 |O......w... ..$.| 000000d0 99 a0 01 c3 d9 88 00 8e 95 1a 86 f1 bc 0c 86 9f |................| 000000e0 da 08 53 f1 bc 0c 85 3d d9 da 08 23 cd 99 a0 01 |..S....=...#....| 000000f0 86 69 ed 81 a4 00 8e a3 02 b1 f2 23 a9 82 24 fd |.i.........#..$.| 00000100 9b a0 01 f1 da 88 00 9e bf 1b 8f 9c a0 01 86 ad |................| 00000110 82 a4 00 b7 82 20 86 51 86 ff 83 28 dd 82 20 e1 |..... .Q...(.. .| 00000120 a9 29 e3 82 20 c1 c0 20 86 e7 82 20 85 03 86 e1 |.).. .. ... ....| 00000130 f5 01 86 9d a2 e2 db 0d 83 2f 2b 83 ab 23 cf 89 |........./+..#..| 00000140 1c 87 d4 82 80 08 d1 97 02 e1 f2 db db 0d 89 d4 |................| 00000150 82 80 08 a5 03 86 c3 9e 02 86 99 1f 8f d4 02 b3 |................| 00000160 94 02 bd 88 e3 db 0d 86 8d 09 ff 89 1c 95 0a 86 |................| 00000170 13 86 13 86 13 86 13 86 13 86 13 86 13 86 13 86 |................| ... and so on.
How can I make sure that there are no conflicts of kernel configuration ?
You will likely have to instrument the code. When tracers are enabled I would look at the final configuration (via sysFS), to make sure what you have there is really what you wrote in the configuration registers.
Best regards, Muhammad
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
On 13 April 2018 at 10:54, Muhammad Abdul WAHAB muhammadabdul.wahab@centralesupelec.fr wrote:
Le 09/04/2018 à 15:03, Mike Leach a écrit :
Hi,
Hi Mike,
How are you determining that the trace addresses are mainly in the kernel? Do you have a dump of the decoded trace packets?
Actually, I enabled tracing and did not launch any program. I just wanted to make sure that the default configuration is generating trace. In decoded trace, I received addresses which are for most of the part kernel addresses. It seems right.
Here is how I configure CoreSight components.
The first thing I do is that I disable the 2nd CPU so that everything runs on the CPU that I am tracing. For ETB, I disabled formatting (by clearing ETB_FFCR_EN_FTC bit in the kernel driver) because I am only interested in raw trace. For PTM, I did not changed default configuration. I just enabled ETB and PTM for CPU 0. I found a decoder on <a href="https://www.spinics.net/lists/arm- kernel/msg207080.html">the link</a> that only decodes raw trace. This decoder is used to decode raw trace.
Did you make sure the options the PTM tracer has been configured with are the same than what the decoder thinks it is getting? I have looked at the decoder you've referenced very early on in the project and remember that it hard-codes the configuration options, which almost guarantees it won't work to decode the traces you're giving it without modifications.
I have the decoded trace in simple text format. If you want to have a look, I can send it to you.
If you are getting the trace from the ETB, then the coresight frame formatter will be running, so any trace will have to be first extracted from the frame format before any decode.
In Coresight-etb10.c file, I replaced the following line : writel_relaxed(ETB_FFCR_EN_FTC | ETB_FFCR_STOP_TRIGGER, drvdata->base + ETB_FFCR); by writel_relaxed(ETB_FFCR_STOP_TRIGGER, drvdata->base + ETB_FFCR); So that the Formatter is not enabled.
Looking at the source code you sent, if you limit the traced region to the code area in your application, then you might expect the trace to contain addresses outside this area as the PTM trace consists of "waypoint" addresses, not individual instructions executed. The decoder determines the executed trace by following the source between waypoint addresses.
Actually, I am compiling the application statically to make sure that I can get all the trace for the application (even for the libraries).
A waypoint address will be generated whenever you move out of the specified trace region, so I would guess that each time you call the sqrt() function, an address outside the main program will appear in the trace. Once this function returns you may see a second address output as the trace restarts - this time this should be at the return address for the function.
Yes, I observe this behaviour when I trace small portion of an application. Each time a BL instruction is executed, I get the function address and the return address. The strange thing is that the trace stops before I receive all the trace.
It may be worthwhile trying to run your data through the trc_pkt_lister test program. To do this you can adapt one of the tests\snapshots to your data. If you look at the TC2 snapshot this has PTM trace in it - the .ini files here could be adapted.
I think, I have to trace it using perf and see what I get and then adapt my configuration accordingly. Is their another way to stop formatter from SysFS ?
Unfortunately trace decoding isn't supported yet for ARMv7, so trying to decode traces with perf won't work. Fixing this is very high on my priority list but still can't get you an estimate.
I have developped a decoder for PFT protocol in FPGA part of Zynq SoC and have been using it so far with TPIU to decode traces on the FPGA. The decoder expects raw trace. This is why I disable formatting.
This sounds interesting and I'd like a little more clarification if you don't mind.
1) Are you doing trace decoding in the FPGA itself? 2) Can you give me the name of the board you're working with? I'm interested in the TPIU. 3) What decoding box is externally connected to the TPIU port?
Regards Mike
Thank you so much for your help.
Best regards, Muhammad PS: Sorry for my late reply.
Le 09/04/2018 à 15:03, Mike Leach a écrit :
Hi,
How are you determining that the trace addresses are mainly in the kernel? Do you have a dump of the decoded trace packets?
If you are getting the trace from the ETB, then the coresight frame formatter will be running, so any trace will have to be first extracted from the frame format before any decode.
Looking at the source code you sent, if you limit the traced region to the code area in your application, then you might expect the trace to contain addresses outside this area as the PTM trace consists of "waypoint" addresses, not individual instructions executed. The decoder determines the executed trace by following the source between waypoint addresses.
A waypoint address will be generated whenever you move out of the specified trace region, so I would guess that each time you call the sqrt() function, an address outside the main program will appear in the trace. Once this function returns you may see a second address output as the trace restarts - this time this should be at the return address for the function.
It may be worthwhile trying to run your data through the trc_pkt_lister test program. To do this you can adapt one of the tests\snapshots to your data. If you look at the TC2 snapshot this has PTM trace in it - the .ini files here could be adapted.
Regards
Mike
On 6 April 2018 at 15:37, Mathieu Poirier mathieu.poirier@linaro.org wrote:
On 5 April 2018 at 10:57, Muhammad Abdul WAHAB muhammadabdul.wahab@centralesupelec.fr wrote:
Hi Mathieu,
Le 05/04/2018 à 17:27, Mathieu Poirier a écrit :
You have a TPIU driver?
Yes, I slightly modified the existing TPIU driver and it worked. Luckily, the TPIU does not require much configuration. I am not enabling the formatter of TPIU because I am tracing only single CPU core. As it is incomplete (no formatter), I did not submitted it as a patch.
How many CPUs on this system? Are you sure the application runs on PTM0? Could it be running on another processor?
I have two CPUs on my system. I made sure that the program runs on the CPU that I am tracing by turning off the second core through sysfs.
I will offer (without looking at your above configuration) that by default the tracers are configured to trace the whole address range. As such can you tell me what you're getting if you enable the tracers without doing any configuration of your own? It could also be that the default configuration is clashing with your new configuration but that's a conversation we can have in an upcoming email once you've answered the above questions.
If I enable trace sink and source components without any configuration of my own, I am getting some trace that consists mainly of kernel code (addresses above 0xc000 0000). For example, the output of dd then hexdump looks like the trace below.
00000000 ab e3 db 0d 72 37 85 80 fe ff 4f 14 d1 f6 81 80 |....r7....O.....| 00000010 0e 77 86 9f 86 24 86 a9 05 8e c5 99 a0 01 c3 d9 |.w...$..........| 00000020 88 00 8e 95 1a 86 f1 bc 0c 86 9f da 08 53 f1 bc |.............S..| 00000030 0c 85 3d d9 da 08 23 cd 99 a0 01 86 69 d9 85 a4 |..=...#.....i...| 00000040 00 f5 f6 23 e9 38 05 df 85 24 9e d5 f5 23 9e bf |...#.8...$...#..| 00000050 36 2d fd 9b a0 01 f1 da 88 00 9e bf 1b 8f 9c a0 |6-..............| 00000060 01 86 b1 f6 a3 00 86 b7 82 20 86 51 86 ff 83 28 |......... .Q...(| 00000070 dd 82 20 e1 a9 29 e3 82 20 c1 c0 20 86 e7 82 20 |.. ..).. .. ... | 00000080 bb f6 23 e1 f5 01 86 b9 ab e3 db 0d 86 cf 89 1c |..#.............| 00000090 fb d3 82 80 08 d1 97 02 e1 f2 db db 0d ff d3 82 |................| 000000a0 80 08 b7 98 02 e9 aa e3 db 0d 9d f9 1a 33 f1 aa |.............3..| 000000b0 23 e7 9e 22 d3 2e 73 89 22 86 72 1b 85 80 fe ff |#.."..s.".r.....| 000000c0 4f 14 d1 f6 81 80 0e 77 86 f5 82 20 e1 81 24 c5 |O......w... ..$.| 000000d0 99 a0 01 c3 d9 88 00 8e 95 1a 86 f1 bc 0c 86 9f |................| 000000e0 da 08 53 f1 bc 0c 85 3d d9 da 08 23 cd 99 a0 01 |..S....=...#....| 000000f0 86 69 ed 81 a4 00 8e a3 02 b1 f2 23 a9 82 24 fd |.i.........#..$.| 00000100 9b a0 01 f1 da 88 00 9e bf 1b 8f 9c a0 01 86 ad |................| 00000110 82 a4 00 b7 82 20 86 51 86 ff 83 28 dd 82 20 e1 |..... .Q...(.. .| 00000120 a9 29 e3 82 20 c1 c0 20 86 e7 82 20 85 03 86 e1 |.).. .. ... ....| 00000130 f5 01 86 9d a2 e2 db 0d 83 2f 2b 83 ab 23 cf 89 |........./+..#..| 00000140 1c 87 d4 82 80 08 d1 97 02 e1 f2 db db 0d 89 d4 |................| 00000150 82 80 08 a5 03 86 c3 9e 02 86 99 1f 8f d4 02 b3 |................| 00000160 94 02 bd 88 e3 db 0d 86 8d 09 ff 89 1c 95 0a 86 |................| 00000170 13 86 13 86 13 86 13 86 13 86 13 86 13 86 13 86 |................| ... and so on.
How can I make sure that there are no conflicts of kernel configuration ?
You will likely have to instrument the code. When tracers are enabled I would look at the final configuration (via sysFS), to make sure what you have there is really what you wrote in the configuration registers.
Best regards, Muhammad
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
Hi Mathieu,
Did you make sure the options the PTM tracer has been configured with are the same than what the decoder thinks it is getting? I have looked at the decoder you've referenced very early on in the project and remember that it hard-codes the configuration options, which almost guarantees it won't work to decode the traces you're giving it without modifications.
Yes, I made sure that the same options are enabled for getting trace and decoding it using the decoder. I had to modify the decoder a bit to make it work. Now, it works well for raw trace. I tested it on simple programs and the result is correct. The problem is in tracing configuration. I am starting to look into how trace is recovered by Perf interface in order to find the error in my configuration.
This sounds interesting and I'd like a little more clarification if you don't mind.
- Are you doing trace decoding in the FPGA itself?
- Can you give me the name of the board you're working with? I'm
interested in the TPIU. 3) What decoding box is externally connected to the TPIU port?
1) Yes, I am decoding traces in the FPGA itself. 2) I am working on Zedboard (Zynq SoC). I submitted patches to add support for CoreSight components last year. Basically, it contains the modifications in the device tree file (https://lkml.org/lkml/2016/9/30/52). 3) In Zynq SoC, you can recover the trace output either on the FPGA part (EMIO interface) either on the pins in order to use an external decoder. As I don't have a decoding box, I use EMIO interface to get trace at FPGA part and then decode it. I tested it on simple programs and it works without any issues.
Best regards, Muhammad
Le 13/04/2018 à 19:07, Mathieu Poirier a écrit :
On 13 April 2018 at 10:54, Muhammad Abdul WAHAB muhammadabdul.wahab@centralesupelec.fr wrote:
Le 09/04/2018 à 15:03, Mike Leach a écrit :
Hi,
Hi Mike,
How are you determining that the trace addresses are mainly in the kernel? Do you have a dump of the decoded trace packets?
Actually, I enabled tracing and did not launch any program. I just wanted to make sure that the default configuration is generating trace. In decoded trace, I received addresses which are for most of the part kernel addresses. It seems right.
Here is how I configure CoreSight components.
The first thing I do is that I disable the 2nd CPU so that everything runs on the CPU that I am tracing. For ETB, I disabled formatting (by clearing ETB_FFCR_EN_FTC bit in the kernel driver) because I am only interested in raw trace. For PTM, I did not changed default configuration. I just enabled ETB and PTM for CPU 0. I found a decoder on <a href="https://www.spinics.net/lists/arm- kernel/msg207080.html">the link</a> that only decodes raw trace. This decoder is used to decode raw trace.
Did you make sure the options the PTM tracer has been configured with are the same than what the decoder thinks it is getting? I have looked at the decoder you've referenced very early on in the project and remember that it hard-codes the configuration options, which almost guarantees it won't work to decode the traces you're giving it without modifications.
I have the decoded trace in simple text format. If you want to have a look, I can send it to you.
If you are getting the trace from the ETB, then the coresight frame formatter will be running, so any trace will have to be first extracted from the frame format before any decode.
In Coresight-etb10.c file, I replaced the following line : writel_relaxed(ETB_FFCR_EN_FTC | ETB_FFCR_STOP_TRIGGER, drvdata->base + ETB_FFCR); by writel_relaxed(ETB_FFCR_STOP_TRIGGER, drvdata->base + ETB_FFCR); So that the Formatter is not enabled.
Looking at the source code you sent, if you limit the traced region to the code area in your application, then you might expect the trace to contain addresses outside this area as the PTM trace consists of "waypoint" addresses, not individual instructions executed. The decoder determines the executed trace by following the source between waypoint addresses.
Actually, I am compiling the application statically to make sure that I can get all the trace for the application (even for the libraries).
A waypoint address will be generated whenever you move out of the specified trace region, so I would guess that each time you call the sqrt() function, an address outside the main program will appear in the trace. Once this function returns you may see a second address output as the trace restarts - this time this should be at the return address for the function.
Yes, I observe this behaviour when I trace small portion of an application. Each time a BL instruction is executed, I get the function address and the return address. The strange thing is that the trace stops before I receive all the trace.
It may be worthwhile trying to run your data through the trc_pkt_lister test program. To do this you can adapt one of the tests\snapshots to your data. If you look at the TC2 snapshot this has PTM trace in it - the .ini files here could be adapted.
I think, I have to trace it using perf and see what I get and then adapt my configuration accordingly. Is their another way to stop formatter from SysFS ?
Unfortunately trace decoding isn't supported yet for ARMv7, so trying to decode traces with perf won't work. Fixing this is very high on my priority list but still can't get you an estimate.
I have developped a decoder for PFT protocol in FPGA part of Zynq SoC and have been using it so far with TPIU to decode traces on the FPGA. The decoder expects raw trace. This is why I disable formatting.
This sounds interesting and I'd like a little more clarification if you don't mind.
- Are you doing trace decoding in the FPGA itself?
- Can you give me the name of the board you're working with? I'm
interested in the TPIU. 3) What decoding box is externally connected to the TPIU port?
Regards Mike
Thank you so much for your help.
Best regards, Muhammad PS: Sorry for my late reply.
Le 09/04/2018 à 15:03, Mike Leach a écrit :
Hi,
How are you determining that the trace addresses are mainly in the kernel? Do you have a dump of the decoded trace packets?
If you are getting the trace from the ETB, then the coresight frame formatter will be running, so any trace will have to be first extracted from the frame format before any decode.
Looking at the source code you sent, if you limit the traced region to the code area in your application, then you might expect the trace to contain addresses outside this area as the PTM trace consists of "waypoint" addresses, not individual instructions executed. The decoder determines the executed trace by following the source between waypoint addresses.
A waypoint address will be generated whenever you move out of the specified trace region, so I would guess that each time you call the sqrt() function, an address outside the main program will appear in the trace. Once this function returns you may see a second address output as the trace restarts - this time this should be at the return address for the function.
It may be worthwhile trying to run your data through the trc_pkt_lister test program. To do this you can adapt one of the tests\snapshots to your data. If you look at the TC2 snapshot this has PTM trace in it - the .ini files here could be adapted.
Regards
Mike
On 6 April 2018 at 15:37, Mathieu Poirier mathieu.poirier@linaro.org wrote:
On 5 April 2018 at 10:57, Muhammad Abdul WAHAB muhammadabdul.wahab@centralesupelec.fr wrote:
Hi Mathieu,
Le 05/04/2018 à 17:27, Mathieu Poirier a écrit :
You have a TPIU driver?
Yes, I slightly modified the existing TPIU driver and it worked. Luckily, the TPIU does not require much configuration. I am not enabling the formatter of TPIU because I am tracing only single CPU core. As it is incomplete (no formatter), I did not submitted it as a patch.
How many CPUs on this system? Are you sure the application runs on PTM0? Could it be running on another processor?
I have two CPUs on my system. I made sure that the program runs on the CPU that I am tracing by turning off the second core through sysfs.
I will offer (without looking at your above configuration) that by default the tracers are configured to trace the whole address range. As such can you tell me what you're getting if you enable the tracers without doing any configuration of your own? It could also be that the default configuration is clashing with your new configuration but that's a conversation we can have in an upcoming email once you've answered the above questions.
If I enable trace sink and source components without any configuration of my own, I am getting some trace that consists mainly of kernel code (addresses above 0xc000 0000). For example, the output of dd then hexdump looks like the trace below.
00000000 ab e3 db 0d 72 37 85 80 fe ff 4f 14 d1 f6 81 80 |....r7....O.....| 00000010 0e 77 86 9f 86 24 86 a9 05 8e c5 99 a0 01 c3 d9 |.w...$..........| 00000020 88 00 8e 95 1a 86 f1 bc 0c 86 9f da 08 53 f1 bc |.............S..| 00000030 0c 85 3d d9 da 08 23 cd 99 a0 01 86 69 d9 85 a4 |..=...#.....i...| 00000040 00 f5 f6 23 e9 38 05 df 85 24 9e d5 f5 23 9e bf |...#.8...$...#..| 00000050 36 2d fd 9b a0 01 f1 da 88 00 9e bf 1b 8f 9c a0 |6-..............| 00000060 01 86 b1 f6 a3 00 86 b7 82 20 86 51 86 ff 83 28 |......... .Q...(| 00000070 dd 82 20 e1 a9 29 e3 82 20 c1 c0 20 86 e7 82 20 |.. ..).. .. ... | 00000080 bb f6 23 e1 f5 01 86 b9 ab e3 db 0d 86 cf 89 1c |..#.............| 00000090 fb d3 82 80 08 d1 97 02 e1 f2 db db 0d ff d3 82 |................| 000000a0 80 08 b7 98 02 e9 aa e3 db 0d 9d f9 1a 33 f1 aa |.............3..| 000000b0 23 e7 9e 22 d3 2e 73 89 22 86 72 1b 85 80 fe ff |#.."..s.".r.....| 000000c0 4f 14 d1 f6 81 80 0e 77 86 f5 82 20 e1 81 24 c5 |O......w... ..$.| 000000d0 99 a0 01 c3 d9 88 00 8e 95 1a 86 f1 bc 0c 86 9f |................| 000000e0 da 08 53 f1 bc 0c 85 3d d9 da 08 23 cd 99 a0 01 |..S....=...#....| 000000f0 86 69 ed 81 a4 00 8e a3 02 b1 f2 23 a9 82 24 fd |.i.........#..$.| 00000100 9b a0 01 f1 da 88 00 9e bf 1b 8f 9c a0 01 86 ad |................| 00000110 82 a4 00 b7 82 20 86 51 86 ff 83 28 dd 82 20 e1 |..... .Q...(.. .| 00000120 a9 29 e3 82 20 c1 c0 20 86 e7 82 20 85 03 86 e1 |.).. .. ... ....| 00000130 f5 01 86 9d a2 e2 db 0d 83 2f 2b 83 ab 23 cf 89 |........./+..#..| 00000140 1c 87 d4 82 80 08 d1 97 02 e1 f2 db db 0d 89 d4 |................| 00000150 82 80 08 a5 03 86 c3 9e 02 86 99 1f 8f d4 02 b3 |................| 00000160 94 02 bd 88 e3 db 0d 86 8d 09 ff 89 1c 95 0a 86 |................| 00000170 13 86 13 86 13 86 13 86 13 86 13 86 13 86 13 86 |................| ... and so on.
How can I make sure that there are no conflicts of kernel configuration ?
You will likely have to instrument the code. When tracers are enabled I would look at the final configuration (via sysFS), to make sure what you have there is really what you wrote in the configuration registers.
Best regards, Muhammad
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
Hi Muhammed,
I have looked at the decoder you indicate - for PTM trace it resolves to individual trace packets, but does not fully decode the trace.
PTM uses program flow trace - that is only trace waypoints result in packets output. Thus not all address values are explicitly traced if they can be deduced from the code. The decoder counts the branches when it sees E/N atoms, but does not resolve them - which can only be done by examination of the code image traced.*
This could be why you are not seeing all the addresses you are expecting.
Regards
Mike
*For example, if you have a loop as follows:-
label: <loop code> BNZ label:
Then this may well be traced as a series of E atoms as the code loops back, followed by a single N atom on loop termination. Assuming the value of label: is fully encoded into the BNZ instruction (i.e. it is not BNZ rN) then no address value for label: will ever appear in the trace. The only way to decode this trace is to use a code follower and look for the BNZ label: within the code and resolve for each E/N atom seen. The decode can then assume that all instructions from label: to BNZ have been executed.
On 17 April 2018 at 08:53, Muhammad Abdul WAHAB muhammadabdul.wahab@centralesupelec.fr wrote:
Hi Mathieu,
Did you make sure the options the PTM tracer has been configured with are the same than what the decoder thinks it is getting? I have looked at the decoder you've referenced very early on in the project and remember that it hard-codes the configuration options, which almost guarantees it won't work to decode the traces you're giving it without modifications.
Yes, I made sure that the same options are enabled for getting trace and decoding it using the decoder. I had to modify the decoder a bit to make it work. Now, it works well for raw trace. I tested it on simple programs and the result is correct. The problem is in tracing configuration. I am starting to look into how trace is recovered by Perf interface in order to find the error in my configuration.
This sounds interesting and I'd like a little more clarification if you don't mind.
- Are you doing trace decoding in the FPGA itself?
- Can you give me the name of the board you're working with? I'm
interested in the TPIU. 3) What decoding box is externally connected to the TPIU port?
- Yes, I am decoding traces in the FPGA itself.
- I am working on Zedboard (Zynq SoC). I submitted patches to add support
for CoreSight components last year. Basically, it contains the modifications in the device tree file (https://lkml.org/lkml/2016/9/30/52). 3) In Zynq SoC, you can recover the trace output either on the FPGA part (EMIO interface) either on the pins in order to use an external decoder. As I don't have a decoding box, I use EMIO interface to get trace at FPGA part and then decode it. I tested it on simple programs and it works without any issues.
Best regards, Muhammad
Le 13/04/2018 à 19:07, Mathieu Poirier a écrit :
On 13 April 2018 at 10:54, Muhammad Abdul WAHAB muhammadabdul.wahab@centralesupelec.fr wrote:
Le 09/04/2018 à 15:03, Mike Leach a écrit :
Hi,
Hi Mike,
How are you determining that the trace addresses are mainly in the kernel? Do you have a dump of the decoded trace packets?
Actually, I enabled tracing and did not launch any program. I just wanted to make sure that the default configuration is generating trace. In decoded trace, I received addresses which are for most of the part kernel addresses. It seems right.
Here is how I configure CoreSight components.
The first thing I do is that I disable the 2nd CPU so that everything runs on the CPU that I am tracing. For ETB, I disabled formatting (by clearing ETB_FFCR_EN_FTC bit in the kernel driver) because I am only interested in raw trace. For PTM, I did not changed default configuration. I just enabled ETB and PTM for CPU 0. I found a decoder on <a href="https://www.spinics.net/lists/arm- kernel/msg207080.html">the link</a> that only decodes raw trace. This decoder is used to decode raw trace.
Did you make sure the options the PTM tracer has been configured with are the same than what the decoder thinks it is getting? I have looked at the decoder you've referenced very early on in the project and remember that it hard-codes the configuration options, which almost guarantees it won't work to decode the traces you're giving it without modifications.
I have the decoded trace in simple text format. If you want to have a look, I can send it to you.
If you are getting the trace from the ETB, then the coresight frame formatter will be running, so any trace will have to be first extracted from the frame format before any decode.
In Coresight-etb10.c file, I replaced the following line : writel_relaxed(ETB_FFCR_EN_FTC | ETB_FFCR_STOP_TRIGGER, drvdata->base + ETB_FFCR); by writel_relaxed(ETB_FFCR_STOP_TRIGGER, drvdata->base + ETB_FFCR); So that the Formatter is not enabled.
Looking at the source code you sent, if you limit the traced region to the code area in your application, then you might expect the trace to contain addresses outside this area as the PTM trace consists of "waypoint" addresses, not individual instructions executed. The decoder determines the executed trace by following the source between waypoint addresses.
Actually, I am compiling the application statically to make sure that I can get all the trace for the application (even for the libraries).
A waypoint address will be generated whenever you move out of the specified trace region, so I would guess that each time you call the sqrt() function, an address outside the main program will appear in the trace. Once this function returns you may see a second address output as the trace restarts - this time this should be at the return address for the function.
Yes, I observe this behaviour when I trace small portion of an application. Each time a BL instruction is executed, I get the function address and the return address. The strange thing is that the trace stops before I receive all the trace.
It may be worthwhile trying to run your data through the trc_pkt_lister test program. To do this you can adapt one of the tests\snapshots to your data. If you look at the TC2 snapshot this has PTM trace in it - the .ini files here could be adapted.
I think, I have to trace it using perf and see what I get and then adapt my configuration accordingly. Is their another way to stop formatter from SysFS ?
Unfortunately trace decoding isn't supported yet for ARMv7, so trying to decode traces with perf won't work. Fixing this is very high on my priority list but still can't get you an estimate.
I have developped a decoder for PFT protocol in FPGA part of Zynq SoC and have been using it so far with TPIU to decode traces on the FPGA. The decoder expects raw trace. This is why I disable formatting.
This sounds interesting and I'd like a little more clarification if you don't mind.
- Are you doing trace decoding in the FPGA itself?
- Can you give me the name of the board you're working with? I'm
interested in the TPIU. 3) What decoding box is externally connected to the TPIU port?
Regards Mike
Thank you so much for your help.
Best regards, Muhammad PS: Sorry for my late reply.
Le 09/04/2018 à 15:03, Mike Leach a écrit :
Hi,
How are you determining that the trace addresses are mainly in the kernel? Do you have a dump of the decoded trace packets?
If you are getting the trace from the ETB, then the coresight frame formatter will be running, so any trace will have to be first extracted from the frame format before any decode.
Looking at the source code you sent, if you limit the traced region to the code area in your application, then you might expect the trace to contain addresses outside this area as the PTM trace consists of "waypoint" addresses, not individual instructions executed. The decoder determines the executed trace by following the source between waypoint addresses.
A waypoint address will be generated whenever you move out of the specified trace region, so I would guess that each time you call the sqrt() function, an address outside the main program will appear in the trace. Once this function returns you may see a second address output as the trace restarts - this time this should be at the return address for the function.
It may be worthwhile trying to run your data through the trc_pkt_lister test program. To do this you can adapt one of the tests\snapshots to your data. If you look at the TC2 snapshot this has PTM trace in it - the .ini files here could be adapted.
Regards
Mike
On 6 April 2018 at 15:37, Mathieu Poirier mathieu.poirier@linaro.org wrote:
On 5 April 2018 at 10:57, Muhammad Abdul WAHAB muhammadabdul.wahab@centralesupelec.fr wrote:
Hi Mathieu,
Le 05/04/2018 à 17:27, Mathieu Poirier a écrit : > > You have a TPIU driver?
Yes, I slightly modified the existing TPIU driver and it worked. Luckily, the TPIU does not require much configuration. I am not enabling the formatter of TPIU because I am tracing only single CPU core. As it is incomplete (no formatter), I did not submitted it as a patch.
> How many CPUs on this system? Are you sure the application runs on > PTM0? > Could it be running on another processor?
I have two CPUs on my system. I made sure that the program runs on the CPU that I am tracing by turning off the second core through sysfs. > > I will offer (without looking at your above configuration) that by > default the tracers are configured to trace the whole address range. > As such can you tell me what you're getting if you enable the tracers > without doing any configuration of your own? It could also be that > the default configuration is clashing with your new configuration but > that's a conversation we can have in an upcoming email once you've > answered the above questions.
If I enable trace sink and source components without any configuration of my own, I am getting some trace that consists mainly of kernel code (addresses above 0xc000 0000). For example, the output of dd then hexdump looks like the trace below.
00000000 ab e3 db 0d 72 37 85 80 fe ff 4f 14 d1 f6 81 80 |....r7....O.....| 00000010 0e 77 86 9f 86 24 86 a9 05 8e c5 99 a0 01 c3 d9 |.w...$..........| 00000020 88 00 8e 95 1a 86 f1 bc 0c 86 9f da 08 53 f1 bc |.............S..| 00000030 0c 85 3d d9 da 08 23 cd 99 a0 01 86 69 d9 85 a4 |..=...#.....i...| 00000040 00 f5 f6 23 e9 38 05 df 85 24 9e d5 f5 23 9e bf |...#.8...$...#..| 00000050 36 2d fd 9b a0 01 f1 da 88 00 9e bf 1b 8f 9c a0 |6-..............| 00000060 01 86 b1 f6 a3 00 86 b7 82 20 86 51 86 ff 83 28 |......... .Q...(| 00000070 dd 82 20 e1 a9 29 e3 82 20 c1 c0 20 86 e7 82 20 |.. ..).. .. ... | 00000080 bb f6 23 e1 f5 01 86 b9 ab e3 db 0d 86 cf 89 1c |..#.............| 00000090 fb d3 82 80 08 d1 97 02 e1 f2 db db 0d ff d3 82 |................| 000000a0 80 08 b7 98 02 e9 aa e3 db 0d 9d f9 1a 33 f1 aa |.............3..| 000000b0 23 e7 9e 22 d3 2e 73 89 22 86 72 1b 85 80 fe ff |#.."..s.".r.....| 000000c0 4f 14 d1 f6 81 80 0e 77 86 f5 82 20 e1 81 24 c5 |O......w... ..$.| 000000d0 99 a0 01 c3 d9 88 00 8e 95 1a 86 f1 bc 0c 86 9f |................| 000000e0 da 08 53 f1 bc 0c 85 3d d9 da 08 23 cd 99 a0 01 |..S....=...#....| 000000f0 86 69 ed 81 a4 00 8e a3 02 b1 f2 23 a9 82 24 fd |.i.........#..$.| 00000100 9b a0 01 f1 da 88 00 9e bf 1b 8f 9c a0 01 86 ad |................| 00000110 82 a4 00 b7 82 20 86 51 86 ff 83 28 dd 82 20 e1 |..... .Q...(.. .| 00000120 a9 29 e3 82 20 c1 c0 20 86 e7 82 20 85 03 86 e1 |.).. .. ... ....| 00000130 f5 01 86 9d a2 e2 db 0d 83 2f 2b 83 ab 23 cf 89 |........./+..#..| 00000140 1c 87 d4 82 80 08 d1 97 02 e1 f2 db db 0d 89 d4 |................| 00000150 82 80 08 a5 03 86 c3 9e 02 86 99 1f 8f d4 02 b3 |................| 00000160 94 02 bd 88 e3 db 0d 86 8d 09 ff 89 1c 95 0a 86 |................| 00000170 13 86 13 86 13 86 13 86 13 86 13 86 13 86 13 86 |................| ... and so on.
How can I make sure that there are no conflicts of kernel configuration ?
You will likely have to instrument the code. When tracers are enabled I would look at the final configuration (via sysFS), to make sure what you have there is really what you wrote in the configuration registers.
Best regards, Muhammad
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
Hi Mike,
Le 17/04/2018 à 10:33, Mike Leach a écrit :
Hi Muhammed,
I have looked at the decoder you indicate - for PTM trace it resolves to individual trace packets, but does not fully decode the trace.
PTM uses program flow trace - that is only trace waypoints result in packets output. Thus not all address values are explicitly traced if they can be deduced from the code. The decoder counts the branches when it sees E/N atoms, but does not resolve them - which can only be done by examination of the code image traced.*
Yes, if the branch broadcast feature is not enabled. Then, only E/N atom packets are sent in trace. But, if branch broadcast feature is enabled, then all branch addresses (direct or indirect) are sent by the PTM. I am using this feature to get all branch addresses. It works well if I configure PTM with small address range. But, as soon as I specify the whole .text section, I don't get full trace of the application. I thought that there is a FIFOFULL event inside PTM that happens. But I don't know how to check that a FIFOFULL event happens. Do you know what is the expected behavior of PTM if a FIFOFULL event occurs ?
*For example, if you have a loop as follows:-
label: <loop code> BNZ label:
Then this may well be traced as a series of E atoms as the code loops back, followed by a single N atom on loop termination. Assuming the value of label: is fully encoded into the BNZ instruction (i.e. it is not BNZ rN) then no address value for label: will ever appear in the trace. The only way to decode this trace is to use a code follower and look for the BNZ label: within the code and resolve for each E/N atom seen. The decode can then assume that all instructions from label: to BNZ have been executed.
If we enable branch broadcast, then the PTM sends a branch address packet each time it encounters a branch instruction which allow us to compute the exact address without looking at the program image.
This could be why you are not seeing all the addresses you are expecting.
Regards
Mike
Thank you so much for your help. Best regards, Muhammad
Hi Muhammad,
On 17 April 2018 at 09:58, Muhammad Abdul WAHAB muhammadabdul.wahab@centralesupelec.fr wrote:
Hi Mike,
Le 17/04/2018 à 10:33, Mike Leach a écrit :
Hi Muhammed,
I have looked at the decoder you indicate - for PTM trace it resolves to individual trace packets, but does not fully decode the trace.
PTM uses program flow trace - that is only trace waypoints result in packets output. Thus not all address values are explicitly traced if they can be deduced from the code. The decoder counts the branches when it sees E/N atoms, but does not resolve them - which can only be done by examination of the code image traced.*
Yes, if the branch broadcast feature is not enabled. Then, only E/N atom packets are sent in trace. But, if branch broadcast feature is enabled, then all branch addresses (direct or indirect) are sent by the PTM. I am using this feature to get all branch addresses. It works well if I configure PTM with small address range. But, as soon as I specify the whole .text section, I don't get full trace of the application. I thought that there is a FIFOFULL event inside PTM that happens. But I don't know how to check that a FIFOFULL event happens. Do you know what is the expected behavior of PTM if a FIFOFULL event occurs ?
If the PTM overflows then once the error condition is cleared - i.e. the trace sink can accept trace again - the PTM will restart with and ISYNC packet, with the reason set to overflow. Using branch broadcast mode is likely to considerably increase the chances of an overflow occurring. Normally we would recommend that branch broadcast be used only in restricted circumstances - i.e. address filtered trace as you were using - due to the large increase in output trace bandwidth required.
Regards
Mike
*For example, if you have a loop as follows:-
label: <loop code> BNZ label:
Then this may well be traced as a series of E atoms as the code loops back, followed by a single N atom on loop termination. Assuming the value of label: is fully encoded into the BNZ instruction (i.e. it is not BNZ rN) then no address value for label: will ever appear in the trace. The only way to decode this trace is to use a code follower and look for the BNZ label: within the code and resolve for each E/N atom seen. The decode can then assume that all instructions from label: to BNZ have been executed.
If we enable branch broadcast, then the PTM sends a branch address packet each time it encounters a branch instruction which allow us to compute the exact address without looking at the program image.
This could be why you are not seeing all the addresses you are expecting.
Regards
Mike
Thank you so much for your help. Best regards, Muhammad