On Wed, May 17, 2017 at 11:35 AM, Mike Leach mike.leach@linaro.org wrote:
Hi,
The OpenCSD decoder outputs executed instruction ranges - which do look very much like the branch stack ranges below. These ranges are inclusive to exclusive addresses.
I'm not familiar with the required from format of the branch stack, but assuming this is a range output by the decoder....
..... 29: 00000000004005a0 -> 00000000004005b0 0 cycles P 0
it means that we traced execution starting at the instruction @ 4005a0 to the instruction _before_ address 4005b0 (i.e. the instruction @ 4005ac if aarch32 / aarch64, or the instruction @ 4005ae if Thumb16).
On 17 May 2017 at 16:43, Dehao Chen dehao@google.com wrote:
I think there is something wrong with the branch_stack:
e.g. ..... 29: 00000000004005a0 -> 00000000004005b0 0 cycles P 0 ..... 30: 0000000000400888 -> 000000000040088c 0 cycles P 0 ..... 31: 000000000040088c -> 0000000000400898 0 cycles P 0
from the objdump: 400884: 97ffff3f bl 400580 __printf_chk@plt 400888: 97ffff46 bl 4005a0 rand@plt 40088c: b8004660 str w0, [x19],#4 400890: eb14027f cmp x19, x20 400894: 54ffffa1 b.ne 400888 <sort_array+0x50> 400898: d285e280 mov x0, #0x2f14
So taking a bit more of the stack from above e-mail.....
..... 30: 0000000000400888 -> 000000000040088c 0 cycles P 0 ..... 31: 000000000040088c -> 0000000000400898 0 cycles P 0 (snip the stuff in high memory ....) ..... 45: 00000000004005a0 -> 00000000004005b0 0 cycles P 0 ..... 46: 0000000000400888 -> 000000000040088c 0 cycles P 0 ..... 47: 000000000040088c -> 0000000000400898 0 cycles P 0
interpreting this as I would the decoder output....
@29 is the execution of the code @4005a0 @30 executes the bl 4005a0 @31 runs from 40088c to the b.ne 400888 @ 400894 thus looping round again.
So perhaps the interpretation of the output from the decoder in building the branch stack could be the issue here?
Mike, I think you are right: my interpretation of the ETM trace was wrong. The ETM trace contains the boundaries of the executed basic blocks, and not the start and end addresses of the branch instructions as in the LBR branch stack. The conversion from the ETM to LBR events is not a 1-to-1 translation as currently implemented. I will submit a patch to fix this in perf inject.
Thanks Kim, Dehao, and Mike for your help on identifying this issue.
Sebastian