I see, in this output, branch_stack is more like "range_list". I think they provides equivalent info, but the profile processing tool can only interpret the branch_stack. So we either need to teach the profile processing tool to understand the range_list, or make the profiler convert range_list to branch_stack.

On Wed, May 17, 2017 at 9:35 AM, Mike Leach <mike.leach@linaro.org> wrote:
Hi,

The OpenCSD decoder outputs executed instruction ranges - which do
look very much like the branch stack ranges below.
These ranges are inclusive to exclusive addresses.

I'm not familiar with the required from format of the branch stack,
but assuming this is a range output by the decoder....

..... 29: 00000000004005a0 -> 00000000004005b0 0 cycles  P   0

it means that we traced execution  starting at the instruction @
4005a0 to the instruction _before_ address 4005b0 (i.e. the
instruction @ 4005ac if aarch32 / aarch64, or the instruction @ 4005ae
if Thumb16).


On 17 May 2017 at 16:43, Dehao Chen <dehao@google.com> wrote:
> I think there is something wrong with the branch_stack:
>
> e.g.
> ..... 29: 00000000004005a0 -> 00000000004005b0 0 cycles  P   0
> ..... 30: 0000000000400888 -> 000000000040088c 0 cycles  P   0
> ..... 31: 000000000040088c -> 0000000000400898 0 cycles  P   0
>
> from the objdump:
>   400884:       97ffff3f        bl      400580 <__printf_chk@plt>
>   400888:       97ffff46        bl      4005a0 <rand@plt>
>   40088c:       b8004660        str     w0, [x19],#4
>   400890:       eb14027f        cmp     x19, x20
>   400894:       54ffffa1        b.ne    400888 <sort_array+0x50>
>   400898:       d285e280        mov     x0, #0x2f14
>

So taking a bit more of the stack from above e-mail.....


..... 30: 0000000000400888 -> 000000000040088c 0 cycles  P   0
..... 31: 000000000040088c -> 0000000000400898 0 cycles  P   0
(snip the stuff in high memory ....)
..... 45: 00000000004005a0 -> 00000000004005b0 0 cycles  P   0
..... 46: 0000000000400888 -> 000000000040088c 0 cycles  P   0
..... 47: 000000000040088c -> 0000000000400898 0 cycles  P   0

interpreting this as I would the decoder output....

@29 is the execution of the code @4005a0
@30 executes the bl 4005a0
@31 runs from 40088c to the b.ne 400888 @ 400894 thus looping round again.

So perhaps the interpretation of the output from the decoder in
building the branch stack could be the issue here?

Mike


> looks like 400888 is not jumping to 40088c, but 4005a0 instead. and 40088c
> is not even a jump instruction.
>
> Dehao
>
>
> On Wed, May 17, 2017 at 7:39 AM, Sebastian Pop <sebpop@gmail.com> wrote:
>>
>> On Wed, May 17, 2017 at 9:35 AM, Dehao Chen <dehao@google.com> wrote:
>> > On Wed, May 17, 2017 at 7:31 AM, Sebastian Pop <sebpop@gmail.com> wrote:
>> >> Could you please provide the full command line I should be using?
>> >> I tried the following with no meaningful output:
>> >>
>> >> # sample_merger -profile inj
>> >> # cat data.txt
>> >> 0
>> >> 0
>> >> 0
>> >
>> >
>> > You also need to have -binary point to the profiling binary.
>> >
>>
>> Thanks!  I used the following command:
>>
>> # sample_merger -profile ./inj -binary ./sort_3k
>>
>> The output is data.txt attached.
>
>
>
> _______________________________________________
> CoreSight mailing list
> CoreSight@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/coresight
>



--
Mike Leach
Principal Engineer, ARM Ltd.
Blackburn Design Centre. UK