Hi Sebastian,

I agree that this is either a decoder or inject issue.

First, can you confirm you are using version 0.7.3 of the decoder - I recently fixed an addressing bug in there.

Otherwise there are a couple of possibilities....
A) an as yet undiscovered decoder bug.
B) gaps in the trace that are not being communicated correctly making inject assume continuous trace when it is not.
C) some other misunderstanding/misinterpretation between decoder and perf inject.

Whichever it is I need to look at the raw trace data alongside the inject output at one of the A/B points and follow the packet => decode => inject flow to see why we get a bad address value.

Can you send me the capture you are using to create the examples you quote above - plus instructions on how to reproduce the output you were getting. I'll then up the logging on the decoder and walk through the trace decode path.

Regards

Mike
On Tue, 19 Sep 2017 at 23:05, Dehao Chen <dehao@google.com> wrote:
I agree. If we can fix the issue upstream, we don't want to have hacks downstream to patch the issue.

Dehao

On Tue, Sep 19, 2017 at 2:59 PM, Sebastian Pop <sebpop@gmail.com> wrote:


On Tue, Sep 19, 2017 at 3:16 PM, Dehao Chen <dehao@google.com> wrote:


On Tue, Sep 19, 2017 at 1:02 PM, Sebastian Pop <sebpop@gmail.com> wrote:
By popular demand, I started debugging this problem again.

With the two patches that I posted earlier,
the traces seem correct with the exception of a few "holes"
where the trace seems to jump over a few instructions that
are not reported in the trace, creating jumps that do not exist
in the control flow graph of the code.

The nested loop for bubble sort is:

  4008a0:       9100e3a0        add     x0, x29, #0x38
  4008a4:       52800004        mov     w4, #0x0                        // #0
  4008a8:       29400402        ldp     w2, w1, [x0]
  4008ac:       6b02003f        cmp     w1, w2
  4008b0:       5400006a        b.ge    4008bc <sort_array+0x84>
  4008b4:       52800024        mov     w4, #0x1                        // #1
  4008b8:       29000801        stp     w1, w2, [x0]
  4008bc:       91001000        add     x0, x0, #0x4
  4008c0:       eb00007f        cmp     x3, x0
  4008c4:       54ffff21        b.ne    4008a8 <sort_array+0x70>
  4008c8:       35fffec4        cbnz    w4, 4008a0 <sort_array+0x68>


..... 34: 00000000004008b0 -> 00000000004008b4 0 cycles  P   0
..... 35: 00000000004008c4 -> 00000000004008a8 0 cycles  P   0
..... 36: 00000000004008b0 -> 00000000004008a8 0 cycles  P   0

edge #36 does not exist in the code: the trace is not correct here.
4008b0 is "b.ge    4008bc" and should either jump to 4008bc or
fall through to the next instruction 4008b4, and the trace wrongly
jumps to 4008a8.

Several hundred jumps later, we see this following sequence:

..... 40: 00000000004008c4 -> 00000000004008a8 0 cycles  P   0
..... 41: 00000000004008b0 -> 00000000004008b4 0 cycles  P   0
..... 42: 00000000004008c4 -> 00000000004008b4 0 cycles  P   0
..... 43: 00000000004008c4 -> 00000000004008a8 0 cycles  P   0

where edge #42 is not correct either: 4008c4 should either branch to
4008a8 or fall through to 4008c8.

Maybe these inconsistencies are due to interruptions in trace recordings?
I think that these interruptions could not be avoided in trace collections.

Dehao, could these wrong edges be fixed in the compiler when reading
the coverage file?

I cannot see an easy way for compiler/create_gcov tool to cover these issue. Why can't trace collection tool fix these issues? Looks a bug to me.


My thinking was that the compiler knows that there are no edges
between the blocks at these addresses and may just ignore the counts
at these addresses.

Maybe we can figure out why this pattern occurs and try to solve
it in either perf inject or in the decoder? The pattern looks very regular.

These first two errors occur at a distance of 389 branches,
the next error occurs after again 389 branches.
If we call the first incorrect jump "A" and the second incorrect jump "B",
we have this pattern:

A,
389 correct jumps,
B,
389 correct jumps,
A,
389 correct jumps,
B,

There are 343 occurrences of A and 322 of B in a trace of sorting 3000 elements.