Perf record format portability

List overview All Threads
Download

newer

older

CALAO-SYSTEMS SPECIAL OFFER

[PATCH 1/2] Porting to Android...

Dmitry Antipov

15 May 2012 15 May '12

3:27 p.m.

Hello,

are there any thoughts on how much of the perf.data is portable and how much it should be? I'm interesting in recording scheduler activity on one machine and then replaying on another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it work with a small subset of recorded events (for example, sched:sched_switch, sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup and sched:sched_migrate_task) on the same architecture?

Thanks in advance, Dmitry

Show replies by date

Arnaldo Carvalho de Melo

15 May 15 May

3:51 p.m.

Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu:

...

Hello,

are there any thoughts on how much of the perf.data is portable and how much it should be? I'm interesting in recording scheduler activity on one machine and then replaying on another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it work with a small subset of recorded events (for example, sched:sched_switch, sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup and sched:sched_migrate_task) on the same architecture?

Endianness issues? ARM EB? There are some patches by Jiri Olsa that may help you if that is the case.

It should be portable, are you using 'perf archive' too?

What exactly is the error experienced?

- Arnaldo

Dmitry Antipov

16 May 16 May

10:50 a.m.

On 05/15/2012 07:51 PM, Arnaldo Carvalho de Melo wrote:

...

Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu:

...
Hello,

are there any thoughts on how much of the perf.data is portable and how much it should be? I'm interesting in recording scheduler activity on one machine and then replaying on another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it work with a small subset of recorded events (for example, sched:sched_switch, sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup and sched:sched_migrate_task) on the same architecture?

Endianness issues? ARM EB? There are some patches by Jiri Olsa that may help you if that is the case.

Thanks, will look at.

...

It should be portable, are you using 'perf archive' too?

It doesn't work with cryptic messages like:

tar: .build-id/17/d6ca02b2c31df54bf62a4142c47e3c99a9eedf: Cannot stat: No such file or directory

creating empty archive.

...

What exactly is the error experienced?

Now I'm facing the simple problem with event IDs, which may be different from machine to machine. For example, /sys/kernel/debug/tracing/events/sched/sched_switch/id is 55 on my ARM board and 279 on my PC host, so 'perf report' displays all event names like "unknown:unknown", even with --kallsyms=XXX where XXX is 'cat /proc/kallsyms > XXX' from PC host.

Dmitry

Arnaldo Carvalho de Melo

2:59 p.m.

Adding Jiri and Steven to the CC list.

Em Wed, May 16, 2012 at 02:50:31PM +0400, Dmitry Antipov escreveu:

...

On 05/15/2012 07:51 PM, Arnaldo Carvalho de Melo wrote:

...
Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu:

...
are there any thoughts on how much of the perf.data is portable and how much it should be? I'm interesting in recording scheduler activity on one machine and then replaying on another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it work with a small subset of recorded events (for example, sched:sched_switch, sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup and sched:sched_migrate_task) on the same architecture?

Endianness issues? ARM EB? There are some patches by Jiri Olsa that may help you if that is the case.

Thanks, will look at.

...
It should be portable, are you using 'perf archive' too?

It doesn't work with cryptic messages like:

tar: .build-id/17/d6ca02b2c31df54bf62a4142c47e3c99a9eedf: Cannot stat: No such file or directory

It is a shell script, basically, after yum collect your events with something like:

[acme@sandy ~]$ perf record -F 10000 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.021 MB perf.data (~917 samples) ]

The resulting perf.data file will have samples taken on these DSOs, with those respective hashes identifying each one:

[acme@sandy ~]$ perf buildid-list 4390a3d2dc84c37a8923ba4c910d6766abc42cbf [kernel.kallsyms] ceb82e745b0ab8bb7ea28c068327be1fb068c923 /lib64/ld-2.12.so e731c64000993d1fd1b443e6d5d6972d149440e8 /lib64/libc-2.12.so [acme@sandy ~]$

In your case we can see that it is looking for build id 17d6ca02b2c31df54bf62a4142c47e3c99a9eedf on the build id cache.

Probably you either are running 'perf archive' on a different machine than the one where you ran 'perf record' or using a different user on the same machine, or, unlikely, perhaps you removed ~/.debug/ after 'record'.

The 'perf archive' tool was done quickly just as a proof of concept, admitedly it needs to be improved to help diagnosing these problems.

...

creating empty archive.

...
What exactly is the error experienced?

Now I'm facing the simple problem with event IDs, which may be different from machine to machine. For example, /sys/kernel/debug/tracing/events/sched/sched_switch/id is 55 on my ARM board and 279 on my PC host, so 'perf report' displays all event names like "unknown:unknown", even with --kallsyms=XXX where XXX is 'cat /proc/kallsyms > XXX' from PC host.

With build-ids and 'perf archive' you shouldn't need specifying kallsyms, it has a build-id and will be collected (record + archive) an then transfered and expanded on the analysis machine (scp + tar xvf).

The tracing part even stashes a copy of kallsyms in perf.data (not needed, but there for historical reasons). The problem is in translating the perf_event_attr.config to the same name and format as in the machine where you collected the events.`

Steve,

Was the kernel trace events infrastructure designed with that in mind? I.e. cross analysis? I must be missing something here, still ENOCOFFEE :-\

When doing cross arch event analisys I tested:

PERF_TYPE_HARDWARE = 0, PERF_TYPE_SOFTWARE = 1, PERF_TYPE_HW_CACHE = 3,

Not:

PERF_TYPE_TRACEPOINT = 2, PERF_TYPE_RAW = 4, PERF_TYPE_BREAKPOINT = 5,

- Arnaldo

Jiri Olsa

3:16 p.m.

On Wed, May 16, 2012 at 11:59:27AM -0300, Arnaldo Carvalho de Melo wrote:

...

Adding Jiri and Steven to the CC list.

Em Wed, May 16, 2012 at 02:50:31PM +0400, Dmitry Antipov escreveu:

...
On 05/15/2012 07:51 PM, Arnaldo Carvalho de Melo wrote:

...
Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu:

...
are there any thoughts on how much of the perf.data is portable and how much it should be? I'm interesting in recording scheduler activity on one machine and then replaying on another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it work with a small subset of recorded events (for example, sched:sched_switch, sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup and sched:sched_migrate_task) on the same architecture?

Endianness issues? ARM EB? There are some patches by Jiri Olsa that may help you if that is the case.

latest version sent today, there's description of tests I did: http://marc.info/?l=linux-kernel&m=133715172512742&w=2

Each time I run new sort of test, another endianity issue is hit. so, tracepoints.. I'll check ;)

jirka

Arnaldo Carvalho de Melo

3:50 p.m.

Em Wed, May 16, 2012 at 05:16:55PM +0200, Jiri Olsa escreveu:

...

On Wed, May 16, 2012 at 11:59:27AM -0300, Arnaldo Carvalho de Melo wrote:

...
Adding Jiri and Steven to the CC list.

Em Wed, May 16, 2012 at 02:50:31PM +0400, Dmitry Antipov escreveu:

...
On 05/15/2012 07:51 PM, Arnaldo Carvalho de Melo wrote:

...
Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu:

...
are there any thoughts on how much of the perf.data is portable and how much it should be? I'm interesting in recording scheduler activity on one machine and then replaying on another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it work with a small subset of recorded events (for example, sched:sched_switch, sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup and sched:sched_migrate_task) on the same architecture?

Endianness issues? ARM EB? There are some patches by Jiri Olsa that may help you if that is the case.

latest version sent today, there's description of tests I did: http://marc.info/?l=linux-kernel&m=133715172512742&w=2

Each time I run new sort of test, another endianity issue is hit. so, tracepoints.. I'll check ;)

The tracepoints part is a different problem, I think, but take a look anyway ;-)

- Arnaldo

Steven Rostedt

4:58 p.m.

On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote:

...

Steve,

Was the kernel trace events infrastructure designed with that in mind? I.e. cross analysis? I must be missing something here, still ENOCOFFEE :-\

Yes, the libparsevents library was design for this from day one. That's why trace-cmd data file can be run on an ARM and read on x86, or PPC, or whatever. I did all my development testing against 32bit, 64bit and big and little endian. This was the case from the beginning.

-- Steve

...

When doing cross arch event analisys I tested:

    PERF_TYPE_HARDWARE                      = 0,
    PERF_TYPE_SOFTWARE                      = 1,
    PERF_TYPE_HW_CACHE                      = 3,

Not:

    PERF_TYPE_TRACEPOINT                    = 2,
    PERF_TYPE_RAW                           = 4,
    PERF_TYPE_BREAKPOINT                    = 5,

Arnaldo

Jiri Olsa

5:48 p.m.

On Wed, May 16, 2012 at 12:58:23PM -0400, Steven Rostedt wrote:

...

On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote:

...
Steve,

Was the kernel trace events infrastructure designed with that in mind? I.e. cross analysis? I must be missing something here, still ENOCOFFEE :-\

Yes, the libparsevents library was design for this from day one. That's why trace-cmd data file can be run on an ARM and read on x86, or PPC, or whatever. I did all my development testing against 32bit, 64bit and big and little endian. This was the case from the beginning.

for ppc64(record) vs x86_64(report) I got following report on latest tip:

[jolsa@dhcp-26-214 test]$ ../perf report > report.target Endianness of raw data not corrected! Warning: 718 samples with id not present in the header Warning: The perf.data file has no samples!

for following record: perf record -a -e sched:sched_switch -e sched:sched_process_exit -e sched:sched_process_fork -e sched:sched_wakeup -- sleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.178 MB perf.data (~7781 samples) ]

I haven't tried trace-cmd, but I guess let's wait for libparsevents perf integration then.. ;)

jirka

Steven Rostedt

7:32 p.m.

On Wed, 2012-05-16 at 19:48 +0200, Jiri Olsa wrote:

...

for ppc64(record) vs x86_64(report) I got following report on latest tip:

[jolsa@dhcp-26-214 test]$ ../perf report > report.target Endianness of raw data not corrected! Warning: 718 samples with id not present in the header Warning: The perf.data file has no samples!

for following record: perf record -a -e sched:sched_switch -e sched:sched_process_exit -e sched:sched_process_fork -e sched:sched_wakeup -- sleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.178 MB perf.data (~7781 samples) ]

I haven't tried trace-cmd, but I guess let's wait for libparsevents perf integration then.. ;)

It's in perf. It just needs to be set up.

Look at tools/perf/util/trace-event.h

There's a bigendian() function, a "file_bigendian" and a "host_bigendian". If perf recorded what endian was used on the target, and saves that in the perf.dat file, all it needs to do is update the two variables.

file_bigendian = recorded_endian; host_bigendian = bigendian();

1 for big endian, 0 for little endian.

Where host is the machine that is running the perf report or script. After that, all reads of the data in events uses one of the __data2host() macros to convert if necessary.

Note, latest trace-cmd has put all these in a pevent struct descriptor, so that different files can be read at the same time, and these files can be from different endian (and bit size) machines. The global variables no longer exist.

My patches, that I and Frederic posted previously, convert perf to use this descriptor so that perf could benefit and read multiple files too.

-- Steve

Steven Rostedt

7:39 p.m.

On Wed, 2012-05-16 at 15:32 -0400, Steven Rostedt wrote:

...

On Wed, 2012-05-16 at 19:48 +0200, Jiri Olsa wrote:

...
for ppc64(record) vs x86_64(report) I got following report on latest tip:

[jolsa@dhcp-26-214 test]$ ../perf report > report.target Endianness of raw data not corrected! Warning: 718 samples with id not present in the header Warning: The perf.data file has no samples!

What does perf script give you. It looks like Frederic took my code for this when he ported the original parse-events over to perf. I see the setup of these variables in tools/perf/util/trace-event-read.c

If you run 'perf script' on x86 from a ppc perf.dat file, do you still get the same errors?

-- Steve

Jiri Olsa

17 May 17 May

8:51 a.m.

On Wed, May 16, 2012 at 03:39:14PM -0400, Steven Rostedt wrote:

...

On Wed, 2012-05-16 at 15:32 -0400, Steven Rostedt wrote:

...
On Wed, 2012-05-16 at 19:48 +0200, Jiri Olsa wrote:

...
for ppc64(record) vs x86_64(report) I got following report on latest tip:

[jolsa@dhcp-26-214 test]$ ../perf report > report.target Endianness of raw data not corrected! Warning: 718 samples with id not present in the header Warning: The perf.data file has no samples!

What does perf script give you. It looks like Frederic took my code for this when he ported the original parse-events over to perf. I see the setup of these variables in tools/perf/util/trace-event-read.c

If you run 'perf script' on x86 from a ppc perf.dat file, do you still get the same errors?

yes

--- [jolsa@dhcp-26-214 test]$ ../perf script Endianness of raw data not corrected! Warning: 718 samples with id not present in the header # ======== # captured on: Wed May 16 19:53:13 2012 # hostname : ibm-js22-vios-02-lp1.rhts.eng.bos.redhat.com # os release : 2.6.32-270.el6.ppc64 # perf version : 2.6.32-270.el6.ppc64.debug # arch : ppc64 # nrcpus online : 8 # nrcpus avail : 8 # cpudesc : POWER6 (architected), altivec supported # cpuid : 62,769 # total memory : 6236992 kB # cmdline : /usr/bin/perf record -a -e sched:sched_switch -e # sched:sched_process_exit -e sched:sched_process_fork -e # sched:sched_wakeup -- sleep 10 # event : name = sched:sched_switch, type = 2, config = 0x22, config1 = # 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 97, 98, 99, # 100, 101, 102, 103, 104 } # event : name = sched:sched_process_exit, type = 2, config = 0x1b, # config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 105, # 106, 107, 108, 109, 110, 111, 112 } # event : name = sched:sched_process_fork, type = 2, config = 0x1d, # config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 113, # 114, 115, 116, 117, 118, 119, 120 } # event : name = sched:sched_wakeup, type = 2, config = 0x17, config1 = # 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 121, 122, 123, # 124, 125, 126, 127, 128 } # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # ======== # ---

jirka

Arnaldo Carvalho de Melo

16 May 16 May

6:08 p.m.

Em Wed, May 16, 2012 at 12:58:23PM -0400, Steven Rostedt escreveu:

...

On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote:

...
Was the kernel trace events infrastructure designed with that in mind? I.e. cross analysis? I must be missing something here, still ENOCOFFEE :-\

Yes, the libparsevents library was design for this from day one. That's why trace-cmd data file can be run on an ARM and read on x86, or PPC, or whatever. I did all my development testing against 32bit, 64bit and big and little endian. This was the case from the beginning.

I need to look at the code, but how does it do this? Copy the relevant /sys/kernel/debug/events formats in the header and then instead of looking at /sys/... look at those?

Does it still copy /proc/kallsyms?

- Arnaldo

Steven Rostedt

6:17 p.m.

On Wed, 2012-05-16 at 15:08 -0300, Arnaldo Carvalho de Melo wrote:

...

Em Wed, May 16, 2012 at 12:58:23PM -0400, Steven Rostedt escreveu:

...
On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote:

...
Was the kernel trace events infrastructure designed with that in mind? I.e. cross analysis? I must be missing something here, still ENOCOFFEE :-\

Yes, the libparsevents library was design for this from day one. That's why trace-cmd data file can be run on an ARM and read on x86, or PPC, or whatever. I did all my development testing against 32bit, 64bit and big and little endian. This was the case from the beginning.

I need to look at the code, but how does it do this? Copy the relevant /sys/kernel/debug/events formats in the header and then instead of looking at /sys/... look at those?

It does copy the events from .../debug/tracing/events. But it does cheat about the bits. To determine the size, it looks at /sys/kernel/debug/tracing/events/header_page and the field of "commit". On 32bit machines, that's 4bytes, and on 64bit, that's 8 bytes.

For endianess, that is calculated on the machine that the recording is running on and stored in the file.

The parse-events structure has a way to record the endianess and long size, for later retrieval.

...

Does it still copy /proc/kallsyms?

Yes it does.

-- Steve

Dmitry Antipov

17 May 17 May

5:10 a.m.

On 05/16/2012 08:58 PM, Steven Rostedt wrote:

...

On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote:

...
Steve,

Was the kernel trace events infrastructure designed with that in mind? I.e. cross analysis? I must be missing something here, still ENOCOFFEE :-\

Yes, the libparsevents library was design for this from day one. That's why trace-cmd data file can be run on an ARM and read on x86, or PPC, or whatever. I did all my development testing against 32bit, 64bit and big and little endian. This was the case from the beginning.

I didn't face with big/little conversion issues, most probably both x86 and my ARM board are of the same (little) endian :-).

But the original question was about event IDs. For example, /sys/kernel/debug/tracing/events/sched/sched_switch/id is 55 on my ARM board and 279 on my PC host, so 'perf report' displays "unknown:unknown" instead of expected "sched:sched_switch" when attempting to do some cross-analysis. I suppose that original event IDs should be preserved, either within perf.data or by providing the copy of original /sys/kernel/debug/tracing/*, much like it's done with --kallsyms to resolve kernel symbols.

Dmitry

Steven Rostedt

11:48 a.m.

On Thu, 2012-05-17 at 09:10 +0400, Dmitry Antipov wrote:

...

On 05/16/2012 08:58 PM, Steven Rostedt wrote:

...
On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote:

...
Steve,

Was the kernel trace events infrastructure designed with that in mind? I.e. cross analysis? I must be missing something here, still ENOCOFFEE :-\

Yes, the libparsevents library was design for this from day one. That's why trace-cmd data file can be run on an ARM and read on x86, or PPC, or whatever. I did all my development testing against 32bit, 64bit and big and little endian. This was the case from the beginning.

I didn't face with big/little conversion issues, most probably both x86 and my ARM board are of the same (little) endian :-).

But the original question was about event IDs. For example, /sys/kernel/debug/tracing/events/sched/sched_switch/id is 55 on my ARM board and 279 on my PC host, so 'perf report' displays "unknown:unknown" instead of expected "sched:sched_switch" when attempting to do some cross-analysis. I suppose that original event IDs should be preserved, either within perf.data or by providing the copy of original /sys/kernel/debug/tracing/*, much like it's done with --kallsyms to resolve kernel symbols.

trace-cmd copies the entire /sys/kernel/debug/tracing/events directory into the data file (well it copies only the events you specify). I thought perf did the same. It should be using what's in the perf.dat file and not what's on the host.

Again, perf report is not what uses the events from trace-cmd. It's perf script that does. If perf script works, then perf report needs to be fixed. But after it gets updated to use the latest libparse-events, which I have no idea when that will ever happen.

-- Steve

Dmitry Antipov

18 May 18 May

5:48 a.m.

On 05/17/2012 03:48 PM, Steven Rostedt wrote:

...

trace-cmd copies the entire /sys/kernel/debug/tracing/events directory into the data file (well it copies only the events you specify). I thought perf did the same. It should be using what's in the perf.dat file and not what's on the host.

I found that 'perf script' and 'perf report' works differently, and I suppose 'perf script' is correct and 'perf report' isn't.

What I'm doing on PC host is:

1) Collect data with: perf record -a -R -f -m 8192 -c 1 -e sched:sched_switch \ -e sched:sched_process_exit -e sched:sched_process_fork \ -e sched:sched_wakeup -e sched:sched_migrate_task [task] 2) Collect an output from 'perf script' and 'perf report', both looks great. 3) Copy perf.data and contents of /proc/kallsyms to ARM target.

4) Next, on ARM target: perf script --kallsyms=[kallsyms from PC host] -i [perf.data from PC host] Looks good, all event names like 'sched_wakeup' or 'sched_switch' are shown. 5) Try: perf report --kallsyms=[kallsyms from PC host] -i [perf.data from PC host] --stdio All event names are shown as 'unknown:unknown'.

"Cross-replaying" (perf sched replay) looks broken too. Host results are:

run measurement overhead: 260 nsecs sleep measurement overhead: 56109 nsecs the run test took 1000054 nsecs the sleep test took 1076170 nsecs nr_run_events: 246 nr_sleep_events: 257 nr_wakeup_events: 123 target-less wakeups: 27 task 0 ( <unknown>: 3440), nr_events: 33 task 1 ( kworker/0:0: 3227), nr_events: 15 task 2 ( <unknown>: 0), nr_events: 125 task 3 ( plugin-containe: 1769), nr_events: 13 task 4 ( ksoftirqd/0: 3), nr_events: 5 task 5 ( kworker/2:2: 2023), nr_events: 3 task 6 ( perf: 3441), nr_events: 200 task 7 ( migration/2: 3091), nr_events: 3 task 8 ( kworker/1:0: 3104), nr_events: 158 task 9 ( urxvt: 2952), nr_events: 95 task 10 ( ksoftirqd/2: 3093), nr_events: 3 ------------------------------------------------------------ #1 : 70.193, ravg: 70.19, cpu: 116.57 / 116.57 #2 : 70.607, ravg: 70.23, cpu: 116.61 / 116.58 #3 : 70.411, ravg: 70.25, cpu: 116.69 / 116.59 #4 : 70.386, ravg: 70.27, cpu: 116.72 / 116.60 #5 : 70.222, ravg: 70.26, cpu: 116.39 / 116.58 #6 : 70.361, ravg: 70.27, cpu: 116.40 / 116.56 #7 : 70.409, ravg: 70.28, cpu: 116.43 / 116.55 #8 : 70.368, ravg: 70.29, cpu: 116.50 / 116.55 #9 : 70.604, ravg: 70.32, cpu: 116.75 / 116.57 #10 : 70.578, ravg: 70.35, cpu: 116.79 / 116.59

Cross-replaying attempt is ('perf sched -i [perf.data from PC host] replay'):

run measurement overhead: 8099 nsecs sleep measurement overhead: 159428 nsecs the run test took 998913 nsecs the sleep test took 1188048 nsecs nr_run_events: 0 nr_sleep_events: 0 nr_wakeup_events: 0 ------------------------------------------------------------ #1 : 0.058, ravg: 0.06, cpu: 0.00 / 0.00 #2 : 0.105, ravg: 0.06, cpu: 0.00 / 0.00 #3 : 0.027, ravg: 0.06, cpu: 0.00 / 0.00 #4 : 0.026, ravg: 0.06, cpu: 0.00 / 0.00 #5 : 0.035, ravg: 0.05, cpu: 0.00 / 0.00 #6 : 0.027, ravg: 0.05, cpu: 0.00 / 0.00 #7 : 0.027, ravg: 0.05, cpu: 0.00 / 0.00 #8 : 0.028, ravg: 0.05, cpu: 0.00 / 0.00 #9 : 0.029, ravg: 0.04, cpu: 0.00 / 0.00 #10 : 0.028, ravg: 0.04, cpu: 0.00 / 0.00

Dmitry

Arnaldo Carvalho de Melo

29 May 29 May

3:10 p.m.

Em Fri, May 18, 2012 at 09:48:26AM +0400, Dmitry Antipov escreveu:

...

On 05/17/2012 03:48 PM, Steven Rostedt wrote:

...
trace-cmd copies the entire /sys/kernel/debug/tracing/events directory into the data file (well it copies only the events you specify). I thought perf did the same. It should be using what's in the perf.dat file and not what's on the host.

I found that 'perf script' and 'perf report' works differently, and I suppose 'perf script' is correct and 'perf report' isn't.

What I'm doing on PC host is:

I haven't tested this, but libtraceevent is now in, perhaps it works for you now? Can you check?

- Arnaldo

...

Collect data with: perf record -a -R -f -m 8192 -c 1 -e sched:sched_switch \ -e sched:sched_process_exit -e sched:sched_process_fork \ -e sched:sched_wakeup -e sched:sched_migrate_task [task]

Collect an output from 'perf script' and 'perf report', both looks great.

Copy perf.data and contents of /proc/kallsyms to ARM target.

Next, on ARM target: perf script --kallsyms=[kallsyms from PC host] -i [perf.data from PC host] Looks good, all event names like 'sched_wakeup' or 'sched_switch' are shown.

Try: perf report --kallsyms=[kallsyms from PC host] -i [perf.data from PC host] --stdio All event names are shown as 'unknown:unknown'.

"Cross-replaying" (perf sched replay) looks broken too. Host results are:

run measurement overhead: 260 nsecs sleep measurement overhead: 56109 nsecs the run test took 1000054 nsecs the sleep test took 1076170 nsecs nr_run_events: 246 nr_sleep_events: 257 nr_wakeup_events: 123 target-less wakeups: 27 task 0 ( <unknown>: 3440), nr_events: 33 task 1 ( kworker/0:0: 3227), nr_events: 15 task 2 ( <unknown>: 0), nr_events: 125 task 3 ( plugin-containe: 1769), nr_events: 13 task 4 ( ksoftirqd/0: 3), nr_events: 5 task 5 ( kworker/2:2: 2023), nr_events: 3 task 6 ( perf: 3441), nr_events: 200 task 7 ( migration/2: 3091), nr_events: 3 task 8 ( kworker/1:0: 3104), nr_events: 158 task 9 ( urxvt: 2952), nr_events: 95 task 10 ( ksoftirqd/2: 3093), nr_events: 3

#1 : 70.193, ravg: 70.19, cpu: 116.57 / 116.57 #2 : 70.607, ravg: 70.23, cpu: 116.61 / 116.58 #3 : 70.411, ravg: 70.25, cpu: 116.69 / 116.59 #4 : 70.386, ravg: 70.27, cpu: 116.72 / 116.60 #5 : 70.222, ravg: 70.26, cpu: 116.39 / 116.58 #6 : 70.361, ravg: 70.27, cpu: 116.40 / 116.56 #7 : 70.409, ravg: 70.28, cpu: 116.43 / 116.55 #8 : 70.368, ravg: 70.29, cpu: 116.50 / 116.55 #9 : 70.604, ravg: 70.32, cpu: 116.75 / 116.57 #10 : 70.578, ravg: 70.35, cpu: 116.79 / 116.59

Cross-replaying attempt is ('perf sched -i [perf.data from PC host] replay'):

run measurement overhead: 8099 nsecs sleep measurement overhead: 159428 nsecs the run test took 998913 nsecs the sleep test took 1188048 nsecs nr_run_events: 0 nr_sleep_events: 0 nr_wakeup_events: 0

#1 : 0.058, ravg: 0.06, cpu: 0.00 / 0.00 #2 : 0.105, ravg: 0.06, cpu: 0.00 / 0.00 #3 : 0.027, ravg: 0.06, cpu: 0.00 / 0.00 #4 : 0.026, ravg: 0.06, cpu: 0.00 / 0.00 #5 : 0.035, ravg: 0.05, cpu: 0.00 / 0.00 #6 : 0.027, ravg: 0.05, cpu: 0.00 / 0.00 #7 : 0.027, ravg: 0.05, cpu: 0.00 / 0.00 #8 : 0.028, ravg: 0.05, cpu: 0.00 / 0.00 #9 : 0.029, ravg: 0.04, cpu: 0.00 / 0.00 #10 : 0.028, ravg: 0.04, cpu: 0.00 / 0.00

Dmitry

Dmitry Antipov

31 May 31 May

8:28 a.m.

On 05/29/2012 07:10 PM, Arnaldo Carvalho de Melo wrote:

...

I haven't tested this, but libtraceevent is now in, perhaps it works for you now? Can you check?

It doesn't work. Attempt to do 'perf report' on ARM for the data collected on x86 shows 'unknown:unknown' for event names (see report_x86_on_ARM.txt), and 'perf report' on x86 for the data collected on ARM shows invalid event names (see report_ARM_on_x86.txt).

Dmitry

4980

days inactive

4996

days old

linaro-dev@lists.linaro.org

17 comments

participants

tags (0)

participants (4)

Arnaldo Carvalho de Melo
Dmitry Antipov
Jiri Olsa
Steven Rostedt