Hi Suzuki,
On Tue, Nov 10, 2020 at 8:27 PM Suzuki K Poulose suzuki.poulose@arm.com wrote:
Hi Linu
On 11/10/20 12:57 PM, Linu Cherian wrote:
Hi Suzuki,
...
We are facing some issues while trying out perf. This doesn't appear to be related to your patch though. Will share the details once we do some initial analysis on it.
Thanks.
# ./perf record -vvv -e cs_etm// --per-thread uname -a Using CPUID 0x00000000430f0b40 Attempting to add event pmu 'cs_etm' with '' that may result in non-fatal errors nr_cblocks: 0 affinity: SYS mmap flush: 1 comp level: 0 maps__set_modules_path_dir: cannot open /lib/modules/5.9.0-rc5-00116-g91c9ea890e1a dir Problems setting modules path maps, continuing anyway...
perf_event_attr: type 8 size 120 { sample_period, sample_freq } 1 sample_type IP|TID|IDENTIFIER read_format ID disabled 1 enable_on_exec 1 sample_id_all 1 exclude_guest 1
sys_perf_event_open: pid 3333 cpu -1 group_fd -1 flags 0x8 = 5
perf_event_attr: type 1 size 120 config 0x9 { sample_period, sample_freq } 1 sample_type IP|TID|IDENTIFIER read_format ID disabled 1 exclude_kernel 1 exclude_hv 1 mmap 1 comm 1 enable_on_exec 1 task 1 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 context_switch 1 ksymbol 1 bpf_event 1
sys_perf_event_open: pid 3333 cpu -1 group_fd -1 flags 0x8 = 6 mmap size 589824B AUX area mmap length 4194304
perf_event_attr: type 1 size 120 config 0x9 watermark 1 sample_id_all 1 bpf_event 1 { wakeup_events, wakeup_watermark } 1
sys_perf_event_open: pid -1 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off bpf_event
perf_event_attr: type 1 size 120 config 0x9 watermark 1 sample_id_all 1 { wakeup_events, wakeup_watermark } 1
sys_perf_event_open: pid -1 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off cloexec flag
perf_event_attr: type 1 size 120 config 0x9 watermark 1 sample_id_all 1 { wakeup_events, wakeup_watermark } 1
sys_perf_event_open: pid -1 cpu -1 group_fd -1 flags 0 sys_perf_event_open failed, error -22 switching off sample_id_all
perf_event_attr: type 1 size 120 config 0x9 watermark 1 { wakeup_events, wakeup_watermark } 1
sys_perf_event_open: pid -1 cpu -1 group_fd -1 flags 0 sys_perf_event_open failed, error -22 Couldn't start the BPF side band thread: BPF programs starting from now on won't be annotatable Synthesizing auxtrace information cannot find cgroup mount point Couldn't synthesize cgroup events. Control descriptor is not initialized Linux marvell 5.9.0-rc5-00116-g91c9ea890e1a #823 SMP PREEMPT Tue Nov 10 10:49:15 IST 2020 aarch64 aarch64 aarch64 GNU/Linux
auxtrace idx 0 old 0 head 0xdd50 diff 0xdd50
I haven't seen this in the normal verbose output.
[ perf record: Woken up 1 times to write data ] symbol:init_start file:(null) line:0 offset:0 return:0 lazy:(null) snip .. symbol:memory_mallopt file:(null) line:0 offset:0 return:0 lazy:(null) failed to write feature CPUDESC failed to write feature MEM_TOPOLOGY failed to write feature CPU_PMU_CAPS [ perf record: Captured and wrote 0.056 MB perf.data ]
# ./perf report 0x368 [0x50]: failed to process type: 1 [Cannot allocate memory] Error: failed to process sample
I have no clue about it. Are you able to run it under GDB ? (Looks like you have built the perf, so if you have sources, it may be a good idea to run under the GDB and figure out where that error is coming from).
Yeah gdb helped figuring out the issue. The issue is in the opencsd, where it doesn't seem to support multiple streams when the formatter is not enabled. . Note:Our Silicon has formatter disabled and we already had changes in perf tool to take care of the formatter status.
The below hack helped.
diff --git a/decoder/source/ocsd_dcd_tree.cpp b/decoder/source/ocsd_dcd_tree.cpp index be15e36..0210dec 100644 --- a/decoder/source/ocsd_dcd_tree.cpp +++ b/decoder/source/ocsd_dcd_tree.cpp @@ -401,7 +401,7 @@ ocsd_err_t DecodeTree::createDecoder(const std::string &decoderName, const int c int crtFlags = createFlags;
uint8_t CSID = 0; // default for single stream decoder (no deformatter) - we ignore the ID - if(usingFormatter()) + //if(usingFormatter()) { CSID = pConfig->getTraceID(); crtFlags |= OCSD_CREATE_FLG_INST_ID;
Not sure if this is the right fix though.
This is how i tested,
1. # taskset 0x2 ./perf record -e cs_etm//u -F 10 --per-thread ping -c 30 127.0.0.1
2. # Ctrl-Z // Put the process in background
3. # taskset -p 0x4 <pid of ping process> // Move the ping process to core 2
4. # fg // Get the process to foreground
5. ./perf report snip ...
# Samples: 66K of event 'branches:uH' # Event count (approx.): 66953 # # Children Self Command Shared Object Symbol # ........ ........ ....... ..................... ........................................ # 15.94% 15.94% ping ld-2.31.so [.] _dl_lookup_symbol_x 14.93% 14.93% ping ld-2.31.so [.] do_lookup_x 10.68% 10.68% ping libc-2.31.so [.] _dl_addr 9.87% 9.87% ping ld-2.31.so [.] _dl_relocate_object 6.75% 6.75% ping ld-2.31.so [.] strcmp 3.62% 3.62% ping ld-2.31.so [.] check_match 2.72% 2.72% ping libc-2.31.so [.] __vfprintf_internal 1.90% 1.90% ping libc-2.31.so [.] _int_malloc 1.29% 1.29% ping libc-2.31.so [.] getenv 1.28% 1.28% ping libc-2.31.so [.] strcmp 1.17% 1.17% ping libc-2.31.so [.] _IO_file_xsputn@@GLIBC_2.17 1.16% 1.16% ping ld-2.31.so [.] _dl_name_match_p
snip ...
Also i could verify using prints in the tmc-etr-driver that the trace buffer gets reused across cores as well.
Also, what is
perf --version ?
perf version 5.9.0-rc5
# To display the perf.data header info, please use --header/--header-only option
============================================================================
Appreciate your help on getting some debug hints on what is going wrong.
One strange thing noted here is sys_perf_event_open, passing cpu = -1 and pid = -1, which doesnt appear to be valid as per tools/perf/design.txt
I see that on my Juno, but it still works. I believe that is for the generic PMU (pmu.type == 1) and not the coresight PMU, which I believe is (type == 8) in your case (the first event).
Suzuki