Hello,
I've found an issue that I believe is regression in the user space part of perf. I'm trying to use CoreSight with Linux 5.5.0-rc7 with perf application built from the same source. Previously, I have been using 5.1-rc4 for a couple of months successfully. The platform I use is Xilinx Zynq Ultrascale+, but I believe is not related to the problem. I managed to find a hacky way to fix the issue, but I don't fully understand the nature of the problem, so I decided to share my findings here with hope I can gain more knowledge or attract attention of somebody who already knows how to fix it.
I've noticed it's not possible to start tracing session with perf-record anymore: root@zynq:/# /linux/tools/perf/perf record -e cs_etm/@tmc_etf0/u --per-thread true failed to set sink "" on event cs_etm/@tmc_etf0/u with 21 (Is a directory)
Sink tmc_etf0 is present in the system.: root@zynq:/# ls -l /sys/bus/event_source/devices/cs_etm/sinks/ total 0 -r--r--r-- 1 root root 4096 Jan 27 14:31 tmc_etf0 -r--r--r-- 1 root root 4096 Jan 27 17:00 tmc_etf1 -r--r--r-- 1 root root 4096 Jan 27 13:24 tmc_etr0 -r--r--r-- 1 root root 4096 Jan 27 17:00 tpiu0
Regardless the sink string I provide, the error is always the same, always with the empty quote and "Is a directory" errno: root@zynq:/# /linux/tools/perf/perf record -e cs_etm/@this_obviously_does_not_exist/u --per-thread true failed to set sink "" on event cs_etm/@this_obviously_does_not_exist/u with 21 (Is a directory)
I discovered that the error is generated by this function in tools/perf/arch/arm/util/cs-etm.c: static int cs_etm_set_sink_attr(struct perf_pmu *pmu, struct evsel *evsel) { char msg[BUFSIZ], path[PATH_MAX], *sink; struct perf_evsel_config_term *term; int ret = -EINVAL; u32 hash;
if (evsel->core.attr.config2 & GENMASK(31, 0)) return 0;
list_for_each_entry(term, &evsel->config_terms, list) { if (term->type != PERF_EVSEL__CONFIG_TERM_DRV_CFG) continue;
sink = term->val.drv_cfg; snprintf(path, PATH_MAX, "sinks/%s", sink);
ret = perf_pmu__scan_file(pmu, path, "%x", &hash); if (ret != 1) { pr_err("failed to set sink "%s" on event %s with %d (%s)\n", sink, perf_evsel__name(evsel), errno, str_error_r(errno, msg, sizeof(msg))); return ret; }
evsel->core.attr.config2 |= hash; return 0; }
The problem occurs because perf_pmu__scan_file() cannot open the file path built by snprintf(). Specifically, char* sink points to an empty string, resulting in sink path leading to the sinks/ directory in sysfs, skipping the file name of the sink. With git-bisect I narrowed the case down to the specific commit that might have introduced the problem: commit 1dc925568f015edfdbb89e20ad41755bb70538b9 Author: Ian Rogers irogers@google.com Date: Wed Oct 30 15:34:47 2019 -0700
perf parse: Add a deep delete for parse event terms
Add a parse_events_term deep delete function so that owned strings and arrays are freed.
After I revert this commit, tracing works properly. What the commit does, is to improve memory freeing of perf configuration. Specifically, it adds the following function that frees strings associated with specific config terms: +void parse_events_term__delete(struct parse_events_term *term) +{ + if (term->array.nr_ranges) + zfree(&term->array.ranges); + + if (term->type_val != PARSE_EVENTS__TERM_TYPE_NUM) + zfree(&term->val.str); + + zfree(&term->config); + free(term); +} Since the zfree(&term->val.str) part didn't exist prior to this commit, the cs_etm_set_sink_attr() function quoted above was able to determine proper sink file in sysfs. However, I analyzed the code further and I see nothing wrong with this part, as there is no other part that frees the string allocated memory. I believe then, that this commit revealed the already existing problem, rather than introduced it.
I also noticed, that, before the CoreSight-specific code is looking for the sysfs sink file, terms objectsare duplicated along with their associated strings (in pmu.c:perf_pmu__check_alias()). I failed to understand what is the purpose of this behavior, however at later stage, still before CS-specific part, both the original lists of terms, the original and the duplicated one, are freed by this snippet in parse-events.y: parse_events_terms__delete($2); parse_events_terms__delete(orig_terms); free($1); $$ = list; #undef CLEANUP_YYABORT
By analyzing pointers in gdb, I determined that the first parse_events_terms__delete() in this snippet frees the structure that is later retrieved by cs_etm_set_sink_attr(). In fact, instead of reverting the previously mentioned commit, removing the `parse_events_terms__delete($2);` makes perf-record work again. This line was introduced by: commit 4a35a9027f64d588d2fd9436dda4126e8d5647d7 Author: Arnaldo Carvalho de Melo acme@redhat.com Date: Mon May 7 15:27:01 2018 -0300
Revert "perf pmu: Fix pmu events parsing rule"
At this point I'm not sure what actually is the nature of the problem: 1. It may be a problem with cs_etm_set_sink_attr(), as it shouldn't use the config_term list but obtain the name in a different manner. I investigated this path, but failed to find a smart way to do this other that to parse the string pointer by perf_evsel__name(evsel). 2. It may be a problem with the superfluous free in the parse-events.y file, and the removal of the first _delete() actually solves the issue. 3. The root cause is actually something else and I'm digging in the wrong direction.
I'll be grateful for any suggestions that may help to understand and solve this issue.
Best regards, Wojciech