6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Rogers irogers@google.com
[ Upstream commit 7b6dd7a923281a7ccb980a0f768d6926721eb3cc ]
Perf event names aren't case sensitive. For sysfs events the entire directory of events is read then iterated comparing names in a case insensitive way, most often to see if an event is present.
Consider:
$ perf stat -e inst_retired.any true
The event inst_retired.any may be present in any PMU, so every PMU's sysfs events are loaded and then searched with strcasecmp to see if any match. This event is only present on the cpu PMU as a JSON event so a lot of events were loaded from sysfs unnecessarily just to prove an event didn't exist there.
This change avoids loading all the events by assuming sysfs event names are always either lower or uppercase. It uses file exists and only loads the events when the desired event is present.
For the example above, the number of openat calls measured by 'perf trace' on a tigerlake laptop goes from 325 down to 255. The reduction will be larger for machines with many PMUs, particularly replicated uncore PMUs.
Ensure pmu_aliases_parse() is called before all uses of the aliases list, but remove some "pmu->sysfs_aliases_loaded" tests as they are now part of the function.
Reviewed-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Ian Rogers irogers@google.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Bjorn Helgaas bhelgaas@google.com Cc: Ingo Molnar mingo@redhat.com Cc: James Clark james.clark@arm.com Cc: Jing Zhang renyu.zj@linux.alibaba.com Cc: Jiri Olsa jolsa@kernel.org Cc: Jonathan Corbet corbet@lwn.net Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Randy Dunlap rdunlap@infradead.org Cc: Ravi Bangoria ravi.bangoria@amd.com Cc: Thomas Richter tmricht@linux.ibm.com Link: https://lore.kernel.org/r/20240502213507.2339733-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Stable-dep-of: d9c5f5f94c2d ("perf pmu: Count sys and cpuid JSON events separately") Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/pmu.c | 31 ++++++++++++++++++++++++++----- 1 file changed, 26 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c index 0b1c380fce901..f767f43fd3c79 100644 --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c @@ -425,9 +425,30 @@ static struct perf_pmu_alias *perf_pmu__find_alias(struct perf_pmu *pmu, { struct perf_pmu_alias *alias;
- if (load && !pmu->sysfs_aliases_loaded) - pmu_aliases_parse(pmu); + if (load && !pmu->sysfs_aliases_loaded) { + bool has_sysfs_event; + char event_file_name[FILENAME_MAX + 8];
+ /* + * Test if alias/event 'name' exists in the PMU's sysfs/events + * directory. If not skip parsing the sysfs aliases. Sysfs event + * name must be all lower or all upper case. + */ + scnprintf(event_file_name, sizeof(event_file_name), "events/%s", name); + for (size_t i = 7, n = 7 + strlen(name); i < n; i++) + event_file_name[i] = tolower(event_file_name[i]); + + has_sysfs_event = perf_pmu__file_exists(pmu, event_file_name); + if (!has_sysfs_event) { + for (size_t i = 7, n = 7 + strlen(name); i < n; i++) + event_file_name[i] = toupper(event_file_name[i]); + + has_sysfs_event = perf_pmu__file_exists(pmu, event_file_name); + } + if (has_sysfs_event) + pmu_aliases_parse(pmu); + + } list_for_each_entry(alias, &pmu->aliases, list) { if (!strcasecmp(alias->name, name)) return alias; @@ -1627,9 +1648,7 @@ size_t perf_pmu__num_events(struct perf_pmu *pmu) { size_t nr;
- if (!pmu->sysfs_aliases_loaded) - pmu_aliases_parse(pmu); - + pmu_aliases_parse(pmu); nr = pmu->sysfs_aliases;
if (pmu->cpu_aliases_added) @@ -1688,6 +1707,7 @@ int perf_pmu__for_each_event(struct perf_pmu *pmu, bool skip_duplicate_pmus, struct strbuf sb;
strbuf_init(&sb, /*hint=*/ 0); + pmu_aliases_parse(pmu); pmu_add_cpu_aliases(pmu); list_for_each_entry(event, &pmu->aliases, list) { size_t buf_used; @@ -2089,6 +2109,7 @@ const char *perf_pmu__name_from_config(struct perf_pmu *pmu, u64 config) if (!pmu) return NULL;
+ pmu_aliases_parse(pmu); pmu_add_cpu_aliases(pmu); list_for_each_entry(event, &pmu->aliases, list) { struct perf_event_attr attr = {.config = 0,};