Hi Ilpo,
On 7/14/2023 3:35 AM, Ilpo Järvinen wrote:
On Thu, 13 Jul 2023, Reinette Chatre wrote:
On 7/13/2023 6:19 AM, Ilpo Järvinen wrote:
Perf event fd (fd_lm) is not closed on some error paths.
Always close fd_lm in get_llc_perf() and add close into an error handling block in cat_val().
Fixes: 790bf585b0ee ("selftests/resctrl: Add Cache Allocation Technology (CAT) selftest") Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
tools/testing/selftests/resctrl/cache.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index 8a4fe8693be6..ced47b445d1e 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -87,21 +87,20 @@ static int reset_enable_llc_perf(pid_t pid, int cpu_no) static int get_llc_perf(unsigned long *llc_perf_miss) { __u64 total_misses;
- int ret;
/* Stop counters after one span to get miss rate */ ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
- if (read(fd_lm, &rf_cqm, sizeof(struct read_format)) == -1) {
- ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
- close(fd_lm);
- if (ret == -1) { perror("Could not get llc misses through perf");
- return -1; }
total_misses = rf_cqm.values[0].value;
- close(fd_lm);
- *llc_perf_miss = total_misses;
return 0; @@ -253,6 +252,7 @@ int cat_val(struct resctrl_val_param *param) memflush, operation, resctrl_val)) { fprintf(stderr, "Error-running fill buffer\n"); ret = -1;
close(fd_lm); break; }
Instead of fixing these existing patterns I think it would make the code easier to understand and maintain if it is made symmetrical. Having the perf event fd opened in one place but its close() scattered elsewhere has the potential for confusion and making later mistakes easy to miss.
What if perf event fd is closed in a new "disable_llc_perf()" that is matched with "reset_enable_llc_perf()" and called from cat_val()?
I think this raises another issue with the test trickery where measure_cache_vals() has some assumptions about state based on the test name.
I very much agree on the principle here, and thus I already have created patches which will do a major cleanup on this area. The cleaned-up code has pe_fd local var to cat_val() and handles closing it in cat_val() with the usual patterns.
However, the patch is currently resides post L3 CAT test rewrite. Backporting the cleanups/refactors into this series would require considerable effort due to how convoluted all those n-step cleanup patches and L3 CAT test rewrite are in this area. There's just very much to cleanup here and L3 rewrite will touch the same areas so its a net full of conflicts.
Do you want me to spend the effort to backport them into this series (I expect will take some time)?
Considering the "Fixes" tag, having a smaller fix that can easily be backported would be ideal so I am ok with deferring a bigger rework.
I do think this fix can be made more robust with a couple of small changes that should not introduce significant conflicts: * initialize fd_lm to -1 * do not close() fd_lm in get_llc_perf() but instead move its close() to at exit of cat_val(). * add check in get_llc_perf() that it does not attempt ioctl() on "fd_lm == -1" (later addition would be error checking of the ioctl())
I currently have these items pending besides this series (in order):
- L3 CAT test rewrite and its preparatory patches
- More cleanups (including the pe_fd cleanup)
- New generalized test framework
- L2 CAT test
Thank you very much for taking this on.
Reinette