Non-contiguous CBM support for Intel CAT has been merged into the kernel with Commit 0e3cd31f6e90 ("x86/resctrl: Enable non-contiguous CBMs in Intel CAT") but there is no selftest that would validate if this feature works correctly. The selftest needs to verify if writing non-contiguous CBMs to the schemata file behaves as expected in comparison to the information about non-contiguous CBMs support.
The patch series is based on a rework of resctrl selftests that's currently in review [1]. The patch also implements a similiar functionality presented in the bash script included in the cover letter of the original non-contiguous CBMs in Intel CAT series [2].
Changelog v2: - Rebase onto v3 of [1] series. - Add two patches that prepare helpers for the new test. - Move Ilpo's patch that adds test grouping to this series. - Apply Ilpo's suggestion to the patch that adds a new test.
[1] https://lore.kernel.org/all/20231211121826.14392-1-ilpo.jarvinen@linux.intel... [2] https://lore.kernel.org/all/cover.1696934091.git.maciej.wieczor-retman@intel...
Ilpo Järvinen (1): selftests/resctrl: Add test groups and name L3 CAT test L3_CAT
Maciej Wieczor-Retman (3): selftests/resctrl: Add helpers for the non-contiguous test selftests/resctrl: Split validate_resctrl_feature_request() selftests/resctrl: Add non-contiguous CBMs CAT test
tools/testing/selftests/resctrl/cat_test.c | 80 ++++++++++++++++- tools/testing/selftests/resctrl/cmt_test.c | 4 +- tools/testing/selftests/resctrl/mba_test.c | 5 +- tools/testing/selftests/resctrl/mbm_test.c | 6 +- tools/testing/selftests/resctrl/resctrl.h | 12 ++- .../testing/selftests/resctrl/resctrl_tests.c | 18 ++-- tools/testing/selftests/resctrl/resctrlfs.c | 86 ++++++++++++++++--- 7 files changed, 185 insertions(+), 26 deletions(-)
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
To select test to run -t parameter can be used. However, -t cat currently maps to L3 CAT test which is confusing after more CAT related tests are added.
Allow selecting tests as groups and call L3 CAT test "L3_CAT", "CAT" group will enable all CAT related tests.
Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com --- Changelog v2: - Move this patch from Ilpo's series here (Ilpo).
tools/testing/selftests/resctrl/cat_test.c | 3 ++- tools/testing/selftests/resctrl/resctrl.h | 2 ++ tools/testing/selftests/resctrl/resctrl_tests.c | 16 +++++++++++----- 3 files changed, 15 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 24af8310288a..39fc9303b8e8 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -295,7 +295,8 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param }
struct resctrl_test l3_cat_test = { - .name = "CAT", + .name = "L3_CAT", + .group = "CAT", .resource = "L3", .feature_check = test_resource_feature_check, .run_test = cat_run_test, diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index c54efcf1412a..739e16d08a7b 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -65,6 +65,7 @@ struct user_params { /* * resctrl_test: resctrl test definition * @name: Test name + * @group: Test group (e.g., L2 and L3 CAT test belong to CAT group), can be NULL * @resource: Resource to test (e.g., MB, L3, L2, etc.) * @vendor_specific: Bitmask for vendor-specific tests (can be 0 for universal tests) * @disabled: Test is disabled @@ -73,6 +74,7 @@ struct user_params { */ struct resctrl_test { const char *name; + const char *group; const char *resource; unsigned int vendor_specific; bool disabled; diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index 75fc49ba3efb..3044179ee6e9 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -65,11 +65,15 @@ static void cmd_help(void) printf("usage: resctrl_tests [-h] [-t test list] [-n no_of_bits] [-b benchmark_cmd [option]...]\n"); printf("\t-b benchmark_cmd [option]...: run specified benchmark for MBM, MBA and CMT\n"); printf("\t default benchmark is builtin fill_buf\n"); - printf("\t-t test list: run tests specified in the test list, "); + printf("\t-t test list: run tests/groups specified by the list, "); printf("e.g. -t mbm,mba,cmt,cat\n"); - printf("\t\tSupported tests:\n"); - for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++) - printf("\t\t\t%s\n", resctrl_tests[i]->name); + printf("\t\tSupported tests (group):\n"); + for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++) { + if (resctrl_tests[i]->group) + printf("\t\t\t%s (%s)\n", resctrl_tests[i]->name, resctrl_tests[i]->group); + else + printf("\t\t\t%s\n", resctrl_tests[i]->name); + } printf("\t-n no_of_bits: run cache tests using specified no of bits in cache bit mask\n"); printf("\t-p cpu_no: specify CPU number to run the test. 1 is default\n"); printf("\t-h: help\n"); @@ -199,7 +203,9 @@ int main(int argc, char **argv) bool found = false;
for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++) { - if (!strcasecmp(token, resctrl_tests[i]->name)) { + if (!strcasecmp(token, resctrl_tests[i]->name) || + (resctrl_tests[i]->group && + !strcasecmp(token, resctrl_tests[i]->group))) { if (resctrl_tests[i]->disabled) tests++; resctrl_tests[i]->disabled = false;
Hi Maciej,
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
To select test to run -t parameter can be used. However, -t cat currently maps to L3 CAT test which is confusing after more CAT related tests are added.
Allow selecting tests as groups and call L3 CAT test "L3_CAT", "CAT" group will enable all CAT related tests.
Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com
Changelog v2:
- Move this patch from Ilpo's series here (Ilpo).
tools/testing/selftests/resctrl/cat_test.c | 3 ++- tools/testing/selftests/resctrl/resctrl.h | 2 ++ tools/testing/selftests/resctrl/resctrl_tests.c | 16 +++++++++++----- 3 files changed, 15 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 24af8310288a..39fc9303b8e8 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -295,7 +295,8 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param } struct resctrl_test l3_cat_test = {
- .name = "CAT",
- .name = "L3_CAT",
- .group = "CAT", .resource = "L3", .feature_check = test_resource_feature_check, .run_test = cat_run_test,
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index c54efcf1412a..739e16d08a7b 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -65,6 +65,7 @@ struct user_params { /*
- resctrl_test: resctrl test definition
- @name: Test name
- @group: Test group (e.g., L2 and L3 CAT test belong to CAT group), can be NULL
Could you please remove references to L2 CAT test that is not available yet? A detailed description of what a test group is will be helpful.
Reinette
Hi,
On 2024-01-08 at 14:31:50 -0800, Reinette Chatre wrote:
Hi Maciej,
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
To select test to run -t parameter can be used. However, -t cat currently maps to L3 CAT test which is confusing after more CAT related tests are added.
Allow selecting tests as groups and call L3 CAT test "L3_CAT", "CAT" group will enable all CAT related tests.
Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com
Changelog v2:
- Move this patch from Ilpo's series here (Ilpo).
tools/testing/selftests/resctrl/cat_test.c | 3 ++- tools/testing/selftests/resctrl/resctrl.h | 2 ++ tools/testing/selftests/resctrl/resctrl_tests.c | 16 +++++++++++----- 3 files changed, 15 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 24af8310288a..39fc9303b8e8 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -295,7 +295,8 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param } struct resctrl_test l3_cat_test = {
- .name = "CAT",
- .name = "L3_CAT",
- .group = "CAT", .resource = "L3", .feature_check = test_resource_feature_check, .run_test = cat_run_test,
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index c54efcf1412a..739e16d08a7b 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -65,6 +65,7 @@ struct user_params { /*
- resctrl_test: resctrl test definition
- @name: Test name
- @group: Test group (e.g., L2 and L3 CAT test belong to CAT group), can be NULL
Could you please remove references to L2 CAT test that is not available yet? A detailed description of what a test group is will be helpful.
Sure, thanks for catching this!
Reinette
The CAT non-contiguous selftests have to read the file responsible for reporting support of non-contiguous CBMs in Intel CAT. Then the test compares if that information matches what is reported by CPUID output.
Add a generic helper function to read a chosen functionality support information.
Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com --- Changelog v2: - Added this patch.
tools/testing/selftests/resctrl/resctrl.h | 1 + tools/testing/selftests/resctrl/resctrlfs.c | 25 +++++++++++++++++++++ 2 files changed, 26 insertions(+)
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 739e16d08a7b..8f72d94b9cbe 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -161,6 +161,7 @@ unsigned int count_contiguous_bits(unsigned long val, unsigned int *start); int get_full_cbm(const char *cache_type, unsigned long *mask); int get_mask_no_shareable(const char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size); +int read_info_res_file(const char *resource, const char *filename); void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(void); void signal_handler_unregister(void); diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 0e97036a64b8..70333440ff2f 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -249,6 +249,31 @@ static int get_bit_mask(const char *filename, unsigned long *mask) return 0; }
+int read_info_res_file(const char *resource, const char *filename) +{ + char file_path[PATH_MAX]; + FILE *fp; + int ret; + + snprintf(file_path, sizeof(file_path), "%s/%s/%s", INFO_PATH, resource, + filename); + + fp = fopen(file_path, "r"); + if (!fp) { + perror("Error in opening sparse_masks file\n"); + return -1; + } + + if (fscanf(fp, "%u", &ret) <= 0) { + perror("Could not get sparse_masks contents\n"); + fclose(fp); + return -1; + } + + fclose(fp); + return ret; +} + /* * create_bit_mask- Create bit mask from start, len pair * @start: LSB of the mask
Hi Maciej,
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
The CAT non-contiguous selftests have to read the file responsible for reporting support of non-contiguous CBMs in Intel CAT. Then the test
"in Intel CAT" -> "in kernel (resctrl)"
compares if that information matches what is reported by CPUID output.
Add a generic helper function to read a chosen functionality support information.
Since this is a generic function that just reads a value from a file it cannot be assumed that the value represents functionality support.
Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com
Changelog v2:
- Added this patch.
tools/testing/selftests/resctrl/resctrl.h | 1 + tools/testing/selftests/resctrl/resctrlfs.c | 25 +++++++++++++++++++++ 2 files changed, 26 insertions(+)
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 739e16d08a7b..8f72d94b9cbe 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -161,6 +161,7 @@ unsigned int count_contiguous_bits(unsigned long val, unsigned int *start); int get_full_cbm(const char *cache_type, unsigned long *mask); int get_mask_no_shareable(const char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size); +int read_info_res_file(const char *resource, const char *filename); void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(void); void signal_handler_unregister(void); diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 0e97036a64b8..70333440ff2f 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -249,6 +249,31 @@ static int get_bit_mask(const char *filename, unsigned long *mask) return 0; } +int read_info_res_file(const char *resource, const char *filename)
Considering that this is intended to be a new generic utility, could you please add some function documentation?
+{
- char file_path[PATH_MAX];
- FILE *fp;
- int ret;
- snprintf(file_path, sizeof(file_path), "%s/%s/%s", INFO_PATH, resource,
filename);
- fp = fopen(file_path, "r");
- if (!fp) {
perror("Error in opening sparse_masks file\n");
The error messages do not match the goal of this function to be generic. Also, please note the recent cleanup done by Ilpo to replace the perror() by ksft_perror().
return -1;
- }
- if (fscanf(fp, "%u", &ret) <= 0) {
I find this to be potentially confusing. The function claims to be a generic utility to read a value from a resctrl file ... but hidden within is that the value is required to be unsigned, which is then cast into an int. This could be made more specific and robust with something like below: int resource_info_unsigned_get(const char *resource, const char *filename, unsigned int *val)
The return value will be the result of the request. If resource_info_unsigned_get() returns 0 then @val will contain the value read.
perror("Could not get sparse_masks contents\n");
fclose(fp);
return -1;
- }
- fclose(fp);
- return ret;
+}
/*
- create_bit_mask- Create bit mask from start, len pair
- @start: LSB of the mask
Reinette
Hi!
On 2024-01-08 at 14:36:36 -0800, Reinette Chatre wrote:
Hi Maciej,
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
The CAT non-contiguous selftests have to read the file responsible for reporting support of non-contiguous CBMs in Intel CAT. Then the test
"in Intel CAT" -> "in kernel (resctrl)"
Sure, will fix.
compares if that information matches what is reported by CPUID output.
Add a generic helper function to read a chosen functionality support information.
Since this is a generic function that just reads a value from a file it cannot be assumed that the value represents functionality support.
Right, I'll rewrite this.
Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com
Changelog v2:
- Added this patch.
tools/testing/selftests/resctrl/resctrl.h | 1 + tools/testing/selftests/resctrl/resctrlfs.c | 25 +++++++++++++++++++++ 2 files changed, 26 insertions(+)
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 739e16d08a7b..8f72d94b9cbe 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -161,6 +161,7 @@ unsigned int count_contiguous_bits(unsigned long val, unsigned int *start); int get_full_cbm(const char *cache_type, unsigned long *mask); int get_mask_no_shareable(const char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size); +int read_info_res_file(const char *resource, const char *filename); void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(void); void signal_handler_unregister(void); diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 0e97036a64b8..70333440ff2f 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -249,6 +249,31 @@ static int get_bit_mask(const char *filename, unsigned long *mask) return 0; } +int read_info_res_file(const char *resource, const char *filename)
Considering that this is intended to be a new generic utility, could you please add some function documentation?
Sure
+{
- char file_path[PATH_MAX];
- FILE *fp;
- int ret;
- snprintf(file_path, sizeof(file_path), "%s/%s/%s", INFO_PATH, resource,
filename);
- fp = fopen(file_path, "r");
- if (!fp) {
perror("Error in opening sparse_masks file\n");
The error messages do not match the goal of this function to be generic. Also, please note the recent cleanup done by Ilpo to replace the perror() by ksft_perror().
Thanks for catching this, will fix.
return -1;
- }
- if (fscanf(fp, "%u", &ret) <= 0) {
I find this to be potentially confusing. The function claims to be a generic utility to read a value from a resctrl file ... but hidden within is that the value is required to be unsigned, which is then cast into an int. This could be made more specific and robust with something like below: int resource_info_unsigned_get(const char *resource, const char *filename, unsigned int *val)
The return value will be the result of the request. If resource_info_unsigned_get() returns 0 then @val will contain the value read.
Right, that might be confusing. Will fix according to your comment, thanks!
perror("Could not get sparse_masks contents\n");
fclose(fp);
return -1;
- }
- fclose(fp);
- return ret;
+}
/*
- create_bit_mask- Create bit mask from start, len pair
- @start: LSB of the mask
Reinette
validate_resctrl_feature_request() is used to test both if a resource is present in the info directory, and if a passed monitoring feature is present in the mon_features file. There exists a different way to represent feature support and that is by the presence of 0 or 1 in single file in the info/resource directory. In this case the filename represents what feature support is being indicated.
Split validate_resctrl_feature_request() into three smaller functions that each accomplish one check: - Resource directory presence in the /sys/fs/resctrl/info directory. - Feature name presence in the /sys/fs/resctrl/info/RESOURCE/mon_features file. - Feature file presence in a given /sys/fs/resctrl/info/RESOURCE directory.
Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com --- Changelog v2: - Added this patch.
tools/testing/selftests/resctrl/cat_test.c | 2 - tools/testing/selftests/resctrl/cmt_test.c | 4 +- tools/testing/selftests/resctrl/mba_test.c | 5 +- tools/testing/selftests/resctrl/mbm_test.c | 6 +-- tools/testing/selftests/resctrl/resctrl.h | 6 ++- tools/testing/selftests/resctrl/resctrlfs.c | 59 +++++++++++++++++---- 6 files changed, 63 insertions(+), 19 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 39fc9303b8e8..7dc7206b3b99 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -1,9 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* * Cache Allocation Technology (CAT) test - * * Copyright (C) 2018 Intel Corporation - * * Authors: * Sai Praneeth Prakhya sai.praneeth.prakhya@intel.com, * Fenghua Yu fenghua.yu@intel.com diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index dd5ca343c469..7b63aec8e2c4 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -169,8 +169,8 @@ static int cmt_run_test(const struct resctrl_test *test, const struct user_param
static bool cmt_feature_check(const struct resctrl_test *test) { - return test_resource_feature_check(test) && - validate_resctrl_feature_request("L3_MON", "llc_occupancy"); + return resctrl_mon_feature_exists("L3_MON", "llc_occupancy") && + resctrl_resource_exists("L3"); }
struct resctrl_test cmt_test = { diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c index da256d2dbe5c..ecf1c186448d 100644 --- a/tools/testing/selftests/resctrl/mba_test.c +++ b/tools/testing/selftests/resctrl/mba_test.c @@ -170,8 +170,9 @@ static int mba_run_test(const struct resctrl_test *test, const struct user_param
static bool mba_feature_check(const struct resctrl_test *test) { - return test_resource_feature_check(test) && - validate_resctrl_feature_request("L3_MON", "mbm_local_bytes"); + return resctrl_resource_exists(test->resource) && + resctrl_mon_feature_exists("L3_MON", + "mbm_local_bytes"); }
struct resctrl_test mba_test = { diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c index 34879e7b71a0..d67ffa3ec63a 100644 --- a/tools/testing/selftests/resctrl/mbm_test.c +++ b/tools/testing/selftests/resctrl/mbm_test.c @@ -97,7 +97,7 @@ static int mbm_setup(const struct resctrl_test *test, return END_OF_TESTS;
/* Set up shemata with 100% allocation on the first run. */ - if (p->num_of_runs == 0 && validate_resctrl_feature_request("MB", NULL)) + if (p->num_of_runs == 0 && resctrl_resource_exists("MB")) ret = write_schemata(p->ctrlgrp, "100", uparams->cpu, test->resource);
p->num_of_runs++; @@ -140,8 +140,8 @@ static int mbm_run_test(const struct resctrl_test *test, const struct user_param
static bool mbm_feature_check(const struct resctrl_test *test) { - return validate_resctrl_feature_request("L3_MON", "mbm_total_bytes") && - validate_resctrl_feature_request("L3_MON", "mbm_local_bytes"); + return resctrl_mon_feature_exists("L3_MON", "mbm_total_bytes") && + resctrl_mon_feature_exists("L3_MON", "mbm_local_bytes"); }
struct resctrl_test mbm_test = { diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 8f72d94b9cbe..74041a35d4ba 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -135,7 +135,11 @@ int get_domain_id(const char *resource, int cpu_no, int *domain_id); int mount_resctrlfs(void); int umount_resctrlfs(void); int validate_bw_report_request(char *bw_report); -bool validate_resctrl_feature_request(const char *resource, const char *feature); +bool resctrl_resource_exists(const char *resource); +bool resctrl_mon_feature_exists(const char *resource, + const char *feature); +bool resctrl_cache_feature_exists(const char *resource, + const char *feature); bool test_resource_feature_check(const struct resctrl_test *test); char *fgrep(FILE *inf, const char *str); int taskset_benchmark(pid_t bm_pid, int cpu_no, cpu_set_t *old_affinity); diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 70333440ff2f..8546421f0940 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -697,20 +697,16 @@ char *fgrep(FILE *inf, const char *str) }
/* - * validate_resctrl_feature_request - Check if requested feature is valid. - * @resource: Required resource (e.g., MB, L3, L2, L3_MON, etc.) - * @feature: Required monitor feature (in mon_features file). Can only be - * set for L3_MON. Must be NULL for all other resources. + * resctrl_resource_exists - Check if a resource is supported. + * @resource: Resctrl resource (e.g., MB, L3, L2, L3_MON, etc.) * - * Return: True if the resource/feature is supported, else false. False is + * Return: True if the resource is supported, else false. False is * also returned if resctrl FS is not mounted. */ -bool validate_resctrl_feature_request(const char *resource, const char *feature) +bool resctrl_resource_exists(const char *resource) { char res_path[PATH_MAX]; struct stat statbuf; - char *res; - FILE *inf; int ret;
if (!resource) @@ -725,6 +721,25 @@ bool validate_resctrl_feature_request(const char *resource, const char *feature) if (stat(res_path, &statbuf)) return false;
+ return true; +} + +/* + * resctrl_mon_feature_exists - Check if requested feature is valid. + * @resource: Required resource (e.g., MB, L3, L2, L3_MON, etc.) + * @feature: Required monitor feature (in mon_features file). Can only be + * set for L3_MON. Must be NULL for all other resources. + * + * Return: True if the resource/feature is supported, else false. False is + * also returned if resctrl FS is not mounted. + */ +bool resctrl_mon_feature_exists(const char *resource, + const char *feature) +{ + char res_path[PATH_MAX]; + char *res; + FILE *inf; + if (!feature) return true;
@@ -740,9 +755,35 @@ bool validate_resctrl_feature_request(const char *resource, const char *feature) return !!res; }
+/* + * resctrl_cache_feature_exists - Check if a file that indicates a + * cache related feature support is present. + * @resource: Required cache resource (L3 or L2) + * @feature: Required cache feature. + * + * Return: True if the feature is supported, else false. + */ +bool resctrl_cache_feature_exists(const char *resource, + const char *feature) +{ + char res_path[PATH_MAX]; + struct stat statbuf; + + if (!feature) + return true; + + snprintf(res_path, sizeof(res_path), "%s/%s/%s", INFO_PATH, resource, + feature); + + if (stat(res_path, &statbuf)) + return false; + + return true; +} + bool test_resource_feature_check(const struct resctrl_test *test) { - return validate_resctrl_feature_request(test->resource, NULL); + return resctrl_resource_exists(test->resource); }
int filter_dmesg(void)
Hi Maciej,
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
validate_resctrl_feature_request() is used to test both if a resource is present in the info directory, and if a passed monitoring feature is present in the mon_features file. There exists a different way to represent feature support and that is by the presence of 0 or 1 in single file in the info/resource directory. In this case the filename represents what feature support is being indicated.
Split validate_resctrl_feature_request() into three smaller functions that each accomplish one check:
- Resource directory presence in the /sys/fs/resctrl/info directory.
- Feature name presence in the /sys/fs/resctrl/info/RESOURCE/mon_features file.
- Feature file presence in a given /sys/fs/resctrl/info/RESOURCE directory.
Please present refactoring of existing code and introduction of new feature as separate patches.
Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com
Changelog v2:
- Added this patch.
tools/testing/selftests/resctrl/cat_test.c | 2 - tools/testing/selftests/resctrl/cmt_test.c | 4 +- tools/testing/selftests/resctrl/mba_test.c | 5 +- tools/testing/selftests/resctrl/mbm_test.c | 6 +-- tools/testing/selftests/resctrl/resctrl.h | 6 ++- tools/testing/selftests/resctrl/resctrlfs.c | 59 +++++++++++++++++---- 6 files changed, 63 insertions(+), 19 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 39fc9303b8e8..7dc7206b3b99 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -1,9 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /*
- Cache Allocation Technology (CAT) test
- Copyright (C) 2018 Intel Corporation
- Authors:
- Sai Praneeth Prakhya sai.praneeth.prakhya@intel.com,
- Fenghua Yu fenghua.yu@intel.com
Some unrelated changes here.
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index dd5ca343c469..7b63aec8e2c4 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -169,8 +169,8 @@ static int cmt_run_test(const struct resctrl_test *test, const struct user_param static bool cmt_feature_check(const struct resctrl_test *test) {
- return test_resource_feature_check(test) &&
validate_resctrl_feature_request("L3_MON", "llc_occupancy");
- return resctrl_mon_feature_exists("L3_MON", "llc_occupancy") &&
resctrl_resource_exists("L3");
} struct resctrl_test cmt_test = { diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c index da256d2dbe5c..ecf1c186448d 100644 --- a/tools/testing/selftests/resctrl/mba_test.c +++ b/tools/testing/selftests/resctrl/mba_test.c @@ -170,8 +170,9 @@ static int mba_run_test(const struct resctrl_test *test, const struct user_param static bool mba_feature_check(const struct resctrl_test *test) {
- return test_resource_feature_check(test) &&
validate_resctrl_feature_request("L3_MON", "mbm_local_bytes");
- return resctrl_resource_exists(test->resource) &&
resctrl_mon_feature_exists("L3_MON",
"mbm_local_bytes");
} struct resctrl_test mba_test = { diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c index 34879e7b71a0..d67ffa3ec63a 100644 --- a/tools/testing/selftests/resctrl/mbm_test.c +++ b/tools/testing/selftests/resctrl/mbm_test.c @@ -97,7 +97,7 @@ static int mbm_setup(const struct resctrl_test *test, return END_OF_TESTS; /* Set up shemata with 100% allocation on the first run. */
- if (p->num_of_runs == 0 && validate_resctrl_feature_request("MB", NULL))
- if (p->num_of_runs == 0 && resctrl_resource_exists("MB")) ret = write_schemata(p->ctrlgrp, "100", uparams->cpu, test->resource);
p->num_of_runs++; @@ -140,8 +140,8 @@ static int mbm_run_test(const struct resctrl_test *test, const struct user_param static bool mbm_feature_check(const struct resctrl_test *test) {
- return validate_resctrl_feature_request("L3_MON", "mbm_total_bytes") &&
validate_resctrl_feature_request("L3_MON", "mbm_local_bytes");
- return resctrl_mon_feature_exists("L3_MON", "mbm_total_bytes") &&
resctrl_mon_feature_exists("L3_MON", "mbm_local_bytes");
} struct resctrl_test mbm_test = { diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 8f72d94b9cbe..74041a35d4ba 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -135,7 +135,11 @@ int get_domain_id(const char *resource, int cpu_no, int *domain_id); int mount_resctrlfs(void); int umount_resctrlfs(void); int validate_bw_report_request(char *bw_report); -bool validate_resctrl_feature_request(const char *resource, const char *feature); +bool resctrl_resource_exists(const char *resource); +bool resctrl_mon_feature_exists(const char *resource,
const char *feature);
+bool resctrl_cache_feature_exists(const char *resource,
const char *feature);
bool test_resource_feature_check(const struct resctrl_test *test); char *fgrep(FILE *inf, const char *str); int taskset_benchmark(pid_t bm_pid, int cpu_no, cpu_set_t *old_affinity); diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 70333440ff2f..8546421f0940 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -697,20 +697,16 @@ char *fgrep(FILE *inf, const char *str) } /*
- validate_resctrl_feature_request - Check if requested feature is valid.
- @resource: Required resource (e.g., MB, L3, L2, L3_MON, etc.)
- @feature: Required monitor feature (in mon_features file). Can only be
set for L3_MON. Must be NULL for all other resources.
- resctrl_resource_exists - Check if a resource is supported.
- @resource: Resctrl resource (e.g., MB, L3, L2, L3_MON, etc.)
- Return: True if the resource/feature is supported, else false. False is
*/
- Return: True if the resource is supported, else false. False is
also returned if resctrl FS is not mounted.
-bool validate_resctrl_feature_request(const char *resource, const char *feature) +bool resctrl_resource_exists(const char *resource) { char res_path[PATH_MAX]; struct stat statbuf;
- char *res;
- FILE *inf; int ret;
if (!resource) @@ -725,6 +721,25 @@ bool validate_resctrl_feature_request(const char *resource, const char *feature) if (stat(res_path, &statbuf)) return false;
- return true;
+}
+/*
- resctrl_mon_feature_exists - Check if requested feature is valid.
- @resource: Required resource (e.g., MB, L3, L2, L3_MON, etc.)
If this is intended for the monitoring resource and L3_MON (per below) is the only valid resource then @resource cannot be all of the examples shown. Why is the @resource argument needed?
- @feature: Required monitor feature (in mon_features file). Can only be
set for L3_MON. Must be NULL for all other resources.
Which other resources?
- Return: True if the resource/feature is supported, else false. False is
also returned if resctrl FS is not mounted.
- */
+bool resctrl_mon_feature_exists(const char *resource,
const char *feature)
+{
- char res_path[PATH_MAX];
- char *res;
- FILE *inf;
- if (!feature) return true;
Doesn't this mean that resctrl_mon_feature_exists(NULL, NULL) will return true?
@@ -740,9 +755,35 @@ bool validate_resctrl_feature_request(const char *resource, const char *feature) return !!res; } +/*
- resctrl_cache_feature_exists - Check if a file that indicates a
- cache related feature support is present.
Seems like this is not really specific to a cache ... it can check for any info file related to any resource.
- @resource: Required cache resource (L3 or L2)
- @feature: Required cache feature.
This seems to assume some usage of this utility. What if it is, for example, resource_info_file_exists() or resource_info_file_readable()?
- Return: True if the feature is supported, else false.
- */
+bool resctrl_cache_feature_exists(const char *resource,
const char *feature)
+{
- char res_path[PATH_MAX];
- struct stat statbuf;
- if (!feature)
return true;
resctrl_cache_feature_exists(NULL, NULL) will return true, no?
- snprintf(res_path, sizeof(res_path), "%s/%s/%s", INFO_PATH, resource,
feature);
- if (stat(res_path, &statbuf))
return false;
I think it will be more robust to look at statbuf to learn if the file type is correct and the file is actually readable.
- return true;
+}
bool test_resource_feature_check(const struct resctrl_test *test) {
- return validate_resctrl_feature_request(test->resource, NULL);
- return resctrl_resource_exists(test->resource);
} int filter_dmesg(void)
Reinette
Hi!
On 2024-01-08 at 14:38:45 -0800, Reinette Chatre wrote:
Hi Maciej,
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
validate_resctrl_feature_request() is used to test both if a resource is present in the info directory, and if a passed monitoring feature is present in the mon_features file. There exists a different way to represent feature support and that is by the presence of 0 or 1 in single file in the info/resource directory. In this case the filename represents what feature support is being indicated.
Split validate_resctrl_feature_request() into three smaller functions that each accomplish one check:
- Resource directory presence in the /sys/fs/resctrl/info directory.
- Feature name presence in the /sys/fs/resctrl/info/RESOURCE/mon_features file.
- Feature file presence in a given /sys/fs/resctrl/info/RESOURCE directory.
Please present refactoring of existing code and introduction of new feature as separate patches.
Sure, will do.
Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com
Changelog v2:
- Added this patch.
tools/testing/selftests/resctrl/cat_test.c | 2 - tools/testing/selftests/resctrl/cmt_test.c | 4 +- tools/testing/selftests/resctrl/mba_test.c | 5 +- tools/testing/selftests/resctrl/mbm_test.c | 6 +-- tools/testing/selftests/resctrl/resctrl.h | 6 ++- tools/testing/selftests/resctrl/resctrlfs.c | 59 +++++++++++++++++---- 6 files changed, 63 insertions(+), 19 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 39fc9303b8e8..7dc7206b3b99 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -1,9 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /*
- Cache Allocation Technology (CAT) test
- Copyright (C) 2018 Intel Corporation
- Authors:
- Sai Praneeth Prakhya sai.praneeth.prakhya@intel.com,
- Fenghua Yu fenghua.yu@intel.com
Some unrelated changes here.
Oops, sorry, I'll remove these.
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 70333440ff2f..8546421f0940 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -725,6 +721,25 @@ bool validate_resctrl_feature_request(const char *resource, const char *feature) if (stat(res_path, &statbuf)) return false;
- return true;
+}
+/*
- resctrl_mon_feature_exists - Check if requested feature is valid.
- @resource: Required resource (e.g., MB, L3, L2, L3_MON, etc.)
If this is intended for the monitoring resource and L3_MON (per below) is the only valid resource then @resource cannot be all of the examples shown. Why is the @resource argument needed?
I see now that these functions turned out rather silly. I'll redo them according to all your other comments. Thanks for pointing these out!
- @feature: Required monitor feature (in mon_features file). Can only be
set for L3_MON. Must be NULL for all other resources.
Which other resources?
I'll remove it and just put L3_MON in the path.
- Return: True if the resource/feature is supported, else false. False is
also returned if resctrl FS is not mounted.
- */
+bool resctrl_mon_feature_exists(const char *resource,
const char *feature)
+{
- char res_path[PATH_MAX];
- char *res;
- FILE *inf;
- if (!feature) return true;
Doesn't this mean that resctrl_mon_feature_exists(NULL, NULL) will return true?
I'll get rid of this check.
@@ -740,9 +755,35 @@ bool validate_resctrl_feature_request(const char *resource, const char *feature) return !!res; } +/*
- resctrl_cache_feature_exists - Check if a file that indicates a
- cache related feature support is present.
Seems like this is not really specific to a cache ... it can check for any info file related to any resource.
Right, I'll rewrite this.
- @resource: Required cache resource (L3 or L2)
- @feature: Required cache feature.
This seems to assume some usage of this utility. What if it is, for example, resource_info_file_exists() or resource_info_file_readable()?
Yes, I think resource_info_file_exists() fits better.
- Return: True if the feature is supported, else false.
- */
+bool resctrl_cache_feature_exists(const char *resource,
const char *feature)
+{
- char res_path[PATH_MAX];
- struct stat statbuf;
- if (!feature)
return true;
resctrl_cache_feature_exists(NULL, NULL) will return true, no?
I'll remove this check.
- snprintf(res_path, sizeof(res_path), "%s/%s/%s", INFO_PATH, resource,
feature);
- if (stat(res_path, &statbuf))
return false;
I think it will be more robust to look at statbuf to learn if the file type is correct and the file is actually readable.
Could that file be unreadable or of wrong type?
Also I thought that since read_info_res_file() opens and reads that file any errors should be handled there. Shouldn't this part of the test only return whether the file is there or not since that indicates if something is supported in the kernel?
- return true;
+}
bool test_resource_feature_check(const struct resctrl_test *test) {
- return validate_resctrl_feature_request(test->resource, NULL);
- return resctrl_resource_exists(test->resource);
} int filter_dmesg(void)
Reinette
Hi Maciej,
On 1/17/2024 1:49 AM, Maciej Wieczór-Retman wrote:
On 2024-01-08 at 14:38:45 -0800, Reinette Chatre wrote:
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
- snprintf(res_path, sizeof(res_path), "%s/%s/%s", INFO_PATH, resource,
feature);
- if (stat(res_path, &statbuf))
return false;
I think it will be more robust to look at statbuf to learn if the file type is correct and the file is actually readable.
Could that file be unreadable or of wrong type?
It should be readable and the correct type when all goes well. Hence the term "more robust".
Also I thought that since read_info_res_file() opens and reads that file any errors should be handled there. Shouldn't this part of the test only return whether the file is there or not since that indicates if something is supported in the kernel?
ok.
Reinette
Add tests for both L2 and L3 CAT to verify the return values generated by writing non-contiguous CBMs don't contradict the reported non-contiguous support information.
Use a logical XOR to confirm return value of write_schemata() and non-contiguous CBMs support information match.
Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com --- Changelog v2: - Redo the patch message. (Ilpo) - Tidy up __cpuid_count calls. (Ilpo) - Remove redundant AND in noncont_mask calculations (Ilpo) - Fix bit_center offset. - Add newline before function return. (Ilpo) - Group non-contiguous tests with CAT tests. (Ilpo) - Use a helper for reading sparse_masks file. (Ilpo) - Make get_cache_level() available in other source files. (Ilpo)
tools/testing/selftests/resctrl/cat_test.c | 75 +++++++++++++++++++ tools/testing/selftests/resctrl/resctrl.h | 3 + .../testing/selftests/resctrl/resctrl_tests.c | 2 + tools/testing/selftests/resctrl/resctrlfs.c | 2 +- 4 files changed, 81 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 7dc7206b3b99..ecf553a89aae 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -292,6 +292,65 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param return ret; }
+static int noncont_cat_run_test(const struct resctrl_test *test, + const struct user_params *uparams) +{ + unsigned long full_cache_mask, cont_mask, noncont_mask; + unsigned int eax, ebx, ecx, edx, ret; + int level, bit_center, sparse_masks; + char schemata[64]; + + /* Check to compare sparse_masks content to cpuid output. */ + sparse_masks = read_info_res_file(test->resource, "sparse_masks"); + if (sparse_masks < 0) + return sparse_masks; + + level = get_cache_level(test->resource); + if (level < 0) + return -EINVAL; + __cpuid_count(0x10, 4 - level, eax, ebx, ecx, edx); + + if (sparse_masks != ((ecx >> 3) & 1)) + return -1; + + /* Write checks initialization. */ + ret = get_full_cbm(test->resource, &full_cache_mask); + if (ret < 0) + return ret; + bit_center = count_bits(full_cache_mask) / 2; + cont_mask = full_cache_mask >> bit_center; + + /* Contiguous mask write check. */ + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); + ret = write_schemata("", schemata, uparams->cpu, test->resource); + if (ret) + return ret; + + /* + * Non-contiguous mask write check. CBM has a 0xf hole approximately in the middle. + * Output is compared with support information to catch any edge case errors. + */ + noncont_mask = ~(0xf << (bit_center - 2)) & full_cache_mask; + snprintf(schemata, sizeof(schemata), "%lx", noncont_mask); + ret = write_schemata("", schemata, uparams->cpu, test->resource); + if (ret && sparse_masks) + ksft_print_msg("Non-contiguous CBMs supported but write failed\n"); + else if (ret && !sparse_masks) + ksft_print_msg("Non-contiguous CBMs not supported and write failed as expected\n"); + else if (!ret && !sparse_masks) + ksft_print_msg("Non-contiguous CBMs not supported but write succeeded\n"); + + return !ret == !sparse_masks; +} + +static bool noncont_cat_feature_check(const struct resctrl_test *test) +{ + if (!resctrl_resource_exists(test->resource)) + return false; + + return resctrl_cache_feature_exists(test->resource, "sparse_masks"); +} + struct resctrl_test l3_cat_test = { .name = "L3_CAT", .group = "CAT", @@ -299,3 +358,19 @@ struct resctrl_test l3_cat_test = { .feature_check = test_resource_feature_check, .run_test = cat_run_test, }; + +struct resctrl_test l3_noncont_cat_test = { + .name = "L3_NONCONT_CAT", + .group = "CAT", + .resource = "L3", + .feature_check = noncont_cat_feature_check, + .run_test = noncont_cat_run_test, +}; + +struct resctrl_test l2_noncont_cat_test = { + .name = "L2_NONCONT_CAT", + .group = "CAT", + .resource = "L2", + .feature_check = noncont_cat_feature_check, + .run_test = noncont_cat_run_test, +}; diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 74041a35d4ba..7b7a48d1fddd 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -165,6 +165,7 @@ unsigned int count_contiguous_bits(unsigned long val, unsigned int *start); int get_full_cbm(const char *cache_type, unsigned long *mask); int get_mask_no_shareable(const char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size); +int get_cache_level(const char *cache_type); int read_info_res_file(const char *resource, const char *filename); void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(void); @@ -201,5 +202,7 @@ extern struct resctrl_test mbm_test; extern struct resctrl_test mba_test; extern struct resctrl_test cmt_test; extern struct resctrl_test l3_cat_test; +extern struct resctrl_test l3_noncont_cat_test; +extern struct resctrl_test l2_noncont_cat_test;
#endif /* RESCTRL_H */ diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index 3044179ee6e9..f3dc1b9696e7 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -19,6 +19,8 @@ static struct resctrl_test *resctrl_tests[] = { &mba_test, &cmt_test, &l3_cat_test, + &l3_noncont_cat_test, + &l2_noncont_cat_test, };
static int detect_vendor(void) diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 8546421f0940..8bd30973fec3 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -100,7 +100,7 @@ int umount_resctrlfs(void) * * Return: cache level as integer or -1 if @cache_type is invalid. */ -static int get_cache_level(const char *cache_type) +int get_cache_level(const char *cache_type) { if (!strcmp(cache_type, "L3")) return 3;
Hi Maciej,
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
Add tests for both L2 and L3 CAT to verify the return values generated by writing non-contiguous CBMs don't contradict the reported non-contiguous support information.
Use a logical XOR to confirm return value of write_schemata() and non-contiguous CBMs support information match.
Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com
Changelog v2:
- Redo the patch message. (Ilpo)
- Tidy up __cpuid_count calls. (Ilpo)
- Remove redundant AND in noncont_mask calculations (Ilpo)
- Fix bit_center offset.
- Add newline before function return. (Ilpo)
- Group non-contiguous tests with CAT tests. (Ilpo)
- Use a helper for reading sparse_masks file. (Ilpo)
- Make get_cache_level() available in other source files. (Ilpo)
tools/testing/selftests/resctrl/cat_test.c | 75 +++++++++++++++++++ tools/testing/selftests/resctrl/resctrl.h | 3 + .../testing/selftests/resctrl/resctrl_tests.c | 2 + tools/testing/selftests/resctrl/resctrlfs.c | 2 +- 4 files changed, 81 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 7dc7206b3b99..ecf553a89aae 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -292,6 +292,65 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param return ret; } +static int noncont_cat_run_test(const struct resctrl_test *test,
const struct user_params *uparams)
+{
- unsigned long full_cache_mask, cont_mask, noncont_mask;
- unsigned int eax, ebx, ecx, edx, ret;
- int level, bit_center, sparse_masks;
- char schemata[64];
- /* Check to compare sparse_masks content to cpuid output. */
"cpuid" -> "CPUID" (to note it is an instruction)
- sparse_masks = read_info_res_file(test->resource, "sparse_masks");
- if (sparse_masks < 0)
return sparse_masks;
- level = get_cache_level(test->resource);
- if (level < 0)
return -EINVAL;
- __cpuid_count(0x10, 4 - level, eax, ebx, ecx, edx);
Please do not invent relationships. Please replace the "4 - level" with specific index used that depends on particular cache. The cache level may not even be needed, just use the resource to determine the correct index.
- if (sparse_masks != ((ecx >> 3) & 1))
return -1;
Can a message be displayed to support the debugging this test failure?
- /* Write checks initialization. */
- ret = get_full_cbm(test->resource, &full_cache_mask);
- if (ret < 0)
return ret;
I assume this test failure relies on the error message from get_bit_mask() that is called via get_full_cbm()?
- bit_center = count_bits(full_cache_mask) / 2;
- cont_mask = full_cache_mask >> bit_center;
- /* Contiguous mask write check. */
- snprintf(schemata, sizeof(schemata), "%lx", cont_mask);
- ret = write_schemata("", schemata, uparams->cpu, test->resource);
- if (ret)
return ret;
How will user know what failed? I am seeing this single test exercise a few scenarios and it is not obvious to me if the issue will be clear if this test, noncont_cat_run_test(), fails.
- /*
* Non-contiguous mask write check. CBM has a 0xf hole approximately in the middle.
* Output is compared with support information to catch any edge case errors.
*/
- noncont_mask = ~(0xf << (bit_center - 2)) & full_cache_mask;
- snprintf(schemata, sizeof(schemata), "%lx", noncont_mask);
- ret = write_schemata("", schemata, uparams->cpu, test->resource);
- if (ret && sparse_masks)
ksft_print_msg("Non-contiguous CBMs supported but write failed\n");
- else if (ret && !sparse_masks)
ksft_print_msg("Non-contiguous CBMs not supported and write failed as expected\n");
- else if (!ret && !sparse_masks)
ksft_print_msg("Non-contiguous CBMs not supported but write succeeded\n");
Can these messages be made more specific with a "write" changed to "write of non-contiguous CBM"
- return !ret == !sparse_masks;
Please return negative on error. Ilpo just did a big cleanup to address this.
+}
+static bool noncont_cat_feature_check(const struct resctrl_test *test) +{
- if (!resctrl_resource_exists(test->resource))
return false;
- return resctrl_cache_feature_exists(test->resource, "sparse_masks");
+}
struct resctrl_test l3_cat_test = { .name = "L3_CAT", .group = "CAT", @@ -299,3 +358,19 @@ struct resctrl_test l3_cat_test = { .feature_check = test_resource_feature_check, .run_test = cat_run_test, };
+struct resctrl_test l3_noncont_cat_test = {
- .name = "L3_NONCONT_CAT",
- .group = "CAT",
- .resource = "L3",
- .feature_check = noncont_cat_feature_check,
- .run_test = noncont_cat_run_test,
+};
+struct resctrl_test l2_noncont_cat_test = {
- .name = "L2_NONCONT_CAT",
- .group = "CAT",
- .resource = "L2",
- .feature_check = noncont_cat_feature_check,
- .run_test = noncont_cat_run_test,
+}; diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 74041a35d4ba..7b7a48d1fddd 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -165,6 +165,7 @@ unsigned int count_contiguous_bits(unsigned long val, unsigned int *start); int get_full_cbm(const char *cache_type, unsigned long *mask); int get_mask_no_shareable(const char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size); +int get_cache_level(const char *cache_type); int read_info_res_file(const char *resource, const char *filename); void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(void); @@ -201,5 +202,7 @@ extern struct resctrl_test mbm_test; extern struct resctrl_test mba_test; extern struct resctrl_test cmt_test; extern struct resctrl_test l3_cat_test; +extern struct resctrl_test l3_noncont_cat_test; +extern struct resctrl_test l2_noncont_cat_test; #endif /* RESCTRL_H */ diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index 3044179ee6e9..f3dc1b9696e7 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -19,6 +19,8 @@ static struct resctrl_test *resctrl_tests[] = { &mba_test, &cmt_test, &l3_cat_test,
- &l3_noncont_cat_test,
- &l2_noncont_cat_test,
}; static int detect_vendor(void) diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 8546421f0940..8bd30973fec3 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -100,7 +100,7 @@ int umount_resctrlfs(void)
- Return: cache level as integer or -1 if @cache_type is invalid.
*/ -static int get_cache_level(const char *cache_type) +int get_cache_level(const char *cache_type) { if (!strcmp(cache_type, "L3")) return 3;
Reinette
On Mon, 8 Jan 2024, Reinette Chatre wrote:
Hi Maciej,
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
Add tests for both L2 and L3 CAT to verify the return values generated by writing non-contiguous CBMs don't contradict the reported non-contiguous support information.
Use a logical XOR to confirm return value of write_schemata() and non-contiguous CBMs support information match.
Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com
Changelog v2:
- Redo the patch message. (Ilpo)
- Tidy up __cpuid_count calls. (Ilpo)
- Remove redundant AND in noncont_mask calculations (Ilpo)
- Fix bit_center offset.
- Add newline before function return. (Ilpo)
- Group non-contiguous tests with CAT tests. (Ilpo)
- Use a helper for reading sparse_masks file. (Ilpo)
- Make get_cache_level() available in other source files. (Ilpo)
tools/testing/selftests/resctrl/cat_test.c | 75 +++++++++++++++++++ tools/testing/selftests/resctrl/resctrl.h | 3 + .../testing/selftests/resctrl/resctrl_tests.c | 2 + tools/testing/selftests/resctrl/resctrlfs.c | 2 +- 4 files changed, 81 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 7dc7206b3b99..ecf553a89aae 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -292,6 +292,65 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param return ret; } +static int noncont_cat_run_test(const struct resctrl_test *test,
const struct user_params *uparams)
+{
- unsigned long full_cache_mask, cont_mask, noncont_mask;
- unsigned int eax, ebx, ecx, edx, ret;
- int level, bit_center, sparse_masks;
- char schemata[64];
- /* Check to compare sparse_masks content to cpuid output. */
"cpuid" -> "CPUID" (to note it is an instruction)
- sparse_masks = read_info_res_file(test->resource, "sparse_masks");
- if (sparse_masks < 0)
return sparse_masks;
- level = get_cache_level(test->resource);
- if (level < 0)
return -EINVAL;
- __cpuid_count(0x10, 4 - level, eax, ebx, ecx, edx);
Please do not invent relationships. Please replace the "4 - level" with specific index used that depends on particular cache. The cache level may not even be needed, just use the resource to determine the correct index.
This is actually my fault, I suggested Maciej could use arithmetics there.
- return !ret == !sparse_masks;
Please return negative on error. Ilpo just did a big cleanup to address this.
Test failure is not same as an error. So tests should return negative for errors which prevent even running test at all, and 0/1 for test success/fail.
Hi Ilpo,
On 1/9/2024 1:13 AM, Ilpo Järvinen wrote:
On Mon, 8 Jan 2024, Reinette Chatre wrote:
Hi Maciej,
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
Add tests for both L2 and L3 CAT to verify the return values generated by writing non-contiguous CBMs don't contradict the reported non-contiguous support information.
Use a logical XOR to confirm return value of write_schemata() and non-contiguous CBMs support information match.
Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com
Changelog v2:
- Redo the patch message. (Ilpo)
- Tidy up __cpuid_count calls. (Ilpo)
- Remove redundant AND in noncont_mask calculations (Ilpo)
- Fix bit_center offset.
- Add newline before function return. (Ilpo)
- Group non-contiguous tests with CAT tests. (Ilpo)
- Use a helper for reading sparse_masks file. (Ilpo)
- Make get_cache_level() available in other source files. (Ilpo)
tools/testing/selftests/resctrl/cat_test.c | 75 +++++++++++++++++++ tools/testing/selftests/resctrl/resctrl.h | 3 + .../testing/selftests/resctrl/resctrl_tests.c | 2 + tools/testing/selftests/resctrl/resctrlfs.c | 2 +- 4 files changed, 81 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 7dc7206b3b99..ecf553a89aae 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -292,6 +292,65 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param return ret; } +static int noncont_cat_run_test(const struct resctrl_test *test,
const struct user_params *uparams)
+{
- unsigned long full_cache_mask, cont_mask, noncont_mask;
- unsigned int eax, ebx, ecx, edx, ret;
- int level, bit_center, sparse_masks;
- char schemata[64];
- /* Check to compare sparse_masks content to cpuid output. */
"cpuid" -> "CPUID" (to note it is an instruction)
- sparse_masks = read_info_res_file(test->resource, "sparse_masks");
- if (sparse_masks < 0)
return sparse_masks;
- level = get_cache_level(test->resource);
- if (level < 0)
return -EINVAL;
- __cpuid_count(0x10, 4 - level, eax, ebx, ecx, edx);
Please do not invent relationships. Please replace the "4 - level" with specific index used that depends on particular cache. The cache level may not even be needed, just use the resource to determine the correct index.
This is actually my fault, I suggested Maciej could use arithmetics there.
No problem. The math works for the current values but there is no such relationship. If hypothetically a new cache level needs to be supported then this computation cannot be relied upon to continue to be correct.
- return !ret == !sparse_masks;
Please return negative on error. Ilpo just did a big cleanup to address this.
Test failure is not same as an error. So tests should return negative for errors which prevent even running test at all, and 0/1 for test success/fail.
Thanks for catching this. I missed this subtlety in the framework.
Reinette
Hi, thanks for the review!
On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote:
Hi Maciej,
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
Add tests for both L2 and L3 CAT to verify the return values generated by writing non-contiguous CBMs don't contradict the reported non-contiguous support information.
Use a logical XOR to confirm return value of write_schemata() and non-contiguous CBMs support information match.
Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com
Changelog v2:
- Redo the patch message. (Ilpo)
- Tidy up __cpuid_count calls. (Ilpo)
- Remove redundant AND in noncont_mask calculations (Ilpo)
- Fix bit_center offset.
- Add newline before function return. (Ilpo)
- Group non-contiguous tests with CAT tests. (Ilpo)
- Use a helper for reading sparse_masks file. (Ilpo)
- Make get_cache_level() available in other source files. (Ilpo)
tools/testing/selftests/resctrl/cat_test.c | 75 +++++++++++++++++++ tools/testing/selftests/resctrl/resctrl.h | 3 + .../testing/selftests/resctrl/resctrl_tests.c | 2 + tools/testing/selftests/resctrl/resctrlfs.c | 2 +- 4 files changed, 81 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 7dc7206b3b99..ecf553a89aae 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -292,6 +292,65 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param return ret; } +static int noncont_cat_run_test(const struct resctrl_test *test,
const struct user_params *uparams)
+{
- unsigned long full_cache_mask, cont_mask, noncont_mask;
- unsigned int eax, ebx, ecx, edx, ret;
- int level, bit_center, sparse_masks;
- char schemata[64];
- /* Check to compare sparse_masks content to cpuid output. */
"cpuid" -> "CPUID" (to note it is an instruction)
Thanks, will change
- sparse_masks = read_info_res_file(test->resource, "sparse_masks");
- if (sparse_masks < 0)
return sparse_masks;
- level = get_cache_level(test->resource);
- if (level < 0)
return -EINVAL;
- __cpuid_count(0x10, 4 - level, eax, ebx, ecx, edx);
Please do not invent relationships. Please replace the "4 - level" with specific index used that depends on particular cache. The cache level may not even be needed, just use the resource to determine the correct index.
I'll move this back to 'if (!strcmp(test->resource, "L3") ... ' structure then.
- if (sparse_masks != ((ecx >> 3) & 1))
return -1;
Can a message be displayed to support the debugging this test failure?
Sure, that is a very good idea. I'll add ksft_perror() here.
- /* Write checks initialization. */
- ret = get_full_cbm(test->resource, &full_cache_mask);
- if (ret < 0)
return ret;
I assume this test failure relies on the error message from get_bit_mask() that is called via get_full_cbm()?
Yes, I thought adding more prints here might look redundant.
- bit_center = count_bits(full_cache_mask) / 2;
- cont_mask = full_cache_mask >> bit_center;
- /* Contiguous mask write check. */
- snprintf(schemata, sizeof(schemata), "%lx", cont_mask);
- ret = write_schemata("", schemata, uparams->cpu, test->resource);
- if (ret)
return ret;
How will user know what failed? I am seeing this single test exercise a few scenarios and it is not obvious to me if the issue will be clear if this test, noncont_cat_run_test(), fails.
write_schemata() either succeeds with '0' or errors out with a negative value. If the contiguous mask write fails, write_schemata should print out what was wrong and I believe that the test will report an error rather than failure.
- /*
* Non-contiguous mask write check. CBM has a 0xf hole approximately in the middle.
* Output is compared with support information to catch any edge case errors.
*/
- noncont_mask = ~(0xf << (bit_center - 2)) & full_cache_mask;
- snprintf(schemata, sizeof(schemata), "%lx", noncont_mask);
- ret = write_schemata("", schemata, uparams->cpu, test->resource);
- if (ret && sparse_masks)
ksft_print_msg("Non-contiguous CBMs supported but write failed\n");
- else if (ret && !sparse_masks)
ksft_print_msg("Non-contiguous CBMs not supported and write failed as expected\n");
- else if (!ret && !sparse_masks)
ksft_print_msg("Non-contiguous CBMs not supported but write succeeded\n");
Can these messages be made more specific with a "write" changed to "write of non-contiguous CBM"
Sure, will fix it.
- return !ret == !sparse_masks;
Please return negative on error. Ilpo just did a big cleanup to address this.
I believe this is resolved now.
+}
+static bool noncont_cat_feature_check(const struct resctrl_test *test) +{
- if (!resctrl_resource_exists(test->resource))
return false;
- return resctrl_cache_feature_exists(test->resource, "sparse_masks");
+}
struct resctrl_test l3_cat_test = { .name = "L3_CAT", .group = "CAT", @@ -299,3 +358,19 @@ struct resctrl_test l3_cat_test = { .feature_check = test_resource_feature_check, .run_test = cat_run_test, };
+struct resctrl_test l3_noncont_cat_test = {
- .name = "L3_NONCONT_CAT",
- .group = "CAT",
- .resource = "L3",
- .feature_check = noncont_cat_feature_check,
- .run_test = noncont_cat_run_test,
+};
+struct resctrl_test l2_noncont_cat_test = {
- .name = "L2_NONCONT_CAT",
- .group = "CAT",
- .resource = "L2",
- .feature_check = noncont_cat_feature_check,
- .run_test = noncont_cat_run_test,
+}; diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 74041a35d4ba..7b7a48d1fddd 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -165,6 +165,7 @@ unsigned int count_contiguous_bits(unsigned long val, unsigned int *start); int get_full_cbm(const char *cache_type, unsigned long *mask); int get_mask_no_shareable(const char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size); +int get_cache_level(const char *cache_type); int read_info_res_file(const char *resource, const char *filename); void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(void); @@ -201,5 +202,7 @@ extern struct resctrl_test mbm_test; extern struct resctrl_test mba_test; extern struct resctrl_test cmt_test; extern struct resctrl_test l3_cat_test; +extern struct resctrl_test l3_noncont_cat_test; +extern struct resctrl_test l2_noncont_cat_test; #endif /* RESCTRL_H */ diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index 3044179ee6e9..f3dc1b9696e7 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -19,6 +19,8 @@ static struct resctrl_test *resctrl_tests[] = { &mba_test, &cmt_test, &l3_cat_test,
- &l3_noncont_cat_test,
- &l2_noncont_cat_test,
}; static int detect_vendor(void) diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 8546421f0940..8bd30973fec3 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -100,7 +100,7 @@ int umount_resctrlfs(void)
- Return: cache level as integer or -1 if @cache_type is invalid.
*/ -static int get_cache_level(const char *cache_type) +int get_cache_level(const char *cache_type) { if (!strcmp(cache_type, "L3")) return 3;
Reinette
Hi Maciej,
On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote:
On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote:
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
- if (sparse_masks != ((ecx >> 3) & 1))
return -1;
Can a message be displayed to support the debugging this test failure?
Sure, that is a very good idea. I'll add ksft_perror() here.
I do not think ksft_perror() is appropriate since perror() is for system error messages (something that sets errno). Perhaps just ksft_print_msg().
- bit_center = count_bits(full_cache_mask) / 2;
- cont_mask = full_cache_mask >> bit_center;
- /* Contiguous mask write check. */
- snprintf(schemata, sizeof(schemata), "%lx", cont_mask);
- ret = write_schemata("", schemata, uparams->cpu, test->resource);
- if (ret)
return ret;
How will user know what failed? I am seeing this single test exercise a few scenarios and it is not obvious to me if the issue will be clear if this test, noncont_cat_run_test(), fails.
write_schemata() either succeeds with '0' or errors out with a negative value. If the contiguous mask write fails, write_schemata should print out what was wrong and I believe that the test will report an error rather than failure.
Right. I am trying to understand whether the user will be able to decipher what failed in case there is an error. Seems like in this case the user is expected to look at the source code of the test to understand what the test was trying to do at the time it encountered the failure. In this case user may be "lucky" that this test only has one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that reasoning to figure out which write_schemata() failed to further dig what test was trying to do.
Reinette
On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote:
Hi Maciej,
On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote:
On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote:
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
- if (sparse_masks != ((ecx >> 3) & 1))
return -1;
Can a message be displayed to support the debugging this test failure?
Sure, that is a very good idea. I'll add ksft_perror() here.
I do not think ksft_perror() is appropriate since perror() is for system error messages (something that sets errno). Perhaps just ksft_print_msg().
Thanks for the suggestion!
- bit_center = count_bits(full_cache_mask) / 2;
- cont_mask = full_cache_mask >> bit_center;
- /* Contiguous mask write check. */
- snprintf(schemata, sizeof(schemata), "%lx", cont_mask);
- ret = write_schemata("", schemata, uparams->cpu, test->resource);
- if (ret)
return ret;
How will user know what failed? I am seeing this single test exercise a few scenarios and it is not obvious to me if the issue will be clear if this test, noncont_cat_run_test(), fails.
write_schemata() either succeeds with '0' or errors out with a negative value. If the contiguous mask write fails, write_schemata should print out what was wrong and I believe that the test will report an error rather than failure.
Right. I am trying to understand whether the user will be able to decipher what failed in case there is an error. Seems like in this case the user is expected to look at the source code of the test to understand what the test was trying to do at the time it encountered the failure. In this case user may be "lucky" that this test only has one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that reasoning to figure out which write_schemata() failed to further dig what test was trying to do.
When a write_schemata() is executed the string that is being written gets printed. If there are multiple calls in a single tests and one fails I'd imagine it would be easy for the user to figure out which one failed.
On a side note I'm not sure if that's true but I'm getting a feeling that the harder errors (not just test failures) are more of a clue for developers working on the tests. Would you agree that it seems like users probably won't see write_schemata() fail here (if the test execution managed to get to this point without erroring out due to bad parameters or kernel compiled without required options)?
Reinette
Hi Maciej,
On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote:
On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote:
On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote:
On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote:
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
- bit_center = count_bits(full_cache_mask) / 2;
- cont_mask = full_cache_mask >> bit_center;
- /* Contiguous mask write check. */
- snprintf(schemata, sizeof(schemata), "%lx", cont_mask);
- ret = write_schemata("", schemata, uparams->cpu, test->resource);
- if (ret)
return ret;
How will user know what failed? I am seeing this single test exercise a few scenarios and it is not obvious to me if the issue will be clear if this test, noncont_cat_run_test(), fails.
write_schemata() either succeeds with '0' or errors out with a negative value. If the contiguous mask write fails, write_schemata should print out what was wrong and I believe that the test will report an error rather than failure.
Right. I am trying to understand whether the user will be able to decipher what failed in case there is an error. Seems like in this case the user is expected to look at the source code of the test to understand what the test was trying to do at the time it encountered the failure. In this case user may be "lucky" that this test only has one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that reasoning to figure out which write_schemata() failed to further dig what test was trying to do.
When a write_schemata() is executed the string that is being written gets printed. If there are multiple calls in a single tests and one fails I'd imagine it would be easy for the user to figure out which one failed.
It would be easy for the user the figure out if (a) it is obvious to the user what schema a particular write_schema() call attempted to write and (b) all the write_schema() calls attempt to write different schema.
On a side note I'm not sure if that's true but I'm getting a feeling that the harder errors (not just test failures) are more of a clue for developers working on the tests. Would you agree that it seems like users probably won't see write_schemata() fail here (if the test execution managed to get to this point without erroring out due to bad parameters or kernel compiled without required options)?
I do agree that users probably won't see such failures. I do not think these errors are clues to developers working on the tests though, but instead clues to resctrl developers or kernel development CI systems.
Reinette
Hello!
On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote:
Hi Maciej,
On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote:
On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote:
On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote:
On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote:
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
- bit_center = count_bits(full_cache_mask) / 2;
- cont_mask = full_cache_mask >> bit_center;
- /* Contiguous mask write check. */
- snprintf(schemata, sizeof(schemata), "%lx", cont_mask);
- ret = write_schemata("", schemata, uparams->cpu, test->resource);
- if (ret)
return ret;
How will user know what failed? I am seeing this single test exercise a few scenarios and it is not obvious to me if the issue will be clear if this test, noncont_cat_run_test(), fails.
write_schemata() either succeeds with '0' or errors out with a negative value. If the contiguous mask write fails, write_schemata should print out what was wrong and I believe that the test will report an error rather than failure.
Right. I am trying to understand whether the user will be able to decipher what failed in case there is an error. Seems like in this case the user is expected to look at the source code of the test to understand what the test was trying to do at the time it encountered the failure. In this case user may be "lucky" that this test only has one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that reasoning to figure out which write_schemata() failed to further dig what test was trying to do.
When a write_schemata() is executed the string that is being written gets printed. If there are multiple calls in a single tests and one fails I'd imagine it would be easy for the user to figure out which one failed.
It would be easy for the user the figure out if (a) it is obvious to the user what schema a particular write_schema() call attempted to write and (b) all the write_schema() calls attempt to write different schema.
Okay, your comment made me wonder if on error the schemata still is printed. I double checked in the code and whether write_schemata() fails or not it has a goto path where before returning it will print out the schema. So I believe that satisfies your (a) condition.
As for (b) depends on what you meant. Other tests that run more than one write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest that the non-contiguous test should attempt more schematas? For example shift the bit hole from one side to the other? I assumed one CBM with a centered bit hole would be enough to check if non-contiguous CBM feature works properly and more CBMs would be redundant.
On a side note I'm not sure if that's true but I'm getting a feeling that the harder errors (not just test failures) are more of a clue for developers working on the tests. Would you agree that it seems like users probably won't see write_schemata() fail here (if the test execution managed to get to this point without erroring out due to bad parameters or kernel compiled without required options)?
I do agree that users probably won't see such failures. I do not think these errors are clues to developers working on the tests though, but instead clues to resctrl developers or kernel development CI systems.
Right, I agree, the target group I mentioned was too narrow.
Reinette
Hi Maciej,
On 1/18/2024 11:37 PM, Maciej Wieczór-Retman wrote:
On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote:
On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote:
On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote:
On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote:
On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote:
On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
> + bit_center = count_bits(full_cache_mask) / 2; > + cont_mask = full_cache_mask >> bit_center; > + > + /* Contiguous mask write check. */ > + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); > + ret = write_schemata("", schemata, uparams->cpu, test->resource); > + if (ret) > + return ret;
How will user know what failed? I am seeing this single test exercise a few scenarios and it is not obvious to me if the issue will be clear if this test, noncont_cat_run_test(), fails.
write_schemata() either succeeds with '0' or errors out with a negative value. If the contiguous mask write fails, write_schemata should print out what was wrong and I believe that the test will report an error rather than failure.
Right. I am trying to understand whether the user will be able to decipher what failed in case there is an error. Seems like in this case the user is expected to look at the source code of the test to understand what the test was trying to do at the time it encountered the failure. In this case user may be "lucky" that this test only has one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that reasoning to figure out which write_schemata() failed to further dig what test was trying to do.
When a write_schemata() is executed the string that is being written gets printed. If there are multiple calls in a single tests and one fails I'd imagine it would be easy for the user to figure out which one failed.
It would be easy for the user the figure out if (a) it is obvious to the user what schema a particular write_schema() call attempted to write and (b) all the write_schema() calls attempt to write different schema.
Okay, your comment made me wonder if on error the schemata still is printed. I double checked in the code and whether write_schemata() fails or not it has a goto path where before returning it will print out the schema. So I believe that satisfies your (a) condition.
Let me try with an example. Scenario 1: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xf0f", ...); ...
Scenario 2: The test has the following code: ... write_schemata(..., schemata, ...); ... write_schemata(..., schemata, ...); ...
Any failure of write_schemata() in scenario 1 will be easy to trace. As you state, write_schemata() prints the schemata attempted and it will thus be easy to look at the code to see which write_schemata() call failed since it is obvious from the code which schemata was attempted. A failure of one of the write_schemata() in scenario 2 will not be as easy to trace since the user first needs to determine what the value of "schemata" is at each call and that may depend on the platform, bit shifting done in test, and state of system state at time of test.
As for (b) depends on what you meant. Other tests that run more than one write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest that the non-contiguous test should attempt more schematas? For example shift the bit hole from one side to the other? I assumed one CBM with a centered bit hole would be enough to check if non-contiguous CBM feature works properly and more CBMs would be redundant.
Let me try with an example. Scenario 1: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xf0f", ...); ...
Scenario 2: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xfff", ...); ...
A failure of either write_schemata() in scenario 1 will be easy to trace since the schemata attempted is different in each case. The schemata printed by the write_schemata() error message can thus easily be connected to the specific write_schemata() call. A failure of either write_schemata() in scenario 2 is not so obvious since they both attempted the same schemata so the error message printed by write_schemata() could belong to either.
Reinette
Hi!
On 2024-01-19 at 08:39:31 -0800, Reinette Chatre wrote:
Hi Maciej,
On 1/18/2024 11:37 PM, Maciej Wieczór-Retman wrote:
On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote:
On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote:
On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote:
On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote:
On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: > On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
>> + bit_center = count_bits(full_cache_mask) / 2; >> + cont_mask = full_cache_mask >> bit_center; >> + >> + /* Contiguous mask write check. */ >> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >> + if (ret) >> + return ret; > > How will user know what failed? I am seeing this single test exercise a few scenarios > and it is not obvious to me if the issue will be clear if this test, > noncont_cat_run_test(), fails.
write_schemata() either succeeds with '0' or errors out with a negative value. If the contiguous mask write fails, write_schemata should print out what was wrong and I believe that the test will report an error rather than failure.
Right. I am trying to understand whether the user will be able to decipher what failed in case there is an error. Seems like in this case the user is expected to look at the source code of the test to understand what the test was trying to do at the time it encountered the failure. In this case user may be "lucky" that this test only has one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that reasoning to figure out which write_schemata() failed to further dig what test was trying to do.
When a write_schemata() is executed the string that is being written gets printed. If there are multiple calls in a single tests and one fails I'd imagine it would be easy for the user to figure out which one failed.
It would be easy for the user the figure out if (a) it is obvious to the user what schema a particular write_schema() call attempted to write and (b) all the write_schema() calls attempt to write different schema.
Okay, your comment made me wonder if on error the schemata still is printed. I double checked in the code and whether write_schemata() fails or not it has a goto path where before returning it will print out the schema. So I believe that satisfies your (a) condition.
Let me try with an example. Scenario 1: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xf0f", ...); ...
Scenario 2: The test has the following code: ... write_schemata(..., schemata, ...); ... write_schemata(..., schemata, ...); ...
Any failure of write_schemata() in scenario 1 will be easy to trace. As you state, write_schemata() prints the schemata attempted and it will thus be easy to look at the code to see which write_schemata() call failed since it is obvious from the code which schemata was attempted. A failure of one of the write_schemata() in scenario 2 will not be as easy to trace since the user first needs to determine what the value of "schemata" is at each call and that may depend on the platform, bit shifting done in test, and state of system state at time of test.
Doing things similar to scenario 1 would be great from a debugging perspective but since the masks can have different sizes putting literals there seems impossible.
Maybe the code could be improved by putting an example CBM in the comment above a write_schemata() call? "For a 12 bit maximum CBM value, the contiguous schemata will look like '0x3f'" and "For a 12 bit maximum CBM value, the non-contiguous schemata will look like '0xf0f'"
This seems like the closest I could get to what you're showing in scenario 1 (which I assume would be the best).
As for (b) depends on what you meant. Other tests that run more than one write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest that the non-contiguous test should attempt more schematas? For example shift the bit hole from one side to the other? I assumed one CBM with a centered bit hole would be enough to check if non-contiguous CBM feature works properly and more CBMs would be redundant.
Let me try with an example. Scenario 1: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xf0f", ...); ...
Scenario 2: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xfff", ...); ...
A failure of either write_schemata() in scenario 1 will be easy to trace since the schemata attempted is different in each case. The schemata printed by the write_schemata() error message can thus easily be connected to the specific write_schemata() call. A failure of either write_schemata() in scenario 2 is not so obvious since they both attempted the same schemata so the error message printed by write_schemata() could belong to either.
I believe my code follows the first scenario example (since one schemata is half the full CBM, and the other one is the full CBM with a hole in the middle).
I'm sorry to drag this thread out but I want to be sure if I'm right or are you suggesting something and I missed it?
Reinette
Hi Maciej,
On 1/21/2024 11:56 PM, Maciej Wieczór-Retman wrote:
Hi!
On 2024-01-19 at 08:39:31 -0800, Reinette Chatre wrote:
Hi Maciej,
On 1/18/2024 11:37 PM, Maciej Wieczór-Retman wrote:
On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote:
On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote:
On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote:
On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote: > On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
>>> + bit_center = count_bits(full_cache_mask) / 2; >>> + cont_mask = full_cache_mask >> bit_center; >>> + >>> + /* Contiguous mask write check. */ >>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >>> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >>> + if (ret) >>> + return ret; >> >> How will user know what failed? I am seeing this single test exercise a few scenarios >> and it is not obvious to me if the issue will be clear if this test, >> noncont_cat_run_test(), fails. > > write_schemata() either succeeds with '0' or errors out with a negative value. If > the contiguous mask write fails, write_schemata should print out what was wrong > and I believe that the test will report an error rather than failure.
Right. I am trying to understand whether the user will be able to decipher what failed in case there is an error. Seems like in this case the user is expected to look at the source code of the test to understand what the test was trying to do at the time it encountered the failure. In this case user may be "lucky" that this test only has one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that reasoning to figure out which write_schemata() failed to further dig what test was trying to do.
When a write_schemata() is executed the string that is being written gets printed. If there are multiple calls in a single tests and one fails I'd imagine it would be easy for the user to figure out which one failed.
It would be easy for the user the figure out if (a) it is obvious to the user what schema a particular write_schema() call attempted to write and (b) all the write_schema() calls attempt to write different schema.
Okay, your comment made me wonder if on error the schemata still is printed. I double checked in the code and whether write_schemata() fails or not it has a goto path where before returning it will print out the schema. So I believe that satisfies your (a) condition.
Let me try with an example. Scenario 1: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xf0f", ...); ...
Scenario 2: The test has the following code: ... write_schemata(..., schemata, ...); ... write_schemata(..., schemata, ...); ...
Any failure of write_schemata() in scenario 1 will be easy to trace. As you state, write_schemata() prints the schemata attempted and it will thus be easy to look at the code to see which write_schemata() call failed since it is obvious from the code which schemata was attempted. A failure of one of the write_schemata() in scenario 2 will not be as easy to trace since the user first needs to determine what the value of "schemata" is at each call and that may depend on the platform, bit shifting done in test, and state of system state at time of test.
Doing things similar to scenario 1 would be great from a debugging perspective but since the masks can have different sizes putting literals there seems impossible.
Maybe the code could be improved by putting an example CBM in the comment above a write_schemata() call? "For a 12 bit maximum CBM value, the contiguous schemata will look like '0x3f'" and "For a 12 bit maximum CBM value, the non-contiguous schemata will look like '0xf0f'"
This seems like the closest I could get to what you're showing in scenario 1 (which I assume would be the best).
I am not asking you to use literals. I am trying to demonstrate that the only way it would be obvious to the user where a failure is is when the test uses literals. I continue to try to motivate for clear indication to user/developer what failed when this test failed ... this could just be a ksft_print_msg() when the write_schemata() call we are talking about fails.
As for (b) depends on what you meant. Other tests that run more than one write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest that the non-contiguous test should attempt more schematas? For example shift the bit hole from one side to the other? I assumed one CBM with a centered bit hole would be enough to check if non-contiguous CBM feature works properly and more CBMs would be redundant.
Let me try with an example. Scenario 1: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xf0f", ...); ...
Scenario 2: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xfff", ...); ...
A failure of either write_schemata() in scenario 1 will be easy to trace since the schemata attempted is different in each case. The schemata printed by the write_schemata() error message can thus easily be connected to the specific write_schemata() call. A failure of either write_schemata() in scenario 2 is not so obvious since they both attempted the same schemata so the error message printed by write_schemata() could belong to either.
I believe my code follows the first scenario example (since one schemata is half the full CBM, and the other one is the full CBM with a hole in the middle).
I know. This thread digressed into discussion about when it would be ok to omit error message from caller of write_schemata().
I'm sorry to drag this thread out but I want to be sure if I'm right or are you suggesting something and I missed it?
Please just add a ksft_print_msg() to noncont_cat_run_test() when this write_schemata() fails.
Reinette
On 2024-01-22 at 08:32:36 -0800, Reinette Chatre wrote:
Hi Maciej,
On 1/21/2024 11:56 PM, Maciej Wieczór-Retman wrote:
Hi!
On 2024-01-19 at 08:39:31 -0800, Reinette Chatre wrote:
Hi Maciej,
On 1/18/2024 11:37 PM, Maciej Wieczór-Retman wrote:
On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote:
On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote:
On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote: > On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote: >> On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >>> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
>>>> + bit_center = count_bits(full_cache_mask) / 2; >>>> + cont_mask = full_cache_mask >> bit_center; >>>> + >>>> + /* Contiguous mask write check. */ >>>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >>>> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >>>> + if (ret) >>>> + return ret; >>> >>> How will user know what failed? I am seeing this single test exercise a few scenarios >>> and it is not obvious to me if the issue will be clear if this test, >>> noncont_cat_run_test(), fails. >> >> write_schemata() either succeeds with '0' or errors out with a negative value. If >> the contiguous mask write fails, write_schemata should print out what was wrong >> and I believe that the test will report an error rather than failure. > > Right. I am trying to understand whether the user will be able to decipher what failed > in case there is an error. Seems like in this case the user is expected to look at the > source code of the test to understand what the test was trying to do at the time it > encountered the failure. In this case user may be "lucky" that this test only has > one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that > reasoning to figure out which write_schemata() failed to further dig what test was > trying to do.
When a write_schemata() is executed the string that is being written gets printed. If there are multiple calls in a single tests and one fails I'd imagine it would be easy for the user to figure out which one failed.
It would be easy for the user the figure out if (a) it is obvious to the user what schema a particular write_schema() call attempted to write and (b) all the write_schema() calls attempt to write different schema.
As for (b) depends on what you meant. Other tests that run more than one write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest that the non-contiguous test should attempt more schematas? For example shift the bit hole from one side to the other? I assumed one CBM with a centered bit hole would be enough to check if non-contiguous CBM feature works properly and more CBMs would be redundant.
Let me try with an example. Scenario 1: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xf0f", ...); ...
Scenario 2: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xfff", ...); ...
A failure of either write_schemata() in scenario 1 will be easy to trace since the schemata attempted is different in each case. The schemata printed by the write_schemata() error message can thus easily be connected to the specific write_schemata() call. A failure of either write_schemata() in scenario 2 is not so obvious since they both attempted the same schemata so the error message printed by write_schemata() could belong to either.
I'm sorry to drag this thread out but I want to be sure if I'm right or are you suggesting something and I missed it?
Please just add a ksft_print_msg() to noncont_cat_run_test() when this write_schemata() fails.
My point all along was that if write_schemata() fails it already prints out all the necessary information. I'd like to avoid adding redundant messages so please take a look at how it looks now:
I injected write_schemata() with an error so it will take a path as if write() failed with 'Permission denied' as a reason. Here is the output for L3 non-contiguous CAT test:
[root@spr1 ~]# ./resctrl_tests -t L3_NONCONT_CAT TAP version 13 # Pass: Check kernel supports resctrl filesystem # Pass: Check resctrl mountpoint "/sys/fs/resctrl" exists # resctrl filesystem not mounted # dmesg: [ 18.579861] resctrl: L3 allocation detected # dmesg: [ 18.590395] resctrl: L2 allocation detected # dmesg: [ 18.595181] resctrl: MB allocation detected # dmesg: [ 18.599963] resctrl: L3 monitoring detected 1..1 # Starting L3_NONCONT_CAT test ... # Mounting resctrl to "/sys/fs/resctrl" # Write schema "L3:0=ff" to resctrl FS # write() failed : Permission denied not ok 1 L3_NONCONT_CAT: test # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
Of course if you still think adding a ksft_print_msg() there would be meaningful I'll try to write a sensible message. But I hope you can see what I meant when I wrote that the user could already easily see what failed.
Reinette
Hi Maciej,
On 1/22/2024 11:58 PM, Maciej Wieczór-Retman wrote:
On 2024-01-22 at 08:32:36 -0800, Reinette Chatre wrote:
Hi Maciej,
On 1/21/2024 11:56 PM, Maciej Wieczór-Retman wrote:
Hi!
On 2024-01-19 at 08:39:31 -0800, Reinette Chatre wrote:
Hi Maciej,
On 1/18/2024 11:37 PM, Maciej Wieczór-Retman wrote:
On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote:
On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote: > On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote: >> On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote: >>> On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >>>> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
>>>>> + bit_center = count_bits(full_cache_mask) / 2; >>>>> + cont_mask = full_cache_mask >> bit_center; >>>>> + >>>>> + /* Contiguous mask write check. */ >>>>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >>>>> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >>>>> + if (ret) >>>>> + return ret; >>>> >>>> How will user know what failed? I am seeing this single test exercise a few scenarios >>>> and it is not obvious to me if the issue will be clear if this test, >>>> noncont_cat_run_test(), fails. >>> >>> write_schemata() either succeeds with '0' or errors out with a negative value. If >>> the contiguous mask write fails, write_schemata should print out what was wrong >>> and I believe that the test will report an error rather than failure. >> >> Right. I am trying to understand whether the user will be able to decipher what failed >> in case there is an error. Seems like in this case the user is expected to look at the >> source code of the test to understand what the test was trying to do at the time it >> encountered the failure. In this case user may be "lucky" that this test only has >> one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that >> reasoning to figure out which write_schemata() failed to further dig what test was >> trying to do. > > When a write_schemata() is executed the string that is being written gets > printed. If there are multiple calls in a single tests and one fails I'd imagine > it would be easy for the user to figure out which one failed.
It would be easy for the user the figure out if (a) it is obvious to the user what schema a particular write_schema() call attempted to write and (b) all the write_schema() calls attempt to write different schema.
As for (b) depends on what you meant. Other tests that run more than one write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest that the non-contiguous test should attempt more schematas? For example shift the bit hole from one side to the other? I assumed one CBM with a centered bit hole would be enough to check if non-contiguous CBM feature works properly and more CBMs would be redundant.
Let me try with an example. Scenario 1: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xf0f", ...); ...
Scenario 2: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xfff", ...); ...
A failure of either write_schemata() in scenario 1 will be easy to trace since the schemata attempted is different in each case. The schemata printed by the write_schemata() error message can thus easily be connected to the specific write_schemata() call. A failure of either write_schemata() in scenario 2 is not so obvious since they both attempted the same schemata so the error message printed by write_schemata() could belong to either.
I'm sorry to drag this thread out but I want to be sure if I'm right or are you suggesting something and I missed it?
Please just add a ksft_print_msg() to noncont_cat_run_test() when this write_schemata() fails.
My point all along was that if write_schemata() fails it already prints out all the necessary information. I'd like to avoid adding redundant messages so please take a look at how it looks now:
Please consider that there may be different perspectives of "necessary information".
I injected write_schemata() with an error so it will take a path as if write() failed with 'Permission denied' as a reason. Here is the output for L3 non-contiguous CAT test:
[root@spr1 ~]# ./resctrl_tests -t L3_NONCONT_CAT TAP version 13 # Pass: Check kernel supports resctrl filesystem # Pass: Check resctrl mountpoint "/sys/fs/resctrl" exists # resctrl filesystem not mounted # dmesg: [ 18.579861] resctrl: L3 allocation detected # dmesg: [ 18.590395] resctrl: L2 allocation detected # dmesg: [ 18.595181] resctrl: MB allocation detected # dmesg: [ 18.599963] resctrl: L3 monitoring detected 1..1 # Starting L3_NONCONT_CAT test ... # Mounting resctrl to "/sys/fs/resctrl" # Write schema "L3:0=ff" to resctrl FS # write() failed : Permission denied not ok 1 L3_NONCONT_CAT: test # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
Understood.
Of course if you still think adding a ksft_print_msg() there would be meaningful I'll try to write a sensible message. But I hope you can see what I meant when I wrote that the user could already easily see what failed.
I do still believe that it will be helpful if there is a ksft_print_msg() with something like "Unable to write contiguous CBM" or "Write of contiguous CBM failed" or ... ?
Reinette
Hi Reinette!
On 2024-01-23 at 09:42:07 -0800, Reinette Chatre wrote:
Hi Maciej,
On 1/22/2024 11:58 PM, Maciej Wieczór-Retman wrote:
On 2024-01-22 at 08:32:36 -0800, Reinette Chatre wrote:
Hi Maciej,
On 1/21/2024 11:56 PM, Maciej Wieczór-Retman wrote:
Hi!
On 2024-01-19 at 08:39:31 -0800, Reinette Chatre wrote:
Hi Maciej,
On 1/18/2024 11:37 PM, Maciej Wieczór-Retman wrote:
On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote: > On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote: >> On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote: >>> On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote: >>>> On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >>>>> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: > >>>>>> + bit_center = count_bits(full_cache_mask) / 2; >>>>>> + cont_mask = full_cache_mask >> bit_center; >>>>>> + >>>>>> + /* Contiguous mask write check. */ >>>>>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >>>>>> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >>>>>> + if (ret) >>>>>> + return ret; >>>>> >>>>> How will user know what failed? I am seeing this single test exercise a few scenarios >>>>> and it is not obvious to me if the issue will be clear if this test, >>>>> noncont_cat_run_test(), fails. >>>> >>>> write_schemata() either succeeds with '0' or errors out with a negative value. If >>>> the contiguous mask write fails, write_schemata should print out what was wrong >>>> and I believe that the test will report an error rather than failure. >>> >>> Right. I am trying to understand whether the user will be able to decipher what failed >>> in case there is an error. Seems like in this case the user is expected to look at the >>> source code of the test to understand what the test was trying to do at the time it >>> encountered the failure. In this case user may be "lucky" that this test only has >>> one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that >>> reasoning to figure out which write_schemata() failed to further dig what test was >>> trying to do. >> >> When a write_schemata() is executed the string that is being written gets >> printed. If there are multiple calls in a single tests and one fails I'd imagine >> it would be easy for the user to figure out which one failed. > > It would be easy for the user the figure out if (a) it is obvious to the user > what schema a particular write_schema() call attempted to write and (b) all the > write_schema() calls attempt to write different schema.
As for (b) depends on what you meant. Other tests that run more than one write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest that the non-contiguous test should attempt more schematas? For example shift the bit hole from one side to the other? I assumed one CBM with a centered bit hole would be enough to check if non-contiguous CBM feature works properly and more CBMs would be redundant.
Let me try with an example. Scenario 1: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xf0f", ...); ...
Scenario 2: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xfff", ...); ...
A failure of either write_schemata() in scenario 1 will be easy to trace since the schemata attempted is different in each case. The schemata printed by the write_schemata() error message can thus easily be connected to the specific write_schemata() call. A failure of either write_schemata() in scenario 2 is not so obvious since they both attempted the same schemata so the error message printed by write_schemata() could belong to either.
I'm sorry to drag this thread out but I want to be sure if I'm right or are you suggesting something and I missed it?
Please just add a ksft_print_msg() to noncont_cat_run_test() when this write_schemata() fails.
My point all along was that if write_schemata() fails it already prints out all the necessary information. I'd like to avoid adding redundant messages so please take a look at how it looks now:
Please consider that there may be different perspectives of "necessary information".
Oh of course. By that I meant the failed schemata which I assumed is what you were looking for in this error handling here.
I injected write_schemata() with an error so it will take a path as if write() failed with 'Permission denied' as a reason. Here is the output for L3 non-contiguous CAT test:
[root@spr1 ~]# ./resctrl_tests -t L3_NONCONT_CAT TAP version 13 # Pass: Check kernel supports resctrl filesystem # Pass: Check resctrl mountpoint "/sys/fs/resctrl" exists # resctrl filesystem not mounted # dmesg: [ 18.579861] resctrl: L3 allocation detected # dmesg: [ 18.590395] resctrl: L2 allocation detected # dmesg: [ 18.595181] resctrl: MB allocation detected # dmesg: [ 18.599963] resctrl: L3 monitoring detected 1..1 # Starting L3_NONCONT_CAT test ... # Mounting resctrl to "/sys/fs/resctrl" # Write schema "L3:0=ff" to resctrl FS # write() failed : Permission denied not ok 1 L3_NONCONT_CAT: test # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
Understood.
Of course if you still think adding a ksft_print_msg() there would be meaningful I'll try to write a sensible message. But I hope you can see what I meant when I wrote that the user could already easily see what failed.
I do still believe that it will be helpful if there is a ksft_print_msg() with something like "Unable to write contiguous CBM" or "Write of contiguous CBM failed" or ... ?
Sure, I can see how that can be helpful, I'll add "Write of contiguous CBM failed", thanks!
Reinette
linux-kselftest-mirror@lists.linaro.org