[PATCH v2 00/24] selftests/resctrl: Fixes, cleanups, and rewritten CAT test

List overview All Threads
Download

newer

older

[PATCH] kselftest/alsa: Increase...

[PATCH] selftests/nolibc: Remove...

Ilpo Järvinen

18 Apr 2023 18 Apr '23

11:44 a.m.

Here is a series with some fixes and cleanups to resctrl selftests and rewrite of CAT test into something that really tests CAT working or not condition.

v2: - Rebased on top of next to solve the conflicts - Added 2 patches related to resctrl FS mount/umount (fix + cleanup) - Consistently use "alloc" in cache_alloc_size() - CAT test error handling tweaked - Remove a spurious newline change from the CAT patch - Small improvements to changelogs

Ilpo Järvinen (24): selftests/resctrl: Add resctrl.h into build deps selftests/resctrl: Check also too low values for CBM bits selftests/resctrl: Move resctrl FS mount/umount to higher level selftests/resctrl: Remove mum_resctrlfs selftests/resctrl: Make span unsigned long everywhere selftests/resctrl: Express span in bytes selftests/resctrl: Remove duplicated preparation for span arg selftests/resctrl: Don't use variable argument list for ->setup() selftests/resctrl: Remove "malloc_and_init_memory" param from run_fill_buf() selftests/resctrl: Split run_fill_buf() to alloc, work, and dealloc helpers selftests/resctrl: Remove start_buf local variable from buffer alloc func selftests/resctrl: Don't pass test name to fill_buf selftests/resctrl: Add flush_buffer() to fill_buf selftests/resctrl: Remove test type checks from cat_val() selftests/resctrl: Refactor get_cbm_mask() selftests/resctrl: Create cache_alloc_size() helper selftests/resctrl: Replace count_bits with count_consecutive_bits() selftests/resctrl: Exclude shareable bits from schemata in CAT test selftests/resctrl: Pass the real number of tests to show_cache_info() selftests/resctrl: Move CAT/CMT test global vars to func they are used selftests/resctrl: Read in less obvious order to defeat prefetch optimizations selftests/resctrl: Split measure_cache_vals() function selftests/resctrl: Split show_cache_info() to test specific and generic parts selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

tools/testing/selftests/resctrl/Makefile | 2 +- tools/testing/selftests/resctrl/cache.c | 154 ++++++------ tools/testing/selftests/resctrl/cat_test.c | 235 ++++++++---------- tools/testing/selftests/resctrl/cmt_test.c | 65 +++-- tools/testing/selftests/resctrl/fill_buf.c | 105 ++++---- tools/testing/selftests/resctrl/mba_test.c | 9 +- tools/testing/selftests/resctrl/mbm_test.c | 17 +- tools/testing/selftests/resctrl/resctrl.h | 32 +-- .../testing/selftests/resctrl/resctrl_tests.c | 82 ++++-- tools/testing/selftests/resctrl/resctrl_val.c | 9 +- tools/testing/selftests/resctrl/resctrlfs.c | 187 ++++++++++---- 11 files changed, 499 insertions(+), 398 deletions(-)

-- 2.30.2

Show replies by date

Ilpo Järvinen

18 Apr 18 Apr

11:44 a.m.

New subject: [PATCH v2 01/24] selftests/resctrl: Add resctrl.h into build deps

Makefile only lists *.c as build dependecies for the restctrl_tests executable which excludes resctrl.h.

Add *.h to wildcard() cover also resctrl.h.

Fixes: 591a6e8588fc ("selftests/resctrl: Add basic resctrl file system operations and data") Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/resctrl/Makefile b/tools/testing/selftests/resctrl/Makefile index 73d53257df42..2dc7da221795 100644 --- a/tools/testing/selftests/resctrl/Makefile +++ b/tools/testing/selftests/resctrl/Makefile @@ -7,4 +7,4 @@ TEST_GEN_PROGS := resctrl_tests

include ../lib.mk

-$(OUTPUT)/resctrl_tests: $(wildcard *.c) +$(OUTPUT)/resctrl_tests: $(wildcard *.c *.h)

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:07 a.m.

New subject: [PATCH v2 01/24] selftests/resctrl: Add resctrl.h into build deps

Hi Ilpo,

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...

Makefile only lists *.c as build dependecies for the restctrl_tests

dependecies -> dependencies

restctrl_tests -> resctrl_tests

...

executable which excludes resctrl.h.

Add *.h to wildcard() cover also resctrl.h.

I find this a bit hard to parse. How about "Add *.h to wildcard() to include resctrl.h."

(considering the problem statement indicates that resctrl.h was "excluded", having it now "included" seems to match)

...

Fixes: 591a6e8588fc ("selftests/resctrl: Add basic resctrl file system operations and data") Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/resctrl/Makefile b/tools/testing/selftests/resctrl/Makefile index 73d53257df42..2dc7da221795 100644 --- a/tools/testing/selftests/resctrl/Makefile +++ b/tools/testing/selftests/resctrl/Makefile @@ -7,4 +7,4 @@ TEST_GEN_PROGS := resctrl_tests include ../lib.mk -$(OUTPUT)/resctrl_tests: $(wildcard *.c) +$(OUTPUT)/resctrl_tests: $(wildcard *.c *.h)

How about a simpler *.[ch]? Seems like this pattern is popular in selftest code.

Reinette

Ilpo Järvinen

18 Apr 18 Apr

11:44 a.m.

New subject: [PATCH v2 02/24] selftests/resctrl: Check also too low values for CBM bits

CAT test only validates that the number of CBM bits is not too large but it could be too small too.

Check and return error before starting the CAT test if the number of CBM bits is too small.

Fixes: 09a67934625a ("selftests/resctrl: Don't hard code value of "no_of_bits" variable") Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cat_test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index fb1443f888c4..722c9cd4120d 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -129,7 +129,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) if (!n) n = count_of_bits / 2;

- if (n > count_of_bits - 1) { + if (n < 1 || n > count_of_bits - 1) { ksft_print_msg("Invalid input value for no_of_bits n!\n"); ksft_print_msg("Please enter value in range 1 to %d\n", count_of_bits - 1);

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:08 a.m.

New subject: [PATCH v2 02/24] selftests/resctrl: Check also too low values for CBM bits

Hi Ilpo,

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...

CAT test only validates that the number of CBM bits is not too large but it could be too small too.

Could you please elaborate how this scenario could occur?

...

Check and return error before starting the CAT test if the number of CBM bits is too small.

Fixes: 09a67934625a ("selftests/resctrl: Don't hard code value of "no_of_bits" variable")

This fix is not clear to me. This commit being fixed already contains an explicit test that will bail out of no_of_bits <= 0. It is not clear to me why it is necessary to adding a test for < 1 as a fix for this commit.

...

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/cat_test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index fb1443f888c4..722c9cd4120d 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -129,7 +129,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) if (!n) n = count_of_bits / 2;

if (n > count_of_bits - 1) {

if (n < 1 || n > count_of_bits - 1) { ksft_print_msg("Invalid input value for no_of_bits n!\n"); ksft_print_msg("Please enter value in range 1 to %d\n", count_of_bits - 1);

Reinette

Ilpo Järvinen

24 Apr 24 Apr

10:46 a.m.

New subject: [PATCH v2 02/24] selftests/resctrl: Check also too low values for CBM bits

On Fri, 21 Apr 2023, Reinette Chatre wrote:

...

Hi Ilpo,

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...
CAT test only validates that the number of CBM bits is not too large but it could be too small too.

Could you please elaborate how this scenario could occur?

...
Check and return error before starting the CAT test if the number of CBM bits is too small.

Fixes: 09a67934625a ("selftests/resctrl: Don't hard code value of "no_of_bits" variable")

This fix is not clear to me. This commit being fixed already contains an explicit test that will bail out of no_of_bits <= 0. It is not clear to me why it is necessary to adding a test for < 1 as a fix for this commit.

Ah, indeed, it's checked on the higher level so this fix is unnecessary. I missed it entirely while taking this change out from a more complex change and even failed to make the connection when I stared at the very if <= 0 check not so many days ago.

-- i. > > Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com > > --- > > tools/testing/selftests/resctrl/cat_test.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c > > index fb1443f888c4..722c9cd4120d 100644 > > --- a/tools/testing/selftests/resctrl/cat_test.c > > +++ b/tools/testing/selftests/resctrl/cat_test.c > > @@ -129,7 +129,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) > > if (!n) > > n = count_of_bits / 2; > > > > - if (n > count_of_bits - 1) { > > + if (n < 1 || n > count_of_bits - 1) { > > ksft_print_msg("Invalid input value for no_of_bits n!\n"); > > ksft_print_msg("Please enter value in range 1 to %d\n", > > count_of_bits - 1);

Ilpo Järvinen

18 Apr 18 Apr

11:44 a.m.

New subject: [PATCH v2 03/24] selftests/resctrl: Move resctrl FS mount/umount to higher level

A few places currently lack umounting resctrl FS on error paths.

Each and every test does require resctrl FS to be present already for feature check. Thus, it makes sense to just mount it on higher level in resctrl_tests.c.

Move resctrl FS mount/unmount into each test function in resctrl_tests.c. Make feature validation to simply check that resctrl FS is mounted.

Fixes: 790bf585b0ee ("selftests/resctrl: Add Cache Allocation Technology (CAT) selftest") Fixes: 78941183d1b1 ("selftests/resctrl: Add Cache QoS Monitoring (CQM) selftest") Fixes: f1dd71982d19 ("selftests/resctrl: Skip the test if requested resctrl feature is not supported") Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cat_test.c | 6 --- tools/testing/selftests/resctrl/cmt_test.c | 4 -- .../testing/selftests/resctrl/resctrl_tests.c | 48 ++++++++++++++++--- tools/testing/selftests/resctrl/resctrl_val.c | 5 -- tools/testing/selftests/resctrl/resctrlfs.c | 7 ++- 5 files changed, 47 insertions(+), 23 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 722c9cd4120d..dec3151cf888 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -106,10 +106,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)

cache_size = 0;

- ret = remount_resctrlfs(true); - if (ret) - return ret; - /* Get default cbm mask for L3/L2 cache */ ret = get_cbm_mask(cache_type, cbm_mask); if (ret) @@ -227,8 +223,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)

out: cat_test_cleanup(); - if (bm_pid) - umount_resctrlfs();

return ret; } diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index af71b2141271..426d11189a99 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -86,10 +86,6 @@ int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd)

cache_size = 0;

- ret = remount_resctrlfs(true); - if (ret) - return ret; - if (!validate_resctrl_feature_request(CMT_STR)) return -1;

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index 9b9751206e1c..5c9ed52b69f2 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -77,9 +77,15 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, int span,

ksft_print_msg("Starting MBM BW change ...\n");

+ res = remount_resctrlfs(false); + if (res) { + ksft_exit_fail_msg("Failed to mount resctrl FS\n"); + return; + } + if (!validate_resctrl_feature_request(MBM_STR) || (get_vendor() != ARCH_INTEL)) { ksft_test_result_skip("Hardware does not support MBM or MBM is disabled\n"); - return; + goto umount; }

if (!has_ben) @@ -88,6 +94,9 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, int span, ksft_test_result(!res, "MBM: bw change\n"); if ((get_vendor() == ARCH_INTEL) && res) ksft_print_msg("Intel MBM may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n"); + +umount: + umount_resctrlfs(); }

static void run_mba_test(bool has_ben, char **benchmark_cmd, int span, @@ -97,15 +106,24 @@ static void run_mba_test(bool has_ben, char **benchmark_cmd, int span,

ksft_print_msg("Starting MBA Schemata change ...\n");

+ res = remount_resctrlfs(false); + if (res) { + ksft_exit_fail_msg("Failed to mount resctrl FS\n"); + return; + } + if (!validate_resctrl_feature_request(MBA_STR) || (get_vendor() != ARCH_INTEL)) { ksft_test_result_skip("Hardware does not support MBA or MBA is disabled\n"); - return; + goto umount; }

if (!has_ben) sprintf(benchmark_cmd[1], "%d", span); res = mba_schemata_change(cpu_no, bw_report, benchmark_cmd); ksft_test_result(!res, "MBA: schemata change\n"); + +umount: + umount_resctrlfs(); }

static void run_cmt_test(bool has_ben, char **benchmark_cmd, int cpu_no) @@ -113,9 +131,16 @@ static void run_cmt_test(bool has_ben, char **benchmark_cmd, int cpu_no) int res;

ksft_print_msg("Starting CMT test ...\n"); + + res = remount_resctrlfs(false); + if (res) { + ksft_exit_fail_msg("Failed to mount resctrl FS\n"); + return; + } + if (!validate_resctrl_feature_request(CMT_STR)) { ksft_test_result_skip("Hardware does not support CMT or CMT is disabled\n"); - return; + goto umount; }

if (!has_ben) @@ -124,6 +149,9 @@ static void run_cmt_test(bool has_ben, char **benchmark_cmd, int cpu_no) ksft_test_result(!res, "CMT: test\n"); if ((get_vendor() == ARCH_INTEL) && res) ksft_print_msg("Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n"); + +umount: + umount_resctrlfs(); }

static void run_cat_test(int cpu_no, int no_of_bits) @@ -132,13 +160,23 @@ static void run_cat_test(int cpu_no, int no_of_bits)

ksft_print_msg("Starting CAT test ...\n");

+ res = remount_resctrlfs(false); + if (res) { + ksft_exit_fail_msg("Failed to mount resctrl FS\n"); + return; + } + if (!validate_resctrl_feature_request(CAT_STR)) { ksft_test_result_skip("Hardware does not support CAT or CAT is disabled\n"); - return; + goto umount; }

res = cat_perf_miss_val(cpu_no, no_of_bits, "L3"); ksft_test_result(!res, "CAT: test\n"); + + +umount: + umount_resctrlfs(); }

int main(int argc, char **argv) @@ -266,7 +304,5 @@ int main(int argc, char **argv) if (cat_test) run_cat_test(cpu_no, no_of_bits);

- umount_resctrlfs(); - ksft_finished(); } diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c index ab1eab1e7ff6..e8f1e6a59f4a 100644 --- a/tools/testing/selftests/resctrl/resctrl_val.c +++ b/tools/testing/selftests/resctrl/resctrl_val.c @@ -648,10 +648,6 @@ int resctrl_val(char **benchmark_cmd, struct resctrl_val_param *param) return ret; }

- ret = remount_resctrlfs(param->mum_resctrlfs); - if (ret) - return ret; - /* * If benchmark wasn't successfully started by child, then child should * kill parent, so save parent's pid @@ -788,7 +784,6 @@ int resctrl_val(char **benchmark_cmd, struct resctrl_val_param *param) signal_handler_unregister(); out: kill(bm_pid, SIGKILL); - umount_resctrlfs();

return ret; } diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index fb00245dee92..42f547a81e25 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -611,7 +611,8 @@ char *fgrep(FILE *inf, const char *str) * validate_resctrl_feature_request - Check if requested feature is valid. * @resctrl_val: Requested feature * - * Return: True if the feature is supported, else false + * Return: True if the feature is supported, else false. False is also + * returned if resctrl FS is not mounted. */ bool validate_resctrl_feature_request(const char *resctrl_val) { @@ -619,11 +620,13 @@ bool validate_resctrl_feature_request(const char *resctrl_val) bool found = false; char *res; FILE *inf; + int ret;

if (!resctrl_val) return false;

- if (remount_resctrlfs(false)) + ret = find_resctrl_mount(NULL); + if (ret) return false;

if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) {

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:09 a.m.

New subject: [PATCH v2 03/24] selftests/resctrl: Move resctrl FS mount/umount to higher level

Hi Ilpo,

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...

A few places currently lack umounting resctrl FS on error paths.

You mention "A few places" (plural). In the patch I do see that cmt_resctrl_val() is missing an unmount. Where are the other places?

...

Each and every test does require resctrl FS to be present already for feature check. Thus, it makes sense to just mount it on higher level in resctrl_tests.c.

Move resctrl FS mount/unmount into each test function in resctrl_tests.c. Make feature validation to simply check that resctrl FS is mounted.

...

diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index af71b2141271..426d11189a99 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -86,10 +86,6 @@ int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd) cache_size = 0;
ret = remount_resctrlfs(true);

if (ret)
return ret;
if (!validate_resctrl_feature_request(CMT_STR)) return -1;
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index 9b9751206e1c..5c9ed52b69f2 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -77,9 +77,15 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, int span, ksft_print_msg("Starting MBM BW change ...\n");

res = remount_resctrlfs(false);

I think that should be remount_resctrlfs(true). Please note that any of the tests could be run separately from the command line and thus each test need to ensure a clean environment, it cannot assume that (a) user space provided it with a clean and/or unmounted resctrl or (b) that any test was run before it.

...

if (res) {
ksft_exit_fail_msg("Failed to mount resctrl FS\n");
return;
}

if (!validate_resctrl_feature_request(MBM_STR) || (get_vendor() != ARCH_INTEL)) { ksft_test_result_skip("Hardware does not support MBM or MBM is disabled\n");
return;
goto umount;
}

Reinette

Ilpo Järvinen

24 Apr 24 Apr

2:58 p.m.

New subject: [PATCH v2 03/24] selftests/resctrl: Move resctrl FS mount/umount to higher level

On Fri, 21 Apr 2023, Reinette Chatre wrote:

...

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...
A few places currently lack umounting resctrl FS on error paths.

You mention "A few places" (plural). In the patch I do see that cmt_resctrl_val() is missing an unmount. Where are the other places?

- cmt_resctrl_val() has multiple error paths with direct return. - cat_perf_miss_val() has multiple error paths with direct return.

In addition, validate_resctrl_feature_request() is called by run_mbm_test() and run_mba_test(). Neither MBA nor MBM test tries to umount resctrl FS.

I improved the changelog accordingly.

While doing this, I took a more careful look into how it can result in problems and I think the only way is through PARENT_EXIT() because main has the umount in the end (and the remounting trickery kinda seems to work even if it was hard to track).

Fixing the PARENT_EXIT() problem required yet another change which I add in v3.

As the only failure I could think of is because of PARENT_EXIT(), I removed Fixes tags from this change and put one into the PARENT_EXIT() umount fix. So this change will just be part of the move towards more tractable resctrl FS handling, not a fix anymore.

In the end, after some reshuffling, I ended up having 5 changes related to this:

selftests/resctrl: Remove mum_resctrlfs from struct resctrl_val_param selftests/resctrl: Refactor remount_resctrl(bool mum_resctrlfs) to mount_resctrl() selftests/resctrl: Move resctrl FS mount/umount to higher level selftests/resctrl: Unmount resctrl FS before starting the first test selftests/resctrl: Unmount resctrl FS if child fails to run benchmark

...

...
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index af71b2141271..426d11189a99 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -86,10 +86,6 @@ int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd) cache_size = 0;
ret = remount_resctrlfs(true);

if (ret)
return ret;
if (!validate_resctrl_feature_request(CMT_STR)) return -1;
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index 9b9751206e1c..5c9ed52b69f2 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -77,9 +77,15 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, int span, ksft_print_msg("Starting MBM BW change ...\n");

res = remount_resctrlfs(false);
I think that should be remount_resctrlfs(true).

...

Please note that any of the tests could be run separately from the command line and thus each test need to ensure a clean environment, it cannot assume that (a) user space provided it with a clean and/or unmounted resctrl or (b) that any test was run before it.

I think I got tripped by the level of complexity here and trying to split patch to minimal parts. I was somehow thinking that remount_resctrlfs(false) would return error if resctrl fs is already mounted.

I've now changed this to pass true instead even if the argument is removed by the other change.

-- i.

Ilpo Järvinen

18 Apr 18 Apr

11:44 a.m.

New subject: [PATCH v2 04/24] selftests/resctrl: Remove mum_resctrlfs

Resctrl FS mount/umount are now cleanly paired.

Removed mum_resctrlfs that is what is left of the earlier remount trickery to make the code cleaner. Rename remount_resctrlfs() to mount_resctrlfs() to match the reduced functionality.

While at it, group the mount/umount prototypes in the header.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cat_test.c | 1 - tools/testing/selftests/resctrl/cmt_test.c | 1 - tools/testing/selftests/resctrl/mba_test.c | 1 - tools/testing/selftests/resctrl/mbm_test.c | 1 - tools/testing/selftests/resctrl/resctrl.h | 4 +--- .../testing/selftests/resctrl/resctrl_tests.c | 8 ++++---- tools/testing/selftests/resctrl/resctrlfs.c | 20 +++++-------------- 7 files changed, 10 insertions(+), 26 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index dec3151cf888..86c62dfd8e78 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -140,7 +140,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) struct resctrl_val_param param = { .resctrl_val = CAT_STR, .cpu_no = cpu_no, - .mum_resctrlfs = false, .setup = cat_setup, };

diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index 426d11189a99..d31e28416bb7 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -113,7 +113,6 @@ int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd) .ctrlgrp = "c1", .mongrp = "m1", .cpu_no = cpu_no, - .mum_resctrlfs = false, .filename = RESULT_FILE_NAME, .mask = ~(long_mask << n) & long_mask, .span = cache_size * n / count_of_bits, diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c index cde3781a9ab0..3d53c6c9b9ce 100644 --- a/tools/testing/selftests/resctrl/mba_test.c +++ b/tools/testing/selftests/resctrl/mba_test.c @@ -154,7 +154,6 @@ int mba_schemata_change(int cpu_no, char *bw_report, char **benchmark_cmd) .ctrlgrp = "c1", .mongrp = "m1", .cpu_no = cpu_no, - .mum_resctrlfs = true, .filename = RESULT_FILE_NAME, .bw_report = bw_report, .setup = mba_setup diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c index 538d35a6485a..24326cb7bc21 100644 --- a/tools/testing/selftests/resctrl/mbm_test.c +++ b/tools/testing/selftests/resctrl/mbm_test.c @@ -123,7 +123,6 @@ int mbm_bw_change(int span, int cpu_no, char *bw_report, char **benchmark_cmd) .mongrp = "m1", .span = span, .cpu_no = cpu_no, - .mum_resctrlfs = true, .filename = RESULT_FILE_NAME, .bw_report = bw_report, .setup = mbm_setup diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 87e39456dee0..c737eb47eacc 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -53,7 +53,6 @@ * @mongrp: Name of the monitor group (mon grp) * @cpu_no: CPU number to which the benchmark would be binded * @span: Memory bytes accessed in each benchmark iteration - * @mum_resctrlfs: Should the resctrl FS be remounted? * @filename: Name of file to which the o/p should be written * @bw_report: Bandwidth report type (reads vs writes) * @setup: Call back function to setup test environment @@ -64,7 +63,6 @@ struct resctrl_val_param { char mongrp[64]; int cpu_no; unsigned long span; - bool mum_resctrlfs; char filename[64]; char *bw_report; unsigned long mask; @@ -84,8 +82,8 @@ extern char llc_occup_path[1024]; int get_vendor(void); bool check_resctrlfs_support(void); int filter_dmesg(void); -int remount_resctrlfs(bool mum_resctrlfs); int get_resource_id(int cpu_no, int *resource_id); +int mount_resctrlfs(void); int umount_resctrlfs(void); int validate_bw_report_request(char *bw_report); bool validate_resctrl_feature_request(const char *resctrl_val); diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index 5c9ed52b69f2..f3303459136d 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -77,7 +77,7 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, int span,

ksft_print_msg("Starting MBM BW change ...\n");

- res = remount_resctrlfs(false); + res = mount_resctrlfs(); if (res) { ksft_exit_fail_msg("Failed to mount resctrl FS\n"); return; @@ -106,7 +106,7 @@ static void run_mba_test(bool has_ben, char **benchmark_cmd, int span,

ksft_print_msg("Starting MBA Schemata change ...\n");

- res = remount_resctrlfs(false); + res = mount_resctrlfs(); if (res) { ksft_exit_fail_msg("Failed to mount resctrl FS\n"); return; @@ -132,7 +132,7 @@ static void run_cmt_test(bool has_ben, char **benchmark_cmd, int cpu_no)

ksft_print_msg("Starting CMT test ...\n");

- res = remount_resctrlfs(false); + res = mount_resctrlfs(); if (res) { ksft_exit_fail_msg("Failed to mount resctrl FS\n"); return; @@ -160,7 +160,7 @@ static void run_cat_test(int cpu_no, int no_of_bits)

ksft_print_msg("Starting CAT test ...\n");

- res = remount_resctrlfs(false); + res = mount_resctrlfs(); if (res) { ksft_exit_fail_msg("Failed to mount resctrl FS\n"); return; diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 42f547a81e25..45f785213143 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -48,29 +48,19 @@ static int find_resctrl_mount(char *buffer) }

/* - * remount_resctrlfs - Remount resctrl FS at /sys/fs/resctrl - * @mum_resctrlfs: Should the resctrl FS be remounted? + * mount_resctrlfs - Mount resctrl FS at /sys/fs/resctrl * * If not mounted, mount it. - * If mounted and mum_resctrlfs then remount resctrl FS. - * If mounted and !mum_resctrlfs then noop * * Return: 0 on success, non-zero on failure */ -int remount_resctrlfs(bool mum_resctrlfs) +int mount_resctrlfs(void) { - char mountpoint[256]; int ret;

- ret = find_resctrl_mount(mountpoint); - if (ret) - strcpy(mountpoint, RESCTRL_PATH); - - if (!ret && mum_resctrlfs && umount(mountpoint)) - ksft_print_msg("Fail: unmounting "%s"\n", mountpoint); - - if (!ret && !mum_resctrlfs) - return 0; + ret = find_resctrl_mount(NULL); + if (!ret) + return -1;

ksft_print_msg("Mounting resctrl to "%s"\n", RESCTRL_PATH); ret = mount("resctrl", RESCTRL_PATH, "resctrl", 0, NULL);

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:11 a.m.

New subject: [PATCH v2 04/24] selftests/resctrl: Remove mum_resctrlfs

Hi Ilpo,

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...

Resctrl FS mount/umount are now cleanly paired.

Removed mum_resctrlfs that is what is left of the earlier remount

Removed -> Remove? I am not sure what is meant with "that is what is left" ...

...

trickery to make the code cleaner. Rename remount_resctrlfs() to mount_resctrlfs() to match the reduced functionality.

These two logical changes would be easier to review if done in two patches.

...

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index 5c9ed52b69f2..f3303459136d 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -77,7 +77,7 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, int span, ksft_print_msg("Starting MBM BW change ...\n");

res = remount_resctrlfs(false);

res = mount_resctrlfs();

As mentioned in previous patch the way I see it remount is still needed.

...

if (res) { ksft_exit_fail_msg("Failed to mount resctrl FS\n"); return; @@ -106,7 +106,7 @@ static void run_mba_test(bool has_ben, char **benchmark_cmd, int span, ksft_print_msg("Starting MBA Schemata change ...\n");

res = remount_resctrlfs(false);

res = mount_resctrlfs(); if (res) { ksft_exit_fail_msg("Failed to mount resctrl FS\n"); return;

@@ -132,7 +132,7 @@ static void run_cmt_test(bool has_ben, char **benchmark_cmd, int cpu_no) ksft_print_msg("Starting CMT test ...\n");

res = remount_resctrlfs(false);

res = mount_resctrlfs(); if (res) { ksft_exit_fail_msg("Failed to mount resctrl FS\n"); return;

@@ -160,7 +160,7 @@ static void run_cat_test(int cpu_no, int no_of_bits) ksft_print_msg("Starting CAT test ...\n");

res = remount_resctrlfs(false);

res = mount_resctrlfs(); if (res) { ksft_exit_fail_msg("Failed to mount resctrl FS\n"); return;

diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 42f547a81e25..45f785213143 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -48,29 +48,19 @@ static int find_resctrl_mount(char *buffer) } /*

remount_resctrlfs - Remount resctrl FS at /sys/fs/resctrl

@mum_resctrlfs: Should the resctrl FS be remounted?

mount_resctrlfs - Mount resctrl FS at /sys/fs/resctrl

If not mounted, mount it.

If mounted and mum_resctrlfs then remount resctrl FS.

If mounted and !mum_resctrlfs then noop

Return: 0 on success, non-zero on failure

*/

-int remount_resctrlfs(bool mum_resctrlfs) +int mount_resctrlfs(void) {

char mountpoint[256]; int ret;
ret = find_resctrl_mount(mountpoint);

if (ret)
strcpy(mountpoint, RESCTRL_PATH);
if (!ret && mum_resctrlfs && umount(mountpoint))
ksft_print_msg("Fail: unmounting \"%s\"\n", mountpoint);
if (!ret && !mum_resctrlfs)
return 0;
ret = find_resctrl_mount(NULL);

if (!ret)
return -1;

This seems to assume that resctrl is always unmounted. Should the main program thus start by unmounting resctrl before it runs any test in case it is mounted when user space starts the tests?

Reinette

Ilpo Järvinen

24 Apr 24 Apr

2:55 p.m.

New subject: [PATCH v2 04/24] selftests/resctrl: Remove mum_resctrlfs

On Fri, 21 Apr 2023, Reinette Chatre wrote:

...

...
-int remount_resctrlfs(bool mum_resctrlfs) +int mount_resctrlfs(void) {

char mountpoint[256]; int ret;
ret = find_resctrl_mount(mountpoint);

if (ret)
strcpy(mountpoint, RESCTRL_PATH);
if (!ret && mum_resctrlfs && umount(mountpoint))
ksft_print_msg("Fail: unmounting \"%s\"\n", mountpoint);
if (!ret && !mum_resctrlfs)
return 0;
ret = find_resctrl_mount(NULL);

if (!ret)
return -1;
This seems to assume that resctrl is always unmounted. Should the main program thus start by unmounting resctrl before it runs any test in case it is mounted when user space starts the tests?

I thought that was the wanted functionality. I've now added a change to the series which does this umount before starting a tests.

-- i.

Ilpo Järvinen

18 Apr 18 Apr

11:44 a.m.

New subject: [PATCH v2 05/24] selftests/resctrl: Make span unsigned long everywhere

fill_buf(), show_bw_info(), and resctrl_val_param.span define span as unsigned long.

Consistently use unsigned long elsewhere too for span parameters.

diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c index 24326cb7bc21..3c389ccfea41 100644 --- a/tools/testing/selftests/resctrl/mbm_test.c +++ b/tools/testing/selftests/resctrl/mbm_test.c @@ -15,7 +15,7 @@ #define NUM_OF_RUNS 5

static int -show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, int span) +show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, unsigned long span) { unsigned long avg_bw_imc = 0, avg_bw_resc = 0; unsigned long sum_bw_imc = 0, sum_bw_resc = 0; @@ -40,14 +40,14 @@ show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, int span) ksft_print_msg("%s Check MBM diff within %d%%\n", ret ? "Fail:" : "Pass:", MAX_DIFF_PERCENT); ksft_print_msg("avg_diff_per: %d%%\n", avg_diff_per); - ksft_print_msg("Span (MB): %d\n", span); + ksft_print_msg("Span (MB): %lu\n", span); ksft_print_msg("avg_bw_imc: %lu\n", avg_bw_imc); ksft_print_msg("avg_bw_resc: %lu\n", avg_bw_resc);

return ret; }

-static int check_results(int span) +static int check_results(unsigned long span) { unsigned long bw_imc[NUM_OF_RUNS], bw_resc[NUM_OF_RUNS]; char temp[1024], *token_array[8]; @@ -115,7 +115,7 @@ void mbm_test_cleanup(void) remove(RESULT_FILE_NAME); }

-int mbm_bw_change(int span, int cpu_no, char *bw_report, char **benchmark_cmd) +int mbm_bw_change(unsigned long span, int cpu_no, char *bw_report, char **benchmark_cmd) { struct resctrl_val_param param = { .resctrl_val = MBM_STR, diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index c737eb47eacc..fe54c2b4f014 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -99,7 +99,7 @@ int perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu, int run_fill_buf(unsigned long span, int malloc_and_init_memory, int memflush, int op, char *resctrl_va); int resctrl_val(char **benchmark_cmd, struct resctrl_val_param *param); -int mbm_bw_change(int span, int cpu_no, char *bw_report, char **benchmark_cmd); +int mbm_bw_change(unsigned long span, int cpu_no, char *bw_report, char **benchmark_cmd); void tests_cleanup(void); void mbm_test_cleanup(void); int mba_schemata_change(int cpu_no, char *bw_report, char **benchmark_cmd); diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index f3303459136d..6bc0eda25e5d 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -70,7 +70,7 @@ void tests_cleanup(void) cat_test_cleanup(); }

-static void run_mbm_test(bool has_ben, char **benchmark_cmd, int span, +static void run_mbm_test(bool has_ben, char **benchmark_cmd, unsigned long span, int cpu_no, char *bw_report) { int res; @@ -99,7 +99,7 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, int span, umount_resctrlfs(); }

-static void run_mba_test(bool has_ben, char **benchmark_cmd, int span, +static void run_mba_test(bool has_ben, char **benchmark_cmd, unsigned long span, int cpu_no, char *bw_report) { int res; @@ -118,7 +118,7 @@ static void run_mba_test(bool has_ben, char **benchmark_cmd, int span, }

if (!has_ben) - sprintf(benchmark_cmd[1], "%d", span); + sprintf(benchmark_cmd[1], "%lu", span); res = mba_schemata_change(cpu_no, bw_report, benchmark_cmd); ksft_test_result(!res, "MBA: schemata change\n");

@@ -182,9 +182,10 @@ static void run_cat_test(int cpu_no, int no_of_bits) int main(int argc, char **argv) { bool has_ben = false, mbm_test = true, mba_test = true, cmt_test = true; - int c, cpu_no = 1, span = 250, argc_new = argc, i, no_of_bits = 0; char *benchmark_cmd[BENCHMARK_ARGS], bw_report[64], bm_type[64]; char benchmark_cmd_area[BENCHMARK_ARGS][BENCHMARK_ARG_SIZE]; + int c, cpu_no = 1, argc_new = argc, i, no_of_bits = 0; + unsigned long span = 250; int ben_ind, ben_count, tests = 0; bool cat_test = true;

@@ -274,7 +275,7 @@ int main(int argc, char **argv) benchmark_cmd[i] = benchmark_cmd_area[i];

strcpy(benchmark_cmd[0], "fill_buf"); - sprintf(benchmark_cmd[1], "%d", span); + sprintf(benchmark_cmd[1], "%lu", span); strcpy(benchmark_cmd[2], "1"); strcpy(benchmark_cmd[3], "1"); strcpy(benchmark_cmd[4], "0");

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:12 a.m.

New subject: [PATCH v2 05/24] selftests/resctrl: Make span unsigned long everywhere

Hi Ilpo,

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...

fill_buf(), show_bw_info(), and resctrl_val_param.span define span as unsigned long.

There is no fill_buf() in the code and show_bw_info() does not define span as unsigned long (it is even the first function changed in this patch).

...

Consistently use unsigned long elsewhere too for span parameters.

Is unsigned long the right type to use? Tracing through all the indirections I do not see how making all usages unsigned long achieves consistency ... have you considered size_t?

Reinette

Ilpo Järvinen

24 Apr 24 Apr

3:05 p.m.

New subject: [PATCH v2 05/24] selftests/resctrl: Make span unsigned long everywhere

On Fri, 21 Apr 2023, Reinette Chatre wrote:

...

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...
fill_buf(), show_bw_info(), and resctrl_val_param.span define span as unsigned long.

There is no fill_buf() in the code and show_bw_info() does not define span as unsigned long (it is even the first function changed in this patch).

Shuffling a large number of patches around seems detrimental for the quality of the commit messages no matter how hard I try to maintain them up to date. Thanks for noticing this.

...

...
Consistently use unsigned long elsewhere too for span parameters.

Is unsigned long the right type to use? Tracing through all the indirections I do not see how making all usages unsigned long achieves consistency ... have you considered size_t?

I'll change to size_t as it refers to the size of the memory block.

-- i.

Ilpo Järvinen

18 Apr 18 Apr

11:44 a.m.

New subject: [PATCH v2 06/24] selftests/resctrl: Express span in bytes

Make MBA and MBM tests to use megabytes to represent span. CMT test uses bytes.

Convert MBA and MBM tests to use bytes like CMT test to remove the inconsistency between the tests. This also allows removing test dependent buffer sizing from run_benchmark().

diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c index 3c389ccfea41..ea2ea5721e76 100644 --- a/tools/testing/selftests/resctrl/mbm_test.c +++ b/tools/testing/selftests/resctrl/mbm_test.c @@ -40,7 +40,7 @@ show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, unsigned long span) ksft_print_msg("%s Check MBM diff within %d%%\n", ret ? "Fail:" : "Pass:", MAX_DIFF_PERCENT); ksft_print_msg("avg_diff_per: %d%%\n", avg_diff_per); - ksft_print_msg("Span (MB): %lu\n", span); + ksft_print_msg("Span in bytes: %lu\n", span); ksft_print_msg("avg_bw_imc: %lu\n", avg_bw_imc); ksft_print_msg("avg_bw_resc: %lu\n", avg_bw_resc);

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index 6bc0eda25e5d..f1ed2c89f228 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -185,7 +185,7 @@ int main(int argc, char **argv) char *benchmark_cmd[BENCHMARK_ARGS], bw_report[64], bm_type[64]; char benchmark_cmd_area[BENCHMARK_ARGS][BENCHMARK_ARG_SIZE]; int c, cpu_no = 1, argc_new = argc, i, no_of_bits = 0; - unsigned long span = 250; + unsigned long span = 250 * MB; int ben_ind, ben_count, tests = 0; bool cat_test = true;

diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 45f785213143..09e0a69c81c4 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -296,7 +296,7 @@ int taskset_benchmark(pid_t bm_pid, int cpu_no) void run_benchmark(int signum, siginfo_t *info, void *ucontext) { int operation, ret, malloc_and_init_memory, memflush; - unsigned long span, buffer_span; + unsigned long span; char **benchmark_cmd; char resctrl_val[64]; FILE *fp; @@ -319,12 +319,7 @@ void run_benchmark(int signum, siginfo_t *info, void *ucontext) operation = atoi(benchmark_cmd[4]); sprintf(resctrl_val, "%s", benchmark_cmd[5]);

- if (strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR))) - buffer_span = span * MB; - else - buffer_span = span; - - if (run_fill_buf(buffer_span, malloc_and_init_memory, memflush, + if (run_fill_buf(span, malloc_and_init_memory, memflush, operation, resctrl_val)) fprintf(stderr, "Error in running fill buffer\n"); } else {

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:12 a.m.

New subject: [PATCH v2 06/24] selftests/resctrl: Express span in bytes

Hi Ilpo,

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...

Make MBA and MBM tests to use megabytes to represent span. CMT test uses bytes.

Convert MBA and MBM tests to use bytes like CMT test to remove the inconsistency between the tests. This also allows removing test dependent buffer sizing from run_benchmark().

It is not clear to me how this patch achieves this goal since after it show_mba_info() still displays results in MB.

Reinette

Ilpo Järvinen

24 Apr 24 Apr

3:31 p.m.

New subject: [PATCH v2 06/24] selftests/resctrl: Express span in bytes

On Fri, 21 Apr 2023, Reinette Chatre wrote:

...

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...
Make MBA and MBM tests to use megabytes to represent span. CMT test uses bytes.

Convert MBA and MBM tests to use bytes like CMT test to remove the inconsistency between the tests. This also allows removing test dependent buffer sizing from run_benchmark().

It is not clear to me how this patch achieves this goal since after it show_mba_info() still displays results in MB.

This was more for internal consistency as there was the test type dependent span calculation in run_benchmark(). I can fix the changelog to reflect that, however, what you think would be the best approach in show_bw_info(), should I leave the print to use MB (converting the internal representation back from bytes to MB there)?

-- i.

Ilpo Järvinen

18 Apr 18 Apr

11:44 a.m.

New subject: [PATCH v2 07/24] selftests/resctrl: Remove duplicated preparation for span arg

When no benchmark_cmd is given, benchmark_cmd[1] is set to span in main(). There's no need to do it again in run_mba_test().

Remove the duplicated preparation for span argument into benchmark_cmd[1] from run_mba_test(). It enables also removing has_ben argument from run_mba_test() as unnecessary.

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index f1ed2c89f228..3c8ec68eb507 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -99,8 +99,8 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, unsigned long span, umount_resctrlfs(); }

-static void run_mba_test(bool has_ben, char **benchmark_cmd, unsigned long span, - int cpu_no, char *bw_report) +static void run_mba_test(char **benchmark_cmd, unsigned long span, int cpu_no, + char *bw_report) { int res;

@@ -117,8 +117,6 @@ static void run_mba_test(bool has_ben, char **benchmark_cmd, unsigned long span, goto umount; }

- if (!has_ben) - sprintf(benchmark_cmd[1], "%lu", span); res = mba_schemata_change(cpu_no, bw_report, benchmark_cmd); ksft_test_result(!res, "MBA: schemata change\n");

@@ -297,7 +295,7 @@ int main(int argc, char **argv) run_mbm_test(has_ben, benchmark_cmd, span, cpu_no, bw_report);

if (mba_test) - run_mba_test(has_ben, benchmark_cmd, span, cpu_no, bw_report); + run_mba_test(benchmark_cmd, span, cpu_no, bw_report);

if (cmt_test) run_cmt_test(has_ben, benchmark_cmd, cpu_no);

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:14 a.m.

New subject: [PATCH v2 07/24] selftests/resctrl: Remove duplicated preparation for span arg

Hi Ilpo,

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...

When no benchmark_cmd is given, benchmark_cmd[1] is set to span in main(). There's no need to do it again in run_mba_test().

Remove the duplicated preparation for span argument into benchmark_cmd[1] from run_mba_test(). It enables also removing has_ben argument from run_mba_test() as unnecessary.

I find the last sentence a bit hard to read. How about "After this, the has_ben argument to run_mba_test() can be removed.".

...

Co-developed-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/resctrl_tests.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index f1ed2c89f228..3c8ec68eb507 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -99,8 +99,8 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, unsigned long span, umount_resctrlfs(); } -static void run_mba_test(bool has_ben, char **benchmark_cmd, unsigned long span,
	 int cpu_no, char *bw_report)
+static void run_mba_test(char **benchmark_cmd, unsigned long span, int cpu_no,
	 char *bw_report)
{ int res; @@ -117,8 +117,6 @@ static void run_mba_test(bool has_ben, char **benchmark_cmd, unsigned long span, goto umount; }
if (!has_ben)
sprintf(benchmark_cmd[1], "%lu", span);

Can "span" also be removed?

...

res = mba_schemata_change(cpu_no, bw_report, benchmark_cmd); ksft_test_result(!res, "MBA: schemata change\n"); @@ -297,7 +295,7 @@ int main(int argc, char **argv) run_mbm_test(has_ben, benchmark_cmd, span, cpu_no, bw_report); if (mba_test)
run_mba_test(has_ben, benchmark_cmd, span, cpu_no, bw_report);
run_mba_test(benchmark_cmd, span, cpu_no, bw_report);
if (cmt_test) run_cmt_test(has_ben, benchmark_cmd, cpu_no);

Reinette

Ilpo Järvinen

24 Apr 24 Apr

3:41 p.m.

New subject: [PATCH v2 07/24] selftests/resctrl: Remove duplicated preparation for span arg

On Fri, 21 Apr 2023, Reinette Chatre wrote:

...

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...
When no benchmark_cmd is given, benchmark_cmd[1] is set to span in main(). There's no need to do it again in run_mba_test().

Remove the duplicated preparation for span argument into benchmark_cmd[1] from run_mba_test(). It enables also removing has_ben argument from run_mba_test() as unnecessary.

I find the last sentence a bit hard to read. How about "After this, the has_ben argument to run_mba_test() can be removed.".

...
Co-developed-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/resctrl_tests.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index f1ed2c89f228..3c8ec68eb507 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -99,8 +99,8 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, unsigned long span, umount_resctrlfs(); } -static void run_mba_test(bool has_ben, char **benchmark_cmd, unsigned long span,
	 int cpu_no, char *bw_report)
+static void run_mba_test(char **benchmark_cmd, unsigned long span, int cpu_no,
	 char *bw_report)
{ int res; @@ -117,8 +117,6 @@ static void run_mba_test(bool has_ben, char **benchmark_cmd, unsigned long span, goto umount; }
if (!has_ben)
sprintf(benchmark_cmd[1], "%lu", span);
Can "span" also be removed?

Yes.

-- i.

Ilpo Järvinen

18 Apr 18 Apr

11:44 a.m.

New subject: [PATCH v2 08/24] selftests/resctrl: Don't use variable argument list for ->setup()

struct resctrl_val_param has ->setup() function that accepts variable argument list. All test cases use only 1 argument as input and it's the struct resctrl_val_param pointer.

Instead of variable argument list, directly pass struct resctrl_val_param pointer as the only parameter to ->setup().

Co-developed-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cache.c | 2 +- tools/testing/selftests/resctrl/cat_test.c | 8 +------- tools/testing/selftests/resctrl/cmt_test.c | 9 +-------- tools/testing/selftests/resctrl/mba_test.c | 8 +------- tools/testing/selftests/resctrl/mbm_test.c | 8 +------- tools/testing/selftests/resctrl/resctrl.h | 3 +-- tools/testing/selftests/resctrl/resctrl_val.c | 2 +- 7 files changed, 7 insertions(+), 33 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index 8a4fe8693be6..11105ba30eff 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -238,7 +238,7 @@ int cat_val(struct resctrl_val_param *param) /* Test runs until the callback setup() tells the test to stop. */ while (1) { if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) { - ret = param->setup(1, param); + ret = param->setup(param); if (ret == END_OF_TESTS) { ret = 0; break; diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 86c62dfd8e78..a998e6397518 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -27,17 +27,11 @@ static unsigned long cache_size; * con_mon grp, mon_grp in resctrl FS. * Run 5 times in order to get average values. */ -static int cat_setup(int num, ...) +static int cat_setup(struct resctrl_val_param *p) { - struct resctrl_val_param *p; char schemata[64]; - va_list param; int ret = 0;

- va_start(param, num); - p = va_arg(param, struct resctrl_val_param *); - va_end(param); - /* Run NUM_OF_RUNS times */ if (p->num_of_runs >= NUM_OF_RUNS) return END_OF_TESTS; diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index d31e28416bb7..2d434c03cbba 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -21,15 +21,8 @@ static char cbm_mask[256]; static unsigned long long_mask; static unsigned long cache_size;

-static int cmt_setup(int num, ...) +static int cmt_setup(struct resctrl_val_param *p) { - struct resctrl_val_param *p; - va_list param; - - va_start(param, num); - p = va_arg(param, struct resctrl_val_param *); - va_end(param); - /* Run NUM_OF_RUNS times */ if (p->num_of_runs >= NUM_OF_RUNS) return END_OF_TESTS; diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c index 3d53c6c9b9ce..4d2f145804b8 100644 --- a/tools/testing/selftests/resctrl/mba_test.c +++ b/tools/testing/selftests/resctrl/mba_test.c @@ -22,18 +22,12 @@ * con_mon grp, mon_grp in resctrl FS. * For each allocation, run 5 times in order to get average values. */ -static int mba_setup(int num, ...) +static int mba_setup(struct resctrl_val_param *p) { static int runs_per_allocation, allocation = 100; - struct resctrl_val_param *p; char allocation_str[64]; - va_list param; int ret;

- va_start(param, num); - p = va_arg(param, struct resctrl_val_param *); - va_end(param); - if (runs_per_allocation >= NUM_OF_RUNS) runs_per_allocation = 0;

diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c index ea2ea5721e76..90ee57e5b94b 100644 --- a/tools/testing/selftests/resctrl/mbm_test.c +++ b/tools/testing/selftests/resctrl/mbm_test.c @@ -86,16 +86,10 @@ static int check_results(unsigned long span) return ret; }

-static int mbm_setup(int num, ...) +static int mbm_setup(struct resctrl_val_param *p) { - struct resctrl_val_param *p; - va_list param; int ret = 0;

- va_start(param, num); - p = va_arg(param, struct resctrl_val_param *); - va_end(param); - /* Run NUM_OF_RUNS times */ if (p->num_of_runs >= NUM_OF_RUNS) return END_OF_TESTS; diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index fe54c2b4f014..4fd2eaf641e0 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -3,7 +3,6 @@ #ifndef RESCTRL_H #define RESCTRL_H #include <stdio.h> -#include <stdarg.h> #include <math.h> #include <errno.h> #include <sched.h> @@ -67,7 +66,7 @@ struct resctrl_val_param { char *bw_report; unsigned long mask; int num_of_runs; - int (*setup)(int num, ...); + int (*setup)(struct resctrl_val_param *param); };

#define MBM_STR "mbm" diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c index e8f1e6a59f4a..f0f6c5f6e98b 100644 --- a/tools/testing/selftests/resctrl/resctrl_val.c +++ b/tools/testing/selftests/resctrl/resctrl_val.c @@ -759,7 +759,7 @@ int resctrl_val(char **benchmark_cmd, struct resctrl_val_param *param)

/* Test runs until the callback setup() tells the test to stop. */ while (1) { - ret = param->setup(1, param); + ret = param->setup(param); if (ret == END_OF_TESTS) { ret = 0; break;

-- 2.30.2

Ilpo Järvinen

11:44 a.m.

New subject: [PATCH v2 09/24] selftests/resctrl: Remove "malloc_and_init_memory" param from run_fill_buf()

run_fill_buf()'s malloc_and_init_memory parameter is always 1. There's also duplicated memory init code for malloc_and_init_memory == 0 case in fill_buf() which is unused.

Remove the malloc_and_init_memory parameter and the duplicated mem init code.

While at it, fix also a typo in run_fill_buf() prototype's argument.

Co-developed-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cache.c | 6 ++--- tools/testing/selftests/resctrl/fill_buf.c | 27 +++---------------- tools/testing/selftests/resctrl/resctrl.h | 3 +-- .../testing/selftests/resctrl/resctrl_tests.c | 13 +++++---- tools/testing/selftests/resctrl/resctrlfs.c | 12 ++++----- 5 files changed, 19 insertions(+), 42 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index 11105ba30eff..ed5ec8a78c30 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -212,7 +212,7 @@ int measure_cache_vals(struct resctrl_val_param *param, int bm_pid) */ int cat_val(struct resctrl_val_param *param) { - int malloc_and_init_memory = 1, memflush = 1, operation = 0, ret = 0; + int memflush = 1, operation = 0, ret = 0; char *resctrl_val = param->resctrl_val; pid_t bm_pid;

@@ -249,8 +249,8 @@ int cat_val(struct resctrl_val_param *param) if (ret) break;

- if (run_fill_buf(param->span, malloc_and_init_memory, - memflush, operation, resctrl_val)) { + if (run_fill_buf(param->span, memflush, operation, + resctrl_val)) { fprintf(stderr, "Error-running fill buffer\n"); ret = -1; break; diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 341cc93ca84c..5cdb421a2f6c 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -139,35 +139,18 @@ static int fill_cache_write(unsigned char *start_ptr, unsigned char *end_ptr, }

static int -fill_cache(unsigned long long buf_size, int malloc_and_init, int memflush, - int op, char *resctrl_val) +fill_cache(unsigned long long buf_size, int memflush, int op, char *resctrl_val) { unsigned char *start_ptr, *end_ptr; - unsigned long long i; int ret;

- if (malloc_and_init) - start_ptr = malloc_and_init_memory(buf_size); - else - start_ptr = malloc(buf_size); - + start_ptr = malloc_and_init_memory(buf_size); if (!start_ptr) return -1;

startptr = start_ptr; end_ptr = start_ptr + buf_size;

- /* - * It's better to touch the memory once to avoid any compiler - * optimizations - */ - if (!malloc_and_init) { - for (i = 0; i < buf_size; i++) - *start_ptr++ = (unsigned char)rand(); - } - - start_ptr = startptr; - /* Flush the memory before using to avoid "cache hot pages" effect */ if (memflush) mem_flush(start_ptr, buf_size); @@ -187,14 +170,12 @@ fill_cache(unsigned long long buf_size, int malloc_and_init, int memflush, return 0; }

-int run_fill_buf(unsigned long span, int malloc_and_init_memory, - int memflush, int op, char *resctrl_val) +int run_fill_buf(unsigned long span, int memflush, int op, char *resctrl_val) { unsigned long long cache_size = span; int ret;

- ret = fill_cache(cache_size, malloc_and_init_memory, memflush, op, - resctrl_val); + ret = fill_cache(cache_size, memflush, op, resctrl_val); if (ret) { printf("\n Error in fill cache\n"); return -1; diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 4fd2eaf641e0..574adac8932d 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -95,8 +95,7 @@ int write_bm_pid_to_resctrl(pid_t bm_pid, char *ctrlgrp, char *mongrp, char *resctrl_val); int perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu, int group_fd, unsigned long flags); -int run_fill_buf(unsigned long span, int malloc_and_init_memory, int memflush, - int op, char *resctrl_va); +int run_fill_buf(unsigned long span, int memflush, int op, char *resctrl_val); int resctrl_val(char **benchmark_cmd, struct resctrl_val_param *param); int mbm_bw_change(unsigned long span, int cpu_no, char *bw_report, char **benchmark_cmd); void tests_cleanup(void); diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index 3c8ec68eb507..ef2977b5ecd4 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -89,7 +89,7 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, unsigned long span, }

if (!has_ben) - sprintf(benchmark_cmd[5], "%s", MBA_STR); + sprintf(benchmark_cmd[4], "%s", MBA_STR); res = mbm_bw_change(span, cpu_no, bw_report, benchmark_cmd); ksft_test_result(!res, "MBM: bw change\n"); if ((get_vendor() == ARCH_INTEL) && res) @@ -142,7 +142,7 @@ static void run_cmt_test(bool has_ben, char **benchmark_cmd, int cpu_no) }

if (!has_ben) - sprintf(benchmark_cmd[5], "%s", CMT_STR); + sprintf(benchmark_cmd[4], "%s", CMT_STR); res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd); ksft_test_result(!res, "CMT: test\n"); if ((get_vendor() == ARCH_INTEL) && res) @@ -269,16 +269,15 @@ int main(int argc, char **argv) benchmark_cmd[ben_count] = NULL; } else { /* If no benchmark is given by "-b" argument, use fill_buf. */ - for (i = 0; i < 6; i++) + for (i = 0; i < 5; i++) benchmark_cmd[i] = benchmark_cmd_area[i];

strcpy(benchmark_cmd[0], "fill_buf"); sprintf(benchmark_cmd[1], "%lu", span); strcpy(benchmark_cmd[2], "1"); - strcpy(benchmark_cmd[3], "1"); - strcpy(benchmark_cmd[4], "0"); - strcpy(benchmark_cmd[5], ""); - benchmark_cmd[6] = NULL; + strcpy(benchmark_cmd[3], "0"); + strcpy(benchmark_cmd[4], ""); + benchmark_cmd[5] = NULL; }

sprintf(bw_report, "reads"); diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 09e0a69c81c4..79e030065da8 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -295,7 +295,7 @@ int taskset_benchmark(pid_t bm_pid, int cpu_no) */ void run_benchmark(int signum, siginfo_t *info, void *ucontext) { - int operation, ret, malloc_and_init_memory, memflush; + int operation, ret, memflush; unsigned long span; char **benchmark_cmd; char resctrl_val[64]; @@ -314,13 +314,11 @@ void run_benchmark(int signum, siginfo_t *info, void *ucontext) if (strcmp(benchmark_cmd[0], "fill_buf") == 0) { /* Execute default fill_buf benchmark */ span = strtoul(benchmark_cmd[1], NULL, 10); - malloc_and_init_memory = atoi(benchmark_cmd[2]); - memflush = atoi(benchmark_cmd[3]); - operation = atoi(benchmark_cmd[4]); - sprintf(resctrl_val, "%s", benchmark_cmd[5]); + memflush = atoi(benchmark_cmd[2]); + operation = atoi(benchmark_cmd[3]); + sprintf(resctrl_val, "%s", benchmark_cmd[4]);

- if (run_fill_buf(span, malloc_and_init_memory, memflush, - operation, resctrl_val)) + if (run_fill_buf(span, memflush, operation, resctrl_val)) fprintf(stderr, "Error in running fill buffer\n"); } else { /* Execute specified benchmark */

-- 2.30.2

Ilpo Järvinen

11:44 a.m.

New subject: [PATCH v2 10/24] selftests/resctrl: Split run_fill_buf() to alloc, work, and dealloc helpers

MBM, MBA and CMT test cases use run_fill_buf() to loop indefinitely loop around the buffer. CAT test case is different and doesn't want to loop around the buffer continuously.

Split run_fill_buf() into helpers so that both the use cases are easier to control by creating separate functions for buffer allocation, looping around the buffer and the deallocation. Make those functions available for tests.

Co-developed-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/fill_buf.c | 46 ++++++++++++++++------ tools/testing/selftests/resctrl/resctrl.h | 3 ++ 2 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 5cdb421a2f6c..6f0438aa71a6 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -24,6 +24,11 @@

static unsigned char *startptr;

+void free_buffer(void) +{ + free(startptr); +} + static void sb(void) { #if defined(__i386) || defined(__x86_64) @@ -138,36 +143,53 @@ static int fill_cache_write(unsigned char *start_ptr, unsigned char *end_ptr, return 0; }

-static int -fill_cache(unsigned long long buf_size, int memflush, int op, char *resctrl_val) +int alloc_buffer(unsigned long long buf_size, int memflush) { - unsigned char *start_ptr, *end_ptr; - int ret; + unsigned char *start_ptr;

start_ptr = malloc_and_init_memory(buf_size); if (!start_ptr) return -1;

startptr = start_ptr; - end_ptr = start_ptr + buf_size;

/* Flush the memory before using to avoid "cache hot pages" effect */ if (memflush) mem_flush(start_ptr, buf_size);

+ return 0; +} + +int use_buffer(unsigned long long buf_size, int op, char *resctrl_val) +{ + unsigned char *end_ptr; + int ret; + + end_ptr = startptr + buf_size; if (op == 0) - ret = fill_cache_read(start_ptr, end_ptr, resctrl_val); + ret = fill_cache_read(startptr, end_ptr, resctrl_val); else - ret = fill_cache_write(start_ptr, end_ptr, resctrl_val); + ret = fill_cache_write(startptr, end_ptr, resctrl_val);

- if (ret) { + if (ret) printf("\n Error in fill cache read/write...\n"); - return -1; - }

- free(startptr); + return ret; +}

- return 0; +static int +fill_cache(unsigned long long buf_size, int memflush, int op, char *resctrl_val) +{ + int ret; + + ret = alloc_buffer(buf_size, memflush); + if (ret) + return ret; + + ret = use_buffer(buf_size, op, resctrl_val); + free_buffer(); + + return ret; }

int run_fill_buf(unsigned long span, int memflush, int op, char *resctrl_val) diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 574adac8932d..75bfa2b2746d 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -95,6 +95,9 @@ int write_bm_pid_to_resctrl(pid_t bm_pid, char *ctrlgrp, char *mongrp, char *resctrl_val); int perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu, int group_fd, unsigned long flags); +void free_buffer(void); +int alloc_buffer(unsigned long long buf_size, int memflush); +int use_buffer(unsigned long long buf_size, int op, char *resctrl_val); int run_fill_buf(unsigned long span, int memflush, int op, char *resctrl_val); int resctrl_val(char **benchmark_cmd, struct resctrl_val_param *param); int mbm_bw_change(unsigned long span, int cpu_no, char *bw_report, char **benchmark_cmd);

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:16 a.m.

New subject: [PATCH v2 10/24] selftests/resctrl: Split run_fill_buf() to alloc, work, and dealloc helpers

Hi Ilpo,

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...

MBM, MBA and CMT test cases use run_fill_buf() to loop indefinitely loop around the buffer. CAT test case is different and doesn't want to

"loop indefinitely loop" -> "loop indefinitely" ?

...

loop around the buffer continuously.

Split run_fill_buf() into helpers so that both the use cases are easier to control by creating separate functions for buffer allocation, looping around the buffer and the deallocation. Make those functions available for tests.

Co-developed-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/fill_buf.c | 46 ++++++++++++++++------ tools/testing/selftests/resctrl/resctrl.h | 3 ++ 2 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 5cdb421a2f6c..6f0438aa71a6 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -24,6 +24,11 @@ static unsigned char *startptr; +void free_buffer(void) +{

free(startptr);

+}

From what I understand startptr is a global variable because there used to be a signal handler that attempted to free the buffer as part of its cleanup. This was not necessary and this behavior no longer exists, yet the global buffer pointer remains. See commit 73c55fa5ab55 ("selftests/resctrl: Commonize the signal handler register/unregister for all tests")

I do not see why a global buffer pointer with all these indirections are needed. Why not just use a local pointer and pass it to functions as needed? In the above case, just call free(pointer) directly from the test.

...

static void sb(void) { #if defined(__i386) || defined(__x86_64) @@ -138,36 +143,53 @@ static int fill_cache_write(unsigned char *start_ptr, unsigned char *end_ptr, return 0; } -static int -fill_cache(unsigned long long buf_size, int memflush, int op, char *resctrl_val) +int alloc_buffer(unsigned long long buf_size, int memflush) {

This can be an allocation function that returns a pointer to allocated buffer, NULL if error.

...

unsigned char *start_ptr, *end_ptr;

int ret;

unsigned char *start_ptr;

start_ptr = malloc_and_init_memory(buf_size); if (!start_ptr) return -1; startptr = start_ptr;

end_ptr = start_ptr + buf_size;

/* Flush the memory before using to avoid "cache hot pages" effect */ if (memflush) mem_flush(start_ptr, buf_size);

return 0;

+}

+int use_buffer(unsigned long long buf_size, int op, char *resctrl_val) +{

unsigned char *end_ptr;

int ret;

end_ptr = startptr + buf_size; if (op == 0)
ret = fill_cache_read(start_ptr, end_ptr, resctrl_val);
ret = fill_cache_read(startptr, end_ptr, resctrl_val);
else
ret = fill_cache_write(start_ptr, end_ptr, resctrl_val);
ret = fill_cache_write(startptr, end_ptr, resctrl_val);
if (ret) {

if (ret) printf("\n Error in fill cache read/write...\n");
return -1;
}
free(startptr);

return ret;

+}

This seems like an unnecessary level of abstraction to me. Could callers not just call fill_cache_read()/fill_cache_write() directly? I think doing so will make tests easier to understand. Looking ahead at how cat_val() turns out in the final patch I do think a call to fill_cache_read() is easier to follow than this abstraction.

...

return 0;

+static int +fill_cache(unsigned long long buf_size, int memflush, int op, char *resctrl_val) +{
int ret;

ret = alloc_buffer(buf_size, memflush);

if (ret)
return ret;
ret = use_buffer(buf_size, op, resctrl_val);

free_buffer();

return ret;
} int run_fill_buf(unsigned long span, int memflush, int op, char *resctrl_val) diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 574adac8932d..75bfa2b2746d 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -95,6 +95,9 @@ int write_bm_pid_to_resctrl(pid_t bm_pid, char *ctrlgrp, char *mongrp, char *resctrl_val); int perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu, int group_fd, unsigned long flags); +void free_buffer(void); +int alloc_buffer(unsigned long long buf_size, int memflush); +int use_buffer(unsigned long long buf_size, int op, char *resctrl_val); int run_fill_buf(unsigned long span, int memflush, int op, char *resctrl_val); int resctrl_val(char **benchmark_cmd, struct resctrl_val_param *param); int mbm_bw_change(unsigned long span, int cpu_no, char *bw_report, char **benchmark_cmd);

Reinette

Ilpo Järvinen

24 Apr 24 Apr

4:01 p.m.

New subject: [PATCH v2 10/24] selftests/resctrl: Split run_fill_buf() to alloc, work, and dealloc helpers

On Fri, 21 Apr 2023, Reinette Chatre wrote:

...

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...
diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 5cdb421a2f6c..6f0438aa71a6 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -24,6 +24,11 @@ static unsigned char *startptr; +void free_buffer(void) +{

free(startptr);

+}

...
From what I understand startptr is a global variable because there used

to be a signal handler that attempted to free the buffer as part of its cleanup. This was not necessary and this behavior no longer exists, yet the global buffer pointer remains. See commit 73c55fa5ab55 ("selftests/resctrl: Commonize the signal handler register/unregister for all tests")

I do not see why a global buffer pointer with all these indirections are needed. Why not just use a local pointer and pass it to functions as needed? In the above case, just call free(pointer) directly from the test.

OK, I'll try to convert all this into using non-global pointers then. It requires a bit refactoring but, IIRC, it is doable.

...

...
static void sb(void) { #if defined(__i386) || defined(__x86_64) @@ -138,36 +143,53 @@ static int fill_cache_write(unsigned char *start_ptr, unsigned char *end_ptr, return 0; } -static int -fill_cache(unsigned long long buf_size, int memflush, int op, char *resctrl_val) +int alloc_buffer(unsigned long long buf_size, int memflush) {

This can be an allocation function that returns a pointer to allocated buffer, NULL if error.

...
unsigned char *start_ptr, *end_ptr;

int ret;

unsigned char *start_ptr;

start_ptr = malloc_and_init_memory(buf_size); if (!start_ptr) return -1; startptr = start_ptr;

end_ptr = start_ptr + buf_size;

/* Flush the memory before using to avoid "cache hot pages" effect */ if (memflush) mem_flush(start_ptr, buf_size);

return 0;

+}

+int use_buffer(unsigned long long buf_size, int op, char *resctrl_val) +{

unsigned char *end_ptr;

int ret;

end_ptr = startptr + buf_size; if (op == 0)
ret = fill_cache_read(start_ptr, end_ptr, resctrl_val);
ret = fill_cache_read(startptr, end_ptr, resctrl_val);
else
ret = fill_cache_write(start_ptr, end_ptr, resctrl_val);
ret = fill_cache_write(startptr, end_ptr, resctrl_val);
if (ret) {

if (ret) printf("\n Error in fill cache read/write...\n");
return -1;
}
free(startptr);

return ret;

+}
This seems like an unnecessary level of abstraction to me. Could callers not just call fill_cache_read()/fill_cache_write() directly? I think doing so will make tests easier to understand. Looking ahead at how cat_val() turns out in the final patch I do think a call to fill_cache_read() is easier to follow than this abstraction.

Passing a custom benchmark command with -b would lose some functionality if this abstraction is removed. CAT test could make a direct call though as it doesn't care about the benchmark command.

How useful that -b functionality is for selftesting is somewhat questionable though.

-- i.

Reinette Chatre

4:32 p.m.

New subject: [PATCH v2 10/24] selftests/resctrl: Split run_fill_buf() to alloc, work, and dealloc helpers

Hi Ilpo,

On 4/24/2023 9:01 AM, Ilpo Järvinen wrote:

...

On Fri, 21 Apr 2023, Reinette Chatre wrote:

...
On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...

...

...
...
static void sb(void) { #if defined(__i386) || defined(__x86_64) @@ -138,36 +143,53 @@ static int fill_cache_write(unsigned char *start_ptr, unsigned char *end_ptr, return 0; } -static int -fill_cache(unsigned long long buf_size, int memflush, int op, char *resctrl_val) +int alloc_buffer(unsigned long long buf_size, int memflush) {

This can be an allocation function that returns a pointer to allocated buffer, NULL if error.

...
unsigned char *start_ptr, *end_ptr;

int ret;

unsigned char *start_ptr;

start_ptr = malloc_and_init_memory(buf_size); if (!start_ptr) return -1; startptr = start_ptr;

end_ptr = start_ptr + buf_size;

/* Flush the memory before using to avoid "cache hot pages" effect */ if (memflush) mem_flush(start_ptr, buf_size);

return 0;

+}

+int use_buffer(unsigned long long buf_size, int op, char *resctrl_val) +{

unsigned char *end_ptr;

int ret;

end_ptr = startptr + buf_size; if (op == 0)
ret = fill_cache_read(start_ptr, end_ptr, resctrl_val);
ret = fill_cache_read(startptr, end_ptr, resctrl_val);
else
ret = fill_cache_write(start_ptr, end_ptr, resctrl_val);
ret = fill_cache_write(startptr, end_ptr, resctrl_val);
if (ret) {

if (ret) printf("\n Error in fill cache read/write...\n");
return -1;
}
free(startptr);

return ret;

+}
This seems like an unnecessary level of abstraction to me. Could callers not just call fill_cache_read()/fill_cache_write() directly? I think doing so will make tests easier to understand. Looking ahead at how cat_val() turns out in the final patch I do think a call to fill_cache_read() is easier to follow than this abstraction.
Passing a custom benchmark command with -b would lose some functionality if this abstraction is removed. CAT test could make a direct call though as it doesn't care about the benchmark command.

How useful that -b functionality is for selftesting is somewhat questionable though.

I do not think we are speaking about the same thing here. I think that use_buffer() is unnecessary. fill_cache() can just call fill_cache_read() or fill_cache_write() directly, depending on the op value. Could you please elaborate how that impacts the custom benchmark?

Looking ahead at patch 24/24: "selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test" I feel more strongly that use_buffer() is unnecessary since it adds an unnecessary layer and makes it harder to see what the test does.

Reinette

Ilpo Järvinen

18 Apr 18 Apr

11:44 a.m.

New subject: [PATCH v2 11/24] selftests/resctrl: Remove start_buf local variable from buffer alloc func

alloc_buffer() has local start_ptr variable which is unnecessary.

Assign the pointer of the allocated buffer directly to startptr that is a global variable in fill_buf to simplify the code in alloc_buffer().

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/fill_buf.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 6f0438aa71a6..3a10b554d778 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -145,17 +145,13 @@ static int fill_cache_write(unsigned char *start_ptr, unsigned char *end_ptr,

int alloc_buffer(unsigned long long buf_size, int memflush) { - unsigned char *start_ptr; - - start_ptr = malloc_and_init_memory(buf_size); - if (!start_ptr) + startptr = malloc_and_init_memory(buf_size); + if (!startptr) return -1;

- startptr = start_ptr; - /* Flush the memory before using to avoid "cache hot pages" effect */ if (memflush) - mem_flush(start_ptr, buf_size); + mem_flush(startptr, buf_size);

return 0; }

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:16 a.m.

New subject: [PATCH v2 11/24] selftests/resctrl: Remove start_buf local variable from buffer alloc func

Hi Ilpo,

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...

alloc_buffer() has local start_ptr variable which is unnecessary.

Assign the pointer of the allocated buffer directly to startptr that is a global variable in fill_buf to simplify the code in alloc_buffer().

I think the opposite (removing the global variable) would make the code more manageable. Tests manage their own buffer pointers, there is no need for a global buffer pointer (that I can see).

Reinette

Ilpo Järvinen

18 Apr 18 Apr

11:44 a.m.

New subject: [PATCH v2 12/24] selftests/resctrl: Don't pass test name to fill_buf

Test name is passed to fill_buf functions so that they can loop around buffer only once. This is required for CAT test case.

To loop around buffer only once, we don't need to let fill_buf know which test case it is. Instead, use a boolean argument 'once' which makes fill_buf more generic.

As the benchmark_cmd no longer needs to include the test name, a few test running functions can be simplified to not write the test name into the default benchmark_cmd which allows dropping has_ben argument from the functions too.

Co-developed-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cache.c | 3 +-- tools/testing/selftests/resctrl/fill_buf.c | 22 +++++++++---------- tools/testing/selftests/resctrl/resctrl.h | 4 ++-- .../testing/selftests/resctrl/resctrl_tests.c | 14 +++++------- tools/testing/selftests/resctrl/resctrlfs.c | 11 +++++++--- 5 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index ed5ec8a78c30..a15f1f2715cd 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -249,8 +249,7 @@ int cat_val(struct resctrl_val_param *param) if (ret) break;

- if (run_fill_buf(param->span, memflush, operation, - resctrl_val)) { + if (run_fill_buf(param->span, memflush, operation, true)) { fprintf(stderr, "Error-running fill buffer\n"); ret = -1; break; diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 3a10b554d778..677e1a113629 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -108,14 +108,14 @@ void fill_one_span_write(unsigned char *start_ptr, unsigned char *end_ptr) }

static int fill_cache_read(unsigned char *start_ptr, unsigned char *end_ptr, - char *resctrl_val) + bool once) { int ret = 0; FILE *fp;

while (1) { ret = fill_one_span_read(start_ptr, end_ptr); - if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) + if (once) break; }

@@ -132,11 +132,11 @@ static int fill_cache_read(unsigned char *start_ptr, unsigned char *end_ptr, }

static int fill_cache_write(unsigned char *start_ptr, unsigned char *end_ptr, - char *resctrl_val) + bool once) { while (1) { fill_one_span_write(start_ptr, end_ptr); - if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) + if (once) break; }

@@ -156,16 +156,16 @@ int alloc_buffer(unsigned long long buf_size, int memflush) return 0; }

-int use_buffer(unsigned long long buf_size, int op, char *resctrl_val) +int use_buffer(unsigned long long buf_size, int op, bool once) { unsigned char *end_ptr; int ret;

end_ptr = startptr + buf_size; if (op == 0) - ret = fill_cache_read(startptr, end_ptr, resctrl_val); + ret = fill_cache_read(startptr, end_ptr, once); else - ret = fill_cache_write(startptr, end_ptr, resctrl_val); + ret = fill_cache_write(startptr, end_ptr, once);

if (ret) printf("\n Error in fill cache read/write...\n"); @@ -174,7 +174,7 @@ int use_buffer(unsigned long long buf_size, int op, char *resctrl_val) }

static int -fill_cache(unsigned long long buf_size, int memflush, int op, char *resctrl_val) +fill_cache(unsigned long long buf_size, int memflush, int op, bool once) { int ret;

@@ -182,18 +182,18 @@ fill_cache(unsigned long long buf_size, int memflush, int op, char *resctrl_val) if (ret) return ret;

- ret = use_buffer(buf_size, op, resctrl_val); + ret = use_buffer(buf_size, op, once); free_buffer();

return ret; }

-int run_fill_buf(unsigned long span, int memflush, int op, char *resctrl_val) +int run_fill_buf(unsigned long span, int memflush, int op, bool once) { unsigned long long cache_size = span; int ret;

- ret = fill_cache(cache_size, memflush, op, resctrl_val); + ret = fill_cache(cache_size, memflush, op, once); if (ret) { printf("\n Error in fill cache\n"); return -1; diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 75bfa2b2746d..8748121345f3 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -97,8 +97,8 @@ int perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu, int group_fd, unsigned long flags); void free_buffer(void); int alloc_buffer(unsigned long long buf_size, int memflush); -int use_buffer(unsigned long long buf_size, int op, char *resctrl_val); -int run_fill_buf(unsigned long span, int memflush, int op, char *resctrl_val); +int use_buffer(unsigned long long buf_size, int op, bool once); +int run_fill_buf(unsigned long span, int memflush, int op, bool once); int resctrl_val(char **benchmark_cmd, struct resctrl_val_param *param); int mbm_bw_change(unsigned long span, int cpu_no, char *bw_report, char **benchmark_cmd); void tests_cleanup(void); diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index ef2977b5ecd4..6e710989f368 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -70,7 +70,7 @@ void tests_cleanup(void) cat_test_cleanup(); }

-static void run_mbm_test(bool has_ben, char **benchmark_cmd, unsigned long span, +static void run_mbm_test(char **benchmark_cmd, unsigned long span, int cpu_no, char *bw_report) { int res; @@ -88,8 +88,6 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, unsigned long span, goto umount; }

- if (!has_ben) - sprintf(benchmark_cmd[4], "%s", MBA_STR); res = mbm_bw_change(span, cpu_no, bw_report, benchmark_cmd); ksft_test_result(!res, "MBM: bw change\n"); if ((get_vendor() == ARCH_INTEL) && res) @@ -124,7 +122,7 @@ static void run_mba_test(char **benchmark_cmd, unsigned long span, int cpu_no, umount_resctrlfs(); }

-static void run_cmt_test(bool has_ben, char **benchmark_cmd, int cpu_no) +static void run_cmt_test(char **benchmark_cmd, int cpu_no) { int res;

@@ -141,8 +139,6 @@ static void run_cmt_test(bool has_ben, char **benchmark_cmd, int cpu_no) goto umount; }

- if (!has_ben) - sprintf(benchmark_cmd[4], "%s", CMT_STR); res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd); ksft_test_result(!res, "CMT: test\n"); if ((get_vendor() == ARCH_INTEL) && res) @@ -276,7 +272,7 @@ int main(int argc, char **argv) sprintf(benchmark_cmd[1], "%lu", span); strcpy(benchmark_cmd[2], "1"); strcpy(benchmark_cmd[3], "0"); - strcpy(benchmark_cmd[4], ""); + strcpy(benchmark_cmd[4], "false"); benchmark_cmd[5] = NULL; }

@@ -291,13 +287,13 @@ int main(int argc, char **argv) ksft_set_plan(tests ? : 4);

if (mbm_test) - run_mbm_test(has_ben, benchmark_cmd, span, cpu_no, bw_report); + run_mbm_test(benchmark_cmd, span, cpu_no, bw_report);

if (mba_test) run_mba_test(benchmark_cmd, span, cpu_no, bw_report);

if (cmt_test) - run_cmt_test(has_ben, benchmark_cmd, cpu_no); + run_cmt_test(benchmark_cmd, cpu_no);

if (cat_test) run_cat_test(cpu_no, no_of_bits); diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 79e030065da8..7fef9068d7fd 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -298,7 +298,7 @@ void run_benchmark(int signum, siginfo_t *info, void *ucontext) int operation, ret, memflush; unsigned long span; char **benchmark_cmd; - char resctrl_val[64]; + bool once; FILE *fp;

benchmark_cmd = info->si_ptr; @@ -316,9 +316,14 @@ void run_benchmark(int signum, siginfo_t *info, void *ucontext) span = strtoul(benchmark_cmd[1], NULL, 10); memflush = atoi(benchmark_cmd[2]); operation = atoi(benchmark_cmd[3]); - sprintf(resctrl_val, "%s", benchmark_cmd[4]); + if (!strcmp(benchmark_cmd[4], "true")) + once = true; + else if (!strcmp(benchmark_cmd[4], "false")) + once = false; + else + PARENT_EXIT("Invalid once parameter");

- if (run_fill_buf(span, memflush, operation, resctrl_val)) + if (run_fill_buf(span, memflush, operation, once)) fprintf(stderr, "Error in running fill buffer\n"); } else { /* Execute specified benchmark */

-- 2.30.2

Ilpo Järvinen

11:44 a.m.

New subject: [PATCH v2 13/24] selftests/resctrl: Add flush_buffer() to fill_buf

Currently, flushing is only done after allocating and filling the buffer and cannot be controlled by the test cases.

The new CAT test will want to control flushing within a test so introduce flush_buffer() for that purpose.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/fill_buf.c | 5 +++++ tools/testing/selftests/resctrl/resctrl.h | 1 + 2 files changed, 6 insertions(+)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 677e1a113629..7e0d3a1ea555 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -58,6 +58,11 @@ static void mem_flush(void *p, size_t s) sb(); }

+void flush_buffer(unsigned long long span) +{ + mem_flush(startptr, span); +} + static void *malloc_and_init_memory(size_t s) { void *p = NULL; diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 8748121345f3..ba36eb5fdf0d 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -97,6 +97,7 @@ int perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu, int group_fd, unsigned long flags); void free_buffer(void); int alloc_buffer(unsigned long long buf_size, int memflush); +void flush_buffer(unsigned long long span); int use_buffer(unsigned long long buf_size, int op, bool once); int run_fill_buf(unsigned long span, int memflush, int op, bool once); int resctrl_val(char **benchmark_cmd, struct resctrl_val_param *param);

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:16 a.m.

New subject: [PATCH v2 13/24] selftests/resctrl: Add flush_buffer() to fill_buf

Hi Ilpo,

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...

Currently, flushing is only done after allocating and filling the buffer and cannot be controlled by the test cases.

The new CAT test will want to control flushing within a test so introduce flush_buffer() for that purpose.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/fill_buf.c | 5 +++++ tools/testing/selftests/resctrl/resctrl.h | 1 + 2 files changed, 6 insertions(+)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 677e1a113629..7e0d3a1ea555 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -58,6 +58,11 @@ static void mem_flush(void *p, size_t s) sb(); } +void flush_buffer(unsigned long long span) +{

mem_flush(startptr, span);

+}

I do not think this indirection is needed. In the same spirit of feedback to previous patches a test can manage its own buffer pointer and flush it by calling mem_flush() directly.

Reinette

Ilpo Järvinen

18 Apr 18 Apr

11:44 a.m.

New subject: [PATCH v2 14/24] selftests/resctrl: Remove test type checks from cat_val()

cat_val() is only used during CAT test but it checks for test type.

Remove test type checks and the unused else branch from cat_val().

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cache.c | 45 +++++++++++-------------- 1 file changed, 20 insertions(+), 25 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index a15f1f2715cd..6bc912de38be 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -232,36 +232,31 @@ int cat_val(struct resctrl_val_param *param) if (ret) return ret;

- if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) - initialize_llc_perf(); + initialize_llc_perf();

/* Test runs until the callback setup() tells the test to stop. */ while (1) { - if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) { - ret = param->setup(param); - if (ret == END_OF_TESTS) { - ret = 0; - break; - } - if (ret < 0) - break; - ret = reset_enable_llc_perf(bm_pid, param->cpu_no); - if (ret) - break; - - if (run_fill_buf(param->span, memflush, operation, true)) { - fprintf(stderr, "Error-running fill buffer\n"); - ret = -1; - break; - } - - sleep(1); - ret = measure_cache_vals(param, bm_pid); - if (ret) - break; - } else { + ret = param->setup(param); + if (ret == END_OF_TESTS) { + ret = 0; break; } + if (ret < 0) + break; + ret = reset_enable_llc_perf(bm_pid, param->cpu_no); + if (ret) + break; + + if (run_fill_buf(param->span, memflush, operation, true)) { + fprintf(stderr, "Error-running fill buffer\n"); + ret = -1; + break; + } + + sleep(1); + ret = measure_cache_vals(param, bm_pid); + if (ret) + break; }

return ret;

-- 2.30.2

Ilpo Järvinen

11:44 a.m.

New subject: [PATCH v2 15/24] selftests/resctrl: Refactor get_cbm_mask()

Callers of get_cbm_mask() are required to pass a string into which the CBM bit mask is read into. Neither CAT nor CMT tests need the mask as string but just convert it into an unsigned long value.

The bit mask reader can only read .../cbm_mask files.

Generalize the bit mask reading function into get_bit_mask() such that it can be used to handle other files besides the .../cbm_mask and handle the unsigned long conversion within within get_bit_mask() using fscanf(). Alter get_cbm_mask() to construct the filename for get_bit_mask().

Co-developed-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cat_test.c | 5 +-- tools/testing/selftests/resctrl/cmt_test.c | 5 +-- tools/testing/selftests/resctrl/resctrl.h | 2 +- tools/testing/selftests/resctrl/resctrlfs.c | 50 +++++++++++++++------ 4 files changed, 40 insertions(+), 22 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index a998e6397518..9bf5d05d9e74 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -18,7 +18,6 @@ #define MAX_DIFF 1000000

static int count_of_bits; -static char cbm_mask[256]; static unsigned long long_mask; static unsigned long cache_size;

@@ -101,12 +100,10 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) cache_size = 0;

/* Get default cbm mask for L3/L2 cache */ - ret = get_cbm_mask(cache_type, cbm_mask); + ret = get_cbm_mask(cache_type, &long_mask); if (ret) return ret;

- long_mask = strtoul(cbm_mask, NULL, 16); - /* Get L3/L2 cache size */ ret = get_cache_size(cpu_no, cache_type, &cache_size); if (ret) diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index 2d434c03cbba..ae54bbabbd91 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -17,7 +17,6 @@ #define MAX_DIFF_PERCENT 15

static int count_of_bits; -static char cbm_mask[256]; static unsigned long long_mask; static unsigned long cache_size;

@@ -82,12 +81,10 @@ int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd) if (!validate_resctrl_feature_request(CMT_STR)) return -1;

- ret = get_cbm_mask("L3", cbm_mask); + ret = get_cbm_mask("L3", &long_mask); if (ret) return ret;

- long_mask = strtoul(cbm_mask, NULL, 16); - ret = get_cache_size(cpu_no, "L3", &cache_size); if (ret) return ret; diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index ba36eb5fdf0d..bcc95faa5b4e 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -106,7 +106,7 @@ void tests_cleanup(void); void mbm_test_cleanup(void); int mba_schemata_change(int cpu_no, char *bw_report, char **benchmark_cmd); void mba_test_cleanup(void); -int get_cbm_mask(char *cache_type, char *cbm_mask); +int get_cbm_mask(char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size); void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(void); diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 7fef9068d7fd..f01ecfa64063 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -186,30 +186,29 @@ int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size) #define CORE_SIBLINGS_PATH "/sys/bus/cpu/devices/cpu"

/* - * get_cbm_mask - Get cbm mask for given cache - * @cache_type: Cache level L2/L3 - * @cbm_mask: cbm_mask returned as a string + * get_bit_mask - Get bit mask from given file + * @filename: File containing the mask + * @mask: The bit mask returned as unsigned long * * Return: = 0 on success, < 0 on failure. */ -int get_cbm_mask(char *cache_type, char *cbm_mask) +static int get_bit_mask(char *filename, unsigned long *mask) { - char cbm_mask_path[1024]; FILE *fp;

- if (!cbm_mask) + if (!filename || !mask) return -1;

- sprintf(cbm_mask_path, "%s/%s/cbm_mask", INFO_PATH, cache_type); - - fp = fopen(cbm_mask_path, "r"); + fp = fopen(filename, "r"); if (!fp) { - perror("Failed to open cache level"); - + fprintf(stderr, "Failed to open bit mask file '%s': %s\n", + filename, strerror(errno)); return -1; } - if (fscanf(fp, "%s", cbm_mask) <= 0) { - perror("Could not get max cbm_mask"); + + if (fscanf(fp, "%lx", mask) <= 0) { + fprintf(stderr, "Could not read bit mask file '%s': %s\n", + filename, strerror(errno)); fclose(fp);

return -1; @@ -219,6 +218,31 @@ int get_cbm_mask(char *cache_type, char *cbm_mask) return 0; }

+/* + * get_cbm_bits - Get number of bits in cbm mask + * @cache_type: Cache level L2/L3 + * @mask: cbm_mask returned as unsigned long + * + * Return: = 0 on success, < 0 on failure. + */ +int get_cbm_mask(char *cache_type, unsigned long *mask) +{ + char cbm_mask_path[1024]; + int ret; + + if (!cache_type) + return -1; + + snprintf(cbm_mask_path, sizeof(cbm_mask_path), "%s/%s/cbm_mask", + INFO_PATH, cache_type); + + ret = get_bit_mask(cbm_mask_path, mask); + if (ret) + return -1; + + return 0; +} + /* * get_core_sibling - Get sibling core id from the same socket for given CPU * @cpu_no: CPU number

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:17 a.m.

New subject: [PATCH v2 15/24] selftests/resctrl: Refactor get_cbm_mask()

Hi Ilpo,

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...

Callers of get_cbm_mask() are required to pass a string into which the CBM bit mask is read into. Neither CAT nor CMT tests need the

There is a double "into" above. Perhaps the second can be dropped?

...

mask as string but just convert it into an unsigned long value.

The bit mask reader can only read .../cbm_mask files.

Generalize the bit mask reading function into get_bit_mask() such that it can be used to handle other files besides the .../cbm_mask and handle the unsigned long conversion within within get_bit_mask() using fscanf(). Alter get_cbm_mask() to construct the filename for get_bit_mask().

Co-developed-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/cat_test.c | 5 +-- tools/testing/selftests/resctrl/cmt_test.c | 5 +-- tools/testing/selftests/resctrl/resctrl.h | 2 +- tools/testing/selftests/resctrl/resctrlfs.c | 50 +++++++++++++++------ 4 files changed, 40 insertions(+), 22 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index a998e6397518..9bf5d05d9e74 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -18,7 +18,6 @@ #define MAX_DIFF 1000000 static int count_of_bits; -static char cbm_mask[256]; static unsigned long long_mask; static unsigned long cache_size; @@ -101,12 +100,10 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) cache_size = 0; /* Get default cbm mask for L3/L2 cache */

ret = get_cbm_mask(cache_type, cbm_mask);

ret = get_cbm_mask(cache_type, &long_mask); if (ret) return ret;

long_mask = strtoul(cbm_mask, NULL, 16);

/* Get L3/L2 cache size */ ret = get_cache_size(cpu_no, cache_type, &cache_size); if (ret)

diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index 2d434c03cbba..ae54bbabbd91 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -17,7 +17,6 @@ #define MAX_DIFF_PERCENT 15 static int count_of_bits; -static char cbm_mask[256]; static unsigned long long_mask; static unsigned long cache_size; @@ -82,12 +81,10 @@ int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd) if (!validate_resctrl_feature_request(CMT_STR)) return -1;

ret = get_cbm_mask("L3", cbm_mask);

ret = get_cbm_mask("L3", &long_mask); if (ret) return ret;

I think this is a good change. It does raise the question why long_mask is a global variable so I think it may make things go smoother if the patch making long_mask local is moved to be before this patch.

Reinette

Shaopeng Tan (Fujitsu)

31 May 31 May

6:03 a.m.

New subject: [PATCH v2 15/24] selftests/resctrl: Refactor get_cbm_mask()

Hi Ilpo,

...

Callers of get_cbm_mask() are required to pass a string into which the CBM bit mask is read into. Neither CAT nor CMT tests need the mask as string but just convert it into an unsigned long value.

The bit mask reader can only read .../cbm_mask files.

Generalize the bit mask reading function into get_bit_mask() such that it can be used to handle other files besides the .../cbm_mask and handle the unsigned long conversion within within get_bit_mask() using fscanf(). Alter get_cbm_mask() to construct the filename for get_bit_mask().

Co-developed-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/cat_test.c | 5 +-- tools/testing/selftests/resctrl/cmt_test.c | 5 +-- tools/testing/selftests/resctrl/resctrl.h | 2 +- tools/testing/selftests/resctrl/resctrlfs.c | 50 +++++++++++++++------ 4 files changed, 40 insertions(+), 22 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index a998e6397518..9bf5d05d9e74 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -18,7 +18,6 @@ #define MAX_DIFF 1000000

static int count_of_bits; -static char cbm_mask[256]; static unsigned long long_mask; static unsigned long cache_size;

@@ -101,12 +100,10 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) cache_size = 0;

/* Get default cbm mask for L3/L2 cache */

ret = get_cbm_mask(cache_type, cbm_mask);

ret = get_cbm_mask(cache_type, &long_mask); if (ret) return ret;

long_mask = strtoul(cbm_mask, NULL, 16);

/* Get L3/L2 cache size */ ret = get_cache_size(cpu_no, cache_type, &cache_size); if (ret)

diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index 2d434c03cbba..ae54bbabbd91 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -17,7 +17,6 @@ #define MAX_DIFF_PERCENT 15

static int count_of_bits; -static char cbm_mask[256]; static unsigned long long_mask; static unsigned long cache_size;

@@ -82,12 +81,10 @@ int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd) if (!validate_resctrl_feature_request(CMT_STR)) return -1;

ret = get_cbm_mask("L3", cbm_mask);

ret = get_cbm_mask("L3", &long_mask); if (ret) return ret;

long_mask = strtoul(cbm_mask, NULL, 16);

ret = get_cache_size(cpu_no, "L3", &cache_size); if (ret) return ret;

diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index ba36eb5fdf0d..bcc95faa5b4e 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -106,7 +106,7 @@ void tests_cleanup(void); void mbm_test_cleanup(void); int mba_schemata_change(int cpu_no, char *bw_report, char **benchmark_cmd); void mba_test_cleanup(void); -int get_cbm_mask(char *cache_type, char *cbm_mask); +int get_cbm_mask(char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size); void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(void); diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 7fef9068d7fd..f01ecfa64063 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -186,30 +186,29 @@ int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size) #define CORE_SIBLINGS_PATH "/sys/bus/cpu/devices/cpu"

/*

get_cbm_mask - Get cbm mask for given cache

@cache_type: Cache level L2/L3

@cbm_mask: cbm_mask returned as a string

get_bit_mask - Get bit mask from given file

@filename: File containing the mask

@mask: The bit mask returned as unsigned long

Return: = 0 on success, < 0 on failure.

*/

-int get_cbm_mask(char *cache_type, char *cbm_mask) +static int get_bit_mask(char *filename, unsigned long *mask) {

char cbm_mask_path[1024]; FILE *fp;

if (!cbm_mask)

if (!filename || !mask) return -1;

sprintf(cbm_mask_path, "%s/%s/cbm_mask", INFO_PATH,

cache_type);

fp = fopen(cbm_mask_path, "r");

fp = fopen(filename, "r"); if (!fp) {
perror("Failed to open cache level");
fprintf(stderr, "Failed to open bit mask file '%s': %s\n",
	filename, strerror(errno));
return -1; }
if (fscanf(fp, "%s", cbm_mask) <= 0) {
perror("Could not get max cbm_mask");
if (fscanf(fp, "%lx", mask) <= 0) {
fprintf(stderr, "Could not read bit mask file '%s': %s\n",
	filename, strerror(errno));
fclose(fp);

return -1;
@@ -219,6 +218,31 @@ int get_cbm_mask(char *cache_type, char *cbm_mask) return 0; }

+/*

get_cbm_bits - Get number of bits in cbm mask

Is it get_cbm_mask?

Best regards, Shaopeng TAN

...

@cache_type: Cache level L2/L3

@mask: cbm_mask returned as unsigned long

Return: = 0 on success, < 0 on failure.

*/

+int get_cbm_mask(char *cache_type, unsigned long *mask) {
char cbm_mask_path[1024];

int ret;

if (!cache_type)
return -1;
snprintf(cbm_mask_path, sizeof(cbm_mask_path),
"%s/%s/cbm_mask",
 INFO_PATH, cache_type);
ret = get_bit_mask(cbm_mask_path, mask);

if (ret)
return -1;
return 0;
+}

/*

get_core_sibling - Get sibling core id from the same socket for given CPU

@cpu_no: CPU number

-- 2.30.2

Ilpo Järvinen

9:24 a.m.

New subject: [PATCH v2 15/24] selftests/resctrl: Refactor get_cbm_mask()

On Wed, 31 May 2023, Shaopeng Tan (Fujitsu) wrote:

...

Hi Ilpo,

...
Callers of get_cbm_mask() are required to pass a string into which the CBM bit mask is read into. Neither CAT nor CMT tests need the mask as string but just convert it into an unsigned long value.

The bit mask reader can only read .../cbm_mask files.

Generalize the bit mask reading function into get_bit_mask() such that it can be used to handle other files besides the .../cbm_mask and handle the unsigned long conversion within within get_bit_mask() using fscanf(). Alter get_cbm_mask() to construct the filename for get_bit_mask().

Co-developed-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/cat_test.c | 5 +-- tools/testing/selftests/resctrl/cmt_test.c | 5 +-- tools/testing/selftests/resctrl/resctrl.h | 2 +- tools/testing/selftests/resctrl/resctrlfs.c | 50 +++++++++++++++------ 4 files changed, 40 insertions(+), 22 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index a998e6397518..9bf5d05d9e74 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -18,7 +18,6 @@ #define MAX_DIFF 1000000

static int count_of_bits; -static char cbm_mask[256]; static unsigned long long_mask; static unsigned long cache_size;

@@ -101,12 +100,10 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) cache_size = 0;

/* Get default cbm mask for L3/L2 cache */

ret = get_cbm_mask(cache_type, cbm_mask);

ret = get_cbm_mask(cache_type, &long_mask); if (ret) return ret;

long_mask = strtoul(cbm_mask, NULL, 16);

/* Get L3/L2 cache size */ ret = get_cache_size(cpu_no, cache_type, &cache_size); if (ret)

diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index 2d434c03cbba..ae54bbabbd91 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -17,7 +17,6 @@ #define MAX_DIFF_PERCENT 15

static int count_of_bits; -static char cbm_mask[256]; static unsigned long long_mask; static unsigned long cache_size;

@@ -82,12 +81,10 @@ int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd) if (!validate_resctrl_feature_request(CMT_STR)) return -1;

ret = get_cbm_mask("L3", cbm_mask);

ret = get_cbm_mask("L3", &long_mask); if (ret) return ret;

long_mask = strtoul(cbm_mask, NULL, 16);

ret = get_cache_size(cpu_no, "L3", &cache_size); if (ret) return ret;

diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index ba36eb5fdf0d..bcc95faa5b4e 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -106,7 +106,7 @@ void tests_cleanup(void); void mbm_test_cleanup(void); int mba_schemata_change(int cpu_no, char *bw_report, char **benchmark_cmd); void mba_test_cleanup(void); -int get_cbm_mask(char *cache_type, char *cbm_mask); +int get_cbm_mask(char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size); void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(void); diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 7fef9068d7fd..f01ecfa64063 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -186,30 +186,29 @@ int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size) #define CORE_SIBLINGS_PATH "/sys/bus/cpu/devices/cpu"

/*

get_cbm_mask - Get cbm mask for given cache

@cache_type: Cache level L2/L3

@cbm_mask: cbm_mask returned as a string

get_bit_mask - Get bit mask from given file

@filename: File containing the mask

@mask: The bit mask returned as unsigned long

Return: = 0 on success, < 0 on failure.

*/

-int get_cbm_mask(char *cache_type, char *cbm_mask) +static int get_bit_mask(char *filename, unsigned long *mask) {

char cbm_mask_path[1024]; FILE *fp;

if (!cbm_mask)

if (!filename || !mask) return -1;

sprintf(cbm_mask_path, "%s/%s/cbm_mask", INFO_PATH,

cache_type);

fp = fopen(cbm_mask_path, "r");

fp = fopen(filename, "r"); if (!fp) {
perror("Failed to open cache level");
fprintf(stderr, "Failed to open bit mask file '%s': %s\n",
	filename, strerror(errno));
return -1; }
if (fscanf(fp, "%s", cbm_mask) <= 0) {
perror("Could not get max cbm_mask");
if (fscanf(fp, "%lx", mask) <= 0) {
fprintf(stderr, "Could not read bit mask file '%s': %s\n",
	filename, strerror(errno));
fclose(fp);

return -1;
@@ -219,6 +218,31 @@ int get_cbm_mask(char *cache_type, char *cbm_mask) return 0; }

+/*

get_cbm_bits - Get number of bits in cbm mask
Is it get_cbm_mask?

Sure, thanks for noticing this. While correcting it, I also improved the description to match what the function now does.

-- i.

Ilpo Järvinen

18 Apr 18 Apr

11:44 a.m.

New subject: [PATCH v2 16/24] selftests/resctrl: Create cache_alloc_size() helper

CAT and CMT tests calculate the span size from the n-bits cache allocation on their own.

Add cache_alloc_size() helper which calculates size of the cache allocation for the given number of bits to avoid duplicating code.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cache.c | 27 ++++++++++++++++++++++ tools/testing/selftests/resctrl/cat_test.c | 8 +++++-- tools/testing/selftests/resctrl/cmt_test.c | 4 +++- tools/testing/selftests/resctrl/resctrl.h | 2 ++ 4 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index 6bc912de38be..b983af394e33 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -15,6 +15,33 @@ static struct read_format rf_cqm; static int fd_lm; char llc_occup_path[1024];

+/* + * cache_alloc_size - Calculate alloc size for given cache alloc mask + * @cpu_no: CPU number + * @cache_type: Cache level L2/L3 + * @alloc_mask: Cache alloc mask + * @alloc_size: Alloc size returned on success + * + * Returns: 0 on success with @alloc_size filled, non-zero on error. + */ +int cache_alloc_size(int cpu_no, char *cache_type, unsigned long alloc_mask, + unsigned long *alloc_size) +{ + unsigned long cache_size, full_mask; + int ret; + + ret = get_cbm_mask(cache_type, &full_mask); + if (ret) + return ret; + + ret = get_cache_size(cpu_no, cache_type, &cache_size); + if (ret) + return ret; + + *alloc_size = cache_size * count_bits(alloc_mask) / count_bits(full_mask); + return 0; +} + static void initialize_perf_event_attr(void) { pea_llc_miss.type = PERF_TYPE_HARDWARE; diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 9bf5d05d9e74..d3fbd4de9f8a 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -140,7 +140,9 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) /* Set param values for parent thread which will be allocated bitmask * with (max_bits - n) bits */ - param.span = cache_size * (count_of_bits - n) / count_of_bits; + ret = cache_alloc_size(cpu_no, cache_type, l_mask, &param.span); + if (ret) + return ret; strcpy(param.ctrlgrp, "c2"); strcpy(param.mongrp, "m2"); strcpy(param.filename, RESULT_FILE_NAME2); @@ -162,7 +164,9 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) param.mask = l_mask_1; strcpy(param.ctrlgrp, "c1"); strcpy(param.mongrp, "m1"); - param.span = cache_size * n / count_of_bits; + ret = cache_alloc_size(cpu_no, cache_type, l_mask_1, &param.span); + if (ret) + exit(-1); strcpy(param.filename, RESULT_FILE_NAME1); param.num_of_runs = 0; param.cpu_no = sibling_cpu_no; diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index ae54bbabbd91..efe77e0f1d4c 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -105,10 +105,12 @@ int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd) .cpu_no = cpu_no, .filename = RESULT_FILE_NAME, .mask = ~(long_mask << n) & long_mask, - .span = cache_size * n / count_of_bits, .num_of_runs = 0, .setup = cmt_setup, }; + ret = cache_alloc_size(cpu_no, "L3", param.mask, &param.span); + if (ret) + return ret;

if (strcmp(benchmark_cmd[0], "fill_buf") == 0) sprintf(benchmark_cmd[1], "%lu", param.span); diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index bcc95faa5b4e..65425d92684e 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -108,6 +108,8 @@ int mba_schemata_change(int cpu_no, char *bw_report, char **benchmark_cmd); void mba_test_cleanup(void); int get_cbm_mask(char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size); +int cache_alloc_size(int cpu_no, char *cache_type, unsigned long alloc_mask, + unsigned long *alloc_size); void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(void); void signal_handler_unregister(void);

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:19 a.m.

New subject: [PATCH v2 16/24] selftests/resctrl: Create cache_alloc_size() helper

Hi Ilpo,

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...

CAT and CMT tests calculate the span size from the n-bits cache allocation on their own.

Add cache_alloc_size() helper which calculates size of the cache allocation for the given number of bits to avoid duplicating code.

This patch is very heavy on the usage of allocation when I think it only refers to the cache size ... how that size is used by the caller is independent from this.

Compare to how it sounds with some small changes to changelog:

CAT and CMT tests calculate the span size from the capacity bitmask independently. Add cache_size() helper which calculates the size of the cache for the given number of bits to avoid duplicating code.

I think removing "alloc" helps to convey what this code actually does.

...

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/cache.c | 27 ++++++++++++++++++++++ tools/testing/selftests/resctrl/cat_test.c | 8 +++++-- tools/testing/selftests/resctrl/cmt_test.c | 4 +++- tools/testing/selftests/resctrl/resctrl.h | 2 ++ 4 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index 6bc912de38be..b983af394e33 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -15,6 +15,33 @@ static struct read_format rf_cqm; static int fd_lm; char llc_occup_path[1024]; +/*

cache_alloc_size - Calculate alloc size for given cache alloc mask

"cache_size - Calculate number of bytes represented by bitmask" ? Please feel free to improve.

...

@cpu_no: CPU number

@cache_type: Cache level L2/L3

@alloc_mask: Cache alloc mask

The description is mostly a rewrite of the variable name. Can it be more descriptive?

...

@alloc_size: Alloc size returned on success

I do not think the utility should assume anything about how the value it provides should be used. Instead it should just reflect what the value is.

...

Returns: 0 on success with @alloc_size filled, non-zero on error.

*/

+int cache_alloc_size(int cpu_no, char *cache_type, unsigned long alloc_mask,
     unsigned long *alloc_size)
+{
unsigned long cache_size, full_mask;

int ret;

ret = get_cbm_mask(cache_type, &full_mask);

if (ret)
return ret;
ret = get_cache_size(cpu_no, cache_type, &cache_size);

if (ret)
return ret;
*alloc_size = cache_size * count_bits(alloc_mask) / count_bits(full_mask);

return 0;
+}

static void initialize_perf_event_attr(void) { pea_llc_miss.type = PERF_TYPE_HARDWARE; diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 9bf5d05d9e74..d3fbd4de9f8a 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -140,7 +140,9 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) /* Set param values for parent thread which will be allocated bitmask * with (max_bits - n) bits */

param.span = cache_size * (count_of_bits - n) / count_of_bits;
ret = cache_alloc_size(cpu_no, cache_type, l_mask, &param.span);

if (ret)
return ret;
strcpy(param.ctrlgrp, "c2"); strcpy(param.mongrp, "m2"); strcpy(param.filename, RESULT_FILE_NAME2);
@@ -162,7 +164,9 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) param.mask = l_mask_1; strcpy(param.ctrlgrp, "c1"); strcpy(param.mongrp, "m1");
param.span = cache_size * n / count_of_bits;
ret = cache_alloc_size(cpu_no, cache_type, l_mask_1, &param.span);
if (ret)
	exit(-1);
strcpy(param.filename, RESULT_FILE_NAME1); param.num_of_runs = 0; param.cpu_no = sibling_cpu_no;

Did this change intend to remove the duplicate code mentioned in the changelog? I was expecting the calls to get_cbm_mask() and get_cache_size() within cat_perf_miss_val() to be removed.

...

diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index ae54bbabbd91..efe77e0f1d4c 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -105,10 +105,12 @@ int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd) .cpu_no = cpu_no, .filename = RESULT_FILE_NAME, .mask = ~(long_mask << n) & long_mask,
.span		= cache_size * n / count_of_bits,
.num_of_runs = 0, .setup = cmt_setup, };
ret = cache_alloc_size(cpu_no, "L3", param.mask, &param.span);

if (ret)
return ret;
if (strcmp(benchmark_cmd[0], "fill_buf") == 0) sprintf(benchmark_cmd[1], "%lu", param.span);

Same here regarding removal of code.

...

diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index bcc95faa5b4e..65425d92684e 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -108,6 +108,8 @@ int mba_schemata_change(int cpu_no, char *bw_report, char **benchmark_cmd); void mba_test_cleanup(void); int get_cbm_mask(char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size); +int cache_alloc_size(int cpu_no, char *cache_type, unsigned long alloc_mask,
     unsigned long *alloc_size);
void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(void); void signal_handler_unregister(void);

Reinette

Ilpo Järvinen

24 Apr 24 Apr

4:28 p.m.

New subject: [PATCH v2 16/24] selftests/resctrl: Create cache_alloc_size() helper

On Fri, 21 Apr 2023, Reinette Chatre wrote:

...

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...
CAT and CMT tests calculate the span size from the n-bits cache allocation on their own.

Add cache_alloc_size() helper which calculates size of the cache allocation for the given number of bits to avoid duplicating code.

This patch is very heavy on the usage of allocation when I think it only refers to the cache size ... how that size is used by the caller is independent from this.

Compare to how it sounds with some small changes to changelog:

CAT and CMT tests calculate the span size from the capacity bitmask independently. Add cache_size() helper which calculates the size of the cache for the given number of bits to avoid duplicating code.

I think removing "alloc" helps to convey what this code actually does.

Does it? Without something to indicate its not the full cache size, there's possiblity for confusion. While the tests are mostly interested in the allocated size, the full cache size is also collected (solely for printing it out, IIRC). Maybe I should rename those variable to total_cache_size or something like that to mitigate the confusion?

...

...
Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/cache.c | 27 ++++++++++++++++++++++ tools/testing/selftests/resctrl/cat_test.c | 8 +++++-- tools/testing/selftests/resctrl/cmt_test.c | 4 +++- tools/testing/selftests/resctrl/resctrl.h | 2 ++ 4 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index 6bc912de38be..b983af394e33 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -15,6 +15,33 @@ static struct read_format rf_cqm; static int fd_lm; char llc_occup_path[1024]; +/*

cache_alloc_size - Calculate alloc size for given cache alloc mask

"cache_size - Calculate number of bytes represented by bitmask" ? Please feel free to improve.

...

@cpu_no: CPU number

@cache_type: Cache level L2/L3

@alloc_mask: Cache alloc mask

The description is mostly a rewrite of the variable name. Can it be more descriptive?

...

@alloc_size: Alloc size returned on success

I do not think the utility should assume anything about how the value it provides should be used. Instead it should just reflect what the value is.

I was just referring to that the value is filled only on success.

...

...
Returns: 0 on success with @alloc_size filled, non-zero on error.

*/

+int cache_alloc_size(int cpu_no, char *cache_type, unsigned long alloc_mask,
     unsigned long *alloc_size)
+{
unsigned long cache_size, full_mask;

int ret;

ret = get_cbm_mask(cache_type, &full_mask);

if (ret)
return ret;
ret = get_cache_size(cpu_no, cache_type, &cache_size);

if (ret)
return ret;
*alloc_size = cache_size * count_bits(alloc_mask) / count_bits(full_mask);

return 0;
+}

static void initialize_perf_event_attr(void) { pea_llc_miss.type = PERF_TYPE_HARDWARE; diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 9bf5d05d9e74..d3fbd4de9f8a 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -140,7 +140,9 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) /* Set param values for parent thread which will be allocated bitmask * with (max_bits - n) bits */

param.span = cache_size * (count_of_bits - n) / count_of_bits;
ret = cache_alloc_size(cpu_no, cache_type, l_mask, &param.span);

if (ret)
return ret;
strcpy(param.ctrlgrp, "c2"); strcpy(param.mongrp, "m2"); strcpy(param.filename, RESULT_FILE_NAME2);
@@ -162,7 +164,9 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) param.mask = l_mask_1; strcpy(param.ctrlgrp, "c1"); strcpy(param.mongrp, "m1");
param.span = cache_size * n / count_of_bits;
ret = cache_alloc_size(cpu_no, cache_type, l_mask_1, &param.span);
if (ret)
	exit(-1);
strcpy(param.filename, RESULT_FILE_NAME1); param.num_of_runs = 0; param.cpu_no = sibling_cpu_no;
Did this change intend to remove the duplicate code mentioned in the changelog?

It removes n CBM bits -> cache size calculations by collecting the calculation into one place.

cache_alloc_size() takes mask instead of n (CBM bits) as input which makes things easier down the line when the new CAT test starts to tweak the alloc size. The new CAT test would otherwise need to track both the mask and n.

cache_alloc_size() is independent of what caller requires so the full mask is not passed from the caller.

...

I was expecting the calls to get_cbm_mask() and get_cache_size() within cat_perf_miss_val() to be removed.

I would have wanted to remove get_cache_size() but it would mean removing cache size print or moving it to elsewhere.

get_cbm_mask() cannot be removed as it's used by the test to calculate the mask the test wants (but it no longer has to determine the size itself but uses this new helper instead).

I can try to amend the changelog to explain things better.

...

...
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index ae54bbabbd91..efe77e0f1d4c 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -105,10 +105,12 @@ int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd) .cpu_no = cpu_no, .filename = RESULT_FILE_NAME, .mask = ~(long_mask << n) & long_mask,
.span		= cache_size * n / count_of_bits,
.num_of_runs = 0, .setup = cmt_setup, };
ret = cache_alloc_size(cpu_no, "L3", param.mask, &param.span);

if (ret)
return ret;
if (strcmp(benchmark_cmd[0], "fill_buf") == 0) sprintf(benchmark_cmd[1], "%lu", param.span);
Same here regarding removal of code.

...
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index bcc95faa5b4e..65425d92684e 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -108,6 +108,8 @@ int mba_schemata_change(int cpu_no, char *bw_report, char **benchmark_cmd); void mba_test_cleanup(void); int get_cbm_mask(char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size); +int cache_alloc_size(int cpu_no, char *cache_type, unsigned long alloc_mask,
     unsigned long *alloc_size);
void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(void); void signal_handler_unregister(void);
Reinette

-- i.

Reinette Chatre

4:45 p.m.

New subject: [PATCH v2 16/24] selftests/resctrl: Create cache_alloc_size() helper

Hi Ilpo,

On 4/24/2023 9:28 AM, Ilpo Järvinen wrote:

...

On Fri, 21 Apr 2023, Reinette Chatre wrote:

...
On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...
CAT and CMT tests calculate the span size from the n-bits cache allocation on their own.

Add cache_alloc_size() helper which calculates size of the cache allocation for the given number of bits to avoid duplicating code.

This patch is very heavy on the usage of allocation when I think it only refers to the cache size ... how that size is used by the caller is independent from this.

Compare to how it sounds with some small changes to changelog:

CAT and CMT tests calculate the span size from the capacity bitmask independently. Add cache_size() helper which calculates the size of the cache for the given number of bits to avoid duplicating code.

I think removing "alloc" helps to convey what this code actually does.

Does it? Without something to indicate its not the full cache size, there's possiblity for confusion. While the tests are mostly interested in the allocated size, the full cache size is also collected (solely for printing it out, IIRC). Maybe I should rename those variable to total_cache_size or something like that to mitigate the confusion?

This patch adds and use a utility that converts a bitmask into bytes. I do not think it should dictate what the meaning or usage of the bitmask is.

...

...
...
Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/cache.c | 27 ++++++++++++++++++++++ tools/testing/selftests/resctrl/cat_test.c | 8 +++++-- tools/testing/selftests/resctrl/cmt_test.c | 4 +++- tools/testing/selftests/resctrl/resctrl.h | 2 ++ 4 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index 6bc912de38be..b983af394e33 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -15,6 +15,33 @@ static struct read_format rf_cqm; static int fd_lm; char llc_occup_path[1024]; +/*

cache_alloc_size - Calculate alloc size for given cache alloc mask

"cache_size - Calculate number of bytes represented by bitmask" ? Please feel free to improve.

...

@cpu_no: CPU number

@cache_type: Cache level L2/L3

@alloc_mask: Cache alloc mask

The description is mostly a rewrite of the variable name. Can it be more descriptive?

...

@alloc_size: Alloc size returned on success

I do not think the utility should assume anything about how the value it provides should be used. Instead it should just reflect what the value is.

I was just referring to that the value is filled only on success.

I understand. My comment was about the naming of the parameter, which can, for example, just be "size".

...

...
...
Returns: 0 on success with @alloc_size filled, non-zero on error.

*/

+int cache_alloc_size(int cpu_no, char *cache_type, unsigned long alloc_mask,
     unsigned long *alloc_size)
+{
unsigned long cache_size, full_mask;

int ret;

ret = get_cbm_mask(cache_type, &full_mask);

if (ret)
return ret;
ret = get_cache_size(cpu_no, cache_type, &cache_size);

if (ret)
return ret;
*alloc_size = cache_size * count_bits(alloc_mask) / count_bits(full_mask);

return 0;
+}

static void initialize_perf_event_attr(void) { pea_llc_miss.type = PERF_TYPE_HARDWARE; diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 9bf5d05d9e74..d3fbd4de9f8a 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -140,7 +140,9 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) /* Set param values for parent thread which will be allocated bitmask * with (max_bits - n) bits */

param.span = cache_size * (count_of_bits - n) / count_of_bits;
ret = cache_alloc_size(cpu_no, cache_type, l_mask, &param.span);

if (ret)
return ret;
strcpy(param.ctrlgrp, "c2"); strcpy(param.mongrp, "m2"); strcpy(param.filename, RESULT_FILE_NAME2);
@@ -162,7 +164,9 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) param.mask = l_mask_1; strcpy(param.ctrlgrp, "c1"); strcpy(param.mongrp, "m1");
param.span = cache_size * n / count_of_bits;
ret = cache_alloc_size(cpu_no, cache_type, l_mask_1, &param.span);
if (ret)
	exit(-1);
strcpy(param.filename, RESULT_FILE_NAME1); param.num_of_runs = 0; param.cpu_no = sibling_cpu_no;
Did this change intend to remove the duplicate code mentioned in the changelog?
It removes n CBM bits -> cache size calculations by collecting the calculation into one place.

cache_alloc_size() takes mask instead of n (CBM bits) as input which makes things easier down the line when the new CAT test starts to tweak the alloc size. The new CAT test would otherwise need to track both the mask and n.

cache_alloc_size() is independent of what caller requires so the full mask is not passed from the caller.

...
I was expecting the calls to get_cbm_mask() and get_cache_size() within cat_perf_miss_val() to be removed.

I would have wanted to remove get_cache_size() but it would mean removing cache size print or moving it to elsewhere.

get_cbm_mask() cannot be removed as it's used by the test to calculate the mask the test wants (but it no longer has to determine the size itself but uses this new helper instead).

I can try to amend the changelog to explain things better.

The current motivation for this patch is to avoid duplicating code but as I see it it introduces more duplicated code.

Reinette

Ilpo Järvinen

18 Apr 18 Apr

11:44 a.m.

New subject: [PATCH v2 17/24] selftests/resctrl: Replace count_bits with count_consecutive_bits()

CAT and CMT tests depends on masks being continuous.

Replace count_bits with more appropriate variant that counts consecutive bits.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cat_test.c | 6 ++--- tools/testing/selftests/resctrl/cmt_test.c | 3 +-- tools/testing/selftests/resctrl/resctrl.h | 1 + tools/testing/selftests/resctrl/resctrlfs.c | 30 +++++++++++++++++++++ 4 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index d3fbd4de9f8a..a1834dd5ad9a 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -78,7 +78,7 @@ static int check_results(struct resctrl_val_param *param) }

fclose(fp); - no_of_bits = count_bits(param->mask); + no_of_bits = count_consecutive_bits(param->mask, NULL);

return show_cache_info(sum_llc_perf_miss, no_of_bits, param->span / 64, MAX_DIFF, MAX_DIFF_PERCENT, NUM_OF_RUNS, @@ -103,6 +103,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) ret = get_cbm_mask(cache_type, &long_mask); if (ret) return ret; + count_of_bits = count_consecutive_bits(long_mask, NULL);

/* Get L3/L2 cache size */ ret = get_cache_size(cpu_no, cache_type, &cache_size); @@ -110,9 +111,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) return ret; ksft_print_msg("Cache size :%lu\n", cache_size);

- /* Get max number of bits from default-cabm mask */ - count_of_bits = count_bits(long_mask); - if (!n) n = count_of_bits / 2;

diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index efe77e0f1d4c..98e7d3accd73 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -84,14 +84,13 @@ int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd) ret = get_cbm_mask("L3", &long_mask); if (ret) return ret; + count_of_bits = count_consecutive_bits(long_mask, NULL);

ret = get_cache_size(cpu_no, "L3", &cache_size); if (ret) return ret; ksft_print_msg("Cache size :%lu\n", cache_size);

- count_of_bits = count_bits(long_mask); - if (n < 1 || n > count_of_bits) { ksft_print_msg("Invalid input value for numbr_of_bits n!\n"); ksft_print_msg("Please enter value in range 1 to %d\n", count_of_bits); diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 65425d92684e..aa5dc8b95a06 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -106,6 +106,7 @@ void tests_cleanup(void); void mbm_test_cleanup(void); int mba_schemata_change(int cpu_no, char *bw_report, char **benchmark_cmd); void mba_test_cleanup(void); +unsigned int count_consecutive_bits(unsigned long val, unsigned int *start); int get_cbm_mask(char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size); int cache_alloc_size(int cpu_no, char *cache_type, unsigned long alloc_mask, diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index f01ecfa64063..4efaf69c8152 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -10,6 +10,8 @@ */ #include "resctrl.h"

+#include <strings.h> + static int find_resctrl_mount(char *buffer) { FILE *mounts; @@ -218,6 +220,34 @@ static int get_bit_mask(char *filename, unsigned long *mask) return 0; }

+/* + * count_consecutive_bits - Returns the longest train of bits in a bit mask + * @val A bit mask + * @start The location of the least-significant bit of the longest train + * + * Return: The length of the consecutive bits in the longest train of bits + */ +unsigned int count_consecutive_bits(unsigned long val, unsigned int *start) +{ + unsigned long last_val; + int count = 0; + + while (val) { + last_val = val; + val &= (val >> 1); + count++; + } + + if (start) { + if (count) + *start = ffsl(last_val) - 1; + else + *start = 0; + } + + return count; +} + /* * get_cbm_bits - Get number of bits in cbm mask * @cache_type: Cache level L2/L3

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:20 a.m.

New subject: [PATCH v2 17/24] selftests/resctrl: Replace count_bits with count_consecutive_bits()

Hi Ilpo,

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...

CAT and CMT tests depends on masks being continuous.

The term used in the spec is "contiguous", using it here may help to convey the goal.

...

Replace count_bits with more appropriate variant that counts consecutive bits.

Could you please elaborate why this is more appropriate and why this is necessary? What is wrong with current solution?

Please note that cbm_mask in resctrl will always be contiguous.

Reinette

Ilpo Järvinen

25 Apr 25 Apr

11:41 a.m.

New subject: [PATCH v2 17/24] selftests/resctrl: Replace count_bits with count_consecutive_bits()

On Fri, 21 Apr 2023, Reinette Chatre wrote:

...

On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...
CAT and CMT tests depends on masks being continuous.

The term used in the spec is "contiguous", using it here may help to convey the goal.

...
Replace count_bits with more appropriate variant that counts consecutive bits.

Could you please elaborate why this is more appropriate and why this is necessary? What is wrong with current solution?

Please note that cbm_mask in resctrl will always be contiguous.

Hi,

It's good that you asked this question.

This comes from a change (not by me originally) that also excluded the shareable bits from the mask the CAT test uses. I assumed (w/o better knowledge) that those shareable bits could create a hole into the mask.

It could be wrong assumption and the shareable bits are always at one end of the CBM mask.

Do you happen to know how the shareable bits are positioned within the mask?

-- i.

Reinette Chatre

2:28 p.m.

New subject: [PATCH v2 17/24] selftests/resctrl: Replace count_bits with count_consecutive_bits()

Hi Ilpo,

On 4/25/2023 4:41 AM, Ilpo Järvinen wrote:

...

On Fri, 21 Apr 2023, Reinette Chatre wrote:

...
On 4/18/2023 4:44 AM, Ilpo Järvinen wrote:

...
CAT and CMT tests depends on masks being continuous.

The term used in the spec is "contiguous", using it here may help to convey the goal.

...
Replace count_bits with more appropriate variant that counts consecutive bits.

Could you please elaborate why this is more appropriate and why this is necessary? What is wrong with current solution?

Please note that cbm_mask in resctrl will always be contiguous.

Hi,

It's good that you asked this question.

This comes from a change (not by me originally) that also excluded the shareable bits from the mask the CAT test uses. I assumed (w/o better knowledge) that those shareable bits could create a hole into the mask.

You are correct. Shareable bits can create a hole in the mask.

...

It could be wrong assumption and the shareable bits are always at one end of the CBM mask.

Do you happen to know how the shareable bits are positioned within the mask?

This depends on the hardware. Software learns about it via a bitmask (as opposed to "number of bits") so the hardware has flexibility to communicate any combination of shared ways.

These comments are not related to this patch though. I understand that this patch has been created in support of the changes that follow. My questions are related to this patch that communicates that it is "more appropriate" for what the code currently does without consideration of what is to come. I would like to understand how this is more appropriate. Also note that cbm_mask will always be contiguous - in this case the hardware communicates a number of bits, not a bitmask, so this will always be contiguous. This patch claims that the current solution is not appropriate to parse cbm_mask, could you please elaborate why?

Reinette

Shaopeng Tan (Fujitsu)

31 May 31 May

7:25 a.m.

New subject: [PATCH v2 17/24] selftests/resctrl: Replace count_bits with count_consecutive_bits()

Hi Ilpo,

...

CAT and CMT tests depends on masks being continuous.

Replace count_bits with more appropriate variant that counts consecutive bits.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/cat_test.c | 6 ++--- tools/testing/selftests/resctrl/cmt_test.c | 3 +-- tools/testing/selftests/resctrl/resctrl.h | 1 + tools/testing/selftests/resctrl/resctrlfs.c | 30 +++++++++++++++++++++ 4 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index d3fbd4de9f8a..a1834dd5ad9a 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -78,7 +78,7 @@ static int check_results(struct resctrl_val_param *param) }

fclose(fp);

no_of_bits = count_bits(param->mask);

no_of_bits = count_consecutive_bits(param->mask, NULL);

return show_cache_info(sum_llc_perf_miss, no_of_bits,

param->span / 64, MAX_DIFF, MAX_DIFF_PERCENT, NUM_OF_RUNS, @@ -103,6 +103,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) ret = get_cbm_mask(cache_type, &long_mask); if (ret) return ret;

count_of_bits = count_consecutive_bits(long_mask, NULL);

/* Get L3/L2 cache size */ ret = get_cache_size(cpu_no, cache_type, &cache_size); @@ -110,9

+111,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) return ret; ksft_print_msg("Cache size :%lu\n", cache_size);

/* Get max number of bits from default-cabm mask */

count_of_bits = count_bits(long_mask);

if (!n) n = count_of_bits / 2;

diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index efe77e0f1d4c..98e7d3accd73 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -84,14 +84,13 @@ int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd) ret = get_cbm_mask("L3", &long_mask); if (ret) return ret;

count_of_bits = count_consecutive_bits(long_mask, NULL);

ret = get_cache_size(cpu_no, "L3", &cache_size); if (ret) return ret; ksft_print_msg("Cache size :%lu\n", cache_size);

count_of_bits = count_bits(long_mask);

if (n < 1 || n > count_of_bits) { ksft_print_msg("Invalid input value for numbr_of_bits n!\n"); ksft_print_msg("Please enter value in range 1 to %d\n",

count_of_bits); diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 65425d92684e..aa5dc8b95a06 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -106,6 +106,7 @@ void tests_cleanup(void); void mbm_test_cleanup(void); int mba_schemata_change(int cpu_no, char *bw_report, char **benchmark_cmd); void mba_test_cleanup(void); +unsigned int count_consecutive_bits(unsigned long val, unsigned int +*start); int get_cbm_mask(char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size); int cache_alloc_size(int cpu_no, char *cache_type, unsigned long alloc_mask, diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index f01ecfa64063..4efaf69c8152 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -10,6 +10,8 @@ */ #include "resctrl.h"

+#include <strings.h>

static int find_resctrl_mount(char *buffer) { FILE *mounts; @@ -218,6 +220,34 @@ static int get_bit_mask(char *filename, unsigned long *mask) return 0; }

+/*

count_consecutive_bits - Returns the longest train of bits in a bit mask

@val A bit mask

@start The location of the least-significant bit of the longest train

Return: The length of the consecutive bits in the longest train of bits

*/

+unsigned int count_consecutive_bits(unsigned long val, unsigned int +*start) {
unsigned long last_val;

int count = 0;

while (val) {
last_val = val;
val &= (val >> 1);
count++;
}

There may not be a case that the most-significant bit is 1 at present, but if this case exists, it cannot count correctly.

Best regards, Shaopeng TAN

Ilpo Järvinen

9:35 a.m.

New subject: [PATCH v2 17/24] selftests/resctrl: Replace count_bits with count_consecutive_bits()

On Wed, 31 May 2023, Shaopeng Tan (Fujitsu) wrote:

...

Hi Ilpo,

...
CAT and CMT tests depends on masks being continuous.

Replace count_bits with more appropriate variant that counts consecutive bits.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

...

...
@@ -218,6 +220,34 @@ static int get_bit_mask(char *filename, unsigned long *mask) return 0; }

+/*

count_consecutive_bits - Returns the longest train of bits in a bit mask

@val A bit mask

@start The location of the least-significant bit of the longest train

Return: The length of the consecutive bits in the longest train of bits

*/

+unsigned int count_consecutive_bits(unsigned long val, unsigned int +*start) {
unsigned long last_val;

int count = 0;

while (val) {
last_val = val;
val &= (val >> 1);
count++;
}
There may not be a case that the most-significant bit is 1 at present, but if this case exists, it cannot count correctly.

Can you please rephrase what is your concern here please?

I test 0U, 1U, ~0U, and a few more complex combinations of bits, and all returned correct count so I might not have understood what case you meant with your description.

This function does not count nor calculate the most-significant bit in any phase but the longest train of bits using the count variable (and the least-significant bit of the longest train in the code that is not included into this partial snippet).

-- i.

Shaopeng Tan (Fujitsu)

1 Jun 1 Jun

6:20 a.m.

New subject: [PATCH v2 17/24] selftests/resctrl: Replace count_bits with count_consecutive_bits()

Hi Ilpo,

...

On Wed, 31 May 2023, Shaopeng Tan (Fujitsu) wrote:

...
Hi Ilpo,

...
CAT and CMT tests depends on masks being continuous.

Replace count_bits with more appropriate variant that counts consecutive

bits.

...
...
Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

...
...
@@ -218,6 +220,34 @@ static int get_bit_mask(char *filename, unsigned long *mask) return 0; }

+/*

count_consecutive_bits - Returns the longest train of bits in a bit mask

@val A bit mask

@start The location of the least-significant bit of the longest train

Return: The length of the consecutive bits in the longest train of bits

*/

+unsigned int count_consecutive_bits(unsigned long val, unsigned int +*start) {
unsigned long last_val;

int count = 0;

while (val) {
last_val = val;
val &= (val >> 1);
count++;
}
There may not be a case that the most-significant bit is 1 at present, but if this case exists, it cannot count correctly.
Can you please rephrase what is your concern here please?

I test 0U, 1U, ~0U, and a few more complex combinations of bits, and all returned correct count so I might not have understood what case you meant with your description.

This function does not count nor calculate the most-significant bit in any phase but the longest train of bits using the count variable (and the least-significant bit of the longest train in the code that is not included into this partial snippet).

Thanks for your explanation. I'm sorry, it was my mistake. No problem here.

Best regards Shaopeng TAN

Ilpo Järvinen

18 Apr 18 Apr

11:45 a.m.

New subject: [PATCH v2 18/24] selftests/resctrl: Exclude shareable bits from schemata in CAT test

CAT test doesn't take shareable bits into account, i.e., the test might be sharing cache with some devices (e.g., graphics).

Introduce get_mask_no_shareable() and use it to provision an environment for CAT test where the allocated LLC is isolated better.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cat_test.c | 2 +- tools/testing/selftests/resctrl/resctrl.h | 3 ++ tools/testing/selftests/resctrl/resctrlfs.c | 56 +++++++++++++++++++++ 3 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index a1834dd5ad9a..e2d10124cdb1 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -100,7 +100,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) cache_size = 0;

/* Get default cbm mask for L3/L2 cache */ - ret = get_cbm_mask(cache_type, &long_mask); + ret = get_mask_no_shareable(cache_type, &long_mask); if (ret) return ret; count_of_bits = count_consecutive_bits(long_mask, NULL); diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index aa5dc8b95a06..be5a61e7fbcc 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -106,8 +106,11 @@ void tests_cleanup(void); void mbm_test_cleanup(void); int mba_schemata_change(int cpu_no, char *bw_report, char **benchmark_cmd); void mba_test_cleanup(void); +unsigned long create_bit_mask(unsigned int start, unsigned int len); unsigned int count_consecutive_bits(unsigned long val, unsigned int *start); int get_cbm_mask(char *cache_type, unsigned long *mask); +int get_shareable_mask(char *cache_type, unsigned long *shareable_mask); +int get_mask_no_shareable(char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size); int cache_alloc_size(int cpu_no, char *cache_type, unsigned long alloc_mask, unsigned long *alloc_size); diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 4efaf69c8152..94b99b06bc89 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -220,6 +220,16 @@ static int get_bit_mask(char *filename, unsigned long *mask) return 0; }

+/* + * create_bit_mask- Create bit mask from start,len pair + * @start: LSB of the mask + * @len Number of bits in the mask + */ +unsigned long create_bit_mask(unsigned int start, unsigned int len) +{ + return ((1UL << len) - 1UL) << start; +} + /* * count_consecutive_bits - Returns the longest train of bits in a bit mask * @val A bit mask @@ -273,6 +283,52 @@ int get_cbm_mask(char *cache_type, unsigned long *mask) return 0; }

+/* + * get_shareable_mask - Get shareable mask from shareable_bits for given cache + * @cache_type: Cache level L2/L3 + * @shareable_mask: shareable mask returned as unsigned long + * + * Return: = 0 on success, < 0 on failure. + */ +int get_shareable_mask(char *cache_type, unsigned long *shareable_mask) +{ + char mask_path[1024]; + + if (!cache_type) + return -1; + + snprintf(mask_path, sizeof(mask_path), "%s/%s/shareable_bits", + INFO_PATH, cache_type); + + return get_bit_mask(mask_path, shareable_mask); +} + +/* + * get_mask_no_shareable - Get CBM mask without shareable_bits for given cache + * @cache_type: Cache level L2/L3 + * @mask: mask returned as unsigned long + * + * Return: = 0 on success, < 0 on failure. + */ +int get_mask_no_shareable(char *cache_type, unsigned long *mask) +{ + unsigned long full_mask, shareable_mask; + unsigned int start, len; + + if (get_cbm_mask(cache_type, &full_mask) < 0) + return -1; + if (get_shareable_mask(cache_type, &shareable_mask) < 0) + return -1; + + len = count_consecutive_bits(full_mask & ~shareable_mask, &start); + if (!len) + return -1; + + *mask = create_bit_mask(start, len); + + return 0; +} + /* * get_core_sibling - Get sibling core id from the same socket for given CPU * @cpu_no: CPU number

-- 2.30.2

Ilpo Järvinen

11:45 a.m.

New subject: [PATCH v2 19/24] selftests/resctrl: Pass the real number of tests to show_cache_info()

Some results include warm-up tests which are discarded before passing the sum to show_cache_info(). Currently, show_cache_info() handles this by subtracting one from the number of tests in divisor. It is a trappy construct to have sum and number of tests parameters to disagree like this.

A more logical place for subtracting the skipped tests is where the sum is calculated so move it there. Pass the correct number of tests to show_cache_info() soit can use directly as the divisor for calculating the average.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cache.c | 2 +- tools/testing/selftests/resctrl/cat_test.c | 2 +- tools/testing/selftests/resctrl/cmt_test.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index b983af394e33..c93f5d2bc66e 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -312,7 +312,7 @@ int show_cache_info(unsigned long sum_llc_val, int no_of_bits, long avg_diff = 0; int ret;

- avg_llc_val = sum_llc_val / (num_of_runs - 1); + avg_llc_val = sum_llc_val / num_of_runs; avg_diff = (long)abs(cache_span - avg_llc_val); diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index e2d10124cdb1..ae21e656cf6e 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -81,7 +81,7 @@ static int check_results(struct resctrl_val_param *param) no_of_bits = count_consecutive_bits(param->mask, NULL);

return show_cache_info(sum_llc_perf_miss, no_of_bits, param->span / 64, - MAX_DIFF, MAX_DIFF_PERCENT, NUM_OF_RUNS, + MAX_DIFF, MAX_DIFF_PERCENT, runs - 1, get_vendor() == ARCH_INTEL, false); }

diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index 98e7d3accd73..087378a775ee 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -63,7 +63,7 @@ static int check_results(struct resctrl_val_param *param, int no_of_bits) fclose(fp);

return show_cache_info(sum_llc_occu_resc, no_of_bits, param->span, - MAX_DIFF, MAX_DIFF_PERCENT, NUM_OF_RUNS, + MAX_DIFF, MAX_DIFF_PERCENT, runs - 1, true, true); }

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:20 a.m.

New subject: [PATCH v2 19/24] selftests/resctrl: Pass the real number of tests to show_cache_info()

Hi Ilpo,

On 4/18/2023 4:45 AM, Ilpo Järvinen wrote:

...

Some results include warm-up tests which are discarded before passing the sum to show_cache_info(). Currently, show_cache_info() handles this

Please drop "Currently".

...

by subtracting one from the number of tests in divisor. It is a trappy construct to have sum and number of tests parameters to disagree like this.

A more logical place for subtracting the skipped tests is where the sum is calculated so move it there. Pass the correct number of tests to show_cache_info() soit can use directly as the divisor for calculating the average.

This is not clear to me. How about "soit can use directly" -> "so it can be used directly"?

Reinette

Ilpo Järvinen

18 Apr 18 Apr

11:45 a.m.

New subject: [PATCH v2 20/24] selftests/resctrl: Move CAT/CMT test global vars to func they are used

CAT and CMT tests have count_of_bits, long_mask, and cache_size global variables that can be moved into the sole using function.

Make the global variables local variables of the relevant function to scope them better.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cat_test.c | 7 +++---- tools/testing/selftests/resctrl/cmt_test.c | 7 +++---- 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index ae21e656cf6e..ef3ba22bdde5 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -17,10 +17,6 @@ #define MAX_DIFF_PERCENT 4 #define MAX_DIFF 1000000

-static int count_of_bits; -static unsigned long long_mask; -static unsigned long cache_size; - /* * Change schemata. Write schemata to specified * con_mon grp, mon_grp in resctrl FS. @@ -95,6 +91,9 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) { unsigned long l_mask, l_mask_1; int ret, pipefd[2], sibling_cpu_no; + unsigned long cache_size; + unsigned long long_mask; + int count_of_bits; char pipe_message;

cache_size = 0; diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index 087378a775ee..6adee08661e7 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -16,10 +16,6 @@ #define MAX_DIFF 2000000 #define MAX_DIFF_PERCENT 15

-static int count_of_bits; -static unsigned long long_mask; -static unsigned long cache_size; - static int cmt_setup(struct resctrl_val_param *p) { /* Run NUM_OF_RUNS times */ @@ -74,6 +70,9 @@ void cmt_test_cleanup(void)

int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd) { + unsigned long cache_size; + unsigned long long_mask; + int count_of_bits; int ret;

cache_size = 0;

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:20 a.m.

New subject: [PATCH v2 20/24] selftests/resctrl: Move CAT/CMT test global vars to func they are used

Hi Ilpo,

On 4/18/2023 4:45 AM, Ilpo Järvinen wrote:

...

CAT and CMT tests have count_of_bits, long_mask, and cache_size global variables that can be moved into the sole using function.

Make the global variables local variables of the relevant function to scope them better.

Could you please move this patch earlier, before usage of long_mask in earlier patch.

...

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/cat_test.c | 7 +++---- tools/testing/selftests/resctrl/cmt_test.c | 7 +++---- 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index ae21e656cf6e..ef3ba22bdde5 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -17,10 +17,6 @@ #define MAX_DIFF_PERCENT 4 #define MAX_DIFF 1000000 -static int count_of_bits; -static unsigned long long_mask; -static unsigned long cache_size;

/*

Change schemata. Write schemata to specified

con_mon grp, mon_grp in resctrl FS.

@@ -95,6 +91,9 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) { unsigned long l_mask, l_mask_1; int ret, pipefd[2], sibling_cpu_no;

unsigned long cache_size;

unsigned long long_mask;

int count_of_bits; char pipe_message;

cache_size = 0;

Seems like this initialization can be moved to the definition?

...

diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index 087378a775ee..6adee08661e7 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -16,10 +16,6 @@ #define MAX_DIFF 2000000 #define MAX_DIFF_PERCENT 15 -static int count_of_bits; -static unsigned long long_mask; -static unsigned long cache_size;

static int cmt_setup(struct resctrl_val_param *p) { /* Run NUM_OF_RUNS times */ @@ -74,6 +70,9 @@ void cmt_test_cleanup(void) int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd) {

unsigned long cache_size;

unsigned long long_mask;

int count_of_bits; int ret;

cache_size = 0;

Same here.

Reinette

Ilpo Järvinen

18 Apr 18 Apr

11:45 a.m.

New subject: [PATCH v2 21/24] selftests/resctrl: Read in less obvious order to defeat prefetch optimizations

When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results.

Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/fill_buf.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 7e0d3a1ea555..049a520498a9 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -88,14 +88,17 @@ static void *malloc_and_init_memory(size_t s)

static int fill_one_span_read(unsigned char *start_ptr, unsigned char *end_ptr) { - unsigned char sum, *p; - + unsigned int size = (end_ptr - start_ptr) / (CL_SIZE / 2); + unsigned int count = size; + unsigned char sum; + + /* + * Read the buffer in an order that is unexpected by HW prefetching + * optimizations to prevent them interfering with the caching pattern. + */ sum = 0; - p = start_ptr; - while (p < end_ptr) { - sum += *p; - p += (CL_SIZE / 2); - } + while (count--) + sum += start_ptr[((count * 59) % size) * CL_SIZE / 2];

return sum; }

-- 2.30.2

Shaopeng Tan (Fujitsu)

31 May 31 May

5:33 a.m.

New subject: [PATCH v2 21/24] selftests/resctrl: Read in less obvious order to defeat prefetch optimizations

Hi Ilpo,

...

When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results.

Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/fill_buf.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 7e0d3a1ea555..049a520498a9 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -88,14 +88,17 @@ static void *malloc_and_init_memory(size_t s)

static int fill_one_span_read(unsigned char *start_ptr, unsigned char *end_ptr) {

unsigned char sum, *p;
unsigned int size = (end_ptr - start_ptr) / (CL_SIZE / 2);

unsigned int count = size;

unsigned char sum;

/*
* Read the buffer in an order that is unexpected by HW prefetching
* optimizations to prevent them interfering with the caching pattern.
*/
sum = 0;
p = start_ptr;

while (p < end_ptr) {
sum += *p;
p += (CL_SIZE / 2);
}
while (count--)
sum += start_ptr[((count * 59) % size) * CL_SIZE / 2];

Could you please elaborate why 59 is used?

Best regards, Shaopeng TAN

Ilpo Järvinen

9:17 a.m.

New subject: [PATCH v2 21/24] selftests/resctrl: Read in less obvious order to defeat prefetch optimizations

On Wed, 31 May 2023, Shaopeng Tan (Fujitsu) wrote:

...

Hi Ilpo,

...
When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results.

Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/fill_buf.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 7e0d3a1ea555..049a520498a9 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -88,14 +88,17 @@ static void *malloc_and_init_memory(size_t s)

static int fill_one_span_read(unsigned char *start_ptr, unsigned char *end_ptr) {

unsigned char sum, *p;
unsigned int size = (end_ptr - start_ptr) / (CL_SIZE / 2);

unsigned int count = size;

unsigned char sum;

/*
* Read the buffer in an order that is unexpected by HW prefetching
* optimizations to prevent them interfering with the caching pattern.
*/
sum = 0;
p = start_ptr;

while (p < end_ptr) {
sum += *p;
p += (CL_SIZE / 2);
}
while (count--)
sum += start_ptr[((count * 59) % size) * CL_SIZE / 2];
Could you please elaborate why 59 is used?

The main reason is that it's a prime number ensuring the whole buffer gets read. I picked something that doesn't make it to wrap on almost every iteration.

-- i.

Shaopeng Tan (Fujitsu)

1 Jun 1 Jun

6:15 a.m.

New subject: [PATCH v2 21/24] selftests/resctrl: Read in less obvious order to defeat prefetch optimizations

Hi Ilpo,

...

...
...
When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results.

Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/fill_buf.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 7e0d3a1ea555..049a520498a9 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -88,14 +88,17 @@ static void *malloc_and_init_memory(size_t s)

static int fill_one_span_read(unsigned char *start_ptr, unsigned char *end_ptr) {

unsigned char sum, *p;
unsigned int size = (end_ptr - start_ptr) / (CL_SIZE / 2);

unsigned int count = size;

unsigned char sum;

/*
* Read the buffer in an order that is unexpected by HW prefetching
* optimizations to prevent them interfering with the caching pattern.
*/
sum = 0;
p = start_ptr;

while (p < end_ptr) {
sum += *p;
p += (CL_SIZE / 2);
}
while (count--)
sum += start_ptr[((count * 59) % size) * CL_SIZE / 2];
Could you please elaborate why 59 is used?
The main reason is that it's a prime number ensuring the whole buffer gets read. I picked something that doesn't make it to wrap on almost every iteration.

Thanks for your explanation. It seems there is no problem.

Perhaps you have already tested this patch in your environment and got a test result of "ok". Because HW prefetching does not work well, the IMC counter fluctuates a lot in my environment, and the test result is "not ok".

In order to ensure this test set runs in any environments and gets "ok", would you consider changing the value of MAX_DIFF_PERCENT of each test? or changing something else?

``` Environment: Kernel: 6.4.0-rc2 CPU: Intel(R) Xeon(R) Gold 6254 CPU @ 3.10GHz

Test result(MBM as an example): # # Starting MBM BW change ... # # Mounting resctrl to "/sys/fs/resctrl" # # Benchmark PID: 8671 # # Writing benchmark parameters to resctrl FS # # Write schema "MB:0=100" to resctrl FS # # Checking for pass/fail # # Fail: Check MBM diff within 5% # # avg_diff_per: 9% # # Span in bytes: 262144000 # # avg_bw_imc: 6202 # # avg_bw_resc: 5585 # not ok 1 MBM: bw change ```

Best regards, Shaopeng TAN

Ilpo Järvinen

2 Jun 2 Jun

1:51 p.m.

New subject: [PATCH v2 21/24] selftests/resctrl: Read in less obvious order to defeat prefetch optimizations

On Thu, 1 Jun 2023, Shaopeng Tan (Fujitsu) wrote:

...

...
...
...
When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results.

Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/fill_buf.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 7e0d3a1ea555..049a520498a9 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -88,14 +88,17 @@ static void *malloc_and_init_memory(size_t s)

static int fill_one_span_read(unsigned char *start_ptr, unsigned char *end_ptr) {

unsigned char sum, *p;
unsigned int size = (end_ptr - start_ptr) / (CL_SIZE / 2);

unsigned int count = size;

unsigned char sum;

/*
* Read the buffer in an order that is unexpected by HW prefetching
* optimizations to prevent them interfering with the caching pattern.
*/
sum = 0;
p = start_ptr;

while (p < end_ptr) {
sum += *p;
p += (CL_SIZE / 2);
}
while (count--)
sum += start_ptr[((count * 59) % size) * CL_SIZE / 2];
Could you please elaborate why 59 is used?
The main reason is that it's a prime number ensuring the whole buffer gets read. I picked something that doesn't make it to wrap on almost every iteration.
Thanks for your explanation. It seems there is no problem.

Perhaps you have already tested this patch in your environment and got a test result of "ok".

Yes, it was tested :-) and all looked fine here. But my testing was more focused on the systems which come with CAT and on all those, this change clearly improved MBA/MBM results (they became almost always diff=0 except for the smallest ones in the MBA test).

...

Because HW prefetching does not work well, the IMC counter fluctuates a lot in my environment, and the test result is "not ok".

In order to ensure this test set runs in any environments and gets "ok", would you consider changing the value of MAX_DIFF_PERCENT of each test? or changing something else?
Environment:
 Kernel: 6.4.0-rc2
 CPU: Intel(R) Xeon(R) Gold 6254 CPU @ 3.10GHz

Test result(MBM as an example):
# # Starting MBM BW change ...
# # Mounting resctrl to "/sys/fs/resctrl"
# # Benchmark PID: 8671
# # Writing benchmark parameters to resctrl FS
# # Write schema "MB:0=100" to resctrl FS
# # Checking for pass/fail
# # Fail: Check MBM diff within 5%
# # avg_diff_per: 9%
# # Span in bytes: 262144000
# # avg_bw_imc: 6202
# # avg_bw_resc: 5585
# not ok 1 MBM: bw change

Oh, I see. It seems that these CPUs break the trend and get much worse and more unstable for some reason. It might be that some i9 I recently got a lkp report from could have the same problem. I'll look more into this, thanks a lot for testing and bringing it up.

So to answer your question above, I've no intention to tweak MAX_DIFF_PERCENT because of this issue but I'll instead try to improve the approach to defeat the HW prefetcher.

If HW prefetcher is not defeated, the CAT test LLC misses have a slowly converging ramp which is not very useful unless number of runs is increased by much (and perhaps the first samples dropped entirely). So it is kinda needed and it would be nice if an approach that is non-HW specific could be used for this.

It will probably take some time... Should I send a v3 with only the fixes and useful refactors at the head of this series while I try to sort these problems with the test changes out?

-- i.

Reinette Chatre

2:39 p.m.

New subject: [PATCH v2 21/24] selftests/resctrl: Read in less obvious order to defeat prefetch optimizations

Hi Ilpo,

On 6/2/2023 6:51 AM, Ilpo Järvinen wrote:

...

It will probably take some time... Should I send a v3 with only the fixes and useful refactors at the head of this series while I try to sort these problems with the test changes out?

This sounds good to me. This series is already at 24 patches so I think that splitting the redesign of the CAT test from the other fixes would indeed make this work easier to parse.

Thank you

Reinette

Ilpo Järvinen

14 Jun 14 Jun

1:02 p.m.

New subject: [PATCH v2 21/24] selftests/resctrl: Read in less obvious order to defeat prefetch optimizations

On Thu, 1 Jun 2023, Shaopeng Tan (Fujitsu) wrote:

...

...
...
...
When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results.

Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/fill_buf.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 7e0d3a1ea555..049a520498a9 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -88,14 +88,17 @@ static void *malloc_and_init_memory(size_t s)

static int fill_one_span_read(unsigned char *start_ptr, unsigned char *end_ptr) {

unsigned char sum, *p;
unsigned int size = (end_ptr - start_ptr) / (CL_SIZE / 2);

unsigned int count = size;

unsigned char sum;

/*
* Read the buffer in an order that is unexpected by HW prefetching
* optimizations to prevent them interfering with the caching pattern.
*/
sum = 0;
p = start_ptr;

while (p < end_ptr) {
sum += *p;
p += (CL_SIZE / 2);
}
while (count--)
sum += start_ptr[((count * 59) % size) * CL_SIZE / 2];
Could you please elaborate why 59 is used?
The main reason is that it's a prime number ensuring the whole buffer gets read. I picked something that doesn't make it to wrap on almost every iteration.
Thanks for your explanation. It seems there is no problem.

Perhaps you have already tested this patch in your environment and got a test result of "ok". Because HW prefetching does not work well, the IMC counter fluctuates a lot in my environment, and the test result is "not ok".

In order to ensure this test set runs in any environments and gets "ok", would you consider changing the value of MAX_DIFF_PERCENT of each test? or changing something else?
Environment:
 Kernel: 6.4.0-rc2
 CPU: Intel(R) Xeon(R) Gold 6254 CPU @ 3.10GHz

Test result(MBM as an example):
# # Starting MBM BW change ...
# # Mounting resctrl to "/sys/fs/resctrl"
# # Benchmark PID: 8671
# # Writing benchmark parameters to resctrl FS
# # Write schema "MB:0=100" to resctrl FS
# # Checking for pass/fail
# # Fail: Check MBM diff within 5%
# # avg_diff_per: 9%
# # Span in bytes: 262144000
# # avg_bw_imc: 6202
# # avg_bw_resc: 5585
# not ok 1 MBM: bw change

Could you try if the approach below works better (I think it should apply cleanly on top of the fixes+cleanups v3 series which you recently tested, no need to have the other CAT test changes).

The biggest difference in terms of result stability my tests come from these factors: - Removed reversed index order. - Open-coded the modulo in the loop to subtraction.

In addition, I changed the prime to one which works slightly better than 59. The MBM/MBA results were already <5% with 59 too due to the other two changes, but using 23 lowered them further in my tests (with Platinum 8260L).

--- From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com [PATCH] selftests/resctrl: Read in less obvious order to defeat prefetch optimizations

When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results.

Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo.

Using a prime multiplier with modulo ensures the entire buffer is eventually read. 23 is small enough that the reads are spread out but wrapping does not occur very frequently (wrapping too often can trigger L2 hits more frequently which causes noise to the test because getting the data from LLC is not required).

It was discovered that not all primes work equally well and some can cause wildly unstable results (e.g., in an earlier version of this patch, the reads were done in reversed order and 59 was used as the prime resulting in unacceptably high and unstable results in MBA and MBM test on some architectures).

Link: https://lore.kernel.org/linux-kselftest/TYAPR01MB6330025B5E6537F94DA49ACB8B4... Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

--- tools/testing/selftests/resctrl/fill_buf.c | 38 +++++++++++++++++++++++------- 1 file changed, 30 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index f9893edda869..afde37d3fca0 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -74,16 +74,38 @@ static void *malloc_and_init_memory(size_t buf_size) return p; }

+/* + * Buffer index step advance to workaround HW prefetching interfering with + * the measurements. + * + * Must be a prime to step through all indexes of the buffer. + * + * Some primes work better than others on some architectures (from MBA/MBM + * result stability point of view). + */ +#define FILL_IDX_MULT 23 + static int fill_one_span_read(unsigned char *buf, size_t buf_size) { - unsigned char *end_ptr = buf + buf_size; - unsigned char sum, *p; - - sum = 0; - p = buf; - while (p < end_ptr) { - sum += *p; - p += (CL_SIZE / 2); + unsigned int size = buf_size / (CL_SIZE / 2); + unsigned int i, idx = 0; + unsigned char sum = 0; + + /* + * Read the buffer in an order that is unexpected by HW prefetching + * optimizations to prevent them interfering with the caching pattern. + * + * The read order is (in terms of halves of cachelines): + * i * FILL_IDX_MULT % size + * The formula is open-coded below to avoiding modulo inside the loop + * as it improves MBA/MBM result stability on some architectures. + */ + for (i = 0; i < size; i++) { + sum += buf[idx * (CL_SIZE / 2)]; + + idx += FILL_IDX_MULT; + while (idx >= size) + idx -= size; }

return sum;

-- tg: (68d2d0512b91..) refactor/read-fuzzing (depends on: refactor/remove-test-globals)

Shaopeng Tan (Fujitsu)

16 Jun 16 Jun

5:30 a.m.

New subject: [PATCH v2 21/24] selftests/resctrl: Read in less obvious order to defeat prefetch optimizations

Hi Ilpo,

...

On Thu, 1 Jun 2023, Shaopeng Tan (Fujitsu) wrote:

...
...
...
...
When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results.

Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/fill_buf.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 7e0d3a1ea555..049a520498a9 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -88,14 +88,17 @@ static void *malloc_and_init_memory(size_t s)

static int fill_one_span_read(unsigned char *start_ptr, unsigned char *end_ptr) {

unsigned char sum, *p;
unsigned int size = (end_ptr - start_ptr) / (CL_SIZE / 2);

unsigned int count = size;

unsigned char sum;

/*
* Read the buffer in an order that is unexpected by HW
prefetching

...
...
...
...
* optimizations to prevent them interfering with the caching
pattern.

...
...
...
...
*/
sum = 0;
p = start_ptr;

while (p < end_ptr) {
sum += *p;
p += (CL_SIZE / 2);
}
while (count--)
sum += start_ptr[((count * 59) % size) * CL_SIZE / 2];
Could you please elaborate why 59 is used?
The main reason is that it's a prime number ensuring the whole buffer gets
read.

...
...
I picked something that doesn't make it to wrap on almost every iteration.

Thanks for your explanation. It seems there is no problem.

Perhaps you have already tested this patch in your environment and got a test

result of "ok".

...
Because HW prefetching does not work well, the IMC counter fluctuates a lot in my environment, and the test result is "not ok".

In order to ensure this test set runs in any environments and gets "ok", would you consider changing the value of MAX_DIFF_PERCENT of

each test?

...
or changing something else?
Environment:
 Kernel: 6.4.0-rc2
 CPU: Intel(R) Xeon(R) Gold 6254 CPU @ 3.10GHz

Test result(MBM as an example):
# # Starting MBM BW change ...
# # Mounting resctrl to "/sys/fs/resctrl"
# # Benchmark PID: 8671
# # Writing benchmark parameters to resctrl FS # # Write schema
"MB:0=100" to resctrl FS # # Checking for pass/fail # # Fail: Check
MBM diff within 5% # # avg_diff_per: 9% # # Span in bytes: 262144000 #
# avg_bw_imc: 6202 # # avg_bw_resc: 5585 # not ok 1 MBM: bw change ```
Could you try if the approach below works better (I think it should apply cleanly on top of the fixes+cleanups v3 series which you recently tested, no need to have the other CAT test changes).

I ran the test set several times. MBA and MBM seem fine, but CAT is always "not ok". The result is below.

--- $ sudo make -C tools/testing/selftests/resctrl run_tests make: Entering directory '/**/tools/testing/selftests/resctrl' TAP version 13 1..1 # selftests: resctrl: resctrl_tests # TAP version 13 # # Pass: Check kernel supports resctrl filesystem # # Pass: Check resctrl mountpoint "/sys/fs/resctrl" exists # # resctrl filesystem not mounted # # dmesg: [ 3.658029] resctrl: L3 allocation detected # # dmesg: [ 3.658420] resctrl: MB allocation detected # # dmesg: [ 3.658604] resctrl: L3 monitoring detected # 1..4 # # Starting MBM BW change ... # # Mounting resctrl to "/sys/fs/resctrl" # # Benchmark PID: 9555 # # Writing benchmark parameters to resctrl FS # # Write schema "MB:0=100" to resctrl FS # # Checking for pass/fail # # Pass: Check MBM diff within 5% # # avg_diff_per: 0% # # Span (MB): 250 # # avg_bw_imc: 6880 # # avg_bw_resc: 6895 # ok 1 MBM: bw change # # Starting MBA Schemata change ... # # Mounting resctrl to "/sys/fs/resctrl" # # Benchmark PID: 9561 # # Writing benchmark parameters to resctrl FS # # Write schema "MB:0=100" to resctrl FS # # Write schema "MB:0=90" to resctrl FS # # Write schema "MB:0=80" to resctrl FS # # Write schema "MB:0=70" to resctrl FS # # Write schema "MB:0=60" to resctrl FS # # Write schema "MB:0=50" to resctrl FS # # Write schema "MB:0=40" to resctrl FS # # Write schema "MB:0=30" to resctrl FS # # Write schema "MB:0=20" to resctrl FS # # Write schema "MB:0=10" to resctrl FS # # Results are displayed in (MB) # # Pass: Check MBA diff within 5% for schemata 100 # # avg_diff_per: 0% # # avg_bw_imc: 6874 # # avg_bw_resc: 6904 # # Pass: Check MBA diff within 5% for schemata 90 # # avg_diff_per: 1% # # avg_bw_imc: 6738 # # avg_bw_resc: 6807 # # Pass: Check MBA diff within 5% for schemata 80 # # avg_diff_per: 1% # # avg_bw_imc: 6735 # # avg_bw_resc: 6803 # # Pass: Check MBA diff within 5% for schemata 70 # # avg_diff_per: 1% # # avg_bw_imc: 6702 # # avg_bw_resc: 6770 # # Pass: Check MBA diff within 5% for schemata 60 # # avg_diff_per: 1% # # avg_bw_imc: 6632 # # avg_bw_resc: 6725 # # Pass: Check MBA diff within 5% for schemata 50 # # avg_diff_per: 1% # # avg_bw_imc: 6510 # # avg_bw_resc: 6635 # # Pass: Check MBA diff within 5% for schemata 40 # # avg_diff_per: 2% # # avg_bw_imc: 6206 # # avg_bw_resc: 6342 # # Pass: Check MBA diff within 5% for schemata 30 # # avg_diff_per: 1% # # avg_bw_imc: 3826 # # avg_bw_resc: 3896 # # Pass: Check MBA diff within 5% for schemata 20 # # avg_diff_per: 1% # # avg_bw_imc: 2820 # # avg_bw_resc: 2862 # # Pass: Check MBA diff within 5% for schemata 10 # # avg_diff_per: 1% # # avg_bw_imc: 1876 # # avg_bw_resc: 1898 # # Pass: Check schemata change using MBA # ok 2 MBA: schemata change # # Starting CMT test ... # # Mounting resctrl to "/sys/fs/resctrl" # # Cache size :25952256 # # Benchmark PID: 9573 # # Writing benchmark parameters to resctrl FS # # Checking for pass/fail # # Pass: Check cache miss rate within 15% # # Percent diff=10 # # Number of bits: 5 # # Average LLC val: 12994560 # # Cache span (bytes): 11796480 # ok 3 CMT: test # # Starting CAT test ... # # Mounting resctrl to "/sys/fs/resctrl" # # Cache size :25952256 # # Writing benchmark parameters to resctrl FS # # Write schema "L3:0=3f" to resctrl FS # # Checking for pass/fail # # Fail: Check cache miss rate within 4% # # Percent diff=24 # # Number of bits: 6 # # Average LLC val: 275418 # # Cache span (lines): 221184 # not ok 4 CAT: test # # Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0 not ok 1 selftests: resctrl: resctrl_tests # exit=1 make: Leaving directory '/**/tools/testing/selftests/resctrl' ---

Best regards, Shaopeng TAN

Ilpo Järvinen

6:44 a.m.

New subject: [PATCH v2 21/24] selftests/resctrl: Read in less obvious order to defeat prefetch optimizations

On Fri, 16 Jun 2023, Shaopeng Tan (Fujitsu) wrote:

...

Hi Ilpo,

...
On Thu, 1 Jun 2023, Shaopeng Tan (Fujitsu) wrote:

...
...
...
...
When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results.

Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo.

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/fill_buf.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 7e0d3a1ea555..049a520498a9 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -88,14 +88,17 @@ static void *malloc_and_init_memory(size_t s)

static int fill_one_span_read(unsigned char *start_ptr, unsigned char *end_ptr) {

unsigned char sum, *p;
unsigned int size = (end_ptr - start_ptr) / (CL_SIZE / 2);

unsigned int count = size;

unsigned char sum;

/*
* Read the buffer in an order that is unexpected by HW
prefetching

...
...
...
...
* optimizations to prevent them interfering with the caching
pattern.

...
...
...
...
*/
sum = 0;
p = start_ptr;

while (p < end_ptr) {
sum += *p;
p += (CL_SIZE / 2);
}
while (count--)
sum += start_ptr[((count * 59) % size) * CL_SIZE / 2];
Could you please elaborate why 59 is used?
The main reason is that it's a prime number ensuring the whole buffer gets
read.

...
...
I picked something that doesn't make it to wrap on almost every iteration.

Thanks for your explanation. It seems there is no problem.

Perhaps you have already tested this patch in your environment and got a test

result of "ok".

...
Because HW prefetching does not work well, the IMC counter fluctuates a lot in my environment, and the test result is "not ok".

In order to ensure this test set runs in any environments and gets "ok", would you consider changing the value of MAX_DIFF_PERCENT of

each test?

...
or changing something else?
Environment:
 Kernel: 6.4.0-rc2
 CPU: Intel(R) Xeon(R) Gold 6254 CPU @ 3.10GHz

Test result(MBM as an example):
# # Starting MBM BW change ...
# # Mounting resctrl to "/sys/fs/resctrl"
# # Benchmark PID: 8671
# # Writing benchmark parameters to resctrl FS # # Write schema
"MB:0=100" to resctrl FS # # Checking for pass/fail # # Fail: Check
MBM diff within 5% # # avg_diff_per: 9% # # Span in bytes: 262144000 #
# avg_bw_imc: 6202 # # avg_bw_resc: 5585 # not ok 1 MBM: bw change ```
Could you try if the approach below works better (I think it should apply cleanly on top of the fixes+cleanups v3 series which you recently tested, no need to have the other CAT test changes).
I ran the test set several times. MBA and MBM seem fine, but CAT is always "not ok". The result is below.

Ok, thanks a lot for confirming. I was just interested to see MBA/MBM test results.

I'm not surprised the old CAT test is failing with "not ok". I see it occurring quite often with the old CAT test. It is one of the reasons why it is being rewritten, although the main motivator is that the old one doesn't really even test CAT because it flushes LLC and reads the buffer only once after the flush.

The rewritten CAT test should work better in this regard but it was not among fixes+cleanups series (v3) + this patch.

-- i. > --- > $ sudo make -C tools/testing/selftests/resctrl run_tests > make: Entering directory '/**/tools/testing/selftests/resctrl' > TAP version 13 > 1..1 > # selftests: resctrl: resctrl_tests > # TAP version 13 > # # Pass: Check kernel supports resctrl filesystem > # # Pass: Check resctrl mountpoint "/sys/fs/resctrl" exists > # # resctrl filesystem not mounted > # # dmesg: [ 3.658029] resctrl: L3 allocation detected > # # dmesg: [ 3.658420] resctrl: MB allocation detected > # # dmesg: [ 3.658604] resctrl: L3 monitoring detected > # 1..4 > # # Starting MBM BW change ... > # # Mounting resctrl to "/sys/fs/resctrl" > # # Benchmark PID: 9555 > # # Writing benchmark parameters to resctrl FS > # # Write schema "MB:0=100" to resctrl FS > # # Checking for pass/fail > # # Pass: Check MBM diff within 5% > # # avg_diff_per: 0% > # # Span (MB): 250 > # # avg_bw_imc: 6880 > # # avg_bw_resc: 6895 > # ok 1 MBM: bw change > # # Starting MBA Schemata change ... > # # Mounting resctrl to "/sys/fs/resctrl" > # # Benchmark PID: 9561 > # # Writing benchmark parameters to resctrl FS > # # Write schema "MB:0=100" to resctrl FS > # # Write schema "MB:0=90" to resctrl FS > # # Write schema "MB:0=80" to resctrl FS > # # Write schema "MB:0=70" to resctrl FS > # # Write schema "MB:0=60" to resctrl FS > # # Write schema "MB:0=50" to resctrl FS > # # Write schema "MB:0=40" to resctrl FS > # # Write schema "MB:0=30" to resctrl FS > # # Write schema "MB:0=20" to resctrl FS > # # Write schema "MB:0=10" to resctrl FS > # # Results are displayed in (MB) > # # Pass: Check MBA diff within 5% for schemata 100 > # # avg_diff_per: 0% > # # avg_bw_imc: 6874 > # # avg_bw_resc: 6904 > # # Pass: Check MBA diff within 5% for schemata 90 > # # avg_diff_per: 1% > # # avg_bw_imc: 6738 > # # avg_bw_resc: 6807 > # # Pass: Check MBA diff within 5% for schemata 80 > # # avg_diff_per: 1% > # # avg_bw_imc: 6735 > # # avg_bw_resc: 6803 > # # Pass: Check MBA diff within 5% for schemata 70 > # # avg_diff_per: 1% > # # avg_bw_imc: 6702 > # # avg_bw_resc: 6770 > # # Pass: Check MBA diff within 5% for schemata 60 > # # avg_diff_per: 1% > # # avg_bw_imc: 6632 > # # avg_bw_resc: 6725 > # # Pass: Check MBA diff within 5% for schemata 50 > # # avg_diff_per: 1% > # # avg_bw_imc: 6510 > # # avg_bw_resc: 6635 > # # Pass: Check MBA diff within 5% for schemata 40 > # # avg_diff_per: 2% > # # avg_bw_imc: 6206 > # # avg_bw_resc: 6342 > # # Pass: Check MBA diff within 5% for schemata 30 > # # avg_diff_per: 1% > # # avg_bw_imc: 3826 > # # avg_bw_resc: 3896 > # # Pass: Check MBA diff within 5% for schemata 20 > # # avg_diff_per: 1% > # # avg_bw_imc: 2820 > # # avg_bw_resc: 2862 > # # Pass: Check MBA diff within 5% for schemata 10 > # # avg_diff_per: 1% > # # avg_bw_imc: 1876 > # # avg_bw_resc: 1898 > # # Pass: Check schemata change using MBA > # ok 2 MBA: schemata change > # # Starting CMT test ... > # # Mounting resctrl to "/sys/fs/resctrl" > # # Cache size :25952256 > # # Benchmark PID: 9573 > # # Writing benchmark parameters to resctrl FS > # # Checking for pass/fail > # # Pass: Check cache miss rate within 15% > # # Percent diff=10 > # # Number of bits: 5 > # # Average LLC val: 12994560 > # # Cache span (bytes): 11796480 > # ok 3 CMT: test > # # Starting CAT test ... > # # Mounting resctrl to "/sys/fs/resctrl" > # # Cache size :25952256 > # # Writing benchmark parameters to resctrl FS > # # Write schema "L3:0=3f" to resctrl FS > # # Checking for pass/fail > # # Fail: Check cache miss rate within 4% > # # Percent diff=24 > # # Number of bits: 6 > # # Average LLC val: 275418 > # # Cache span (lines): 221184 > # not ok 4 CAT: test > # # Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0 > not ok 1 selftests: resctrl: resctrl_tests # exit=1 > make: Leaving directory '/**/tools/testing/selftests/resctrl'

Ilpo Järvinen

18 Apr 18 Apr

11:45 a.m.

New subject: [PATCH v2 22/24] selftests/resctrl: Split measure_cache_vals() function

The measure_cache_vals() function does a different thing depending on the test case that called it: - For CAT, it measures LLC perf misses 2. - For CMT, it measures LLC occupancy through resctrl.

Split these two functionalities such that CMT test calls a new function called measure_llc_resctrl() to get LLC occupancy through resctrl and CAT test directly calls get_llc_perf().

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index c93f5d2bc66e..a015ce2d0a3c 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -199,35 +199,20 @@ static int print_results_cache(char *filename, int bm_pid, return 0; }

-int measure_cache_vals(struct resctrl_val_param *param, int bm_pid) +int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid) { - unsigned long llc_perf_miss = 0, llc_occu_resc = 0, llc_value = 0; + unsigned long llc_occu_resc = 0; int ret;

- /* - * Measure cache miss from perf. - */ - if (!strncmp(param->resctrl_val, CAT_STR, sizeof(CAT_STR))) { - ret = get_llc_perf(&llc_perf_miss); - if (ret < 0) - return ret; - llc_value = llc_perf_miss; - } - /* * Measure llc occupancy from resctrl. */ - if (!strncmp(param->resctrl_val, CMT_STR, sizeof(CMT_STR))) { - ret = get_llc_occu_resctrl(&llc_occu_resc); - if (ret < 0) - return ret; - llc_value = llc_occu_resc; - } - ret = print_results_cache(param->filename, bm_pid, llc_value); - if (ret) + ret = get_llc_occu_resctrl(&llc_occu_resc); + if (ret < 0) return ret;

- return 0; + ret = print_results_cache(param->filename, bm_pid, llc_occu_resc); + return ret; }

/* @@ -241,6 +226,7 @@ int cat_val(struct resctrl_val_param *param) { int memflush = 1, operation = 0, ret = 0; char *resctrl_val = param->resctrl_val; + unsigned long llc_perf_miss = 0; pid_t bm_pid;

if (strcmp(param->filename, "") == 0) @@ -281,7 +267,14 @@ int cat_val(struct resctrl_val_param *param) }

sleep(1); - ret = measure_cache_vals(param, bm_pid); + + /* Measure cache miss from perf */ + ret = get_llc_perf(&llc_perf_miss); + if (ret) + break; + + ret = print_results_cache(param->filename, bm_pid, + llc_perf_miss); if (ret) break; } diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index be5a61e7fbcc..12754733126f 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -124,7 +124,7 @@ int cmt_resctrl_val(int cpu_no, int n, char **benchmark_cmd); unsigned int count_bits(unsigned long n); void cmt_test_cleanup(void); int get_core_sibling(int cpu_no); -int measure_cache_vals(struct resctrl_val_param *param, int bm_pid); +int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid); int show_cache_info(unsigned long sum_llc_val, int no_of_bits, unsigned long cache_span, unsigned long max_diff, unsigned long max_diff_percent, unsigned long num_of_runs, diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c index f0f6c5f6e98b..0ffe4694bf47 100644 --- a/tools/testing/selftests/resctrl/resctrl_val.c +++ b/tools/testing/selftests/resctrl/resctrl_val.c @@ -774,7 +774,7 @@ int resctrl_val(char **benchmark_cmd, struct resctrl_val_param *param) break; } else if (!strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR))) { sleep(1); - ret = measure_cache_vals(param, bm_pid); + ret = measure_llc_resctrl(param, bm_pid); if (ret) break; }

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:21 a.m.

New subject: [PATCH v2 22/24] selftests/resctrl: Split measure_cache_vals() function

Hi Ilpo,

On 4/18/2023 4:45 AM, Ilpo Järvinen wrote:

...

The measure_cache_vals() function does a different thing depending on the test case that called it:

For CAT, it measures LLC perf misses 2.

What does the "2" represent?

...

For CMT, it measures LLC occupancy through resctrl.

Split these two functionalities such that CMT test calls a new function called measure_llc_resctrl() to get LLC occupancy through resctrl and CAT test directly calls get_llc_perf().

The changelog mentions 'split' but the split does not end up the same for the two usages. Why does this split end up with one usage moving to a direct call (get_llc_perf()) while the other usage remains with a wrapper? Why not open code both?

Reinette

Ilpo Järvinen

18 Apr 18 Apr

11:45 a.m.

New subject: [PATCH v2 23/24] selftests/resctrl: Split show_cache_info() to test specific and generic parts

show_cache_info() attempts to do calculate the results and provide generic cache information. It makes hard to alter the pass/fail conditions.

Separate the the test specific checks into CAT and CMT test files and leave only the generic information part into show_cache_info().

Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cache.c | 40 ++++------------------ tools/testing/selftests/resctrl/cat_test.c | 30 ++++++++++++++-- tools/testing/selftests/resctrl/cmt_test.c | 32 +++++++++++++++-- tools/testing/selftests/resctrl/resctrl.h | 6 ++-- 4 files changed, 65 insertions(+), 43 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index a015ce2d0a3c..7970239413da 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -283,43 +283,17 @@ int cat_val(struct resctrl_val_param *param) }

/* - * show_cache_info: show cache test result information - * @sum_llc_val: sum of LLC cache result data + * show_cache_info: show generic cache test information * @no_of_bits: number of bits - * @cache_span: cache span in bytes for CMT or in lines for CAT - * @max_diff: max difference - * @max_diff_percent: max difference percentage - * @num_of_runs: number of runs - * @platform: show test information on this platform - * @cmt: CMT test or CAT test - * - * Return: 0 on success. non-zero on failure. + * @avg_llc_val: avg of LLC cache result data + * @cache_span: cache span + * @lines: cache span in lines or bytes */ -int show_cache_info(unsigned long sum_llc_val, int no_of_bits, - unsigned long cache_span, unsigned long max_diff, - unsigned long max_diff_percent, unsigned long num_of_runs, - bool platform, bool cmt) +void show_cache_info(int no_of_bits, unsigned long avg_llc_val, + unsigned long cache_span, bool lines) { - unsigned long avg_llc_val = 0; - float diff_percent; - long avg_diff = 0; - int ret; - - avg_llc_val = sum_llc_val / num_of_runs; - avg_diff = (long)abs(cache_span - avg_llc_val); - diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100; - - ret = platform && abs((int)diff_percent) > max_diff_percent && - (cmt ? (abs(avg_diff) > max_diff) : true); - - ksft_print_msg("%s Check cache miss rate within %d%%\n", - ret ? "Fail:" : "Pass:", max_diff_percent); - - ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent)); ksft_print_msg("Number of bits: %d\n", no_of_bits); ksft_print_msg("Average LLC val: %lu\n", avg_llc_val); - ksft_print_msg("Cache span (%s): %lu\n", cmt ? "bytes" : "lines", + ksft_print_msg("Cache span (%s): %lu\n", !lines ? "bytes" : "lines", cache_span); - - return ret; } diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index ef3ba22bdde5..4b505fdb35d7 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -41,6 +41,30 @@ static int cat_setup(struct resctrl_val_param *p) return ret; }

+static int show_results_info(unsigned long sum_llc_val, int no_of_bits, + unsigned long cache_span, unsigned long max_diff, + unsigned long max_diff_percent, unsigned long num_of_runs, + bool platform) +{ + unsigned long avg_llc_val = 0; + float diff_percent; + int ret; + + avg_llc_val = sum_llc_val / num_of_runs; + diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100; + + ret = platform && abs((int)diff_percent) > max_diff_percent; + + ksft_print_msg("%s Check cache miss rate within %d%%\n", + ret ? "Fail:" : "Pass:", max_diff_percent); + + ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent)); + + show_cache_info(no_of_bits, avg_llc_val, cache_span, true); + + return ret; +} + static int check_results(struct resctrl_val_param *param) { char *token_array[8], temp[512]; @@ -76,9 +100,9 @@ static int check_results(struct resctrl_val_param *param) fclose(fp); no_of_bits = count_consecutive_bits(param->mask, NULL);

- return show_cache_info(sum_llc_perf_miss, no_of_bits, param->span / 64, - MAX_DIFF, MAX_DIFF_PERCENT, runs - 1, - get_vendor() == ARCH_INTEL, false); + return show_results_info(sum_llc_perf_miss, no_of_bits, param->span / 64, + MAX_DIFF, MAX_DIFF_PERCENT, runs - 1, + get_vendor() == ARCH_INTEL); }

void cat_test_cleanup(void) diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index 6adee08661e7..b3fedade583b 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -27,6 +27,33 @@ static int cmt_setup(struct resctrl_val_param *p) return 0; }

+static int show_results_info(unsigned long sum_llc_val, int no_of_bits, + unsigned long cache_span, unsigned long max_diff, + unsigned long max_diff_percent, unsigned long num_of_runs, + bool platform) +{ + unsigned long avg_llc_val = 0; + float diff_percent; + long avg_diff = 0; + int ret; + + avg_llc_val = sum_llc_val / num_of_runs; + avg_diff = (long)abs(cache_span - avg_llc_val); + diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100; + + ret = platform && abs((int)diff_percent) > max_diff_percent && + abs(avg_diff) > max_diff; + + ksft_print_msg("%s Check cache miss rate within %d%%\n", + ret ? "Fail:" : "Pass:", max_diff_percent); + + ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent)); + + show_cache_info(no_of_bits, avg_llc_val, cache_span, false); + + return ret; +} + static int check_results(struct resctrl_val_param *param, int no_of_bits) { char *token_array[8], temp[512]; @@ -58,9 +85,8 @@ static int check_results(struct resctrl_val_param *param, int no_of_bits) } fclose(fp);

- return show_cache_info(sum_llc_occu_resc, no_of_bits, param->span, - MAX_DIFF, MAX_DIFF_PERCENT, runs - 1, - true, true); + return show_results_info(sum_llc_occu_resc, no_of_bits, param->span, + MAX_DIFF, MAX_DIFF_PERCENT, runs - 1, true); }

void cmt_test_cleanup(void) diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 12754733126f..fc914391024d 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -125,9 +125,7 @@ unsigned int count_bits(unsigned long n); void cmt_test_cleanup(void); int get_core_sibling(int cpu_no); int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid); -int show_cache_info(unsigned long sum_llc_val, int no_of_bits, - unsigned long cache_span, unsigned long max_diff, - unsigned long max_diff_percent, unsigned long num_of_runs, - bool platform, bool cmt); +void show_cache_info(int no_of_bits, unsigned long avg_llc_val, + unsigned long cache_span, bool lines);

#endif /* RESCTRL_H */

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:22 a.m.

New subject: [PATCH v2 23/24] selftests/resctrl: Split show_cache_info() to test specific and generic parts

Hi Ilpo,

On 4/18/2023 4:45 AM, Ilpo Järvinen wrote:

...

show_cache_info() attempts to do calculate the results and provide generic cache information. It makes hard to alter the pass/fail conditions.

I find the above hard to read. How about:

show_cache_info() calculates results and provide generic cache information. This makes it hard to alter pass/fail conditions.

...

Separate the the test specific checks into CAT and CMT test files and

"the the" -> "the"

...

leave only the generic information part into show_cache_info().

Reinette

Ilpo Järvinen

18 Apr 18 Apr

11:45 a.m.

New subject: [PATCH v2 24/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

CAT test spawns two processes into two different control groups with exclusive schemata. Both the processes alloc a buffer from memory matching their allocated LLC block size and flush the entire buffer out of caches. Since the processes are reading through the buffer only once during the measurement and initially all the buffer was flushed, the test isn't testing CAT.

Rewrite the CAT test to allocated a buffer sized to half of LLC. Then perform a sequence of tests with different LLC alloc sizes starting from half of the CBM bits down to 1-bit CBM. Flush the buffer before each test and read the buffer twice. Observe the LLC misses on the second read through the buffer. As the allocated LLC block gets smaller and smaller, the LLC misses will become larger and larger giving a strong signal on CAT working properly.

Suggested-by: Reinette Chatre reinette.chatre@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com --- tools/testing/selftests/resctrl/cache.c | 20 +- tools/testing/selftests/resctrl/cat_test.c | 204 +++++++++------------ 2 files changed, 97 insertions(+), 127 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index 7970239413da..64f08ba5edc2 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -224,10 +224,10 @@ int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid) */ int cat_val(struct resctrl_val_param *param) { - int memflush = 1, operation = 0, ret = 0; char *resctrl_val = param->resctrl_val; unsigned long llc_perf_miss = 0; pid_t bm_pid; + int ret;

if (strcmp(param->filename, "") == 0) sprintf(param->filename, "stdio"); @@ -245,6 +245,10 @@ int cat_val(struct resctrl_val_param *param) if (ret) return ret;

+ ret = alloc_buffer(param->span, 1); + if (ret) + return ret; + initialize_llc_perf();

/* Test runs until the callback setup() tells the test to stop. */ @@ -256,17 +260,15 @@ int cat_val(struct resctrl_val_param *param) } if (ret < 0) break; + + flush_buffer(param->span); + use_buffer(param->span, 0, true); + ret = reset_enable_llc_perf(bm_pid, param->cpu_no); if (ret) break;

- if (run_fill_buf(param->span, memflush, operation, true)) { - fprintf(stderr, "Error-running fill buffer\n"); - ret = -1; - break; - } - - sleep(1); + use_buffer(param->span, 0, true);

/* Measure cache miss from perf */ ret = get_llc_perf(&llc_perf_miss); @@ -279,6 +281,8 @@ int cat_val(struct resctrl_val_param *param) break; }

+ free_buffer(); + return ret; }

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 4b505fdb35d7..85053829b9c5 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -11,11 +11,12 @@ #include "resctrl.h" #include <unistd.h>

-#define RESULT_FILE_NAME1 "result_cat1" -#define RESULT_FILE_NAME2 "result_cat2" -#define NUM_OF_RUNS 5 -#define MAX_DIFF_PERCENT 4 -#define MAX_DIFF 1000000 +#define RESULT_FILE_NAME "result_cat" +#define NUM_OF_RUNS 5 +#define MIN_DIFF_PERCENT_PER_BIT 2 + +static unsigned long current_mask; +static long prev_avg_llc_val;

/* * Change schemata. Write schemata to specified @@ -28,13 +29,24 @@ static int cat_setup(struct resctrl_val_param *p) int ret = 0;

/* Run NUM_OF_RUNS times */ - if (p->num_of_runs >= NUM_OF_RUNS) - return END_OF_TESTS; + if (p->num_of_runs >= NUM_OF_RUNS) { + /* Remove one bit from the consecutive block */ + current_mask &= current_mask >> 1; + if (!current_mask) + return END_OF_TESTS; + + p->num_of_runs = 0; + }

if (p->num_of_runs == 0) { - sprintf(schemata, "%lx", p->mask); - ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no, - p->resctrl_val); + snprintf(schemata, sizeof(schemata), "%lx", p->mask & ~current_mask); + ret = write_schemata("", schemata, p->cpu_no, p->resctrl_val); + if (ret) + return ret; + snprintf(schemata, sizeof(schemata), "%lx", current_mask); + ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no, p->resctrl_val); + if (ret) + return ret; } p->num_of_runs++;

@@ -42,34 +54,41 @@ static int cat_setup(struct resctrl_val_param *p) }

static int show_results_info(unsigned long sum_llc_val, int no_of_bits, - unsigned long cache_span, unsigned long max_diff, - unsigned long max_diff_percent, unsigned long num_of_runs, - bool platform) + unsigned long cache_span, long min_diff_percent, + unsigned long num_of_runs, bool platform) { - unsigned long avg_llc_val = 0; - float diff_percent; - int ret; + long avg_llc_val = 0; + int avg_diff_per; + float avg_diff; + int ret = 0;

avg_llc_val = sum_llc_val / num_of_runs; - diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100; + avg_diff = (float)(avg_llc_val - prev_avg_llc_val) / prev_avg_llc_val; + avg_diff_per = (int)(avg_diff * 100);

- ret = platform && abs((int)diff_percent) > max_diff_percent; + if (prev_avg_llc_val) { + ret = platform && avg_diff_per < min_diff_percent;

- ksft_print_msg("%s Check cache miss rate within %d%%\n", - ret ? "Fail:" : "Pass:", max_diff_percent); + ksft_print_msg("%s Check cache miss rate changed more than %d%%\n", + ret ? "Fail:" : "Pass:", min_diff_percent);

- ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent)); + ksft_print_msg("Percent diff=%d\n", avg_diff_per); + } + prev_avg_llc_val = avg_llc_val;

show_cache_info(no_of_bits, avg_llc_val, cache_span, true);

return ret; }

-static int check_results(struct resctrl_val_param *param) +static int check_results(struct resctrl_val_param *param, char *cache_type) { char *token_array[8], temp[512]; unsigned long sum_llc_perf_miss = 0; - int runs = 0, no_of_bits = 0; + unsigned long alloc_size; + int runs = 0; + int fail = 0; + int ret; FILE *fp;

ksft_print_msg("Checking for pass/fail\n"); @@ -83,42 +102,59 @@ static int check_results(struct resctrl_val_param *param) while (fgets(temp, sizeof(temp), fp)) { char *token = strtok(temp, ":\t"); int fields = 0; + int bits;

while (token) { token_array[fields++] = token; token = strtok(NULL, ":\t"); } - /* - * Discard the first value which is inaccurate due to monitoring - * setup transition phase. - */ - if (runs > 0) - sum_llc_perf_miss += strtoul(token_array[3], NULL, 0); + + sum_llc_perf_miss += strtoul(token_array[3], NULL, 0); runs++; + + if (runs < NUM_OF_RUNS) + continue; + + if (!current_mask) { + ksft_print_msg("Unexpected empty cache mask\n"); + break; + } + + ret = cache_alloc_size(param->cpu_no, cache_type, current_mask, &alloc_size); + if (ret) + return ret; + + bits = count_bits(current_mask); + + ret = show_results_info(sum_llc_perf_miss, bits, + alloc_size / 64, + MIN_DIFF_PERCENT_PER_BIT * bits, runs, + get_vendor() == ARCH_INTEL); + if (ret) + fail = 1; + + runs = 0; + sum_llc_perf_miss = 0; + current_mask &= current_mask >> 1; }

fclose(fp); - no_of_bits = count_consecutive_bits(param->mask, NULL);

- return show_results_info(sum_llc_perf_miss, no_of_bits, param->span / 64, - MAX_DIFF, MAX_DIFF_PERCENT, runs - 1, - get_vendor() == ARCH_INTEL); + return fail; }

void cat_test_cleanup(void) { - remove(RESULT_FILE_NAME1); - remove(RESULT_FILE_NAME2); + remove(RESULT_FILE_NAME); }

int cat_perf_miss_val(int cpu_no, int n, char *cache_type) { - unsigned long l_mask, l_mask_1; - int ret, pipefd[2], sibling_cpu_no; unsigned long cache_size; unsigned long long_mask; + unsigned int start; int count_of_bits; - char pipe_message; + int ret;

cache_size = 0;

@@ -126,7 +162,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) ret = get_mask_no_shareable(cache_type, &long_mask); if (ret) return ret; - count_of_bits = count_consecutive_bits(long_mask, NULL); + count_of_bits = count_consecutive_bits(long_mask, &start);

/* Get L3/L2 cache size */ ret = get_cache_size(cpu_no, cache_type, &cache_size); @@ -143,99 +179,29 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) count_of_bits - 1); return -1; } - - /* Get core id from same socket for running another thread */ - sibling_cpu_no = get_core_sibling(cpu_no); - if (sibling_cpu_no < 0) - return -1; + current_mask = create_bit_mask(start, n);

struct resctrl_val_param param = { .resctrl_val = CAT_STR, .cpu_no = cpu_no, + .ctrlgrp = "c1", .setup = cat_setup, + .filename = RESULT_FILE_NAME, + .num_of_runs = 0, }; - - l_mask = long_mask >> n; - l_mask_1 = ~l_mask & long_mask; - - /* Set param values for parent thread which will be allocated bitmask - * with (max_bits - n) bits - */ - ret = cache_alloc_size(cpu_no, cache_type, l_mask, &param.span); + param.mask = long_mask; + ret = cache_alloc_size(cpu_no, cache_type, current_mask, &param.span); if (ret) return ret; - strcpy(param.ctrlgrp, "c2"); - strcpy(param.mongrp, "m2"); - strcpy(param.filename, RESULT_FILE_NAME2); - param.mask = l_mask; - param.num_of_runs = 0; - - if (pipe(pipefd)) { - perror("# Unable to create pipe"); - return errno; - } - - fflush(stdout); - bm_pid = fork(); - - /* Set param values for child thread which will be allocated bitmask - * with n bits - */ - if (bm_pid == 0) { - param.mask = l_mask_1; - strcpy(param.ctrlgrp, "c1"); - strcpy(param.mongrp, "m1"); - ret = cache_alloc_size(cpu_no, cache_type, l_mask_1, &param.span); - if (ret) - exit(-1); - strcpy(param.filename, RESULT_FILE_NAME1); - param.num_of_runs = 0; - param.cpu_no = sibling_cpu_no; - } else { - ret = signal_handler_register(); - if (ret) { - kill(bm_pid, SIGKILL); - goto out; - } - }

remove(param.filename);

ret = cat_val(&param); - if (ret == 0) - ret = check_results(&param); - - if (bm_pid == 0) { - /* Tell parent that child is ready */ - close(pipefd[0]); - pipe_message = 1; - if (write(pipefd[1], &pipe_message, sizeof(pipe_message)) < - sizeof(pipe_message)) - /* - * Just print the error message. - * Let while(1) run and wait for itself to be killed. - */ - perror("# failed signaling parent process"); - - close(pipefd[1]); - while (1) - ; - } else { - /* Parent waits for child to be ready. */ - close(pipefd[1]); - pipe_message = 0; - while (pipe_message != 1) { - if (read(pipefd[0], &pipe_message, - sizeof(pipe_message)) < sizeof(pipe_message)) { - perror("# failed reading from child process"); - break; - } - } - close(pipefd[0]); - kill(bm_pid, SIGKILL); - signal_handler_unregister(); - } + if (ret) + goto out;

+ current_mask = create_bit_mask(start, n); + ret = check_results(&param, cache_type); out: cat_test_cleanup();

-- 2.30.2

Reinette Chatre

22 Apr 22 Apr

12:32 a.m.

New subject: [PATCH v2 24/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

Hi Ilpo,

On 4/18/2023 4:45 AM, Ilpo Järvinen wrote:

...

CAT test spawns two processes into two different control groups with exclusive schemata. Both the processes alloc a buffer from memory matching their allocated LLC block size and flush the entire buffer out of caches. Since the processes are reading through the buffer only once during the measurement and initially all the buffer was flushed, the test isn't testing CAT.

Rewrite the CAT test to allocated a buffer sized to half of LLC. Then

"allocated a buffer" -> "allocate a buffer" ?

...

perform a sequence of tests with different LLC alloc sizes starting from half of the CBM bits down to 1-bit CBM. Flush the buffer before each test and read the buffer twice. Observe the LLC misses on the second read through the buffer. As the allocated LLC block gets smaller and smaller, the LLC misses will become larger and larger giving a strong signal on CAT working properly.

Since the changelog starts by describing the CAT test needing two processes I think it would help to highlight that this test uses a single process. I think it would also help to describing how the cache is used by the rest while this test is running.

...

Suggested-by: Reinette Chatre reinette.chatre@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/cache.c | 20 +- tools/testing/selftests/resctrl/cat_test.c | 204 +++++++++------------ 2 files changed, 97 insertions(+), 127 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index 7970239413da..64f08ba5edc2 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -224,10 +224,10 @@ int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid) */ int cat_val(struct resctrl_val_param *param) {

int memflush = 1, operation = 0, ret = 0; char *resctrl_val = param->resctrl_val; unsigned long llc_perf_miss = 0; pid_t bm_pid;

int ret;

if (strcmp(param->filename, "") == 0) sprintf(param->filename, "stdio"); @@ -245,6 +245,10 @@ int cat_val(struct resctrl_val_param *param) if (ret) return ret;
ret = alloc_buffer(param->span, 1);

if (ret)
return ret;
initialize_llc_perf();
/* Test runs until the callback setup() tells the test to stop. */ @@ -256,17 +260,15 @@ int cat_val(struct resctrl_val_param *param) } if (ret < 0) break;
flush_buffer(param->span);
use_buffer(param->span, 0, true);
ret = reset_enable_llc_perf(bm_pid, param->cpu_no); if (ret) break;
if (run_fill_buf(param->span, memflush, operation, true)) {
	fprintf(stderr, "Error-running fill buffer\n");
	ret = -1;
	break;
}
sleep(1);
use_buffer(param->span, 0, true);
/* Measure cache miss from perf */ ret = get_llc_perf(&llc_perf_miss); @@ -279,6 +281,8 @@ int cat_val(struct resctrl_val_param *param) break; }

free_buffer();

return ret;

} diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 4b505fdb35d7..85053829b9c5 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -11,11 +11,12 @@ #include "resctrl.h" #include <unistd.h> -#define RESULT_FILE_NAME1 "result_cat1" -#define RESULT_FILE_NAME2 "result_cat2" -#define NUM_OF_RUNS 5 -#define MAX_DIFF_PERCENT 4 -#define MAX_DIFF 1000000 +#define RESULT_FILE_NAME "result_cat" +#define NUM_OF_RUNS 5 +#define MIN_DIFF_PERCENT_PER_BIT 2

Could you please start a new trend that adds documentation that explains what this constant means and how it was chosen?

...

+static unsigned long current_mask; +static long prev_avg_llc_val; /*

Change schemata. Write schemata to specified

@@ -28,13 +29,24 @@ static int cat_setup(struct resctrl_val_param *p) int ret = 0; /* Run NUM_OF_RUNS times */
if (p->num_of_runs >= NUM_OF_RUNS)
return END_OF_TESTS;
if (p->num_of_runs >= NUM_OF_RUNS) {
/* Remove one bit from the consecutive block */
current_mask &= current_mask >> 1;
if (!current_mask)
	return END_OF_TESTS;
p->num_of_runs = 0;

This seems like a workaround to get the schemata to be written. It is problematic since now p->num_of_runs no longer accurately reflects the number of test runs. I was expecting this mask manipulation to be in cat_val() so that it is clear how test works instead of part of the logic handled here.

...

}

if (p->num_of_runs == 0) {

```
sprintf(schemata, "%lx", p->mask);
```

ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no,

```
		     p->resctrl_val);
```

snprintf(schemata, sizeof(schemata), "%lx", p->mask & ~current_mask);

ret = write_schemata("", schemata, p->cpu_no, p->resctrl_val);

```
if (ret)
```
```
	return ret;
```

snprintf(schemata, sizeof(schemata), "%lx", current_mask);

ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no, p->resctrl_val);

```
if (ret)
```
```
	return ret;
```
} p->num_of_runs++;

...

@@ -126,7 +162,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) ret = get_mask_no_shareable(cache_type, &long_mask); if (ret) return ret;

count_of_bits = count_consecutive_bits(long_mask, NULL);

count_of_bits = count_consecutive_bits(long_mask, &start);

/* Get L3/L2 cache size */ ret = get_cache_size(cpu_no, cache_type, &cache_size); @@ -143,99 +179,29 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) count_of_bits - 1); return -1; }

/* Get core id from same socket for running another thread */

sibling_cpu_no = get_core_sibling(cpu_no);

Do any users of get_core_sibling() remain after this?

Reinette

Ilpo Järvinen

26 Apr 26 Apr

1:58 p.m.

New subject: [PATCH v2 24/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

On Fri, 21 Apr 2023, Reinette Chatre wrote:

...

Hi Ilpo,

On 4/18/2023 4:45 AM, Ilpo Järvinen wrote:

...
CAT test spawns two processes into two different control groups with exclusive schemata. Both the processes alloc a buffer from memory matching their allocated LLC block size and flush the entire buffer out of caches. Since the processes are reading through the buffer only once during the measurement and initially all the buffer was flushed, the test isn't testing CAT.

Rewrite the CAT test to allocated a buffer sized to half of LLC. Then

"allocated a buffer" -> "allocate a buffer" ?

...
perform a sequence of tests with different LLC alloc sizes starting from half of the CBM bits down to 1-bit CBM. Flush the buffer before each test and read the buffer twice. Observe the LLC misses on the second read through the buffer. As the allocated LLC block gets smaller and smaller, the LLC misses will become larger and larger giving a strong signal on CAT working properly.

Since the changelog starts by describing the CAT test needing two processes I think it would help to highlight that this test uses a single process. I think it would also help to describing how the cache is used by the rest while this test is running.

Sure, good points, I'll add the info.

...

...
Suggested-by: Reinette Chatre reinette.chatre@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

tools/testing/selftests/resctrl/cache.c | 20 +- tools/testing/selftests/resctrl/cat_test.c | 204 +++++++++------------ 2 files changed, 97 insertions(+), 127 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index 7970239413da..64f08ba5edc2 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -224,10 +224,10 @@ int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid) */ int cat_val(struct resctrl_val_param *param) {

int memflush = 1, operation = 0, ret = 0; char *resctrl_val = param->resctrl_val; unsigned long llc_perf_miss = 0; pid_t bm_pid;

int ret;

if (strcmp(param->filename, "") == 0) sprintf(param->filename, "stdio"); @@ -245,6 +245,10 @@ int cat_val(struct resctrl_val_param *param) if (ret) return ret;
ret = alloc_buffer(param->span, 1);

if (ret)
return ret;
initialize_llc_perf();
/* Test runs until the callback setup() tells the test to stop. */ @@ -256,17 +260,15 @@ int cat_val(struct resctrl_val_param *param) } if (ret < 0) break;
flush_buffer(param->span);
use_buffer(param->span, 0, true);
ret = reset_enable_llc_perf(bm_pid, param->cpu_no); if (ret) break;
if (run_fill_buf(param->span, memflush, operation, true)) {
	fprintf(stderr, "Error-running fill buffer\n");
	ret = -1;
	break;
}
sleep(1);
use_buffer(param->span, 0, true);
/* Measure cache miss from perf */ ret = get_llc_perf(&llc_perf_miss); @@ -279,6 +281,8 @@ int cat_val(struct resctrl_val_param *param) break; }

free_buffer();

return ret;

} diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 4b505fdb35d7..85053829b9c5 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -11,11 +11,12 @@ #include "resctrl.h" #include <unistd.h> -#define RESULT_FILE_NAME1 "result_cat1" -#define RESULT_FILE_NAME2 "result_cat2" -#define NUM_OF_RUNS 5 -#define MAX_DIFF_PERCENT 4 -#define MAX_DIFF 1000000 +#define RESULT_FILE_NAME "result_cat" +#define NUM_OF_RUNS 5 +#define MIN_DIFF_PERCENT_PER_BIT 2
Could you please start a new trend that adds documentation that explains what this constant means and how it was chosen?

I can try although that particular 2 was a bit handwavy that just seems to work with the tests I performed.

...

...
+static unsigned long current_mask; +static long prev_avg_llc_val; /*

Change schemata. Write schemata to specified

@@ -28,13 +29,24 @@ static int cat_setup(struct resctrl_val_param *p) int ret = 0; /* Run NUM_OF_RUNS times */
if (p->num_of_runs >= NUM_OF_RUNS)
return END_OF_TESTS;
if (p->num_of_runs >= NUM_OF_RUNS) {
/* Remove one bit from the consecutive block */
current_mask &= current_mask >> 1;
if (!current_mask)
	return END_OF_TESTS;
p->num_of_runs = 0;
This seems like a workaround to get the schemata to be written. It is problematic since now p->num_of_runs no longer accurately reflects the number of test runs.

This is already the case. MBA test works around this very same problem by using a custom static variable (runs_per_allocation) which is reset to 0 every NUM_OF_RUNS tests and not keeping ->num_of_runs at all. If MBA test would replace runs_per_allocation with use of ->num_of_runs, it would match what the new CAT test does.

Nothing currently relies on ->num_of_runs counting across the different "tests" that are run inside CAT and MBA tests. And I don't have anything immediately around the corner that would require ->num_of_runs to count total number of repetitions that were ran.

I guess it would be possible to attempt to consolidate that second layer MBA and the rewritten CAT tests need somehow into resctrl_val_param. But IMHO that too is low-prio refactor as nothing is broken as is.

...

I was expecting this mask manipulation to be in cat_val() so that it is clear how test works instead of part of the logic handled here.

That seems to be moving into opposite direction from how things are currently handled. Doing it in cat_val() would be relying less on ->setup(). If that's the preferred direction, then the question becomes, should CAT test do anything in ->setup() because also the schemata writing could be done in directly cat_val().

What I would prefer not to do is to have a rule which says: if there's a test-specific function, don't use ->setup() but do any setup directly in the test-specific function but, otherwise use ->setup(). Such an inconsistency would make things hard to track.

...

...
}

if (p->num_of_runs == 0) {
sprintf(schemata, "%lx", p->mask);
ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no,
		     p->resctrl_val);
snprintf(schemata, sizeof(schemata), "%lx", p->mask & ~current_mask);
ret = write_schemata("", schemata, p->cpu_no, p->resctrl_val);
if (ret)
	return ret;
snprintf(schemata, sizeof(schemata), "%lx", current_mask);
ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no, p->resctrl_val);
if (ret)
	return ret;
} p->num_of_runs++;
...

...
@@ -126,7 +162,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) ret = get_mask_no_shareable(cache_type, &long_mask); if (ret) return ret;

count_of_bits = count_consecutive_bits(long_mask, NULL);

count_of_bits = count_consecutive_bits(long_mask, &start);

/* Get L3/L2 cache size */ ret = get_cache_size(cpu_no, cache_type, &cache_size); @@ -143,99 +179,29 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) count_of_bits - 1); return -1; }

/* Get core id from same socket for running another thread */

sibling_cpu_no = get_core_sibling(cpu_no);

Do any users of get_core_sibling() remain after this?

Correct observation, there seems to be no other users after this is removed.

-- i.

Reinette Chatre

11:35 p.m.

New subject: [PATCH v2 24/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

Hi Ilpo,

On 4/26/2023 6:58 AM, Ilpo Järvinen wrote:

...

On Fri, 21 Apr 2023, Reinette Chatre wrote:

...
On 4/18/2023 4:45 AM, Ilpo Järvinen wrote:

...

...
...
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 4b505fdb35d7..85053829b9c5 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -11,11 +11,12 @@ #include "resctrl.h" #include <unistd.h> -#define RESULT_FILE_NAME1 "result_cat1" -#define RESULT_FILE_NAME2 "result_cat2" -#define NUM_OF_RUNS 5 -#define MAX_DIFF_PERCENT 4 -#define MAX_DIFF 1000000 +#define RESULT_FILE_NAME "result_cat" +#define NUM_OF_RUNS 5 +#define MIN_DIFF_PERCENT_PER_BIT 2

Could you please start a new trend that adds documentation that explains what this constant means and how it was chosen?

I can try although that particular 2 was a bit handwavy that just seems to work with the tests I performed.

The changelog claims that the existing CAT test does not work with this new test offered as replacement. Considering that I do think it is important to have confidence that this test is able to test CAT. The words "handwave" and "seems to work" are red flags to me. When merged, these tests will be run on a variety of platforms with various configurations. Using test criteria based on measurements from one particular system may work but there needs to be confidence that the criteria maps to all systems these tests will be run on.

...

...
...
+static unsigned long current_mask; +static long prev_avg_llc_val; /*

Change schemata. Write schemata to specified

@@ -28,13 +29,24 @@ static int cat_setup(struct resctrl_val_param *p) int ret = 0; /* Run NUM_OF_RUNS times */
if (p->num_of_runs >= NUM_OF_RUNS)
return END_OF_TESTS;
if (p->num_of_runs >= NUM_OF_RUNS) {
/* Remove one bit from the consecutive block */
current_mask &= current_mask >> 1;
if (!current_mask)
	return END_OF_TESTS;
p->num_of_runs = 0;
This seems like a workaround to get the schemata to be written. It is problematic since now p->num_of_runs no longer accurately reflects the number of test runs.
This is already the case. MBA test works around this very same problem by using a custom static variable (runs_per_allocation) which is reset to 0 every NUM_OF_RUNS tests and not keeping ->num_of_runs at all. If MBA test would replace runs_per_allocation with use of ->num_of_runs, it would match what the new CAT test does.

Nothing currently relies on ->num_of_runs counting across the different "tests" that are run inside CAT and MBA tests. And I don't have anything immediately around the corner that would require ->num_of_runs to count total number of repetitions that were ran.

I guess it would be possible to attempt to consolidate that second layer MBA and the rewritten CAT tests need somehow into resctrl_val_param. But IMHO that too is low-prio refactor as nothing is broken as is.

I do not think that I would use any of the other tests as reference since all the other tests rely on the same wrapper (resctrl_val()) by providing it their own customization (via aptly named ... struct resctrl_val_param). The CAT test is already unique by _not_ using resctrl_val() but its own test. I do not see why those resctrl_val() customization need to propagate to the CAT test if it is not using the wrapper to begin with.

...

...
I was expecting this mask manipulation to be in cat_val() so that it is clear how test works instead of part of the logic handled here.

That seems to be moving into opposite direction from how things are currently handled. Doing it in cat_val() would be relying less on ->setup(). If that's the preferred direction, then the question becomes, should CAT test do anything in ->setup() because also the schemata writing could be done in directly cat_val().

What I would prefer not to do is to have a rule which says: if there's a test-specific function, don't use ->setup() but do any setup directly in the test-specific function but, otherwise use ->setup(). Such an inconsistency would make things hard to track.

The test specific function can still call a setup function but it can be done directly instead of via "struct resctrl_val_param". The test specific function already transitioned away from using resctrl_val(), it is not clear to me why there should be rules about how function pointers within "struct resctrl_val_param" should be used or indeed why "struct resctrl_val_param" should be used at all.

Reinette

Ilpo Järvinen

27 Apr 27 Apr

8:04 a.m.

New subject: [PATCH v2 24/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

On Wed, 26 Apr 2023, Reinette Chatre wrote:

...

On 4/26/2023 6:58 AM, Ilpo Järvinen wrote:

...
On Fri, 21 Apr 2023, Reinette Chatre wrote:

...
On 4/18/2023 4:45 AM, Ilpo Järvinen wrote:

...

...
...
...
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 4b505fdb35d7..85053829b9c5 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -11,11 +11,12 @@ #include "resctrl.h" #include <unistd.h> -#define RESULT_FILE_NAME1 "result_cat1" -#define RESULT_FILE_NAME2 "result_cat2" -#define NUM_OF_RUNS 5 -#define MAX_DIFF_PERCENT 4 -#define MAX_DIFF 1000000 +#define RESULT_FILE_NAME "result_cat" +#define NUM_OF_RUNS 5 +#define MIN_DIFF_PERCENT_PER_BIT 2

Could you please start a new trend that adds documentation that explains what this constant means and how it was chosen?

I can try although that particular 2 was a bit handwavy that just seems to work with the tests I performed.

The changelog claims that the existing CAT test does not work with this new test offered as replacement. Considering that I do think it is important to have confidence that this test is able to test CAT. The words "handwave" and "seems to work" are red flags to me. When merged, these tests will be run on a variety of platforms with various configurations. Using test criteria based on measurements from one particular system may work but there needs to be confidence that the criteria maps to all systems these tests will be run on.

My "tests" (in plural) were not limited to one particular system but included systems from different generations.

...

...
...
...
+static unsigned long current_mask; +static long prev_avg_llc_val; /*

Change schemata. Write schemata to specified

@@ -28,13 +29,24 @@ static int cat_setup(struct resctrl_val_param *p) int ret = 0; /* Run NUM_OF_RUNS times */
if (p->num_of_runs >= NUM_OF_RUNS)
return END_OF_TESTS;
if (p->num_of_runs >= NUM_OF_RUNS) {
/* Remove one bit from the consecutive block */
current_mask &= current_mask >> 1;
if (!current_mask)
	return END_OF_TESTS;
p->num_of_runs = 0;
This seems like a workaround to get the schemata to be written. It is problematic since now p->num_of_runs no longer accurately reflects the number of test runs.
This is already the case. MBA test works around this very same problem by using a custom static variable (runs_per_allocation) which is reset to 0 every NUM_OF_RUNS tests and not keeping ->num_of_runs at all. If MBA test would replace runs_per_allocation with use of ->num_of_runs, it would match what the new CAT test does.

Nothing currently relies on ->num_of_runs counting across the different "tests" that are run inside CAT and MBA tests. And I don't have anything immediately around the corner that would require ->num_of_runs to count total number of repetitions that were ran.

I guess it would be possible to attempt to consolidate that second layer MBA and the rewritten CAT tests need somehow into resctrl_val_param. But IMHO that too is low-prio refactor as nothing is broken as is.
I do not think that I would use any of the other tests as reference since all the other tests rely on the same wrapper (resctrl_val()) by providing it their own customization (via aptly named ... struct resctrl_val_param).

Oh, I see. I never made the connection to the function name before this. (To be honest, it's pretty stupid name for that particular function, given what the function does, but that's an entirely separate issue.)

-- i. > The CAT test is already unique by _not_ using resctrl_val() but its > own test. I do not see why those resctrl_val() customization need to > propagate to the CAT test if it is not using the wrapper to begin with. > > > > >> I was expecting this mask manipulation to be > >> in cat_val() so that it is clear how test works instead of part > >> of the logic handled here. > > > > That seems to be moving into opposite direction from how things are > > currently handled. Doing it in cat_val() would be relying less on > > ->setup(). If that's the preferred direction, then the question becomes, > > should CAT test do anything in ->setup() because also the schemata > > writing could be done in directly cat_val(). > > > > What I would prefer not to do is to have a rule which says: if there's a > > test-specific function, don't use ->setup() but do any setup directly > > in the test-specific function but, otherwise use ->setup(). Such an > > inconsistency would make things hard to track. > > The test specific function can still call a setup function but it > can be done directly instead of via "struct resctrl_val_param". The > test specific function already transitioned away from using resctrl_val(), > it is not clear to me why there should be rules about how > function pointers within "struct resctrl_val_param" should be used or > indeed why "struct resctrl_val_param" should be used at all. > > Reinette >

Reinette Chatre

3:15 p.m.

New subject: [PATCH v2 24/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

Hi Ilpo,

On 4/27/2023 1:04 AM, Ilpo Järvinen wrote:

...

On Wed, 26 Apr 2023, Reinette Chatre wrote:

...
On 4/26/2023 6:58 AM, Ilpo Järvinen wrote:

...
On Fri, 21 Apr 2023, Reinette Chatre wrote:

...
On 4/18/2023 4:45 AM, Ilpo Järvinen wrote:

...

...
...
...
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 4b505fdb35d7..85053829b9c5 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -11,11 +11,12 @@ #include "resctrl.h" #include <unistd.h> -#define RESULT_FILE_NAME1 "result_cat1" -#define RESULT_FILE_NAME2 "result_cat2" -#define NUM_OF_RUNS 5 -#define MAX_DIFF_PERCENT 4 -#define MAX_DIFF 1000000 +#define RESULT_FILE_NAME "result_cat" +#define NUM_OF_RUNS 5 +#define MIN_DIFF_PERCENT_PER_BIT 2

Could you please start a new trend that adds documentation that explains what this constant means and how it was chosen?

I can try although that particular 2 was a bit handwavy that just seems to work with the tests I performed.

The changelog claims that the existing CAT test does not work with this new test offered as replacement. Considering that I do think it is important to have confidence that this test is able to test CAT. The words "handwave" and "seems to work" are red flags to me. When merged, these tests will be run on a variety of platforms with various configurations. Using test criteria based on measurements from one particular system may work but there needs to be confidence that the criteria maps to all systems these tests will be run on.

My "tests" (in plural) were not limited to one particular system but included systems from different generations.

Thank you very much for your thorough testing. Having this information accompany this change will surely help to increase confidence in the value chosen.

Thank you very much

Reinette

927

days inactive

986

days old

linux-kselftest-mirror@lists.linaro.org

71 comments

participants

tags (0)

participants (3)

Ilpo Järvinen
Reinette Chatre
Shaopeng Tan (Fujitsu)