Changes v4: - Printing SNC warnings at the start of every test. - Printing SNC warnings at the end of every relevant test. - Remove global snc_mode variable, consolidate snc detection functions into one. - Correct minor mistakes.
Changes v3: - Reworked patch 2. - Changed minor things in patch 1 like function name and made corrections to the patch message.
Changes v2: - Removed patches 2 and 3 since now this part will be supported by the kernel.
Sub-Numa Clustering (SNC) allows splitting CPU cores, caches and memory into multiple NUMA nodes. When enabled, NUMA-aware applications can achieve better performance on bigger server platforms.
SNC support in the kernel was merged into x86/cache [1]. With SNC enabled and kernel support in place all the tests will function normally (aside from effective cache size). There might be a problem when SNC is enabled but the system is still using an older kernel version without SNC support. Currently the only message displayed in that situation is a guess that SNC might be enabled and is causing issues. That message also is displayed whenever the test fails on an Intel platform.
Add a mechanism to discover kernel support for SNC which will add more meaning and certainty to the error message.
Add runtime SNC mode detection and verify how reliable that information is.
Series was tested on Ice Lake server platforms with SNC disabled, SNC-2 and SNC-4. The tests were also ran with and without kernel support for SNC.
Series applies cleanly on kselftest/next.
[1] https://lore.kernel.org/all/20240628215619.76401-1-tony.luck@intel.com/
Previous versions of this series: [v1] https://lore.kernel.org/all/cover.1709721159.git.maciej.wieczor-retman@intel... [v2] https://lore.kernel.org/all/cover.1715769576.git.maciej.wieczor-retman@intel... [v3] https://lore.kernel.org/all/cover.1719842207.git.maciej.wieczor-retman@intel...
Maciej Wieczor-Retman (2): selftests/resctrl: Adjust effective L3 cache size with SNC enabled selftests/resctrl: Adjust SNC support messages
tools/testing/selftests/resctrl/cat_test.c | 8 ++ tools/testing/selftests/resctrl/cmt_test.c | 10 +- tools/testing/selftests/resctrl/mba_test.c | 7 + tools/testing/selftests/resctrl/mbm_test.c | 9 +- tools/testing/selftests/resctrl/resctrl.h | 7 + .../testing/selftests/resctrl/resctrl_tests.c | 8 +- tools/testing/selftests/resctrl/resctrlfs.c | 130 ++++++++++++++++++ 7 files changed, 174 insertions(+), 5 deletions(-)
Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA nodes. Systems may support splitting into either two, three or four nodes.
When SNC mode is enabled the effective amount of L3 cache available for allocation is divided by the number of nodes per L3.
Detect which SNC mode is active by comparing the number of CPUs that share a cache with CPU0, with the number of CPUs on node0.
Signed-off-by: Tony Luck tony.luck@intel.com Co-developed-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com --- Changelog v4: - Make returned value a static local variable so the function only runs the logic once. (Reinette)
Changelog v3: - Add comparison between present and online cpus to test if the calculated SNC mode is credible. (Reinette) - Added comment to cache size modification to better explain why it is needed there. (Reinette) - Fix facts in patch message. (Reinette) - Change snc_ways() to snc_nodes_per_l3_cache(). (Reinette)
tools/testing/selftests/resctrl/resctrl.h | 4 ++ tools/testing/selftests/resctrl/resctrlfs.c | 73 +++++++++++++++++++++ 2 files changed, 77 insertions(+)
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 2dda56084588..851b37c9c38a 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -11,6 +11,7 @@ #include <signal.h> #include <dirent.h> #include <stdbool.h> +#include <ctype.h> #include <sys/stat.h> #include <sys/ioctl.h> #include <sys/mount.h> @@ -43,6 +44,8 @@
#define DEFAULT_SPAN (250 * MB)
+#define MAX_SNC 4 + /* * user_params: User supplied parameters * @cpu: CPU number to which the benchmark will be bound to @@ -120,6 +123,7 @@ extern volatile int *value_sink;
extern char llc_occup_path[1024];
+int snc_nodes_per_l3_cache(void); int get_vendor(void); bool check_resctrlfs_support(void); int filter_dmesg(void); diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 250c320349a7..803dd415984c 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -156,6 +156,68 @@ int get_domain_id(const char *resource, int cpu_no, int *domain_id) return 0; }
+/* + * Count number of CPUs in a /sys bit map + */ +static unsigned int count_sys_bitmap_bits(char *name) +{ + FILE *fp = fopen(name, "r"); + int count = 0, c; + + if (!fp) + return 0; + + while ((c = fgetc(fp)) != EOF) { + if (!isxdigit(c)) + continue; + switch (c) { + case 'f': + count++; + case '7': case 'b': case 'd': case 'e': + count++; + case '3': case '5': case '6': case '9': case 'a': case 'c': + count++; + case '1': case '2': case '4': case '8': + count++; + } + } + fclose(fp); + + return count; +} + +/* + * Detect SNC by comparing #CPUs in node0 with #CPUs sharing LLC with CPU0. + * If some CPUs are offline the numbers may not be exact multiples of each + * other. Any offline CPUs on node0 will be also gone from shared_cpu_map of + * CPU0 but offline CPUs from other nodes will only make the cache_cpus value + * lower. Still try to get the ratio right by preventing the second possibility. + */ +int snc_nodes_per_l3_cache(void) +{ + int node_cpus, cache_cpus, i; + static int snc_mode; + + if (!snc_mode) { + node_cpus = count_sys_bitmap_bits("/sys/devices/system/node/node0/cpumap"); + cache_cpus = count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_map"); + + if (!node_cpus || !cache_cpus) { + ksft_print_msg("Could not determine Sub-NUMA Cluster mode.\n"); + return 1; + } + + for (i = 1; i <= MAX_SNC ; i++) { + if (i * node_cpus >= cache_cpus) { + snc_mode = i; + break; + } + } + } + + return snc_mode; +} + /* * get_cache_size - Get cache size for a specified CPU * @cpu_no: CPU number @@ -211,6 +273,17 @@ int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size break; }
+ /* + * The amount of cache represented by each bit in the masks + * in the schemata file is reduced by a factor equal to SNC + * nodes per L3 cache. + * E.g. on a SNC-2 system with a 100MB L3 cache a test that + * allocates memory from its local SNC node (default behavior + * without using libnuma) will only see 50 MB llc_occupancy + * with a fully populated L3 mask in the schemata file. + */ + if (cache_num == 3) + *cache_size /= snc_nodes_per_l3_cache(); return 0; }
Resctrl selftest prints a message on test failure that Sub-Numa Clustering (SNC) could be enabled and points the user to check their BIOS settings. No actual check is performed before printing that message so it is not very accurate in pinpointing a problem.
Figuring out if SNC is enabled is only one part of the problem, the others being whether the detected SNC mode is reliable and whether the kernel supports SNC in resctrl.
When there is SNC support for kernel's resctrl subsystem and SNC is enabled then sub node files are created for each node in the resctrlfs. The sub node files exist in each regular node's L3 monitoring directory. The reliable path to check for existence of sub node files is /sys/fs/resctrl/mon_data/mon_L3_00/mon_sub_L3_00.
To check if SNC detection is reliable one can check the /sys/devices/system/cpu/offline file. If it's empty, it means all cores are operational and the ratio should be calculated correctly. If it has any contents, it means the detected SNC mode can't be trusted and should be disabled.
Add helpers for all operations mentioned above.
Detect SNC mode once and let other tests inherit that information.
Add messages to alert the user when SNC detection could return incorrect results. Correct old messages to account for kernel support of SNC in resctrl.
Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com --- Changelog v4: - Change messages at the end of tests and at the start of run_single_test. (Reinette) - Add messages at the end of CAT since it can also fail due to enabled SNC + lack of kernel support. - Remove snc_mode global variable. (Reinette) - Fix wrong description of snc_kernel_support(). (Reinette) - Move call to cpus_offline_empty() into snc_nodes_per_l3_cache() so the whole detection flow is in one place as discussed. (Reinette)
Changelog v3: - Change snc_ways() to snc_nodes_per_l3_cache(). (Reinette) - Add printing the discovered SNC mode. (Reinette) - Change method of kernel support discovery from cache sizes to existance of sub node files. - Check if SNC detection is unreliable. - Move SNC detection to only the first run_single_test() instead on error at the end of test runs. - Add global value to remind user at the end of relevant tests if SNC detection was found to be unreliable. - Redo the patch message after the changes.
Changelog v2: - Move snc_ways() checks from individual tests into snc_kernel_support(). - Write better comment for snc_kernel_support().
tools/testing/selftests/resctrl/cat_test.c | 8 +++ tools/testing/selftests/resctrl/cmt_test.c | 10 +++- tools/testing/selftests/resctrl/mba_test.c | 7 +++ tools/testing/selftests/resctrl/mbm_test.c | 9 ++- tools/testing/selftests/resctrl/resctrl.h | 3 + .../testing/selftests/resctrl/resctrl_tests.c | 8 ++- tools/testing/selftests/resctrl/resctrlfs.c | 57 +++++++++++++++++++ 7 files changed, 97 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index d4dffc934bc3..a8bb49f56755 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -285,6 +285,14 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
ret = check_results(¶m, test->resource, cache_total_size, full_cache_mask, start_mask); + if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support()) + ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n"); + + if ((get_vendor() == ARCH_INTEL) && snc_unreliable) { + ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n"); + ksft_print_msg("Intel CAT may be inaccurate.\n"); + } + return ret; }
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index 0c045080d808..471e134face0 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -175,8 +175,14 @@ static int cmt_run_test(const struct resctrl_test *test, const struct user_param goto out;
ret = check_results(¶m, span, n); - if (ret && (get_vendor() == ARCH_INTEL)) - ksft_print_msg("Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n"); + if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support()) + ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n"); + + if ((get_vendor() == ARCH_INTEL) && snc_unreliable) { + ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n"); + ksft_print_msg("Intel CMT may be inaccurate.\n"); + } +
out: free(span_str); diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c index ab8496a4925b..a805c14fe04b 100644 --- a/tools/testing/selftests/resctrl/mba_test.c +++ b/tools/testing/selftests/resctrl/mba_test.c @@ -179,6 +179,13 @@ static int mba_run_test(const struct resctrl_test *test, const struct user_param return ret;
ret = check_results(); + if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support()) + ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n"); + + if ((get_vendor() == ARCH_INTEL) && snc_unreliable) { + ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n"); + ksft_print_msg("Intel MBA may be inaccurate.\n"); + }
return ret; } diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c index 6b5a3b52d861..ce3c86989f8b 100644 --- a/tools/testing/selftests/resctrl/mbm_test.c +++ b/tools/testing/selftests/resctrl/mbm_test.c @@ -147,8 +147,13 @@ static int mbm_run_test(const struct resctrl_test *test, const struct user_param return ret;
ret = check_results(DEFAULT_SPAN); - if (ret && (get_vendor() == ARCH_INTEL)) - ksft_print_msg("Intel MBM may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n"); + if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support()) + ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n"); + + if ((get_vendor() == ARCH_INTEL) && snc_unreliable) { + ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n"); + ksft_print_msg("Intel MBM may be inaccurate.\n"); + }
return ret; } diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 851b37c9c38a..488bdca01e4f 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -121,6 +121,8 @@ struct perf_event_read { */ extern volatile int *value_sink;
+extern int snc_unreliable; + extern char llc_occup_path[1024];
int snc_nodes_per_l3_cache(void); @@ -167,6 +169,7 @@ void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(const struct resctrl_test *test); void signal_handler_unregister(void); unsigned int count_bits(unsigned long n); +int snc_kernel_support(void);
void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config); void perf_event_initialize_read_format(struct perf_event_read *pe_read); diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index ecbb7605a981..4b84d6199a36 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -118,11 +118,17 @@ static bool test_vendor_specific_check(const struct resctrl_test *test)
static void run_single_test(const struct resctrl_test *test, const struct user_params *uparams) { - int ret; + int ret, snc_mode;
if (test->disabled) return;
+ snc_mode = snc_nodes_per_l3_cache(); + if (snc_mode > 1) + ksft_print_msg("SNC-%d mode discovered.\n", snc_mode); + else if (snc_unreliable) + ksft_print_msg("SNC detection unreliable due to offline CPUs. Test results may not be accurate if SNC enabled.\n"); + if (!test_vendor_specific_check(test)) { ksft_test_result_skip("Hardware does not support %s\n", test->name); return; diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 803dd415984c..4d0dbb332b8f 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -13,6 +13,8 @@
#include "resctrl.h"
+int snc_unreliable; + static int find_resctrl_mount(char *buffer) { FILE *mounts; @@ -186,6 +188,25 @@ static unsigned int count_sys_bitmap_bits(char *name) return count; }
+static bool cpus_offline_empty(void) +{ + char offline_cpus_str[64]; + FILE *fp; + + fp = fopen("/sys/devices/system/cpu/offline", "r"); + if (fscanf(fp, "%s", offline_cpus_str) < 0) { + if (!errno) { + fclose(fp); + return 1; + } + ksft_perror("Could not read offline CPUs file!"); + } + + fclose(fp); + + return 0; +} + /* * Detect SNC by comparing #CPUs in node0 with #CPUs sharing LLC with CPU0. * If some CPUs are offline the numbers may not be exact multiples of each @@ -199,6 +220,13 @@ int snc_nodes_per_l3_cache(void) static int snc_mode;
if (!snc_mode) { + if (!cpus_offline_empty()) { + ksft_print_msg("Runtime SNC detection unreliable due to offline CPUs.\n"); + ksft_print_msg("Setting SNC mode to disabled.\n"); + snc_mode = 1; + snc_unreliable = 1; + return snc_mode; + } node_cpus = count_sys_bitmap_bits("/sys/devices/system/node/node0/cpumap"); cache_cpus = count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_map");
@@ -942,3 +970,32 @@ unsigned int count_bits(unsigned long n)
return count; } + +/** + * snc_kernel_support - Check for existence of mon_sub_L3_00 file that indicates + * SNC resctrl support on the kernel side. + * + * Return: 0 if not supported, 1 if SNC is disabled or SNC is both enabled and + * supported. + */ +int snc_kernel_support(void) +{ + char node_path[PATH_MAX]; + struct stat statbuf; + int ret; + + ret = snc_nodes_per_l3_cache(); + /* + * If SNC is disabled then its kernel support isn't important. + */ + if (ret == 1) + return ret; + + snprintf(node_path, sizeof(node_path), "%s/%s/%s", RESCTRL_PATH, "mon_data", + "mon_L3_00/mon_sub_L3_00"); + + if (!stat(node_path, &statbuf)) + return 1; + + return 0; +}
Hi Maciej,
On 7/12/24 2:04 AM, Maciej Wieczor-Retman wrote:
Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA nodes. Systems may support splitting into either two, three or four nodes.
When SNC mode is enabled the effective amount of L3 cache available for allocation is divided by the number of nodes per L3.
Detect which SNC mode is active by comparing the number of CPUs that share a cache with CPU0, with the number of CPUs on node0.
Signed-off-by: Tony Luck tony.luck@intel.com Co-developed-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com
Since you are "From:" author there is no need for a "Co-developed-by" for you. Tony does need one. Please check: "When to use Acked-by:, Cc:, and Co-developed-by:" in Documentation/process/submitting-patches.rst (checkpatch.pl also warns about this).
Changelog v4:
- Make returned value a static local variable so the function only runs the logic once. (Reinette)
Changelog v3:
Add comparison between present and online cpus to test if the calculated SNC mode is credible. (Reinette)
Added comment to cache size modification to better explain why it is needed there. (Reinette)
Fix facts in patch message. (Reinette)
Change snc_ways() to snc_nodes_per_l3_cache(). (Reinette)
tools/testing/selftests/resctrl/resctrl.h | 4 ++ tools/testing/selftests/resctrl/resctrlfs.c | 73 +++++++++++++++++++++ 2 files changed, 77 insertions(+)
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 2dda56084588..851b37c9c38a 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -11,6 +11,7 @@ #include <signal.h> #include <dirent.h> #include <stdbool.h> +#include <ctype.h> #include <sys/stat.h> #include <sys/ioctl.h> #include <sys/mount.h> @@ -43,6 +44,8 @@ #define DEFAULT_SPAN (250 * MB) +#define MAX_SNC 4
- /*
- user_params: User supplied parameters
- @cpu: CPU number to which the benchmark will be bound to
@@ -120,6 +123,7 @@ extern volatile int *value_sink; extern char llc_occup_path[1024]; +int snc_nodes_per_l3_cache(void); int get_vendor(void); bool check_resctrlfs_support(void); int filter_dmesg(void); diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 250c320349a7..803dd415984c 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -156,6 +156,68 @@ int get_domain_id(const char *resource, int cpu_no, int *domain_id) return 0; } +/*
- Count number of CPUs in a /sys bit map
"bit map" -> "bitmap"
- */
+static unsigned int count_sys_bitmap_bits(char *name) +{
- FILE *fp = fopen(name, "r");
- int count = 0, c;
- if (!fp)
return 0;
- while ((c = fgetc(fp)) != EOF) {
if (!isxdigit(c))
continue;
switch (c) {
case 'f':
count++;
case '7': case 'b': case 'd': case 'e':
count++;
case '3': case '5': case '6': case '9': case 'a': case 'c':
count++;
case '1': case '2': case '4': case '8':
count++;
}
- }
- fclose(fp);
- return count;
+}
+/*
- Detect SNC by comparing #CPUs in node0 with #CPUs sharing LLC with CPU0.
- If some CPUs are offline the numbers may not be exact multiples of each
- other. Any offline CPUs on node0 will be also gone from shared_cpu_map of
- CPU0 but offline CPUs from other nodes will only make the cache_cpus value
- lower. Still try to get the ratio right by preventing the second possibility.
This all seems unnecessary since the next patch sets snc_mode to 1 if there are any offline CPUs. Seems more appropriate to move the offline CPU handling to this patch.
- */
+int snc_nodes_per_l3_cache(void) +{
- int node_cpus, cache_cpus, i;
- static int snc_mode;
- if (!snc_mode) {
node_cpus = count_sys_bitmap_bits("/sys/devices/system/node/node0/cpumap");
cache_cpus = count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_map");
if (!node_cpus || !cache_cpus) {
ksft_print_msg("Could not determine Sub-NUMA Cluster mode.\n");
return 1;
}
for (i = 1; i <= MAX_SNC ; i++) {
(nit: unnecessary space)
if (i * node_cpus >= cache_cpus) {
snc_mode = i;
break;
}
This is irrelevant after the subsequent patch but note that there are scenarios where above loop cannot set snc_mode and the function will thus return 0 since snc_mode is not initialized. This is a problem in division done by following hunk.
}
- }
- return snc_mode;
+}
- /*
- get_cache_size - Get cache size for a specified CPU
- @cpu_no: CPU number
@@ -211,6 +273,17 @@ int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size break; }
- /*
* The amount of cache represented by each bit in the masks
* in the schemata file is reduced by a factor equal to SNC
* nodes per L3 cache.
* E.g. on a SNC-2 system with a 100MB L3 cache a test that
* allocates memory from its local SNC node (default behavior
* without using libnuma) will only see 50 MB llc_occupancy
* with a fully populated L3 mask in the schemata file.
*/
- if (cache_num == 3)
return 0; }*cache_size /= snc_nodes_per_l3_cache();
Reinette
Hi Maciej,
On 7/12/24 2:04 AM, Maciej Wieczor-Retman wrote:
Resctrl selftest prints a message on test failure that Sub-Numa Clustering (SNC) could be enabled and points the user to check their BIOS settings. No actual check is performed before printing that message so it is not very accurate in pinpointing a problem.
Figuring out if SNC is enabled is only one part of the problem, the others being whether the detected SNC mode is reliable and whether the kernel supports SNC in resctrl.
When there is SNC support for kernel's resctrl subsystem and SNC is enabled then sub node files are created for each node in the resctrlfs. The sub node files exist in each regular node's L3 monitoring directory. The reliable path to check for existence of sub node files is /sys/fs/resctrl/mon_data/mon_L3_00/mon_sub_L3_00.
To check if SNC detection is reliable one can check the /sys/devices/system/cpu/offline file. If it's empty, it means all cores are operational and the ratio should be calculated correctly. If it has any contents, it means the detected SNC mode can't be trusted and should be disabled.
This belongs in previous patch.
Add helpers for all operations mentioned above.
Detect SNC mode once and let other tests inherit that information.
This belongs to previous patch.
Add messages to alert the user when SNC detection could return incorrect results. Correct old messages to account for kernel support of SNC in resctrl.
Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com
Changelog v4:
- Change messages at the end of tests and at the start of run_single_test. (Reinette)
- Add messages at the end of CAT since it can also fail due to enabled SNC + lack of kernel support.
- Remove snc_mode global variable. (Reinette)
- Fix wrong description of snc_kernel_support(). (Reinette)
- Move call to cpus_offline_empty() into snc_nodes_per_l3_cache() so the whole detection flow is in one place as discussed. (Reinette)
Changelog v3:
- Change snc_ways() to snc_nodes_per_l3_cache(). (Reinette)
- Add printing the discovered SNC mode. (Reinette)
- Change method of kernel support discovery from cache sizes to existance of sub node files.
- Check if SNC detection is unreliable.
- Move SNC detection to only the first run_single_test() instead on error at the end of test runs.
- Add global value to remind user at the end of relevant tests if SNC detection was found to be unreliable.
- Redo the patch message after the changes.
Changelog v2:
Move snc_ways() checks from individual tests into snc_kernel_support().
Write better comment for snc_kernel_support().
tools/testing/selftests/resctrl/cat_test.c | 8 +++ tools/testing/selftests/resctrl/cmt_test.c | 10 +++- tools/testing/selftests/resctrl/mba_test.c | 7 +++ tools/testing/selftests/resctrl/mbm_test.c | 9 ++- tools/testing/selftests/resctrl/resctrl.h | 3 + .../testing/selftests/resctrl/resctrl_tests.c | 8 ++- tools/testing/selftests/resctrl/resctrlfs.c | 57 +++++++++++++++++++ 7 files changed, 97 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index d4dffc934bc3..a8bb49f56755 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -285,6 +285,14 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param ret = check_results(¶m, test->resource, cache_total_size, full_cache_mask, start_mask);
- if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support())
ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
The kernel support only applies to monitoring, there is no kernel support/changes related to CAT when SNC is enabled.
- if ((get_vendor() == ARCH_INTEL) && snc_unreliable) {
ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n");
ksft_print_msg("Intel CAT may be inaccurate.\n");
- }
This is still relevant but unclear why previous message checked "ret" but above does not.
- return ret; }
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index 0c045080d808..471e134face0 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -175,8 +175,14 @@ static int cmt_run_test(const struct resctrl_test *test, const struct user_param goto out; ret = check_results(¶m, span, n);
- if (ret && (get_vendor() == ARCH_INTEL))
ksft_print_msg("Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
- if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support())
ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
- if ((get_vendor() == ARCH_INTEL) && snc_unreliable) {
ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n");
ksft_print_msg("Intel CMT may be inaccurate.\n");
- }
CMT may be inaccurate in both scenarios (no kernel support or unreliable detection). Why only check "ret" in case there is no kernel support?
out: free(span_str); diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c index ab8496a4925b..a805c14fe04b 100644 --- a/tools/testing/selftests/resctrl/mba_test.c +++ b/tools/testing/selftests/resctrl/mba_test.c @@ -179,6 +179,13 @@ static int mba_run_test(const struct resctrl_test *test, const struct user_param return ret; ret = check_results();
- if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support())
ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
- if ((get_vendor() == ARCH_INTEL) && snc_unreliable) {
ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n");
ksft_print_msg("Intel MBA may be inaccurate.\n");
- }
As I understand there is no change to MBA when SNC is enabled. These additions thus seem unnecessary.
return ret; } diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c index 6b5a3b52d861..ce3c86989f8b 100644 --- a/tools/testing/selftests/resctrl/mbm_test.c +++ b/tools/testing/selftests/resctrl/mbm_test.c @@ -147,8 +147,13 @@ static int mbm_run_test(const struct resctrl_test *test, const struct user_param return ret; ret = check_results(DEFAULT_SPAN);
- if (ret && (get_vendor() == ARCH_INTEL))
ksft_print_msg("Intel MBM may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
- if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support())
ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
- if ((get_vendor() == ARCH_INTEL) && snc_unreliable) {
ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n");
ksft_print_msg("Intel MBM may be inaccurate.\n");
- }
return ret; } diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 851b37c9c38a..488bdca01e4f 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -121,6 +121,8 @@ struct perf_event_read { */ extern volatile int *value_sink; +extern int snc_unreliable;
- extern char llc_occup_path[1024];
int snc_nodes_per_l3_cache(void); @@ -167,6 +169,7 @@ void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(const struct resctrl_test *test); void signal_handler_unregister(void); unsigned int count_bits(unsigned long n); +int snc_kernel_support(void); void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config); void perf_event_initialize_read_format(struct perf_event_read *pe_read); diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index ecbb7605a981..4b84d6199a36 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -118,11 +118,17 @@ static bool test_vendor_specific_check(const struct resctrl_test *test) static void run_single_test(const struct resctrl_test *test, const struct user_params *uparams) {
- int ret;
- int ret, snc_mode;
if (test->disabled) return;
- snc_mode = snc_nodes_per_l3_cache();
- if (snc_mode > 1)
ksft_print_msg("SNC-%d mode discovered.\n", snc_mode);
- else if (snc_unreliable)
ksft_print_msg("SNC detection unreliable due to offline CPUs. Test results may not be accurate if SNC enabled.\n");
- if (!test_vendor_specific_check(test)) { ksft_test_result_skip("Hardware does not support %s\n", test->name); return;
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 803dd415984c..4d0dbb332b8f 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -13,6 +13,8 @@ #include "resctrl.h" +int snc_unreliable;
- static int find_resctrl_mount(char *buffer) { FILE *mounts;
@@ -186,6 +188,25 @@ static unsigned int count_sys_bitmap_bits(char *name) return count; } +static bool cpus_offline_empty(void) +{
- char offline_cpus_str[64];
- FILE *fp;
- fp = fopen("/sys/devices/system/cpu/offline", "r");
- if (fscanf(fp, "%s", offline_cpus_str) < 0) {
if (!errno) {
fclose(fp);
return 1;
}
ksft_perror("Could not read offline CPUs file!");
No need to scream.
I think it will be more useful to replace "offline CPUs file" with specific "/sys/devices/system/cpu/offline".
- }
- fclose(fp);
- return 0;
+}
- /*
- Detect SNC by comparing #CPUs in node0 with #CPUs sharing LLC with CPU0.
- If some CPUs are offline the numbers may not be exact multiples of each
@@ -199,6 +220,13 @@ int snc_nodes_per_l3_cache(void) static int snc_mode; if (!snc_mode) {
if (!cpus_offline_empty()) {
ksft_print_msg("Runtime SNC detection unreliable due to offline CPUs.\n");
ksft_print_msg("Setting SNC mode to disabled.\n");
snc_mode = 1;
snc_unreliable = 1;
return snc_mode;
}
This can be moved to previous patch and that for loop simplified/removed.
node_cpus = count_sys_bitmap_bits("/sys/devices/system/node/node0/cpumap"); cache_cpus = count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_map");
@@ -942,3 +970,32 @@ unsigned int count_bits(unsigned long n) return count; }
+/**
- snc_kernel_support - Check for existence of mon_sub_L3_00 file that indicates
- SNC resctrl support on the kernel side.
- Return: 0 if not supported, 1 if SNC is disabled or SNC is both enabled and
- supported.
- */
+int snc_kernel_support(void) +{
- char node_path[PATH_MAX];
- struct stat statbuf;
- int ret;
- ret = snc_nodes_per_l3_cache();
- /*
* If SNC is disabled then its kernel support isn't important.
Please expand comment that this does not take unreliable SNC detection into account and needs to be done separately.
*/
- if (ret == 1)
return ret;
- snprintf(node_path, sizeof(node_path), "%s/%s/%s", RESCTRL_PATH, "mon_data",
"mon_L3_00/mon_sub_L3_00");
- if (!stat(node_path, &statbuf))
return 1;
- return 0;
+}
Reinette
Hi, thanks for the review,
On 2024-08-12 at 16:40:00 -0700, Reinette Chatre wrote:
Hi Maciej,
On 7/12/24 2:04 AM, Maciej Wieczor-Retman wrote:
Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA nodes. Systems may support splitting into either two, three or four nodes.
When SNC mode is enabled the effective amount of L3 cache available for allocation is divided by the number of nodes per L3.
Detect which SNC mode is active by comparing the number of CPUs that share a cache with CPU0, with the number of CPUs on node0.
Signed-off-by: Tony Luck tony.luck@intel.com Co-developed-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com
Since you are "From:" author there is no need for a "Co-developed-by" for you. Tony does need one. Please check: "When to use Acked-by:, Cc:, and Co-developed-by:" in Documentation/process/submitting-patches.rst (checkpatch.pl also warns about this).
Thanks, I changed patch's author at some point and I think I forgot to change the tags.
Changelog v4:
- Make returned value a static local variable so the function only runs the logic once. (Reinette)
Changelog v3:
Add comparison between present and online cpus to test if the calculated SNC mode is credible. (Reinette)
Added comment to cache size modification to better explain why it is needed there. (Reinette)
Fix facts in patch message. (Reinette)
Change snc_ways() to snc_nodes_per_l3_cache(). (Reinette)
tools/testing/selftests/resctrl/resctrl.h | 4 ++ tools/testing/selftests/resctrl/resctrlfs.c | 73 +++++++++++++++++++++ 2 files changed, 77 insertions(+)
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 2dda56084588..851b37c9c38a 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -11,6 +11,7 @@ #include <signal.h> #include <dirent.h> #include <stdbool.h> +#include <ctype.h> #include <sys/stat.h> #include <sys/ioctl.h> #include <sys/mount.h> @@ -43,6 +44,8 @@ #define DEFAULT_SPAN (250 * MB) +#define MAX_SNC 4
- /*
- user_params: User supplied parameters
- @cpu: CPU number to which the benchmark will be bound to
@@ -120,6 +123,7 @@ extern volatile int *value_sink; extern char llc_occup_path[1024]; +int snc_nodes_per_l3_cache(void); int get_vendor(void); bool check_resctrlfs_support(void); int filter_dmesg(void); diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 250c320349a7..803dd415984c 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -156,6 +156,68 @@ int get_domain_id(const char *resource, int cpu_no, int *domain_id) return 0; } +/*
- Count number of CPUs in a /sys bit map
"bit map" -> "bitmap"
Will do.
- */
+static unsigned int count_sys_bitmap_bits(char *name) +{
- FILE *fp = fopen(name, "r");
- int count = 0, c;
- if (!fp)
return 0;
- while ((c = fgetc(fp)) != EOF) {
if (!isxdigit(c))
continue;
switch (c) {
case 'f':
count++;
case '7': case 'b': case 'd': case 'e':
count++;
case '3': case '5': case '6': case '9': case 'a': case 'c':
count++;
case '1': case '2': case '4': case '8':
count++;
}
- }
- fclose(fp);
- return count;
+}
+/*
- Detect SNC by comparing #CPUs in node0 with #CPUs sharing LLC with CPU0.
- If some CPUs are offline the numbers may not be exact multiples of each
- other. Any offline CPUs on node0 will be also gone from shared_cpu_map of
- CPU0 but offline CPUs from other nodes will only make the cache_cpus value
- lower. Still try to get the ratio right by preventing the second possibility.
This all seems unnecessary since the next patch sets snc_mode to 1 if there are any offline CPUs. Seems more appropriate to move the offline CPU handling to this patch.
Okay, I'll move it here.
- */
+int snc_nodes_per_l3_cache(void) +{
- int node_cpus, cache_cpus, i;
- static int snc_mode;
- if (!snc_mode) {
node_cpus = count_sys_bitmap_bits("/sys/devices/system/node/node0/cpumap");
cache_cpus = count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_map");
if (!node_cpus || !cache_cpus) {
ksft_print_msg("Could not determine Sub-NUMA Cluster mode.\n");
return 1;
}
for (i = 1; i <= MAX_SNC ; i++) {
(nit: unnecessary space)
Will fix.
if (i * node_cpus >= cache_cpus) {
snc_mode = i;
break;
}
This is irrelevant after the subsequent patch but note that there are scenarios where above loop cannot set snc_mode and the function will thus return 0 since snc_mode is not initialized. This is a problem in division done by following hunk.
Oh right, I'll set initial value to 1.
}
- }
- return snc_mode;
+}
- /*
- get_cache_size - Get cache size for a specified CPU
- @cpu_no: CPU number
@@ -211,6 +273,17 @@ int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size break; }
- /*
* The amount of cache represented by each bit in the masks
* in the schemata file is reduced by a factor equal to SNC
* nodes per L3 cache.
* E.g. on a SNC-2 system with a 100MB L3 cache a test that
* allocates memory from its local SNC node (default behavior
* without using libnuma) will only see 50 MB llc_occupancy
* with a fully populated L3 mask in the schemata file.
*/
- if (cache_num == 3)
return 0; }*cache_size /= snc_nodes_per_l3_cache();
Reinette
On 2024-08-12 at 16:40:10 -0700, Reinette Chatre wrote:
Hi Maciej,
On 7/12/24 2:04 AM, Maciej Wieczor-Retman wrote:
Resctrl selftest prints a message on test failure that Sub-Numa Clustering (SNC) could be enabled and points the user to check their BIOS settings. No actual check is performed before printing that message so it is not very accurate in pinpointing a problem.
Figuring out if SNC is enabled is only one part of the problem, the others being whether the detected SNC mode is reliable and whether the kernel supports SNC in resctrl.
When there is SNC support for kernel's resctrl subsystem and SNC is enabled then sub node files are created for each node in the resctrlfs. The sub node files exist in each regular node's L3 monitoring directory. The reliable path to check for existence of sub node files is /sys/fs/resctrl/mon_data/mon_L3_00/mon_sub_L3_00.
To check if SNC detection is reliable one can check the /sys/devices/system/cpu/offline file. If it's empty, it means all cores are operational and the ratio should be calculated correctly. If it has any contents, it means the detected SNC mode can't be trusted and should be disabled.
This belongs in previous patch.
Sure, will change it.
Add helpers for all operations mentioned above.
Detect SNC mode once and let other tests inherit that information.
This belongs to previous patch.
Okay, I can add that too.
Add messages to alert the user when SNC detection could return incorrect results. Correct old messages to account for kernel support of SNC in resctrl.
Signed-off-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com
Changelog v4:
- Change messages at the end of tests and at the start of run_single_test. (Reinette)
- Add messages at the end of CAT since it can also fail due to enabled SNC + lack of kernel support.
- Remove snc_mode global variable. (Reinette)
- Fix wrong description of snc_kernel_support(). (Reinette)
- Move call to cpus_offline_empty() into snc_nodes_per_l3_cache() so the whole detection flow is in one place as discussed. (Reinette)
Changelog v3:
- Change snc_ways() to snc_nodes_per_l3_cache(). (Reinette)
- Add printing the discovered SNC mode. (Reinette)
- Change method of kernel support discovery from cache sizes to existance of sub node files.
- Check if SNC detection is unreliable.
- Move SNC detection to only the first run_single_test() instead on error at the end of test runs.
- Add global value to remind user at the end of relevant tests if SNC detection was found to be unreliable.
- Redo the patch message after the changes.
Changelog v2:
Move snc_ways() checks from individual tests into snc_kernel_support().
Write better comment for snc_kernel_support().
tools/testing/selftests/resctrl/cat_test.c | 8 +++ tools/testing/selftests/resctrl/cmt_test.c | 10 +++- tools/testing/selftests/resctrl/mba_test.c | 7 +++ tools/testing/selftests/resctrl/mbm_test.c | 9 ++- tools/testing/selftests/resctrl/resctrl.h | 3 + .../testing/selftests/resctrl/resctrl_tests.c | 8 ++- tools/testing/selftests/resctrl/resctrlfs.c | 57 +++++++++++++++++++ 7 files changed, 97 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index d4dffc934bc3..a8bb49f56755 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -285,6 +285,14 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param ret = check_results(¶m, test->resource, cache_total_size, full_cache_mask, start_mask);
- if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support())
ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
The kernel support only applies to monitoring, there is no kernel support/changes related to CAT when SNC is enabled.
I rechecked the CAT test and it seems to indeed be fine. I recall it failed a while ago when there was no kernel support but I guess now it is fixed.
I'll drop the above check.
- if ((get_vendor() == ARCH_INTEL) && snc_unreliable) {
ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n");
ksft_print_msg("Intel CAT may be inaccurate.\n");
- }
This is still relevant but unclear why previous message checked "ret" but above does not.
The above check tries to explain why a failure happened.
This check is a reminder about a false positive - the test passes but "snc_unreliable" was set. I guess we could make this check to test "!ret"?
- return ret; }
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index 0c045080d808..471e134face0 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -175,8 +175,14 @@ static int cmt_run_test(const struct resctrl_test *test, const struct user_param goto out; ret = check_results(¶m, span, n);
- if (ret && (get_vendor() == ARCH_INTEL))
ksft_print_msg("Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
- if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support())
ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
- if ((get_vendor() == ARCH_INTEL) && snc_unreliable) {
ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n");
ksft_print_msg("Intel CMT may be inaccurate.\n");
- }
CMT may be inaccurate in both scenarios (no kernel support or unreliable detection). Why only check "ret" in case there is no kernel support?
I guess the same thing from above can apply here? Test "!ret"? Perhaps then make this check into "else if ()" instead of just "if" since they will be exclusive?
out: free(span_str); diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c index ab8496a4925b..a805c14fe04b 100644 --- a/tools/testing/selftests/resctrl/mba_test.c +++ b/tools/testing/selftests/resctrl/mba_test.c @@ -179,6 +179,13 @@ static int mba_run_test(const struct resctrl_test *test, const struct user_param return ret; ret = check_results();
- if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support())
ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
- if ((get_vendor() == ARCH_INTEL) && snc_unreliable) {
ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n");
ksft_print_msg("Intel MBA may be inaccurate.\n");
- }
As I understand there is no change to MBA when SNC is enabled. These additions thus seem unnecessary.
I just rechecked by installing 6.9 kernel (no SNC kernel support) and using this series selftest. MBA seems to fail in these conditions. I think it is because MBA pulls values from resctrl and from iMC and then compares them. My guess is that iMC works on the halved cache while resctrl (without new SNC support) uses the whole cache. Here is the log from the MBA test I did:
TAP version 13 # Pass: Check kernel supports resctrl filesystem # Pass: Check resctrl mountpoint "/sys/fs/resctrl" exists # resctrl filesystem not mounted # dmesg: [ 11.574208] resctrl: L3 allocation detected # dmesg: [ 11.578421] resctrl: MB allocation detected # dmesg: [ 11.582615] resctrl: L3 monitoring detected 1..1 # SNC-2 mode discovered. # Starting MBA test ... # Mounting resctrl to "/sys/fs/resctrl" # Benchmark PID: 2764 # Writing benchmark parameters to resctrl FS # Write schema "MB:0=100" to resctrl FS ... # Write schema "MB:0=10" to resctrl FS # Results are displayed in (MB) # Fail: Check MBA diff within 8% for schemata 100 # avg_diff_per: 102% # avg_bw_imc: 14221 # avg_bw_resc: 28804 # Fail: Check MBA diff within 8% for schemata 90 # avg_diff_per: 102% # avg_bw_imc: 14146 # avg_bw_resc: 28648 ... # Fail: Check MBA diff within 8% for schemata 20 # avg_diff_per: 93% # avg_bw_imc: 3712 # avg_bw_resc: 7178 # Fail: Check MBA diff within 8% for schemata 10 # avg_diff_per: 92% # avg_bw_imc: 728 # avg_bw_resc: 1405 # Fail: Check schemata change using MBA # At least one test failed # Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system. not ok 1 MBA: test # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
return ret; } diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c index 6b5a3b52d861..ce3c86989f8b 100644 --- a/tools/testing/selftests/resctrl/mbm_test.c +++ b/tools/testing/selftests/resctrl/mbm_test.c @@ -147,8 +147,13 @@ static int mbm_run_test(const struct resctrl_test *test, const struct user_param return ret; ret = check_results(DEFAULT_SPAN);
- if (ret && (get_vendor() == ARCH_INTEL))
ksft_print_msg("Intel MBM may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
- if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support())
ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
- if ((get_vendor() == ARCH_INTEL) && snc_unreliable) {
ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n");
ksft_print_msg("Intel MBM may be inaccurate.\n");
- } return ret; }
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 851b37c9c38a..488bdca01e4f 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -121,6 +121,8 @@ struct perf_event_read { */ extern volatile int *value_sink; +extern int snc_unreliable;
- extern char llc_occup_path[1024]; int snc_nodes_per_l3_cache(void);
@@ -167,6 +169,7 @@ void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(const struct resctrl_test *test); void signal_handler_unregister(void); unsigned int count_bits(unsigned long n); +int snc_kernel_support(void); void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config); void perf_event_initialize_read_format(struct perf_event_read *pe_read); diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index ecbb7605a981..4b84d6199a36 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -118,11 +118,17 @@ static bool test_vendor_specific_check(const struct resctrl_test *test) static void run_single_test(const struct resctrl_test *test, const struct user_params *uparams) {
- int ret;
- int ret, snc_mode; if (test->disabled) return;
- snc_mode = snc_nodes_per_l3_cache();
- if (snc_mode > 1)
ksft_print_msg("SNC-%d mode discovered.\n", snc_mode);
- else if (snc_unreliable)
ksft_print_msg("SNC detection unreliable due to offline CPUs. Test results may not be accurate if SNC enabled.\n");
- if (!test_vendor_specific_check(test)) { ksft_test_result_skip("Hardware does not support %s\n", test->name); return;
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 803dd415984c..4d0dbb332b8f 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -13,6 +13,8 @@ #include "resctrl.h" +int snc_unreliable;
- static int find_resctrl_mount(char *buffer) { FILE *mounts;
@@ -186,6 +188,25 @@ static unsigned int count_sys_bitmap_bits(char *name) return count; } +static bool cpus_offline_empty(void) +{
- char offline_cpus_str[64];
- FILE *fp;
- fp = fopen("/sys/devices/system/cpu/offline", "r");
- if (fscanf(fp, "%s", offline_cpus_str) < 0) {
if (!errno) {
fclose(fp);
return 1;
}
ksft_perror("Could not read offline CPUs file!");
No need to scream.
Sure, I guess I missed this when removing exclamation marks last time.
I think it will be more useful to replace "offline CPUs file" with specific "/sys/devices/system/cpu/offline".
Sure, will change.
- }
- fclose(fp);
- return 0;
+}
- /*
- Detect SNC by comparing #CPUs in node0 with #CPUs sharing LLC with CPU0.
- If some CPUs are offline the numbers may not be exact multiples of each
@@ -199,6 +220,13 @@ int snc_nodes_per_l3_cache(void) static int snc_mode; if (!snc_mode) {
if (!cpus_offline_empty()) {
ksft_print_msg("Runtime SNC detection unreliable due to offline CPUs.\n");
ksft_print_msg("Setting SNC mode to disabled.\n");
snc_mode = 1;
snc_unreliable = 1;
return snc_mode;
}
This can be moved to previous patch and that for loop simplified/removed.
Okay.
node_cpus = count_sys_bitmap_bits("/sys/devices/system/node/node0/cpumap"); cache_cpus = count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_map");
@@ -942,3 +970,32 @@ unsigned int count_bits(unsigned long n) return count; }
+/**
- snc_kernel_support - Check for existence of mon_sub_L3_00 file that indicates
- SNC resctrl support on the kernel side.
- Return: 0 if not supported, 1 if SNC is disabled or SNC is both enabled and
- supported.
- */
+int snc_kernel_support(void) +{
- char node_path[PATH_MAX];
- struct stat statbuf;
- int ret;
- ret = snc_nodes_per_l3_cache();
- /*
* If SNC is disabled then its kernel support isn't important.
Please expand comment that this does not take unreliable SNC detection into account and needs to be done separately.
Okay, I also just noticed that the return value in the function comment also needs an update.
*/
- if (ret == 1)
return ret;
- snprintf(node_path, sizeof(node_path), "%s/%s/%s", RESCTRL_PATH, "mon_data",
"mon_L3_00/mon_sub_L3_00");
- if (!stat(node_path, &statbuf))
return 1;
- return 0;
+}
Reinette
Hi Maciej,
On 8/27/24 1:15 AM, Maciej Wieczor-Retman wrote:
On 2024-08-12 at 16:40:10 -0700, Reinette Chatre wrote:
On 7/12/24 2:04 AM, Maciej Wieczor-Retman wrote:
- if ((get_vendor() == ARCH_INTEL) && snc_unreliable) {
ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n");
ksft_print_msg("Intel CAT may be inaccurate.\n");
- }
This is still relevant but unclear why previous message checked "ret" but above does not.
The above check tries to explain why a failure happened.
This check is a reminder about a false positive - the test passes but "snc_unreliable" was set. I guess we could make this check to test "!ret"?
Thinking about this more ... if the test results cannot be trusted at all (whether tests pass or fail) when snc_reliable is true then it seems more appropriate to just skip these tests when SNC detection is unreliable.
- return ret; }
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c index 0c045080d808..471e134face0 100644 --- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -175,8 +175,14 @@ static int cmt_run_test(const struct resctrl_test *test, const struct user_param goto out; ret = check_results(¶m, span, n);
- if (ret && (get_vendor() == ARCH_INTEL))
ksft_print_msg("Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
- if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support())
ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
- if ((get_vendor() == ARCH_INTEL) && snc_unreliable) {
ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n");
ksft_print_msg("Intel CMT may be inaccurate.\n");
- }
CMT may be inaccurate in both scenarios (no kernel support or unreliable detection). Why only check "ret" in case there is no kernel support?
I guess the same thing from above can apply here? Test "!ret"? Perhaps then make this check into "else if ()" instead of just "if" since they will be exclusive?
out: free(span_str); diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c index ab8496a4925b..a805c14fe04b 100644 --- a/tools/testing/selftests/resctrl/mba_test.c +++ b/tools/testing/selftests/resctrl/mba_test.c @@ -179,6 +179,13 @@ static int mba_run_test(const struct resctrl_test *test, const struct user_param return ret; ret = check_results();
- if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support())
ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
- if ((get_vendor() == ARCH_INTEL) && snc_unreliable) {
ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n");
ksft_print_msg("Intel MBA may be inaccurate.\n");
- }
As I understand there is no change to MBA when SNC is enabled. These additions thus seem unnecessary.
I just rechecked by installing 6.9 kernel (no SNC kernel support) and using this series selftest. MBA seems to fail in these conditions. I think it is because MBA pulls values from resctrl and from iMC and then compares them. My guess is that iMC works on the halved cache while resctrl (without new SNC support) uses the whole cache. Here is the log from the MBA test I did:
Apologies, yes, while MBA is not impacted by SNC the MBA selftest relies on MBM that is impacted by SNC.
Reinette
On 2024-09-17 at 08:32:11 -0700, Reinette Chatre wrote:
Hi Maciej,
On 8/27/24 1:15 AM, Maciej Wieczor-Retman wrote:
On 2024-08-12 at 16:40:10 -0700, Reinette Chatre wrote:
On 7/12/24 2:04 AM, Maciej Wieczor-Retman wrote:
- if ((get_vendor() == ARCH_INTEL) && snc_unreliable) {
ksft_print_msg("Sub-NUMA Clustering could not be detected properly (see earlier messages for details).\n");
ksft_print_msg("Intel CAT may be inaccurate.\n");
- }
This is still relevant but unclear why previous message checked "ret" but above does not.
The above check tries to explain why a failure happened.
This check is a reminder about a false positive - the test passes but "snc_unreliable" was set. I guess we could make this check to test "!ret"?
Thinking about this more ... if the test results cannot be trusted at all (whether tests pass or fail) when snc_reliable is true then it seems more appropriate to just skip these tests when SNC detection is unreliable.
Okay, I'll just skip the test if the snc_unreliable is true.
linux-kselftest-mirror@lists.linaro.org