cpu.max selftests (both the normal one and the nested one) test the working of throttling by setting up cpu.max, running a cpu hog process for a specified duration, and comparing usage_usec as reported by cpu.stat with the duration of the cpu hog: they should be far enough.
Currently, this is done by using values_close, which has two problems:
1. Semantic: values_close is used with an error percentage of 95%, which one will not expect on seeing "values close". The intent it's actually going for is "values far".
2. Accuracy: the tests can pass even if usage_usec is upto around double the expected amount. That's too high of a margin for usage_usec.
Overall, this patchset improves the readability and accuracy of the cpu.max tests.
Signed-off-by: Shashank Balaji shashank.mahadasyam@sony.com --- Shashank Balaji (2): selftests/cgroup: rename `expected` to `duration` in cpu.max tests selftests/cgroup: better bound in cpu.max tests
tools/testing/selftests/cgroup/test_cpu.c | 42 ++++++++++++++++++------------- 1 file changed, 24 insertions(+), 18 deletions(-) --- base-commit: 66701750d5565c574af42bef0b789ce0203e3071 change-id: 20250227-kselftest-cgroup-fix-cpu-max-56619928e99b
Best regards,
usage_seconds is renamed to duration_seconds and expected_usage_usec is renamed to duration_usec to better reflect the meaning of those variables: they're the duration for which the cpu hog runs for as per wall clock time.
Using `usage` for this purpose conflicts with the meaning of `usage` as per cpu.stat. In the cpu.stat case, `usage` is the duration for which the cgroup processes get to run for. Since in the cpu.max tests (both the normal one and the nested test), the cpu hog is set to use the wall clock time, because of throttling, the usage as per cpu.stat will be lower than the duration for which the cpu hog was set to run for.
Now it should ring an alarm to see `values_close` being called on usage_usec and duration_usec, because they are not supposed to be close! This is fixed in the next patch.
No functional changes.
Signed-off-by: Shashank Balaji shashank.mahadasyam@sony.com --- tools/testing/selftests/cgroup/test_cpu.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_cpu.c b/tools/testing/selftests/cgroup/test_cpu.c index a2b50af8e9eeede0cf61d8394300cac02ccaf005..26b0df338505526cc0c5de8f4179b8ec9bad43d7 100644 --- a/tools/testing/selftests/cgroup/test_cpu.c +++ b/tools/testing/selftests/cgroup/test_cpu.c @@ -646,8 +646,8 @@ static int test_cpucg_max(const char *root) { int ret = KSFT_FAIL; long usage_usec, user_usec; - long usage_seconds = 1; - long expected_usage_usec = usage_seconds * USEC_PER_SEC; + long duration_seconds = 1; + long duration_usec = duration_seconds * USEC_PER_SEC; char *cpucg;
cpucg = cg_name(root, "cpucg_test"); @@ -663,7 +663,7 @@ static int test_cpucg_max(const char *root) struct cpu_hog_func_param param = { .nprocs = 1, .ts = { - .tv_sec = usage_seconds, + .tv_sec = duration_seconds, .tv_nsec = 0, }, .clock_type = CPU_HOG_CLOCK_WALL, @@ -676,10 +676,10 @@ static int test_cpucg_max(const char *root) if (user_usec <= 0) goto cleanup;
- if (user_usec >= expected_usage_usec) + if (user_usec >= duration_usec) goto cleanup;
- if (values_close(usage_usec, expected_usage_usec, 95)) + if (values_close(usage_usec, duration_usec, 95)) goto cleanup;
ret = KSFT_PASS; @@ -699,8 +699,8 @@ static int test_cpucg_max_nested(const char *root) { int ret = KSFT_FAIL; long usage_usec, user_usec; - long usage_seconds = 1; - long expected_usage_usec = usage_seconds * USEC_PER_SEC; + long duration_seconds = 1; + long duration_usec = duration_seconds * USEC_PER_SEC; char *parent, *child;
parent = cg_name(root, "cpucg_parent"); @@ -723,7 +723,7 @@ static int test_cpucg_max_nested(const char *root) struct cpu_hog_func_param param = { .nprocs = 1, .ts = { - .tv_sec = usage_seconds, + .tv_sec = duration_seconds, .tv_nsec = 0, }, .clock_type = CPU_HOG_CLOCK_WALL, @@ -736,10 +736,10 @@ static int test_cpucg_max_nested(const char *root) if (user_usec <= 0) goto cleanup;
- if (user_usec >= expected_usage_usec) + if (user_usec >= duration_usec) goto cleanup;
- if (values_close(usage_usec, expected_usage_usec, 95)) + if (values_close(usage_usec, duration_usec, 95)) goto cleanup;
ret = KSFT_PASS;
The cpu.max test (both the normal one and the nested one) setup cpu.max with 1000 us runtime and the default period (100,000 us). A cpu hog is run for a duration of 1s as per wall clock time. This corresponds to 10 periods, hence an expected usage of 10,000 us. We want the measured usage (as per cpu.stat) to be close to 10,000 us. Enforce this bound correctly.
Previously, this approximate equality test was done by `!values_close(usage_usec, duration_usec, 95)`: if the absolute difference between usage_usec and duration_usec is greater than 95% of their sum, then we pass. This is problematic for two reasons:
1. Semantics: When one sees `values_close` they expect the error percentage to be some small number, not 95. The intent behind using `values_close` is lost by using a high error percent such as 95. The intent it's actually going for is "values far".
2. Bound too wide: The condition translates to the following expression:
|usage_usec - duration_usec| > (usage_usec + duration_usec)*0.95
0.05*duration_usec > 1.95*usage_usec (usage < duration)
usage_usec < 0.0257*duration_usec = 25,641 us
So, this condition passes as long as usage_usec is lower than 25,641 us, while all we want is for it to be close to 10,000 us.
To address these issues, the condition is changed to `labs(usage_usec - expected_usage_usec) < 2000` meaning pass. Now the meaning is much clearer. `labs` is used instead of `values_close` because we don't expect the error in usage_usec compared to expected_usage_usec to scale with either of the terms. The error is because of the cpu hog process running for slightly longer than the duration. So, using a proportional error estimate, such as `values_close`, does not make sense. The maximum tolerable error is set to 2000 us because on running this test 10 times, the maximum `usage_usec` observed was 11,513 us, which corresponds to an error of 1513 us.
user_usec is removed because it will always be less than usage_usec. usage_usec is what really represents the throttling.
Signed-off-by: Shashank Balaji shashank.mahadasyam@sony.com --- tools/testing/selftests/cgroup/test_cpu.c | 34 ++++++++++++++++++------------- 1 file changed, 20 insertions(+), 14 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_cpu.c b/tools/testing/selftests/cgroup/test_cpu.c index 26b0df338505526cc0c5de8f4179b8ec9bad43d7..fcef90d2948e1344b7741214a0cdd10609069624 100644 --- a/tools/testing/selftests/cgroup/test_cpu.c +++ b/tools/testing/selftests/cgroup/test_cpu.c @@ -645,9 +645,8 @@ test_cpucg_nested_weight_underprovisioned(const char *root) static int test_cpucg_max(const char *root) { int ret = KSFT_FAIL; - long usage_usec, user_usec; + long usage_usec, expected_usage_usec; long duration_seconds = 1; - long duration_usec = duration_seconds * USEC_PER_SEC; char *cpucg;
cpucg = cg_name(root, "cpucg_test"); @@ -672,14 +671,18 @@ static int test_cpucg_max(const char *root) goto cleanup;
usage_usec = cg_read_key_long(cpucg, "cpu.stat", "usage_usec"); - user_usec = cg_read_key_long(cpucg, "cpu.stat", "user_usec"); - if (user_usec <= 0) + if (usage_usec <= 0) goto cleanup;
- if (user_usec >= duration_usec) - goto cleanup; + /* + * Since the cpu hog is set to run as per wall clock time, it's expected to + * run for 10 periods (duration_usec/default_period_usec), and in each + * period, it's throttled to run for 1000 usec. So its expected usage is + * 1000 * 10 = 10000 usec. + */ + expected_usage_usec = 10000;
- if (values_close(usage_usec, duration_usec, 95)) + if (labs(usage_usec - expected_usage_usec) > 2000) goto cleanup;
ret = KSFT_PASS; @@ -698,9 +701,8 @@ static int test_cpucg_max(const char *root) static int test_cpucg_max_nested(const char *root) { int ret = KSFT_FAIL; - long usage_usec, user_usec; + long usage_usec, expected_usage_usec; long duration_seconds = 1; - long duration_usec = duration_seconds * USEC_PER_SEC; char *parent, *child;
parent = cg_name(root, "cpucg_parent"); @@ -732,14 +734,18 @@ static int test_cpucg_max_nested(const char *root) goto cleanup;
usage_usec = cg_read_key_long(child, "cpu.stat", "usage_usec"); - user_usec = cg_read_key_long(child, "cpu.stat", "user_usec"); - if (user_usec <= 0) + if (usage_usec <= 0) goto cleanup;
- if (user_usec >= duration_usec) - goto cleanup; + /* + * Since the cpu hog is set to run as per wall clock time, it's expected to + * run for 10 periods (duration_usec/default_period_usec), and in each + * period, it's throttled to run for 1000 usec. So its expected usage is + * 1000 * 10 = 10000 usec. + */ + expected_usage_usec = 10000;
- if (values_close(usage_usec, duration_usec, 95)) + if (labs(usage_usec - expected_usage_usec) > 2000) goto cleanup;
ret = KSFT_PASS;
Hello Shashank.
On Tue, Jul 01, 2025 at 11:13:54PM +0900, Shashank Balaji shashank.mahadasyam@sony.com wrote:
cpu.max selftests (both the normal one and the nested one) test the working of throttling by setting up cpu.max, running a cpu hog process for a specified duration, and comparing usage_usec as reported by cpu.stat with the duration of the cpu hog: they should be far enough.
Currently, this is done by using values_close, which has two problems:
- Semantic: values_close is used with an error percentage of 95%, which one will not expect on seeing "values close". The intent it's
actually going for is "values far".
- Accuracy: the tests can pass even if usage_usec is upto around double the expected amount. That's too high of a margin for usage_usec.
Overall, this patchset improves the readability and accuracy of the cpu.max tests.
Signed-off-by: Shashank Balaji shashank.mahadasyam@sony.com
I think you're getting at an actual bug in the test definition.
I think that the test_cpucg_max should either run hog_cpus_timed with CPU_HOG_CLOCK_PROCESS instead of CPU_HOG_CLOCK_WALL to make sense or the expected_usage_usec should be defined with the configured quota in mind (i.e. 1/100). (The latter seems to make the test more natural.)
With such defined metrics, the asserted expression could be values_close(usage_usec, expected_usage_usec, 10) based on your numbers, error is around 20% so our helper's argument is roughly half of that. (I'd be fine even with err=20 to prevent some false positives.)
I think those changes could even be in one patch but I leave that up to you. My comment to your 2nd patch is that I'd like to stick to relative errors and keep positive values_close() predicate that's used in other selftests too. (But those 95% in the current code are clumsy given two different qualities are compared.)
Thanks, Michal
Hi Michal,
Thanks for the reply!
On Wed, Jul 02, 2025 at 02:34:29PM +0200, Michal Koutný wrote:
Hello Shashank.
On Tue, Jul 01, 2025 at 11:13:54PM +0900, Shashank Balaji shashank.mahadasyam@sony.com wrote:
cpu.max selftests (both the normal one and the nested one) test the working of throttling by setting up cpu.max, running a cpu hog process for a specified duration, and comparing usage_usec as reported by cpu.stat with the duration of the cpu hog: they should be far enough.
Currently, this is done by using values_close, which has two problems:
- Semantic: values_close is used with an error percentage of 95%, which one will not expect on seeing "values close". The intent it's
actually going for is "values far".
- Accuracy: the tests can pass even if usage_usec is upto around double the expected amount. That's too high of a margin for usage_usec.
Overall, this patchset improves the readability and accuracy of the cpu.max tests.
Signed-off-by: Shashank Balaji shashank.mahadasyam@sony.com
I think you're getting at an actual bug in the test definition.
I think that the test_cpucg_max should either run hog_cpus_timed with CPU_HOG_CLOCK_PROCESS instead of CPU_HOG_CLOCK_WALL to make sense or the expected_usage_usec should be defined with the configured quota in mind (i.e. 1/100). (The latter seems to make the test more natural.)
Going with the more natural way of sticking to CPU_HOG_CLOCK_WALL, the second patch does calculate expected_usage_usec based on the configured quota, as the code comment explains. So I'm guessesing we're on the same page about this?
With such defined metrics, the asserted expression could be values_close(usage_usec, expected_usage_usec, 10) based on your numbers, error is around 20% so our helper's argument is roughly half of that. (I'd be fine even with err=20 to prevent some false positives.)
I think those changes could even be in one patch but I leave that up to you. My comment to your 2nd patch is that I'd like to stick to relative errors and keep positive values_close() predicate that's used in other selftests too. (But those 95% in the current code are clumsy given two different qualities are compared.)
Do you mean something like,
if (values_close(usage_usec, expected_usage_usec, 10)) goto cleanup;
using the positive values_close() predicate. If so, I'm not sure I understand because if usage_usec and expected_usage_usec _are_ close, then we want the test to pass! We should be using the negative predicate.
And sure, I'll send v2 as a single patch.
Thanks
Shashank
On Thu, Jul 03, 2025 at 10:41:17AM +0900, Shashank Balaji shashank.mahadasyam@sony.com wrote:
Going with the more natural way of sticking to CPU_HOG_CLOCK_WALL, the second patch does calculate expected_usage_usec based on the configured quota, as the code comment explains. So I'm guessesing we're on the same page about this?
Yes, the expected_usage_usec in the 2nd patch is correct. (It'd be nicer if the value calculated from the configured cpu.max and not typed out as a literal that may diverge should be the cpu.max changed in the test.)
Do you mean something like,
if (values_close(usage_usec, expected_usage_usec, 10)) goto cleanup;
using the positive values_close() predicate. If so, I'm not sure I understand because if usage_usec and expected_usage_usec _are_ close, then we want the test to pass! We should be using the negative predicate.
I meant to use it the same way like test_memcontrol.c does, i.e. values_close() <=> pass.
So codewise it boils down to (a negation, I see why the confusion): if (!values_close(usage_usec, expected_usage_usec, 10)) goto cleanup; ret = KSFT_PASS;
Michal
Current cpu.max tests (both the normal one and the nested one) are inaccurate.
They setup cpu.max with 1000 us quota and the default period (100,000 us). A cpu hog is run for a duration of 1s as per wall clock time. This corresponds to 10 periods, hence an expected usage of 10,000 us. We want the measured usage (as per cpu.stat) to be close to 10,000 us.
Previously, this approximate equality test was done by `!values_close(usage_usec, duration_usec, 95)`: if the absolute difference between usage_usec and duration_usec is greater than 95% of their sum, then we pass. This is problematic for two reasons:
1. Semantics: When one sees `values_close` they expect the error percentage to be some small number, not 95. The intent behind using `values_close` is lost by using a high error percent such as 95. The intent it's actually going for is "values far".
2. Bound too wide: The condition translates to the following expression:
|usage_usec - duration_usec| > (usage_usec + duration_usec)*0.95
0.05*duration_usec > 1.95*usage_usec (usage < duration)
usage_usec < 0.0257*duration_usec = 25,641 us
So, this condition passes as long as usage_usec is lower than 25,641 us, while all we want is for it to be close to 10,000 us.
Fix this by explicitly calcuating the expected usage duration based on the configured quota, default period, and the duration, and compare usage_usec and expected_usage_usec using values_close() with a 10% error margin.
Also, use snprintf to get the quota string to write to cpu.max instead of hardcoding the quota, ensuring a single source of truth.
Signed-off-by: Shashank Balaji shashank.mahadasyam@sony.com
---
Changes in v2: - Incorporate Michal's suggestions: - Merge two patches into one - Generate the quota string from the variable instead of hardcoding it - Use values_close() instead of labs() - Explicitly calculate expected_usage_usec - v1: https://lore.kernel.org/all/20250701-kselftest-cgroup-fix-cpu-max-v1-0-04950... --- tools/testing/selftests/cgroup/test_cpu.c | 63 ++++++++++++++++------- 1 file changed, 43 insertions(+), 20 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_cpu.c b/tools/testing/selftests/cgroup/test_cpu.c index a2b50af8e9ee..2a60e6c41940 100644 --- a/tools/testing/selftests/cgroup/test_cpu.c +++ b/tools/testing/selftests/cgroup/test_cpu.c @@ -2,6 +2,7 @@
#define _GNU_SOURCE #include <linux/limits.h> +#include <sys/param.h> #include <sys/sysinfo.h> #include <sys/wait.h> #include <errno.h> @@ -645,10 +646,16 @@ test_cpucg_nested_weight_underprovisioned(const char *root) static int test_cpucg_max(const char *root) { int ret = KSFT_FAIL; - long usage_usec, user_usec; - long usage_seconds = 1; - long expected_usage_usec = usage_seconds * USEC_PER_SEC; + long quota_usec = 1000; + long default_period_usec = 100000; /* cpu.max's default period */ + long duration_seconds = 1; + + long duration_usec = duration_seconds * USEC_PER_SEC; + long usage_usec, n_periods, remainder_usec, expected_usage_usec; char *cpucg; + char quota_buf[32]; + + snprintf(quota_buf, sizeof(quota_buf), "%ld", quota_usec);
cpucg = cg_name(root, "cpucg_test"); if (!cpucg) @@ -657,13 +664,13 @@ static int test_cpucg_max(const char *root) if (cg_create(cpucg)) goto cleanup;
- if (cg_write(cpucg, "cpu.max", "1000")) + if (cg_write(cpucg, "cpu.max", quota_buf)) goto cleanup;
struct cpu_hog_func_param param = { .nprocs = 1, .ts = { - .tv_sec = usage_seconds, + .tv_sec = duration_seconds, .tv_nsec = 0, }, .clock_type = CPU_HOG_CLOCK_WALL, @@ -672,14 +679,19 @@ static int test_cpucg_max(const char *root) goto cleanup;
usage_usec = cg_read_key_long(cpucg, "cpu.stat", "usage_usec"); - user_usec = cg_read_key_long(cpucg, "cpu.stat", "user_usec"); - if (user_usec <= 0) + if (usage_usec <= 0) goto cleanup;
- if (user_usec >= expected_usage_usec) - goto cleanup; + /* + * The following calculation applies only since + * the cpu hog is set to run as per wall-clock time + */ + n_periods = duration_usec / default_period_usec; + remainder_usec = duration_usec - n_periods * default_period_usec; + expected_usage_usec + = n_periods * quota_usec + MIN(remainder_usec, quota_usec);
- if (values_close(usage_usec, expected_usage_usec, 95)) + if (!values_close(usage_usec, expected_usage_usec, 10)) goto cleanup;
ret = KSFT_PASS; @@ -698,10 +710,16 @@ static int test_cpucg_max(const char *root) static int test_cpucg_max_nested(const char *root) { int ret = KSFT_FAIL; - long usage_usec, user_usec; - long usage_seconds = 1; - long expected_usage_usec = usage_seconds * USEC_PER_SEC; + long quota_usec = 1000; + long default_period_usec = 100000; /* cpu.max's default period */ + long duration_seconds = 1; + + long duration_usec = duration_seconds * USEC_PER_SEC; + long usage_usec, n_periods, remainder_usec, expected_usage_usec; char *parent, *child; + char quota_buf[32]; + + snprintf(quota_buf, sizeof(quota_buf), "%ld", quota_usec);
parent = cg_name(root, "cpucg_parent"); child = cg_name(parent, "cpucg_child"); @@ -717,13 +735,13 @@ static int test_cpucg_max_nested(const char *root) if (cg_create(child)) goto cleanup;
- if (cg_write(parent, "cpu.max", "1000")) + if (cg_write(parent, "cpu.max", quota_buf)) goto cleanup;
struct cpu_hog_func_param param = { .nprocs = 1, .ts = { - .tv_sec = usage_seconds, + .tv_sec = duration_seconds, .tv_nsec = 0, }, .clock_type = CPU_HOG_CLOCK_WALL, @@ -732,14 +750,19 @@ static int test_cpucg_max_nested(const char *root) goto cleanup;
usage_usec = cg_read_key_long(child, "cpu.stat", "usage_usec"); - user_usec = cg_read_key_long(child, "cpu.stat", "user_usec"); - if (user_usec <= 0) + if (usage_usec <= 0) goto cleanup;
- if (user_usec >= expected_usage_usec) - goto cleanup; + /* + * The following calculation applies only since + * the cpu hog is set to run as per wall-clock time + */ + n_periods = duration_usec / default_period_usec; + remainder_usec = duration_usec - n_periods * default_period_usec; + expected_usage_usec + = n_periods * quota_usec + MIN(remainder_usec, quota_usec);
- if (values_close(usage_usec, expected_usage_usec, 95)) + if (!values_close(usage_usec, expected_usage_usec, 10)) goto cleanup;
ret = KSFT_PASS;
On Thu, Jul 03, 2025 at 09:03:20PM +0900, Shashank Balaji shashank.mahadasyam@sony.com wrote:
Current cpu.max tests (both the normal one and the nested one) are inaccurate.
They setup cpu.max with 1000 us quota and the default period (100,000 us). A cpu hog is run for a duration of 1s as per wall clock time. This corresponds to 10 periods, hence an expected usage of 10,000 us. We want the measured usage (as per cpu.stat) to be close to 10,000 us.
Previously, this approximate equality test was done by `!values_close(usage_usec, duration_usec, 95)`: if the absolute difference between usage_usec and duration_usec is greater than 95% of their sum, then we pass. This is problematic for two reasons:
- Semantics: When one sees `values_close` they expect the error percentage to be some small number, not 95. The intent behind using
`values_close` is lost by using a high error percent such as 95. The intent it's actually going for is "values far".
Bound too wide: The condition translates to the following expression:
|usage_usec - duration_usec| > (usage_usec + duration_usec)*0.95
0.05*duration_usec > 1.95*usage_usec (usage < duration)
usage_usec < 0.0257*duration_usec = 25,641 us
So, this condition passes as long as usage_usec is lower than 25,641
us, while all we want is for it to be close to 10,000 us.
Fix this by explicitly calcuating the expected usage duration based on the configured quota, default period, and the duration, and compare usage_usec and expected_usage_usec using values_close() with a 10% error margin.
Also, use snprintf to get the quota string to write to cpu.max instead of hardcoding the quota, ensuring a single source of truth.
Signed-off-by: Shashank Balaji shashank.mahadasyam@sony.com
Changes in v2:
- Incorporate Michal's suggestions:
- Merge two patches into one
- Generate the quota string from the variable instead of hardcoding it
- Use values_close() instead of labs()
- Explicitly calculate expected_usage_usec
- v1: https://lore.kernel.org/all/20250701-kselftest-cgroup-fix-cpu-max-v1-0-04950...
tools/testing/selftests/cgroup/test_cpu.c | 63 ++++++++++++++++------- 1 file changed, 43 insertions(+), 20 deletions(-)
- user_usec = cg_read_key_long(cpucg, "cpu.stat", "user_usec");
- if (user_usec <= 0)
- if (usage_usec <= 0) goto cleanup;
- if (user_usec >= expected_usage_usec)
goto cleanup;
I think this was a meaningful check. Not sure if dropped accidentally or on purpose w/out explanation.
After that's addressed, feel free to add Acked-by: Michal Koutný mkoutny@suse.com
Hi Michal,
On Thu, Jul 03, 2025 at 05:58:48PM +0200, Michal Koutný wrote:
On Thu, Jul 03, 2025 at 09:03:20PM +0900, Shashank Balaji shashank.mahadasyam@sony.com wrote:
Current cpu.max tests (both the normal one and the nested one) are inaccurate.
They setup cpu.max with 1000 us quota and the default period (100,000 us). A cpu hog is run for a duration of 1s as per wall clock time. This corresponds to 10 periods, hence an expected usage of 10,000 us. We want the measured usage (as per cpu.stat) to be close to 10,000 us.
Previously, this approximate equality test was done by `!values_close(usage_usec, duration_usec, 95)`: if the absolute difference between usage_usec and duration_usec is greater than 95% of their sum, then we pass. This is problematic for two reasons:
- Semantics: When one sees `values_close` they expect the error percentage to be some small number, not 95. The intent behind using
`values_close` is lost by using a high error percent such as 95. The intent it's actually going for is "values far".
Bound too wide: The condition translates to the following expression:
|usage_usec - duration_usec| > (usage_usec + duration_usec)*0.95
0.05*duration_usec > 1.95*usage_usec (usage < duration)
usage_usec < 0.0257*duration_usec = 25,641 us
So, this condition passes as long as usage_usec is lower than 25,641
us, while all we want is for it to be close to 10,000 us.
Fix this by explicitly calcuating the expected usage duration based on the configured quota, default period, and the duration, and compare usage_usec and expected_usage_usec using values_close() with a 10% error margin.
Also, use snprintf to get the quota string to write to cpu.max instead of hardcoding the quota, ensuring a single source of truth.
Signed-off-by: Shashank Balaji shashank.mahadasyam@sony.com
Changes in v2:
- Incorporate Michal's suggestions:
- Merge two patches into one
- Generate the quota string from the variable instead of hardcoding it
- Use values_close() instead of labs()
- Explicitly calculate expected_usage_usec
- v1: https://lore.kernel.org/all/20250701-kselftest-cgroup-fix-cpu-max-v1-0-04950...
tools/testing/selftests/cgroup/test_cpu.c | 63 ++++++++++++++++------- 1 file changed, 43 insertions(+), 20 deletions(-)
- user_usec = cg_read_key_long(cpucg, "cpu.stat", "user_usec");
- if (user_usec <= 0)
- if (usage_usec <= 0) goto cleanup;
- if (user_usec >= expected_usage_usec)
goto cleanup;
I think this was a meaningful check. Not sure if dropped accidentally or on purpose w/out explanation.
After that's addressed, feel free to add Acked-by: Michal Koutný mkoutny@suse.com
Sorry about that. I dropped it accidentally. This check should be okay, right?
if (usage_usec > expected_usage_usec) goto cleanup;
1. We don't need to separately check user_usec because it'll always be less than user_usec, and usage_usec is what's directly affected by throttling. 2. I changed the >= to > because, not that it'll ever happen, but we can let usage_usec = expected_usage_usec pass. Afterall, it's called "expected" for a reason.
Thanks
Shashank
On Fri, Jul 04, 2025 at 09:26:56AM +0900, Shashank Balaji wrote:
tools/testing/selftests/cgroup/test_cpu.c | 63 ++++++++++++++++------- 1 file changed, 43 insertions(+), 20 deletions(-)
- user_usec = cg_read_key_long(cpucg, "cpu.stat", "user_usec");
- if (user_usec <= 0)
- if (usage_usec <= 0) goto cleanup;
- if (user_usec >= expected_usage_usec)
goto cleanup;
I think this was a meaningful check. Not sure if dropped accidentally or on purpose w/out explanation.
After that's addressed, feel free to add Acked-by: Michal Koutný mkoutny@suse.com
Sorry about that. I dropped it accidentally. This check should be okay, right?
if (usage_usec > expected_usage_usec) goto cleanup;
- We don't need to separately check user_usec because it'll always be
less than user_usec, and usage_usec is what's directly affected by throttling. 2. I changed the >= to > because, not that it'll ever happen, but we can let usage_usec = expected_usage_usec pass. Afterall, it's called "expected" for a reason.
Hmm, here is something interesting. The following patch adds printfs to the existing code to see what user_usec, usage_usec, the expected_usage_usec used in the code, and the theoretical expected_usage_usec are. On running the modified test a couple of times, here is the output:
$ sudo ./test_cpu user: 10485, usage: 10485, used expected: 1000000, theoretical expected: 10000 ok 1 test_cpucg_max user: 11127, usage: 11127, used expected: 1000000, theoretical expected: 10000 ok 2 test_cpucg_max_nested $ sudo ./test_cpu user: 10286, usage: 10286, used expected: 1000000, theoretical expected: 10000 ok 1 test_cpucg_max user: 10404, usage: 11271, used expected: 1000000, theoretical expected: 10000 ok 2 test_cpucg_max_nested $ sudo ./test_cpu user: 10490, usage: 10490, used expected: 1000000, theoretical expected: 10000 ok 1 test_cpucg_max user: 9326, usage: 9326, used expected: 1000000, theoretical expected: 10000 ok 2 test_cpucg_max_nested $ sudo ./test_cpu user: 10368, usage: 10368, used expected: 1000000, theoretical expected: 10000 ok 1 test_cpucg_max user: 10026, usage: 10026, used expected: 1000000, theoretical expected: 10000 ok 2 test_cpucg_max_nested $ sudo ./test_cpu user: 10541, usage: 10541, used expected: 1000000, theoretical expected: 10000 ok 1 test_cpucg_max user: 11040, usage: 11040, used expected: 1000000, theoretical expected: 10000 ok 2 test_cpucg_max_nested
So, both user_usec and usage_usec exceeding the theoretical expected_usage_usec is not uncommon. The "fail if usage_usec >= expected_usage_usec" check in the existing code only really works because of the (wrong) large expected_usage_usec used.
So I think the best we can do is check if usage_usec is close to expected_usage_usec or not, and not require usage_usec to be less than expected_usage_usec.
diff --git i/tools/testing/selftests/cgroup/test_cpu.c w/tools/testing/selftests/cgroup/test_cpu.c index a2b50af8e9ee..14c8c7b49214 100644 --- i/tools/testing/selftests/cgroup/test_cpu.c +++ w/tools/testing/selftests/cgroup/test_cpu.c @@ -679,6 +679,9 @@ static int test_cpucg_max(const char *root) if (user_usec >= expected_usage_usec) goto cleanup;
+ printf("user: %ld, usage: %ld, used expected: %ld, theoretical expected: 10000\n", + user_usec, usage_usec, expected_usage_usec); + if (values_close(usage_usec, expected_usage_usec, 95)) goto cleanup;
@@ -739,6 +742,9 @@ static int test_cpucg_max_nested(const char *root) if (user_usec >= expected_usage_usec) goto cleanup;
+ printf("user: %ld, usage: %ld, used expected: %ld, theoretical expected: 10000\n", + user_usec, usage_usec, expected_usage_usec); + if (values_close(usage_usec, expected_usage_usec, 95)) goto cleanup;
@@ -758,13 +764,13 @@ struct cpucg_test { int (*fn)(const char *root); const char *name; } tests[] = { - T(test_cpucg_subtree_control), - T(test_cpucg_stats), - T(test_cpucg_nice), - T(test_cpucg_weight_overprovisioned), - T(test_cpucg_weight_underprovisioned), - T(test_cpucg_nested_weight_overprovisioned), - T(test_cpucg_nested_weight_underprovisioned), + // T(test_cpucg_subtree_control), + // T(test_cpucg_stats), + // T(test_cpucg_nice), + // T(test_cpucg_weight_overprovisioned), + // T(test_cpucg_weight_underprovisioned), + // T(test_cpucg_nested_weight_overprovisioned), + // T(test_cpucg_nested_weight_underprovisioned), T(test_cpucg_max), T(test_cpucg_max_nested), };
On Fri, Jul 04, 2025 at 03:49:58PM +0900, Shashank Balaji shashank.mahadasyam@sony.com wrote:
- We don't need to separately check user_usec because it'll always be
less than user_usec^W usage_usec, and usage_usec is what's directly affected by throttling.
When kernel is not preemptible, I'd expect the system time may more easily excess the quota, so I considered the user_usage check less prone to false results. But...
- I changed the >= to > because, not that it'll ever happen, but we can
let usage_usec = expected_usage_usec pass. Afterall, it's called "expected" for a reason.
Hmm, here is something interesting. The following patch adds printfs to the existing code to see what user_usec, usage_usec, the expected_usage_usec used in the code, and the theoretical expected_usage_usec are. On running the modified test a couple of times, here is the output:
...thanks for checking. I was misled by the previous test implementation (the expected_usage_usec had no relation to actual throttled usage in there). What you observe is thus likely explained by the default sched_cfs_bandwidth_slice (5 times the tested quota) and CONFIG_HZ.
So I'd say keep only the two-sided tolerant check. (I want to avoid the test to randomly fail when there's no gaping issue.)
Thanks, Michal
Hi Michal,
On Fri, Jul 04, 2025 at 10:59:15AM +0200, Michal Koutný wrote:
On Fri, Jul 04, 2025 at 03:49:58PM +0900, Shashank Balaji shashank.mahadasyam@sony.com wrote:
- We don't need to separately check user_usec because it'll always be
less than user_usec^W usage_usec, and usage_usec is what's directly affected by throttling.
When kernel is not preemptible, I'd expect the system time may more easily excess the quota, so I considered the user_usage check less prone to false results. But...
- I changed the >= to > because, not that it'll ever happen, but we can
let usage_usec = expected_usage_usec pass. Afterall, it's called "expected" for a reason.
Hmm, here is something interesting. The following patch adds printfs to the existing code to see what user_usec, usage_usec, the expected_usage_usec used in the code, and the theoretical expected_usage_usec are. On running the modified test a couple of times, here is the output:
...thanks for checking. I was misled by the previous test implementation (the expected_usage_usec had no relation to actual throttled usage in there). What you observe is thus likely explained by the default sched_cfs_bandwidth_slice (5 times the tested quota) and CONFIG_HZ.
So I'd say keep only the two-sided tolerant check. (I want to avoid the test to randomly fail when there's no gaping issue.)
Yep, patch v2 is doing just that. So, I assume I have your Acked-by?
Thanks
Shashank
On Fri, Jul 04, 2025 at 06:07:12PM +0900, Shashank Balaji wrote:
Hi Michal,
On Fri, Jul 04, 2025 at 10:59:15AM +0200, Michal Koutný wrote:
On Fri, Jul 04, 2025 at 03:49:58PM +0900, Shashank Balaji shashank.mahadasyam@sony.com wrote:
- We don't need to separately check user_usec because it'll always be
less than user_usec^W usage_usec, and usage_usec is what's directly affected by throttling.
When kernel is not preemptible, I'd expect the system time may more easily excess the quota, so I considered the user_usage check less prone to false results. But...
- I changed the >= to > because, not that it'll ever happen, but we can
let usage_usec = expected_usage_usec pass. Afterall, it's called "expected" for a reason.
Hmm, here is something interesting. The following patch adds printfs to the existing code to see what user_usec, usage_usec, the expected_usage_usec used in the code, and the theoretical expected_usage_usec are. On running the modified test a couple of times, here is the output:
...thanks for checking. I was misled by the previous test implementation (the expected_usage_usec had no relation to actual throttled usage in there). What you observe is thus likely explained by the default sched_cfs_bandwidth_slice (5 times the tested quota) and CONFIG_HZ.
So I'd say keep only the two-sided tolerant check. (I want to avoid the test to randomly fail when there's no gaping issue.)
Yep, patch v2 is doing just that. So, I assume I have your Acked-by?
Thanks
Shashank
I forgot to add the fixes tags: Fixes: a79906570f9646ae17 ("cgroup: Add test_cpucg_max_nested() testcase") Fixes: 889ab8113ef1386c57 ("cgroup: Add test_cpucg_max() testcase")
Should I send a v3 with your ack and the tags?
Thanks
Shashank
On Fri, Jul 04, 2025 at 06:15:54PM +0900, Shashank Balaji shashank.mahadasyam@sony.com wrote:
I forgot to add the fixes tags: Fixes: a79906570f9646ae17 ("cgroup: Add test_cpucg_max_nested() testcase") Fixes: 889ab8113ef1386c57 ("cgroup: Add test_cpucg_max() testcase")
Should I send a v3 with your ack and the tags?
Yeah, I think it'd be simpler to apply.
Thanks, Michal
Current cpu.max tests (both the normal one and the nested one) are broken.
They setup cpu.max with 1000 us quota and the default period (100,000 us). A cpu hog is run for a duration of 1s as per wall clock time. This corresponds to 10 periods, hence an expected usage of 10,000 us. We want the measured usage (as per cpu.stat) to be close to 10,000 us.
Previously, this approximate equality test was done by `!values_close(usage_usec, expected_usage_usec, 95)`: if the absolute difference between usage_usec and expected_usage_usec is greater than 95% of their sum, then we pass. And expected_usage_usec was set to 1,000,000 us. Mathematically, this translates to the following being true for pass:
|usage - expected_usage| > (usage + expected_usage)*0.95
If usage > expected_usage: usage - expected_usage > (usage + expected_usage)*0.95 0.05*usage > 1.95*expected_usage usage > 39*expected_usage = 39s
If usage < expected_usage: expected_usage - usage > (usage + expected_usage)*0.95 0.05*expected_usage > 1.95*usage usage < 0.0256*expected_usage = 25,600 us
Combined,
Pass if usage < 25,600 us or > 39 s,
which makes no sense given that all we need is for usage_usec to be close to 10,000 us.
Fix this by explicitly calcuating the expected usage duration based on the configured quota, default period, and the duration, and compare usage_usec and expected_usage_usec using values_close() with a 10% error margin.
Also, use snprintf to get the quota string to write to cpu.max instead of hardcoding the quota, ensuring a single source of truth.
Remove the check comparing user_usec and expected_usage_usec, since on running this test modified with printfs, it's seen that user_usec and usage_usec can regularly exceed the theoretical expected_usage_usec:
$ sudo ./test_cpu user: 10485, usage: 10485, expected: 10000 ok 1 test_cpucg_max user: 11127, usage: 11127, expected: 10000 ok 2 test_cpucg_max_nested $ sudo ./test_cpu user: 10286, usage: 10286, expected: 10000 ok 1 test_cpucg_max user: 10404, usage: 11271, expected: 10000 ok 2 test_cpucg_max_nested
Hence, a values_close() check of usage_usec and expected_usage_usec is sufficient.
Fixes: a79906570f9646ae17 ("cgroup: Add test_cpucg_max_nested() testcase") Fixes: 889ab8113ef1386c57 ("cgroup: Add test_cpucg_max() testcase") Acked-by: Michal Koutný mkoutny@suse.com Signed-off-by: Shashank Balaji shashank.mahadasyam@sony.com
---
Changes in v3: - Simplified commit message - Explained why the "user_usec >= expected_usage_usec" check is removed - Added fixes tags and Michal's Acked-by - No code changes - v2: https://lore.kernel.org/all/20250703120325.2905314-1-shashank.mahadasyam@son...
Changes in v2: - Incorporate Michal's suggestions: - Merge two patches into one - Generate the quota string from the variable instead of hardcoding it - Use values_close() instead of labs() - Explicitly calculate expected_usage_usec - v1: https://lore.kernel.org/all/20250701-kselftest-cgroup-fix-cpu-max-v1-0-04950... --- tools/testing/selftests/cgroup/test_cpu.c | 63 ++++++++++++++++------- 1 file changed, 43 insertions(+), 20 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_cpu.c b/tools/testing/selftests/cgroup/test_cpu.c index a2b50af8e9ee..2a60e6c41940 100644 --- a/tools/testing/selftests/cgroup/test_cpu.c +++ b/tools/testing/selftests/cgroup/test_cpu.c @@ -2,6 +2,7 @@
#define _GNU_SOURCE #include <linux/limits.h> +#include <sys/param.h> #include <sys/sysinfo.h> #include <sys/wait.h> #include <errno.h> @@ -645,10 +646,16 @@ test_cpucg_nested_weight_underprovisioned(const char *root) static int test_cpucg_max(const char *root) { int ret = KSFT_FAIL; - long usage_usec, user_usec; - long usage_seconds = 1; - long expected_usage_usec = usage_seconds * USEC_PER_SEC; + long quota_usec = 1000; + long default_period_usec = 100000; /* cpu.max's default period */ + long duration_seconds = 1; + + long duration_usec = duration_seconds * USEC_PER_SEC; + long usage_usec, n_periods, remainder_usec, expected_usage_usec; char *cpucg; + char quota_buf[32]; + + snprintf(quota_buf, sizeof(quota_buf), "%ld", quota_usec);
cpucg = cg_name(root, "cpucg_test"); if (!cpucg) @@ -657,13 +664,13 @@ static int test_cpucg_max(const char *root) if (cg_create(cpucg)) goto cleanup;
- if (cg_write(cpucg, "cpu.max", "1000")) + if (cg_write(cpucg, "cpu.max", quota_buf)) goto cleanup;
struct cpu_hog_func_param param = { .nprocs = 1, .ts = { - .tv_sec = usage_seconds, + .tv_sec = duration_seconds, .tv_nsec = 0, }, .clock_type = CPU_HOG_CLOCK_WALL, @@ -672,14 +679,19 @@ static int test_cpucg_max(const char *root) goto cleanup;
usage_usec = cg_read_key_long(cpucg, "cpu.stat", "usage_usec"); - user_usec = cg_read_key_long(cpucg, "cpu.stat", "user_usec"); - if (user_usec <= 0) + if (usage_usec <= 0) goto cleanup;
- if (user_usec >= expected_usage_usec) - goto cleanup; + /* + * The following calculation applies only since + * the cpu hog is set to run as per wall-clock time + */ + n_periods = duration_usec / default_period_usec; + remainder_usec = duration_usec - n_periods * default_period_usec; + expected_usage_usec + = n_periods * quota_usec + MIN(remainder_usec, quota_usec);
- if (values_close(usage_usec, expected_usage_usec, 95)) + if (!values_close(usage_usec, expected_usage_usec, 10)) goto cleanup;
ret = KSFT_PASS; @@ -698,10 +710,16 @@ static int test_cpucg_max(const char *root) static int test_cpucg_max_nested(const char *root) { int ret = KSFT_FAIL; - long usage_usec, user_usec; - long usage_seconds = 1; - long expected_usage_usec = usage_seconds * USEC_PER_SEC; + long quota_usec = 1000; + long default_period_usec = 100000; /* cpu.max's default period */ + long duration_seconds = 1; + + long duration_usec = duration_seconds * USEC_PER_SEC; + long usage_usec, n_periods, remainder_usec, expected_usage_usec; char *parent, *child; + char quota_buf[32]; + + snprintf(quota_buf, sizeof(quota_buf), "%ld", quota_usec);
parent = cg_name(root, "cpucg_parent"); child = cg_name(parent, "cpucg_child"); @@ -717,13 +735,13 @@ static int test_cpucg_max_nested(const char *root) if (cg_create(child)) goto cleanup;
- if (cg_write(parent, "cpu.max", "1000")) + if (cg_write(parent, "cpu.max", quota_buf)) goto cleanup;
struct cpu_hog_func_param param = { .nprocs = 1, .ts = { - .tv_sec = usage_seconds, + .tv_sec = duration_seconds, .tv_nsec = 0, }, .clock_type = CPU_HOG_CLOCK_WALL, @@ -732,14 +750,19 @@ static int test_cpucg_max_nested(const char *root) goto cleanup;
usage_usec = cg_read_key_long(child, "cpu.stat", "usage_usec"); - user_usec = cg_read_key_long(child, "cpu.stat", "user_usec"); - if (user_usec <= 0) + if (usage_usec <= 0) goto cleanup;
- if (user_usec >= expected_usage_usec) - goto cleanup; + /* + * The following calculation applies only since + * the cpu hog is set to run as per wall-clock time + */ + n_periods = duration_usec / default_period_usec; + remainder_usec = duration_usec - n_periods * default_period_usec; + expected_usage_usec + = n_periods * quota_usec + MIN(remainder_usec, quota_usec);
- if (values_close(usage_usec, expected_usage_usec, 95)) + if (!values_close(usage_usec, expected_usage_usec, 10)) goto cleanup;
ret = KSFT_PASS;
linux-kselftest-mirror@lists.linaro.org