Hi all,
This patch series introduces improvements to the cgroup selftests by adding helper functions to better handle asynchronous updates in cgroup statistics. These changes are especially useful for managing cgroup stats like memory.stat and cgroup.stat, which can be affected by delays (e.g., RCPU callbacks and asynchronous rstat flushing).
v4: - Patch 1/3: Adds the `cg_read_key_long_poll()` helper to poll cgroup keys with retries and configurable intervals. - Patch 2/3: Updates `test_memcg_sock()` to use `cg_read_key_long_poll()` for handling delayed "sock" counter updates in memory.stat. - Patch 3/3: Replaces `sleep` and retry logic in `test_kmem_dead_cgroups()` with `cg_read_key_long_poll()` for waiting on `nr_dying_descendants`.
v3: - Move `MEMCG_SOCKSTAT_WAIT_*` defines after the `#include` block as suggested.
v2: - Clarify the rationale for the 3s timeout and mention the periodic rstat flush interval (FLUSH_TIME = 2*HZ) in the comment. - Replace hardcoded retry count and wait interval with macros to avoid magic numbers and make the timeout calculation explicit.
Thanks to Michal Koutný for the suggestion to introduce the polling helper, and to Lance Yang for the review.
Guopeng Zhang (3): selftests: cgroup: Add cg_read_key_long_poll() to poll a cgroup key with retries selftests: cgroup: make test_memcg_sock robust against delayed sock stats selftests: cgroup: Replace sleep with cg_read_key_long_poll() for waiting on nr_dying_descendants
.../selftests/cgroup/lib/cgroup_util.c | 21 +++++++++++++ .../cgroup/lib/include/cgroup_util.h | 5 +++ tools/testing/selftests/cgroup/test_kmem.c | 31 ++++++++----------- .../selftests/cgroup/test_memcontrol.c | 20 +++++++++++- 4 files changed, 58 insertions(+), 19 deletions(-)
Introduce a new helper function `cg_read_key_long_poll()` in cgroup_util.h. This function polls the specified key in a cgroup file until it matches the expected value or the retry limit is reached, with configurable wait intervals between retries.
This helper is particularly useful for handling asynchronously updated cgroup statistics (e.g., memory.stat), where immediate reads may observe stale values, especially on busy systems. It allows tests and other utilities to handle such cases more flexibly.
Signed-off-by: Guopeng Zhang zhangguopeng@kylinos.cn Suggested-by: Michal Koutný mkoutny@suse.com --- .../selftests/cgroup/lib/cgroup_util.c | 21 +++++++++++++++++++ .../cgroup/lib/include/cgroup_util.h | 5 +++++ 2 files changed, 26 insertions(+)
diff --git a/tools/testing/selftests/cgroup/lib/cgroup_util.c b/tools/testing/selftests/cgroup/lib/cgroup_util.c index 44c52f620fda..953835f5da66 100644 --- a/tools/testing/selftests/cgroup/lib/cgroup_util.c +++ b/tools/testing/selftests/cgroup/lib/cgroup_util.c @@ -168,6 +168,27 @@ long cg_read_key_long(const char *cgroup, const char *control, const char *key) return atol(ptr + strlen(key)); }
+long cg_read_key_long_poll(const char *cgroup, const char *control, + const char *key, long expected, int retries, + useconds_t wait_interval_us) +{ + long val = -1; + int i; + + for (i = 0; i < retries; i++) { + val = cg_read_key_long(cgroup, control, key); + if (val < 0) + return val; + + if (val == expected) + break; + + usleep(wait_interval_us); + } + + return val; +} + long cg_read_lc(const char *cgroup, const char *control) { char buf[PAGE_SIZE]; diff --git a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h index 7ab2824ed7b5..772f5383ddc7 100644 --- a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h +++ b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h @@ -17,6 +17,8 @@ #define CG_NAMED_NAME "selftest" #define CG_PATH_FORMAT (!cg_test_v1_named ? "0::%s" : (":name=" CG_NAMED_NAME ":%s"))
+#define DEFAULT_WAIT_INTERVAL_US (100 * 1000) /* 100 ms */ + /* * Checks if two given values differ by less than err% of their sum. */ @@ -64,6 +66,9 @@ extern int cg_read_strstr(const char *cgroup, const char *control, extern long cg_read_long(const char *cgroup, const char *control); extern long cg_read_long_fd(int fd); long cg_read_key_long(const char *cgroup, const char *control, const char *key); +long cg_read_key_long_poll(const char *cgroup, const char *control, + const char *key, long expected, int retries, + useconds_t wait_interval_us); extern long cg_read_lc(const char *cgroup, const char *control); extern int cg_write(const char *cgroup, const char *control, char *buf); extern int cg_open(const char *cgroup, const char *control, int flags);
test_memcg_sock() currently requires that memory.stat's "sock " counter is exactly zero immediately after the TCP server exits. On a busy system this assumption is too strict:
- Socket memory may be freed with a small delay (e.g. RCU callbacks). - memcg statistics are updated asynchronously via the rstat flushing worker, so the "sock " value in memory.stat can stay non-zero for a short period of time even after all socket memory has been uncharged.
As a result, test_memcg_sock() can intermittently fail even though socket memory accounting is working correctly.
Make the test more robust by polling memory.stat for the "sock " counter and allowing it some time to drop to zero instead of checking it only once. The timeout is set to 3 seconds to cover the periodic rstat flush interval (FLUSH_TIME = 2*HZ by default) plus some scheduling slack. If the counter does not become zero within the timeout, the test still fails as before.
On my test system, running test_memcontrol 50 times produced:
- Before this patch: 6/50 runs passed. - After this patch: 50/50 runs passed.
Signed-off-by: Guopeng Zhang zhangguopeng@kylinos.cn Suggested-by: Lance Yang lance.yang@linux.dev --- .../selftests/cgroup/test_memcontrol.c | 20 ++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c index 4e1647568c5b..dda12e5c6457 100644 --- a/tools/testing/selftests/cgroup/test_memcontrol.c +++ b/tools/testing/selftests/cgroup/test_memcontrol.c @@ -21,6 +21,8 @@ #include "kselftest.h" #include "cgroup_util.h"
+#define MEMCG_SOCKSTAT_WAIT_RETRIES 30 /* 3s total */ + static bool has_localevents; static bool has_recursiveprot;
@@ -1384,6 +1386,7 @@ static int test_memcg_sock(const char *root) int bind_retries = 5, ret = KSFT_FAIL, pid, err; unsigned short port; char *memcg; + long sock_post = -1;
memcg = cg_name(root, "memcg_test"); if (!memcg) @@ -1432,7 +1435,22 @@ static int test_memcg_sock(const char *root) if (cg_read_long(memcg, "memory.current") < 0) goto cleanup;
- if (cg_read_key_long(memcg, "memory.stat", "sock ")) + /* + * memory.stat is updated asynchronously via the memcg rstat + * flushing worker, which runs periodically (every 2 seconds, + * see FLUSH_TIME). On a busy system, the "sock " counter may + * stay non-zero for a short period of time after the TCP + * connection is closed and all socket memory has been + * uncharged. + * + * Poll memory.stat for up to 3 seconds (~FLUSH_TIME plus some + * scheduling slack) and require that the "sock " counter + * eventually drops to zero. + */ + sock_post = cg_read_key_long_poll(memcg, "memory.stat", "sock ", 0, + MEMCG_SOCKSTAT_WAIT_RETRIES, + DEFAULT_WAIT_INTERVAL_US); + if (sock_post) goto cleanup;
ret = KSFT_PASS;
Replaced the manual sleep and retry logic in test_kmem_dead_cgroups() with the new helper `cg_read_key_long_poll()`. This change improves the robustness of the test by polling the "nr_dying_descendants" counter in `cgroup.stat` until it reaches 0 or the timeout is exceeded.
Additionally, increased the retry timeout to 8 seconds (from 5 seconds) based on testing results: - With 5-second timeout: 4/20 runs passed. - With 8-second timeout: 20/20 runs passed.
Signed-off-by: Guopeng Zhang zhangguopeng@kylinos.cn --- tools/testing/selftests/cgroup/test_kmem.c | 31 +++++++++------------- 1 file changed, 13 insertions(+), 18 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c index ca38525484e3..299d8e332e42 100644 --- a/tools/testing/selftests/cgroup/test_kmem.c +++ b/tools/testing/selftests/cgroup/test_kmem.c @@ -26,6 +26,7 @@ */ #define MAX_VMSTAT_ERROR (4096 * 64 * get_nprocs())
+#define KMEM_DEAD_WAIT_RETRIES 80 /* 8s total */
static int alloc_dcache(const char *cgroup, void *arg) { @@ -306,9 +307,7 @@ static int test_kmem_dead_cgroups(const char *root) { int ret = KSFT_FAIL; char *parent; - long dead; - int i; - int max_time = 20; + long dead = -1;
parent = cg_name(root, "kmem_dead_cgroups_test"); if (!parent) @@ -323,21 +322,17 @@ static int test_kmem_dead_cgroups(const char *root) if (cg_run_in_subcgroups(parent, alloc_dcache, (void *)100, 30)) goto cleanup;
- for (i = 0; i < max_time; i++) { - dead = cg_read_key_long(parent, "cgroup.stat", - "nr_dying_descendants "); - if (dead == 0) { - ret = KSFT_PASS; - break; - } - /* - * Reclaiming cgroups might take some time, - * let's wait a bit and repeat. - */ - sleep(1); - if (i > 5) - printf("Waiting time longer than 5s; wait: %ds (dead: %ld)\n", i, dead); - } + /* + * Reclaiming cgroups may take some time, + * so let's wait a bit and retry. + */ + dead = cg_read_key_long_poll(parent, "cgroup.stat", + "nr_dying_descendants ", 0, KMEM_DEAD_WAIT_RETRIES, + DEFAULT_WAIT_INTERVAL_US); + if (dead) + goto cleanup; + + ret = KSFT_PASS;
cleanup: cg_destroy(parent);
On 2025/11/24 20:38, Guopeng Zhang wrote:
test_memcg_sock() currently requires that memory.stat's "sock " counter is exactly zero immediately after the TCP server exits. On a busy system this assumption is too strict:
- Socket memory may be freed with a small delay (e.g. RCU callbacks).
- memcg statistics are updated asynchronously via the rstat flushing worker, so the "sock " value in memory.stat can stay non-zero for a short period of time even after all socket memory has been uncharged.
As a result, test_memcg_sock() can intermittently fail even though socket memory accounting is working correctly.
Make the test more robust by polling memory.stat for the "sock " counter and allowing it some time to drop to zero instead of checking it only once. The timeout is set to 3 seconds to cover the periodic rstat flush interval (FLUSH_TIME = 2*HZ by default) plus some scheduling slack. If the counter does not become zero within the timeout, the test still fails as before.
On my test system, running test_memcontrol 50 times produced:
- Before this patch: 6/50 runs passed.
- After this patch: 50/50 runs passed.
Signed-off-by: Guopeng Zhang zhangguopeng@kylinos.cn Suggested-by: Lance Yang lance.yang@linux.dev
.../selftests/cgroup/test_memcontrol.c | 20 ++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c index 4e1647568c5b..dda12e5c6457 100644 --- a/tools/testing/selftests/cgroup/test_memcontrol.c +++ b/tools/testing/selftests/cgroup/test_memcontrol.c @@ -21,6 +21,8 @@ #include "kselftest.h"
This patch fails to apply to mm-new ...
Hmm, it expects #include "kselftest.h" here, but the tree uses #include "../kselftest.h".
Which is odd, as that line hasn't been touched in years ...
On 11/27/25 18:55, Lance Yang wrote:
On 2025/11/24 20:38, Guopeng Zhang wrote:
test_memcg_sock() currently requires that memory.stat's "sock " counter is exactly zero immediately after the TCP server exits. On a busy system this assumption is too strict:
- Socket memory may be freed with a small delay (e.g. RCU callbacks). - memcg statistics are updated asynchronously via the rstat flushing worker, so the "sock " value in memory.stat can stay non-zero for a short period of time even after all socket memory has been uncharged.
As a result, test_memcg_sock() can intermittently fail even though socket memory accounting is working correctly.
Make the test more robust by polling memory.stat for the "sock " counter and allowing it some time to drop to zero instead of checking it only once. The timeout is set to 3 seconds to cover the periodic rstat flush interval (FLUSH_TIME = 2*HZ by default) plus some scheduling slack. If the counter does not become zero within the timeout, the test still fails as before.
On my test system, running test_memcontrol 50 times produced:
- Before this patch: 6/50 runs passed. - After this patch: 50/50 runs passed.
Signed-off-by: Guopeng Zhang zhangguopeng@kylinos.cn Suggested-by: Lance Yang lance.yang@linux.dev
.../selftests/cgroup/test_memcontrol.c | 20 ++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c index 4e1647568c5b..dda12e5c6457 100644 --- a/tools/testing/selftests/cgroup/test_memcontrol.c +++ b/tools/testing/selftests/cgroup/test_memcontrol.c @@ -21,6 +21,8 @@ #include "kselftest.h"
This patch fails to apply to mm-new ...
Hmm, it expects #include "kselftest.h" here, but the tree uses #include "../kselftest.h".
Which is odd, as that line hasn't been touched in years ...
Hi,lance
Thanks for your review.
When I prepared this patch I was working on linux-next, where tools/testing/selftests/cgroup/test_memcontrol.c already uses:
#include "kselftest.h"
I just checked, and this change comes from the following commit:
1aaedc385b9b278dcf91f4e9d0c3e1a078804ff1 https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?...
So the patch applies cleanly on top of the latest linux-next, but not on mm-new which still has `#include "../kselftest.h"`.
Thanks, Guopeng
On 2025/11/27 19:18, Guopeng Zhang wrote:
On 11/27/25 18:55, Lance Yang wrote:
On 2025/11/24 20:38, Guopeng Zhang wrote:
test_memcg_sock() currently requires that memory.stat's "sock " counter is exactly zero immediately after the TCP server exits. On a busy system this assumption is too strict:
- Socket memory may be freed with a small delay (e.g. RCU callbacks). - memcg statistics are updated asynchronously via the rstat flushing worker, so the "sock " value in memory.stat can stay non-zero for a short period of time even after all socket memory has been uncharged.
As a result, test_memcg_sock() can intermittently fail even though socket memory accounting is working correctly.
Make the test more robust by polling memory.stat for the "sock " counter and allowing it some time to drop to zero instead of checking it only once. The timeout is set to 3 seconds to cover the periodic rstat flush interval (FLUSH_TIME = 2*HZ by default) plus some scheduling slack. If the counter does not become zero within the timeout, the test still fails as before.
On my test system, running test_memcontrol 50 times produced:
- Before this patch: 6/50 runs passed. - After this patch: 50/50 runs passed.
Signed-off-by: Guopeng Zhang zhangguopeng@kylinos.cn Suggested-by: Lance Yang lance.yang@linux.dev
.../selftests/cgroup/test_memcontrol.c | 20 ++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c index 4e1647568c5b..dda12e5c6457 100644 --- a/tools/testing/selftests/cgroup/test_memcontrol.c +++ b/tools/testing/selftests/cgroup/test_memcontrol.c @@ -21,6 +21,8 @@ #include "kselftest.h"
This patch fails to apply to mm-new ...
Hmm, it expects #include "kselftest.h" here, but the tree uses #include "../kselftest.h".
Which is odd, as that line hasn't been touched in years ...
Hi,lance
Thanks for your review.
When I prepared this patch I was working on linux-next, where tools/testing/selftests/cgroup/test_memcontrol.c already uses:
#include "kselftest.h"I just checked, and this change comes from the following commit:
1aaedc385b9b278dcf91f4e9d0c3e1a078804ff1 https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20251127&id=1aaedc385b9b278dcf91f4e9d0c3e1a078804ff1So the patch applies cleanly on top of the latest linux-next, but not on mm-new which still has `#include "../kselftest.h"`.
Ahh, I see, thanks!
On Mon, Nov 24, 2025 at 08:38:14PM +0800, Guopeng Zhang wrote:
Introduce a new helper function `cg_read_key_long_poll()` in cgroup_util.h. This function polls the specified key in a cgroup file until it matches the expected value or the retry limit is reached, with configurable wait intervals between retries.
This helper is particularly useful for handling asynchronously updated cgroup statistics (e.g., memory.stat), where immediate reads may observe stale values, especially on busy systems. It allows tests and other utilities to handle such cases more flexibly.
Signed-off-by: Guopeng Zhang zhangguopeng@kylinos.cn Suggested-by: Michal Koutný mkoutny@suse.com
Reviewed-by: Shakeel Butt shakeel.butt@linux.dev
On Mon, Nov 24, 2025 at 08:38:16PM +0800, Guopeng Zhang wrote:
Replaced the manual sleep and retry logic in test_kmem_dead_cgroups() with the new helper `cg_read_key_long_poll()`. This change improves the robustness of the test by polling the "nr_dying_descendants" counter in `cgroup.stat` until it reaches 0 or the timeout is exceeded.
Additionally, increased the retry timeout to 8 seconds (from 5 seconds) based on testing results:
Why 8 seconds? What does it depend on? For memcg stats I see the 3 seconds driven from the 2 sec periodic rstat flush. Mainly how can we make this more future proof?
- With 5-second timeout: 4/20 runs passed.
- With 8-second timeout: 20/20 runs passed.
Signed-off-by: Guopeng Zhang zhangguopeng@kylinos.cn
Anyways, just add a sentence in the commit message on the reasoning behind 8 seconds and a comment in code as well. With that, you can add:
Reviewed-by: Shakeel Butt shakeel.butt@linux.dev
On Mon, Nov 24, 2025 at 08:38:15PM +0800, Guopeng Zhang wrote:
test_memcg_sock() currently requires that memory.stat's "sock " counter is exactly zero immediately after the TCP server exits. On a busy system this assumption is too strict:
- Socket memory may be freed with a small delay (e.g. RCU callbacks).
- memcg statistics are updated asynchronously via the rstat flushing worker, so the "sock " value in memory.stat can stay non-zero for a short period of time even after all socket memory has been uncharged.
As a result, test_memcg_sock() can intermittently fail even though socket memory accounting is working correctly.
Make the test more robust by polling memory.stat for the "sock " counter and allowing it some time to drop to zero instead of checking it only once. The timeout is set to 3 seconds to cover the periodic rstat flush interval (FLUSH_TIME = 2*HZ by default) plus some scheduling slack. If the counter does not become zero within the timeout, the test still fails as before.
On my test system, running test_memcontrol 50 times produced:
- Before this patch: 6/50 runs passed.
- After this patch: 50/50 runs passed.
Signed-off-by: Guopeng Zhang zhangguopeng@kylinos.cn Suggested-by: Lance Yang lance.yang@linux.dev
.../selftests/cgroup/test_memcontrol.c | 20 ++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c index 4e1647568c5b..dda12e5c6457 100644 --- a/tools/testing/selftests/cgroup/test_memcontrol.c +++ b/tools/testing/selftests/cgroup/test_memcontrol.c @@ -21,6 +21,8 @@ #include "kselftest.h" #include "cgroup_util.h" +#define MEMCG_SOCKSTAT_WAIT_RETRIES 30 /* 3s total */
No need for the comment at the end as it will be stale when someone change DEFAULT_WAIT_INTERVAL_US in future.
Anyways it's a nit.
Reviewed-by: Shakeel Butt shakeel.butt@linux.dev
On 12/3/25 07:12, Shakeel Butt wrote:
On Mon, Nov 24, 2025 at 08:38:15PM +0800, Guopeng Zhang wrote:
test_memcg_sock() currently requires that memory.stat's "sock " counter is exactly zero immediately after the TCP server exits. On a busy system this assumption is too strict:
- Socket memory may be freed with a small delay (e.g. RCU callbacks).
- memcg statistics are updated asynchronously via the rstat flushing worker, so the "sock " value in memory.stat can stay non-zero for a short period of time even after all socket memory has been uncharged.
As a result, test_memcg_sock() can intermittently fail even though socket memory accounting is working correctly.
Make the test more robust by polling memory.stat for the "sock " counter and allowing it some time to drop to zero instead of checking it only once. The timeout is set to 3 seconds to cover the periodic rstat flush interval (FLUSH_TIME = 2*HZ by default) plus some scheduling slack. If the counter does not become zero within the timeout, the test still fails as before.
On my test system, running test_memcontrol 50 times produced:
- Before this patch: 6/50 runs passed.
- After this patch: 50/50 runs passed.
Signed-off-by: Guopeng Zhang zhangguopeng@kylinos.cn Suggested-by: Lance Yang lance.yang@linux.dev
.../selftests/cgroup/test_memcontrol.c | 20 ++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c index 4e1647568c5b..dda12e5c6457 100644 --- a/tools/testing/selftests/cgroup/test_memcontrol.c +++ b/tools/testing/selftests/cgroup/test_memcontrol.c @@ -21,6 +21,8 @@ #include "kselftest.h" #include "cgroup_util.h" +#define MEMCG_SOCKSTAT_WAIT_RETRIES 30 /* 3s total */
No need for the comment at the end as it will be stale when someone change DEFAULT_WAIT_INTERVAL_US in future.
Anyways it's a nit.
Reviewed-by: Shakeel Butt shakeel.butt@linux.dev
Hi Shakeel,
Thanks for the review and for the Reviewed-by.
Good point about the comment. I’ll drop the "/* 3s total */" part in the next version so it does not become stale if DEFAULT_WAIT_INTERVAL_US changes.
Thanks, Guopeng
On 12/3/25 07:18, Shakeel Butt wrote:
On Mon, Nov 24, 2025 at 08:38:16PM +0800, Guopeng Zhang wrote:
Replaced the manual sleep and retry logic in test_kmem_dead_cgroups() with the new helper `cg_read_key_long_poll()`. This change improves the robustness of the test by polling the "nr_dying_descendants" counter in `cgroup.stat` until it reaches 0 or the timeout is exceeded.
Additionally, increased the retry timeout to 8 seconds (from 5 seconds) based on testing results:
Why 8 seconds? What does it depend on? For memcg stats I see the 3 seconds driven from the 2 sec periodic rstat flush. Mainly how can we make this more future proof?
Hi Shakeel,
Thanks a lot for the review and for the guidance.
The 8s timeout was chosen based on stress testing of test_kmem_dead_cgroups() on my setup: 5s was not always sufficient under load, while 8s consistently covered the reclaim of dying descendants. It is intended as a generous upper bound for the asynchronous reclaim and is not tied to any specific kernel constant. If the reclaim behavior changes significantly in the future, this timeout can be adjusted along with the test.
- With 5-second timeout: 4/20 runs passed.
- With 8-second timeout: 20/20 runs passed.
Signed-off-by: Guopeng Zhang zhangguopeng@kylinos.cn
Anyways, just add a sentence in the commit message on the reasoning behind 8 seconds and a comment in code as well. With that, you can add:
Reviewed-by: Shakeel Butt shakeel.butt@linux.dev
I’ll add a short sentence to the commit message and a comment next to KMEM_DEAD_WAIT_RETRIES explaining this rationale, and will include your:
Reviewed-by: Shakeel Butt shakeel.butt@linux.dev
in the next version.
Thanks, Guopeng
On Mon, Nov 24, 2025 at 08:38:13PM +0800, Guopeng Zhang wrote:
Hi all,
This patch series introduces improvements to the cgroup selftests by adding helper functions to better handle asynchronous updates in cgroup statistics. These changes are especially useful for managing cgroup stats like memory.stat and cgroup.stat, which can be affected by delays (e.g., RCPU callbacks and asynchronous rstat flushing).
v4:
- Patch 1/3: Adds the `cg_read_key_long_poll()` helper to poll cgroup keys with retries and configurable intervals.
- Patch 2/3: Updates `test_memcg_sock()` to use `cg_read_key_long_poll()` for handling delayed "sock" counter updates in memory.stat.
- Patch 3/3: Replaces `sleep` and retry logic in `test_kmem_dead_cgroups()` with `cg_read_key_long_poll()` for waiting on `nr_dying_descendants`.
Michal, if this looks good to you, I'll apply after the merge window.
Thanks.
linux-kselftest-mirror@lists.linaro.org