On 12/3/25 07:18, Shakeel Butt wrote:
On Mon, Nov 24, 2025 at 08:38:16PM +0800, Guopeng Zhang wrote:
Replaced the manual sleep and retry logic in test_kmem_dead_cgroups() with the new helper `cg_read_key_long_poll()`. This change improves the robustness of the test by polling the "nr_dying_descendants" counter in `cgroup.stat` until it reaches 0 or the timeout is exceeded.
Additionally, increased the retry timeout to 8 seconds (from 5 seconds) based on testing results:
Why 8 seconds? What does it depend on? For memcg stats I see the 3 seconds driven from the 2 sec periodic rstat flush. Mainly how can we make this more future proof?
Hi Shakeel,
Thanks a lot for the review and for the guidance.
The 8s timeout was chosen based on stress testing of test_kmem_dead_cgroups() on my setup: 5s was not always sufficient under load, while 8s consistently covered the reclaim of dying descendants. It is intended as a generous upper bound for the asynchronous reclaim and is not tied to any specific kernel constant. If the reclaim behavior changes significantly in the future, this timeout can be adjusted along with the test.
- With 5-second timeout: 4/20 runs passed.
- With 8-second timeout: 20/20 runs passed.
Signed-off-by: Guopeng Zhang zhangguopeng@kylinos.cn
Anyways, just add a sentence in the commit message on the reasoning behind 8 seconds and a comment in code as well. With that, you can add:
Reviewed-by: Shakeel Butt shakeel.butt@linux.dev
I’ll add a short sentence to the commit message and a comment next to KMEM_DEAD_WAIT_RETRIES explaining this rationale, and will include your:
Reviewed-by: Shakeel Butt shakeel.butt@linux.dev
in the next version.
Thanks, Guopeng