Re: [PATCH v5 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection()

11 Apr 2025

      On Mon, Apr 07, 2025 at 12:23:16PM -0400, Waiman Long longman@redhat.com wrote:
...
Child   Actual usage    Expected usage    %err

1       16990208         22020096      -12.9%
1       17252352         22020096      -12.1%
0       37699584         30408704      +10.7%
1       14368768         22020096      -21.0%
1       16871424         22020096      -13.2%

The current 10% error tolerenace might be right at the time
test_memcontrol.c was first introduced in v4.18 kernel, but memory
reclaim have certainly evolved quite a bit since then which may result
in a bit more run-to-run variation than previously expected.
I like Roman's suggestion of nr_cpus dependence but I assume your
variations were still on the same system, weren't they?
Is it fair to say that reclaim is chaotic [1]? I wonder what may cause
variations between separate runs of the test.
Would it help to `echo 3 >drop_caches` before each run to have more
stable initial conditions? (Not sure if it's OK in selftests.)
<del>Or sleep 0.5s to settle rstat flushing?</del> No, page_counter's
don't suffer that but stock MEMCG_CHARGE_BATCH in percpu stocks.
So maybe drain the stock so that counters are precise after the test?
(Either by executing a dummy memcg on each CPU or via some debugging
API.)
Michal
[1] https://en.wikipedia.org/wiki/Chaos_theory#Chaotic_dynamics

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v5 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection()