On Tue 28-11-23 08:53:56, Nhat Pham wrote:
On Tue, Nov 28, 2023 at 1:38 AM Michal Hocko mhocko@suse.com wrote:
On Mon 27-11-23 11:36:59, Nhat Pham wrote:
The new zswap writeback scheme requires an online-only memcg hierarchy traversal. Add a new parameter to mem_cgroup_iter() to check for onlineness before returning.
Why is this needed?
For context, in patch 3 of this series, Domenico and I are adding cgroup-aware LRU to zswap, so that we can perform workload-specific zswap writeback. When the reclaim happens due to the global zswap limit being hit, a cgroup is selected by the mem_cgroup_iter(), and the last one selected is saved in the zswap pool (so that the iteration can follow from there next time the limit is hit).
However, one problem with this scheme is we will be pinning the reference to that saved memcg until the next global reclaim attempt, which could prevent it from being killed for quite some time after it has been offlined. Johannes, Yosry, and I discussed a couple of approaches for a while, and decided to add a callback that would release the reference held by the zswap pool when the memcg is offlined, and the zswap pool will obtain the reference to the next online memcg in the traversal (or at least one that has not had the zswap-memcg-release-callback run on it yet).
This should be a part of the changelog along with an explanation why this cannot be handled on the caller level? You have a pin on the memcg, you can check it is online and scratch it if not, right? Why do we need to make a rather convoluted iterator interface more complex when most users simply do not require that?