On Thu, Jan 10, 2013 at 01:20:30AM +0400, Glauber Costa wrote: [...]
Given the above, I believe that ideally we should use this pressure mechanism in memcg replacing the current memcg notification mechanism.
Just a quick wonder: why would we need to place it into memcg, when we don't need any of the memcg stuff for it? I see no benefits, not design-wise, not implementation-wise or anything-wise. :)
We can use mempressure w/o memcg, and even then it can (or should :) be useful (for cpuset, for example).
More or less like timer expiration happens: you could still write numbers for compatibility, but those numbers would be internally mapped into the levels Anton is proposing, that makes *way* more sense.
If that is not possible, they should coexist as "notification" and a "pressure" mechanism inside memcg.
The main argument against it centered around cpusets also being able to participate in the play. I haven't yet understood how would it take place. In particular, I saw no mention to cpusets in the patches.
I didn't test it, but as I see it, once a process in a specific cpuset, the task can only use a specific allowed zones for reclaim/alloc, i.e. various checks like this in vmscan:
if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL)) continue;
So, vmscan simply won't call vmpressure() if the zone is not allowed (so we won't account that pressure, from that zone).
Thanks, Anton