On Sat, 17 Nov 2012, Glauber Costa wrote:
I'm wondering if we should have more than three different levels.
In the case I outlined below, for backwards compatibility. What I actually mean is that memcg *currently* allows arbitrary notifications. One way to merge those, while moving to a saner 3-point notification, is to still allow the old writes and fit them in the closest bucket.
Yeah, but I'm wondering why three is the right answer.
Umm, why do users of cpusets not want to be able to trigger memory pressure notifications?
Because cpusets only deal with memory placement, not memory usage.
The set of nodes that a thread is allowed to allocate from may face memory pressure up to and including oom while the rest of the system may have a ton of free memory. Your solution is to compile and mount memcg if you want notifications of memory pressure on those nodes. Others in this thread have already said they don't want to rely on memcg for any of this and, as Anton showed, this can be tied directly into the VM without any help from memcg as it sits today. So why implement a simple and clean mempressure cgroup that can be used alone or co-existing with either memcg or cpusets?
And it is not that moving a task to cpuset disallows you to do any of this: you could, as long as the same set of tasks are mounted in a corresponding memcg.
Same thing with a separate mempressure cgroup. The point is that there will be users of this cgroup that do not want the overhead imposed by memcg (which is why it's disabled in defconfig) and there's no direct dependency that causes it to be a part of memcg.