From: Johannes Weiner hannes@cmpxchg.org Subject: mm: memcontrol: prevent starvation when writing memory.high
When a value is written to a cgroup's memory.high control file, the write() context first tries to reclaim the cgroup to size before putting the limit in place for the workload. Concurrent charges from the workload can keep such a write() looping in reclaim indefinitely.
In the past, a write to memory.high would first put the limit in place for the workload, then do targeted reclaim until the new limit has been met - similar to how we do it for memory.max. This wasn't prone to the described starvation issue. However, this sequence could cause excessive latencies in the workload, when allocating threads could be put into long penalty sleeps on the sudden memory.high overage created by the write(), before that had a chance to work it off.
Now that memory_high_write() performs reclaim before enforcing the new limit, reflect that the cgroup may well fail to converge due to concurrent workload activity. Bail out of the loop after a few tries.
Link: https://lkml.kernel.org/r/20210112163011.127833-1-hannes@cmpxchg.org Fixes: 536d3bf261a2 ("mm: memcontrol: avoid workload stalls when lowering memory.high") Signed-off-by: Johannes Weiner hannes@cmpxchg.org Reviewed-by: Shakeel Butt shakeelb@google.com Reported-by: Tejun Heo tj@kernel.org Acked-by: Roman Gushchin guro@fb.com Reviewed-by: Michal Koutný mkoutny@suse.com Cc: Michal Hocko mhocko@suse.com Cc: stable@vger.kernel.org [5.8+] Signed-off-by: Andrew Morton akpm@linux-foundation.org ---
mm/memcontrol.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/mm/memcontrol.c~mm-memcontrol-prevent-starvation-when-writing-memoryhigh +++ a/mm/memcontrol.c @@ -6273,7 +6273,6 @@ static ssize_t memory_high_write(struct
for (;;) { unsigned long nr_pages = page_counter_read(&memcg->memory); - unsigned long reclaimed;
if (nr_pages <= high) break; @@ -6287,10 +6286,10 @@ static ssize_t memory_high_write(struct continue; }
- reclaimed = try_to_free_mem_cgroup_pages(memcg, nr_pages - high, - GFP_KERNEL, true); + try_to_free_mem_cgroup_pages(memcg, nr_pages - high, + GFP_KERNEL, true);
- if (!reclaimed && !nr_retries--) + if (!nr_retries--) break; }
_
On Sat, Jan 23, 2021 at 9:01 PM Andrew Morton akpm@linux-foundation.org wrote:
From: Johannes Weiner hannes@cmpxchg.org Subject: mm: memcontrol: prevent starvation when writing memory.high
When a value is written to a cgroup's memory.high control file, the write() context first tries to reclaim the cgroup to size before putting the limit in place for the workload. Concurrent charges from the workload can keep such a write() looping in reclaim indefinitely.
In the past, a write to memory.high would first put the limit in place for the workload, then do targeted reclaim until the new limit has been met - similar to how we do it for memory.max. This wasn't prone to the described starvation issue. However, this sequence could cause excessive latencies in the workload, when allocating threads could be put into long penalty sleeps on the sudden memory.high overage created by the write(), before that had a chance to work it off.
Now that memory_high_write() performs reclaim before enforcing the new limit, reflect that the cgroup may well fail to converge due to concurrent workload activity. Bail out of the loop after a few tries.
Link: https://lkml.kernel.org/r/20210112163011.127833-1-hannes@cmpxchg.org Fixes: 536d3bf261a2 ("mm: memcontrol: avoid workload stalls when lowering memory.high") Signed-off-by: Johannes Weiner hannes@cmpxchg.org Reviewed-by: Shakeel Butt shakeelb@google.com Reported-by: Tejun Heo tj@kernel.org Acked-by: Roman Gushchin guro@fb.com Reviewed-by: Michal Koutný mkoutny@suse.com Cc: Michal Hocko mhocko@suse.com Cc: stable@vger.kernel.org [5.8+] Signed-off-by: Andrew Morton akpm@linux-foundation.org
Johannes requested to replace this patch with https://lore.kernel.org/linux-mm/20210122184341.292461-1-hannes@cmpxchg.org/
On Sun, Jan 24, 2021 at 10:02 AM Shakeel Butt shakeelb@google.com wrote:
Johannes requested to replace this patch with https://lore.kernel.org/linux-mm/20210122184341.292461-1-hannes@cmpxchg.org/
I've dropped it (not replaced it - will wait for Andrew to comment/send) from my queue.
Linus
linux-stable-mirror@lists.linaro.org