New subject: [PATCH mm-unstable v1 2/2] mm/mglru: fix overshooting shrinker memory

11 Jul 2024

evict_folios() uses a second pass to reclaim folios that have gone
through page writeback and become clean before it finishes the first
pass, since folio_rotate_reclaimable() cannot handle those folios due
to the isolation.
The second pass tries to avoid potential double counting by deducting
scan_control->nr_scanned. However, this can result in underflow of
nr_scanned, under a condition where shrink_folio_list() does not
increment nr_scanned, i.e., when folio_trylock() fails.
The underflow can cause the divisor, i.e., scale=scanned+reclaimed in
vmpressure_calc_level(), to become zero, resulting in the following
crash:
[exception RIP: vmpressure_work_fn+101]
  process_one_work at ffffffffa3313f2b
Since scan_control->nr_scanned has no established semantics, the
potential double counting has minimal risks. Therefore, fix the
problem by not deducting scan_control->nr_scanned in evict_folios().
Reported-by: Wei Xu weixugc@google.com
Fixes: 359a5e1416ca ("mm: multi-gen LRU: retry folios written back while isolated")
Cc: stable@vger.kernel.org
Signed-off-by: Yu Zhao yuzhao@google.com
---
 mm/vmscan.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0761f91b407f..6403038c776e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4597,7 +4597,6 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
/* retry folios that may have missed folio_rotate_reclaimable() */
    	list_move(&folio->lru, &clean);
-		sc->nr_scanned -= folio_nr_pages(folio);
    }
spin_lock_irq(&lruvec->lru_lock);
-- 
2.45.2.993.g49e7a77208-goog



    

[PATCH mm-unstable v1 1/2] mm/mglru: fix div-by-zero in vmpressure_calc_level()