From: Hugh Dickins hughd@google.com
[ Upstream commit 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3 ]
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a large folio is added: so collect_longterm_unpinnable_folios() just wastes effort when calling lru_add_drain[_all]() on a large folio.
But although there is good reason not to batch up PMD-sized folios, we might well benefit from batching a small number of low-order mTHPs (though unclear how that "small number" limitation will be implemented).
So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to insulate those particular checks from future change. Name preferred to "folio_is_batchable" because large folios can well be put on a batch: it's just the per-CPU LRU caches, drained much later, which need care.
Marked for stable, to counter the increase in lru_add_drain_all()s from "mm/gup: check ref_count instead of lru before migration".
Link: https://lkml.kernel.org/r/57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region") Signed-off-by: Hugh Dickins hughd@google.com Suggested-by: David Hildenbrand david@redhat.com Acked-by: David Hildenbrand david@redhat.com Cc: "Aneesh Kumar K.V" aneesh.kumar@kernel.org Cc: Axel Rasmussen axelrasmussen@google.com Cc: Chris Li chrisl@kernel.org Cc: Christoph Hellwig hch@infradead.org Cc: Jason Gunthorpe jgg@ziepe.ca Cc: Johannes Weiner hannes@cmpxchg.org Cc: John Hubbard jhubbard@nvidia.com Cc: Keir Fraser keirf@google.com Cc: Konstantin Khlebnikov koct9i@gmail.com Cc: Li Zhe lizhe.67@bytedance.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Peter Xu peterx@redhat.com Cc: Rik van Riel riel@surriel.com Cc: Shivank Garg shivankg@amd.com Cc: Vlastimil Babka vbabka@suse.cz Cc: Wei Xu weixugc@google.com Cc: Will Deacon will@kernel.org Cc: yangge yangge1116@126.com Cc: Yuanchu Xie yuanchu@google.com Cc: Yu Zhao yuzhao@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org [ adapted to drain_allow instead of drained ] Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/swap.h | 10 ++++++++++ mm/gup.c | 3 ++- mm/mlock.c | 6 +++--- mm/swap.c | 4 ++-- 4 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h index f3e0ac20c2e8c..63f85b3fee238 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -382,6 +382,16 @@ void folio_add_lru_vma(struct folio *, struct vm_area_struct *); void mark_page_accessed(struct page *); void folio_mark_accessed(struct folio *);
+static inline bool folio_may_be_lru_cached(struct folio *folio) +{ + /* + * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting. + * Holding small numbers of low-order mTHP folios in per-CPU LRU cache + * will be sensible, but nobody has implemented and tested that yet. + */ + return !folio_test_large(folio); +} + extern atomic_t lru_disable_count;
static inline bool lru_cache_disabled(void) diff --git a/mm/gup.c b/mm/gup.c index e323843cc5dd8..a919ee0f5b778 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2354,7 +2354,8 @@ static unsigned long collect_longterm_unpinnable_folios( continue; }
- if (!folio_test_lru(folio) && drain_allow) { + if (!folio_test_lru(folio) && folio_may_be_lru_cached(folio) && + drain_allow) { lru_add_drain_all(); drain_allow = false; } diff --git a/mm/mlock.c b/mm/mlock.c index cde076fa7d5e5..8c8d522efdd59 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio)
folio_get(folio); if (!folio_batch_add(fbatch, mlock_lru(folio)) || - folio_test_large(folio) || lru_cache_disabled()) + !folio_may_be_lru_cached(folio) || lru_cache_disabled()) mlock_folio_batch(fbatch); local_unlock(&mlock_fbatch.lock); } @@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio)
folio_get(folio); if (!folio_batch_add(fbatch, mlock_new(folio)) || - folio_test_large(folio) || lru_cache_disabled()) + !folio_may_be_lru_cached(folio) || lru_cache_disabled()) mlock_folio_batch(fbatch); local_unlock(&mlock_fbatch.lock); } @@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio) */ folio_get(folio); if (!folio_batch_add(fbatch, folio) || - folio_test_large(folio) || lru_cache_disabled()) + !folio_may_be_lru_cached(folio) || lru_cache_disabled()) mlock_folio_batch(fbatch); local_unlock(&mlock_fbatch.lock); } diff --git a/mm/swap.c b/mm/swap.c index 59f30a981c6f9..e8536e3b48149 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -222,8 +222,8 @@ static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch, else local_lock(&cpu_fbatches.lock);
- if (!folio_batch_add(this_cpu_ptr(fbatch), folio) || folio_test_large(folio) || - lru_cache_disabled()) + if (!folio_batch_add(this_cpu_ptr(fbatch), folio) || + !folio_may_be_lru_cached(folio) || lru_cache_disabled()) folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)