From: Barry Song v-songbaohua@oppo.com
Commit 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") introduced an unconditional one-tick sleep when `swapcache_prepare()` fails, which has led to reports of UI stuttering on latency-sensitive Android devices. To address this, we can use a waitqueue to wake up tasks that fail `swapcache_prepare()` sooner, instead of always sleeping for a full tick. While tasks may occasionally be woken by an unrelated `do_swap_page()`, this method is preferable to two scenarios: rapid re-entry into page faults, which can cause livelocks, and multiple millisecond sleeps, which visibly degrade user experience.
Oven's testing shows that a single waitqueue resolves the UI stuttering issue. If a 'thundering herd' problem becomes apparent later, a waitqueue hash similar to `folio_wait_table[PAGE_WAIT_TABLE_SIZE]` for page bit locks can be introduced.
Fixes: 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") Cc: Kairui Song kasong@tencent.com Cc: "Huang, Ying" ying.huang@intel.com Cc: Yu Zhao yuzhao@google.com Cc: David Hildenbrand david@redhat.com Cc: Chris Li chrisl@kernel.org Cc: Hugh Dickins hughd@google.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Michal Hocko mhocko@suse.com Cc: Minchan Kim minchan@kernel.org Cc: Yosry Ahmed yosryahmed@google.com Cc: SeongJae Park sj@kernel.org Cc: Kalesh Singh kaleshsingh@google.com Cc: Suren Baghdasaryan surenb@google.com Cc: stable@vger.kernel.org Reported-by: Oven Liyang liyangouwen1@oppo.com Tested-by: Oven Liyang liyangouwen1@oppo.com Signed-off-by: Barry Song v-songbaohua@oppo.com --- mm/memory.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c index 2366578015ad..6913174f7f41 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4192,6 +4192,8 @@ static struct folio *alloc_swap_folio(struct vm_fault *vmf) } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+static DECLARE_WAIT_QUEUE_HEAD(swapcache_wq); + /* * We enter with non-exclusive mmap_lock (to exclude vma changes, * but allow concurrent faults), and pte mapped but not yet locked. @@ -4204,6 +4206,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; struct folio *swapcache, *folio = NULL; + DECLARE_WAITQUEUE(wait, current); struct page *page; struct swap_info_struct *si = NULL; rmap_t rmap_flags = RMAP_NONE; @@ -4302,7 +4305,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) * Relax a bit to prevent rapid * repeated page faults. */ + add_wait_queue(&swapcache_wq, &wait); schedule_timeout_uninterruptible(1); + remove_wait_queue(&swapcache_wq, &wait); goto out_page; } need_clear_cache = true; @@ -4609,8 +4614,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) pte_unmap_unlock(vmf->pte, vmf->ptl); out: /* Clear the swap cache pin for direct swapin after PTL unlock */ - if (need_clear_cache) + if (need_clear_cache) { swapcache_clear(si, entry, nr_pages); + wake_up(&swapcache_wq); + } if (si) put_swap_device(si); return ret; @@ -4625,8 +4632,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) folio_unlock(swapcache); folio_put(swapcache); } - if (need_clear_cache) + if (need_clear_cache) { swapcache_clear(si, entry, nr_pages); + wake_up(&swapcache_wq); + } if (si) put_swap_device(si); return ret;