Since commit 70cbc3cc78a99 ("mm: gup: fix the fast GUP race against THP collapse"), the lockless_pages_from_mm() fastpath rechecks the pmd_t to ensure that the page table was not removed by khugepaged in between.
However, lockless_pages_from_mm() still requires that the page table is not concurrently freed. Fix it by sending IPIs (if the architecture uses semi-RCU-style page table freeing) before freeing/reusing page tables.
Link: https://lkml.kernel.org/r/20221129154730.2274278-2-jannh@google.com Link: https://lkml.kernel.org/r/20221128180252.1684965-2-jannh@google.com Link: https://lkml.kernel.org/r/20221125213714.4115729-2-jannh@google.com Fixes: ba76149f47d8 ("thp: khugepaged") Signed-off-by: Jann Horn jannh@google.com Reviewed-by: Yang Shi shy828301@gmail.com Acked-by: David Hildenbrand david@redhat.com Cc: John Hubbard jhubbard@nvidia.com Cc: Peter Xu peterx@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org [manual backport: two of the three places in khugepaged that can free ptes were refactored into a common helper between 5.15 and 6.0; TLB flushing was refactored between 5.4 and 5.10; TLB flushing was refactored between 4.19 and 5.4; pmd collapse for PTE-mapped THP was only added in 5.4; ugly hack needed in <=4.19 for s390 and arm] Signed-off-by: Jann Horn jannh@google.com --- include/asm-generic/tlb.h | 6 ++++++ mm/khugepaged.c | 15 +++++++++++++++ mm/memory.c | 5 +++++ 3 files changed, 26 insertions(+)
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index db72ad39853b..737f5cb0dc84 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -61,6 +61,12 @@ struct mmu_table_batch { extern void tlb_table_flush(struct mmu_gather *tlb); extern void tlb_remove_table(struct mmu_gather *tlb, void *table);
+void tlb_remove_table_sync_one(void); + +#else + +static inline void tlb_remove_table_sync_one(void) { } + #endif
/* diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 5dd14ef2e1de..0a4cace1cfc4 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -23,6 +23,19 @@ #include <asm/pgalloc.h> #include "internal.h"
+/* gross hack for <=4.19 stable */ +#if defined(CONFIG_S390) || defined(CONFIG_ARM) +static void tlb_remove_table_smp_sync(void *arg) +{ + /* Simply deliver the interrupt */ +} + +static void tlb_remove_table_sync_one(void) +{ + smp_call_function(tlb_remove_table_smp_sync, NULL, 1); +} +#endif + enum scan_result { SCAN_FAIL, SCAN_SUCCEED, @@ -1045,6 +1058,7 @@ static void collapse_huge_page(struct mm_struct *mm, _pmd = pmdp_collapse_flush(vma, address, pmd); spin_unlock(pmd_ptl); mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); + tlb_remove_table_sync_one();
spin_lock(pte_ptl); isolated = __collapse_huge_page_isolate(vma, address, pte); @@ -1294,6 +1308,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) _pmd = pmdp_collapse_flush(vma, addr, pmd); spin_unlock(ptl); mm_dec_nr_ptes(mm); + tlb_remove_table_sync_one(); pte_free(mm, pmd_pgtable(_pmd)); } up_write(&mm->mmap_sem); diff --git a/mm/memory.c b/mm/memory.c index 800834cff4e6..b80ce6b3c8f4 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -362,6 +362,11 @@ static void tlb_remove_table_smp_sync(void *arg) /* Simply deliver the interrupt */ }
+void tlb_remove_table_sync_one(void) +{ + smp_call_function(tlb_remove_table_smp_sync, NULL, 1); +} + static void tlb_remove_table_one(void *table) { /*
base-commit: c1ccef20f08e192228a2056808113b453d18c094
Any codepath that zaps page table entries must invoke MMU notifiers to ensure that secondary MMUs (like KVM) don't keep accessing pages which aren't mapped anymore. Secondary MMUs don't hold their own references to pages that are mirrored over, so failing to notify them can lead to page use-after-free.
I'm marking this as addressing an issue introduced in commit f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages"), but most of the security impact of this only came in commit 27e1f8273113 ("khugepaged: enable collapse pmd for pte-mapped THP"), which actually omitted flushes for the removal of present PTEs, not just for the removal of empty page tables.
Link: https://lkml.kernel.org/r/20221129154730.2274278-3-jannh@google.com Link: https://lkml.kernel.org/r/20221128180252.1684965-3-jannh@google.com Link: https://lkml.kernel.org/r/20221125213714.4115729-3-jannh@google.com Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages") Signed-off-by: Jann Horn jannh@google.com Acked-by: David Hildenbrand david@redhat.com Reviewed-by: Yang Shi shy828301@gmail.com Cc: John Hubbard jhubbard@nvidia.com Cc: Peter Xu peterx@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org [manual backport: this code was refactored from two copies into a common helper between 5.15 and 6.0; pmd collapse for PTE-mapped THP was only added in 5.4; MMU notifier API changed between 4.19 and 5.4] Signed-off-by: Jann Horn jannh@google.com --- mm/khugepaged.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 0a4cace1cfc4..60f7df987567 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1303,13 +1303,20 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) */ if (down_write_trylock(&mm->mmap_sem)) { if (!khugepaged_test_exit(mm)) { - spinlock_t *ptl = pmd_lock(mm, pmd); + spinlock_t *ptl; + unsigned long end = addr + HPAGE_PMD_SIZE; + + mmu_notifier_invalidate_range_start(mm, addr, + end); + ptl = pmd_lock(mm, pmd); /* assume page table is clear */ _pmd = pmdp_collapse_flush(vma, addr, pmd); spin_unlock(ptl); mm_dec_nr_ptes(mm); tlb_remove_table_sync_one(); pte_free(mm, pmd_pgtable(_pmd)); + mmu_notifier_invalidate_range_end(mm, addr, + end); } up_write(&mm->mmap_sem); }
linux-stable-mirror@lists.linaro.org