The quilt patch titled Subject: mm: hugetlb: add huge page size param to set_huge_pte_at() has been removed from the -mm tree. Its filename was mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------ From: Ryan Roberts ryan.roberts@arm.com Subject: mm: hugetlb: add huge page size param to set_huge_pte_at() Date: Fri, 22 Sep 2023 12:58:03 +0100
Patch series "Fix set_huge_pte_at() panic on arm64", v2.
This series fixes a bug in arm64's implementation of set_huge_pte_at(), which can result in an unprivileged user causing a kernel panic. The problem was triggered when running the new uffd poison mm selftest for HUGETLB memory. This test (and the uffd poison feature) was merged for v6.5-rc7.
Ideally, I'd like to get this fix in for v6.6 and I've cc'ed stable (correctly this time) to get it backported to v6.5, where the issue first showed up.
Description of Bug ==================
arm64's huge pte implementation supports multiple huge page sizes, some of which are implemented in the page table with multiple contiguous entries. So set_huge_pte_at() needs to work out how big the logical pte is, so that it can also work out how many physical ptes (or pmds) need to be written. It previously did this by grabbing the folio out of the pte and querying its size.
However, there are cases when the pte being set is actually a swap entry. But this also used to work fine, because for huge ptes, we only ever saw migration entries and hwpoison entries. And both of these types of swap entries have a PFN embedded, so the code would grab that and everything still worked out.
But over time, more calls to set_huge_pte_at() have been added that set swap entry types that do not embed a PFN. And this causes the code to go bang. The triggering case is for the uffd poison test, commit 99aa77215ad0 ("selftests/mm: add uffd unit test for UFFDIO_POISON"), which causes a PTE_MARKER_POISONED swap entry to be set, coutesey of commit 8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs") - added in v6.5-rc7. Although review shows that there are other call sites that set PTE_MARKER_UFFD_WP (which also has no PFN), these don't trigger on arm64 because arm64 doesn't support UFFD WP.
If CONFIG_DEBUG_VM is enabled, we do at least get a BUG(), but otherwise, it will dereference a bad pointer in page_folio():
static inline struct folio *hugetlb_swap_entry_to_folio(swp_entry_t entry) { VM_BUG_ON(!is_migration_entry(entry) && !is_hwpoison_entry(entry));
return page_folio(pfn_to_page(swp_offset_pfn(entry))); }
Fix ===
The simplest fix would have been to revert the dodgy cleanup commit 18f3962953e4 ("mm: hugetlb: kill set_huge_swap_pte_at()"), but since things have moved on, this would have required an audit of all the new set_huge_pte_at() call sites to see if they should be converted to set_huge_swap_pte_at(). As per the original intent of the change, it would also leave us open to future bugs when people invariably get it wrong and call the wrong helper.
So instead, I've added a huge page size parameter to set_huge_pte_at(). This means that the arm64 code has the size in all cases. It's a bigger change, due to needing to touch the arches that implement the function, but it is entirely mechanical, so in my view, low risk.
I've compile-tested all touched arches; arm64, parisc, powerpc, riscv, s390, sparc (and additionally x86_64). I've additionally booted and run mm selftests against arm64, where I observe the uffd poison test is fixed, and there are no other regressions.
This patch (of 2):
In order to fix a bug, arm64 needs to be told the size of the huge page for which the pte is being set in set_huge_pte_at(). Provide for this by adding an `unsigned long sz` parameter to the function. This follows the same pattern as huge_pte_clear().
This commit makes the required interface modifications to the core mm as well as all arches that implement this function (arm64, parisc, powerpc, riscv, s390, sparc). The actual arm64 bug will be fixed in a separate commit.
No behavioral changes intended.
Link: https://lkml.kernel.org/r/20230922115804.2043771-1-ryan.roberts@arm.com Link: https://lkml.kernel.org/r/20230922115804.2043771-2-ryan.roberts@arm.com Fixes: 8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs") Signed-off-by: Ryan Roberts ryan.roberts@arm.com Reviewed-by: Christophe Leroy christophe.leroy@csgroup.eu [powerpc 8xx] Reviewed-by: Lorenzo Stoakes lstoakes@gmail.com [vmalloc change] Cc: Alexandre Ghiti alex@ghiti.fr Cc: Albert Ou aou@eecs.berkeley.edu Cc: Alexander Gordeev agordeev@linux.ibm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Cc: Arnd Bergmann arnd@arndb.de Cc: Axel Rasmussen axelrasmussen@google.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Christian Borntraeger borntraeger@linux.ibm.com Cc: Christoph Hellwig hch@infradead.org Cc: David S. Miller davem@davemloft.net Cc: Gerald Schaefer gerald.schaefer@linux.ibm.com Cc: Heiko Carstens hca@linux.ibm.com Cc: Helge Deller deller@gmx.de Cc: "James E.J. Bottomley" James.Bottomley@HansenPartnership.com Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Muchun Song muchun.song@linux.dev Cc: Nicholas Piggin npiggin@gmail.com Cc: Palmer Dabbelt palmer@dabbelt.com Cc: Paul Walmsley paul.walmsley@sifive.com Cc: Peter Xu peterx@redhat.com Cc: Qi Zheng zhengqi.arch@bytedance.com Cc: Ryan Roberts ryan.roberts@arm.com Cc: SeongJae Park sj@kernel.org Cc: Sven Schnelle svens@linux.ibm.com Cc: Uladzislau Rezki (Sony) urezki@gmail.com Cc: Vasily Gorbik gor@linux.ibm.com Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org [6.5+] Signed-off-by: Andrew Morton akpm@linux-foundation.org ---
arch/arm64/include/asm/hugetlb.h | 2 arch/arm64/mm/hugetlbpage.c | 6 + arch/parisc/include/asm/hugetlb.h | 2 arch/parisc/mm/hugetlbpage.c | 2 arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h | 3 arch/powerpc/mm/book3s64/hugetlbpage.c | 5 + arch/powerpc/mm/book3s64/radix_hugetlbpage.c | 3 arch/powerpc/mm/nohash/8xx.c | 3 arch/powerpc/mm/pgtable.c | 3 arch/riscv/include/asm/hugetlb.h | 3 arch/riscv/mm/hugetlbpage.c | 3 arch/s390/include/asm/hugetlb.h | 6 + arch/s390/mm/hugetlbpage.c | 8 ++ arch/sparc/include/asm/hugetlb.h | 6 + arch/sparc/mm/hugetlbpage.c | 8 ++ include/asm-generic/hugetlb.h | 2 include/linux/hugetlb.h | 6 + mm/damon/vaddr.c | 3 mm/hugetlb.c | 43 +++++++------ mm/migrate.c | 7 +- mm/rmap.c | 23 +++++- mm/vmalloc.c | 2 22 files changed, 100 insertions(+), 49 deletions(-)
--- a/arch/arm64/include/asm/hugetlb.h~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/arm64/include/asm/hugetlb.h @@ -28,7 +28,7 @@ pte_t arch_make_huge_pte(pte_t entry, un #define arch_make_huge_pte arch_make_huge_pte #define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT extern void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t pte); + pte_t *ptep, pte_t pte, unsigned long sz); #define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS extern int huge_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, --- a/arch/arm64/mm/hugetlbpage.c~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/arm64/mm/hugetlbpage.c @@ -249,7 +249,7 @@ static inline struct folio *hugetlb_swap }
void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t pte) + pte_t *ptep, pte_t pte, unsigned long sz) { size_t pgsize; int i; @@ -571,5 +571,7 @@ pte_t huge_ptep_modify_prot_start(struct void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t old_pte, pte_t pte) { - set_huge_pte_at(vma->vm_mm, addr, ptep, pte); + unsigned long psize = huge_page_size(hstate_vma(vma)); + + set_huge_pte_at(vma->vm_mm, addr, ptep, pte, psize); } --- a/arch/parisc/include/asm/hugetlb.h~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/parisc/include/asm/hugetlb.h @@ -6,7 +6,7 @@
#define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t pte); + pte_t *ptep, pte_t pte, unsigned long sz);
#define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, --- a/arch/parisc/mm/hugetlbpage.c~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/parisc/mm/hugetlbpage.c @@ -140,7 +140,7 @@ static void __set_huge_pte_at(struct mm_ }
void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t entry) + pte_t *ptep, pte_t entry, unsigned long sz) { __set_huge_pte_at(mm, addr, ptep, entry); } --- a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h @@ -46,7 +46,8 @@ static inline int check_and_get_huge_psi }
#define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT -void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte); +void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, + pte_t pte, unsigned long sz);
#define __HAVE_ARCH_HUGE_PTE_CLEAR static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr, --- a/arch/powerpc/mm/book3s64/hugetlbpage.c~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/powerpc/mm/book3s64/hugetlbpage.c @@ -143,11 +143,14 @@ pte_t huge_ptep_modify_prot_start(struct void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t old_pte, pte_t pte) { + unsigned long psize;
if (radix_enabled()) return radix__huge_ptep_modify_prot_commit(vma, addr, ptep, old_pte, pte); - set_huge_pte_at(vma->vm_mm, addr, ptep, pte); + + psize = huge_page_size(hstate_vma(vma)); + set_huge_pte_at(vma->vm_mm, addr, ptep, pte, psize); }
void __init hugetlbpage_init_defaultsize(void) --- a/arch/powerpc/mm/book3s64/radix_hugetlbpage.c~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/powerpc/mm/book3s64/radix_hugetlbpage.c @@ -47,6 +47,7 @@ void radix__huge_ptep_modify_prot_commit pte_t old_pte, pte_t pte) { struct mm_struct *mm = vma->vm_mm; + unsigned long psize = huge_page_size(hstate_vma(vma));
/* * POWER9 NMMU must flush the TLB after clearing the PTE before @@ -58,5 +59,5 @@ void radix__huge_ptep_modify_prot_commit atomic_read(&mm->context.copros) > 0) radix__flush_hugetlb_page(vma, addr);
- set_huge_pte_at(vma->vm_mm, addr, ptep, pte); + set_huge_pte_at(vma->vm_mm, addr, ptep, pte, psize); } --- a/arch/powerpc/mm/nohash/8xx.c~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/powerpc/mm/nohash/8xx.c @@ -91,7 +91,8 @@ static int __ref __early_map_kernel_huge if (new && WARN_ON(pte_present(*ptep) && pgprot_val(prot))) return -EINVAL;
- set_huge_pte_at(&init_mm, va, ptep, pte_mkhuge(pfn_pte(pa >> PAGE_SHIFT, prot))); + set_huge_pte_at(&init_mm, va, ptep, + pte_mkhuge(pfn_pte(pa >> PAGE_SHIFT, prot)), psize);
return 0; } --- a/arch/powerpc/mm/pgtable.c~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/powerpc/mm/pgtable.c @@ -288,7 +288,8 @@ int huge_ptep_set_access_flags(struct vm }
#if defined(CONFIG_PPC_8xx) -void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) +void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, + pte_t pte, unsigned long sz) { pmd_t *pmd = pmd_off(mm, addr); pte_basic_t val; --- a/arch/riscv/include/asm/hugetlb.h~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/riscv/include/asm/hugetlb.h @@ -18,7 +18,8 @@ void huge_pte_clear(struct mm_struct *mm
#define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT void set_huge_pte_at(struct mm_struct *mm, - unsigned long addr, pte_t *ptep, pte_t pte); + unsigned long addr, pte_t *ptep, pte_t pte, + unsigned long sz);
#define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR pte_t huge_ptep_get_and_clear(struct mm_struct *mm, --- a/arch/riscv/mm/hugetlbpage.c~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/riscv/mm/hugetlbpage.c @@ -180,7 +180,8 @@ pte_t arch_make_huge_pte(pte_t entry, un void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, - pte_t pte) + pte_t pte, + unsigned long sz) { int i, pte_num;
--- a/arch/s390/include/asm/hugetlb.h~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/s390/include/asm/hugetlb.h @@ -16,6 +16,8 @@ #define hugepages_supported() (MACHINE_HAS_EDAT1)
void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned long sz); +void __set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte); pte_t huge_ptep_get(pte_t *ptep); pte_t huge_ptep_get_and_clear(struct mm_struct *mm, @@ -65,7 +67,7 @@ static inline int huge_ptep_set_access_f int changed = !pte_same(huge_ptep_get(ptep), pte); if (changed) { huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); - set_huge_pte_at(vma->vm_mm, addr, ptep, pte); + __set_huge_pte_at(vma->vm_mm, addr, ptep, pte); } return changed; } @@ -74,7 +76,7 @@ static inline void huge_ptep_set_wrprote unsigned long addr, pte_t *ptep) { pte_t pte = huge_ptep_get_and_clear(mm, addr, ptep); - set_huge_pte_at(mm, addr, ptep, pte_wrprotect(pte)); + __set_huge_pte_at(mm, addr, ptep, pte_wrprotect(pte)); }
static inline pte_t mk_huge_pte(struct page *page, pgprot_t pgprot) --- a/arch/s390/mm/hugetlbpage.c~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/s390/mm/hugetlbpage.c @@ -142,7 +142,7 @@ static void clear_huge_pte_skeys(struct __storage_key_init_range(paddr, paddr + size - 1); }
-void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, +void __set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) { unsigned long rste; @@ -163,6 +163,12 @@ void set_huge_pte_at(struct mm_struct *m set_pte(ptep, __pte(rste)); }
+void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned long sz) +{ + __set_huge_pte_at(mm, addr, ptep, pte); +} + pte_t huge_ptep_get(pte_t *ptep) { return __rste_to_pte(pte_val(*ptep)); --- a/arch/sparc/include/asm/hugetlb.h~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/sparc/include/asm/hugetlb.h @@ -14,6 +14,8 @@ extern struct pud_huge_patch_entry __pud
#define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned long sz); +void __set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte);
#define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR @@ -32,7 +34,7 @@ static inline void huge_ptep_set_wrprote unsigned long addr, pte_t *ptep) { pte_t old_pte = *ptep; - set_huge_pte_at(mm, addr, ptep, pte_wrprotect(old_pte)); + __set_huge_pte_at(mm, addr, ptep, pte_wrprotect(old_pte)); }
#define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS @@ -42,7 +44,7 @@ static inline int huge_ptep_set_access_f { int changed = !pte_same(*ptep, pte); if (changed) { - set_huge_pte_at(vma->vm_mm, addr, ptep, pte); + __set_huge_pte_at(vma->vm_mm, addr, ptep, pte); flush_tlb_page(vma, addr); } return changed; --- a/arch/sparc/mm/hugetlbpage.c~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/arch/sparc/mm/hugetlbpage.c @@ -328,7 +328,7 @@ pte_t *huge_pte_offset(struct mm_struct return pte_offset_huge(pmd, addr); }
-void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, +void __set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t entry) { unsigned int nptes, orig_shift, shift; @@ -364,6 +364,12 @@ void set_huge_pte_at(struct mm_struct *m orig_shift); }
+void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t entry, unsigned long sz) +{ + __set_huge_pte_at(mm, addr, ptep, entry); +} + pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { --- a/include/asm-generic/hugetlb.h~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/include/asm-generic/hugetlb.h @@ -76,7 +76,7 @@ static inline void hugetlb_free_pgd_rang
#ifndef __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t pte) + pte_t *ptep, pte_t pte, unsigned long sz) { set_pte_at(mm, addr, ptep, pte); } --- a/include/linux/hugetlb.h~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/include/linux/hugetlb.h @@ -984,7 +984,9 @@ static inline void huge_ptep_modify_prot unsigned long addr, pte_t *ptep, pte_t old_pte, pte_t pte) { - set_huge_pte_at(vma->vm_mm, addr, ptep, pte); + unsigned long psize = huge_page_size(hstate_vma(vma)); + + set_huge_pte_at(vma->vm_mm, addr, ptep, pte, psize); } #endif
@@ -1173,7 +1175,7 @@ static inline pte_t huge_ptep_clear_flus }
static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t pte) + pte_t *ptep, pte_t pte, unsigned long sz) { }
--- a/mm/damon/vaddr.c~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/mm/damon/vaddr.c @@ -341,13 +341,14 @@ static void damon_hugetlb_mkold(pte_t *p bool referenced = false; pte_t entry = huge_ptep_get(pte); struct folio *folio = pfn_folio(pte_pfn(entry)); + unsigned long psize = huge_page_size(hstate_vma(vma));
folio_get(folio);
if (pte_young(entry)) { referenced = true; entry = pte_mkold(entry); - set_huge_pte_at(mm, addr, pte, entry); + set_huge_pte_at(mm, addr, pte, entry, psize); }
#ifdef CONFIG_MMU_NOTIFIER --- a/mm/hugetlb.c~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/mm/hugetlb.c @@ -4980,7 +4980,7 @@ static bool is_hugetlb_entry_hwpoisoned(
static void hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr, - struct folio *new_folio, pte_t old) + struct folio *new_folio, pte_t old, unsigned long sz) { pte_t newpte = make_huge_pte(vma, &new_folio->page, 1);
@@ -4988,7 +4988,7 @@ hugetlb_install_folio(struct vm_area_str hugepage_add_new_anon_rmap(new_folio, vma, addr); if (userfaultfd_wp(vma) && huge_pte_uffd_wp(old)) newpte = huge_pte_mkuffd_wp(newpte); - set_huge_pte_at(vma->vm_mm, addr, ptep, newpte); + set_huge_pte_at(vma->vm_mm, addr, ptep, newpte, sz); hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm); folio_set_hugetlb_migratable(new_folio); } @@ -5065,7 +5065,7 @@ again: } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) { if (!userfaultfd_wp(dst_vma)) entry = huge_pte_clear_uffd_wp(entry); - set_huge_pte_at(dst, addr, dst_pte, entry); + set_huge_pte_at(dst, addr, dst_pte, entry, sz); } else if (unlikely(is_hugetlb_entry_migration(entry))) { swp_entry_t swp_entry = pte_to_swp_entry(entry); bool uffd_wp = pte_swp_uffd_wp(entry); @@ -5080,18 +5080,18 @@ again: entry = swp_entry_to_pte(swp_entry); if (userfaultfd_wp(src_vma) && uffd_wp) entry = pte_swp_mkuffd_wp(entry); - set_huge_pte_at(src, addr, src_pte, entry); + set_huge_pte_at(src, addr, src_pte, entry, sz); } if (!userfaultfd_wp(dst_vma)) entry = huge_pte_clear_uffd_wp(entry); - set_huge_pte_at(dst, addr, dst_pte, entry); + set_huge_pte_at(dst, addr, dst_pte, entry, sz); } else if (unlikely(is_pte_marker(entry))) { pte_marker marker = copy_pte_marker( pte_to_swp_entry(entry), dst_vma);
if (marker) set_huge_pte_at(dst, addr, dst_pte, - make_pte_marker(marker)); + make_pte_marker(marker), sz); } else { entry = huge_ptep_get(src_pte); pte_folio = page_folio(pte_page(entry)); @@ -5145,7 +5145,7 @@ again: goto again; } hugetlb_install_folio(dst_vma, dst_pte, addr, - new_folio, src_pte_old); + new_folio, src_pte_old, sz); spin_unlock(src_ptl); spin_unlock(dst_ptl); continue; @@ -5166,7 +5166,7 @@ again: if (!userfaultfd_wp(dst_vma)) entry = huge_pte_clear_uffd_wp(entry);
- set_huge_pte_at(dst, addr, dst_pte, entry); + set_huge_pte_at(dst, addr, dst_pte, entry, sz); hugetlb_count_add(npages, dst); } spin_unlock(src_ptl); @@ -5184,7 +5184,8 @@ again: }
static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr, - unsigned long new_addr, pte_t *src_pte, pte_t *dst_pte) + unsigned long new_addr, pte_t *src_pte, pte_t *dst_pte, + unsigned long sz) { struct hstate *h = hstate_vma(vma); struct mm_struct *mm = vma->vm_mm; @@ -5202,7 +5203,7 @@ static void move_huge_pte(struct vm_area spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
pte = huge_ptep_get_and_clear(mm, old_addr, src_pte); - set_huge_pte_at(mm, new_addr, dst_pte, pte); + set_huge_pte_at(mm, new_addr, dst_pte, pte, sz);
if (src_ptl != dst_ptl) spin_unlock(src_ptl); @@ -5259,7 +5260,7 @@ int move_hugetlb_page_tables(struct vm_a if (!dst_pte) break;
- move_huge_pte(vma, old_addr, new_addr, src_pte, dst_pte); + move_huge_pte(vma, old_addr, new_addr, src_pte, dst_pte, sz); }
if (shared_pmd) @@ -5337,7 +5338,8 @@ static void __unmap_hugepage_range(struc if (pte_swp_uffd_wp_any(pte) && !(zap_flags & ZAP_FLAG_DROP_MARKER)) set_huge_pte_at(mm, address, ptep, - make_pte_marker(PTE_MARKER_UFFD_WP)); + make_pte_marker(PTE_MARKER_UFFD_WP), + sz); else huge_pte_clear(mm, address, ptep, sz); spin_unlock(ptl); @@ -5371,7 +5373,8 @@ static void __unmap_hugepage_range(struc if (huge_pte_uffd_wp(pte) && !(zap_flags & ZAP_FLAG_DROP_MARKER)) set_huge_pte_at(mm, address, ptep, - make_pte_marker(PTE_MARKER_UFFD_WP)); + make_pte_marker(PTE_MARKER_UFFD_WP), + sz); hugetlb_count_sub(pages_per_huge_page(h), mm); page_remove_rmap(page, vma, true);
@@ -5676,7 +5679,7 @@ retry_avoidcopy: hugepage_add_new_anon_rmap(new_folio, vma, haddr); if (huge_pte_uffd_wp(pte)) newpte = huge_pte_mkuffd_wp(newpte); - set_huge_pte_at(mm, haddr, ptep, newpte); + set_huge_pte_at(mm, haddr, ptep, newpte, huge_page_size(h)); folio_set_hugetlb_migratable(new_folio); /* Make the old page be freed below */ new_folio = old_folio; @@ -5972,7 +5975,7 @@ static vm_fault_t hugetlb_no_page(struct */ if (unlikely(pte_marker_uffd_wp(old_pte))) new_pte = huge_pte_mkuffd_wp(new_pte); - set_huge_pte_at(mm, haddr, ptep, new_pte); + set_huge_pte_at(mm, haddr, ptep, new_pte, huge_page_size(h));
hugetlb_count_add(pages_per_huge_page(h), mm); if ((flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) { @@ -6261,7 +6264,8 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_ }
_dst_pte = make_pte_marker(PTE_MARKER_POISONED); - set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); + set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte, + huge_page_size(h));
/* No need to invalidate - it was non-present before */ update_mmu_cache(dst_vma, dst_addr, dst_pte); @@ -6412,7 +6416,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_ if (wp_enabled) _dst_pte = huge_pte_mkuffd_wp(_dst_pte);
- set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); + set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte, huge_page_size(h));
hugetlb_count_add(pages_per_huge_page(h), dst_mm);
@@ -6598,7 +6602,7 @@ long hugetlb_change_protection(struct vm else if (uffd_wp_resolve) newpte = pte_swp_clear_uffd_wp(newpte); if (!pte_same(pte, newpte)) - set_huge_pte_at(mm, address, ptep, newpte); + set_huge_pte_at(mm, address, ptep, newpte, psize); } else if (unlikely(is_pte_marker(pte))) { /* No other markers apply for now. */ WARN_ON_ONCE(!pte_marker_uffd_wp(pte)); @@ -6623,7 +6627,8 @@ long hugetlb_change_protection(struct vm if (unlikely(uffd_wp)) /* Safe to modify directly (none->non-present). */ set_huge_pte_at(mm, address, ptep, - make_pte_marker(PTE_MARKER_UFFD_WP)); + make_pte_marker(PTE_MARKER_UFFD_WP), + psize); } spin_unlock(ptl); } --- a/mm/migrate.c~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/mm/migrate.c @@ -243,7 +243,9 @@ static bool remove_migration_pte(struct
#ifdef CONFIG_HUGETLB_PAGE if (folio_test_hugetlb(folio)) { - unsigned int shift = huge_page_shift(hstate_vma(vma)); + struct hstate *h = hstate_vma(vma); + unsigned int shift = huge_page_shift(h); + unsigned long psize = huge_page_size(h);
pte = arch_make_huge_pte(pte, shift, vma->vm_flags); if (folio_test_anon(folio)) @@ -251,7 +253,8 @@ static bool remove_migration_pte(struct rmap_flags); else page_dup_file_rmap(new, true); - set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); + set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte, + psize); } else #endif { --- a/mm/rmap.c~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/mm/rmap.c @@ -1480,6 +1480,7 @@ static bool try_to_unmap_one(struct foli struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; unsigned long pfn; + unsigned long hsz = 0;
/* * When racing against e.g. zap_pte_range() on another cpu, @@ -1511,6 +1512,9 @@ static bool try_to_unmap_one(struct foli */ adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end); + + /* We need the huge page size for set_huge_pte_at() */ + hsz = huge_page_size(hstate_vma(vma)); } mmu_notifier_invalidate_range_start(&range);
@@ -1628,7 +1632,8 @@ static bool try_to_unmap_one(struct foli pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); if (folio_test_hugetlb(folio)) { hugetlb_count_sub(folio_nr_pages(folio), mm); - set_huge_pte_at(mm, address, pvmw.pte, pteval); + set_huge_pte_at(mm, address, pvmw.pte, pteval, + hsz); } else { dec_mm_counter(mm, mm_counter(&folio->page)); set_pte_at(mm, address, pvmw.pte, pteval); @@ -1820,6 +1825,7 @@ static bool try_to_migrate_one(struct fo struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; unsigned long pfn; + unsigned long hsz = 0;
/* * When racing against e.g. zap_pte_range() on another cpu, @@ -1855,6 +1861,9 @@ static bool try_to_migrate_one(struct fo */ adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end); + + /* We need the huge page size for set_huge_pte_at() */ + hsz = huge_page_size(hstate_vma(vma)); } mmu_notifier_invalidate_range_start(&range);
@@ -2020,7 +2029,8 @@ static bool try_to_migrate_one(struct fo pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); if (folio_test_hugetlb(folio)) { hugetlb_count_sub(folio_nr_pages(folio), mm); - set_huge_pte_at(mm, address, pvmw.pte, pteval); + set_huge_pte_at(mm, address, pvmw.pte, pteval, + hsz); } else { dec_mm_counter(mm, mm_counter(&folio->page)); set_pte_at(mm, address, pvmw.pte, pteval); @@ -2044,7 +2054,8 @@ static bool try_to_migrate_one(struct fo
if (arch_unmap_one(mm, vma, address, pteval) < 0) { if (folio_test_hugetlb(folio)) - set_huge_pte_at(mm, address, pvmw.pte, pteval); + set_huge_pte_at(mm, address, pvmw.pte, + pteval, hsz); else set_pte_at(mm, address, pvmw.pte, pteval); ret = false; @@ -2058,7 +2069,8 @@ static bool try_to_migrate_one(struct fo if (anon_exclusive && page_try_share_anon_rmap(subpage)) { if (folio_test_hugetlb(folio)) - set_huge_pte_at(mm, address, pvmw.pte, pteval); + set_huge_pte_at(mm, address, pvmw.pte, + pteval, hsz); else set_pte_at(mm, address, pvmw.pte, pteval); ret = false; @@ -2090,7 +2102,8 @@ static bool try_to_migrate_one(struct fo if (pte_uffd_wp(pteval)) swp_pte = pte_swp_mkuffd_wp(swp_pte); if (folio_test_hugetlb(folio)) - set_huge_pte_at(mm, address, pvmw.pte, swp_pte); + set_huge_pte_at(mm, address, pvmw.pte, swp_pte, + hsz); else set_pte_at(mm, address, pvmw.pte, swp_pte); trace_set_migration_pte(address, pte_val(swp_pte), --- a/mm/vmalloc.c~mm-hugetlb-add-huge-page-size-param-to-set_huge_pte_at +++ a/mm/vmalloc.c @@ -111,7 +111,7 @@ static int vmap_pte_range(pmd_t *pmd, un pte_t entry = pfn_pte(pfn, prot);
entry = arch_make_huge_pte(entry, ilog2(size), 0); - set_huge_pte_at(&init_mm, addr, pte, entry); + set_huge_pte_at(&init_mm, addr, pte, entry, size); pfn += PFN_DOWN(size); continue; } _
Patches currently in -mm which might be from ryan.roberts@arm.com are
arm64-hugetlb-fix-set_huge_pte_at-to-work-with-all-swap-entries.patch mm-allow-deferred-splitting-of-arbitrary-anon-large-folios.patch mm-non-pmd-mappable-large-folios-for-folio_add_new_anon_rmap.patch mm-thp-account-pte-mapped-anonymous-thp-usage.patch mm-thp-introduce-anon_orders-and-anon_always_mask-sysfs-files.patch mm-thp-extend-thp-to-allocate-anonymous-large-folios.patch mm-thp-add-recommend-option-for-anon_orders.patch arm64-mm-override-arch_wants_pte_order.patch selftests-mm-cow-generalize-do_run_with_thp-helper.patch selftests-mm-cow-add-tests-for-small-order-anon-thp.patch