Looks like what we fixed for hugetlb in commit 44f86392bdd1 ("mm/hugetlb: fix uffd-wp handling for migration entries in hugetlb_change_protection()") similarly applies to THP.
Setting/clearing uffd-wp on THP migration entries is not implemented properly. Further, while removing migration PMDs considers the uffd-wp bit, inserting migration PMDs does not consider the uffd-wp bit.
We have to set/clear independently of the migration entry type in change_huge_pmd() and properly copy the uffd-wp bit in set_pmd_migration_entry().
Verified using a simple reproducer that triggers migration of a THP, that the set_pmd_migration_entry() no longer loses the uffd-wp bit.
Fixes: f45ec5ff16a7 ("userfaultfd: wp: support swap and page migration") Cc: stable@vger.kernel.org Signed-off-by: David Hildenbrand david@redhat.com --- mm/huge_memory.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 032fb0ef9cd1..bdda4f426d58 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1838,10 +1838,10 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, if (is_swap_pmd(*pmd)) { swp_entry_t entry = pmd_to_swp_entry(*pmd); struct page *page = pfn_swap_entry_to_page(entry); + pmd_t newpmd;
VM_BUG_ON(!is_pmd_migration_entry(*pmd)); if (is_writable_migration_entry(entry)) { - pmd_t newpmd; /* * A protection check is difficult so * just be safe and disable write @@ -1855,8 +1855,16 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, newpmd = pmd_swp_mksoft_dirty(newpmd); if (pmd_swp_uffd_wp(*pmd)) newpmd = pmd_swp_mkuffd_wp(newpmd); - set_pmd_at(mm, addr, pmd, newpmd); + } else { + newpmd = *pmd; } + + if (uffd_wp) + newpmd = pmd_swp_mkuffd_wp(newpmd); + else if (uffd_wp_resolve) + newpmd = pmd_swp_clear_uffd_wp(newpmd); + if (!pmd_same(*pmd, newpmd)) + set_pmd_at(mm, addr, pmd, newpmd); goto unlock; } #endif @@ -3251,6 +3259,8 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, pmdswp = swp_entry_to_pmd(entry); if (pmd_soft_dirty(pmdval)) pmdswp = pmd_swp_mksoft_dirty(pmdswp); + if (pmd_swp_uffd_wp(*pvmw->pmd)) + pmdswp = pmd_swp_mkuffd_wp(pmdswp); set_pmd_at(mm, address, pvmw->pmd, pmdswp); page_remove_rmap(page, vma, true); put_page(page);
On Wed, Apr 05, 2023 at 04:25:34PM +0200, David Hildenbrand wrote:
Looks like what we fixed for hugetlb in commit 44f86392bdd1 ("mm/hugetlb: fix uffd-wp handling for migration entries in hugetlb_change_protection()") similarly applies to THP.
Setting/clearing uffd-wp on THP migration entries is not implemented properly. Further, while removing migration PMDs considers the uffd-wp bit, inserting migration PMDs does not consider the uffd-wp bit.
We have to set/clear independently of the migration entry type in change_huge_pmd() and properly copy the uffd-wp bit in set_pmd_migration_entry().
Verified using a simple reproducer that triggers migration of a THP, that the set_pmd_migration_entry() no longer loses the uffd-wp bit.
Fixes: f45ec5ff16a7 ("userfaultfd: wp: support swap and page migration") Cc: stable@vger.kernel.org Signed-off-by: David Hildenbrand david@redhat.com
Reviewed-by: Peter Xu peterx@redhat.com
Thanks, one trivial nitpick:
mm/huge_memory.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 032fb0ef9cd1..bdda4f426d58 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1838,10 +1838,10 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, if (is_swap_pmd(*pmd)) { swp_entry_t entry = pmd_to_swp_entry(*pmd); struct page *page = pfn_swap_entry_to_page(entry);
pmd_t newpmd;
VM_BUG_ON(!is_pmd_migration_entry(*pmd)); if (is_writable_migration_entry(entry)) {
pmd_t newpmd; /* * A protection check is difficult so * just be safe and disable write
@@ -1855,8 +1855,16 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, newpmd = pmd_swp_mksoft_dirty(newpmd); if (pmd_swp_uffd_wp(*pmd)) newpmd = pmd_swp_mkuffd_wp(newpmd);
set_pmd_at(mm, addr, pmd, newpmd);
} else {
}newpmd = *pmd;
if (uffd_wp)
newpmd = pmd_swp_mkuffd_wp(newpmd);
else if (uffd_wp_resolve)
newpmd = pmd_swp_clear_uffd_wp(newpmd);
if (!pmd_same(*pmd, newpmd))
goto unlock; }set_pmd_at(mm, addr, pmd, newpmd);
#endif @@ -3251,6 +3259,8 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, pmdswp = swp_entry_to_pmd(entry); if (pmd_soft_dirty(pmdval)) pmdswp = pmd_swp_mksoft_dirty(pmdswp);
- if (pmd_swp_uffd_wp(*pvmw->pmd))
pmdswp = pmd_swp_mkuffd_wp(pmdswp);
I think it's fine to use *pmd, but maybe still better to use pmdval? I worry pmdp_invalidate()) can be something else in the future that may affect the bit.
set_pmd_at(mm, address, pvmw->pmd, pmdswp); page_remove_rmap(page, vma, true); put_page(page); -- 2.39.2
On 05.04.23 17:12, Peter Xu wrote:
On Wed, Apr 05, 2023 at 04:25:34PM +0200, David Hildenbrand wrote:
Looks like what we fixed for hugetlb in commit 44f86392bdd1 ("mm/hugetlb: fix uffd-wp handling for migration entries in hugetlb_change_protection()") similarly applies to THP.
Setting/clearing uffd-wp on THP migration entries is not implemented properly. Further, while removing migration PMDs considers the uffd-wp bit, inserting migration PMDs does not consider the uffd-wp bit.
We have to set/clear independently of the migration entry type in change_huge_pmd() and properly copy the uffd-wp bit in set_pmd_migration_entry().
Verified using a simple reproducer that triggers migration of a THP, that the set_pmd_migration_entry() no longer loses the uffd-wp bit.
Fixes: f45ec5ff16a7 ("userfaultfd: wp: support swap and page migration") Cc: stable@vger.kernel.org Signed-off-by: David Hildenbrand david@redhat.com
Reviewed-by: Peter Xu peterx@redhat.com
Thanks, one trivial nitpick:
mm/huge_memory.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 032fb0ef9cd1..bdda4f426d58 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1838,10 +1838,10 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, if (is_swap_pmd(*pmd)) { swp_entry_t entry = pmd_to_swp_entry(*pmd); struct page *page = pfn_swap_entry_to_page(entry);
pmd_t newpmd;
VM_BUG_ON(!is_pmd_migration_entry(*pmd)); if (is_writable_migration_entry(entry)) {
pmd_t newpmd; /* * A protection check is difficult so * just be safe and disable write
@@ -1855,8 +1855,16 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, newpmd = pmd_swp_mksoft_dirty(newpmd); if (pmd_swp_uffd_wp(*pmd)) newpmd = pmd_swp_mkuffd_wp(newpmd);
set_pmd_at(mm, addr, pmd, newpmd);
} else {
}newpmd = *pmd;
if (uffd_wp)
newpmd = pmd_swp_mkuffd_wp(newpmd);
else if (uffd_wp_resolve)
newpmd = pmd_swp_clear_uffd_wp(newpmd);
if (!pmd_same(*pmd, newpmd))
goto unlock; } #endifset_pmd_at(mm, addr, pmd, newpmd);
@@ -3251,6 +3259,8 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, pmdswp = swp_entry_to_pmd(entry); if (pmd_soft_dirty(pmdval)) pmdswp = pmd_swp_mksoft_dirty(pmdswp);
- if (pmd_swp_uffd_wp(*pvmw->pmd))
pmdswp = pmd_swp_mkuffd_wp(pmdswp);
I think it's fine to use *pmd, but maybe still better to use pmdval? I worry pmdp_invalidate()) can be something else in the future that may affect the bit.
Wondering how I ended up with that, I realized that it's actually wrong and might have worked by chance for my reproducer on x86.
That should make it work:
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index f977c965fdad..fffc953fa6ea 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3257,7 +3257,7 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, pmdswp = swp_entry_to_pmd(entry); if (pmd_soft_dirty(pmdval)) pmdswp = pmd_swp_mksoft_dirty(pmdswp); - if (pmd_swp_uffd_wp(*pvmw->pmd)) + if (pmd_uffd_wp(pmdval)) pmdswp = pmd_swp_mkuffd_wp(pmdswp); set_pmd_at(mm, address, pvmw->pmd, pmdswp); page_remove_rmap(page, vma, true);
On Wed, Apr 05, 2023 at 05:17:31PM +0200, David Hildenbrand wrote:
On 05.04.23 17:12, Peter Xu wrote:
On Wed, Apr 05, 2023 at 04:25:34PM +0200, David Hildenbrand wrote:
Looks like what we fixed for hugetlb in commit 44f86392bdd1 ("mm/hugetlb: fix uffd-wp handling for migration entries in hugetlb_change_protection()") similarly applies to THP.
Setting/clearing uffd-wp on THP migration entries is not implemented properly. Further, while removing migration PMDs considers the uffd-wp bit, inserting migration PMDs does not consider the uffd-wp bit.
We have to set/clear independently of the migration entry type in change_huge_pmd() and properly copy the uffd-wp bit in set_pmd_migration_entry().
Verified using a simple reproducer that triggers migration of a THP, that the set_pmd_migration_entry() no longer loses the uffd-wp bit.
Fixes: f45ec5ff16a7 ("userfaultfd: wp: support swap and page migration") Cc: stable@vger.kernel.org Signed-off-by: David Hildenbrand david@redhat.com
Reviewed-by: Peter Xu peterx@redhat.com
Thanks, one trivial nitpick:
mm/huge_memory.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 032fb0ef9cd1..bdda4f426d58 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1838,10 +1838,10 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, if (is_swap_pmd(*pmd)) { swp_entry_t entry = pmd_to_swp_entry(*pmd); struct page *page = pfn_swap_entry_to_page(entry);
VM_BUG_ON(!is_pmd_migration_entry(*pmd)); if (is_writable_migration_entry(entry)) {pmd_t newpmd;
pmd_t newpmd; /* * A protection check is difficult so * just be safe and disable write
@@ -1855,8 +1855,16 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, newpmd = pmd_swp_mksoft_dirty(newpmd); if (pmd_swp_uffd_wp(*pmd)) newpmd = pmd_swp_mkuffd_wp(newpmd);
set_pmd_at(mm, addr, pmd, newpmd);
} else {
}newpmd = *pmd;
if (uffd_wp)
newpmd = pmd_swp_mkuffd_wp(newpmd);
else if (uffd_wp_resolve)
newpmd = pmd_swp_clear_uffd_wp(newpmd);
if (!pmd_same(*pmd, newpmd))
goto unlock; } #endifset_pmd_at(mm, addr, pmd, newpmd);
@@ -3251,6 +3259,8 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, pmdswp = swp_entry_to_pmd(entry); if (pmd_soft_dirty(pmdval)) pmdswp = pmd_swp_mksoft_dirty(pmdswp);
- if (pmd_swp_uffd_wp(*pvmw->pmd))
pmdswp = pmd_swp_mkuffd_wp(pmdswp);
I think it's fine to use *pmd, but maybe still better to use pmdval? I worry pmdp_invalidate()) can be something else in the future that may affect the bit.
Wondering how I ended up with that, I realized that it's actually wrong and might have worked by chance for my reproducer on x86.
That should make it work:
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index f977c965fdad..fffc953fa6ea 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3257,7 +3257,7 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, pmdswp = swp_entry_to_pmd(entry); if (pmd_soft_dirty(pmdval)) pmdswp = pmd_swp_mksoft_dirty(pmdswp);
if (pmd_swp_uffd_wp(*pvmw->pmd))
if (pmd_uffd_wp(pmdval)) pmdswp = pmd_swp_mkuffd_wp(pmdswp); set_pmd_at(mm, address, pvmw->pmd, pmdswp); page_remove_rmap(page, vma, true);
I guess pmd_swp_uffd_wp() just reads the _USER bit 2 which is also set for a present pte, but then it sets swp uffd-wp always even if it was not set.
Yes the change must be squashed in to be correct, with that, my R-b keeps.
Thanks,
On 05.04.23 17:43, Peter Xu wrote:
On Wed, Apr 05, 2023 at 05:17:31PM +0200, David Hildenbrand wrote:
On 05.04.23 17:12, Peter Xu wrote:
On Wed, Apr 05, 2023 at 04:25:34PM +0200, David Hildenbrand wrote:
Looks like what we fixed for hugetlb in commit 44f86392bdd1 ("mm/hugetlb: fix uffd-wp handling for migration entries in hugetlb_change_protection()") similarly applies to THP.
Setting/clearing uffd-wp on THP migration entries is not implemented properly. Further, while removing migration PMDs considers the uffd-wp bit, inserting migration PMDs does not consider the uffd-wp bit.
We have to set/clear independently of the migration entry type in change_huge_pmd() and properly copy the uffd-wp bit in set_pmd_migration_entry().
Verified using a simple reproducer that triggers migration of a THP, that the set_pmd_migration_entry() no longer loses the uffd-wp bit.
Fixes: f45ec5ff16a7 ("userfaultfd: wp: support swap and page migration") Cc: stable@vger.kernel.org Signed-off-by: David Hildenbrand david@redhat.com
Reviewed-by: Peter Xu peterx@redhat.com
Thanks, one trivial nitpick:
mm/huge_memory.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 032fb0ef9cd1..bdda4f426d58 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1838,10 +1838,10 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, if (is_swap_pmd(*pmd)) { swp_entry_t entry = pmd_to_swp_entry(*pmd); struct page *page = pfn_swap_entry_to_page(entry);
pmd_t newpmd; VM_BUG_ON(!is_pmd_migration_entry(*pmd)); if (is_writable_migration_entry(entry)) {
pmd_t newpmd; /* * A protection check is difficult so * just be safe and disable write
@@ -1855,8 +1855,16 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, newpmd = pmd_swp_mksoft_dirty(newpmd); if (pmd_swp_uffd_wp(*pmd)) newpmd = pmd_swp_mkuffd_wp(newpmd);
set_pmd_at(mm, addr, pmd, newpmd);
} else {
newpmd = *pmd; }
if (uffd_wp)
newpmd = pmd_swp_mkuffd_wp(newpmd);
else if (uffd_wp_resolve)
newpmd = pmd_swp_clear_uffd_wp(newpmd);
if (!pmd_same(*pmd, newpmd))
} #endifset_pmd_at(mm, addr, pmd, newpmd); goto unlock;
@@ -3251,6 +3259,8 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, pmdswp = swp_entry_to_pmd(entry); if (pmd_soft_dirty(pmdval)) pmdswp = pmd_swp_mksoft_dirty(pmdswp);
- if (pmd_swp_uffd_wp(*pvmw->pmd))
pmdswp = pmd_swp_mkuffd_wp(pmdswp);
I think it's fine to use *pmd, but maybe still better to use pmdval? I worry pmdp_invalidate()) can be something else in the future that may affect the bit.
Wondering how I ended up with that, I realized that it's actually wrong and might have worked by chance for my reproducer on x86.
That should make it work:
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index f977c965fdad..fffc953fa6ea 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3257,7 +3257,7 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, pmdswp = swp_entry_to_pmd(entry); if (pmd_soft_dirty(pmdval)) pmdswp = pmd_swp_mksoft_dirty(pmdswp);
if (pmd_swp_uffd_wp(*pvmw->pmd))
if (pmd_uffd_wp(pmdval)) pmdswp = pmd_swp_mkuffd_wp(pmdswp); set_pmd_at(mm, address, pvmw->pmd, pmdswp); page_remove_rmap(page, vma, true);
I guess pmd_swp_uffd_wp() just reads the _USER bit 2 which is also set for a present pte, but then it sets swp uffd-wp always even if it was not set.
Yes. I modified the reproducer to migrate without uffd-wp first and we suddenly gain a uffd-wp bit.
Yes the change must be squashed in to be correct, with that, my R-b keeps.
Thanks, I will resend later.
linux-stable-mirror@lists.linaro.org