On 14 Apr 2025, at 3:27, Gavin Guo wrote:
When migrating a THP, concurrent access to the PMD migration entry during a deferred split scan can lead to a page fault, as illustrated
It is an access violation, right? Because pmd_folio(*pmd_migration_entry) does not return a folio address. Page fault made this sounded like not a big issue.
below. To prevent this page fault, it is necessary to check the PMD migration entry and return early. In this context, there is no need to use pmd_to_swp_entry and pfn_swap_entry_to_page to verify the equality of the target folio. Since the PMD migration entry is locked, it cannot be served as the target.
You mean split_huge_pmd_address() locks the PMD page table, so that page migration cannot proceed, or the THP is locked by migration, so that it cannot be split? The sentence is a little confusing to me.
BUG: unable to handle page fault for address: ffffea60001db008 CPU: 0 UID: 0 PID: 2199114 Comm: tee Not tainted 6.14.0+ #4 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:split_huge_pmd_locked+0x3b5/0x2b60 Call Trace:
<TASK> try_to_migrate_one+0x28c/0x3730 rmap_walk_anon+0x4f6/0x770 unmap_folio+0x196/0x1f0 split_huge_page_to_list_to_order+0x9f6/0x1560 deferred_split_scan+0xac5/0x12a0 shrinker_debugfs_scan_write+0x376/0x470 full_proxy_write+0x15c/0x220 vfs_write+0x2fc/0xcb0 ksys_write+0x146/0x250 do_syscall_64+0x6a/0x120 entry_SYSCALL_64_after_hwframe+0x76/0x7e
The bug is found by syzkaller on an internal kernel, then confirmed on upstream.
Fixes: 84c3fc4e9c56 ("mm: thp: check pmd migration entry in common path") Cc: stable@vger.kernel.org Signed-off-by: Gavin Guo gavinguo@igalia.com
mm/huge_memory.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2a47682d1ab7..0cb9547dcff2 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3075,6 +3075,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, void split_huge_pmd_locked(struct vm_area_struct *vma, unsigned long address, pmd_t *pmd, bool freeze, struct folio *folio) {
- bool pmd_migration = is_pmd_migration_entry(*pmd);
- VM_WARN_ON_ONCE(folio && !folio_test_pmd_mappable(folio)); VM_WARN_ON_ONCE(!IS_ALIGNED(address, HPAGE_PMD_SIZE)); VM_WARN_ON_ONCE(folio && !folio_test_locked(folio));
@@ -3085,10 +3087,18 @@ void split_huge_pmd_locked(struct vm_area_struct *vma, unsigned long address, * require a folio to check the PMD against. Otherwise, there * is a risk of replacing the wrong folio. */
- if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) ||
is_pmd_migration_entry(*pmd)) {
if (folio && folio != pmd_folio(*pmd))
return;
- if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || pmd_migration) {
if (folio) {
/*
* Do not apply pmd_folio() to a migration entry; and
* folio lock guarantees that it must be of the wrong
* folio anyway.
Why does the folio lock imply it is a wrong folio?
*/
if (pmd_migration)
return;
if (folio != pmd_folio(*pmd))
return;
}
Why not just
if (folio && pmd_migration) return;
if (pmd_trans_huge() …) { … } ?
Thanks.
Best Regards, Yan, Zi