tl;dr: 32-bit x86 without PAE opts into hugetlb page table sharing despite only having 2-level paging, which means the "sharable" page tables are PGDs, and then stuff breaks
On Sun, Jun 29, 2025 at 3:00 PM Vitaly Chikunov vt@altlinux.org wrote:
LTP tests failure with the following commit described below:
Uuugh... thanks for letting me know.
On Fri, Jun 20, 2025 at 11:33:32PM +0200, Jann Horn wrote:
From: Liu Shixin liushixin2@huawei.com
[ Upstream commit 59d9094df3d79443937add8700b2ef1a866b1081 ]
The folio refcount may be increased unexpectly through try_get_folio() by caller such as split_huge_pages. In huge_pmd_unshare(), we use refcount to check whether a pmd page table is shared. The check is incorrect if the refcount is increased by the above caller, and this can cause the page table leaked:
[...]
The commit causes LTP test memfd_create03 to fail on i586 architecture on v6.1.142 stable release, the test was passing on v6.1.141. Found the commit with git bisect.
Ah, yes, I can reproduce this; specifically it reproduces on a 32-bit X86 builds without X86_PAE. If I enable X86_PAE, the tests pass.
Okay, I don't know precisely why this is breaking, but at a high level: x86 unconditionally selects ARCH_WANT_HUGE_PMD_SHARE (and still does in mainline). That flag means "when we have PMD entries pointing to hugetlb pages, we want to share the PMD table across processes".
32-bit X86 with PAE has 3 page table levels (pgd, pmd, pte); so with this sharing mechanism, we'd have multiple PGD entries pointing to the same PMD. I guess that seems fine.
But 32-bit X86 with PAE only has 2 page table levels (pgd, pte). So a hugepage is referenced by a PGD entry, and it makes no sense to try to share PGDs. PGDs not being shared page tables is also baked into (looking at the mainline version) "struct ptdesc", which puts "struct mm_struct *pt_mm;" (for x86 PGDs) and "atomic_t pt_share_count;" (for hugetlb page table sharing) into the same union.
I guess I'll send a patch later to disable page table sharing in non-PAE 32-bit x86... or maybe we should disable it entirely for 32-bit x86...