From: Mike Kravetz mike.kravetz@oracle.com
[ Upstream commit 48b8d744ea841b8adf8d07bfe7a2d55f22e4d179 ]
Patch series "Fix prep_compound_gigantic_page ref count adjustment".
These patches address the possible race between prep_compound_gigantic_page and __page_cache_add_speculative as described by Jann Horn in [1].
The first patch simply removes the unnecessary/obsolete helper routine prep_compound_huge_page to make the actual fix a little simpler.
The second patch is the actual fix and has a detailed explanation in the commit message.
This potential issue has existed for almost 10 years and I am unaware of anyone actually hitting the race. I did not cc stable, but would be happy to squash the patches and send to stable if anyone thinks that is a good idea.
[1] https://lore.kernel.org/linux-mm/CAG48ez23q0Jy9cuVnwAe7t_fdhMk2S7N5Hdi-GLcCe...
This patch (of 2):
I could not think of a reliable way to recreate the issue for testing. Rather, I 'simulated errors' to exercise all the error paths.
The routine prep_compound_huge_page is a simple wrapper to call either prep_compound_gigantic_page or prep_compound_page. However, it is only called from gather_bootmem_prealloc which only processes gigantic pages. Eliminate the routine and call prep_compound_gigantic_page directly.
Link: https://lkml.kernel.org/r/20210622021423.154662-1-mike.kravetz@oracle.com Link: https://lkml.kernel.org/r/20210622021423.154662-2-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz mike.kravetz@oracle.com Cc: Andrea Arcangeli aarcange@redhat.com Cc: Jan Kara jack@suse.cz Cc: Jann Horn jannh@google.com Cc: John Hubbard jhubbard@nvidia.com Cc: "Kirill A . Shutemov" kirill@shutemov.name Cc: Matthew Wilcox willy@infradead.org Cc: Michal Hocko mhocko@kernel.org Cc: Youquan Song youquan.song@intel.com Cc: Muchun Song songmuchun@bytedance.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/hugetlb.c | 29 ++++++++++------------------- 1 file changed, 10 insertions(+), 19 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 7ba7d9b20494..dbf44b92651b 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1313,8 +1313,6 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask, return alloc_contig_pages(nr_pages, gfp_mask, nid, nodemask); }
-static void prep_new_huge_page(struct hstate *h, struct page *page, int nid); -static void prep_compound_gigantic_page(struct page *page, unsigned int order); #else /* !CONFIG_CONTIG_ALLOC */ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask, int nid, nodemask_t *nodemask) @@ -2504,16 +2502,10 @@ found: return 1; }
-static void __init prep_compound_huge_page(struct page *page, - unsigned int order) -{ - if (unlikely(order > (MAX_ORDER - 1))) - prep_compound_gigantic_page(page, order); - else - prep_compound_page(page, order); -} - -/* Put bootmem huge pages into the standard lists after mem_map is up */ +/* + * Put bootmem huge pages into the standard lists after mem_map is up. + * Note: This only applies to gigantic (order > MAX_ORDER) pages. + */ static void __init gather_bootmem_prealloc(void) { struct huge_bootmem_page *m; @@ -2522,20 +2514,19 @@ static void __init gather_bootmem_prealloc(void) struct page *page = virt_to_page(m); struct hstate *h = m->hstate;
+ VM_BUG_ON(!hstate_is_gigantic(h)); WARN_ON(page_count(page) != 1); - prep_compound_huge_page(page, huge_page_order(h)); + prep_compound_gigantic_page(page, huge_page_order(h)); WARN_ON(PageReserved(page)); prep_new_huge_page(h, page, page_to_nid(page)); put_page(page); /* free it into the hugepage allocator */
/* - * If we had gigantic hugepages allocated at boot time, we need - * to restore the 'stolen' pages to totalram_pages in order to - * fix confusing memory reports from free(1) and another - * side-effects, like CommitLimit going negative. + * We need to restore the 'stolen' pages to totalram_pages + * in order to fix confusing memory reports from free(1) and + * other side-effects, like CommitLimit going negative. */ - if (hstate_is_gigantic(h)) - adjust_managed_page_count(page, pages_per_huge_page(h)); + adjust_managed_page_count(page, pages_per_huge_page(h)); cond_resched(); } }