[patch 076/119] mm: hwpoison: disable memory error handling on 1GB hugepage

5 Apr 2018

From: Naoya Horiguchi n-horiguchi@ah.jp.nec.com
Subject: mm: hwpoison: disable memory error handling on 1GB hugepage
Recently the following BUG was reported:
Injecting memory failure for pfn 0x3c0000 at process virtual address 0x7fe300000000
    Memory failure: 0x3c0000: recovery action for huge page: Recovered
    BUG: unable to handle kernel paging request at ffff8dfcc0003000
    IP: gup_pgd_range+0x1f0/0xc20
    PGD 17ae72067 P4D 17ae72067 PUD 0
    Oops: 0000 [#1] SMP PTI
    ...
    CPU: 3 PID: 5467 Comm: hugetlb_1gb Not tainted 4.15.0-rc8-mm1-abc+ #3
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014
You can easily reproduce this by calling madvise(MADV_HWPOISON) twice on a
1GB hugepage.  This happens because get_user_pages_fast() is not aware of
a migration entry on pud that was created in the 1st madvise() event.
I think that conversion to pud-aligned migration entry is working, but
other MM code walking over page table isn't prepared for it.  We need some
time and effort to make all this work properly, so this patch avoids the
reported bug by just disabling error handling for 1GB hugepage.
[n-horiguchi@ah.jp.nec.com: v2]
  Link: http://lkml.kernel.org/r/1517284444-18149-1-git-send-email-n-horiguchi@ah.jp...
Link: http://lkml.kernel.org/r/1517207283-15769-1-git-send-email-n-horiguchi@ah.jp...
Signed-off-by: Naoya Horiguchi n-horiguchi@ah.jp.nec.com
Acked-by: Michal Hocko mhocko@suse.com
Reviewed-by: Andrew Morton akpm@linux-foundation.org
Reviewed-by: Mike Kravetz mike.kravetz@oracle.com
Acked-by: Punit Agrawal punit.agrawal@arm.com
Tested-by: Michael Ellerman mpe@ellerman.id.au
Cc: Anshuman Khandual khandual@linux.vnet.ibm.com
Cc: "Aneesh Kumar K.V" aneesh.kumar@linux.vnet.ibm.com
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Morton akpm@linux-foundation.org
---
include/linux/mm.h  |    1 +
 mm/memory-failure.c |   16 ++++++++++++++++
 2 files changed, 17 insertions(+)
diff -puN include/linux/mm.h~mm-hwpoison-disable-memory-error-handling-on-1gb-hugepage include/linux/mm.h

--- a/include/linux/mm.h~mm-hwpoison-disable-memory-error-handling-on-1gb-hugepage
+++ a/include/linux/mm.h
@@ -2613,6 +2613,7 @@ enum mf_action_page_type {
    MF_MSG_POISONED_HUGE,
    MF_MSG_HUGE,
    MF_MSG_FREE_HUGE,
+	MF_MSG_NON_PMD_HUGE,
    MF_MSG_UNMAP_FAILED,
    MF_MSG_DIRTY_SWAPCACHE,
    MF_MSG_CLEAN_SWAPCACHE,
diff -puN mm/memory-failure.c~mm-hwpoison-disable-memory-error-handling-on-1gb-hugepage mm/memory-failure.c
--- a/mm/memory-failure.c~mm-hwpoison-disable-memory-error-handling-on-1gb-hugepage
+++ a/mm/memory-failure.c
@@ -502,6 +502,7 @@ static const char * const action_page_ty
    [MF_MSG_POISONED_HUGE]		= "huge page already hardware poisoned",
    [MF_MSG_HUGE]			= "huge page",
    [MF_MSG_FREE_HUGE]		= "free huge page",
+	[MF_MSG_NON_PMD_HUGE]		= "non-pmd-sized huge page",
    [MF_MSG_UNMAP_FAILED]		= "unmapping failed page",
    [MF_MSG_DIRTY_SWAPCACHE]	= "dirty swapcache page",
    [MF_MSG_CLEAN_SWAPCACHE]	= "clean swapcache page",
@@ -1084,6 +1085,21 @@ static int memory_failure_hugetlb(unsign
    	return 0;
    }
+	/*
+	 * TODO: hwpoison for pud-sized hugetlb doesn't work right now, so
+	 * simply disable it. In order to make it work properly, we need
+	 * make sure that:
+	 *  - conversion of a pud that maps an error hugetlb into hwpoison
+	 *    entry properly works, and
+	 *  - other mm code walking over page table is aware of pud-aligned
+	 *    hwpoison entries.
+	 */
+	if (huge_page_size(page_hstate(head)) > PMD_SIZE) {
+		action_result(pfn, MF_MSG_NON_PMD_HUGE, MF_IGNORED);
+		res = -EBUSY;
+		goto out;
+	}
+
    if (!hwpoison_user_mappings(p, pfn, flags, &head)) {
    	action_result(pfn, MF_MSG_UNMAP_FAILED, MF_IGNORED);
    	res = -EBUSY;
_

    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

[patch 076/119] mm: hwpoison: disable memory error handling on 1GB hugepage