As it stands, memory_failure() gets thoroughly confused by dev_pagemap backed mappings. The recovery code has specific enabling for several possible page states and needs new enabling to handle poison in dax mappings.
In order to support reliable reverse mapping of user space addresses add new locking in the fsdax implementation to prevent races between page-address_space disassociation events and the rmap performed in the memory_failure() path. Additionally, since dev_pagemap pages are hidden from the page allocator, add a mechanism to determine the size of the mapping that encompasses a given poisoned pfn. Lastly, since pmem errors can be repaired, change the speculatively accessed poison protection, mce_unmap_kpfn(), to be reversible and otherwise allow ongoing access from the kernel.
---
Dan Williams (11): device-dax: convert to vmf_insert_mixed and vm_fault_t device-dax: cleanup vm_fault de-reference chains device-dax: enable page_mapping() device-dax: set page->index filesystem-dax: set page->index filesystem-dax: perform __dax_invalidate_mapping_entry() under the page lock mm, madvise_inject_error: fix page count leak x86, memory_failure: introduce {set,clear}_mce_nospec() mm, memory_failure: pass page size to kill_proc() mm, memory_failure: teach memory_failure() about dev_pagemap pages libnvdimm, pmem: restore page attributes when clearing errors
arch/x86/include/asm/set_memory.h | 29 ++++++ arch/x86/kernel/cpu/mcheck/mce-internal.h | 15 --- arch/x86/kernel/cpu/mcheck/mce.c | 38 +------- drivers/dax/device.c | 91 ++++++++++++-------- drivers/nvdimm/pmem.c | 26 ++++++ drivers/nvdimm/pmem.h | 13 +++ fs/dax.c | 102 ++++++++++++++++++++-- include/linux/huge_mm.h | 5 + include/linux/set_memory.h | 14 +++ mm/huge_memory.c | 4 - mm/madvise.c | 11 ++ mm/memory-failure.c | 133 +++++++++++++++++++++++++++-- 12 files changed, 370 insertions(+), 111 deletions(-)