The patch titled
Subject: mm: remove VM_BUG_ON(PageSlab()) from page_mapcount()
has been added to the -mm tree. Its filename is
mm-remove-vm_bug_onpageslab-from-page_mapcount.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-remove-vm_bug_onpageslab-from-p…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-remove-vm_bug_onpageslab-from-p…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru>
Subject: mm: remove VM_BUG_ON(PageSlab()) from page_mapcount()
Replace superfluous VM_BUG_ON() with comment about correct usage.
Technically reverts commit 1d148e218a0d0566b1c06f2f45f1436d53b049b2
("mm: add VM_BUG_ON_PAGE() to page_mapcount()"), but context have changed.
Function isolate_migratepages_block() runs some checks out of lru_lock
when choose pages for migration. After checking PageLRU() it checks extra
page references by comparing page_count() and page_mapcount(). Between
these two checks page could be removed from lru, freed and taken by slab.
As a result this race triggers VM_BUG_ON(PageSlab()) in page_mapcount().
Race window is tiny. For certain workload this happens around once a year.
page:ffffea0105ca9380 count:1 mapcount:0 mapping:ffff88ff7712c180 index:0x0 compound_mapcount: 0
flags: 0x500000000008100(slab|head)
raw: 0500000000008100 dead000000000100 dead000000000200 ffff88ff7712c180
raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
page dumped because: VM_BUG_ON_PAGE(PageSlab(page))
------------[ cut here ]------------
kernel BUG at ./include/linux/mm.h:628!
invalid opcode: 0000 [#1] SMP NOPTI
CPU: 77 PID: 504 Comm: kcompactd1 Tainted: G W 4.19.109-27 #1
Hardware name: Yandex T175-N41-Y3N/MY81-EX0-Y3N, BIOS R05 06/20/2019
RIP: 0010:isolate_migratepages_block+0x986/0x9b0
Code in isolate_migratepages_block() was added in commit 119d6d59dcc0
("mm, compaction: avoid isolating pinned pages") before adding VM_BUG_ON
into page_mapcount().
This race has been predicted in 2015 by Vlastimil Babka (see link below).
Link: http://lkml.kernel.org/r/159032779896.957378.7852761411265662220.stgit@buzz
Link: https://lore.kernel.org/lkml/557710E1.6060103@suse.cz/
Link: https://lore.kernel.org/linux-mm/158937872515.474360.5066096871639561424.st… (v1)
Fixes: 1d148e218a0d ("mm: add VM_BUG_ON_PAGE() to page_mapcount()")
Signed-off-by: Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru>
Acked-by: Hugh Dickins <hughd(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: David Rientjes <rientjes(a)google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mm.h | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
--- a/include/linux/mm.h~mm-remove-vm_bug_onpageslab-from-page_mapcount
+++ a/include/linux/mm.h
@@ -782,6 +782,11 @@ static inline void *kvcalloc(size_t n, s
extern void kvfree(const void *addr);
+/*
+ * Mapcount of compound page as a whole, not includes mapped sub-pages.
+ *
+ * Must be called only for compound pages or any their tail sub-pages.
+ */
static inline int compound_mapcount(struct page *page)
{
VM_BUG_ON_PAGE(!PageCompound(page), page);
@@ -801,10 +806,15 @@ static inline void page_mapcount_reset(s
int __page_mapcount(struct page *page);
+/*
+ * Mapcount of 0-order page, for sub-page includes compound_mapcount().
+ *
+ * Result is undefined for pages which cannot be mapped into userspace.
+ * For example SLAB or special types of pages. See function page_has_type().
+ * They use this place in struct page differently.
+ */
static inline int page_mapcount(struct page *page)
{
- VM_BUG_ON_PAGE(PageSlab(page), page);
-
if (unlikely(PageCompound(page)))
return __page_mapcount(page);
return atomic_read(&page->_mapcount) + 1;
_
Patches currently in -mm which might be from khlebnikov(a)yandex-team.ru are
mm-compaction-avoid-vm_bug_onpageslab-in-page_mapcount.patch
mm-remove-vm_bug_onpageslab-from-page_mapcount.patch
kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
doc-cgroup-update-note-about-conditions-when-oom-killer-is-invoked.patch
The patch titled
Subject: mm,thp: stop leaking unreleased file pages
has been added to the -mm tree. Its filename is
mmthp-stop-leaking-unreleased-file-pages.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mmthp-stop-leaking-unreleased-file…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mmthp-stop-leaking-unreleased-file…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm,thp: stop leaking unreleased file pages
When collapse_file() calls try_to_release_page(), it has already isolated
the page: so if releasing buffers happens to fail (as it sometimes does),
remember to putback_lru_page(): otherwise that page is left unreclaimable
and unfreeable, and the file extent uncollapsible.
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2005231837500.1766@eggly.anvils
Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: <stable(a)vger.kernel.org> [5.4+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 1 +
1 file changed, 1 insertion(+)
--- a/mm/khugepaged.c~mmthp-stop-leaking-unreleased-file-pages
+++ a/mm/khugepaged.c
@@ -1692,6 +1692,7 @@ static void collapse_file(struct mm_stru
if (page_has_private(page) &&
!try_to_release_page(page, GFP_KERNEL)) {
result = SCAN_PAGE_HAS_PRIVATE;
+ putback_lru_page(page);
goto out_unlock;
}
_
Patches currently in -mm which might be from hughd(a)google.com are
mmthp-stop-leaking-unreleased-file-pages.patch
mm-memcontrol-charge-swapin-pages-on-instantiation-fix.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix.patch
The patch titled
Subject: z3fold: fix use-after-free when freeing handles
has been removed from the -mm tree. Its filename was
z3fold-fix-use-after-free-when-freeing-handles.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Uladzislau Rezki <uladzislau.rezki(a)sony.com>
Subject: z3fold: fix use-after-free when freeing handles
free_handle() for a foreign handle may race with inter-page compaction,
what can lead to memory corruption. To avoid that, take write lock not
read lock in free_handle to be synchronized with __release_z3fold_page().
For example KASAN can detect it:
[ 33.723357] ==================================================================
[ 33.723401] BUG: KASAN: use-after-free in LZ4_decompress_safe+0x2c4/0x3b8
[ 33.723418] Read of size 1 at addr ffffffc976695ca3 by task GoogleApiHandle/4121
[ 33.723428]
[ 33.723449] CPU: 0 PID: 4121 Comm: GoogleApiHandle Tainted: P S OE 4.19.81-perf+ #162
[ 33.723461] Hardware name: Sony Mobile Communications. PDX-203(KONA) (DT)
[ 33.723473] Call trace:
[ 33.723495] dump_backtrace+0x0/0x288
[ 33.723512] show_stack+0x14/0x20
[ 33.723533] dump_stack+0xe4/0x124
[ 33.723551] print_address_description+0x80/0x2e0
[ 33.723566] kasan_report+0x268/0x2d0
[ 33.723584] __asan_load1+0x4c/0x58
[ 33.723601] LZ4_decompress_safe+0x2c4/0x3b8
[ 33.723619] lz4_decompress_crypto+0x3c/0x70
[ 33.723636] crypto_decompress+0x58/0x70
[ 33.723656] zcomp_decompress+0xd4/0x120
...
Apart from that, initialize zhdr->mapped_count in init_z3fold_page() and
remove "newpage" variable because it is not used anywhere.
Link: http://lkml.kernel.org/r/20200520082100.28876-1-vitaly.wool@konsulko.com
Signed-off-by: Uladzislau Rezki <uladzislau.rezki(a)sony.com>
Signed-off-by: Vitaly Wool <vitaly.wool(a)konsulko.com>
Cc: Qian Cai <cai(a)lca.pw>
Cc: Raymond Jennings <shentino(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/z3fold.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
--- a/mm/z3fold.c~z3fold-fix-use-after-free-when-freeing-handles
+++ a/mm/z3fold.c
@@ -318,16 +318,16 @@ static inline void free_handle(unsigned
slots = handle_to_slots(handle);
write_lock(&slots->lock);
*(unsigned long *)handle = 0;
- write_unlock(&slots->lock);
- if (zhdr->slots == slots)
+ if (zhdr->slots == slots) {
+ write_unlock(&slots->lock);
return; /* simple case, nothing else to do */
+ }
/* we are freeing a foreign handle if we are here */
zhdr->foreign_handles--;
is_free = true;
- read_lock(&slots->lock);
if (!test_bit(HANDLES_ORPHANED, &slots->pool)) {
- read_unlock(&slots->lock);
+ write_unlock(&slots->lock);
return;
}
for (i = 0; i <= BUDDY_MASK; i++) {
@@ -336,7 +336,7 @@ static inline void free_handle(unsigned
break;
}
}
- read_unlock(&slots->lock);
+ write_unlock(&slots->lock);
if (is_free) {
struct z3fold_pool *pool = slots_to_pool(slots);
@@ -422,6 +422,7 @@ static struct z3fold_header *init_z3fold
zhdr->start_middle = 0;
zhdr->cpu = -1;
zhdr->foreign_handles = 0;
+ zhdr->mapped_count = 0;
zhdr->slots = slots;
zhdr->pool = pool;
INIT_LIST_HEAD(&zhdr->buddy);
_
Patches currently in -mm which might be from uladzislau.rezki(a)sony.com are
The patch titled
Subject: sparc32: use PUD rather than PGD to get PMD in srmmu_nocache_init()
has been removed from the -mm tree. Its filename was
sparc32-use-pud-rather-than-pgd-to-get-pmd-in-srmmu_nocache_init.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Mike Rapoport <rppt(a)linux.ibm.com>
Subject: sparc32: use PUD rather than PGD to get PMD in srmmu_nocache_init()
The kbuild test robot reported the following warning:
arch/sparc/mm/srmmu.c: In function 'srmmu_nocache_init':
>> arch/sparc/mm/srmmu.c:300:9: error: variable 'pud' set but not used
>> [-Werror=unused-but-set-variable]
300 | pud_t *pud;
This warning is caused by misprint in the page table traversal in
srmmu_nocache_init() function which accessed a PMD entry using PGD rather
than PUD.
Since sparc32 has only 3 page table levels, the PGD and PUD are
essentially the same and usage of __nocache_fix() removed the type
checking.
Use PUD for the consistency and to silence the compiler warning.
Link: http://lkml.kernel.org/r/20200520132005.GM1059226@linux.ibm.com
Fixes: 7235db268a2777bc38 ("sparc32: use pgtable-nopud instead of 4level-fixup")
Signed-off-by: Mike Rapoport <rppt(a)linux.ibm.com>
Reported-by: kbuild test robot <lkp(a)intel.com>
Cc: David S. Miller <davem(a)davemloft.net>
Cc: Anatoly Pugachev <matorola(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/sparc/mm/srmmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/sparc/mm/srmmu.c~sparc32-use-pud-rather-than-pgd-to-get-pmd-in-srmmu_nocache_init
+++ a/arch/sparc/mm/srmmu.c
@@ -333,7 +333,7 @@ static void __init srmmu_nocache_init(vo
pgd = pgd_offset_k(vaddr);
p4d = p4d_offset(__nocache_fix(pgd), vaddr);
pud = pud_offset(__nocache_fix(p4d), vaddr);
- pmd = pmd_offset(__nocache_fix(pgd), vaddr);
+ pmd = pmd_offset(__nocache_fix(pud), vaddr);
pte = pte_offset_kernel(__nocache_fix(pmd), vaddr);
pteval = ((paddr >> 4) | SRMMU_ET_PTE | SRMMU_PRIV);
_
Patches currently in -mm which might be from rppt(a)linux.ibm.com are
mm-memblock-replace-dereferences-of-memblock_regionnid-with-api-calls.patch
mm-make-early_pfn_to_nid-and-related-defintions-close-to-each-other.patch
mm-remove-config_have_memblock_node_map-option.patch
mm-free_area_init-use-maximal-zone-pfns-rather-than-zone-sizes.patch
mm-use-free_area_init-instead-of-free_area_init_nodes.patch
alpha-simplify-detection-of-memory-zone-boundaries.patch
arm-simplify-detection-of-memory-zone-boundaries.patch
arm64-simplify-detection-of-memory-zone-boundaries-for-uma-configs.patch
csky-simplify-detection-of-memory-zone-boundaries.patch
m68k-mm-simplify-detection-of-memory-zone-boundaries.patch
parisc-simplify-detection-of-memory-zone-boundaries.patch
sparc32-simplify-detection-of-memory-zone-boundaries.patch
unicore32-simplify-detection-of-memory-zone-boundaries.patch
xtensa-simplify-detection-of-memory-zone-boundaries.patch
mm-remove-early_pfn_in_nid-and-config_nodes_span_other_nodes.patch
mm-free_area_init-allow-defining-max_zone_pfn-in-descending-order.patch
mm-free_area_init-allow-defining-max_zone_pfn-in-descending-order-fix-2.patch
mm-rename-free_area_init_node-to-free_area_init_memoryless_node.patch
mm-clean-up-free_area_init_node-and-its-helpers.patch
mm-simplify-find_min_pfn_with_active_regions.patch
docs-vm-update-memory-models-documentation.patch
h8300-remove-usage-of-__arch_use_5level_hack.patch
arm-add-support-for-folded-p4d-page-tables.patch
arm-add-support-for-folded-p4d-page-tables-fix.patch
arm64-add-support-for-folded-p4d-page-tables.patch
hexagon-remove-__arch_use_5level_hack.patch
ia64-add-support-for-folded-p4d-page-tables.patch
nios2-add-support-for-folded-p4d-page-tables.patch
openrisc-add-support-for-folded-p4d-page-tables.patch
powerpc-add-support-for-folded-p4d-page-tables.patch
powerpc-add-support-for-folded-p4d-page-tables-fix.patch
powerpc-add-support-for-folded-p4d-page-tables-fix-2.patch
sh-drop-__pxd_offset-macros-that-duplicate-pxd_index-ones.patch
sh-add-support-for-folded-p4d-page-tables.patch
unicore32-remove-__arch_use_5level_hack.patch
asm-generic-remove-pgtable-nop4d-hackh.patch
mm-remove-__arch_has_5level_hack-and-include-asm-generic-5level-fixuph.patch
mm-dont-include-asm-pgtableh-if-linux-mmh-is-already-included.patch
mm-introduce-include-linux-pgtableh.patch
mm-reorder-includes-after-introduction-of-linux-pgtableh.patch
csky-replace-definitions-of-__pxd_offset-with-pxd_index.patch
m68k-mm-motorola-move-comment-about-page-table-allocation-funcitons.patch
m68k-mm-move-cachenocahe_page-definitions-close-to-their-user.patch
x86-mm-simplify-init_trampoline-and-surrounding-logic.patch
mm-pgtable-add-shortcuts-for-accessing-kernel-pmd-and-pte.patch
mm-pgtable-add-shortcuts-for-accessing-kernel-pmd-and-pte-fix.patch
mm-pgtable-add-shortcuts-for-accessing-kernel-pmd-and-pte-fix-2.patch
mm-consolidate-pte_index-and-pte_offset_-definitions.patch
mm-consolidate-pmd_index-and-pmd_offset-definitions.patch
mm-consolidate-pud_index-and-pud_offset-definitions.patch
mm-consolidate-pgd_index-and-pgd_offset_k-definitions.patch
The patch titled
Subject: sh: include linux/time_types.h for sockios
has been removed from the -mm tree. Its filename was
sh-include-linux-time_typesh-for-sockios.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Arnd Bergmann <arnd(a)arndb.de>
Subject: sh: include linux/time_types.h for sockios
Using the socket ioctls on arch/sh (and only there) causes build time
problems when __kernel_old_timeval/__kernel_old_timespec are not already
visible to the compiler.
Add an explict include line for the header that defines these
structures.
Link: http://lkml.kernel.org/r/20200519131327.1836482-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Reported-by: John Paul Adrian Glaubitz <glaubitz(a)physik.fu-berlin.de>
Tested-by: John Paul Adrian Glaubitz <glaubitz(a)physik.fu-berlin.de>
Fixes: 8c709f9a0693 ("y2038: sh: remove timeval/timespec usage from headers")
Fixes: 0768e17073dc ("net: socket: implement 64-bit timestamps")
Cc: Yoshinori Sato <ysato(a)users.sourceforge.jp>
Cc: Rich Felker <dalias(a)libc.org>
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: John Paul Adrian Glaubitz <glaubitz(a)physik.fu-berlin.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/sh/include/uapi/asm/sockios.h | 2 ++
1 file changed, 2 insertions(+)
--- a/arch/sh/include/uapi/asm/sockios.h~sh-include-linux-time_typesh-for-sockios
+++ a/arch/sh/include/uapi/asm/sockios.h
@@ -2,6 +2,8 @@
#ifndef __ASM_SH_SOCKIOS_H
#define __ASM_SH_SOCKIOS_H
+#include <linux/time_types.h>
+
/* Socket-level I/O control calls. */
#define FIOGETOWN _IOR('f', 123, int)
#define FIOSETOWN _IOW('f', 124, int)
_
Patches currently in -mm which might be from arnd(a)arndb.de are
drm-remove-drm-specific-kmap_atomic-code-fix.patch
bitops-avoid-clang-shift-count-overflow-warnings.patch
ubsan-fix-gcc-10-warnings.patch
arm64-add-support-for-folded-p4d-page-tables-fix.patch
The patch titled
Subject: rapidio: fix an error in get_user_pages_fast() error handling
has been removed from the -mm tree. Its filename was
rapidio-fix-an-error-in-get_user_pages_fast-error-handling.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: John Hubbard <jhubbard(a)nvidia.com>
Subject: rapidio: fix an error in get_user_pages_fast() error handling
In the case of get_user_pages_fast() returning fewer pages than requested,
rio_dma_transfer() does not quite do the right thing. It attempts to
release all the pages that were requested, rather than just the pages that
were pinned.
Fix the error handling so that only the pages that were successfully
pinned are released.
Link: http://lkml.kernel.org/r/20200517235620.205225-2-jhubbard@nvidia.com
Fixes: e8de370188d0 ("rapidio: add mport char device driver")
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Matt Porter <mporter(a)kernel.crashing.org>
Cc: Alexandre Bounine <alex.bou9(a)gmail.com>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: Dan Carpenter <dan.carpenter(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/rapidio/devices/rio_mport_cdev.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/drivers/rapidio/devices/rio_mport_cdev.c~rapidio-fix-an-error-in-get_user_pages_fast-error-handling
+++ a/drivers/rapidio/devices/rio_mport_cdev.c
@@ -877,6 +877,11 @@ rio_dma_transfer(struct file *filp, u32
rmcd_error("pinned %ld out of %ld pages",
pinned, nr_pages);
ret = -EFAULT;
+ /*
+ * Set nr_pages up to mean "how many pages to unpin, in
+ * the error handler:
+ */
+ nr_pages = pinned;
goto err_pg;
}
_
Patches currently in -mm which might be from jhubbard(a)nvidia.com are
mm-gup-introduce-pin_user_pages_unlocked.patch
ivtv-convert-get_user_pages-pin_user_pages.patch
mm-gup-move-__get_user_pages_fast-down-a-few-lines-in-gupc.patch
mm-gup-refactor-and-de-duplicate-gup_fast-code.patch
mm-gup-refactor-and-de-duplicate-gup_fast-code-fix.patch
mm-gup-introduce-pin_user_pages_fast_only.patch
drm-i915-convert-get_user_pages-pin_user_pages.patch
mm-gup-might_lock_readmmap_sem-in-get_user_pages_fast.patch
khugepaged-add-self-test-fix-3.patch
rapidio-convert-get_user_pages-pin_user_pages.patch
The patch titled
Subject: device-dax: don't leak kernel memory to user space after unloading kmem
has been removed from the -mm tree. Its filename was
device-dax-dont-leak-kernel-memory-to-user-space-after-unloading-kmem.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: device-dax: don't leak kernel memory to user space after unloading kmem
Assume we have kmem configured and loaded:
[root@localhost ~]# cat /proc/iomem
...
140000000-33fffffff : Persistent Memory$
140000000-1481fffff : namespace0.0
150000000-33fffffff : dax0.0
150000000-33fffffff : System RAM
Assume we try to unload kmem. This force-unloading will work, even if
memory cannot get removed from the system.
[root@localhost ~]# rmmod kmem
[ 86.380228] removing memory fails, because memory [0x0000000150000000-0x0000000157ffffff] is onlined
...
[ 86.431225] kmem dax0.0: DAX region [mem 0x150000000-0x33fffffff] cannot be hotremoved until the next reboot
Now, we can reconfigure the namespace:
[root@localhost ~]# ndctl create-namespace --force --reconfig=namespace0.0 --mode=devdax
[ 131.409351] nd_pmem namespace0.0: could not reserve region [mem 0x140000000-0x33fffffff]dax
[ 131.410147] nd_pmem: probe of namespace0.0 failed with error -16namespace0.0 --mode=devdax
...
This fails as expected due to the busy memory resource, and the memory
cannot be used. However, the dax0.0 device is removed, and along its name.
The name of the memory resource now points at freed memory (name of the
device).
[root@localhost ~]# cat /proc/iomem
...
140000000-33fffffff : Persistent Memory
140000000-1481fffff : namespace0.0
150000000-33fffffff : �_�^7_��/_��wR��WQ���^��� ...
150000000-33fffffff : System RAM
We have to make sure to duplicate the string. While at it, remove the
superfluous setting of the name and fixup a stale comment.
Link: http://lkml.kernel.org/r/20200508084217.9160-2-david@redhat.com
Fixes: 9f960da72b25 ("device-dax: "Hotremove" persistent memory that is used like normal RAM")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Vishal Verma <vishal.l.verma(a)intel.com>
Cc: Dave Jiang <dave.jiang(a)intel.com>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: <stable(a)vger.kernel.org> [5.3]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/dax/kmem.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
--- a/drivers/dax/kmem.c~device-dax-dont-leak-kernel-memory-to-user-space-after-unloading-kmem
+++ a/drivers/dax/kmem.c
@@ -22,6 +22,7 @@ int dev_dax_kmem_probe(struct device *de
resource_size_t kmem_size;
resource_size_t kmem_end;
struct resource *new_res;
+ const char *new_res_name;
int numa_node;
int rc;
@@ -48,11 +49,16 @@ int dev_dax_kmem_probe(struct device *de
kmem_size &= ~(memory_block_size_bytes() - 1);
kmem_end = kmem_start + kmem_size;
- /* Region is permanently reserved. Hot-remove not yet implemented. */
- new_res = request_mem_region(kmem_start, kmem_size, dev_name(dev));
+ new_res_name = kstrdup(dev_name(dev), GFP_KERNEL);
+ if (!new_res_name)
+ return -ENOMEM;
+
+ /* Region is permanently reserved if hotremove fails. */
+ new_res = request_mem_region(kmem_start, kmem_size, new_res_name);
if (!new_res) {
dev_warn(dev, "could not reserve region [%pa-%pa]\n",
&kmem_start, &kmem_end);
+ kfree(new_res_name);
return -EBUSY;
}
@@ -63,12 +69,12 @@ int dev_dax_kmem_probe(struct device *de
* unknown to us that will break add_memory() below.
*/
new_res->flags = IORESOURCE_SYSTEM_RAM;
- new_res->name = dev_name(dev);
rc = add_memory(numa_node, new_res->start, resource_size(new_res));
if (rc) {
release_resource(new_res);
kfree(new_res);
+ kfree(new_res_name);
return rc;
}
dev_dax->dax_kmem_res = new_res;
@@ -83,6 +89,7 @@ static int dev_dax_kmem_remove(struct de
struct resource *res = dev_dax->dax_kmem_res;
resource_size_t kmem_start = res->start;
resource_size_t kmem_size = resource_size(res);
+ const char *res_name = res->name;
int rc;
/*
@@ -102,6 +109,7 @@ static int dev_dax_kmem_remove(struct de
/* Release and free dax resources */
release_resource(res);
kfree(res);
+ kfree(res_name);
dev_dax->dax_kmem_res = NULL;
return 0;
_
Patches currently in -mm which might be from david(a)redhat.com are
drivers-base-memoryc-cache-memory-blocks-in-xarray-to-accelerate-lookup-fix.patch
powerpc-pseries-hotplug-memory-stop-checking-is_mem_section_removable.patch
mm-memory_hotplug-remove-is_mem_section_removable.patch
mm-memory_hotplug-set-node_start_pfn-of-hotadded-pgdat-to-0.patch
mm-memory_hotplug-handle-memblocks-only-with-config_arch_keep_memblock.patch
mm-memory_hotplug-introduce-add_memory_driver_managed.patch
kexec_file-dont-place-kexec-images-on-ioresource_mem_driver_managed.patch
device-dax-add-memory-via-add_memory_driver_managed.patch