The patch titled
Subject: mm/migrate.c: add missing flush_dcache_page for non-mapped page migrate
has been removed from the -mm tree. Its filename was
mm-migrate-add-missing-flush_dcache_page-for-non-mapped-page-migrate.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Lars Persson <lars.persson(a)axis.com>
Subject: mm/migrate.c: add missing flush_dcache_page for non-mapped page migrate
Our MIPS 1004Kc SoCs were seeing random userspace crashes with SIGILL and
SIGSEGV that could not be traced back to a userspace code bug. They had
all the magic signs of an I/D cache coherency issue.
Now recently we noticed that the /proc/sys/vm/compact_memory interface was
quite efficient at provoking this class of userspace crashes.
Studying the code in mm/migrate.c there is a distinction made between
migrating a page that is mapped at the instant of migration and one that
is not mapped. Our problem turned out to be the non-mapped pages.
For the non-mapped page the code performs a copy of the page content and
all relevant meta-data of the page without doing the required D-cache
maintenance. This leaves dirty data in the D-cache of the CPU and on the
1004K cores this data is not visible to the I-cache. A subsequent
page-fault that triggers a mapping of the page will happily serve the
process with potentially stale code.
What about ARM then, this bug should have seen greater exposure? Well ARM
became immune to this flaw back in 2010, see commit c01778001a4f ("ARM:
6379/1: Assume new page cache pages have dirty D-cache").
My proposed fix moves the D-cache maintenance inside move_to_new_page to
make it common for both cases.
Link: http://lkml.kernel.org/r/20190315083502.11849-1-larper@axis.com
Fixes: 97ee0524614 ("flush cache before installing new page at migraton")
Signed-off-by: Lars Persson <larper(a)axis.com>
Reviewed-by: Paul Burton <paul.burton(a)mips.com>
Acked-by: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Ralf Baechle <ralf(a)linux-mips.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/migrate.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
--- a/mm/migrate.c~mm-migrate-add-missing-flush_dcache_page-for-non-mapped-page-migrate
+++ a/mm/migrate.c
@@ -248,10 +248,8 @@ static bool remove_migration_pte(struct
pte = swp_entry_to_pte(entry);
} else if (is_device_public_page(new)) {
pte = pte_mkdevmap(pte);
- flush_dcache_page(new);
}
- } else
- flush_dcache_page(new);
+ }
#ifdef CONFIG_HUGETLB_PAGE
if (PageHuge(new)) {
@@ -995,6 +993,13 @@ static int move_to_new_page(struct page
*/
if (!PageMappingFlags(page))
page->mapping = NULL;
+
+ if (unlikely(is_zone_device_page(newpage))) {
+ if (is_device_public_page(newpage))
+ flush_dcache_page(newpage);
+ } else
+ flush_dcache_page(newpage);
+
}
out:
return rc;
_
Patches currently in -mm which might be from lars.persson(a)axis.com are
The patch titled
Subject: mm/page_isolation.c: fix a wrong flag in set_migratetype_isolate()
has been removed from the -mm tree. Its filename was
mm-fix-a-wrong-flag-in-set_migratetype_isolate.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Qian Cai <cai(a)lca.pw>
Subject: mm/page_isolation.c: fix a wrong flag in set_migratetype_isolate()
Due to has_unmovable_pages() taking an incorrect irqsave flag instead of
the isolation flag in set_migratetype_isolate(), there are issues with
HWPOSION and error reporting where dump_page() is not called when there is
an unmovable page.
Link: http://lkml.kernel.org/r/20190320204941.53731-1-cai@lca.pw
Fixes: d381c54760dc ("mm: only report isolation failures when offlining memory")
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Signed-off-by: Qian Cai <cai(a)lca.pw>
Cc: <stable(a)vger.kernel.org> [5.0.x]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_isolation.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/page_isolation.c~mm-fix-a-wrong-flag-in-set_migratetype_isolate
+++ a/mm/page_isolation.c
@@ -59,7 +59,8 @@ static int set_migratetype_isolate(struc
* FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
* We just check MOVABLE pages.
*/
- if (!has_unmovable_pages(zone, page, arg.pages_found, migratetype, flags))
+ if (!has_unmovable_pages(zone, page, arg.pages_found, migratetype,
+ isol_flags))
ret = 0;
/*
_
Patches currently in -mm which might be from cai(a)lca.pw are
mm-compaction-abort-search-if-isolation-fails-v2.patch
mm-compaction-fix-an-undefined-behaviour.patch
initramfs-cleanup-populate_rootfs-fix.patch
The patch titled
Subject: mm/debug.c: fix __dump_page when mapping->host is not set
has been removed from the -mm tree. Its filename was
mm-fix-__dump_page-when-mapping-host-is-not-set.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Oscar Salvador <osalvador(a)suse.de>
Subject: mm/debug.c: fix __dump_page when mapping->host is not set
While debugging something, I added a dump_page() into do_swap_page(), and
I got the splat from below. The issue happens when dereferencing
mapping->host in __dump_page():
...
else if (mapping) {
pr_warn("%ps ", mapping->a_ops);
if (mapping->host->i_dentry.first) {
struct dentry *dentry;
dentry = container_of(mapping->host->i_dentry.first, struct dentry, d_u.d_alias);
pr_warn("name:\"%pd\" ", dentry);
}
}
...
Swap address space does not contain an inode information, and so
mapping->host equals NULL.
Although the dump_page() call was added artificially into do_swap_page(),
I am not sure if we can hit this from any other path, so it looks worth
fixing it. We can easily do that by checking mapping->host first.
Link: http://lkml.kernel.org/r/20190318072931.29094-1-osalvador@suse.de
Fixes: 1c6fb1d89e73c ("mm: print more information about mapping in __dump_page")
Signed-off-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/debug.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/debug.c~mm-fix-__dump_page-when-mapping-host-is-not-set
+++ a/mm/debug.c
@@ -79,7 +79,7 @@ void __dump_page(struct page *page, cons
pr_warn("ksm ");
else if (mapping) {
pr_warn("%ps ", mapping->a_ops);
- if (mapping->host->i_dentry.first) {
+ if (mapping->host && mapping->host->i_dentry.first) {
struct dentry *dentry;
dentry = container_of(mapping->host->i_dentry.first, struct dentry, d_u.d_alias);
pr_warn("name:\"%pd\" ", dentry);
_
Patches currently in -mm which might be from osalvador(a)suse.de are
mmmemory_hotplug-unlock-1gb-hugetlb-on-x86_64.patch
mmmemory_hotplug-drop-redundant-hugepage_migration_supported-check.patch
The patch titled
Subject: mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified
has been removed from the -mm tree. Its filename was
mm-mempolicy-make-mbind-return-eio-when-mpol_mf_strict-is-specified.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Yang Shi <yang.shi(a)linux.alibaba.com>
Subject: mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified
When MPOL_MF_STRICT was specified and an existing page was already on a
node that does not follow the policy, mbind() should return -EIO. But
6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()")
broke the rule.
And c8633798497c ("mm: mempolicy: mbind and migrate_pages support thp
migration") didn't return the correct value for THP mbind() too.
If MPOL_MF_STRICT is set, ignore vma_migratable() to make sure it reaches
queue_pages_to_pte_range() or queue_pages_pmd() to check if an existing
page was already on a node that does not follow the policy. And,
non-migratable vma may be used, return -EIO too if MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL was specified.
Tested with https://github.com/metan-ucw/ltp/blob/master/testcases/kernel/syscalls/mbin…
[akpm(a)linux-foundation.org: tweak code comment]
Link: http://lkml.kernel.org/r/1553020556-38583-1-git-send-email-yang.shi@linux.a…
Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()")
Signed-off-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Signed-off-by: Oscar Salvador <osalvador(a)suse.de>
Reported-by: Cyril Hrubis <chrubis(a)suse.cz>
Suggested-by: Kirill A. Shutemov <kirill(a)shutemov.name>
Acked-by: Rafael Aquini <aquini(a)redhat.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: David Rientjes <rientjes(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mempolicy.c | 40 +++++++++++++++++++++++++++++++++-------
1 file changed, 33 insertions(+), 7 deletions(-)
--- a/mm/mempolicy.c~mm-mempolicy-make-mbind-return-eio-when-mpol_mf_strict-is-specified
+++ a/mm/mempolicy.c
@@ -428,6 +428,13 @@ static inline bool queue_pages_required(
return node_isset(nid, *qp->nmask) == !(flags & MPOL_MF_INVERT);
}
+/*
+ * queue_pages_pmd() has three possible return values:
+ * 1 - pages are placed on the right node or queued successfully.
+ * 0 - THP was split.
+ * -EIO - is migration entry or MPOL_MF_STRICT was specified and an existing
+ * page was already on a node that does not follow the policy.
+ */
static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
unsigned long end, struct mm_walk *walk)
{
@@ -437,7 +444,7 @@ static int queue_pages_pmd(pmd_t *pmd, s
unsigned long flags;
if (unlikely(is_pmd_migration_entry(*pmd))) {
- ret = 1;
+ ret = -EIO;
goto unlock;
}
page = pmd_page(*pmd);
@@ -454,8 +461,15 @@ static int queue_pages_pmd(pmd_t *pmd, s
ret = 1;
flags = qp->flags;
/* go to thp migration */
- if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))
+ if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
+ if (!vma_migratable(walk->vma)) {
+ ret = -EIO;
+ goto unlock;
+ }
+
migrate_page_add(page, qp->pagelist, flags);
+ } else
+ ret = -EIO;
unlock:
spin_unlock(ptl);
out:
@@ -480,8 +494,10 @@ static int queue_pages_pte_range(pmd_t *
ptl = pmd_trans_huge_lock(pmd, vma);
if (ptl) {
ret = queue_pages_pmd(pmd, ptl, addr, end, walk);
- if (ret)
+ if (ret > 0)
return 0;
+ else if (ret < 0)
+ return ret;
}
if (pmd_trans_unstable(pmd))
@@ -502,11 +518,16 @@ static int queue_pages_pte_range(pmd_t *
continue;
if (!queue_pages_required(page, qp))
continue;
- migrate_page_add(page, qp->pagelist, flags);
+ if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
+ if (!vma_migratable(vma))
+ break;
+ migrate_page_add(page, qp->pagelist, flags);
+ } else
+ break;
}
pte_unmap_unlock(pte - 1, ptl);
cond_resched();
- return 0;
+ return addr != end ? -EIO : 0;
}
static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask,
@@ -576,7 +597,12 @@ static int queue_pages_test_walk(unsigne
unsigned long endvma = vma->vm_end;
unsigned long flags = qp->flags;
- if (!vma_migratable(vma))
+ /*
+ * Need check MPOL_MF_STRICT to return -EIO if possible
+ * regardless of vma_migratable
+ */
+ if (!vma_migratable(vma) &&
+ !(flags & MPOL_MF_STRICT))
return 1;
if (endvma > end)
@@ -603,7 +629,7 @@ static int queue_pages_test_walk(unsigne
}
/* queue pages from current vma */
- if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))
+ if (flags & MPOL_MF_VALID)
return 0;
return 1;
}
_
Patches currently in -mm which might be from yang.shi(a)linux.alibaba.com are
The patch titled
Subject: iommu/io-pgtable-arm-v7s: request DMA32 memory, and improve debugging
has been removed from the -mm tree. Its filename was
iommu-io-pgtable-arm-v7s-request-dma32-memory-and-improve-debugging.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Nicolas Boichat <drinkcat(a)chromium.org>
Subject: iommu/io-pgtable-arm-v7s: request DMA32 memory, and improve debugging
IOMMUs using ARMv7 short-descriptor format require page tables (level 1
and 2) to be allocated within the first 4GB of RAM, even on 64-bit
systems.
For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32 is
defined (e.g. on arm64 platforms).
For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32. Note that
we do not explicitly pass GFP_DMA[32] to kmem_cache_zalloc, as this is not
strictly necessary, and would cause a warning in mm/sl*b.c, as we did not
update GFP_SLAB_BUG_MASK.
Also, print an error when the physical address does not fit in
32-bit, to make debugging easier in the future.
Link: http://lkml.kernel.org/r/20181210011504.122604-3-drinkcat@chromium.org
Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat <drinkcat(a)chromium.org>
Acked-by: Will Deacon <will.deacon(a)arm.com>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Hsin-Yi Wang <hsinyi(a)chromium.org>
Cc: Huaisheng Ye <yehs1(a)lenovo.com>
Cc: Joerg Roedel <joro(a)8bytes.org>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Matthias Brugger <matthias.bgg(a)gmail.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mike Rapoport <rppt(a)linux.vnet.ibm.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: Robin Murphy <robin.murphy(a)arm.com>
Cc: Sasha Levin <Alexander.Levin(a)microsoft.com>
Cc: Tomasz Figa <tfiga(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Yingjoe Chen <yingjoe.chen(a)mediatek.com>
Cc: Yong Wu <yong.wu(a)mediatek.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/iommu/io-pgtable-arm-v7s.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
--- a/drivers/iommu/io-pgtable-arm-v7s.c~iommu-io-pgtable-arm-v7s-request-dma32-memory-and-improve-debugging
+++ a/drivers/iommu/io-pgtable-arm-v7s.c
@@ -160,6 +160,14 @@
#define ARM_V7S_TCR_PD1 BIT(5)
+#ifdef CONFIG_ZONE_DMA32
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
+#define ARM_V7S_TABLE_SLAB_FLAGS SLAB_CACHE_DMA32
+#else
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA
+#define ARM_V7S_TABLE_SLAB_FLAGS SLAB_CACHE_DMA
+#endif
+
typedef u32 arm_v7s_iopte;
static bool selftest_running;
@@ -197,13 +205,16 @@ static void *__arm_v7s_alloc_table(int l
void *table = NULL;
if (lvl == 1)
- table = (void *)__get_dma_pages(__GFP_ZERO, get_order(size));
+ table = (void *)__get_free_pages(
+ __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size));
else if (lvl == 2)
- table = kmem_cache_zalloc(data->l2_tables, gfp | GFP_DMA);
+ table = kmem_cache_zalloc(data->l2_tables, gfp);
phys = virt_to_phys(table);
- if (phys != (arm_v7s_iopte)phys)
+ if (phys != (arm_v7s_iopte)phys) {
/* Doesn't fit in PTE */
+ dev_err(dev, "Page table does not fit in PTE: %pa", &phys);
goto out_free;
+ }
if (table && !(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA)) {
dma = dma_map_single(dev, table, size, DMA_TO_DEVICE);
if (dma_mapping_error(dev, dma))
@@ -733,7 +744,7 @@ static struct io_pgtable *arm_v7s_alloc_
data->l2_tables = kmem_cache_create("io-pgtable_armv7s_l2",
ARM_V7S_TABLE_SIZE(2),
ARM_V7S_TABLE_SIZE(2),
- SLAB_CACHE_DMA, NULL);
+ ARM_V7S_TABLE_SLAB_FLAGS, NULL);
if (!data->l2_tables)
goto out_free_data;
_
Patches currently in -mm which might be from drinkcat(a)chromium.org are
mm-add-sys-kernel-slab-cache-cache_dma32.patch
The patch titled
Subject: mm: add support for kmem caches in DMA32 zone
has been removed from the -mm tree. Its filename was
mm-add-support-for-kmem-caches-in-dma32-zone.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Nicolas Boichat <drinkcat(a)chromium.org>
Subject: mm: add support for kmem caches in DMA32 zone
Patch series "iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables",
v6.
This is a followup to the discussion in [1], [2].
IOMMUs using ARMv7 short-descriptor format require page tables (level 1
and 2) to be allocated within the first 4GB of RAM, even on 64-bit
systems.
For L1 tables that are bigger than a page, we can just use
__get_free_pages with GFP_DMA32 (on arm64 systems only, arm would still
use GFP_DMA).
For L2 tables that only take 1KB, it would be a waste to allocate a full
page, so we considered 3 approaches:
1. This series, adding support for GFP_DMA32 slab caches.
2. genalloc, which requires pre-allocating the maximum number of L2 page
tables (4096, so 4MB of memory).
3. page_frag, which is not very memory-efficient as it is unable to reuse
freed fragments until the whole page is freed. [3]
This series is the most memory-efficient approach.
stable@ note:
We confirmed that this is a regression, and IOMMU errors happen on 4.19
and linux-next/master on MT8173 (elm, Acer Chromebook R13). The issue
most likely starts from commit ad67f5a6545f ("arm64: replace ZONE_DMA
with ZONE_DMA32"), i.e. 4.15, and presumably breaks a number of Mediatek
platforms (and maybe others?).
[1] https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html
[2] https://lists.linuxfoundation.org/pipermail/iommu/2018-December/031696.html
[3] https://patchwork.codeaurora.org/patch/671639/
This patch (of 3):
IOMMUs using ARMv7 short-descriptor format require page tables to be
allocated within the first 4GB of RAM, even on 64-bit systems. On arm64,
this is done by passing GFP_DMA32 flag to memory allocation functions.
For IOMMU L2 tables that only take 1KB, it would be a waste to allocate
a full page using get_free_pages, so we considered 3 approaches:
1. This patch, adding support for GFP_DMA32 slab caches.
2. genalloc, which requires pre-allocating the maximum number of L2
page tables (4096, so 4MB of memory).
3. page_frag, which is not very memory-efficient as it is unable
to reuse freed fragments until the whole page is freed.
This change makes it possible to create a custom cache in DMA32 zone using
kmem_cache_create, then allocate memory using kmem_cache_alloc.
We do not create a DMA32 kmalloc cache array, as there are currently no
users of kmalloc(..., GFP_DMA32). These calls will continue to trigger a
warning, as we keep GFP_DMA32 in GFP_SLAB_BUG_MASK.
This implies that calls to kmem_cache_*alloc on a SLAB_CACHE_DMA32
kmem_cache must _not_ use GFP_DMA32 (it is anyway redundant and
unnecessary).
Link: http://lkml.kernel.org/r/20181210011504.122604-2-drinkcat@chromium.org
Signed-off-by: Nicolas Boichat <drinkcat(a)chromium.org>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Acked-by: Will Deacon <will.deacon(a)arm.com>
Cc: Robin Murphy <robin.murphy(a)arm.com>
Cc: Joerg Roedel <joro(a)8bytes.org>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Sasha Levin <Alexander.Levin(a)microsoft.com>
Cc: Huaisheng Ye <yehs1(a)lenovo.com>
Cc: Mike Rapoport <rppt(a)linux.vnet.ibm.com>
Cc: Yong Wu <yong.wu(a)mediatek.com>
Cc: Matthias Brugger <matthias.bgg(a)gmail.com>
Cc: Tomasz Figa <tfiga(a)google.com>
Cc: Yingjoe Chen <yingjoe.chen(a)mediatek.com>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Hsin-Yi Wang <hsinyi(a)chromium.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/slab.h | 2 ++
mm/slab.c | 2 ++
mm/slab.h | 3 ++-
mm/slab_common.c | 2 +-
mm/slub.c | 5 +++++
5 files changed, 12 insertions(+), 2 deletions(-)
--- a/include/linux/slab.h~mm-add-support-for-kmem-caches-in-dma32-zone
+++ a/include/linux/slab.h
@@ -32,6 +32,8 @@
#define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U)
/* Use GFP_DMA memory */
#define SLAB_CACHE_DMA ((slab_flags_t __force)0x00004000U)
+/* Use GFP_DMA32 memory */
+#define SLAB_CACHE_DMA32 ((slab_flags_t __force)0x00008000U)
/* DEBUG: Store the last owner for bug hunting */
#define SLAB_STORE_USER ((slab_flags_t __force)0x00010000U)
/* Panic if kmem_cache_create() fails */
--- a/mm/slab.c~mm-add-support-for-kmem-caches-in-dma32-zone
+++ a/mm/slab.c
@@ -2115,6 +2115,8 @@ done:
cachep->allocflags = __GFP_COMP;
if (flags & SLAB_CACHE_DMA)
cachep->allocflags |= GFP_DMA;
+ if (flags & SLAB_CACHE_DMA32)
+ cachep->allocflags |= GFP_DMA32;
if (flags & SLAB_RECLAIM_ACCOUNT)
cachep->allocflags |= __GFP_RECLAIMABLE;
cachep->size = size;
--- a/mm/slab_common.c~mm-add-support-for-kmem-caches-in-dma32-zone
+++ a/mm/slab_common.c
@@ -53,7 +53,7 @@ static DECLARE_WORK(slab_caches_to_rcu_d
SLAB_FAILSLAB | SLAB_KASAN)
#define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
- SLAB_ACCOUNT)
+ SLAB_CACHE_DMA32 | SLAB_ACCOUNT)
/*
* Merge control. If this is set then no merging of slab caches will occur.
--- a/mm/slab.h~mm-add-support-for-kmem-caches-in-dma32-zone
+++ a/mm/slab.h
@@ -127,7 +127,8 @@ static inline slab_flags_t kmem_cache_fl
/* Legal flag mask for kmem_cache_create(), for various configurations */
-#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | SLAB_PANIC | \
+#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | \
+ SLAB_CACHE_DMA32 | SLAB_PANIC | \
SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS )
#if defined(CONFIG_DEBUG_SLAB)
--- a/mm/slub.c~mm-add-support-for-kmem-caches-in-dma32-zone
+++ a/mm/slub.c
@@ -3589,6 +3589,9 @@ static int calculate_sizes(struct kmem_c
if (s->flags & SLAB_CACHE_DMA)
s->allocflags |= GFP_DMA;
+ if (s->flags & SLAB_CACHE_DMA32)
+ s->allocflags |= GFP_DMA32;
+
if (s->flags & SLAB_RECLAIM_ACCOUNT)
s->allocflags |= __GFP_RECLAIMABLE;
@@ -5679,6 +5682,8 @@ static char *create_unique_id(struct kme
*/
if (s->flags & SLAB_CACHE_DMA)
*p++ = 'd';
+ if (s->flags & SLAB_CACHE_DMA32)
+ *p++ = 'D';
if (s->flags & SLAB_RECLAIM_ACCOUNT)
*p++ = 'a';
if (s->flags & SLAB_CONSISTENCY_CHECKS)
_
Patches currently in -mm which might be from drinkcat(a)chromium.org are
mm-add-sys-kernel-slab-cache-cache_dma32.patch
The patch titled
Subject: ocfs2: fix inode bh swapping mixup in ocfs2_reflink_inodes_lock
has been removed from the -mm tree. Its filename was
ocfs2-fix-inode-bh-swapping-mixup-in-ocfs2_reflink_inodes_lock.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Darrick J. Wong <darrick.wong(a)oracle.com>
Subject: ocfs2: fix inode bh swapping mixup in ocfs2_reflink_inodes_lock
ocfs2_reflink_inodes_lock() can swap the inode1/inode2 variables so that
we always grab cluster locks in order of increasing inode number.
Unfortunately, we forget to swap the inode record buffer head pointers
when we've done this, which leads to incorrect bookkeepping when we're
trying to make the two inodes have the same refcount tree.
This has the effect of causing filesystem shutdowns if you're trying to
reflink data from inode 100 into inode 97, where inode 100 already has a
refcount tree attached and inode 97 doesn't. The reflink code decides to
copy the refcount tree pointer from 100 to 97, but uses inode 97's inode
record to open the tree root (which it doesn't have) and blows up. This
issue causes filesystem shutdowns and metadata corruption!
Link: http://lkml.kernel.org/r/20190312214910.GK20533@magnolia
Fixes: 29ac8e856cb369 ("ocfs2: implement the VFS clone_range, copy_range, and dedupe_range features")
Signed-off-by: Darrick J. Wong <darrick.wong(a)oracle.com>
Reviewed-by: Joseph Qi <jiangqi903(a)gmail.com>
Cc: Mark Fasheh <mfasheh(a)versity.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Joseph Qi <joseph.qi(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/refcounttree.c | 42 +++++++++++++++++++++-----------------
1 file changed, 24 insertions(+), 18 deletions(-)
--- a/fs/ocfs2/refcounttree.c~ocfs2-fix-inode-bh-swapping-mixup-in-ocfs2_reflink_inodes_lock
+++ a/fs/ocfs2/refcounttree.c
@@ -4719,22 +4719,23 @@ out:
/* Lock an inode and grab a bh pointing to the inode. */
int ocfs2_reflink_inodes_lock(struct inode *s_inode,
- struct buffer_head **bh1,
+ struct buffer_head **bh_s,
struct inode *t_inode,
- struct buffer_head **bh2)
+ struct buffer_head **bh_t)
{
- struct inode *inode1;
- struct inode *inode2;
+ struct inode *inode1 = s_inode;
+ struct inode *inode2 = t_inode;
struct ocfs2_inode_info *oi1;
struct ocfs2_inode_info *oi2;
+ struct buffer_head *bh1 = NULL;
+ struct buffer_head *bh2 = NULL;
bool same_inode = (s_inode == t_inode);
+ bool need_swap = (inode1->i_ino > inode2->i_ino);
int status;
/* First grab the VFS and rw locks. */
lock_two_nondirectories(s_inode, t_inode);
- inode1 = s_inode;
- inode2 = t_inode;
- if (inode1->i_ino > inode2->i_ino)
+ if (need_swap)
swap(inode1, inode2);
status = ocfs2_rw_lock(inode1, 1);
@@ -4757,17 +4758,13 @@ int ocfs2_reflink_inodes_lock(struct ino
trace_ocfs2_double_lock((unsigned long long)oi1->ip_blkno,
(unsigned long long)oi2->ip_blkno);
- if (*bh1)
- *bh1 = NULL;
- if (*bh2)
- *bh2 = NULL;
-
/* We always want to lock the one with the lower lockid first. */
if (oi1->ip_blkno > oi2->ip_blkno)
mlog_errno(-ENOLCK);
/* lock id1 */
- status = ocfs2_inode_lock_nested(inode1, bh1, 1, OI_LS_REFLINK_TARGET);
+ status = ocfs2_inode_lock_nested(inode1, &bh1, 1,
+ OI_LS_REFLINK_TARGET);
if (status < 0) {
if (status != -ENOENT)
mlog_errno(status);
@@ -4776,15 +4773,25 @@ int ocfs2_reflink_inodes_lock(struct ino
/* lock id2 */
if (!same_inode) {
- status = ocfs2_inode_lock_nested(inode2, bh2, 1,
+ status = ocfs2_inode_lock_nested(inode2, &bh2, 1,
OI_LS_REFLINK_TARGET);
if (status < 0) {
if (status != -ENOENT)
mlog_errno(status);
goto out_cl1;
}
- } else
- *bh2 = *bh1;
+ } else {
+ bh2 = bh1;
+ }
+
+ /*
+ * If we swapped inode order above, we have to swap the buffer heads
+ * before passing them back to the caller.
+ */
+ if (need_swap)
+ swap(bh1, bh2);
+ *bh_s = bh1;
+ *bh_t = bh2;
trace_ocfs2_double_lock_end(
(unsigned long long)oi1->ip_blkno,
@@ -4794,8 +4801,7 @@ int ocfs2_reflink_inodes_lock(struct ino
out_cl1:
ocfs2_inode_unlock(inode1, 1);
- brelse(*bh1);
- *bh1 = NULL;
+ brelse(bh1);
out_rw2:
ocfs2_rw_unlock(inode2, 1);
out_i2:
_
Patches currently in -mm which might be from darrick.wong(a)oracle.com are
The patch titled
Subject: mm/hotplug: fix offline undo_isolate_page_range()
has been removed from the -mm tree. Its filename was
mm-hotplug-fix-offline-undo_isolate_page_range.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Qian Cai <cai(a)lca.pw>
Subject: mm/hotplug: fix offline undo_isolate_page_range()
f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to
zones until online") introduced move_pfn_range_to_zone() which calls
memmap_init_zone() during onlining a memory block. memmap_init_zone()
will reset pagetype flags and makes migrate type to be MOVABLE.
However, in __offline_pages(), it also call undo_isolate_page_range()
after offline_isolated_pages() to do the same thing. Due to 2ce13640b3f4
("mm: __first_valid_page skip over offline pages") changed
__first_valid_page() to skip offline pages, undo_isolate_page_range() here
just waste CPU cycles looping around the offlining PFN range while doing
nothing, because __first_valid_page() will return NULL as
offline_isolated_pages() has already marked all memory sections within the
pfn range as offline via offline_mem_sections().
Also, after calling the "useless" undo_isolate_page_range() here, it
reaches the point of no returning by notifying MEM_OFFLINE. Those pages
will be marked as MIGRATE_MOVABLE again once onlining. The only thing
left to do is to decrease the number of isolated pageblocks zone counter
which would make some paths of the page allocation slower that the above
commit introduced.
Even if alloc_contig_range() can be used to isolate 16GB-hugetlb pages on
ppc64, an "int" should still be enough to represent the number of
pageblocks there. Fix an incorrect comment along the way.
[cai(a)lca.pw: v4]
Link: http://lkml.kernel.org/r/20190314150641.59358-1-cai@lca.pw
Link: http://lkml.kernel.org/r/20190313143133.46200-1-cai@lca.pw
Fixes: 2ce13640b3f4 ("mm: __first_valid_page skip over offline pages")
Signed-off-by: Qian Cai <cai(a)lca.pw>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org> [4.13+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/page-isolation.h | 10 ------
mm/memory_hotplug.c | 17 ++++++++--
mm/page_alloc.c | 2 -
mm/page_isolation.c | 48 +++++++++++++++++++------------
mm/sparse.c | 2 -
5 files changed, 45 insertions(+), 34 deletions(-)
--- a/include/linux/page-isolation.h~mm-hotplug-fix-offline-undo_isolate_page_range
+++ a/include/linux/page-isolation.h
@@ -41,16 +41,6 @@ int move_freepages_block(struct zone *zo
/*
* Changes migrate type in [start_pfn, end_pfn) to be MIGRATE_ISOLATE.
- * If specified range includes migrate types other than MOVABLE or CMA,
- * this will fail with -EBUSY.
- *
- * For isolating all pages in the range finally, the caller have to
- * free all pages in the range. test_page_isolated() can be used for
- * test it.
- *
- * The following flags are allowed (they can be combined in a bit mask)
- * SKIP_HWPOISON - ignore hwpoison pages
- * REPORT_FAILURE - report details about the failure to isolate the range
*/
int
start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
--- a/mm/memory_hotplug.c~mm-hotplug-fix-offline-undo_isolate_page_range
+++ a/mm/memory_hotplug.c
@@ -1576,7 +1576,7 @@ static int __ref __offline_pages(unsigne
{
unsigned long pfn, nr_pages;
long offlined_pages;
- int ret, node;
+ int ret, node, nr_isolate_pageblock;
unsigned long flags;
unsigned long valid_start, valid_end;
struct zone *zone;
@@ -1602,10 +1602,11 @@ static int __ref __offline_pages(unsigne
ret = start_isolate_page_range(start_pfn, end_pfn,
MIGRATE_MOVABLE,
SKIP_HWPOISON | REPORT_FAILURE);
- if (ret) {
+ if (ret < 0) {
reason = "failure to isolate range";
goto failed_removal;
}
+ nr_isolate_pageblock = ret;
arg.start_pfn = start_pfn;
arg.nr_pages = nr_pages;
@@ -1657,8 +1658,16 @@ static int __ref __offline_pages(unsigne
/* Ok, all of our target is isolated.
We cannot do rollback at this point. */
offline_isolated_pages(start_pfn, end_pfn);
- /* reset pagetype flags and makes migrate type to be MOVABLE */
- undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
+
+ /*
+ * Onlining will reset pagetype flags and makes migrate type
+ * MOVABLE, so just need to decrease the number of isolated
+ * pageblocks zone counter here.
+ */
+ spin_lock_irqsave(&zone->lock, flags);
+ zone->nr_isolate_pageblock -= nr_isolate_pageblock;
+ spin_unlock_irqrestore(&zone->lock, flags);
+
/* removal success */
adjust_managed_page_count(pfn_to_page(start_pfn), -offlined_pages);
zone->present_pages -= offlined_pages;
--- a/mm/page_alloc.c~mm-hotplug-fix-offline-undo_isolate_page_range
+++ a/mm/page_alloc.c
@@ -8233,7 +8233,7 @@ int alloc_contig_range(unsigned long sta
ret = start_isolate_page_range(pfn_max_align_down(start),
pfn_max_align_up(end), migratetype, 0);
- if (ret)
+ if (ret < 0)
return ret;
/*
--- a/mm/page_isolation.c~mm-hotplug-fix-offline-undo_isolate_page_range
+++ a/mm/page_isolation.c
@@ -160,27 +160,36 @@ __first_valid_page(unsigned long pfn, un
return NULL;
}
-/*
- * start_isolate_page_range() -- make page-allocation-type of range of pages
- * to be MIGRATE_ISOLATE.
- * @start_pfn: The lower PFN of the range to be isolated.
- * @end_pfn: The upper PFN of the range to be isolated.
- * @migratetype: migrate type to set in error recovery.
+/**
+ * start_isolate_page_range() - make page-allocation-type of range of pages to
+ * be MIGRATE_ISOLATE.
+ * @start_pfn: The lower PFN of the range to be isolated.
+ * @end_pfn: The upper PFN of the range to be isolated.
+ * start_pfn/end_pfn must be aligned to pageblock_order.
+ * @migratetype: Migrate type to set in error recovery.
+ * @flags: The following flags are allowed (they can be combined in
+ * a bit mask)
+ * SKIP_HWPOISON - ignore hwpoison pages
+ * REPORT_FAILURE - report details about the failure to
+ * isolate the range
*
* Making page-allocation-type to be MIGRATE_ISOLATE means free pages in
* the range will never be allocated. Any free pages and pages freed in the
- * future will not be allocated again.
- *
- * start_pfn/end_pfn must be aligned to pageblock_order.
- * Return 0 on success and -EBUSY if any part of range cannot be isolated.
+ * future will not be allocated again. If specified range includes migrate types
+ * other than MOVABLE or CMA, this will fail with -EBUSY. For isolating all
+ * pages in the range finally, the caller have to free all pages in the range.
+ * test_page_isolated() can be used for test it.
*
* There is no high level synchronization mechanism that prevents two threads
- * from trying to isolate overlapping ranges. If this happens, one thread
+ * from trying to isolate overlapping ranges. If this happens, one thread
* will notice pageblocks in the overlapping range already set to isolate.
* This happens in set_migratetype_isolate, and set_migratetype_isolate
- * returns an error. We then clean up by restoring the migration type on
- * pageblocks we may have modified and return -EBUSY to caller. This
+ * returns an error. We then clean up by restoring the migration type on
+ * pageblocks we may have modified and return -EBUSY to caller. This
* prevents two threads from simultaneously working on overlapping ranges.
+ *
+ * Return: the number of isolated pageblocks on success and -EBUSY if any part
+ * of range cannot be isolated.
*/
int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
unsigned migratetype, int flags)
@@ -188,6 +197,7 @@ int start_isolate_page_range(unsigned lo
unsigned long pfn;
unsigned long undo_pfn;
struct page *page;
+ int nr_isolate_pageblock = 0;
BUG_ON(!IS_ALIGNED(start_pfn, pageblock_nr_pages));
BUG_ON(!IS_ALIGNED(end_pfn, pageblock_nr_pages));
@@ -196,13 +206,15 @@ int start_isolate_page_range(unsigned lo
pfn < end_pfn;
pfn += pageblock_nr_pages) {
page = __first_valid_page(pfn, pageblock_nr_pages);
- if (page &&
- set_migratetype_isolate(page, migratetype, flags)) {
- undo_pfn = pfn;
- goto undo;
+ if (page) {
+ if (set_migratetype_isolate(page, migratetype, flags)) {
+ undo_pfn = pfn;
+ goto undo;
+ }
+ nr_isolate_pageblock++;
}
}
- return 0;
+ return nr_isolate_pageblock;
undo:
for (pfn = start_pfn;
pfn < undo_pfn;
--- a/mm/sparse.c~mm-hotplug-fix-offline-undo_isolate_page_range
+++ a/mm/sparse.c
@@ -567,7 +567,7 @@ void online_mem_sections(unsigned long s
}
#ifdef CONFIG_MEMORY_HOTREMOVE
-/* Mark all memory sections within the pfn range as online */
+/* Mark all memory sections within the pfn range as offline */
void offline_mem_sections(unsigned long start_pfn, unsigned long end_pfn)
{
unsigned long pfn;
_
Patches currently in -mm which might be from cai(a)lca.pw are
mm-compaction-abort-search-if-isolation-fails-v2.patch
mm-compaction-fix-an-undefined-behaviour.patch
initramfs-cleanup-populate_rootfs-fix.patch