The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From bd40b17ca49d7d110adf456e647701ce74de2241 Mon Sep 17 00:00:00 2001
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Date: Fri, 31 Jan 2020 05:07:55 -0500
Subject: [PATCH] XArray: Fix xa_find_next for large multi-index entries
Coverity pointed out that xas_sibling() was shifting xa_offset without
promoting it to an unsigned long first, so the shift could cause an
overflow and we'd get the wrong answer. The fix is obvious, and the
new test-case provokes UBSAN to report an error:
runtime error: shift exponent 60 is too large for 32-bit type 'int'
Fixes: 19c30f4dd092 ("XArray: Fix xa_find_after with multi-index entries")
Reported-by: Bjorn Helgaas <bhelgaas(a)google.com>
Reported-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: stable(a)vger.kernel.org
diff --git a/lib/test_xarray.c b/lib/test_xarray.c
index 55c14e8c8859..8c7d7a8468b8 100644
--- a/lib/test_xarray.c
+++ b/lib/test_xarray.c
@@ -12,6 +12,9 @@
static unsigned int tests_run;
static unsigned int tests_passed;
+static const unsigned int order_limit =
+ IS_ENABLED(CONFIG_XARRAY_MULTI) ? BITS_PER_LONG : 1;
+
#ifndef XA_DEBUG
# ifdef __KERNEL__
void xa_dump(const struct xarray *xa) { }
@@ -959,6 +962,20 @@ static noinline void check_multi_find_2(struct xarray *xa)
}
}
+static noinline void check_multi_find_3(struct xarray *xa)
+{
+ unsigned int order;
+
+ for (order = 5; order < order_limit; order++) {
+ unsigned long index = 1UL << (order - 5);
+
+ XA_BUG_ON(xa, !xa_empty(xa));
+ xa_store_order(xa, 0, order - 4, xa_mk_index(0), GFP_KERNEL);
+ XA_BUG_ON(xa, xa_find_after(xa, &index, ULONG_MAX, XA_PRESENT));
+ xa_erase_index(xa, 0);
+ }
+}
+
static noinline void check_find_1(struct xarray *xa)
{
unsigned long i, j, k;
@@ -1081,6 +1098,7 @@ static noinline void check_find(struct xarray *xa)
for (i = 2; i < 10; i++)
check_multi_find_1(xa, i);
check_multi_find_2(xa);
+ check_multi_find_3(xa);
}
/* See find_swap_entry() in mm/shmem.c */
diff --git a/lib/xarray.c b/lib/xarray.c
index 1d9fab7db8da..acd1fad2e862 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -1839,7 +1839,8 @@ static bool xas_sibling(struct xa_state *xas)
if (!node)
return false;
mask = (XA_CHUNK_SIZE << node->shift) - 1;
- return (xas->xa_index & mask) > (xas->xa_offset << node->shift);
+ return (xas->xa_index & mask) >
+ ((unsigned long)xas->xa_offset << node->shift);
}
/**
From: Eric Biggers <ebiggers(a)google.com>
Currently after any algorithm is registered and tested, there's an
unnecessary request_module("cryptomgr") even if it's already loaded.
Also, CRYPTO_MSG_ALG_LOADED is sent twice, and thus if the algorithm is
"crct10dif", lib/crc-t10dif.c replaces the tfm twice rather than once.
This occurs because CRYPTO_MSG_ALG_LOADED is sent using
crypto_probing_notify(), which tries to load "cryptomgr" if the
notification is not handled (NOTIFY_DONE). This doesn't make sense
because "cryptomgr" doesn't handle this notification.
Fix this by using crypto_notify() instead of crypto_probing_notify().
Fixes: dd8b083f9a5e ("crypto: api - Introduce notifier for new crypto algorithms")
Cc: <stable(a)vger.kernel.org> # v4.20+
Cc: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
crypto/algapi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/crypto/algapi.c b/crypto/algapi.c
index 69605e21af92..849254d7e627 100644
--- a/crypto/algapi.c
+++ b/crypto/algapi.c
@@ -403,7 +403,7 @@ static void crypto_wait_for_test(struct crypto_larval *larval)
err = wait_for_completion_killable(&larval->completion);
WARN_ON(err);
if (!err)
- crypto_probing_notify(CRYPTO_MSG_ALG_LOADED, larval);
+ crypto_notify(CRYPTO_MSG_ALG_LOADED, larval);
out:
crypto_larval_kill(&larval->alg);
--
2.26.0
Hi,
The folloewing build failures currently observed with stable release
candidates.
Thanks,
Guenter
---
v4.19.y:
drivers/gpu/drm/etnaviv/etnaviv_perfmon.c:392:14: error: 'chipFeatures_PIPE_3D' undeclared here
drivers/gpu/drm/etnaviv/etnaviv_perfmon.c:397:14: error: 'chipFeatures_PIPE_2D'
drivers/gpu/drm/etnaviv/etnaviv_perfmon.c:402:14: error: 'chipFeatures_PIPE_VG' undeclared here
Culprit is 'drm/etnaviv: rework perfmon query infrastructure'. Applying
commit 15ff4a7b5841 ("etnaviv: perfmon: fix total and idle HI cyleces
readout") as well fixes the problem.
v5.4.y, v5.5.y:
drivers/mmc/host/sdhci-tegra.c: In function 'tegra_sdhci_set_timeout':
drivers/mmc/host/sdhci-tegra.c:1256:2: error:
implicit declaration of function '__sdhci_set_timeout'
Culprit is "sdhci: tegra: Implement Tegra specific set_timeout callback".
__sdhci_set_timeout() was introduced with commit 7d76ed77cfbd3 ("mmc:
sdhci: Refactor sdhci_set_timeout()"). Unfortunately, applying that patch
results in a conflict.
Guenter
The patch below does not apply to the 5.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From af92bad615be75c6c0d1b1c5b48178360250a187 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)c-s.fr>
Date: Fri, 6 Mar 2020 15:09:40 +0000
Subject: [PATCH] powerpc/kasan: Fix kasan_remap_early_shadow_ro()
At the moment kasan_remap_early_shadow_ro() does nothing, because
k_end is 0 and k_cur < 0 is always true.
Change the test to k_cur != k_end, as done in
kasan_init_shadow_page_tables()
Signed-off-by: Christophe Leroy <christophe.leroy(a)c-s.fr>
Fixes: cbd18991e24f ("powerpc/mm: Fix an Oops in kasan_mmu_init()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/4e7b56865e01569058914c991143f5961b5d4719.15835073…
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
index f19526e7d3dc..1a29cf469903 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -101,7 +101,7 @@ static void __init kasan_remap_early_shadow_ro(void)
kasan_populate_pte(kasan_early_shadow_pte, prot);
- for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
+ for (k_cur = k_start & PAGE_MASK; k_cur != k_end; k_cur += PAGE_SIZE) {
pmd_t *pmd = pmd_ptr_k(k_cur);
pte_t *ptep = pte_offset_kernel(pmd, k_cur);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From aa4113340ae6c2811e046f08c2bc21011d20a072 Mon Sep 17 00:00:00 2001
From: Laurentiu Tudor <laurentiu.tudor(a)nxp.com>
Date: Thu, 23 Jan 2020 11:19:25 +0000
Subject: [PATCH] powerpc/fsl_booke: Avoid creating duplicate tlb1 entry
In the current implementation, the call to loadcam_multi() is wrapped
between switch_to_as1() and restore_to_as0() calls so, when it tries
to create its own temporary AS=1 TLB1 entry, it ends up duplicating
the existing one created by switch_to_as1(). Add a check to skip
creating the temporary entry if already running in AS=1.
Fixes: d9e1831a4202 ("powerpc/85xx: Load all early TLB entries at once")
Cc: stable(a)vger.kernel.org # v4.4+
Signed-off-by: Laurentiu Tudor <laurentiu.tudor(a)nxp.com>
Acked-by: Scott Wood <oss(a)buserror.net>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20200123111914.2565-1-laurentiu.tudor@nxp.com
diff --git a/arch/powerpc/mm/nohash/tlb_low.S b/arch/powerpc/mm/nohash/tlb_low.S
index 2ca407cedbe7..eaeee402f96e 100644
--- a/arch/powerpc/mm/nohash/tlb_low.S
+++ b/arch/powerpc/mm/nohash/tlb_low.S
@@ -397,7 +397,7 @@ _GLOBAL(set_context)
* extern void loadcam_entry(unsigned int index)
*
* Load TLBCAM[index] entry in to the L2 CAM MMU
- * Must preserve r7, r8, r9, and r10
+ * Must preserve r7, r8, r9, r10 and r11
*/
_GLOBAL(loadcam_entry)
mflr r5
@@ -433,6 +433,10 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_BIG_PHYS)
*/
_GLOBAL(loadcam_multi)
mflr r8
+ /* Don't switch to AS=1 if already there */
+ mfmsr r11
+ andi. r11,r11,MSR_IS
+ bne 10f
/*
* Set up temporary TLB entry that is the same as what we're
@@ -458,6 +462,7 @@ _GLOBAL(loadcam_multi)
mtmsr r6
isync
+10:
mr r9,r3
add r10,r3,r4
2: bl loadcam_entry
@@ -466,6 +471,10 @@ _GLOBAL(loadcam_multi)
mr r3,r9
blt 2b
+ /* Don't return to AS=0 if we were in AS=1 at function start */
+ andi. r11,r11,MSR_IS
+ bne 3f
+
/* Return to AS=0 and clear the temporary entry */
mfmsr r6
rlwinm. r6,r6,0,~(MSR_IS|MSR_DS)
@@ -481,6 +490,7 @@ _GLOBAL(loadcam_multi)
tlbwe
isync
+3:
mtlr r8
blr
#endif
The patch titled
Subject: mm: call cond_resched() from deferred_init_memmap()
has been added to the -mm tree. Its filename is
mm-call-cond_resched-from-deferred_init_memmap.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-call-cond_resched-from-deferred…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-call-cond_resched-from-deferred…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Subject: mm: call cond_resched() from deferred_init_memmap()
Now that deferred pages are initialized with interrupts enabled we can
replace touch_nmi_watchdog() with cond_resched(), as it was before
3a2d7fa8a3d5.
For now, we cannot do the same in deferred_grow_zone() as it is still
initializes pages with interrupts disabled.
This change fixes RCU problem described in
https://lkml.kernel.org/r/20200401104156.11564-2-david@redhat.com
[ 60.474005] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 60.475000] rcu: 1-...0: (0 ticks this GP) idle=02a/1/0x4000000000000000 softirq=1/1 fqs=15000
[ 60.475000] rcu: (detected by 0, t=60002 jiffies, g=-1199, q=1)
[ 60.475000] Sending NMI from CPU 0 to CPUs 1:
[ 1.760091] NMI backtrace for cpu 1
[ 1.760091] CPU: 1 PID: 20 Comm: pgdatinit0 Not tainted 4.18.0-147.9.1.el8_1.x86_64 #1
[ 1.760091] Hardware name: Red Hat KVM, BIOS 1.13.0-1.module+el8.2.0+5520+4e5817f3 04/01/2014
[ 1.760091] RIP: 0010:__init_single_page.isra.65+0x10/0x4f
[ 1.760091] Code: 48 83 cf 63 48 89 f8 0f 1f 40 00 48 89 c6 48 89 d7 e8 6b 18 80 ff 66 90 5b c3 31 c0 b9 10 00 00 00 49 89 f8 48 c1 e6 33 f3 ab <b8> 07 00 00 00 48 c1 e2 36 41 c7 40 34 01 00 00 00 48 c1 e0 33 41
[ 1.760091] RSP: 0000:ffffba783123be40 EFLAGS: 00000006
[ 1.760091] RAX: 0000000000000000 RBX: fffffad34405e300 RCX: 0000000000000000
[ 1.760091] RDX: 0000000000000000 RSI: 0010000000000000 RDI: fffffad34405e340
[ 1.760091] RBP: 0000000033f3177e R08: fffffad34405e300 R09: 0000000000000002
[ 1.760091] R10: 000000000000002b R11: ffff98afb691a500 R12: 0000000000000002
[ 1.760091] R13: 0000000000000000 R14: 000000003f03ea00 R15: 000000003e10178c
[ 1.760091] FS: 0000000000000000(0000) GS:ffff9c9ebeb00000(0000) knlGS:0000000000000000
[ 1.760091] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.760091] CR2: 00000000ffffffff CR3: 000000a1cf20a001 CR4: 00000000003606e0
[ 1.760091] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1.760091] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1.760091] Call Trace:
[ 1.760091] deferred_init_pages+0x8f/0xbf
[ 1.760091] deferred_init_memmap+0x184/0x29d
[ 1.760091] ? deferred_free_pages.isra.97+0xba/0xba
[ 1.760091] kthread+0x112/0x130
[ 1.760091] ? kthread_flush_work_fn+0x10/0x10
[ 1.760091] ret_from_fork+0x35/0x40
[ 89.123011] node 0 initialised, 1055935372 pages in 88650ms
Link: http://lkml.kernel.org/r/20200403140952.17177-4-pasha.tatashin@soleen.com
Fixes: 3a2d7fa8a3d5 ("mm: disable interrupts while initializing deferred pages")
Reported-by: Yiqian Wei <yiwei(a)redhat.com>
Tested-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: James Morris <jmorris(a)namei.org>
Cc: Kirill Tkhai <ktkhai(a)virtuozzo.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Sasha Levin <sashal(a)kernel.org>
Cc: Shile Zhang <shile.zhang(a)linux.alibaba.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org> [4.17+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/page_alloc.c~mm-call-cond_resched-from-deferred_init_memmap
+++ a/mm/page_alloc.c
@@ -1869,7 +1869,7 @@ static int __init deferred_init_memmap(v
*/
while (spfn < epfn) {
nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn);
- touch_nmi_watchdog();
+ cond_resched();
}
zone_empty:
/* Sanity check that the next zone really is unpopulated */
_
Patches currently in -mm which might be from pasha.tatashin(a)soleen.com are
mm-initialize-deferred-pages-with-interrupts-enabled.patch
mm-call-cond_resched-from-deferred_init_memmap.patch
The patch titled
Subject: mm: initialize deferred pages with interrupts enabled
has been added to the -mm tree. Its filename is
mm-initialize-deferred-pages-with-interrupts-enabled.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-initialize-deferred-pages-with-…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-initialize-deferred-pages-with-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Subject: mm: initialize deferred pages with interrupts enabled
Initializing struct pages is a long task and keeping interrupts disabled
for the duration of this operation introduces a number of problems.
1. jiffies are not updated for long period of time, and thus incorrect time
is reported. See proposed solution and discussion here:
lkml/20200311123848.118638-1-shile.zhang(a)linux.alibaba.com
2. It prevents farther improving deferred page initialization by allowing
intra-node multi-threading.
We are keeping interrupts disabled to solve a rather theoretical problem
that was never observed in real world (See 3a2d7fa8a3d5).
Let's keep interrupts enabled. In case we ever encounter a scenario where
an interrupt thread wants to allocate large amount of memory this early in
boot we can deal with that by growing zone (see deferred_grow_zone()) by
the needed amount before starting deferred_init_memmap() threads.
Before:
[ 1.232459] node 0 initialised, 12058412 pages in 1ms
After:
[ 1.632580] node 0 initialised, 12051227 pages in 436ms
Link: http://lkml.kernel.org/r/20200403140952.17177-3-pasha.tatashin@soleen.com
Fixes: 3a2d7fa8a3d5 ("mm: disable interrupts while initializing deferred pages")
Reported-by: Shile Zhang <shile.zhang(a)linux.alibaba.com>
Signed-off-by: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: James Morris <jmorris(a)namei.org>
Cc: Kirill Tkhai <ktkhai(a)virtuozzo.com>
Cc: Sasha Levin <sashal(a)kernel.org>
Cc: Yiqian Wei <yiwei(a)redhat.com>
Cc: <stable(a)vger.kernel.org> [4.17+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mmzone.h | 2 ++
mm/page_alloc.c | 20 +++++++-------------
2 files changed, 9 insertions(+), 13 deletions(-)
--- a/include/linux/mmzone.h~mm-initialize-deferred-pages-with-interrupts-enabled
+++ a/include/linux/mmzone.h
@@ -678,6 +678,8 @@ typedef struct pglist_data {
/*
* Must be held any time you expect node_start_pfn,
* node_present_pages, node_spanned_pages or nr_zones to stay constant.
+ * Also synchronizes pgdat->first_deferred_pfn during deferred page
+ * init.
*
* pgdat_resize_lock() and pgdat_resize_unlock() are provided to
* manipulate node_size_lock without checking for CONFIG_MEMORY_HOTPLUG
--- a/mm/page_alloc.c~mm-initialize-deferred-pages-with-interrupts-enabled
+++ a/mm/page_alloc.c
@@ -1843,6 +1843,13 @@ static int __init deferred_init_memmap(v
BUG_ON(pgdat->first_deferred_pfn > pgdat_end_pfn(pgdat));
pgdat->first_deferred_pfn = ULONG_MAX;
+ /*
+ * Once we unlock here, the zone cannot be grown anymore, thus if an
+ * interrupt thread must allocate this early in boot, zone must be
+ * pre-grown prior to start of deferred page initialization.
+ */
+ pgdat_resize_unlock(pgdat, &flags);
+
/* Only the highest zone is deferred so find it */
for (zid = 0; zid < MAX_NR_ZONES; zid++) {
zone = pgdat->node_zones + zid;
@@ -1865,8 +1872,6 @@ static int __init deferred_init_memmap(v
touch_nmi_watchdog();
}
zone_empty:
- pgdat_resize_unlock(pgdat, &flags);
-
/* Sanity check that the next zone really is unpopulated */
WARN_ON(++zid < MAX_NR_ZONES && populated_zone(++zone));
@@ -1909,17 +1914,6 @@ deferred_grow_zone(struct zone *zone, un
pgdat_resize_lock(pgdat, &flags);
/*
- * If deferred pages have been initialized while we were waiting for
- * the lock, return true, as the zone was grown. The caller will retry
- * this zone. We won't return to this function since the caller also
- * has this static branch.
- */
- if (!static_branch_unlikely(&deferred_pages)) {
- pgdat_resize_unlock(pgdat, &flags);
- return true;
- }
-
- /*
* If someone grew this zone while we were waiting for spinlock, return
* true, as there might be enough pages already.
*/
_
Patches currently in -mm which might be from pasha.tatashin(a)soleen.com are
mm-initialize-deferred-pages-with-interrupts-enabled.patch
mm-call-cond_resched-from-deferred_init_memmap.patch
The patch titled
Subject: mm/pagealloc.c: call touch_nmi_watchdog() on max order boundaries in deferred init
has been added to the -mm tree. Its filename is
mm-call-touch_nmi_watchdog-on-max-order-boundaries-in-deferred-init.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-call-touch_nmi_watchdog-on-max-…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-call-touch_nmi_watchdog-on-max-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Subject: mm/pagealloc.c: call touch_nmi_watchdog() on max order boundaries in deferred init
Patch series "initialize deferred pages with interrupts enabled", v4.
Keep interrupts enabled during deferred page initialization in order to
make code more modular and allow jiffies to update.
Original approach, and discussion can be found here:
http://lkml.kernel.org/r/20200311123848.118638-1-shile.zhang@linux.alibaba.…
This patch (of 3):
deferred_init_memmap() disables interrupts the entire time, so it calls
touch_nmi_watchdog() periodically to avoid soft lockup splats. Soon it
will run with interrupts enabled, at which point cond_resched() should be
used instead.
deferred_grow_zone() makes the same watchdog calls through code shared
with deferred init but will continue to run with interrupts disabled, so
it can't call cond_resched().
Pull the watchdog calls up to these two places to allow the first to be
changed later, independently of the second. The frequency reduces from
twice per pageblock (init and free) to once per max order block.
Link: http://lkml.kernel.org/r/20200403140952.17177-2-pasha.tatashin@soleen.com
Fixes: 3a2d7fa8a3d5 ("mm: disable interrupts while initializing deferred pages")
Signed-off-by: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Signed-off-by: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Shile Zhang <shile.zhang(a)linux.alibaba.com>
Cc: Kirill Tkhai <ktkhai(a)virtuozzo.com>
Cc: James Morris <jmorris(a)namei.org>
Cc: Sasha Levin <sashal(a)kernel.org>
Cc: Yiqian Wei <yiwei(a)redhat.com>
Cc: <stable(a)vger.kernel.org> [4.17+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
--- a/mm/page_alloc.c~mm-call-touch_nmi_watchdog-on-max-order-boundaries-in-deferred-init
+++ a/mm/page_alloc.c
@@ -1692,7 +1692,6 @@ static void __init deferred_free_pages(u
} else if (!(pfn & nr_pgmask)) {
deferred_free_range(pfn - nr_free, nr_free);
nr_free = 1;
- touch_nmi_watchdog();
} else {
nr_free++;
}
@@ -1722,7 +1721,6 @@ static unsigned long __init deferred_in
continue;
} else if (!page || !(pfn & nr_pgmask)) {
page = pfn_to_page(pfn);
- touch_nmi_watchdog();
} else {
page++;
}
@@ -1862,8 +1860,10 @@ static int __init deferred_init_memmap(v
* that we can avoid introducing any issues with the buddy
* allocator.
*/
- while (spfn < epfn)
+ while (spfn < epfn) {
nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn);
+ touch_nmi_watchdog();
+ }
zone_empty:
pgdat_resize_unlock(pgdat, &flags);
@@ -1947,6 +1947,7 @@ deferred_grow_zone(struct zone *zone, un
first_deferred_pfn = spfn;
nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn);
+ touch_nmi_watchdog();
/* We should only stop along section boundaries */
if ((first_deferred_pfn ^ spfn) < PAGES_PER_SECTION)
_
Patches currently in -mm which might be from daniel.m.jordan(a)oracle.com are
mm-call-touch_nmi_watchdog-on-max-order-boundaries-in-deferred-init.patch