[This is backport for 5.4 of 29daf869cbab69088fe1755d9dd224e99ba78b56]
The kernel expects pte_young() to work regardless of CONFIG_SWAP.
Make sure a minor fault is taken to set _PAGE_ACCESSED when it
is not already set, regardless of the selection of CONFIG_SWAP.
This adds at least 3 instructions to the TLB miss exception
handlers fast path. Following patch will reduce this overhead.
Also update the rotation instruction to the correct number of bits
to reflect all changes done to _PAGE_ACCESSED over time.
Fixes: d069cb4373fe ("powerpc/8xx: Don't touch ACCESSED when no SWAP.")
Fixes: 5f356497c384 ("powerpc/8xx: remove unused _PAGE_WRITETHRU")
Fixes: e0a8e0d90a9f ("powerpc/8xx: Handle PAGE_USER via APG bits")
Fixes: 5b2753fc3e8a ("powerpc/8xx: Implementation of PAGE_EXEC")
Fixes: a891c43b97d3 ("powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/af834e8a0f1fa97bfae65664950f0984a70c4750.16024928…
---
arch/powerpc/kernel/head_8xx.S | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 98d8b6832fcb..f6428b90a6c7 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -229,9 +229,7 @@ SystemCall:
InstructionTLBMiss:
mtspr SPRN_SPRG_SCRATCH0, r10
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP)
mtspr SPRN_SPRG_SCRATCH1, r11
-#endif
/* If we are faulting a kernel address, we have to use the
* kernel page tables.
@@ -278,11 +276,9 @@ InstructionTLBMiss:
#ifdef ITLB_MISS_KERNEL
mtcr r11
#endif
-#ifdef CONFIG_SWAP
- rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ rlwinm r11, r10, 32-7, _PAGE_PRESENT
and r11, r11, r10
rlwimi r10, r11, 0, _PAGE_PRESENT
-#endif
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 20 and 23 must be clear.
* Software indicator bits 22, 24, 25, 26, and 27 must be
@@ -296,9 +292,7 @@ InstructionTLBMiss:
/* Restore registers */
0: mfspr r10, SPRN_SPRG_SCRATCH0
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP)
mfspr r11, SPRN_SPRG_SCRATCH1
-#endif
rfi
patch_site 0b, patch__itlbmiss_exit_1
@@ -308,9 +302,7 @@ InstructionTLBMiss:
addi r10, r10, 1
stw r10, (itlb_miss_counter - PAGE_OFFSET)@l(0)
mfspr r10, SPRN_SPRG_SCRATCH0
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP)
mfspr r11, SPRN_SPRG_SCRATCH1
-#endif
rfi
#endif
@@ -394,11 +386,9 @@ DataStoreTLBMiss:
* r11 = ((r10 & PRESENT) & ((r10 & ACCESSED) >> 5));
* r10 = (r10 & ~PRESENT) | r11;
*/
-#ifdef CONFIG_SWAP
- rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ rlwinm r11, r10, 32-7, _PAGE_PRESENT
and r11, r11, r10
rlwimi r10, r11, 0, _PAGE_PRESENT
-#endif
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 24, 25, 26, and 27 must be
* set. All other Linux PTE bits control the behavior
--
2.25.0
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 1b02d9e770cd7087f34c743f85ccf5ea8372b047 Mon Sep 17 00:00:00 2001
From: Bartosz Golaszewski <bgolaszewski(a)baylibre.com>
Date: Tue, 8 Sep 2020 15:07:49 +0200
Subject: [PATCH] gpio: mockup: fix resource leak in error path
If the module init function fails after creating the debugs directory,
it's never removed. Add proper cleanup calls to avoid this resource
leak.
Fixes: 9202ba2397d1 ("gpio: mockup: implement event injecting over debugfs")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski(a)baylibre.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
diff --git a/drivers/gpio/gpio-mockup.c b/drivers/gpio/gpio-mockup.c
index bc345185db26..1652897fdf90 100644
--- a/drivers/gpio/gpio-mockup.c
+++ b/drivers/gpio/gpio-mockup.c
@@ -552,6 +552,7 @@ static int __init gpio_mockup_init(void)
err = platform_driver_register(&gpio_mockup_driver);
if (err) {
gpio_mockup_err("error registering platform driver\n");
+ debugfs_remove_recursive(gpio_mockup_dbg_dir);
return err;
}
@@ -582,6 +583,7 @@ static int __init gpio_mockup_init(void)
gpio_mockup_err("error registering device");
platform_driver_unregister(&gpio_mockup_driver);
gpio_mockup_unregister_pdevs();
+ debugfs_remove_recursive(gpio_mockup_dbg_dir);
return PTR_ERR(pdev);
}
Hi Greg, Sasha,
This was missing in 4.4-stable. It was easier to backport than picking
all the other commits needed to aply it cleanly. It has been manually
backported with an extra label for goto. I will prefer an Ack from
Wolfram or Krzysztof or Oleksij before you add it to your queue.
--
Regards
Sudip
Hi Greg, Sasha,
While backporting 37640adbefd6 ("MIPS: PCI: remember nasid changed by
set interrupt affinity") something went wrong and an extra 'n' was added.
So 'data->nasid' became 'data->nnasid' and the MIPS builds started failing.
Since v5.4.78 is already released I assumed you will need a patch to
fix it. Please consider applying the attached patch, this is only needed
for 5.4-stable tree.
--
Regards
Sudip
IBM Power9 processors can speculatively operate on data in the L1
cache before it has been completely validated, via a way-prediction
mechanism. It is not possible for an attacker to determine the
contents of impermissible memory using this method, since these
systems implement a combination of hardware and software security
measures to prevent scenarios where protected data could be leaked.
However these measures don't address the scenario where an attacker
induces the operating system to speculatively execute instructions
using data that the attacker controls. This can be used for example to
speculatively bypass "kernel user access prevention" techniques, as
discovered by Anthony Steinhauser of Google's Safeside Project. This
is not an attack by itself, but there is a possibility it could be
used in conjunction with side-channels or other weaknesses in the
privileged code to construct an attack.
This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This series flushes the cache on kernel entry and
after kernel user accesses.
Thanks to Nick Piggin, Russell Currey, Christopher M. Riedl, Michael
Ellerman and Spoorthy S for their work in developing, optimising,
testing and backporting these fixes, and to the many others who helped
behind the scenes.
Daniel Axtens (1):
selftests/powerpc: entry flush test
Michael Ellerman (1):
powerpc: Only include kup-radix.h for 64-bit Book3S
Nicholas Piggin (2):
powerpc/64s: flush L1D on kernel entry
powerpc/64s: flush L1D after user accesses
Russell Currey (1):
selftests/powerpc: rfi_flush: disable entry flush if present
.../admin-guide/kernel-parameters.txt | 7 +
.../powerpc/include/asm/book3s/64/kup-radix.h | 66 +++---
arch/powerpc/include/asm/exception-64s.h | 12 +-
arch/powerpc/include/asm/feature-fixups.h | 19 ++
arch/powerpc/include/asm/kup.h | 26 ++-
arch/powerpc/include/asm/security_features.h | 7 +
arch/powerpc/include/asm/setup.h | 4 +
arch/powerpc/kernel/exceptions-64s.S | 80 +++----
arch/powerpc/kernel/setup_64.c | 122 ++++++++++-
arch/powerpc/kernel/syscall_64.c | 2 +-
arch/powerpc/kernel/vmlinux.lds.S | 14 ++
arch/powerpc/lib/feature-fixups.c | 104 +++++++++
arch/powerpc/platforms/powernv/setup.c | 17 ++
arch/powerpc/platforms/pseries/setup.c | 8 +
.../selftests/powerpc/security/.gitignore | 1 +
.../selftests/powerpc/security/Makefile | 2 +-
.../selftests/powerpc/security/entry_flush.c | 198 ++++++++++++++++++
.../selftests/powerpc/security/rfi_flush.c | 35 +++-
18 files changed, 646 insertions(+), 78 deletions(-)
create mode 100644 tools/testing/selftests/powerpc/security/entry_flush.c
--
2.25.1
This adds crashkernel=auto feature to configure reserved memory for
vmcore creation to both x86 and ARM platforms based on the total memory
size.
Cc: stable(a)vger.kernel.org
Signed-off-by: John Donnelly <john.p.donnelly(a)oracle.com>
Signed-off-by: Saeed Mirzamohammadi <saeed.mirzamohammadi(a)oracle.com>
---
Documentation/admin-guide/kdump/kdump.rst | 5 +++++
arch/arm64/Kconfig | 26 ++++++++++++++++++++++-
arch/arm64/configs/defconfig | 1 +
arch/x86/Kconfig | 26 ++++++++++++++++++++++-
arch/x86/configs/x86_64_defconfig | 1 +
kernel/crash_core.c | 20 +++++++++++++++--
6 files changed, 75 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
index 75a9dd98e76e..f95a2af64f59 100644
--- a/Documentation/admin-guide/kdump/kdump.rst
+++ b/Documentation/admin-guide/kdump/kdump.rst
@@ -285,7 +285,12 @@ This would mean:
2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
3) if the RAM size is larger than 2G, then reserve 128M
+Or you can use crashkernel=auto if you have enough memory. The threshold
+is 1G on x86_64 and arm64. If your system memory is less than the threshold,
+crashkernel=auto will not reserve memory. The size changes according to
+the system memory size like below:
+ x86_64/arm64: 1G-64G:128M,64G-1T:256M,1T-:512M
Boot into System Kernel
=======================
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1515f6f153a0..d359dcffa80e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1124,7 +1124,7 @@ comment "Support for PE file signature verification disabled"
depends on KEXEC_SIG
depends on !EFI || !SIGNED_PE_FILE_VERIFICATION
-config CRASH_DUMP
+menuconfig CRASH_DUMP
bool "Build kdump crash kernel"
help
Generate crash dump after being started by kexec. This should
@@ -1135,6 +1135,30 @@ config CRASH_DUMP
For more details see Documentation/admin-guide/kdump/kdump.rst
+if CRASH_DUMP
+
+config CRASH_AUTO_STR
+ string "Memory reserved for crash kernel"
+ depends on CRASH_DUMP
+ default "1G-64G:128M,64G-1T:256M,1T-:512M"
+ help
+ This configures the reserved memory dependent
+ on the value of System RAM. The syntax is:
+ crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
+ range=start-[end]
+
+ For example:
+ crashkernel=512M-2G:64M,2G-:128M
+
+ This would mean:
+
+ 1) if the RAM is smaller than 512M, then don't reserve anything
+ (this is the "rescue" case)
+ 2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
+ 3) if the RAM size is larger than 2G, then reserve 128M
+
+endif # CRASH_DUMP
+
config XEN_DOM0
def_bool y
depends on XEN
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 5cfe3cf6f2ac..899ef3b6a78f 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -69,6 +69,7 @@ CONFIG_SECCOMP=y
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
CONFIG_CRASH_DUMP=y
+# CONFIG_CRASH_AUTO_STR is not set
CONFIG_XEN=y
CONFIG_COMPAT=y
CONFIG_RANDOMIZE_BASE=y
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f6946b81f74a..bacd17312bb1 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2035,7 +2035,7 @@ config KEXEC_BZIMAGE_VERIFY_SIG
help
Enable bzImage signature verification support.
-config CRASH_DUMP
+menuconfig CRASH_DUMP
bool "kernel crash dumps"
depends on X86_64 || (X86_32 && HIGHMEM)
help
@@ -2049,6 +2049,30 @@ config CRASH_DUMP
(CONFIG_RELOCATABLE=y).
For more details see Documentation/admin-guide/kdump/kdump.rst
+if CRASH_DUMP
+
+config CRASH_AUTO_STR
+ string "Memory reserved for crash kernel" if X86_64
+ depends on CRASH_DUMP
+ default "1G-64G:128M,64G-1T:256M,1T-:512M"
+ help
+ This configures the reserved memory dependent
+ on the value of System RAM. The syntax is:
+ crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
+ range=start-[end]
+
+ For example:
+ crashkernel=512M-2G:64M,2G-:128M
+
+ This would mean:
+
+ 1) if the RAM is smaller than 512M, then don't reserve anything
+ (this is the "rescue" case)
+ 2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
+ 3) if the RAM size is larger than 2G, then reserve 128M
+
+endif # CRASH_DUMP
+
config KEXEC_JUMP
bool "kexec jump"
depends on KEXEC && HIBERNATION
diff --git a/arch/x86/configs/x86_64_defconfig b/arch/x86/configs/x86_64_defconfig
index 9936528e1939..7a87fbecf40b 100644
--- a/arch/x86/configs/x86_64_defconfig
+++ b/arch/x86/configs/x86_64_defconfig
@@ -33,6 +33,7 @@ CONFIG_EFI_MIXED=y
CONFIG_HZ_1000=y
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
+# CONFIG_CRASH_AUTO_STR is not set
CONFIG_HIBERNATION=y
CONFIG_PM_DEBUG=y
CONFIG_PM_TRACE_RTC=y
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 106e4500fd53..a44cd9cc12c4 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -7,6 +7,7 @@
#include <linux/crash_core.h>
#include <linux/utsname.h>
#include <linux/vmalloc.h>
+#include <linux/kexec.h>
#include <asm/page.h>
#include <asm/sections.h>
@@ -41,6 +42,15 @@ static int __init parse_crashkernel_mem(char *cmdline,
unsigned long long *crash_base)
{
char *cur = cmdline, *tmp;
+ unsigned long long total_mem = system_ram;
+
+ /*
+ * Firmware sometimes reserves some memory regions for it's own use.
+ * so we get less than actual system memory size.
+ * Workaround this by round up the total size to 128M which is
+ * enough for most test cases.
+ */
+ total_mem = roundup(total_mem, SZ_128M);
/* for each entry of the comma-separated list */
do {
@@ -85,13 +95,13 @@ static int __init parse_crashkernel_mem(char *cmdline,
return -EINVAL;
}
cur = tmp;
- if (size >= system_ram) {
+ if (size >= total_mem) {
pr_warn("crashkernel: invalid size\n");
return -EINVAL;
}
/* match ? */
- if (system_ram >= start && system_ram < end) {
+ if (total_mem >= start && total_mem < end) {
*crash_size = size;
break;
}
@@ -250,6 +260,12 @@ static int __init __parse_crashkernel(char *cmdline,
if (suffix)
return parse_crashkernel_suffix(ck_cmdline, crash_size,
suffix);
+#ifdef CONFIG_CRASH_AUTO_STR
+ if (strncmp(ck_cmdline, "auto", 4) == 0) {
+ ck_cmdline = CONFIG_CRASH_AUTO_STR;
+ pr_info("Using crashkernel=auto, the size chosen is a best effort estimation.\n");
+ }
+#endif
/*
* if the commandline contains a ':', then that's the extended
* syntax -- if not, it must be the classic syntax
--
2.18.4
IBM Power9 processors can speculatively operate on data in the L1
cache before it has been completely validated, via a way-prediction
mechanism. It is not possible for an attacker to determine the
contents of impermissible memory using this method, since these
systems implement a combination of hardware and software security
measures to prevent scenarios where protected data could be leaked.
However these measures don't address the scenario where an attacker
induces the operating system to speculatively execute instructions
using data that the attacker controls. This can be used for example to
speculatively bypass "kernel user access prevention" techniques, as
discovered by Anthony Steinhauser of Google's Safeside Project. This
is not an attack by itself, but there is a possibility it could be
used in conjunction with side-channels or other weaknesses in the
privileged code to construct an attack.
This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This series flushes the cache on kernel entry and
after kernel user accesses.
Thanks to Nick Piggin, Russell Currey, Christopher M. Riedl, Michael
Ellerman and Spoorthy S for their work in developing, optimising,
testing and backporting these fixes, and to the many others who helped
behind the scenes.
Andrew Donnellan (1):
powerpc: Fix __clear_user() with KUAP enabled
Christophe Leroy (2):
powerpc: Add a framework for user access tracking
powerpc: Implement user_access_begin and friends
Daniel Axtens (2):
powerpc/64s: Define MASKABLE_RELON_EXCEPTION_PSERIES_OOL
powerpc/64s: move some exception handlers out of line
Nicholas Piggin (3):
powerpc/64s: flush L1D on kernel entry
powerpc/uaccess: Evaluate macro arguments once, before user access is
allowed
powerpc/64s: flush L1D after user accesses
Documentation/kernel-parameters.txt | 7 +
.../powerpc/include/asm/book3s/64/kup-radix.h | 23 ++
arch/powerpc/include/asm/exception-64s.h | 15 +-
arch/powerpc/include/asm/feature-fixups.h | 19 ++
arch/powerpc/include/asm/futex.h | 4 +
arch/powerpc/include/asm/kup.h | 40 ++++
arch/powerpc/include/asm/security_features.h | 7 +
arch/powerpc/include/asm/setup.h | 4 +
arch/powerpc/include/asm/uaccess.h | 142 +++++++++---
arch/powerpc/kernel/exceptions-64s.S | 210 +++++++++++-------
arch/powerpc/kernel/ppc_ksyms.c | 10 +
arch/powerpc/kernel/setup_64.c | 138 ++++++++++++
arch/powerpc/kernel/vmlinux.lds.S | 14 ++
arch/powerpc/lib/checksum_wrappers_64.c | 4 +
arch/powerpc/lib/feature-fixups.c | 104 +++++++++
arch/powerpc/lib/string.S | 2 +-
arch/powerpc/lib/string_64.S | 4 +-
arch/powerpc/platforms/powernv/setup.c | 15 ++
arch/powerpc/platforms/pseries/setup.c | 8 +
19 files changed, 653 insertions(+), 117 deletions(-)
create mode 100644 arch/powerpc/include/asm/book3s/64/kup-radix.h
create mode 100644 arch/powerpc/include/asm/kup.h
--
2.25.1
IBM Power9 processors can speculatively operate on data in the L1
cache before it has been completely validated, via a way-prediction
mechanism. It is not possible for an attacker to determine the
contents of impermissible memory using this method, since these
systems implement a combination of hardware and software security
measures to prevent scenarios where protected data could be leaked.
However these measures don't address the scenario where an attacker
induces the operating system to speculatively execute instructions
using data that the attacker controls. This can be used for example to
speculatively bypass "kernel user access prevention" techniques, as
discovered by Anthony Steinhauser of Google's Safeside Project. This
is not an attack by itself, but there is a possibility it could be
used in conjunction with side-channels or other weaknesses in the
privileged code to construct an attack.
This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This series flushes the cache on kernel entry and
after kernel user accesses.
Thanks to Nick Piggin, Russell Currey, Christopher M. Riedl, Michael
Ellerman and Spoorthy S for their work in developing, optimising,
testing and backporting these fixes, and to the many others who helped
behind the scenes.
Andrew Donnellan (1):
powerpc: Fix __clear_user() with KUAP enabled
Christophe Leroy (2):
powerpc: Add a framework for user access tracking
powerpc: Implement user_access_begin and friends
Daniel Axtens (2):
powerpc/64s: Define MASKABLE_RELON_EXCEPTION_PSERIES_OOL
powerpc/64s: move some exception handlers out of line
Nicholas Piggin (3):
powerpc/64s: flush L1D on kernel entry
powerpc/uaccess: Evaluate macro arguments once, before user access is
allowed
powerpc/64s: flush L1D after user accesses
Documentation/kernel-parameters.txt | 7 +
.../powerpc/include/asm/book3s/64/kup-radix.h | 22 +++
arch/powerpc/include/asm/exception-64s.h | 13 +-
arch/powerpc/include/asm/feature-fixups.h | 19 +++
arch/powerpc/include/asm/futex.h | 4 +
arch/powerpc/include/asm/kup.h | 40 +++++
arch/powerpc/include/asm/security_features.h | 7 +
arch/powerpc/include/asm/setup.h | 4 +
arch/powerpc/include/asm/uaccess.h | 143 ++++++++++++++----
arch/powerpc/kernel/exceptions-64s.S | 130 ++++++++--------
arch/powerpc/kernel/setup_64.c | 120 +++++++++++++++
arch/powerpc/kernel/vmlinux.lds.S | 14 ++
arch/powerpc/lib/checksum_wrappers.c | 4 +
arch/powerpc/lib/feature-fixups.c | 104 +++++++++++++
arch/powerpc/lib/string.S | 4 +-
arch/powerpc/lib/string_64.S | 6 +-
arch/powerpc/platforms/powernv/setup.c | 15 ++
arch/powerpc/platforms/pseries/setup.c | 8 +
18 files changed, 567 insertions(+), 97 deletions(-)
create mode 100644 arch/powerpc/include/asm/book3s/64/kup-radix.h
create mode 100644 arch/powerpc/include/asm/kup.h
--
2.25.1
IBM Power9 processors can speculatively operate on data in the L1
cache before it has been completely validated, via a way-prediction
mechanism. It is not possible for an attacker to determine the
contents of impermissible memory using this method, since these
systems implement a combination of hardware and software security
measures to prevent scenarios where protected data could be leaked.
However these measures don't address the scenario where an attacker
induces the operating system to speculatively execute instructions
using data that the attacker controls. This can be used for example to
speculatively bypass "kernel user access prevention" techniques, as
discovered by Anthony Steinhauser of Google's Safeside Project. This
is not an attack by itself, but there is a possibility it could be
used in conjunction with side-channels or other weaknesses in the
privileged code to construct an attack.
This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This series flushes the cache on kernel entry and
after kernel user accesses.
Thanks to Nick Piggin, Russell Currey, Christopher M. Riedl, Michael
Ellerman and Spoorthy S for their work in developing, optimising,
testing and backporting these fixes, and to the many others who helped
behind the scenes.
Andrew Donnellan (1):
powerpc: Fix __clear_user() with KUAP enabled
Christophe Leroy (2):
powerpc: Add a framework for user access tracking
powerpc: Implement user_access_begin and friends
Daniel Axtens (2):
powerpc/64s: Define MASKABLE_RELON_EXCEPTION_PSERIES_OOL
powerpc/64s: move some exception handlers out of line
Nicholas Piggin (3):
powerpc/64s: flush L1D on kernel entry
powerpc/uaccess: Evaluate macro arguments once, before user access is
allowed
powerpc/64s: flush L1D after user accesses
.../admin-guide/kernel-parameters.txt | 7 +
.../powerpc/include/asm/book3s/64/kup-radix.h | 22 +++
arch/powerpc/include/asm/exception-64s.h | 13 +-
arch/powerpc/include/asm/feature-fixups.h | 19 +++
arch/powerpc/include/asm/futex.h | 4 +
arch/powerpc/include/asm/kup.h | 40 +++++
arch/powerpc/include/asm/security_features.h | 7 +
arch/powerpc/include/asm/setup.h | 4 +
arch/powerpc/include/asm/uaccess.h | 148 ++++++++++++++----
arch/powerpc/kernel/exceptions-64s.S | 96 +++++++-----
arch/powerpc/kernel/setup_64.c | 122 ++++++++++++++-
arch/powerpc/kernel/vmlinux.lds.S | 14 ++
arch/powerpc/lib/checksum_wrappers.c | 4 +
arch/powerpc/lib/feature-fixups.c | 104 ++++++++++++
arch/powerpc/lib/string.S | 4 +-
arch/powerpc/lib/string_64.S | 6 +-
arch/powerpc/platforms/powernv/setup.c | 17 ++
arch/powerpc/platforms/pseries/setup.c | 8 +
18 files changed, 558 insertions(+), 81 deletions(-)
create mode 100644 arch/powerpc/include/asm/book3s/64/kup-radix.h
create mode 100644 arch/powerpc/include/asm/kup.h
--
2.25.1
IBM Power9 processors can speculatively operate on data in the L1
cache before it has been completely validated, via a way-prediction
mechanism. It is not possible for an attacker to determine the
contents of impermissible memory using this method, since these
systems implement a combination of hardware and software security
measures to prevent scenarios where protected data could be leaked.
However these measures don't address the scenario where an attacker
induces the operating system to speculatively execute instructions
using data that the attacker controls. This can be used for example to
speculatively bypass "kernel user access prevention" techniques, as
discovered by Anthony Steinhauser of Google's Safeside Project. This
is not an attack by itself, but there is a possibility it could be
used in conjunction with side-channels or other weaknesses in the
privileged code to construct an attack.
This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This series flushes the cache on kernel entry and
after kernel user accesses.
Thanks to Nick Piggin, Russell Currey, Christopher M. Riedl, Michael
Ellerman and Spoorthy S for their work in developing, optimising,
testing and backporting these fixes, and to the many others who helped
behind the scenes.
Andrew Donnellan (1):
powerpc: Fix __clear_user() with KUAP enabled
Christophe Leroy (2):
powerpc: Add a framework for user access tracking
powerpc: Implement user_access_begin and friends
Daniel Axtens (1):
powerpc/64s: move some exception handlers out of line
Nicholas Piggin (3):
powerpc/64s: flush L1D on kernel entry
powerpc/uaccess: Evaluate macro arguments once, before user access is
allowed
powerpc/64s: flush L1D after user accesses
.../admin-guide/kernel-parameters.txt | 7 +
.../powerpc/include/asm/book3s/64/kup-radix.h | 22 +++
arch/powerpc/include/asm/exception-64s.h | 9 +-
arch/powerpc/include/asm/feature-fixups.h | 19 +++
arch/powerpc/include/asm/futex.h | 4 +
arch/powerpc/include/asm/kup.h | 40 +++++
arch/powerpc/include/asm/security_features.h | 7 +
arch/powerpc/include/asm/setup.h | 4 +
arch/powerpc/include/asm/uaccess.h | 147 ++++++++++++++----
arch/powerpc/kernel/exceptions-64s.S | 96 +++++++-----
arch/powerpc/kernel/setup_64.c | 122 ++++++++++++++-
arch/powerpc/kernel/vmlinux.lds.S | 14 ++
arch/powerpc/lib/checksum_wrappers.c | 4 +
arch/powerpc/lib/feature-fixups.c | 104 +++++++++++++
arch/powerpc/lib/string_32.S | 4 +-
arch/powerpc/lib/string_64.S | 6 +-
arch/powerpc/platforms/powernv/setup.c | 17 ++
arch/powerpc/platforms/pseries/setup.c | 8 +
18 files changed, 553 insertions(+), 81 deletions(-)
create mode 100644 arch/powerpc/include/asm/book3s/64/kup-radix.h
create mode 100644 arch/powerpc/include/asm/kup.h
--
2.25.1
IBM Power9 processors can speculatively operate on data in the L1
cache before it has been completely validated, via a way-prediction
mechanism. It is not possible for an attacker to determine the
contents of impermissible memory using this method, since these
systems implement a combination of hardware and software security
measures to prevent scenarios where protected data could be leaked.
However these measures don't address the scenario where an attacker
induces the operating system to speculatively execute instructions
using data that the attacker controls. This can be used for example to
speculatively bypass "kernel user access prevention" techniques, as
discovered by Anthony Steinhauser of Google's Safeside Project. This
is not an attack by itself, but there is a possibility it could be
used in conjunction with side-channels or other weaknesses in the
privileged code to construct an attack.
This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This series flushes the cache on kernel entry and
after kernel user accesses.
Thanks to Nick Piggin, Russell Currey, Christopher M. Riedl, Michael
Ellerman and Spoorthy S for their work in developing, optimising,
testing and backporting these fixes, and to the many others who helped
behind the scenes.
Daniel Axtens (1):
selftests/powerpc: entry flush test
Michael Ellerman (1):
powerpc: Only include kup-radix.h for 64-bit Book3S
Nicholas Piggin (2):
powerpc/64s: flush L1D on kernel entry
powerpc/64s: flush L1D after user accesses
Russell Currey (1):
selftests/powerpc: rfi_flush: disable entry flush if present
.../admin-guide/kernel-parameters.txt | 7 +
.../powerpc/include/asm/book3s/64/kup-radix.h | 29 ++--
arch/powerpc/include/asm/exception-64s.h | 12 +-
arch/powerpc/include/asm/feature-fixups.h | 19 ++
arch/powerpc/include/asm/kup.h | 27 ++-
arch/powerpc/include/asm/security_features.h | 7 +
arch/powerpc/include/asm/setup.h | 4 +
arch/powerpc/kernel/exceptions-64s.S | 88 +++++-----
arch/powerpc/kernel/setup_64.c | 122 ++++++++++++-
arch/powerpc/kernel/vmlinux.lds.S | 14 ++
arch/powerpc/lib/feature-fixups.c | 104 +++++++++++
arch/powerpc/platforms/powernv/setup.c | 17 ++
arch/powerpc/platforms/pseries/setup.c | 8 +
.../selftests/powerpc/security/.gitignore | 1 +
.../selftests/powerpc/security/Makefile | 2 +-
.../selftests/powerpc/security/entry_flush.c | 163 ++++++++++++++++++
.../selftests/powerpc/security/rfi_flush.c | 35 +++-
17 files changed, 592 insertions(+), 67 deletions(-)
create mode 100644 tools/testing/selftests/powerpc/security/entry_flush.c
--
2.25.1
Hi,
Please backport commit f9317ae5523f99999fb54c513ebabbb2bc887ddf ("net:
lantiq: Add locking for TX DMA channel") to kernel 5.4.
https://git.kernel.org/linus/f9317ae5523f99999fb54c513ebabbb2bc887ddf
The fix commit was added upstream with kernel 5.9 and fixes a problem
introduced in commit fe1a56420cf2 ("net: lantiq: Add Lantiq / Intel
VRX200 Ethernet driver") with kernel 4.20.
Multiple users reported in the ticket to integrate this into OpenWrt
that this fixes TX hangs for them.
https://github.com/openwrt/openwrt/pull/3085
Hauke
Hi,
Please backport "i2c: mux: pca954x: Add missing pca9546 definition to
chip_desc" to kernel 4.9.
This is upstream commit id dbe4d69d252e9e65c6c46826980b77b11a142065
https://git.kernel.org/linus/dbe4d69d252e9e65c6c46826980b77b11a142065
commit dbe4d69d252e9e65c6c46826980b77b11a142065
Author: Mike Looijmans <mike.looijmans(a)topic.nl>
Date: Thu Mar 23 10:00:36 2017 +0100
i2c: mux: pca954x: Add missing pca9546 definition to chip_desc
The pca954x_of_match table references the chips array at position
pca_9546, but this entry is not filled before.
When a device tree contains a compatible string with "nxp,pca9546", it
will not load successfully without this patch.
This problem was introduced in commit 8a191a7ad4ca ("i2c: pca954x: add
device tree binding") in v4.9 and is fixed upstream with kernel version
4.11.
The commit f8251f1dfda9 ("i2c: mux: pca954x: Add missing pca9542
definition to chip_desc") fixes a similar problem with the pca9542.
https://git.kernel.org/linus/f8251f1dfda9e1200545bf19270d9df2273bdfa1
The changes in the pca954x_acpi_ids should not be backported as it does
not exist in 4.9.
Hauke
On Thu, Nov 19, 2020 at 1:44 PM Tao Zhou <ouwen210(a)hotmail.com> wrote:
> [...]
> That time I realized something, but..
> I try to remember something and get some impression.
>
> We need to update the below when do not need to enqueue entity because
> this is added for runnable_avg updating,
>
> update_load_avg(cfs_rq, se, UPDATE_TG);
> se_update_runnable(se);
>
> Earlier version do not introduce the above to only update runnable_avg.
> Use one *for loop* is enough though. Please correct me if I am wrong.
>
Thanks a lot Tao! I'm not sure, I'm definitely not an expert in the
scheduler. Will defer this one to Vincent / Peter / Phil / Ben.
Cheers!
I'm announcing the release of the 4.9.244 kernel.
All users of the 4.9 kernel series must upgrade.
The updated 4.9.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.9.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/kernel-parameters.txt | 8
Makefile | 2
arch/x86/events/intel/pt.c | 4
arch/x86/kernel/cpu/bugs.c | 52 +-
drivers/block/xen-blkback/blkback.c | 22 -
drivers/block/xen-blkback/xenbus.c | 5
drivers/char/random.c | 1
drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 27 -
drivers/gpu/drm/gma500/psb_irq.c | 34 -
drivers/iommu/amd_iommu_types.h | 6
drivers/misc/mei/client.h | 4
drivers/net/can/dev.c | 14
drivers/net/can/usb/peak_usb/pcan_usb_core.c | 51 ++
drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 48 +-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 32 +
drivers/net/geneve.c | 36 +
drivers/net/wan/cosa.c | 1
drivers/net/wireless/ath/ath9k/htc_drv_txrx.c | 2
drivers/net/xen-netback/common.h | 15
drivers/net/xen-netback/interface.c | 61 ++
drivers/net/xen-netback/netback.c | 11
drivers/net/xen-netback/rx.c | 13
drivers/of/address.c | 4
drivers/pinctrl/aspeed/pinctrl-aspeed.c | 7
drivers/pinctrl/devicetree.c | 26 -
drivers/pinctrl/pinctrl-amd.c | 6
drivers/regulator/core.c | 2
drivers/scsi/device_handler/scsi_dh_alua.c | 9
drivers/scsi/hpsa.c | 4
drivers/usb/class/cdc-acm.c | 9
drivers/usb/gadget/udc/goku_udc.c | 2
drivers/xen/events/events_2l.c | 9
drivers/xen/events/events_base.c | 422 +++++++++++++++++--
drivers/xen/events/events_fifo.c | 82 +--
drivers/xen/events/events_internal.h | 20
drivers/xen/evtchn.c | 7
drivers/xen/xen-pciback/pci_stub.c | 14
drivers/xen/xen-pciback/pciback.h | 12
drivers/xen/xen-pciback/pciback_ops.c | 48 +-
drivers/xen/xen-pciback/xenbus.c | 2
drivers/xen/xen-scsiback.c | 23 -
fs/btrfs/extent_io.c | 4
fs/btrfs/ioctl.c | 2
fs/cifs/cifs_unicode.c | 8
fs/ext4/inline.c | 1
fs/ext4/super.c | 5
fs/gfs2/glock.c | 3
fs/gfs2/rgrp.c | 5
fs/ocfs2/super.c | 1
fs/xfs/libxfs/xfs_rmap.c | 2
fs/xfs/libxfs/xfs_rmap_btree.c | 16
fs/xfs/xfs_iops.c | 10
fs/xfs/xfs_pnfs.c | 2
include/linux/can/skb.h | 20
include/linux/perf_event.h | 2
include/linux/prandom.h | 36 +
include/linux/time64.h | 4
include/xen/events.h | 29 +
kernel/events/core.c | 42 -
kernel/events/internal.h | 2
kernel/exit.c | 5
kernel/irq/Kconfig | 1
kernel/reboot.c | 28 -
kernel/time/timer.c | 7
kernel/trace/ring_buffer.c | 54 ++
lib/random32.c | 462 ++++++++++++---------
lib/swiotlb.c | 6
mm/mempolicy.c | 6
net/ipv4/syncookies.c | 9
net/ipv6/sit.c | 2
net/ipv6/syncookies.c | 10
net/iucv/af_iucv.c | 3
net/mac80211/tx.c | 35 +
net/wireless/reg.c | 2
net/x25/af_x25.c | 2
net/xfrm/xfrm_state.c | 8
sound/hda/ext/hdac_ext_controller.c | 2
tools/perf/util/session.c | 1
78 files changed, 1446 insertions(+), 548 deletions(-)
Al Viro (1):
don't dump the threads that had been already exiting when zapped.
Alexander Aring (1):
gfs2: Wake up when sd_glock_disposal becomes zero
Alexander Usyskin (1):
mei: protect mei_cl_mtu from null dereference
Anand K Mistry (1):
x86/speculation: Allow IBPB to be conditionally enabled on CPUs with always-on STIBP
Billy Tsai (1):
pinctrl: aspeed: Fix GPI only function problem.
Bob Peterson (2):
gfs2: Free rd_bits later in gfs2_clear_rgrpd to fix use-after-free
gfs2: check for live vs. read-only file system in gfs2_fitrim
Boris Protopopov (1):
Convert trailing spaces and periods in path components
Brian Foster (1):
xfs: flush new eof page on truncate to avoid post-eof corruption
Chris Brandt (1):
usb: cdc-acm: Add DISABLE_ECHO for Renesas USB Download mode
Christoph Hellwig (1):
xfs: fix a missing unlock on error in xfs_fs_map_blocks
Christophe JAILLET (1):
i40e: Fix a potential NULL pointer dereference
Coiby Xu (2):
pinctrl: amd: use higher precision for 512 RtcClk
pinctrl: amd: fix incorrect way to disable debounce filter
Dan Carpenter (2):
ALSA: hda: prevent undefined shift in snd_hdac_ext_bus_get_link()
can: peak_usb: add range checking in decode operations
Darrick J. Wong (2):
xfs: fix flags argument to rmap lookup when converting shared file rmaps
xfs: fix rmap key and record comparison functions
Eric Biggers (1):
ext4: fix leaking sysfs kobject after failed mount
Evan Nimmo (1):
of/address: Fix of_node memory leak in of_dma_is_coherent
Evan Quan (1):
drm/amdgpu: perform srbm soft reset always on SDMA resume
Evgeny Novikov (1):
usb: gadget: goku_udc: fix potential crashes in probe
Filipe Manana (1):
Btrfs: fix missing error return if writeback for extent buffer never started
George Spelvin (1):
random32: make prandom_u32() output unpredictable
Greg Kroah-Hartman (1):
Linux 4.9.244
Grzegorz Siwik (1):
i40e: Wrong truncation from u16 to u8
Hannes Reinecke (1):
scsi: scsi_dh_alua: Avoid crash during alua_bus_detach()
Jiri Olsa (2):
perf tools: Add missing swap for ino_generation
perf/core: Fix race in the perf_mmap_close() function
Johannes Berg (1):
mac80211: fix use of skb payload instead of header
Johannes Thumshirn (1):
btrfs: reschedule when cloning lots of extents
Joseph Qi (1):
ext4: unlock xattr_sem properly in ext4_inline_data_truncate()
Juergen Gross (12):
xen/events: avoid removing an event channel while handling it
xen/events: add a proper barrier to 2-level uevent unmasking
xen/events: fix race in evtchn_fifo_unmask()
xen/events: add a new "late EOI" evtchn framework
xen/blkback: use lateeoi irq binding
xen/netback: use lateeoi irq binding
xen/scsiback: use lateeoi irq binding
xen/pciback: use lateeoi irq binding
xen/events: switch user event channels to lateeoi model
xen/events: use a common cpu hotplug hook for event channels
xen/events: defer eoi in case of excessive number of events
xen/events: block rogue events for some time
Kaixu Xia (1):
ext4: correctly report "not supported" for {usr,grp}jquota when !CONFIG_QUOTA
Keita Suzuki (1):
scsi: hpsa: Fix memory leak in hpsa_init_one()
Mao Wenan (1):
net: Update window_clamp if SOCK_RCVBUF is set
Marc Zyngier (1):
genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_HIERARCHY
Mark Gray (1):
geneve: add transport ports in route lookup for geneve
Martin Schiller (1):
net/x25: Fix null-ptr-deref in x25_connect
Martyna Szapar (2):
i40e: Fix of memory leak and integer truncation in i40e_virtchnl.c
i40e: Memory leak in i40e_config_iwarp_qvlist
Masashi Honma (1):
ath9k_htc: Use appropriate rs_datalen type
Mathieu Poirier (1):
perf/core: Fix crash when using HW tracing kernel filters
Matteo Croce (2):
Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
reboot: fix overflow parsing reboot cpu number
Michał Mirosław (1):
regulator: defer probe when trying to get voltage from unresolved supply
Oleksij Rempel (1):
can: can_create_echo_skb(): fix echo skb generation: always use skb_clone()
Oliver Hartkopp (1):
can: dev: __can_get_echo_skb(): fix real payload length return value for RTR frames
Oliver Herms (1):
IPv6: Set SIT tunnel hard_header_len to zero
Peter Zijlstra (1):
perf: Fix get_recursion_context()
Sergey Nemov (1):
i40e: add num_vectors checker in iwarp handler
Shijie Luo (1):
mm: mempolicy: fix potential pte_unmap_unlock pte error
Song Liu (1):
perf/core: Fix bad use of igrab()
Stefano Stabellini (1):
swiotlb: fix "x86: Don't panic if can not alloc buffer for swiotlb"
Stephane Grosjean (1):
can: peak_usb: peak_usb_get_ts_time(): fix timestamp wrapping
Steven Rostedt (VMware) (1):
ring-buffer: Fix recursion protection transitions between interrupt context
Suravee Suthikulpanit (1):
iommu/amd: Increase interrupt remapping table limit to 512 entries
Thomas Zimmermann (1):
drm/gma500: Fix out-of-bounds access to struct drm_device.vblank[]
Ursula Braun (1):
net/af_iucv: fix null pointer dereference on shutdown
Vincent Mailhol (1):
can: dev: can_get_echo_skb(): prevent call to kfree_skb() in hard IRQ context
Wang Hai (1):
cosa: Add missing kfree in error path of cosa_write
Wengang Wang (1):
ocfs2: initialize ip_next_orphan
Will Deacon (1):
pinctrl: devicetree: Avoid taking direct reference to device name string
Ye Bin (1):
cfg80211: regulatory: Fix inconsistent format argument
Zeng Tao (1):
time: Prevent undefined behaviour in timespec64_to_ns()
kiyin(尹亮) (1):
perf/core: Fix a memory leak in perf_event_parse_addr_filter()
zhuoliang zhang (1):
net: xfrm: fix a race condition during allocing spi
I'm announcing the release of the 4.4.244 kernel.
All users of the 4.4 kernel series must upgrade.
The updated 4.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.4.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/kernel-parameters.txt | 8
Makefile | 2
arch/x86/kernel/cpu/bugs.c | 52 +-
drivers/block/xen-blkback/blkback.c | 22
drivers/block/xen-blkback/xenbus.c | 5
drivers/char/random.c | 2
drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 27 -
drivers/gpu/drm/gma500/psb_irq.c | 34 -
drivers/iommu/amd_iommu_types.h | 6
drivers/misc/mei/client.h | 4
drivers/net/can/dev.c | 14
drivers/net/can/usb/peak_usb/pcan_usb_core.c | 51 ++
drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 48 +-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 4
drivers/net/geneve.c | 36 +
drivers/net/wan/cosa.c | 1
drivers/net/wireless/ath/ath9k/htc_drv_txrx.c | 2
drivers/net/xen-netback/common.h | 39 +
drivers/net/xen-netback/interface.c | 59 ++
drivers/net/xen-netback/netback.c | 17
drivers/of/address.c | 4
drivers/pinctrl/devicetree.c | 26 -
drivers/pinctrl/pinctrl-amd.c | 6
drivers/usb/class/cdc-acm.c | 9
drivers/usb/gadget/udc/goku_udc.c | 2
drivers/xen/events/events_2l.c | 9
drivers/xen/events/events_base.c | 444 ++++++++++++++++++--
drivers/xen/events/events_fifo.c | 102 +---
drivers/xen/events/events_internal.h | 20
drivers/xen/evtchn.c | 7
drivers/xen/xen-pciback/pci_stub.c | 14
drivers/xen/xen-pciback/pciback.h | 12
drivers/xen/xen-pciback/pciback_ops.c | 48 +-
drivers/xen/xen-pciback/xenbus.c | 2
drivers/xen/xen-scsiback.c | 23 -
fs/btrfs/extent_io.c | 4
fs/btrfs/ioctl.c | 2
fs/cifs/cifs_unicode.c | 8
fs/ext4/inline.c | 1
fs/ext4/super.c | 5
fs/gfs2/glock.c | 3
fs/gfs2/rgrp.c | 5
fs/ocfs2/super.c | 1
fs/xfs/xfs_pnfs.c | 2
include/linux/can/skb.h | 20
include/linux/prandom.h | 36 +
include/linux/time64.h | 4
include/xen/events.h | 29 +
kernel/events/core.c | 7
kernel/events/internal.h | 2
kernel/exit.c | 5
kernel/reboot.c | 28 -
kernel/time/timer.c | 7
kernel/trace/ring_buffer.c | 54 +-
lib/random32.c | 463 ++++++++++++---------
lib/swiotlb.c | 6
mm/mempolicy.c | 6
net/ipv4/syncookies.c | 9
net/ipv6/sit.c | 2
net/ipv6/syncookies.c | 10
net/iucv/af_iucv.c | 3
net/mac80211/tx.c | 35 +
net/wireless/reg.c | 2
net/x25/af_x25.c | 2
net/xfrm/xfrm_state.c | 8
sound/hda/ext/hdac_ext_controller.c | 2
tools/perf/util/session.c | 1
67 files changed, 1412 insertions(+), 521 deletions(-)
Al Viro (1):
don't dump the threads that had been already exiting when zapped.
Alexander Aring (1):
gfs2: Wake up when sd_glock_disposal becomes zero
Alexander Usyskin (1):
mei: protect mei_cl_mtu from null dereference
Anand K Mistry (1):
x86/speculation: Allow IBPB to be conditionally enabled on CPUs with always-on STIBP
Bob Peterson (2):
gfs2: Free rd_bits later in gfs2_clear_rgrpd to fix use-after-free
gfs2: check for live vs. read-only file system in gfs2_fitrim
Boris Protopopov (1):
Convert trailing spaces and periods in path components
Chris Brandt (1):
usb: cdc-acm: Add DISABLE_ECHO for Renesas USB Download mode
Christoph Hellwig (1):
xfs: fix a missing unlock on error in xfs_fs_map_blocks
Coiby Xu (2):
pinctrl: amd: use higher precision for 512 RtcClk
pinctrl: amd: fix incorrect way to disable debounce filter
Dan Carpenter (2):
ALSA: hda: prevent undefined shift in snd_hdac_ext_bus_get_link()
can: peak_usb: add range checking in decode operations
Eric Biggers (1):
ext4: fix leaking sysfs kobject after failed mount
Evan Nimmo (1):
of/address: Fix of_node memory leak in of_dma_is_coherent
Evan Quan (1):
drm/amdgpu: perform srbm soft reset always on SDMA resume
Evgeny Novikov (1):
usb: gadget: goku_udc: fix potential crashes in probe
Filipe Manana (1):
Btrfs: fix missing error return if writeback for extent buffer never started
George Spelvin (1):
random32: make prandom_u32() output unpredictable
Greg Kroah-Hartman (1):
Linux 4.4.244
Grzegorz Siwik (1):
i40e: Wrong truncation from u16 to u8
Jiri Olsa (2):
perf tools: Add missing swap for ino_generation
perf/core: Fix race in the perf_mmap_close() function
Johannes Berg (1):
mac80211: fix use of skb payload instead of header
Johannes Thumshirn (1):
btrfs: reschedule when cloning lots of extents
Joseph Qi (1):
ext4: unlock xattr_sem properly in ext4_inline_data_truncate()
Juergen Gross (12):
xen/events: avoid removing an event channel while handling it
xen/events: add a proper barrier to 2-level uevent unmasking
xen/events: fix race in evtchn_fifo_unmask()
xen/events: add a new "late EOI" evtchn framework
xen/blkback: use lateeoi irq binding
xen/netback: use lateeoi irq binding
xen/scsiback: use lateeoi irq binding
xen/pciback: use lateeoi irq binding
xen/events: switch user event channels to lateeoi model
xen/events: use a common cpu hotplug hook for event channels
xen/events: defer eoi in case of excessive number of events
xen/events: block rogue events for some time
Kaixu Xia (1):
ext4: correctly report "not supported" for {usr,grp}jquota when !CONFIG_QUOTA
Mao Wenan (1):
net: Update window_clamp if SOCK_RCVBUF is set
Mark Gray (1):
geneve: add transport ports in route lookup for geneve
Martin Schiller (1):
net/x25: Fix null-ptr-deref in x25_connect
Martyna Szapar (1):
i40e: Fix of memory leak and integer truncation in i40e_virtchnl.c
Masashi Honma (1):
ath9k_htc: Use appropriate rs_datalen type
Matteo Croce (2):
Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
reboot: fix overflow parsing reboot cpu number
Oleksij Rempel (1):
can: can_create_echo_skb(): fix echo skb generation: always use skb_clone()
Oliver Hartkopp (1):
can: dev: __can_get_echo_skb(): fix real payload length return value for RTR frames
Oliver Herms (1):
IPv6: Set SIT tunnel hard_header_len to zero
Peter Zijlstra (1):
perf: Fix get_recursion_context()
Shijie Luo (1):
mm: mempolicy: fix potential pte_unmap_unlock pte error
Stefano Stabellini (1):
swiotlb: fix "x86: Don't panic if can not alloc buffer for swiotlb"
Stephane Grosjean (1):
can: peak_usb: peak_usb_get_ts_time(): fix timestamp wrapping
Steven Rostedt (VMware) (1):
ring-buffer: Fix recursion protection transitions between interrupt context
Suravee Suthikulpanit (1):
iommu/amd: Increase interrupt remapping table limit to 512 entries
Thomas Zimmermann (1):
drm/gma500: Fix out-of-bounds access to struct drm_device.vblank[]
Ursula Braun (1):
net/af_iucv: fix null pointer dereference on shutdown
Vincent Mailhol (1):
can: dev: can_get_echo_skb(): prevent call to kfree_skb() in hard IRQ context
Wang Hai (1):
cosa: Add missing kfree in error path of cosa_write
Wengang Wang (1):
ocfs2: initialize ip_next_orphan
Will Deacon (1):
pinctrl: devicetree: Avoid taking direct reference to device name string
Ye Bin (1):
cfg80211: regulatory: Fix inconsistent format argument
Zeng Tao (1):
time: Prevent undefined behaviour in timespec64_to_ns()
zhuoliang zhang (1):
net: xfrm: fix a race condition during allocing spi
Reshape request should be blocked with ongoing resync job. In cluster
env, a node can start resync job even if the resync cmd isn't executed
on it, e.g., user executes "mdadm --grow" on node A, sometimes node B
will start resync job. However, current update_raid_disks() only check
local recovery status, which is incomplete. As a result, we see user will
execute "mdadm --grow" successfully on local, while the remote node deny
to do reshape job when it doing resync job. The inconsistent handling
cause array enter unexpected status. If user doesn't observe this issue
and continue executing mdadm cmd, the array doesn't work at last.
Fix this issue by blocking reshape request. When node executes "--grow"
and detects ongoing resync, it should stop and report error to user.
The following script reproduces the issue with ~100% probability.
(two nodes share 3 iSCSI luns: sdg/sdh/sdi. Each lun size is 1GB)
```
# on node1, node2 is the remote node.
ssh root@node2 "mdadm -S --scan"
mdadm -S --scan
for i in {g,h,i};do dd if=/dev/zero of=/dev/sd$i oflag=direct bs=1M \
count=20; done
mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sdg /dev/sdh
ssh root@node2 "mdadm -A /dev/md0 /dev/sdg /dev/sdh"
sleep 5
mdadm --manage --add /dev/md0 /dev/sdi
mdadm --wait /dev/md0
mdadm --grow --raid-devices=3 /dev/md0
mdadm /dev/md0 --fail /dev/sdg
mdadm /dev/md0 --remove /dev/sdg
mdadm --grow --raid-devices=2 /dev/md0
```
Cc: stable(a)vger.kernel.org
Signed-off-by: Zhao Heming <heming.zhao(a)suse.com>
---
drivers/md/md.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 98bac4f304ae..74280e353b8f 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7278,6 +7278,7 @@ static int update_raid_disks(struct mddev *mddev, int raid_disks)
return -EINVAL;
if (mddev->sync_thread ||
test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) ||
+ test_bit(MD_RESYNCING_REMOTE, &mddev->recovery) ||
mddev->reshape_position != MaxSector)
return -EBUSY;
@@ -9662,8 +9663,11 @@ static void check_sb_changes(struct mddev *mddev, struct md_rdev *rdev)
}
}
- if (mddev->raid_disks != le32_to_cpu(sb->raid_disks))
- update_raid_disks(mddev, le32_to_cpu(sb->raid_disks));
+ if (mddev->raid_disks != le32_to_cpu(sb->raid_disks)) {
+ ret = update_raid_disks(mddev, le32_to_cpu(sb->raid_disks));
+ if (ret)
+ pr_warn("md: updating array disks failed. %d\n", ret);
+ }
/*
* Since mddev->delta_disks has already updated in update_raid_disks,
--
2.27.0
[This is backport for 4.9 of 29daf869cbab69088fe1755d9dd224e99ba78b56]
The kernel expects pte_young() to work regardless of CONFIG_SWAP.
Make sure a minor fault is taken to set _PAGE_ACCESSED when it
is not already set, regardless of the selection of CONFIG_SWAP.
This adds at least 3 instructions to the TLB miss exception
handlers fast path. Following patch will reduce this overhead.
Also update the rotation instruction to the correct number of bits
to reflect all changes done to _PAGE_ACCESSED over time.
Fixes: d069cb4373fe ("powerpc/8xx: Don't touch ACCESSED when no SWAP.")
Fixes: 5f356497c384 ("powerpc/8xx: remove unused _PAGE_WRITETHRU")
Fixes: e0a8e0d90a9f ("powerpc/8xx: Handle PAGE_USER via APG bits")
Fixes: 5b2753fc3e8a ("powerpc/8xx: Implementation of PAGE_EXEC")
Fixes: a891c43b97d3 ("powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/af834e8a0f1fa97bfae65664950f0984a70c4750.16024928…
---
arch/powerpc/kernel/head_8xx.S | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 2274be535dda..3801b32b1642 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -359,11 +359,9 @@ InstructionTLBMiss:
/* Load the MI_TWC with the attributes for this "segment." */
MTSPR_CPU6(SPRN_MI_TWC, r11, r3) /* Set segment attributes */
-#ifdef CONFIG_SWAP
- rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ rlwinm r11, r10, 32-11, _PAGE_PRESENT
and r11, r11, r10
rlwimi r10, r11, 0, _PAGE_PRESENT
-#endif
li r11, RPN_PATTERN
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 20-23 and 28 must be clear.
@@ -443,11 +441,9 @@ _ENTRY(DTLBMiss_jmp)
* r11 = ((r10 & PRESENT) & ((r10 & ACCESSED) >> 5));
* r10 = (r10 & ~PRESENT) | r11;
*/
-#ifdef CONFIG_SWAP
- rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ rlwinm r11, r10, 32-11, _PAGE_PRESENT
and r11, r11, r10
rlwimi r10, r11, 0, _PAGE_PRESENT
-#endif
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 22 and 28 must be clear.
* Software indicator bits 24, 25, 26, and 27 must be
--
2.25.0
[This is backport for 4.4 of 29daf869cbab69088fe1755d9dd224e99ba78b56]
The kernel expects pte_young() to work regardless of CONFIG_SWAP.
Make sure a minor fault is taken to set _PAGE_ACCESSED when it
is not already set, regardless of the selection of CONFIG_SWAP.
This adds at least 3 instructions to the TLB miss exception
handlers fast path. Following patch will reduce this overhead.
Also update the rotation instruction to the correct number of bits
to reflect all changes done to _PAGE_ACCESSED over time.
Fixes: d069cb4373fe ("powerpc/8xx: Don't touch ACCESSED when no SWAP.")
Fixes: 5f356497c384 ("powerpc/8xx: remove unused _PAGE_WRITETHRU")
Fixes: e0a8e0d90a9f ("powerpc/8xx: Handle PAGE_USER via APG bits")
Fixes: 5b2753fc3e8a ("powerpc/8xx: Implementation of PAGE_EXEC")
Fixes: a891c43b97d3 ("powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/af834e8a0f1fa97bfae65664950f0984a70c4750.16024928…
---
arch/powerpc/kernel/head_8xx.S | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 01e274e6907b..3d7512e72900 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -361,11 +361,9 @@ InstructionTLBMiss:
/* Load the MI_TWC with the attributes for this "segment." */
MTSPR_CPU6(SPRN_MI_TWC, r11, r3) /* Set segment attributes */
-#ifdef CONFIG_SWAP
- rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ rlwinm r11, r10, 32-11, _PAGE_PRESENT
and r11, r11, r10
rlwimi r10, r11, 0, _PAGE_PRESENT
-#endif
li r11, RPN_PATTERN
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 20-23 and 28 must be clear.
@@ -436,11 +434,9 @@ DataStoreTLBMiss:
* r11 = ((r10 & PRESENT) & ((r10 & ACCESSED) >> 5));
* r10 = (r10 & ~PRESENT) | r11;
*/
-#ifdef CONFIG_SWAP
- rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ rlwinm r11, r10, 32-11, _PAGE_PRESENT
and r11, r11, r10
rlwimi r10, r11, 0, _PAGE_PRESENT
-#endif
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 22 and 28 must be clear.
* Software indicator bits 24, 25, 26, and 27 must be
--
2.25.0
[This is backport for 4.19 of 29daf869cbab69088fe1755d9dd224e99ba78b56]
The kernel expects pte_young() to work regardless of CONFIG_SWAP.
Make sure a minor fault is taken to set _PAGE_ACCESSED when it
is not already set, regardless of the selection of CONFIG_SWAP.
This adds at least 3 instructions to the TLB miss exception
handlers fast path. Following patch will reduce this overhead.
Also update the rotation instruction to the correct number of bits
to reflect all changes done to _PAGE_ACCESSED over time.
Fixes: d069cb4373fe ("powerpc/8xx: Don't touch ACCESSED when no SWAP.")
Fixes: 5f356497c384 ("powerpc/8xx: remove unused _PAGE_WRITETHRU")
Fixes: e0a8e0d90a9f ("powerpc/8xx: Handle PAGE_USER via APG bits")
Fixes: 5b2753fc3e8a ("powerpc/8xx: Implementation of PAGE_EXEC")
Fixes: a891c43b97d3 ("powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/af834e8a0f1fa97bfae65664950f0984a70c4750.16024928…
---
arch/powerpc/kernel/head_8xx.S | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 9fd2ff28b8ff..dc99258f2e8c 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -356,11 +356,9 @@ _ENTRY(ITLBMiss_cmp)
/* Load the MI_TWC with the attributes for this "segment." */
mtspr SPRN_MI_TWC, r11 /* Set segment attributes */
-#ifdef CONFIG_SWAP
- rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ rlwinm r11, r10, 32-7, _PAGE_PRESENT
and r11, r11, r10
rlwimi r10, r11, 0, _PAGE_PRESENT
-#endif
li r11, RPN_PATTERN | 0x200
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 20 and 23 must be clear.
@@ -482,11 +480,9 @@ _ENTRY(DTLBMiss_jmp)
* r11 = ((r10 & PRESENT) & ((r10 & ACCESSED) >> 5));
* r10 = (r10 & ~PRESENT) | r11;
*/
-#ifdef CONFIG_SWAP
- rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ rlwinm r11, r10, 32-7, _PAGE_PRESENT
and r11, r11, r10
rlwimi r10, r11, 0, _PAGE_PRESENT
-#endif
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 24, 25, 26, and 27 must be
* set. All other Linux PTE bits control the behavior
--
2.25.0
[This is backport for 4.14 of 29daf869cbab69088fe1755d9dd224e99ba78b56]
The kernel expects pte_young() to work regardless of CONFIG_SWAP.
Make sure a minor fault is taken to set _PAGE_ACCESSED when it
is not already set, regardless of the selection of CONFIG_SWAP.
This adds at least 3 instructions to the TLB miss exception
handlers fast path. Following patch will reduce this overhead.
Also update the rotation instruction to the correct number of bits
to reflect all changes done to _PAGE_ACCESSED over time.
Fixes: d069cb4373fe ("powerpc/8xx: Don't touch ACCESSED when no SWAP.")
Fixes: 5f356497c384 ("powerpc/8xx: remove unused _PAGE_WRITETHRU")
Fixes: e0a8e0d90a9f ("powerpc/8xx: Handle PAGE_USER via APG bits")
Fixes: 5b2753fc3e8a ("powerpc/8xx: Implementation of PAGE_EXEC")
Fixes: a891c43b97d3 ("powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/af834e8a0f1fa97bfae65664950f0984a70c4750.16024928…
---
arch/powerpc/kernel/head_8xx.S | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 2d0d89e2cb9a..43884af0e35c 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -398,11 +398,9 @@ _ENTRY(ITLBMiss_cmp)
#if defined (CONFIG_HUGETLB_PAGE) && defined (CONFIG_PPC_4K_PAGES)
rlwimi r10, r11, 1, MI_SPS16K
#endif
-#ifdef CONFIG_SWAP
- rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ rlwinm r11, r10, 32-11, _PAGE_PRESENT
and r11, r11, r10
rlwimi r10, r11, 0, _PAGE_PRESENT
-#endif
li r11, RPN_PATTERN
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 20-23 and 28 must be clear.
@@ -528,11 +526,9 @@ _ENTRY(DTLBMiss_jmp)
* r11 = ((r10 & PRESENT) & ((r10 & ACCESSED) >> 5));
* r10 = (r10 & ~PRESENT) | r11;
*/
-#ifdef CONFIG_SWAP
- rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ rlwinm r11, r10, 32-11, _PAGE_PRESENT
and r11, r11, r10
rlwimi r10, r11, 0, _PAGE_PRESENT
-#endif
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 22 and 28 must be clear.
* Software indicator bits 24, 25, 26, and 27 must be
--
2.25.0
The ethernet driver may allocate skb (and skb->data) via napi_alloc_skb().
This ends up to page_frag_alloc() to allocate skb->data from
page_frag_cache->va.
During the memory pressure, page_frag_cache->va may be allocated as
pfmemalloc page. As a result, the skb->pfmemalloc is always true as
skb->data is from page_frag_cache->va. The skb will be dropped if the
sock (receiver) does not have SOCK_MEMALLOC. This is expected behaviour
under memory pressure.
However, once kernel is not under memory pressure any longer (suppose large
amount of memory pages are just reclaimed), the page_frag_alloc() may still
re-use the prior pfmemalloc page_frag_cache->va to allocate skb->data. As a
result, the skb->pfmemalloc is always true unless page_frag_cache->va is
re-allocated, even if the kernel is not under memory pressure any longer.
Here is how kernel runs into issue.
1. The kernel is under memory pressure and allocation of
PAGE_FRAG_CACHE_MAX_ORDER in __page_frag_cache_refill() will fail. Instead,
the pfmemalloc page is allocated for page_frag_cache->va.
2: All skb->data from page_frag_cache->va (pfmemalloc) will have
skb->pfmemalloc=true. The skb will always be dropped by sock without
SOCK_MEMALLOC. This is an expected behaviour.
3. Suppose a large amount of pages are reclaimed and kernel is not under
memory pressure any longer. We expect skb->pfmemalloc drop will not happen.
4. Unfortunately, page_frag_alloc() does not proactively re-allocate
page_frag_alloc->va and will always re-use the prior pfmemalloc page. The
skb->pfmemalloc is always true even kernel is not under memory pressure any
longer.
Fix this by freeing and re-allocating the page instead of recycling it.
References: https://lore.kernel.org/lkml/20201103193239.1807-1-dongli.zhang@oracle.com/
References: https://lore.kernel.org/linux-mm/20201105042140.5253-1-willy@infradead.org/
Suggested-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Aruna Ramakrishna <aruna.ramakrishna(a)oracle.com>
Cc: Bert Barbe <bert.barbe(a)oracle.com>
Cc: Rama Nichanamatlu <rama.nichanamatlu(a)oracle.com>
Cc: Venkat Venkatsubra <venkat.x.venkatsubra(a)oracle.com>
Cc: Manjunath Patil <manjunath.b.patil(a)oracle.com>
Cc: Joe Jin <joe.jin(a)oracle.com>
Cc: SRINIVAS <srinivas.eeda(a)oracle.com>
Cc: stable(a)vger.kernel.org
Fixes: 79930f5892e ("net: do not deplete pfmemalloc reserve")
Signed-off-by: Dongli Zhang <dongli.zhang(a)oracle.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
---
Changed since v1:
- change author from Matthew to Dongli
- Add references to all prior discussions
- Add more details to commit message
Changed since v2:
- add unlikely (suggested by Eric Dumazet)
mm/page_alloc.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 23f5066bd4a5..91129ce75ed4 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5103,6 +5103,11 @@ void *page_frag_alloc(struct page_frag_cache *nc,
if (!page_ref_sub_and_test(page, nc->pagecnt_bias))
goto refill;
+ if (unlikely(nc->pfmemalloc)) {
+ free_the_page(page, compound_order(page));
+ goto refill;
+ }
+
#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
/* if size can vary use size else just use PAGE_SIZE */
size = nc->size;
--
2.17.1
Hi Greg, Sasha,
This was missing in 4.9-stable. First patch is only needed so that
applying the second patch becomes easy. If its not accepted I can manually
backport it. Please add it to your queue.
--
Regards
Sudip
Hi Greg, Sasha,
This was missing in 4.14-stable. First patch is only needed so that
applying the second patch becomes easy. If its not accepted I can manually
backport it. Please add it to your queue.
--
Regards
Sudip
Please CC me in any replies as I am not subscribed to the list.
This is a legitimate request as I often need more than two days
especially on busy work days or weekends.
On Tue, 2020-11-17 at 09:01 +0100, Pavel Machek wrote:
> On Sat 2020-11-14 17:40:36, Hussam Al-Tayeb wrote:
> > Hello. I would like to suggest lengthening the review period for
> > stable
> > releases from 48 hours to 7 days.
> > The rationale is that 48 hours is not enough for people to test
> > those
> > stable releases and make sure there are no regressions for
> > particular
> > workflows.
>
> You should probably cc stable list and Greg with this.
>
> And yes, I believe that would be good idea.
>
> Plus the period is very often shorter than advertised, which might be
> also good to fix.
>
> Best regards,
> pavel
>
This is the start of the stable review cycle for the 4.9.244 release.
There are 78 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 19 Nov 2020 12:20:51 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.244-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.244-rc1
Boris Protopopov <pboris(a)amazon.com>
Convert trailing spaces and periods in path components
Eric Biggers <ebiggers(a)google.com>
ext4: fix leaking sysfs kobject after failed mount
Matteo Croce <mcroce(a)microsoft.com>
reboot: fix overflow parsing reboot cpu number
Matteo Croce <mcroce(a)microsoft.com>
Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
Jiri Olsa <jolsa(a)redhat.com>
perf/core: Fix race in the perf_mmap_close() function
Juergen Gross <jgross(a)suse.com>
xen/events: block rogue events for some time
Juergen Gross <jgross(a)suse.com>
xen/events: defer eoi in case of excessive number of events
Juergen Gross <jgross(a)suse.com>
xen/events: use a common cpu hotplug hook for event channels
Juergen Gross <jgross(a)suse.com>
xen/events: switch user event channels to lateeoi model
Juergen Gross <jgross(a)suse.com>
xen/pciback: use lateeoi irq binding
Juergen Gross <jgross(a)suse.com>
xen/scsiback: use lateeoi irq binding
Juergen Gross <jgross(a)suse.com>
xen/netback: use lateeoi irq binding
Juergen Gross <jgross(a)suse.com>
xen/blkback: use lateeoi irq binding
Juergen Gross <jgross(a)suse.com>
xen/events: add a new "late EOI" evtchn framework
Juergen Gross <jgross(a)suse.com>
xen/events: fix race in evtchn_fifo_unmask()
Juergen Gross <jgross(a)suse.com>
xen/events: add a proper barrier to 2-level uevent unmasking
Juergen Gross <jgross(a)suse.com>
xen/events: avoid removing an event channel while handling it
kiyin(尹亮) <kiyin(a)tencent.com>
perf/core: Fix a memory leak in perf_event_parse_addr_filter()
Mathieu Poirier <mathieu.poirier(a)linaro.org>
perf/core: Fix crash when using HW tracing kernel filters
Song Liu <songliubraving(a)fb.com>
perf/core: Fix bad use of igrab()
Anand K Mistry <amistry(a)google.com>
x86/speculation: Allow IBPB to be conditionally enabled on CPUs with always-on STIBP
George Spelvin <lkml(a)sdf.org>
random32: make prandom_u32() output unpredictable
Mao Wenan <wenan.mao(a)linux.alibaba.com>
net: Update window_clamp if SOCK_RCVBUF is set
Martin Schiller <ms(a)dev.tdt.de>
net/x25: Fix null-ptr-deref in x25_connect
Ursula Braun <ubraun(a)linux.ibm.com>
net/af_iucv: fix null pointer dereference on shutdown
Oliver Herms <oliver.peter.herms(a)gmail.com>
IPv6: Set SIT tunnel hard_header_len to zero
Stefano Stabellini <stefano.stabellini(a)xilinx.com>
swiotlb: fix "x86: Don't panic if can not alloc buffer for swiotlb"
Coiby Xu <coiby.xu(a)gmail.com>
pinctrl: amd: fix incorrect way to disable debounce filter
Coiby Xu <coiby.xu(a)gmail.com>
pinctrl: amd: use higher precision for 512 RtcClk
Thomas Zimmermann <tzimmermann(a)suse.de>
drm/gma500: Fix out-of-bounds access to struct drm_device.vblank[]
Al Viro <viro(a)zeniv.linux.org.uk>
don't dump the threads that had been already exiting when zapped.
Wengang Wang <wen.gang.wang(a)oracle.com>
ocfs2: initialize ip_next_orphan
Alexander Usyskin <alexander.usyskin(a)intel.com>
mei: protect mei_cl_mtu from null dereference
Chris Brandt <chris.brandt(a)renesas.com>
usb: cdc-acm: Add DISABLE_ECHO for Renesas USB Download mode
Joseph Qi <joseph.qi(a)linux.alibaba.com>
ext4: unlock xattr_sem properly in ext4_inline_data_truncate()
Kaixu Xia <kaixuxia(a)tencent.com>
ext4: correctly report "not supported" for {usr,grp}jquota when !CONFIG_QUOTA
Peter Zijlstra <peterz(a)infradead.org>
perf: Fix get_recursion_context()
Wang Hai <wanghai38(a)huawei.com>
cosa: Add missing kfree in error path of cosa_write
Evan Nimmo <evan.nimmo(a)alliedtelesis.co.nz>
of/address: Fix of_node memory leak in of_dma_is_coherent
Christoph Hellwig <hch(a)lst.de>
xfs: fix a missing unlock on error in xfs_fs_map_blocks
Darrick J. Wong <darrick.wong(a)oracle.com>
xfs: fix rmap key and record comparison functions
Darrick J. Wong <darrick.wong(a)oracle.com>
xfs: fix flags argument to rmap lookup when converting shared file rmaps
Billy Tsai <billy_tsai(a)aspeedtech.com>
pinctrl: aspeed: Fix GPI only function problem.
Suravee Suthikulpanit <suravee.suthikulpanit(a)amd.com>
iommu/amd: Increase interrupt remapping table limit to 512 entries
Hannes Reinecke <hare(a)suse.de>
scsi: scsi_dh_alua: Avoid crash during alua_bus_detach()
Ye Bin <yebin10(a)huawei.com>
cfg80211: regulatory: Fix inconsistent format argument
Johannes Berg <johannes.berg(a)intel.com>
mac80211: always wind down STA state
Johannes Berg <johannes.berg(a)intel.com>
mac80211: fix use of skb payload instead of header
Evan Quan <evan.quan(a)amd.com>
drm/amdgpu: perform srbm soft reset always on SDMA resume
Keita Suzuki <keitasuzuki.park(a)sslab.ics.keio.ac.jp>
scsi: hpsa: Fix memory leak in hpsa_init_one()
Bob Peterson <rpeterso(a)redhat.com>
gfs2: check for live vs. read-only file system in gfs2_fitrim
Bob Peterson <rpeterso(a)redhat.com>
gfs2: Free rd_bits later in gfs2_clear_rgrpd to fix use-after-free
Evgeny Novikov <novikov(a)ispras.ru>
usb: gadget: goku_udc: fix potential crashes in probe
Masashi Honma <masashi.honma(a)gmail.com>
ath9k_htc: Use appropriate rs_datalen type
Mark Gray <mark.d.gray(a)redhat.com>
geneve: add transport ports in route lookup for geneve
Martyna Szapar <martyna.szapar(a)intel.com>
i40e: Memory leak in i40e_config_iwarp_qvlist
Martyna Szapar <martyna.szapar(a)intel.com>
i40e: Fix of memory leak and integer truncation in i40e_virtchnl.c
Grzegorz Siwik <grzegorz.siwik(a)intel.com>
i40e: Wrong truncation from u16 to u8
Sergey Nemov <sergey.nemov(a)intel.com>
i40e: add num_vectors checker in iwarp handler
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
i40e: Fix a potential NULL pointer dereference
Will Deacon <will(a)kernel.org>
pinctrl: devicetree: Avoid taking direct reference to device name string
Filipe Manana <fdmanana(a)suse.com>
Btrfs: fix missing error return if writeback for extent buffer never started
Brian Foster <bfoster(a)redhat.com>
xfs: flush new eof page on truncate to avoid post-eof corruption
Stephane Grosjean <s.grosjean(a)peak-system.com>
can: peak_usb: peak_usb_get_ts_time(): fix timestamp wrapping
Dan Carpenter <dan.carpenter(a)oracle.com>
can: peak_usb: add range checking in decode operations
Oleksij Rempel <o.rempel(a)pengutronix.de>
can: can_create_echo_skb(): fix echo skb generation: always use skb_clone()
Oliver Hartkopp <socketcan(a)hartkopp.net>
can: dev: __can_get_echo_skb(): fix real payload length return value for RTR frames
Vincent Mailhol <mailhol.vincent(a)wanadoo.fr>
can: dev: can_get_echo_skb(): prevent call to kfree_skb() in hard IRQ context
Dan Carpenter <dan.carpenter(a)oracle.com>
ALSA: hda: prevent undefined shift in snd_hdac_ext_bus_get_link()
Jiri Olsa <jolsa(a)kernel.org>
perf tools: Add missing swap for ino_generation
zhuoliang zhang <zhuoliang.zhang(a)mediatek.com>
net: xfrm: fix a race condition during allocing spi
Marc Zyngier <maz(a)kernel.org>
genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_HIERARCHY
Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
btrfs: reschedule when cloning lots of extents
Zeng Tao <prime.zeng(a)hisilicon.com>
time: Prevent undefined behaviour in timespec64_to_ns()
Shijie Luo <luoshijie1(a)huawei.com>
mm: mempolicy: fix potential pte_unmap_unlock pte error
Alexander Aring <aahringo(a)redhat.com>
gfs2: Wake up when sd_glock_disposal becomes zero
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
ring-buffer: Fix recursion protection transitions between interrupt context
Michał Mirosław <mirq-linux(a)rere.qmqm.pl>
regulator: defer probe when trying to get voltage from unresolved supply
-------------
Diffstat:
Documentation/kernel-parameters.txt | 8 +
Makefile | 4 +-
arch/x86/events/intel/pt.c | 4 +-
arch/x86/kernel/cpu/bugs.c | 52 ++-
drivers/block/xen-blkback/blkback.c | 22 +-
drivers/block/xen-blkback/xenbus.c | 5 +-
drivers/char/random.c | 1 -
drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 27 +-
drivers/gpu/drm/gma500/psb_irq.c | 34 +-
drivers/iommu/amd_iommu_types.h | 6 +-
drivers/misc/mei/client.h | 4 +-
drivers/net/can/dev.c | 14 +-
drivers/net/can/usb/peak_usb/pcan_usb_core.c | 51 ++-
drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 48 ++-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 32 +-
drivers/net/geneve.c | 36 +-
drivers/net/wan/cosa.c | 1 +
drivers/net/wireless/ath/ath9k/htc_drv_txrx.c | 2 +-
drivers/net/xen-netback/common.h | 15 +
drivers/net/xen-netback/interface.c | 61 ++-
drivers/net/xen-netback/netback.c | 11 +-
drivers/net/xen-netback/rx.c | 13 +-
drivers/of/address.c | 4 +-
drivers/pinctrl/aspeed/pinctrl-aspeed.c | 7 +-
drivers/pinctrl/devicetree.c | 26 +-
drivers/pinctrl/pinctrl-amd.c | 6 +-
drivers/regulator/core.c | 2 +
drivers/scsi/device_handler/scsi_dh_alua.c | 9 +-
drivers/scsi/hpsa.c | 4 +-
drivers/usb/class/cdc-acm.c | 9 +
drivers/usb/gadget/udc/goku_udc.c | 2 +-
drivers/xen/events/events_2l.c | 9 +-
drivers/xen/events/events_base.c | 422 +++++++++++++++++--
drivers/xen/events/events_fifo.c | 82 ++--
drivers/xen/events/events_internal.h | 20 +-
drivers/xen/evtchn.c | 7 +-
drivers/xen/xen-pciback/pci_stub.c | 14 +-
drivers/xen/xen-pciback/pciback.h | 12 +-
drivers/xen/xen-pciback/pciback_ops.c | 48 ++-
drivers/xen/xen-pciback/xenbus.c | 2 +-
drivers/xen/xen-scsiback.c | 23 +-
fs/btrfs/extent_io.c | 4 +
fs/btrfs/ioctl.c | 2 +
fs/cifs/cifs_unicode.c | 8 +-
fs/ext4/inline.c | 1 +
fs/ext4/super.c | 5 +-
fs/gfs2/glock.c | 3 +-
fs/gfs2/rgrp.c | 5 +-
fs/ocfs2/super.c | 1 +
fs/xfs/libxfs/xfs_rmap.c | 2 +-
fs/xfs/libxfs/xfs_rmap_btree.c | 16 +-
fs/xfs/xfs_iops.c | 10 +
fs/xfs/xfs_pnfs.c | 2 +-
include/linux/can/skb.h | 20 +-
include/linux/perf_event.h | 2 +-
include/linux/prandom.h | 36 +-
include/linux/time64.h | 4 +
include/xen/events.h | 29 +-
kernel/events/core.c | 42 +-
kernel/events/internal.h | 2 +-
kernel/exit.c | 5 +-
kernel/irq/Kconfig | 1 +
kernel/reboot.c | 28 +-
kernel/time/timer.c | 7 -
kernel/trace/ring_buffer.c | 54 ++-
lib/random32.c | 462 +++++++++++++--------
lib/swiotlb.c | 6 +-
mm/mempolicy.c | 6 +-
net/ipv4/syncookies.c | 9 +-
net/ipv6/sit.c | 2 -
net/ipv6/syncookies.c | 10 +-
net/iucv/af_iucv.c | 3 +-
net/mac80211/sta_info.c | 18 +
net/mac80211/tx.c | 35 +-
net/wireless/reg.c | 2 +-
net/x25/af_x25.c | 2 +-
net/xfrm/xfrm_state.c | 8 +-
sound/hda/ext/hdac_ext_controller.c | 2 +
tools/perf/util/session.c | 1 +
79 files changed, 1465 insertions(+), 549 deletions(-)
This is the start of the stable review cycle for the 4.4.244 release.
There are 64 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 19 Nov 2020 12:20:51 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.244-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.4.244-rc1
Boris Protopopov <pboris(a)amazon.com>
Convert trailing spaces and periods in path components
Eric Biggers <ebiggers(a)google.com>
ext4: fix leaking sysfs kobject after failed mount
Matteo Croce <mcroce(a)microsoft.com>
reboot: fix overflow parsing reboot cpu number
Matteo Croce <mcroce(a)microsoft.com>
Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
Jiri Olsa <jolsa(a)redhat.com>
perf/core: Fix race in the perf_mmap_close() function
Juergen Gross <jgross(a)suse.com>
xen/events: block rogue events for some time
Juergen Gross <jgross(a)suse.com>
xen/events: defer eoi in case of excessive number of events
Juergen Gross <jgross(a)suse.com>
xen/events: use a common cpu hotplug hook for event channels
Juergen Gross <jgross(a)suse.com>
xen/events: switch user event channels to lateeoi model
Juergen Gross <jgross(a)suse.com>
xen/pciback: use lateeoi irq binding
Juergen Gross <jgross(a)suse.com>
xen/scsiback: use lateeoi irq binding
Juergen Gross <jgross(a)suse.com>
xen/netback: use lateeoi irq binding
Juergen Gross <jgross(a)suse.com>
xen/blkback: use lateeoi irq binding
Juergen Gross <jgross(a)suse.com>
xen/events: add a new "late EOI" evtchn framework
Juergen Gross <jgross(a)suse.com>
xen/events: fix race in evtchn_fifo_unmask()
Juergen Gross <jgross(a)suse.com>
xen/events: add a proper barrier to 2-level uevent unmasking
Juergen Gross <jgross(a)suse.com>
xen/events: avoid removing an event channel while handling it
Anand K Mistry <amistry(a)google.com>
x86/speculation: Allow IBPB to be conditionally enabled on CPUs with always-on STIBP
George Spelvin <lkml(a)sdf.org>
random32: make prandom_u32() output unpredictable
Mao Wenan <wenan.mao(a)linux.alibaba.com>
net: Update window_clamp if SOCK_RCVBUF is set
Martin Schiller <ms(a)dev.tdt.de>
net/x25: Fix null-ptr-deref in x25_connect
Ursula Braun <ubraun(a)linux.ibm.com>
net/af_iucv: fix null pointer dereference on shutdown
Oliver Herms <oliver.peter.herms(a)gmail.com>
IPv6: Set SIT tunnel hard_header_len to zero
Stefano Stabellini <stefano.stabellini(a)xilinx.com>
swiotlb: fix "x86: Don't panic if can not alloc buffer for swiotlb"
Coiby Xu <coiby.xu(a)gmail.com>
pinctrl: amd: fix incorrect way to disable debounce filter
Coiby Xu <coiby.xu(a)gmail.com>
pinctrl: amd: use higher precision for 512 RtcClk
Thomas Zimmermann <tzimmermann(a)suse.de>
drm/gma500: Fix out-of-bounds access to struct drm_device.vblank[]
Al Viro <viro(a)zeniv.linux.org.uk>
don't dump the threads that had been already exiting when zapped.
Wengang Wang <wen.gang.wang(a)oracle.com>
ocfs2: initialize ip_next_orphan
Alexander Usyskin <alexander.usyskin(a)intel.com>
mei: protect mei_cl_mtu from null dereference
Chris Brandt <chris.brandt(a)renesas.com>
usb: cdc-acm: Add DISABLE_ECHO for Renesas USB Download mode
Joseph Qi <joseph.qi(a)linux.alibaba.com>
ext4: unlock xattr_sem properly in ext4_inline_data_truncate()
Kaixu Xia <kaixuxia(a)tencent.com>
ext4: correctly report "not supported" for {usr,grp}jquota when !CONFIG_QUOTA
Peter Zijlstra <peterz(a)infradead.org>
perf: Fix get_recursion_context()
Wang Hai <wanghai38(a)huawei.com>
cosa: Add missing kfree in error path of cosa_write
Evan Nimmo <evan.nimmo(a)alliedtelesis.co.nz>
of/address: Fix of_node memory leak in of_dma_is_coherent
Christoph Hellwig <hch(a)lst.de>
xfs: fix a missing unlock on error in xfs_fs_map_blocks
Suravee Suthikulpanit <suravee.suthikulpanit(a)amd.com>
iommu/amd: Increase interrupt remapping table limit to 512 entries
Ye Bin <yebin10(a)huawei.com>
cfg80211: regulatory: Fix inconsistent format argument
Johannes Berg <johannes.berg(a)intel.com>
mac80211: always wind down STA state
Johannes Berg <johannes.berg(a)intel.com>
mac80211: fix use of skb payload instead of header
Evan Quan <evan.quan(a)amd.com>
drm/amdgpu: perform srbm soft reset always on SDMA resume
Bob Peterson <rpeterso(a)redhat.com>
gfs2: check for live vs. read-only file system in gfs2_fitrim
Bob Peterson <rpeterso(a)redhat.com>
gfs2: Free rd_bits later in gfs2_clear_rgrpd to fix use-after-free
Evgeny Novikov <novikov(a)ispras.ru>
usb: gadget: goku_udc: fix potential crashes in probe
Masashi Honma <masashi.honma(a)gmail.com>
ath9k_htc: Use appropriate rs_datalen type
Mark Gray <mark.d.gray(a)redhat.com>
geneve: add transport ports in route lookup for geneve
Martyna Szapar <martyna.szapar(a)intel.com>
i40e: Fix of memory leak and integer truncation in i40e_virtchnl.c
Grzegorz Siwik <grzegorz.siwik(a)intel.com>
i40e: Wrong truncation from u16 to u8
Will Deacon <will(a)kernel.org>
pinctrl: devicetree: Avoid taking direct reference to device name string
Filipe Manana <fdmanana(a)suse.com>
Btrfs: fix missing error return if writeback for extent buffer never started
Stephane Grosjean <s.grosjean(a)peak-system.com>
can: peak_usb: peak_usb_get_ts_time(): fix timestamp wrapping
Dan Carpenter <dan.carpenter(a)oracle.com>
can: peak_usb: add range checking in decode operations
Oleksij Rempel <o.rempel(a)pengutronix.de>
can: can_create_echo_skb(): fix echo skb generation: always use skb_clone()
Oliver Hartkopp <socketcan(a)hartkopp.net>
can: dev: __can_get_echo_skb(): fix real payload length return value for RTR frames
Vincent Mailhol <mailhol.vincent(a)wanadoo.fr>
can: dev: can_get_echo_skb(): prevent call to kfree_skb() in hard IRQ context
Dan Carpenter <dan.carpenter(a)oracle.com>
ALSA: hda: prevent undefined shift in snd_hdac_ext_bus_get_link()
Jiri Olsa <jolsa(a)kernel.org>
perf tools: Add missing swap for ino_generation
zhuoliang zhang <zhuoliang.zhang(a)mediatek.com>
net: xfrm: fix a race condition during allocing spi
Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
btrfs: reschedule when cloning lots of extents
Zeng Tao <prime.zeng(a)hisilicon.com>
time: Prevent undefined behaviour in timespec64_to_ns()
Shijie Luo <luoshijie1(a)huawei.com>
mm: mempolicy: fix potential pte_unmap_unlock pte error
Alexander Aring <aahringo(a)redhat.com>
gfs2: Wake up when sd_glock_disposal becomes zero
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
ring-buffer: Fix recursion protection transitions between interrupt context
-------------
Diffstat:
Documentation/kernel-parameters.txt | 8 +
Makefile | 4 +-
arch/x86/kernel/cpu/bugs.c | 52 ++-
drivers/block/xen-blkback/blkback.c | 22 +-
drivers/block/xen-blkback/xenbus.c | 5 +-
drivers/char/random.c | 2 -
drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 27 +-
drivers/gpu/drm/gma500/psb_irq.c | 34 +-
drivers/iommu/amd_iommu_types.h | 6 +-
drivers/misc/mei/client.h | 4 +-
drivers/net/can/dev.c | 14 +-
drivers/net/can/usb/peak_usb/pcan_usb_core.c | 51 ++-
drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 48 ++-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 4 +-
drivers/net/geneve.c | 36 +-
drivers/net/wan/cosa.c | 1 +
drivers/net/wireless/ath/ath9k/htc_drv_txrx.c | 2 +-
drivers/net/xen-netback/common.h | 39 ++
drivers/net/xen-netback/interface.c | 59 ++-
drivers/net/xen-netback/netback.c | 17 +-
drivers/of/address.c | 4 +-
drivers/pinctrl/devicetree.c | 26 +-
drivers/pinctrl/pinctrl-amd.c | 6 +-
drivers/usb/class/cdc-acm.c | 9 +
drivers/usb/gadget/udc/goku_udc.c | 2 +-
drivers/xen/events/events_2l.c | 9 +-
drivers/xen/events/events_base.c | 444 ++++++++++++++++++--
drivers/xen/events/events_fifo.c | 102 ++---
drivers/xen/events/events_internal.h | 20 +-
drivers/xen/evtchn.c | 7 +-
drivers/xen/xen-pciback/pci_stub.c | 14 +-
drivers/xen/xen-pciback/pciback.h | 12 +-
drivers/xen/xen-pciback/pciback_ops.c | 48 ++-
drivers/xen/xen-pciback/xenbus.c | 2 +-
drivers/xen/xen-scsiback.c | 23 +-
fs/btrfs/extent_io.c | 4 +
fs/btrfs/ioctl.c | 2 +
fs/cifs/cifs_unicode.c | 8 +-
fs/ext4/inline.c | 1 +
fs/ext4/super.c | 5 +-
fs/gfs2/glock.c | 3 +-
fs/gfs2/rgrp.c | 5 +-
fs/ocfs2/super.c | 1 +
fs/xfs/xfs_pnfs.c | 2 +-
include/linux/can/skb.h | 20 +-
include/linux/prandom.h | 36 +-
include/linux/time64.h | 4 +
include/xen/events.h | 29 +-
kernel/events/core.c | 7 +-
kernel/events/internal.h | 2 +-
kernel/exit.c | 5 +-
kernel/reboot.c | 28 +-
kernel/time/timer.c | 7 -
kernel/trace/ring_buffer.c | 54 ++-
lib/random32.c | 463 +++++++++++++--------
lib/swiotlb.c | 6 +-
mm/mempolicy.c | 6 +-
net/ipv4/syncookies.c | 9 +-
net/ipv6/sit.c | 2 -
net/ipv6/syncookies.c | 10 +-
net/iucv/af_iucv.c | 3 +-
net/mac80211/sta_info.c | 18 +
net/mac80211/tx.c | 35 +-
net/wireless/reg.c | 2 +-
net/x25/af_x25.c | 2 +-
net/xfrm/xfrm_state.c | 8 +-
sound/hda/ext/hdac_ext_controller.c | 2 +
tools/perf/util/session.c | 1 +
68 files changed, 1431 insertions(+), 522 deletions(-)
DIR_INDEX has been introduced as a compat ext4 feature. That means that
even kernels / tools that don't understand the feature may modify the
filesystem. This works because for kernels not understanding indexed dir
format, internal htree nodes appear just as empty directory entries.
Index dir aware kernels then check the htree structure is still
consistent before using the data. This all worked reasonably well until
metadata checksums were introduced. The problem is that these
effectively made DIR_INDEX only ro-compatible because internal htree
nodes store checksums in a different place than normal directory blocks.
Thus any modification ignorant to DIR_INDEX (or just clearing
EXT4_INDEX_FL from the inode) will effectively cause checksum mismatch
and trigger kernel errors. So we have to be more careful when dealing
with indexed directories on filesystems with checksumming enabled.
1) We just disallow loading and directory inodes with EXT4_INDEX_FL when
DIR_INDEX is not enabled. This is harsh but it should be very rare (it
means someone disabled DIR_INDEX on existing filesystem and didn't run
e2fsck), e2fsck can fix the problem, and we don't want to answer the
difficult question: "Should we rather corrupt the directory more or
should we ignore that DIR_INDEX feature is not set?"
2) When we find out htree structure is corrupted (but the filesystem and
the directory should in support htrees), we continue just ignoring htree
information for reading but we refuse to add new entries to the
directory to avoid corrupting it more.
CC: stable(a)vger.kernel.org
Fixes: dbe89444042a ("ext4: Calculate and verify checksums for htree nodes")
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
fs/ext4/dir.c | 14 ++++++++------
fs/ext4/ext4.h | 5 ++++-
fs/ext4/inode.c | 13 +++++++++++++
fs/ext4/namei.c | 7 +++++++
4 files changed, 32 insertions(+), 7 deletions(-)
diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index 9f00fc0bf21d..cb9ea593b544 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -129,12 +129,14 @@ static int ext4_readdir(struct file *file, struct dir_context *ctx)
if (err != ERR_BAD_DX_DIR) {
return err;
}
- /*
- * We don't set the inode dirty flag since it's not
- * critical that it get flushed back to the disk.
- */
- ext4_clear_inode_flag(file_inode(file),
- EXT4_INODE_INDEX);
+ /* Can we just clear INDEX flag to ignore htree information? */
+ if (!ext4_has_metadata_csum(sb)) {
+ /*
+ * We don't set the inode dirty flag since it's not
+ * critical that it gets flushed back to the disk.
+ */
+ ext4_clear_inode_flag(inode, EXT4_INODE_INDEX);
+ }
}
if (ext4_has_inline_data(inode)) {
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index f8578caba40d..1fd6c1e2ce2a 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2482,8 +2482,11 @@ void ext4_insert_dentry(struct inode *inode,
struct ext4_filename *fname);
static inline void ext4_update_dx_flag(struct inode *inode)
{
- if (!ext4_has_feature_dir_index(inode->i_sb))
+ if (!ext4_has_feature_dir_index(inode->i_sb)) {
+ /* ext4_iget() should have caught this... */
+ WARN_ON_ONCE(ext4_has_feature_metadata_csum(inode->i_sb));
ext4_clear_inode_flag(inode, EXT4_INODE_INDEX);
+ }
}
static const unsigned char ext4_filetype_table[] = {
DT_UNKNOWN, DT_REG, DT_DIR, DT_CHR, DT_BLK, DT_FIFO, DT_SOCK, DT_LNK
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 629a25d999f0..d33135308c1b 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4615,6 +4615,19 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
ret = -EFSCORRUPTED;
goto bad_inode;
}
+ /*
+ * If dir_index is not enabled but there's dir with INDEX flag set,
+ * we'd normally treat htree data as empty space. But with metadata
+ * checksumming that corrupts checksums so forbid that.
+ */
+ if (!ext4_has_feature_dir_index(sb) && ext4_has_metadata_csum(sb) &&
+ ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) {
+ ext4_error_inode(inode, function, line, 0,
+ "iget: Dir with htree data on filesystem "
+ "without dir_index feature.");
+ ret = -EFSCORRUPTED;
+ goto bad_inode;
+ }
ei->i_disksize = inode->i_size;
#ifdef CONFIG_QUOTA
ei->i_reserved_quota = 0;
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 1cb42d940784..deb9f7a02976 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2207,6 +2207,13 @@ static int ext4_add_entry(handle_t *handle, struct dentry *dentry,
retval = ext4_dx_add_entry(handle, &fname, dir, inode);
if (!retval || (retval != ERR_BAD_DX_DIR))
goto out;
+ /* Can we just ignore htree data? */
+ if (ext4_has_metadata_csum(sb)) {
+ EXT4_ERROR_INODE(dir,
+ "Directory has corrupted htree index.");
+ retval = -EFSCORRUPTED;
+ goto out;
+ }
ext4_clear_inode_flag(dir, EXT4_INODE_INDEX);
dx_fallback++;
ext4_mark_inode_dirty(handle, dir);
--
2.16.4
An active ref_node always can be found in ctx->files_data, it's much
safer to get it this way instead of poking into files_data->ref_list.
Cc: stable(a)vger.kernel.org # v5.7+
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
---
fs/io_uring.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index b205c1df3f74..5cb194ca4fce 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6974,9 +6974,7 @@ static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
return -ENXIO;
spin_lock(&data->lock);
- if (!list_empty(&data->ref_list))
- ref_node = list_first_entry(&data->ref_list,
- struct fixed_file_ref_node, node);
+ ref_node = data->node;
spin_unlock(&data->lock);
if (ref_node)
percpu_ref_kill(&ref_node->refs);
--
2.24.0
Since commit 086d08725d34 ("remoteproc: create vdev subdevice with
specific dma memory pool"), every remoteproc has a DMA subdevice
("remoteprocX#vdevYbuffer") for each virtio device, which inherits
DMA capabilities from the corresponding platform device. This allowed
to associate different DMA pools with each vdev, and required from
virtio drivers to perform DMA operations with the parent device
(vdev->dev.parent) instead of grandparent (vdev->dev.parent->parent).
virtio_rpmsg_bus was already changed in the same merge cycle with
commit d999b622fcfb ("rpmsg: virtio: allocate buffer from parent"),
but virtio_console did not. In fact, operations using the grandparent
worked fine while the grandparent was the platform device, but since
commit c774ad010873 ("remoteproc: Fix and restore the parenting
hierarchy for vdev") this was changed, and now the grandparent device
is the remoteproc device without any DMA capabilities.
So, starting v5.8-rc1 the following warning is observed:
[ 2.483925] ------------[ cut here ]------------
[ 2.489148] WARNING: CPU: 3 PID: 101 at kernel/dma/mapping.c:427 0x80e7eee8
[ 2.489152] Modules linked in: virtio_console(+)
[ 2.503737] virtio_rpmsg_bus rpmsg_core
[ 2.508903]
[ 2.528898] <Other modules, stack and call trace here>
[ 2.913043]
[ 2.914907] ---[ end trace 93ac8746beab612c ]---
[ 2.920102] virtio-ports vport1p0: Error allocating inbufs
kernel/dma/mapping.c:427 is:
WARN_ON_ONCE(!dev->coherent_dma_mask);
obviously because the grandparent now is remoteproc dev without any
DMA caps:
[ 3.104943] Parent: remoteproc0#vdev1buffer, grandparent: remoteproc0
Fix this the same way as it was for virtio_rpmsg_bus, using just the
parent device (vdev->dev.parent, "remoteprocX#vdevYbuffer") for DMA
operations.
This also allows now to reserve DMA pools/buffers for rproc serial
via Device Tree.
Fixes: c774ad010873 ("remoteproc: Fix and restore the parenting hierarchy for vdev")
Cc: stable(a)vger.kernel.org # 5.1+
Signed-off-by: Alexander Lobakin <alobakin(a)pm.me>
---
drivers/char/virtio_console.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index a2da8f768b94..1836cc56e357 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -435,12 +435,12 @@ static struct port_buffer *alloc_buf(struct virtio_device *vdev, size_t buf_size
/*
* Allocate DMA memory from ancestor. When a virtio
* device is created by remoteproc, the DMA memory is
- * associated with the grandparent device:
- * vdev => rproc => platform-dev.
+ * associated with the parent device:
+ * virtioY => remoteprocX#vdevYbuffer.
*/
- if (!vdev->dev.parent || !vdev->dev.parent->parent)
+ buf->dev = vdev->dev.parent;
+ if (!buf->dev)
goto free_buf;
- buf->dev = vdev->dev.parent->parent;
/* Increase device refcnt to avoid freeing it */
get_device(buf->dev);
--
2.29.2
This is the start of the stable review cycle for the 4.14.207 release.
There are 85 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 19 Nov 2020 12:20:51 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.207-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.207-rc1
Boris Protopopov <pboris(a)amazon.com>
Convert trailing spaces and periods in path components
Matteo Croce <mcroce(a)microsoft.com>
reboot: fix overflow parsing reboot cpu number
Matteo Croce <mcroce(a)microsoft.com>
Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
Jiri Olsa <jolsa(a)redhat.com>
perf/core: Fix race in the perf_mmap_close() function
Juergen Gross <jgross(a)suse.com>
xen/events: block rogue events for some time
Juergen Gross <jgross(a)suse.com>
xen/events: defer eoi in case of excessive number of events
Juergen Gross <jgross(a)suse.com>
xen/events: use a common cpu hotplug hook for event channels
Juergen Gross <jgross(a)suse.com>
xen/events: switch user event channels to lateeoi model
Juergen Gross <jgross(a)suse.com>
xen/pciback: use lateeoi irq binding
Juergen Gross <jgross(a)suse.com>
xen/pvcallsback: use lateeoi irq binding
Juergen Gross <jgross(a)suse.com>
xen/scsiback: use lateeoi irq binding
Juergen Gross <jgross(a)suse.com>
xen/netback: use lateeoi irq binding
Juergen Gross <jgross(a)suse.com>
xen/blkback: use lateeoi irq binding
Juergen Gross <jgross(a)suse.com>
xen/events: add a new "late EOI" evtchn framework
Juergen Gross <jgross(a)suse.com>
xen/events: fix race in evtchn_fifo_unmask()
Juergen Gross <jgross(a)suse.com>
xen/events: add a proper barrier to 2-level uevent unmasking
Juergen Gross <jgross(a)suse.com>
xen/events: avoid removing an event channel while handling it
kiyin(尹亮) <kiyin(a)tencent.com>
perf/core: Fix a memory leak in perf_event_parse_addr_filter()
Mathieu Poirier <mathieu.poirier(a)linaro.org>
perf/core: Fix crash when using HW tracing kernel filters
Song Liu <songliubraving(a)fb.com>
perf/core: Fix bad use of igrab()
Anand K Mistry <amistry(a)google.com>
x86/speculation: Allow IBPB to be conditionally enabled on CPUs with always-on STIBP
George Spelvin <lkml(a)sdf.org>
random32: make prandom_u32() output unpredictable
Mao Wenan <wenan.mao(a)linux.alibaba.com>
net: Update window_clamp if SOCK_RCVBUF is set
Heiner Kallweit <hkallweit1(a)gmail.com>
r8169: fix potential skb double free in an error path
Martin Willi <martin(a)strongswan.org>
vrf: Fix fast path output packet handling with async Netfilter rules
Martin Schiller <ms(a)dev.tdt.de>
net/x25: Fix null-ptr-deref in x25_connect
Ursula Braun <ubraun(a)linux.ibm.com>
net/af_iucv: fix null pointer dereference on shutdown
Oliver Herms <oliver.peter.herms(a)gmail.com>
IPv6: Set SIT tunnel hard_header_len to zero
Stefano Stabellini <stefano.stabellini(a)xilinx.com>
swiotlb: fix "x86: Don't panic if can not alloc buffer for swiotlb"
Coiby Xu <coiby.xu(a)gmail.com>
pinctrl: amd: fix incorrect way to disable debounce filter
Coiby Xu <coiby.xu(a)gmail.com>
pinctrl: amd: use higher precision for 512 RtcClk
Thomas Zimmermann <tzimmermann(a)suse.de>
drm/gma500: Fix out-of-bounds access to struct drm_device.vblank[]
Al Viro <viro(a)zeniv.linux.org.uk>
don't dump the threads that had been already exiting when zapped.
Chen Zhou <chenzhou10(a)huawei.com>
selinux: Fix error return code in sel_ib_pkey_sid_slow()
Wengang Wang <wen.gang.wang(a)oracle.com>
ocfs2: initialize ip_next_orphan
Dan Carpenter <dan.carpenter(a)oracle.com>
futex: Don't enable IRQs unconditionally in put_pi_state()
Alexander Usyskin <alexander.usyskin(a)intel.com>
mei: protect mei_cl_mtu from null dereference
Chris Brandt <chris.brandt(a)renesas.com>
usb: cdc-acm: Add DISABLE_ECHO for Renesas USB Download mode
Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
uio: Fix use-after-free in uio_unregister_device()
Jing Xiangfeng <jingxiangfeng(a)huawei.com>
thunderbolt: Add the missed ida_simple_remove() in ring_request_msix()
Joseph Qi <joseph.qi(a)linux.alibaba.com>
ext4: unlock xattr_sem properly in ext4_inline_data_truncate()
Kaixu Xia <kaixuxia(a)tencent.com>
ext4: correctly report "not supported" for {usr,grp}jquota when !CONFIG_QUOTA
Peter Zijlstra <peterz(a)infradead.org>
perf: Fix get_recursion_context()
Wang Hai <wanghai38(a)huawei.com>
cosa: Add missing kfree in error path of cosa_write
Evan Nimmo <evan.nimmo(a)alliedtelesis.co.nz>
of/address: Fix of_node memory leak in of_dma_is_coherent
Christoph Hellwig <hch(a)lst.de>
xfs: fix a missing unlock on error in xfs_fs_map_blocks
Darrick J. Wong <darrick.wong(a)oracle.com>
xfs: fix rmap key and record comparison functions
Darrick J. Wong <darrick.wong(a)oracle.com>
xfs: fix flags argument to rmap lookup when converting shared file rmaps
Christoph Hellwig <hch(a)lst.de>
nbd: fix a block_device refcount leak in nbd_release
Billy Tsai <billy_tsai(a)aspeedtech.com>
pinctrl: aspeed: Fix GPI only function problem.
Andrew Jeffery <andrew(a)aj.id.au>
ARM: 9019/1: kprobes: Avoid fortify_panic() when copying optprobe template
Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
pinctrl: intel: Set default bias in case no particular value given
Suravee Suthikulpanit <suravee.suthikulpanit(a)amd.com>
iommu/amd: Increase interrupt remapping table limit to 512 entries
Hannes Reinecke <hare(a)suse.de>
scsi: scsi_dh_alua: Avoid crash during alua_bus_detach()
Ye Bin <yebin10(a)huawei.com>
cfg80211: regulatory: Fix inconsistent format argument
Johannes Berg <johannes.berg(a)intel.com>
mac80211: always wind down STA state
Johannes Berg <johannes.berg(a)intel.com>
mac80211: fix use of skb payload instead of header
Evan Quan <evan.quan(a)amd.com>
drm/amdgpu: perform srbm soft reset always on SDMA resume
Keita Suzuki <keitasuzuki.park(a)sslab.ics.keio.ac.jp>
scsi: hpsa: Fix memory leak in hpsa_init_one()
Bob Peterson <rpeterso(a)redhat.com>
gfs2: check for live vs. read-only file system in gfs2_fitrim
Bob Peterson <rpeterso(a)redhat.com>
gfs2: Add missing truncate_inode_pages_final for sd_aspace
Bob Peterson <rpeterso(a)redhat.com>
gfs2: Free rd_bits later in gfs2_clear_rgrpd to fix use-after-free
Evgeny Novikov <novikov(a)ispras.ru>
usb: gadget: goku_udc: fix potential crashes in probe
Masashi Honma <masashi.honma(a)gmail.com>
ath9k_htc: Use appropriate rs_datalen type
Filipe Manana <fdmanana(a)suse.com>
Btrfs: fix missing error return if writeback for extent buffer never started
Brian Foster <bfoster(a)redhat.com>
xfs: flush new eof page on truncate to avoid post-eof corruption
Stephane Grosjean <s.grosjean(a)peak-system.com>
can: peak_canfd: pucan_handle_can_rx(): fix echo management when loopback is on
Stephane Grosjean <s.grosjean(a)peak-system.com>
can: peak_usb: peak_usb_get_ts_time(): fix timestamp wrapping
Dan Carpenter <dan.carpenter(a)oracle.com>
can: peak_usb: add range checking in decode operations
Oleksij Rempel <o.rempel(a)pengutronix.de>
can: can_create_echo_skb(): fix echo skb generation: always use skb_clone()
Oliver Hartkopp <socketcan(a)hartkopp.net>
can: dev: __can_get_echo_skb(): fix real payload length return value for RTR frames
Vincent Mailhol <mailhol.vincent(a)wanadoo.fr>
can: dev: can_get_echo_skb(): prevent call to kfree_skb() in hard IRQ context
Marc Kleine-Budde <mkl(a)pengutronix.de>
can: rx-offload: don't call kfree_skb() from IRQ context
Dan Carpenter <dan.carpenter(a)oracle.com>
ALSA: hda: prevent undefined shift in snd_hdac_ext_bus_get_link()
Jiri Olsa <jolsa(a)kernel.org>
perf tools: Add missing swap for ino_generation
zhuoliang zhang <zhuoliang.zhang(a)mediatek.com>
net: xfrm: fix a race condition during allocing spi
Olaf Hering <olaf(a)aepfle.de>
hv_balloon: disable warning when floor reached
Marc Zyngier <maz(a)kernel.org>
genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_HIERARCHY
Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
btrfs: reschedule when cloning lots of extents
Josef Bacik <josef(a)toxicpanda.com>
btrfs: sysfs: init devices outside of the chunk_mutex
Ming Lei <ming.lei(a)redhat.com>
nbd: don't update block size after device is started
Zeng Tao <prime.zeng(a)hisilicon.com>
time: Prevent undefined behaviour in timespec64_to_ns()
Shijie Luo <luoshijie1(a)huawei.com>
mm: mempolicy: fix potential pte_unmap_unlock pte error
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
ring-buffer: Fix recursion protection transitions between interrupt context
Michał Mirosław <mirq-linux(a)rere.qmqm.pl>
regulator: defer probe when trying to get voltage from unresolved supply
-------------
Diffstat:
Documentation/admin-guide/kernel-parameters.txt | 8 +
Makefile | 4 +-
arch/arm/include/asm/kprobes.h | 22 +-
arch/arm/probes/kprobes/opt-arm.c | 18 +-
arch/x86/events/intel/pt.c | 4 +-
arch/x86/kernel/cpu/bugs.c | 52 ++-
drivers/block/nbd.c | 10 +-
drivers/block/xen-blkback/blkback.c | 22 +-
drivers/block/xen-blkback/xenbus.c | 5 +-
drivers/char/random.c | 1 -
drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 27 +-
drivers/gpu/drm/gma500/psb_irq.c | 34 +-
drivers/hv/hv_balloon.c | 2 +-
drivers/iommu/amd_iommu_types.h | 6 +-
drivers/misc/mei/client.h | 4 +-
drivers/net/can/dev.c | 14 +-
drivers/net/can/peak_canfd/peak_canfd.c | 11 +-
drivers/net/can/rx-offload.c | 4 +-
drivers/net/can/usb/peak_usb/pcan_usb_core.c | 51 ++-
drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 48 ++-
drivers/net/ethernet/realtek/r8169.c | 3 +-
drivers/net/vrf.c | 92 +++--
drivers/net/wan/cosa.c | 1 +
drivers/net/wireless/ath/ath9k/htc_drv_txrx.c | 2 +-
drivers/net/xen-netback/common.h | 15 +
drivers/net/xen-netback/interface.c | 61 +++-
drivers/net/xen-netback/netback.c | 11 +-
drivers/net/xen-netback/rx.c | 13 +-
drivers/of/address.c | 4 +-
drivers/pinctrl/aspeed/pinctrl-aspeed.c | 7 +-
drivers/pinctrl/intel/pinctrl-intel.c | 8 +
drivers/pinctrl/pinctrl-amd.c | 6 +-
drivers/regulator/core.c | 2 +
drivers/scsi/device_handler/scsi_dh_alua.c | 9 +-
drivers/scsi/hpsa.c | 4 +-
drivers/thunderbolt/nhi.c | 19 +-
drivers/uio/uio.c | 10 +-
drivers/usb/class/cdc-acm.c | 9 +
drivers/usb/gadget/udc/goku_udc.c | 2 +-
drivers/xen/events/events_2l.c | 9 +-
drivers/xen/events/events_base.c | 422 ++++++++++++++++++++--
drivers/xen/events/events_fifo.c | 83 ++---
drivers/xen/events/events_internal.h | 20 +-
drivers/xen/evtchn.c | 7 +-
drivers/xen/pvcalls-back.c | 76 ++--
drivers/xen/xen-pciback/pci_stub.c | 14 +-
drivers/xen/xen-pciback/pciback.h | 12 +-
drivers/xen/xen-pciback/pciback_ops.c | 48 ++-
drivers/xen/xen-pciback/xenbus.c | 2 +-
drivers/xen/xen-scsiback.c | 23 +-
fs/btrfs/extent_io.c | 4 +
fs/btrfs/ioctl.c | 2 +
fs/btrfs/volumes.c | 7 +-
fs/cifs/cifs_unicode.c | 8 +-
fs/ext4/inline.c | 1 +
fs/ext4/super.c | 4 +-
fs/gfs2/rgrp.c | 5 +-
fs/gfs2/super.c | 1 +
fs/ocfs2/super.c | 1 +
fs/xfs/libxfs/xfs_rmap.c | 2 +-
fs/xfs/libxfs/xfs_rmap_btree.c | 16 +-
fs/xfs/xfs_iops.c | 10 +
fs/xfs/xfs_pnfs.c | 2 +-
include/linux/can/skb.h | 20 +-
include/linux/perf_event.h | 2 +-
include/linux/prandom.h | 36 +-
include/linux/time64.h | 4 +
include/xen/events.h | 29 +-
kernel/events/core.c | 44 +--
kernel/events/internal.h | 2 +-
kernel/exit.c | 5 +-
kernel/futex.c | 5 +-
kernel/irq/Kconfig | 1 +
kernel/reboot.c | 28 +-
kernel/time/itimer.c | 4 -
kernel/time/timer.c | 7 -
kernel/trace/ring_buffer.c | 54 ++-
lib/random32.c | 462 +++++++++++++++---------
lib/swiotlb.c | 6 +-
mm/mempolicy.c | 6 +-
net/ipv4/syncookies.c | 9 +-
net/ipv6/sit.c | 2 -
net/ipv6/syncookies.c | 10 +-
net/iucv/af_iucv.c | 3 +-
net/mac80211/sta_info.c | 18 +
net/mac80211/tx.c | 35 +-
net/wireless/reg.c | 2 +-
net/x25/af_x25.c | 2 +-
net/xfrm/xfrm_state.c | 8 +-
security/selinux/ibpkey.c | 4 +-
sound/hda/ext/hdac_ext_controller.c | 2 +
tools/perf/util/session.c | 1 +
92 files changed, 1585 insertions(+), 630 deletions(-)
From: Eric Biggers <ebiggers(a)google.com>
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to
create a duplicate filename in an encrypted directory by creating a file
concurrently with adding the directory's encryption key.
Fix this bug on f2fs by rejecting no-key dentries in f2fs_add_link().
Note that the weird check for the current task in f2fs_do_add_link()
seems to make this bug difficult to reproduce on f2fs.
Fixes: 9ea97163c6da ("f2fs crypto: add filename encryption for f2fs_add_link")
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
fs/f2fs/f2fs.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index cb700d797296..9a321c52face 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -3251,6 +3251,8 @@ bool f2fs_empty_dir(struct inode *dir);
static inline int f2fs_add_link(struct dentry *dentry, struct inode *inode)
{
+ if (fscrypt_is_nokey_name(dentry))
+ return -ENOKEY;
return f2fs_do_add_link(d_inode(dentry->d_parent), &dentry->d_name,
inode, inode->i_ino, inode->i_mode);
}
--
2.29.2
From: Eric Biggers <ebiggers(a)google.com>
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to
create a duplicate filename in an encrypted directory by creating a file
concurrently with adding the directory's encryption key.
Fix this bug on ext4 by rejecting no-key dentries in ext4_add_entry().
Note that the duplicate check in ext4_find_dest_de() sometimes prevented
this bug. However in many cases it didn't, since ext4_find_dest_de()
doesn't examine every dentry.
Fixes: 4461471107b7 ("ext4 crypto: enable filename encryption")
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
fs/ext4/namei.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 33509266f5a0..793fc7db9d28 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2195,6 +2195,9 @@ static int ext4_add_entry(handle_t *handle, struct dentry *dentry,
if (!dentry->d_name.len)
return -EINVAL;
+ if (fscrypt_is_nokey_name(dentry))
+ return -ENOKEY;
+
#ifdef CONFIG_UNICODE
if (sb_has_strict_encoding(sb) && IS_CASEFOLDED(dir) &&
sb->s_encoding && utf8_validate(sb->s_encoding, &dentry->d_name))
--
2.29.2
From: Eric Biggers <ebiggers(a)google.com>
It's possible to create a duplicate filename in an encrypted directory
by creating a file concurrently with adding the encryption key.
Specifically, sys_open(O_CREAT) (or sys_mkdir(), sys_mknod(), or
sys_symlink()) can lookup the target filename while the directory's
encryption key hasn't been added yet, resulting in a negative no-key
dentry. The VFS then calls ->create() (or ->mkdir(), ->mknod(), or
->symlink()) because the dentry is negative. Normally, ->create() would
return -ENOKEY due to the directory's key being unavailable. However,
if the key was added between the dentry lookup and ->create(), then the
filesystem will go ahead and try to create the file.
If the target filename happens to already exist as a normal name (not a
no-key name), a duplicate filename may be added to the directory.
In order to fix this, we need to fix the filesystems to prevent
->create(), ->mkdir(), ->mknod(), and ->symlink() on no-key names.
(->rename() and ->link() need it too, but those are already handled
correctly by fscrypt_prepare_rename() and fscrypt_prepare_link().)
In preparation for this, add a helper function fscrypt_is_nokey_name()
that filesystems can use to do this check. Use this helper function for
the existing checks that fs/crypto/ does for rename and link.
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
fs/crypto/hooks.c | 5 +++--
include/linux/fscrypt.h | 34 ++++++++++++++++++++++++++++++++++
2 files changed, 37 insertions(+), 2 deletions(-)
diff --git a/fs/crypto/hooks.c b/fs/crypto/hooks.c
index 20b0df47fe6a..061418be4b08 100644
--- a/fs/crypto/hooks.c
+++ b/fs/crypto/hooks.c
@@ -61,7 +61,7 @@ int __fscrypt_prepare_link(struct inode *inode, struct inode *dir,
return err;
/* ... in case we looked up no-key name before key was added */
- if (dentry->d_flags & DCACHE_NOKEY_NAME)
+ if (fscrypt_is_nokey_name(dentry))
return -ENOKEY;
if (!fscrypt_has_permitted_context(dir, inode))
@@ -86,7 +86,8 @@ int __fscrypt_prepare_rename(struct inode *old_dir, struct dentry *old_dentry,
return err;
/* ... in case we looked up no-key name(s) before key was added */
- if ((old_dentry->d_flags | new_dentry->d_flags) & DCACHE_NOKEY_NAME)
+ if (fscrypt_is_nokey_name(old_dentry) ||
+ fscrypt_is_nokey_name(new_dentry))
return -ENOKEY;
if (old_dir != new_dir) {
diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index a8f7a43f031b..8e1d31c959bf 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -111,6 +111,35 @@ static inline void fscrypt_handle_d_move(struct dentry *dentry)
dentry->d_flags &= ~DCACHE_NOKEY_NAME;
}
+/**
+ * fscrypt_is_nokey_name() - test whether a dentry is a no-key name
+ * @dentry: the dentry to check
+ *
+ * This returns true if the dentry is a no-key dentry. A no-key dentry is a
+ * dentry that was created in an encrypted directory that hasn't had its
+ * encryption key added yet. Such dentries may be either positive or negative.
+ *
+ * When a filesystem is asked to create a new filename in an encrypted directory
+ * and the new filename's dentry is a no-key dentry, it must fail the operation
+ * with ENOKEY. This includes ->create(), ->mkdir(), ->mknod(), ->symlink(),
+ * ->rename(), and ->link(). (However, ->rename() and ->link() are already
+ * handled by fscrypt_prepare_rename() and fscrypt_prepare_link().)
+ *
+ * This is necessary because creating a filename requires the directory's
+ * encryption key, but just checking for the key on the directory inode during
+ * the final filesystem operation doesn't guarantee that the key was available
+ * during the preceding dentry lookup. And the key must have already been
+ * available during the dentry lookup in order for it to have been checked
+ * whether the filename already exists in the directory and for the new file's
+ * dentry not to be invalidated due to it incorrectly having the no-key flag.
+ *
+ * Return: %true if the dentry is a no-key name
+ */
+static inline bool fscrypt_is_nokey_name(const struct dentry *dentry)
+{
+ return dentry->d_flags & DCACHE_NOKEY_NAME;
+}
+
/* crypto.c */
void fscrypt_enqueue_decrypt_work(struct work_struct *);
@@ -244,6 +273,11 @@ static inline void fscrypt_handle_d_move(struct dentry *dentry)
{
}
+static inline bool fscrypt_is_nokey_name(const struct dentry *dentry)
+{
+ return false;
+}
+
/* crypto.c */
static inline void fscrypt_enqueue_decrypt_work(struct work_struct *work)
{
--
2.29.2
Hello stable(a)vger.kernel.org
We are Base Investment Company offering Corporate and Personal Loan at 3% Interest Rate for a duration of 10Years.
We also pay 1% commission to brokers, who introduce project owners for finance or other opportunities.
Please get back to me if you are interested for more
details.
Yours faithfully,
Hashim Murrah
The HUAWEI USB-C headset (VID:0x12d1, PID:0x3a07) reported it supports
96khz. However there will be some random issue under 96khz.
Not sure if there is any alternate setting could be applied.
Hence 48khz is suggested to be applied at this moment.
Signed-off-by: Macpaul Lin <macpaul.lin(a)mediatek.com>
Signed-off-by: Eddie Hung <eddie.hung(a)mediatek.com>
Cc: stable(a)vger.kernel.org
---
Changes for v2:
- Fix build error.
- Add Cc: stable(a)vger.kernel.org
Changes for v3:
- Replace "udev" with "chip->dev" according to Takashi's suggestion. Thanks.
sound/usb/format.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/sound/usb/format.c b/sound/usb/format.c
index 1b28d01..0aff774 100644
--- a/sound/usb/format.c
+++ b/sound/usb/format.c
@@ -217,6 +217,11 @@ static int parse_audio_format_rates_v1(struct snd_usb_audio *chip, struct audiof
(chip->usb_id == USB_ID(0x041e, 0x4064) ||
chip->usb_id == USB_ID(0x041e, 0x4068)))
rate = 8000;
+ /* Huawei headset can't support 96kHz fully */
+ if (rate == 96000 &&
+ chip->usb_id == USB_ID(0x12d1, 0x3a07) &&
+ le16_to_cpu(chip->dev->descriptor.bcdDevice) == 0x49)
+ continue;
fp->rate_table[fp->nr_rates] = rate;
if (!fp->rate_min || rate < fp->rate_min)
--
1.7.9.5
Retry.
On Wed, Oct 28, 2020 at 10:10:35AM -0700, Guenter Roeck wrote:
> On Tue, Oct 27, 2020 at 02:50:58PM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.19.153 release.
> > There are 264 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu, 29 Oct 2020 13:53:47 +0000.
> > Anything received after that time might be too late.
> >
>
> Build results:
> total: 155 pass: 152 fail: 3
> Failed builds:
> i386:tools/perf
> powerpc:ppc6xx_defconfig
> x86_64:tools/perf
> Qemu test results:
> total: 417 pass: 417 fail: 0
>
> perf failures are as usual. powerpc:
>
> arch/powerpc/kernel/tau_6xx.c: In function 'TAU_init':
> include/linux/workqueue.h:427:24: error: too many arguments for format
>
> Tested-by: Guenter Roeck <linux(a)roeck-us.net>
>
> Guenter
The patch titled
Subject: mm/zsmalloc.c: drop ZSMALLOC_PGTABLE_MAPPING
has been added to the -mm tree. Its filename is
mm-zsmallocc-drop-zsmalloc_pgtable_mapping.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-zsmallocc-drop-zsmalloc_pgtabl…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-zsmallocc-drop-zsmalloc_pgtabl…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Minchan Kim <minchan(a)kernel.org>
Subject: mm/zsmalloc.c: drop ZSMALLOC_PGTABLE_MAPPING
While I was doing zram testing, I found sometimes decompression failed
since the compression buffer was corrupted. With investigation, I found
below commit calls cond_resched unconditionally so it could make a problem
in atomic context if the task is reschedule.
[ 55.109012] BUG: sleeping function called from invalid context at mm/vmalloc.c:108
[ 55.110774] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 946, name: memhog
[ 55.111973] 3 locks held by memhog/946:
[ 55.112807] #0: ffff9d01d4b193e8 (&mm->mmap_lock#2){++++}-{4:4}, at: __mm_populate+0x103/0x160
[ 55.114151] #1: ffffffffa3d53de0 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0xa98/0x1160
[ 55.115848] #2: ffff9d01d56b8110 (&zspage->lock){.+.+}-{3:3}, at: zs_map_object+0x8e/0x1f0
[ 55.118947] CPU: 0 PID: 946 Comm: memhog Not tainted 5.9.3-00011-gc5bfc0287345-dirty #316
[ 55.121265] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
[ 55.122540] Call Trace:
[ 55.122974] dump_stack+0x8b/0xb8
[ 55.123588] ___might_sleep.cold+0xb6/0xc6
[ 55.124328] unmap_kernel_range_noflush+0x2eb/0x350
[ 55.125198] unmap_kernel_range+0x14/0x30
[ 55.125920] zs_unmap_object+0xd5/0xe0
[ 55.126604] zram_bvec_rw.isra.0+0x38c/0x8e0
[ 55.127462] zram_rw_page+0x90/0x101
[ 55.128199] bdev_write_page+0x92/0xe0
[ 55.128957] ? swap_slot_free_notify+0xb0/0xb0
[ 55.129841] __swap_writepage+0x94/0x4a0
[ 55.130636] ? do_raw_spin_unlock+0x4b/0xa0
[ 55.131462] ? _raw_spin_unlock+0x1f/0x30
[ 55.132261] ? page_swapcount+0x6c/0x90
[ 55.133038] pageout+0xe3/0x3a0
[ 55.133702] shrink_page_list+0xb94/0xd60
[ 55.134626] shrink_inactive_list+0x158/0x460
We can fix this by removing the ZSMALLOC_PGTABLE_MAPPING feature (whcih
contains the offending calling code) from zsmalloc.
Even though this option showed some amount improvement(e.g., 30%) in some
arm32 platforms, it has been headache to maintain since it have abused
APIs[1](e.g., unmap_kernel_range in atomic context).
Since we are approaching to deprecate 32bit machines and already made the
config option available for only builtin build since v5.8, lastly it has
been not default option in zsmalloc, it's time to drop the option for
better maintainance.
[1] http://lore.kernel.org/linux-mm/20201105170249.387069-1-minchan@kernel.org
Link: https://lkml.kernel.org/r/20201117202916.GA3856507@google.com
Fixes: e47110e90584 ("mm/vunmap: add cond_resched() in vunmap_pmd_range")
Signed-off-by: Minchan Kim <minchan(a)kernel.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>
Cc: Tony Lindgren <tony(a)atomide.com>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Harish Sriram <harish(a)linux.ibm.com>
Cc: Uladzislau Rezki <urezki(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/arm/configs/omap2plus_defconfig | 1
include/linux/zsmalloc.h | 1
mm/Kconfig | 13 ------
mm/zsmalloc.c | 54 -------------------------
4 files changed, 69 deletions(-)
--- a/arch/arm/configs/omap2plus_defconfig~mm-zsmallocc-drop-zsmalloc_pgtable_mapping
+++ a/arch/arm/configs/omap2plus_defconfig
@@ -81,7 +81,6 @@ CONFIG_PARTITION_ADVANCED=y
CONFIG_BINFMT_MISC=y
CONFIG_CMA=y
CONFIG_ZSMALLOC=m
-CONFIG_ZSMALLOC_PGTABLE_MAPPING=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
--- a/include/linux/zsmalloc.h~mm-zsmallocc-drop-zsmalloc_pgtable_mapping
+++ a/include/linux/zsmalloc.h
@@ -20,7 +20,6 @@
* zsmalloc mapping modes
*
* NOTE: These only make a difference when a mapped object spans pages.
- * They also have no effect when ZSMALLOC_PGTABLE_MAPPING is selected.
*/
enum zs_mapmode {
ZS_MM_RW, /* normal read-write mapping */
--- a/mm/Kconfig~mm-zsmallocc-drop-zsmalloc_pgtable_mapping
+++ a/mm/Kconfig
@@ -707,19 +707,6 @@ config ZSMALLOC
returned by an alloc(). This handle must be mapped in order to
access the allocated space.
-config ZSMALLOC_PGTABLE_MAPPING
- bool "Use page table mapping to access object in zsmalloc"
- depends on ZSMALLOC=y
- help
- By default, zsmalloc uses a copy-based object mapping method to
- access allocations that span two pages. However, if a particular
- architecture (ex, ARM) performs VM mapping faster than copying,
- then you should select this. This causes zsmalloc to use page table
- mapping rather than copying for object mapping.
-
- You can check speed with zsmalloc benchmark:
- https://github.com/spartacus06/zsmapbench
-
config ZSMALLOC_STAT
bool "Export zsmalloc statistics"
depends on ZSMALLOC
--- a/mm/zsmalloc.c~mm-zsmallocc-drop-zsmalloc_pgtable_mapping
+++ a/mm/zsmalloc.c
@@ -293,11 +293,7 @@ struct zspage {
};
struct mapping_area {
-#ifdef CONFIG_ZSMALLOC_PGTABLE_MAPPING
- struct vm_struct *vm; /* vm area for mapping object that span pages */
-#else
char *vm_buf; /* copy buffer for objects that span pages */
-#endif
char *vm_addr; /* address of kmap_atomic()'ed pages */
enum zs_mapmode vm_mm; /* mapping mode */
};
@@ -1113,54 +1109,6 @@ static struct zspage *find_get_zspage(st
return zspage;
}
-#ifdef CONFIG_ZSMALLOC_PGTABLE_MAPPING
-static inline int __zs_cpu_up(struct mapping_area *area)
-{
- /*
- * Make sure we don't leak memory if a cpu UP notification
- * and zs_init() race and both call zs_cpu_up() on the same cpu
- */
- if (area->vm)
- return 0;
- area->vm = get_vm_area(PAGE_SIZE * 2, 0);
- if (!area->vm)
- return -ENOMEM;
-
- /*
- * Populate ptes in advance to avoid pte allocation with GFP_KERNEL
- * in non-preemtible context of zs_map_object.
- */
- return apply_to_page_range(&init_mm, (unsigned long)area->vm->addr,
- PAGE_SIZE * 2, NULL, NULL);
-}
-
-static inline void __zs_cpu_down(struct mapping_area *area)
-{
- if (area->vm)
- free_vm_area(area->vm);
- area->vm = NULL;
-}
-
-static inline void *__zs_map_object(struct mapping_area *area,
- struct page *pages[2], int off, int size)
-{
- unsigned long addr = (unsigned long)area->vm->addr;
-
- BUG_ON(map_kernel_range(addr, PAGE_SIZE * 2, PAGE_KERNEL, pages) < 0);
- area->vm_addr = area->vm->addr;
- return area->vm_addr + off;
-}
-
-static inline void __zs_unmap_object(struct mapping_area *area,
- struct page *pages[2], int off, int size)
-{
- unsigned long addr = (unsigned long)area->vm_addr;
-
- unmap_kernel_range(addr, PAGE_SIZE * 2);
-}
-
-#else /* CONFIG_ZSMALLOC_PGTABLE_MAPPING */
-
static inline int __zs_cpu_up(struct mapping_area *area)
{
/*
@@ -1241,8 +1189,6 @@ out:
pagefault_enable();
}
-#endif /* CONFIG_ZSMALLOC_PGTABLE_MAPPING */
-
static int zs_cpu_prepare(unsigned int cpu)
{
struct mapping_area *area;
_
Patches currently in -mm which might be from minchan(a)kernel.org are
mm-zsmallocc-drop-zsmalloc_pgtable_mapping.patch
zram-support-a-page-writeback.patch
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: b93aafd91105 - Convert trailing spaces and periods in path components
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ xfstests - ext4
🚧 ✅ xfstests - xfs
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
ppc64le:
Host 1:
✅ Boot test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ xfstests - ext4
🚧 ✅ xfstests - xfs
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage block - filesystem fio test
🚧 ✅ Storage block - queue scheduler test
🚧 ✅ Storage nvme - tcp
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
s390x:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 2:
✅ Boot test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 3:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ ACPI table test
⚡⚡⚡ LTP
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ Memory: fork_mem
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking: igmp conformance test
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - transport
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: sanity smoke test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
⚡⚡⚡ storage: SCSI VPD
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ Firmware test suite
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ Ethernet drivers sanity
🚧 ⚡⚡⚡ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ trace: ftrace/tracer
🚧 ⚡⚡⚡ kdump - kexec_boot
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 3:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 💥 Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ CPU: Frequency Driver Test
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - ext4
🚧 ⚡⚡⚡ xfstests - xfs
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 4:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ ACPI table test
⚡⚡⚡ LTP
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ Memory: fork_mem
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking: igmp conformance test
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - transport
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: sanity smoke test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
⚡⚡⚡ storage: SCSI VPD
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ Firmware test suite
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ Ethernet drivers sanity
🚧 ⚡⚡⚡ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ trace: ftrace/tracer
🚧 ⚡⚡⚡ kdump - kexec_boot
Host 5:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ ACPI table test
⚡⚡⚡ LTP
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ Memory: fork_mem
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking: igmp conformance test
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - transport
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: sanity smoke test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
⚡⚡⚡ storage: SCSI VPD
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ Firmware test suite
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ Ethernet drivers sanity
🚧 ⚡⚡⚡ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ trace: ftrace/tracer
🚧 ⚡⚡⚡ kdump - kexec_boot
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
This series replaces all the use of security_capable(current_cred(),
...) with ns_capable{,_noaudit}() which set PF_SUPERPRIV.
This initially come from a review of Landlock by Jann Horn:
https://lore.kernel.org/lkml/CAG48ez1FQVkt78129WozBwFbVhAPyAr9oJAHFHAbbNxEB…
Mickaël Salaün (2):
ptrace: Set PF_SUPERPRIV when checking capability
seccomp: Set PF_SUPERPRIV when checking capability
kernel/ptrace.c | 18 ++++++------------
kernel/seccomp.c | 5 ++---
2 files changed, 8 insertions(+), 15 deletions(-)
base-commit: 3650b228f83adda7e5ee532e2b90429c03f7b9ec
--
2.28.0
When the HW is powered down, the register state and links are lost. This
may be an issue in the firmware, or in the code expectations; whatever
it is, it is expected behaviour now for Tigerlake; stop warning!
References: https://gitlab.freedesktop.org/drm/intel/-/issues/2411
Fixes: 239bef676d8e ("drm/i915/display: Implement new combo phy initialization step")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Clinton A Taylor <clinton.a.taylor(a)intel.com>
Cc: Lucas De Marchi <lucas.demarchi(a)intel.com>
Cc: Matt Roper <matthew.d.roper(a)intel.com>
Cc: José Roberto de Souza <jose.souza(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.9+
---
drivers/gpu/drm/i915/display/intel_combo_phy.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_combo_phy.c b/drivers/gpu/drm/i915/display/intel_combo_phy.c
index d5ad61e4083e..9a87df982af8 100644
--- a/drivers/gpu/drm/i915/display/intel_combo_phy.c
+++ b/drivers/gpu/drm/i915/display/intel_combo_phy.c
@@ -428,9 +428,9 @@ static void icl_combo_phys_uninit(struct drm_i915_private *dev_priv)
if (phy == PHY_A &&
!icl_combo_phy_verify_state(dev_priv, phy))
- drm_warn(&dev_priv->drm,
- "Combo PHY %c HW state changed unexpectedly\n",
- phy_name(phy));
+ drm_dbg_kms(&dev_priv->drm,
+ "Combo PHY %c HW state changed unexpectedly\n",
+ phy_name(phy));
if (!has_phy_misc(dev_priv, phy))
goto skip_phy_misc;
--
2.20.1
commit 2fb541c862c9 ("net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc")
When the above upstream commit is backported to stable kernel,
one assignment is missing, which causes two problems reported
by Joakim and Vishwanath, see [1] and [2].
So add the assignment back to fix it.
1. https://www.spinics.net/lists/netdev/msg693916.html
2. https://www.spinics.net/lists/netdev/msg695131.html
Fixes: 749cc0b0c7f3 ("net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc")
Signed-off-by: Yunsheng Lin <linyunsheng(a)huawei.com>
---
net/sched/sch_generic.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 0e275e1..6e6147a 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -1127,10 +1127,13 @@ static void dev_deactivate_queue(struct net_device *dev,
void *_qdisc_default)
{
struct Qdisc *qdisc = rtnl_dereference(dev_queue->qdisc);
+ struct Qdisc *qdisc_default = _qdisc_default;
if (qdisc) {
if (!(qdisc->flags & TCQ_F_BUILTIN))
set_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state);
+
+ rcu_assign_pointer(dev_queue->qdisc, qdisc_default);
}
}
--
2.7.4
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From cb8d53d2c97369029cc638c9274ac7be0a316c75 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Tue, 22 Sep 2020 09:24:56 -0700
Subject: [PATCH] ext4: fix leaking sysfs kobject after failed mount
ext4_unregister_sysfs() only deletes the kobject. The reference to it
needs to be put separately, like ext4_put_super() does.
This addresses the syzbot report
"memory leak in kobject_set_name_vargs (3)"
(https://syzkaller.appspot.com/bug?extid=9f864abad79fae7c17e1).
Reported-by: syzbot+9f864abad79fae7c17e1(a)syzkaller.appspotmail.com
Fixes: 72ba74508b28 ("ext4: release sysfs kobject when failing to enable quotas on mount")
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Link: https://lore.kernel.org/r/20200922162456.93657-1-ebiggers@kernel.org
Reviewed-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ea425b49b345..41953b86ffe3 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4872,6 +4872,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
failed_mount8:
ext4_unregister_sysfs(sb);
+ kobject_put(&sbi->s_kobj);
failed_mount7:
ext4_unregister_li_request(sb);
failed_mount6:
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8b92c4ff4423aa9900cf838d3294fcade4dbda35 Mon Sep 17 00:00:00 2001
From: Matteo Croce <mcroce(a)microsoft.com>
Date: Fri, 13 Nov 2020 22:52:02 -0800
Subject: [PATCH] Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
Patch series "fix parsing of reboot= cmdline", v3.
The parsing of the reboot= cmdline has two major errors:
- a missing bound check can crash the system on reboot
- parsing of the cpu number only works if specified last
Fix both.
This patch (of 2):
This reverts commit 616feab753972b97.
kstrtoint() and simple_strtoul() have a subtle difference which makes
them non interchangeable: if a non digit character is found amid the
parsing, the former will return an error, while the latter will just
stop parsing, e.g. simple_strtoul("123xyx") = 123.
The kernel cmdline reboot= argument allows to specify the CPU used for
rebooting, with the syntax `s####` among the other flags, e.g.
"reboot=warm,s31,force", so if this flag is not the last given, it's
silently ignored as well as the subsequent ones.
Fixes: 616feab75397 ("kernel/reboot.c: convert simple_strtoul to kstrtoint")
Signed-off-by: Matteo Croce <mcroce(a)microsoft.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Guenter Roeck <linux(a)roeck-us.net>
Cc: Petr Mladek <pmladek(a)suse.com>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Robin Holt <robinmholt(a)gmail.com>
Cc: Fabian Frederick <fabf(a)skynet.be>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20201103214025.116799-2-mcroce@linux.microsoft.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/kernel/reboot.c b/kernel/reboot.c
index e7b78d5ae1ab..8fbba433725e 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -551,22 +551,15 @@ static int __init reboot_setup(char *str)
break;
case 's':
- {
- int rc;
-
- if (isdigit(*(str+1))) {
- rc = kstrtoint(str+1, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else if (str[1] == 'm' && str[2] == 'p' &&
- isdigit(*(str+3))) {
- rc = kstrtoint(str+3, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else
+ if (isdigit(*(str+1)))
+ reboot_cpu = simple_strtoul(str+1, NULL, 0);
+ else if (str[1] == 'm' && str[2] == 'p' &&
+ isdigit(*(str+3)))
+ reboot_cpu = simple_strtoul(str+3, NULL, 0);
+ else
*mode = REBOOT_SOFT;
break;
- }
+
case 'g':
*mode = REBOOT_GPIO;
break;
Hi Greg, Sasha,
Please consider the attached backport of
f91072ed1b72 ("perf/core: Fix race in the perf_mmap_close() function")
for v4.14.y, v4.19.y and v5.4.y
This will not apply in v4.9.y and v4.4.y and I am sending separate backport
for that.
--
Regards
Sudip
Hi Greg, Sasha,
This was missing from v4.4.y, v4.9.y and v4.14.y. Please consider
the attached backported patch.
Missed adding stable in the previous mail.
--
Regards
Sudip
commit 1978b3a53a74e3230cd46932b149c6e62e832e9a upstream.
On AMD CPUs which have the feature X86_FEATURE_AMD_STIBP_ALWAYS_ON,
STIBP is set to on and
spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED
At the same time, IBPB can be set to conditional.
However, this leads to the case where it's impossible to turn on IBPB
for a process because in the PR_SPEC_DISABLE case in ib_prctl_set() the
spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED
condition leads to a return before the task flag is set. Similarly,
ib_prctl_get() will return PR_SPEC_DISABLE even though IBPB is set to
conditional.
More generally, the following cases are possible:
1. STIBP = conditional && IBPB = on for spectre_v2_user=seccomp,ibpb
2. STIBP = on && IBPB = conditional for AMD CPUs with
X86_FEATURE_AMD_STIBP_ALWAYS_ON
The first case functions correctly today, but only because
spectre_v2_user_ibpb isn't updated to reflect the IBPB mode.
At a high level, this change does one thing. If either STIBP or IBPB
is set to conditional, allow the prctl to change the task flag.
Also, reflect that capability when querying the state. This isn't
perfect since it doesn't take into account if only STIBP or IBPB is
unconditionally on. But it allows the conditional feature to work as
expected, without affecting the unconditional one.
[ bp: Massage commit message and comment; space out statements for
better readability. ]
Fixes: 21998a351512 ("x86/speculation: Avoid force-disabling IBPB based on STIBP and enhanced IBRS.")
Signed-off-by: Anand K Mistry <amistry(a)google.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Acked-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Link: https://lkml.kernel.org/r/20201105163246.v2.1.Ifd7243cd3e2c2206a893ad0a5b9a…
Conflicts:
arch/x86/kernel/cpu/bugs.c
Superfluous newline which was removed in upstream commit a5ce9f2bb665
---
The one conflict is a newline in a comment which was removed in upstream
commit a5ce9f2bb665, but was not merged into the stable trees.
This patch applies cleanly on the stable trees 5.4, 4.19, 4.14, 4.9, and
4.4, which are affected by this bug.
arch/x86/kernel/cpu/bugs.c | 52 ++++++++++++++++++++++++--------------
1 file changed, 33 insertions(+), 19 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index acbf3dbb8bf2..bdc1ed7ff669 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1252,6 +1252,14 @@ static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl)
return 0;
}
+static bool is_spec_ib_user_controlled(void)
+{
+ return spectre_v2_user_ibpb == SPECTRE_V2_USER_PRCTL ||
+ spectre_v2_user_ibpb == SPECTRE_V2_USER_SECCOMP ||
+ spectre_v2_user_stibp == SPECTRE_V2_USER_PRCTL ||
+ spectre_v2_user_stibp == SPECTRE_V2_USER_SECCOMP;
+}
+
static int ib_prctl_set(struct task_struct *task, unsigned long ctrl)
{
switch (ctrl) {
@@ -1259,17 +1267,26 @@ static int ib_prctl_set(struct task_struct *task, unsigned long ctrl)
if (spectre_v2_user_ibpb == SPECTRE_V2_USER_NONE &&
spectre_v2_user_stibp == SPECTRE_V2_USER_NONE)
return 0;
- /*
- * Indirect branch speculation is always disabled in strict
- * mode. It can neither be enabled if it was force-disabled
- * by a previous prctl call.
+ /*
+ * With strict mode for both IBPB and STIBP, the instruction
+ * code paths avoid checking this task flag and instead,
+ * unconditionally run the instruction. However, STIBP and IBPB
+ * are independent and either can be set to conditionally
+ * enabled regardless of the mode of the other.
+ *
+ * If either is set to conditional, allow the task flag to be
+ * updated, unless it was force-disabled by a previous prctl
+ * call. Currently, this is possible on an AMD CPU which has the
+ * feature X86_FEATURE_AMD_STIBP_ALWAYS_ON. In this case, if the
+ * kernel is booted with 'spectre_v2_user=seccomp', then
+ * spectre_v2_user_ibpb == SPECTRE_V2_USER_SECCOMP and
+ * spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED.
*/
- if (spectre_v2_user_ibpb == SPECTRE_V2_USER_STRICT ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED ||
+ if (!is_spec_ib_user_controlled() ||
task_spec_ib_force_disable(task))
return -EPERM;
+
task_clear_spec_ib_disable(task);
task_update_spec_tif(task);
break;
@@ -1282,10 +1299,10 @@ static int ib_prctl_set(struct task_struct *task, unsigned long ctrl)
if (spectre_v2_user_ibpb == SPECTRE_V2_USER_NONE &&
spectre_v2_user_stibp == SPECTRE_V2_USER_NONE)
return -EPERM;
- if (spectre_v2_user_ibpb == SPECTRE_V2_USER_STRICT ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED)
+
+ if (!is_spec_ib_user_controlled())
return 0;
+
task_set_spec_ib_disable(task);
if (ctrl == PR_SPEC_FORCE_DISABLE)
task_set_spec_ib_force_disable(task);
@@ -1350,20 +1367,17 @@ static int ib_prctl_get(struct task_struct *task)
if (spectre_v2_user_ibpb == SPECTRE_V2_USER_NONE &&
spectre_v2_user_stibp == SPECTRE_V2_USER_NONE)
return PR_SPEC_ENABLE;
- else if (spectre_v2_user_ibpb == SPECTRE_V2_USER_STRICT ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED)
- return PR_SPEC_DISABLE;
- else if (spectre_v2_user_ibpb == SPECTRE_V2_USER_PRCTL ||
- spectre_v2_user_ibpb == SPECTRE_V2_USER_SECCOMP ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_PRCTL ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_SECCOMP) {
+ else if (is_spec_ib_user_controlled()) {
if (task_spec_ib_force_disable(task))
return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE;
if (task_spec_ib_disable(task))
return PR_SPEC_PRCTL | PR_SPEC_DISABLE;
return PR_SPEC_PRCTL | PR_SPEC_ENABLE;
- } else
+ } else if (spectre_v2_user_ibpb == SPECTRE_V2_USER_STRICT ||
+ spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT ||
+ spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED)
+ return PR_SPEC_DISABLE;
+ else
return PR_SPEC_NOT_AFFECTED;
}
--
2.29.2.222.g5d2a92d10f8-goog
The patch below does not apply to the 5.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e1777d099728a76a8f8090f89649aac961e7e530 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)wdc.com>
Date: Fri, 6 Nov 2020 20:01:41 +0900
Subject: [PATCH] null_blk: Fix scheduling in atomic with zoned mode
Commit aa1c09cb65e2 ("null_blk: Fix locking in zoned mode") changed
zone locking to using the potentially sleeping wait_on_bit_io()
function. This is acceptable when memory backing is enabled as the
device queue is in that case marked as blocking, but this triggers a
scheduling while in atomic context with memory backing disabled.
Fix this by relying solely on the device zone spinlock for zone
information protection without temporarily releasing this lock around
null_process_cmd() execution in null_zone_write(). This is OK to do
since when memory backing is disabled, command processing does not
block and the memory backing lock nullb->lock is unused. This solution
avoids the overhead of having to mark a zoned null_blk device queue as
blocking when memory backing is unused.
This patch also adds comments to the zone locking code to explain the
unusual locking scheme.
Fixes: aa1c09cb65e2 ("null_blk: Fix locking in zoned mode")
Reported-by: kernel test robot <lkp(a)intel.com>
Signed-off-by: Damien Le Moal <damien.lemoal(a)wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Cc: stable(a)vger.kernel.org
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/drivers/block/null_blk.h b/drivers/block/null_blk.h
index cfd00ad40355..c24d9b5ad81a 100644
--- a/drivers/block/null_blk.h
+++ b/drivers/block/null_blk.h
@@ -47,7 +47,7 @@ struct nullb_device {
unsigned int nr_zones_closed;
struct blk_zone *zones;
sector_t zone_size_sects;
- spinlock_t zone_dev_lock;
+ spinlock_t zone_lock;
unsigned long *zone_locks;
unsigned long size; /* device size in MB */
diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c
index 8775acbb4f8f..beb34b4f76b0 100644
--- a/drivers/block/null_blk_zoned.c
+++ b/drivers/block/null_blk_zoned.c
@@ -46,11 +46,20 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
if (!dev->zones)
return -ENOMEM;
- spin_lock_init(&dev->zone_dev_lock);
- dev->zone_locks = bitmap_zalloc(dev->nr_zones, GFP_KERNEL);
- if (!dev->zone_locks) {
- kvfree(dev->zones);
- return -ENOMEM;
+ /*
+ * With memory backing, the zone_lock spinlock needs to be temporarily
+ * released to avoid scheduling in atomic context. To guarantee zone
+ * information protection, use a bitmap to lock zones with
+ * wait_on_bit_lock_io(). Sleeping on the lock is OK as memory backing
+ * implies that the queue is marked with BLK_MQ_F_BLOCKING.
+ */
+ spin_lock_init(&dev->zone_lock);
+ if (dev->memory_backed) {
+ dev->zone_locks = bitmap_zalloc(dev->nr_zones, GFP_KERNEL);
+ if (!dev->zone_locks) {
+ kvfree(dev->zones);
+ return -ENOMEM;
+ }
}
if (dev->zone_nr_conv >= dev->nr_zones) {
@@ -137,12 +146,17 @@ void null_free_zoned_dev(struct nullb_device *dev)
static inline void null_lock_zone(struct nullb_device *dev, unsigned int zno)
{
- wait_on_bit_lock_io(dev->zone_locks, zno, TASK_UNINTERRUPTIBLE);
+ if (dev->memory_backed)
+ wait_on_bit_lock_io(dev->zone_locks, zno, TASK_UNINTERRUPTIBLE);
+ spin_lock_irq(&dev->zone_lock);
}
static inline void null_unlock_zone(struct nullb_device *dev, unsigned int zno)
{
- clear_and_wake_up_bit(zno, dev->zone_locks);
+ spin_unlock_irq(&dev->zone_lock);
+
+ if (dev->memory_backed)
+ clear_and_wake_up_bit(zno, dev->zone_locks);
}
int null_report_zones(struct gendisk *disk, sector_t sector,
@@ -322,7 +336,6 @@ static blk_status_t null_zone_write(struct nullb_cmd *cmd, sector_t sector,
return null_process_cmd(cmd, REQ_OP_WRITE, sector, nr_sectors);
null_lock_zone(dev, zno);
- spin_lock(&dev->zone_dev_lock);
switch (zone->cond) {
case BLK_ZONE_COND_FULL:
@@ -375,9 +388,17 @@ static blk_status_t null_zone_write(struct nullb_cmd *cmd, sector_t sector,
if (zone->cond != BLK_ZONE_COND_EXP_OPEN)
zone->cond = BLK_ZONE_COND_IMP_OPEN;
- spin_unlock(&dev->zone_dev_lock);
+ /*
+ * Memory backing allocation may sleep: release the zone_lock spinlock
+ * to avoid scheduling in atomic context. Zone operation atomicity is
+ * still guaranteed through the zone_locks bitmap.
+ */
+ if (dev->memory_backed)
+ spin_unlock_irq(&dev->zone_lock);
ret = null_process_cmd(cmd, REQ_OP_WRITE, sector, nr_sectors);
- spin_lock(&dev->zone_dev_lock);
+ if (dev->memory_backed)
+ spin_lock_irq(&dev->zone_lock);
+
if (ret != BLK_STS_OK)
goto unlock;
@@ -392,7 +413,6 @@ static blk_status_t null_zone_write(struct nullb_cmd *cmd, sector_t sector,
ret = BLK_STS_OK;
unlock:
- spin_unlock(&dev->zone_dev_lock);
null_unlock_zone(dev, zno);
return ret;
@@ -516,9 +536,7 @@ static blk_status_t null_zone_mgmt(struct nullb_cmd *cmd, enum req_opf op,
null_lock_zone(dev, i);
zone = &dev->zones[i];
if (zone->cond != BLK_ZONE_COND_EMPTY) {
- spin_lock(&dev->zone_dev_lock);
null_reset_zone(dev, zone);
- spin_unlock(&dev->zone_dev_lock);
trace_nullb_zone_op(cmd, i, zone->cond);
}
null_unlock_zone(dev, i);
@@ -530,7 +548,6 @@ static blk_status_t null_zone_mgmt(struct nullb_cmd *cmd, enum req_opf op,
zone = &dev->zones[zone_no];
null_lock_zone(dev, zone_no);
- spin_lock(&dev->zone_dev_lock);
switch (op) {
case REQ_OP_ZONE_RESET:
@@ -550,8 +567,6 @@ static blk_status_t null_zone_mgmt(struct nullb_cmd *cmd, enum req_opf op,
break;
}
- spin_unlock(&dev->zone_dev_lock);
-
if (ret == BLK_STS_OK)
trace_nullb_zone_op(cmd, zone_no, zone->cond);
Hi,
Please backport 44492e70adc8 to 5.4 stable.
The commit fixes broken WiFi for many users.
Currently only stable 5.4 misses this patch, older kernels don't have rtw88.
Kai-Heng
From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota(a)intel.com>
Commit 5ce6861d36ed5207aff9e5eead4c7cc38a986586 upstream.
This backport targets stable version 5.4, since the original patch fails
to apply there, due to a variable having moved from one struct to another.
The only change required for the patch to apply to 5.4 is to use the
correct structure:
- (engine->gt->info.vdbox_sfc_access &
++ (RUNTIME_INFO(i915)->vdbox_sfc_access &
Original commit message below.
SFC capability of video engines is not set correctly because i915
is testing for incorrect bits.
Fixes: c5d3e39caa45 ("drm/i915: Engine discovery query")
Cc: Matt Roper <matthew.d.roper(a)intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota(a)intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio(a)intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.3+
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106011842.36203-1-daniel…
(cherry picked from commit ad18fa0f5f052046cad96fee762b5c64f42dd86a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
---
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4ce8626b140e..8073758d1036 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -354,7 +354,8 @@ static void __setup_engine_capabilities(struct intel_engine_cs *engine)
* instances.
*/
if ((INTEL_GEN(i915) >= 11 &&
- RUNTIME_INFO(i915)->vdbox_sfc_access & engine->mask) ||
+ (RUNTIME_INFO(i915)->vdbox_sfc_access &
+ BIT(engine->instance))) ||
(INTEL_GEN(i915) >= 9 && engine->instance == 0))
engine->uabi_capabilities |=
I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC;
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio: accel: kxcjk1013: Add support for KIOX010A ACPI DSM for setting
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From e5b1032a656e9aa4c7a4df77cb9156a2a651a5f9 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Tue, 10 Nov 2020 14:38:35 +0100
Subject: iio: accel: kxcjk1013: Add support for KIOX010A ACPI DSM for setting
tablet-mode
Some 360 degree hinges (yoga) style 2-in-1 devices use 2 KXCJ91008-s
to allow the OS to determine the angle between the display and the base
of the device, so that the OS can determine if the 2-in-1 is in laptop
or in tablet-mode.
On Windows both accelerometers are read by a special HingeAngleService
process; and this process calls a DSM (Device Specific Method) on the
ACPI KIOX010A device node for the sensor in the display, to let the
embedded-controller (EC) know about the mode so that it can disable the
kbd and touchpad to avoid spurious input while folded into tablet-mode.
This notifying of the EC is problematic because sometimes the EC comes up
thinking that device is in tablet-mode and the kbd and touchpad do not
work. This happens for example on Irbis NB111 devices after a suspend /
resume cycle (after a complete battery drain / hard reset without having
booted Windows at least once). Other 2-in-1s which are likely affected
too are e.g. the Teclast F5 and F6 series.
The kxcjk-1013 driver may seem like a strange place to deal with this,
but since it is *the* driver for the ACPI KIOX010A device, it is also
the driver which has access to the ACPI handle needed by the DSM.
Add support for calling the DSM and on probe unconditionally tell the
EC that the device is laptop mode, fixing the kbd and touchpad sometimes
not working.
Fixes: 7f6232e69539 ("iio: accel: kxcjk1013: Add KIOX010A ACPI Hardware-ID")
Reported-and-tested-by: russianneuromancer <russianneuromancer(a)ya.ru>
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Cc: <Stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20201110133835.129080-3-hdegoede@redhat.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/accel/kxcjk-1013.c | 36 ++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/drivers/iio/accel/kxcjk-1013.c b/drivers/iio/accel/kxcjk-1013.c
index abeb0d254046..560a3373ff20 100644
--- a/drivers/iio/accel/kxcjk-1013.c
+++ b/drivers/iio/accel/kxcjk-1013.c
@@ -129,6 +129,7 @@ enum kx_chipset {
enum kx_acpi_type {
ACPI_GENERIC,
ACPI_SMO8500,
+ ACPI_KIOX010A,
};
struct kxcjk1013_data {
@@ -275,6 +276,32 @@ static const struct {
{19163, 1, 0},
{38326, 0, 1} };
+#ifdef CONFIG_ACPI
+enum kiox010a_fn_index {
+ KIOX010A_SET_LAPTOP_MODE = 1,
+ KIOX010A_SET_TABLET_MODE = 2,
+};
+
+static int kiox010a_dsm(struct device *dev, int fn_index)
+{
+ acpi_handle handle = ACPI_HANDLE(dev);
+ guid_t kiox010a_dsm_guid;
+ union acpi_object *obj;
+
+ if (!handle)
+ return -ENODEV;
+
+ guid_parse("1f339696-d475-4e26-8cad-2e9f8e6d7a91", &kiox010a_dsm_guid);
+
+ obj = acpi_evaluate_dsm(handle, &kiox010a_dsm_guid, 1, fn_index, NULL);
+ if (!obj)
+ return -EIO;
+
+ ACPI_FREE(obj);
+ return 0;
+}
+#endif
+
static int kxcjk1013_set_mode(struct kxcjk1013_data *data,
enum kxcjk1013_mode mode)
{
@@ -352,6 +379,13 @@ static int kxcjk1013_chip_init(struct kxcjk1013_data *data)
{
int ret;
+#ifdef CONFIG_ACPI
+ if (data->acpi_type == ACPI_KIOX010A) {
+ /* Make sure the kbd and touchpad on 2-in-1s using 2 KXCJ91008-s work */
+ kiox010a_dsm(&data->client->dev, KIOX010A_SET_LAPTOP_MODE);
+ }
+#endif
+
ret = i2c_smbus_read_byte_data(data->client, KXCJK1013_REG_WHO_AM_I);
if (ret < 0) {
dev_err(&data->client->dev, "Error reading who_am_i\n");
@@ -1262,6 +1296,8 @@ static const char *kxcjk1013_match_acpi_device(struct device *dev,
if (strcmp(id->id, "SMO8500") == 0)
*acpi_type = ACPI_SMO8500;
+ else if (strcmp(id->id, "KIOX010A") == 0)
+ *acpi_type = ACPI_KIOX010A;
*chipset = (enum kx_chipset)id->driver_data;
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio: accel: kxcjk1013: Replace is_smo8500_device with an acpi_type
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 11e94f28c3de35d5ad1ac6a242a5b30f4378991a Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Tue, 10 Nov 2020 14:38:34 +0100
Subject: iio: accel: kxcjk1013: Replace is_smo8500_device with an acpi_type
enum
Replace the boolean is_smo8500_device variable with an acpi_type enum.
For now this can be either ACPI_GENERIC or ACPI_SMO8500, this is a
preparation patch for adding special handling for the KIOX010A ACPI HID,
which will add a ACPI_KIOX010A acpi_type to the introduced enum.
For stable as needed as precursor for next patch.
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Fixes: 7f6232e69539 ("iio: accel: kxcjk1013: Add KIOX010A ACPI Hardware-ID")
Cc: <Stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20201110133835.129080-2-hdegoede@redhat.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/accel/kxcjk-1013.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/iio/accel/kxcjk-1013.c b/drivers/iio/accel/kxcjk-1013.c
index beb38d9d607d..abeb0d254046 100644
--- a/drivers/iio/accel/kxcjk-1013.c
+++ b/drivers/iio/accel/kxcjk-1013.c
@@ -126,6 +126,11 @@ enum kx_chipset {
KX_MAX_CHIPS /* this must be last */
};
+enum kx_acpi_type {
+ ACPI_GENERIC,
+ ACPI_SMO8500,
+};
+
struct kxcjk1013_data {
struct i2c_client *client;
struct iio_trigger *dready_trig;
@@ -143,7 +148,7 @@ struct kxcjk1013_data {
bool motion_trigger_on;
int64_t timestamp;
enum kx_chipset chipset;
- bool is_smo8500_device;
+ enum kx_acpi_type acpi_type;
};
enum kxcjk1013_axis {
@@ -1247,7 +1252,7 @@ static irqreturn_t kxcjk1013_data_rdy_trig_poll(int irq, void *private)
static const char *kxcjk1013_match_acpi_device(struct device *dev,
enum kx_chipset *chipset,
- bool *is_smo8500_device)
+ enum kx_acpi_type *acpi_type)
{
const struct acpi_device_id *id;
@@ -1256,7 +1261,7 @@ static const char *kxcjk1013_match_acpi_device(struct device *dev,
return NULL;
if (strcmp(id->id, "SMO8500") == 0)
- *is_smo8500_device = true;
+ *acpi_type = ACPI_SMO8500;
*chipset = (enum kx_chipset)id->driver_data;
@@ -1299,7 +1304,7 @@ static int kxcjk1013_probe(struct i2c_client *client,
} else if (ACPI_HANDLE(&client->dev)) {
name = kxcjk1013_match_acpi_device(&client->dev,
&data->chipset,
- &data->is_smo8500_device);
+ &data->acpi_type);
} else
return -ENODEV;
@@ -1316,7 +1321,7 @@ static int kxcjk1013_probe(struct i2c_client *client,
indio_dev->modes = INDIO_DIRECT_MODE;
indio_dev->info = &kxcjk1013_info;
- if (client->irq > 0 && !data->is_smo8500_device) {
+ if (client->irq > 0 && data->acpi_type != ACPI_SMO8500) {
ret = devm_request_threaded_irq(&client->dev, client->irq,
kxcjk1013_data_rdy_trig_poll,
kxcjk1013_event_handler,
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio: light: fix kconfig dependency bug for VCNL4035
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 44a146a44f656fc03d368c1b9248d29a128cd053 Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran <fazilyildiran(a)gmail.com>
Date: Tue, 3 Nov 2020 01:35:24 +0300
Subject: iio: light: fix kconfig dependency bug for VCNL4035
When VCNL4035 is enabled and IIO_BUFFER is disabled, it results in the
following Kbuild warning:
WARNING: unmet direct dependencies detected for IIO_TRIGGERED_BUFFER
Depends on [n]: IIO [=y] && IIO_BUFFER [=n]
Selected by [y]:
- VCNL4035 [=y] && IIO [=y] && I2C [=y]
The reason is that VCNL4035 selects IIO_TRIGGERED_BUFFER without depending
on or selecting IIO_BUFFER while IIO_TRIGGERED_BUFFER depends on
IIO_BUFFER. This can also fail building the kernel.
Honor the kconfig dependency to remove unmet direct dependency warnings
and avoid any potential build failures.
Fixes: 55707294c4eb ("iio: light: Add support for vishay vcnl4035")
Signed-off-by: Necip Fazil Yildiran <fazilyildiran(a)gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=209883
Link: https://lore.kernel.org/r/20201102223523.572461-1-fazilyildiran@gmail.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/light/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/iio/light/Kconfig b/drivers/iio/light/Kconfig
index cade6dc0305b..33ad4dd0b5c7 100644
--- a/drivers/iio/light/Kconfig
+++ b/drivers/iio/light/Kconfig
@@ -544,6 +544,7 @@ config VCNL4000
config VCNL4035
tristate "VCNL4035 combined ALS and proximity sensor"
+ select IIO_BUFFER
select IIO_TRIGGERED_BUFFER
select REGMAP_I2C
depends on I2C
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio/adc: ingenic: Fix AUX/VBAT readings when touchscreen is used
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 6d6aa2907d59ddd3c0ebb2b93e1ddc84e474485b Mon Sep 17 00:00:00 2001
From: Paul Cercueil <paul(a)crapouillou.net>
Date: Tue, 3 Nov 2020 20:12:38 +0000
Subject: iio/adc: ingenic: Fix AUX/VBAT readings when touchscreen is used
When the command feature of the ADC is used, it is possible to program
the ADC, and specify at each step what input should be processed, and in
comparison to what reference.
This broke the AUX and battery readings when the touchscreen was
enabled, most likely because the CMD feature would change the VREF all
the time.
Now, when AUX or battery are read, we temporarily disable the CMD
feature, which means that we won't get touchscreen readings in that time
frame. But it now gives correct values for AUX / battery, and the
touchscreen isn't disabled for long enough to be an actual issue.
Fixes: b96952f498db ("IIO: Ingenic JZ47xx: Add touchscreen mode.")
Signed-off-by: Paul Cercueil <paul(a)crapouillou.net>
Acked-by: Artur Rojek <contact(a)artur-rojek.eu>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20201103201238.161083-1-paul@crapouillou.net
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/ingenic-adc.c | 32 ++++++++++++++++++++++++++------
1 file changed, 26 insertions(+), 6 deletions(-)
diff --git a/drivers/iio/adc/ingenic-adc.c b/drivers/iio/adc/ingenic-adc.c
index 973e84deebea..1aafbe2cfe67 100644
--- a/drivers/iio/adc/ingenic-adc.c
+++ b/drivers/iio/adc/ingenic-adc.c
@@ -177,13 +177,12 @@ static void ingenic_adc_set_config(struct ingenic_adc *adc,
mutex_unlock(&adc->lock);
}
-static void ingenic_adc_enable(struct ingenic_adc *adc,
- int engine,
- bool enabled)
+static void ingenic_adc_enable_unlocked(struct ingenic_adc *adc,
+ int engine,
+ bool enabled)
{
u8 val;
- mutex_lock(&adc->lock);
val = readb(adc->base + JZ_ADC_REG_ENABLE);
if (enabled)
@@ -192,20 +191,41 @@ static void ingenic_adc_enable(struct ingenic_adc *adc,
val &= ~BIT(engine);
writeb(val, adc->base + JZ_ADC_REG_ENABLE);
+}
+
+static void ingenic_adc_enable(struct ingenic_adc *adc,
+ int engine,
+ bool enabled)
+{
+ mutex_lock(&adc->lock);
+ ingenic_adc_enable_unlocked(adc, engine, enabled);
mutex_unlock(&adc->lock);
}
static int ingenic_adc_capture(struct ingenic_adc *adc,
int engine)
{
+ u32 cfg;
u8 val;
int ret;
- ingenic_adc_enable(adc, engine, true);
+ /*
+ * Disable CMD_SEL temporarily, because it causes wrong VBAT readings,
+ * probably due to the switch of VREF. We must keep the lock here to
+ * avoid races with the buffer enable/disable functions.
+ */
+ mutex_lock(&adc->lock);
+ cfg = readl(adc->base + JZ_ADC_REG_CFG);
+ writel(cfg & ~JZ_ADC_REG_CFG_CMD_SEL, adc->base + JZ_ADC_REG_CFG);
+
+ ingenic_adc_enable_unlocked(adc, engine, true);
ret = readb_poll_timeout(adc->base + JZ_ADC_REG_ENABLE, val,
!(val & BIT(engine)), 250, 1000);
if (ret)
- ingenic_adc_enable(adc, engine, false);
+ ingenic_adc_enable_unlocked(adc, engine, false);
+
+ writel(cfg, adc->base + JZ_ADC_REG_CFG);
+ mutex_unlock(&adc->lock);
return ret;
}
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio: imu: st_lsm6dsx: set 10ms as min shub slave timeout
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From fe0b980ffd1dd8b10c09f82385514819ba2a661d Mon Sep 17 00:00:00 2001
From: Lorenzo Bianconi <lorenzo(a)kernel.org>
Date: Sun, 1 Nov 2020 17:21:18 +0100
Subject: iio: imu: st_lsm6dsx: set 10ms as min shub slave timeout
Set 10ms as minimum i2c slave configuration timeout since st_lsm6dsx
relies on accel ODR for i2c master clock and at high sample rates
(e.g. 833Hz or 416Hz) the slave sensor occasionally may need more cycles
than i2c master timeout (2s/833Hz + 1 ~ 3ms) to apply the configuration
resulting in an uncomplete slave configuration and a constant reading
from the i2c slave connected to st_lsm6dsx i2c master.
Fixes: 8f9a5249e3d9 ("iio: imu: st_lsm6dsx: enable 833Hz sample frequency for tagged sensors")
Fixes: c91c1c844ebd ("iio: imu: st_lsm6dsx: add i2c embedded controller support")
Signed-off-by: Lorenzo Bianconi <lorenzo(a)kernel.org>
Cc: <Stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/a69c8236bf16a1569966815ed71710af2722ed7d.16042472…
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c
index 8c8d8870ca07..99562ba85ee4 100644
--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c
+++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c
@@ -156,11 +156,13 @@ static const struct st_lsm6dsx_ext_dev_settings st_lsm6dsx_ext_dev_table[] = {
static void st_lsm6dsx_shub_wait_complete(struct st_lsm6dsx_hw *hw)
{
struct st_lsm6dsx_sensor *sensor;
- u32 odr;
+ u32 odr, timeout;
sensor = iio_priv(hw->iio_devs[ST_LSM6DSX_ID_ACC]);
odr = (hw->enable_mask & BIT(ST_LSM6DSX_ID_ACC)) ? sensor->odr : 12500;
- msleep((2000000U / odr) + 1);
+ /* set 10ms as minimum timeout for i2c slave configuration */
+ timeout = max_t(u32, 2000000U / odr + 1, 10);
+ msleep(timeout);
}
/*
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio/adc: ingenic: Fix battery VREF for JZ4770 SoC
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From c91ebcc578e09783cfa4d85c1b437790f140f29a Mon Sep 17 00:00:00 2001
From: Paul Cercueil <paul(a)crapouillou.net>
Date: Wed, 4 Nov 2020 19:28:43 +0000
Subject: iio/adc: ingenic: Fix battery VREF for JZ4770 SoC
The reference voltage for the battery is clearly marked as 1.2V in the
programming manual. With this fixed, the battery channel now returns
correct values.
Fixes: a515d6488505 ("IIO: Ingenic JZ47xx: Add support for JZ4770 SoC ADC.")
Signed-off-by: Paul Cercueil <paul(a)crapouillou.net>
Acked-by: Artur Rojek <contact(a)artur-rojek.eu>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20201104192843.67187-1-paul@crapouillou.net
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/ingenic-adc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/adc/ingenic-adc.c b/drivers/iio/adc/ingenic-adc.c
index 92b25083e23f..973e84deebea 100644
--- a/drivers/iio/adc/ingenic-adc.c
+++ b/drivers/iio/adc/ingenic-adc.c
@@ -71,7 +71,7 @@
#define JZ4725B_ADC_BATTERY_HIGH_VREF_BITS 10
#define JZ4740_ADC_BATTERY_HIGH_VREF (7500 * 0.986)
#define JZ4740_ADC_BATTERY_HIGH_VREF_BITS 12
-#define JZ4770_ADC_BATTERY_VREF 6600
+#define JZ4770_ADC_BATTERY_VREF 1200
#define JZ4770_ADC_BATTERY_VREF_BITS 12
#define JZ_ADC_IRQ_AUX BIT(0)
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio: adc: mediatek: fix unset field
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 15207a92e019803d62687455d8aa2ff9eb3dc82c Mon Sep 17 00:00:00 2001
From: Fabien Parent <fparent(a)baylibre.com>
Date: Sun, 18 Oct 2020 21:46:44 +0200
Subject: iio: adc: mediatek: fix unset field
dev_comp field is used in a couple of places but it is never set. This
results in kernel oops when dereferencing a NULL pointer. Set the
`dev_comp` field correctly in the probe function.
Fixes: 6d97024dce23 ("iio: adc: mediatek: mt6577-auxadc, add mt6765 support")
Signed-off-by: Fabien Parent <fparent(a)baylibre.com>
Reviewed-by: Matthias Brugger <matthias.bgg(a)gmail.com>
Cc: <Stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20201018194644.3366846-1-fparent@baylibre.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/mt6577_auxadc.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/iio/adc/mt6577_auxadc.c b/drivers/iio/adc/mt6577_auxadc.c
index ac415cb089cd..79c1dd68b909 100644
--- a/drivers/iio/adc/mt6577_auxadc.c
+++ b/drivers/iio/adc/mt6577_auxadc.c
@@ -9,9 +9,9 @@
#include <linux/err.h>
#include <linux/kernel.h>
#include <linux/module.h>
-#include <linux/of.h>
-#include <linux/of_device.h>
+#include <linux/mod_devicetable.h>
#include <linux/platform_device.h>
+#include <linux/property.h>
#include <linux/iopoll.h>
#include <linux/io.h>
#include <linux/iio/iio.h>
@@ -276,6 +276,8 @@ static int mt6577_auxadc_probe(struct platform_device *pdev)
goto err_disable_clk;
}
+ adc_dev->dev_comp = device_get_match_data(&pdev->dev);
+
mutex_init(&adc_dev->lock);
mt6577_auxadc_mod_reg(adc_dev->reg_base + MT6577_AUXADC_MISC,
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio: cros_ec: Use default frequencies when EC returns invalid
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 56e4f2dda23c6d39d327944faa89efaa4eb290d1 Mon Sep 17 00:00:00 2001
From: Gwendal Grignou <gwendal(a)chromium.org>
Date: Tue, 30 Jun 2020 08:37:30 -0700
Subject: iio: cros_ec: Use default frequencies when EC returns invalid
information
Minimal and maximal frequencies supported by a sensor is queried.
On some older machines, these frequencies are not returned properly and
the EC returns 0 instead.
When returned maximal frequency is 0, ignore the information and use
default frequencies instead.
Fixes: ae7b02ad2f32 ("iio: common: cros_ec_sensors: Expose cros_ec_sensors frequency range via iio sysfs")
Signed-off-by: Gwendal Grignou <gwendal(a)chromium.org>
Reviewed-by: Enric Balletbo i Serra <enric.balletbo(a)collabora.com>
Link: https://lore.kernel.org/r/20200630153730.3302889-1-gwendal@chromium.org
CC: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
.../cros_ec_sensors/cros_ec_sensors_core.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
index c62cacc04672..e3f507771f17 100644
--- a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
+++ b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
@@ -256,7 +256,7 @@ int cros_ec_sensors_core_init(struct platform_device *pdev,
struct cros_ec_sensorhub *sensor_hub = dev_get_drvdata(dev->parent);
struct cros_ec_dev *ec = sensor_hub->ec;
struct cros_ec_sensor_platform *sensor_platform = dev_get_platdata(dev);
- u32 ver_mask;
+ u32 ver_mask, temp;
int frequencies[ARRAY_SIZE(state->frequencies) / 2] = { 0 };
int ret, i;
@@ -311,10 +311,16 @@ int cros_ec_sensors_core_init(struct platform_device *pdev,
&frequencies[2],
&state->fifo_max_event_count);
} else {
- frequencies[1] = state->resp->info_3.min_frequency;
- frequencies[2] = state->resp->info_3.max_frequency;
- state->fifo_max_event_count =
- state->resp->info_3.fifo_max_event_count;
+ if (state->resp->info_3.max_frequency == 0) {
+ get_default_min_max_freq(state->resp->info.type,
+ &frequencies[1],
+ &frequencies[2],
+ &temp);
+ } else {
+ frequencies[1] = state->resp->info_3.min_frequency;
+ frequencies[2] = state->resp->info_3.max_frequency;
+ }
+ state->fifo_max_event_count = state->resp->info_3.fifo_max_event_count;
}
for (i = 0; i < ARRAY_SIZE(frequencies); i++) {
state->frequencies[2 * i] = frequencies[i] / 1000;
--
2.29.2
From: Alexander Sverdlin <alexander.sverdlin(a)nokia.com>
Linux doesn't own the memory immediately after the kernel image. On Octeon
bootloader places a shared structure right close after the kernel _end,
refer to "struct cvmx_bootinfo *octeon_bootinfo" in cavium-octeon/setup.c.
If check_kernel_sections_mem() rounds the PFNs up, first memblock_alloc()
inside early_init_dt_alloc_memory_arch() <= device_tree_init() returns
memory block overlapping with the above octeon_bootinfo structure, which
is being overwritten afterwards.
Cc: stable(a)vger.kernel.org
Fixes: a94e4f24ec83 ("MIPS: init: Drop boot_mem_map")
Signed-off-by: Alexander Sverdlin <alexander.sverdlin(a)nokia.com>
---
arch/mips/kernel/setup.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 0d42532..f6cf2f6 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -504,6 +504,12 @@ static void __init check_kernel_sections_mem(void)
if (!memblock_is_region_memory(start, size)) {
pr_info("Kernel sections are not in the memory maps\n");
memblock_add(start, size);
+ /*
+ * Octeon bootloader places shared data structure right after
+ * the kernel => make sure it will not be corrupted.
+ */
+ memblock_reserve(__pa_symbol(&_end),
+ start + size - __pa_symbol(&_end));
}
}
--
2.10.2
Currently scan_microcode() leverages microcode_matches() to check if the
microcode matches the CPU by comparing the family and model. However before
saving the microcode in scan_microcode(), the processor stepping and flag
of the microcode signature should also be considered in order to avoid
incompatible update and caused the failure of microcode update.
For example on one platform the microcode failed to be updated to the
latest revison on APs during resume from S3 due to incompatible cpu
stepping and signature->pf. This is because the scan_microcode() has
saved an incompatible copy of intel_ucode_patch in
save_microcode_in_initrd_intel() after bootup. And this intel_ucode_patch
is used by APs during early resume from S3 which results in unchecked MSR
access error during resume from S3:
[ 95.519390] unchecked MSR access error: RDMSR from 0x123 at
rIP: 0xffffffffb7676208 (native_read_msr+0x8/0x40)
[ 95.519391] Call Trace:
[ 95.519395] update_srbds_msr+0x38/0x80
[ 95.519396] identify_secondary_cpu+0x7a/0x90
[ 95.519397] smp_store_cpu_info+0x4e/0x60
[ 95.519398] start_secondary+0x49/0x150
[ 95.519399] secondary_startup_64_no_verify+0xa6/0xab
The system keeps running on old microcode during resume:
[ 210.366757] microcode: load_ucode_intel_ap: CPU1, enter, intel_ucode_patch: 0xffff9bf2816e0000
[ 210.366757] microcode: load_ucode_intel_ap: CPU1, p: 0xffff9bf2816e0000, rev: 0xd6
[ 210.366759] microcode: apply_microcode_early: rev: 0x84
[ 210.367826] microcode: apply_microcode_early: rev after upgrade: 0x84
until mc_cpu_starting() is invoked on each AP during resume and the
correct microcode is updated via apply_microcode_intel().
To fix this issue, the scan_microcode() uses find_matching_signature()
instead of microcode_matches() to compare the (family, model, stepping,
processor flag), and only save the microcode that matches. As there is
no other place invoking microcode_matches(), remove it accordingly.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208535
Fixes: 06b8534cb728 ("x86/microcode: Rework microcode loading")
Cc: stable(a)vger.kernel.org#v4.10+
Reviewed-by: Ashok Raj <ashok.raj(a)intel.com>
Signed-off-by: Chen Yu <yu.c.chen(a)intel.com>
---
v2: Remove RFC tag and Cc the stable mailing list.
---
arch/x86/kernel/cpu/microcode/intel.c | 50 ++-------------------------
1 file changed, 2 insertions(+), 48 deletions(-)
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 6a99535d7f37..923853f79099 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -100,53 +100,6 @@ static int has_newer_microcode(void *mc, unsigned int csig, int cpf, int new_rev
return find_matching_signature(mc, csig, cpf);
}
-/*
- * Given CPU signature and a microcode patch, this function finds if the
- * microcode patch has matching family and model with the CPU.
- *
- * %true - if there's a match
- * %false - otherwise
- */
-static bool microcode_matches(struct microcode_header_intel *mc_header,
- unsigned long sig)
-{
- unsigned long total_size = get_totalsize(mc_header);
- unsigned long data_size = get_datasize(mc_header);
- struct extended_sigtable *ext_header;
- unsigned int fam_ucode, model_ucode;
- struct extended_signature *ext_sig;
- unsigned int fam, model;
- int ext_sigcount, i;
-
- fam = x86_family(sig);
- model = x86_model(sig);
-
- fam_ucode = x86_family(mc_header->sig);
- model_ucode = x86_model(mc_header->sig);
-
- if (fam == fam_ucode && model == model_ucode)
- return true;
-
- /* Look for ext. headers: */
- if (total_size <= data_size + MC_HEADER_SIZE)
- return false;
-
- ext_header = (void *) mc_header + data_size + MC_HEADER_SIZE;
- ext_sig = (void *)ext_header + EXT_HEADER_SIZE;
- ext_sigcount = ext_header->count;
-
- for (i = 0; i < ext_sigcount; i++) {
- fam_ucode = x86_family(ext_sig->sig);
- model_ucode = x86_model(ext_sig->sig);
-
- if (fam == fam_ucode && model == model_ucode)
- return true;
-
- ext_sig++;
- }
- return false;
-}
-
static struct ucode_patch *memdup_patch(void *data, unsigned int size)
{
struct ucode_patch *p;
@@ -344,7 +297,8 @@ scan_microcode(void *data, size_t size, struct ucode_cpu_info *uci, bool save)
size -= mc_size;
- if (!microcode_matches(mc_header, uci->cpu_sig.sig)) {
+ if (!find_matching_signature(data, uci->cpu_sig.sig,
+ uci->cpu_sig.pf)) {
data += mc_size;
continue;
}
--
2.17.1
Write buffers use a kmalloc()'ed buffer, they can leak
up to seven bytes of kernel memory to flash if writes are not
aligned.
So use ubifs_pad() to fill these gaps with padding bytes.
This was never a problem while scanning because the scanner logic
manually aligns node lengths and skips over these gaps.
Cc: <stable(a)vger.kernel.org>
Fixes: 1e51764a3c2ac05a2 ("UBIFS: add new flash file system")
Signed-off-by: Richard Weinberger <richard(a)nod.at>
---
fs/ubifs/io.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/fs/ubifs/io.c b/fs/ubifs/io.c
index 7e4bfaf2871f..eae9cf5a57b0 100644
--- a/fs/ubifs/io.c
+++ b/fs/ubifs/io.c
@@ -319,7 +319,7 @@ void ubifs_pad(const struct ubifs_info *c, void *buf, int pad)
{
uint32_t crc;
- ubifs_assert(c, pad >= 0 && !(pad & 7));
+ ubifs_assert(c, pad >= 0);
if (pad >= UBIFS_PAD_NODE_SZ) {
struct ubifs_ch *ch = buf;
@@ -764,6 +764,10 @@ int ubifs_wbuf_write_nolock(struct ubifs_wbuf *wbuf, void *buf, int len)
* write-buffer.
*/
memcpy(wbuf->buf + wbuf->used, buf, len);
+ if (aligned_len > len) {
+ ubifs_assert(c, aligned_len - len < 8);
+ ubifs_pad(c, wbuf->buf + wbuf->used + len, aligned_len - len);
+ }
if (aligned_len == wbuf->avail) {
dbg_io("flush jhead %s wbuf to LEB %d:%d",
@@ -856,13 +860,18 @@ int ubifs_wbuf_write_nolock(struct ubifs_wbuf *wbuf, void *buf, int len)
}
spin_lock(&wbuf->lock);
- if (aligned_len)
+ if (aligned_len) {
/*
* And now we have what's left and what does not take whole
* max. write unit, so write it to the write-buffer and we are
* done.
*/
memcpy(wbuf->buf, buf + written, len);
+ if (aligned_len > len) {
+ ubifs_assert(c, aligned_len - len < 8);
+ ubifs_pad(c, wbuf->buf + len, aligned_len - len);
+ }
+ }
if (c->leb_size - wbuf->offs >= c->max_write_size)
wbuf->size = c->max_write_size;
--
2.26.2
From: Siarhei Liakh <siarhei.liakh(a)concurrent-rt.com>
TL;DR:
There are two places in unlz4() function where reads beyond the end of a buffer
might happen under certain conditions which had been observed in real life on
stock Ubuntu 20.04 x86_64 with several vanilla mainline kernels, including 5.10.
As a result of this issue, the kernel fails to decompress LZ4-compressed
initramfs with following message showing up in the logs:
initramfs unpacking failed: Decoding failed
Note that in most cases the affected system is still able to proceed with the
boot process to completion.
LONG STORY:
Background.
Not so long ago we've noticed that some of our Ubuntu 20.04 x86_64 test systems
often fail to boot newly generated initramfs image. After extensive
investigation we determined that a failure required the following combination
for our 5.4.66-rt38 kernel with some additional custom patches:
Real x86_64 hardware or QEMU
UEFI boot
Ubunutu 20.04 (or 20.04.1) x86_64
CONFIG_BLK_DEV_RAM=y in .config
COMPRESS=lz4 in initramfs.conf
Freshly compiled and installed kernel
Freshly generated and installed initramfs image
In our testing, such a combination would often produce a non-bootable system. It
is important to note that [un]bootability of the system was later tracked down
to particular instances of initramfs images, and would follow them if they were
to be switched around/transferred to other systems. What is even more important
is that consecutive re-generations of initramfs images from the same source and
binary materials would yield about 75% of "bad" images. Further, once the image
is identified as "bad",it always stays "bad"; once one is "good" it always stays
"good". Reverting CONFIG_BLK_DEV_RAM to "m" (default in Ubuntu), or changing
COMPRESS to "gzip" yields a 100% bootable system. Decompressing "bad" initramfs
image with "unmkinitramfs" yields *exactly* the same set of binaries, as
verified by matching MD5 sums to those from "good" image.
Speculation.
Based on general observations, it appears that Ubuntu's userland toolchain
cannot consistently generate exactly the same compressed initramfs image, likely
due to some variations in timestamps between the runs. This causes variations in
compressed lz4 data stream. Further, either initramfs tools or lz4 libraries
appear to pad compressed lz4 output to closest 4-byte boundary. lz4 v1.9.2 that
ships with Ubuntu 20.04 appears to be able to handle such padding just fine,
while lz4 (supposedly v1.8.3) within Linux kernel cannot.
Several reports of somewhat similar behavior had been recently circulation
through different bug tracking systems and discussion forums [1-4].
I also suspect only that systems which can mount permanent root directly (or
with help of modules contained in first, supposedly uncompressed, part of
initramfs, or the ones with statically linked modules) can actually complete the
boot when LZ4 decompression fails. This would certainly explain why most of
Ubuntu systems still manage to boot even after failing to decompress the image.
The facts.
Regardless of whether Ubuntu 20.04 toolchain produces a valid lz4-compressed
initramfs image or not, current version of unlz4() function in kernel has two
code paths which had been observed attempting to read beyond the buffer end when
presented with one of the "padded"/"bad" initramfs images generated by stock
Ubuntu 20.04 toolchain. Some configurations of some 5.4 kernels are known to
fail to boot in such cases. This behavior also becomes evident on vanilla
5.10.0-rc3 and 5.10.0-rc4 kernels with addition of two logging statements for
corresponding edge cases, even though it does not prevent system from booting in
most generic configurations.
Further investigation is likely warranted to confirm whether userland toolchain
contains any bugs and/or whether any of these cases constitute violation of LZ4
and/or initramfs specification.
References
[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1835660
[2] https://github.com/linuxmint/mint20-beta/issues/90
[3] https://askubuntu.com/questions/1245458/getting-the-message-0-283078-initra…
[4] https://forums.linuxmint.com/viewtopic.php?t=323152
Signed-off-by: Siarhei Liakh <siarhei.liakh(a)concurrent-rt.com>
---
Please CC: me directly on all replies.
lib/decompress_unlz4.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/lib/decompress_unlz4.c b/lib/decompress_unlz4.c
index c0cfcfd486be..a016643a6dc5 100644
--- a/lib/decompress_unlz4.c
+++ b/lib/decompress_unlz4.c
@@ -125,6 +125,21 @@ STATIC inline int INIT unlz4(u8 *input, long in_len,
continue;
}
+ if (chunksize == 0) {
+ /*
+ * Nothing to decode...
+ * FIXME: this could be an error condition due
+ * to invalid or corrupt data. However, some
+ * userspace tools had been observed producing
+ * otherwise valid initramfs images which happen
+ * to hit this condition.
+ * TODO: need to figure out whether the latest
+ * LZ4 and initramfs specifications allows for
+ * zero-sized chunks.
+ * See similar message below.
+ */
+ break;
+ }
if (posp)
*posp += 4;
@@ -179,6 +194,20 @@ STATIC inline int INIT unlz4(u8 *input, long in_len,
else if (size < 0) {
error("data corrupted");
goto exit_2;
+ } else if (size < 4) {
+ /*
+ * Ignore any undesized junk/padding...
+ * FIXME: this could be an error condition due
+ * to invalid or corrupt data. However, some
+ * userspace tools had been observed producing
+ * otherwise valid initramfs images which happen
+ * to hit this condition.
+ * TODO: need to figure out whether the latest
+ * LZ4 and initramfs specifications allows for
+ * small padding at the end of the chunk.
+ * See similar message above.
+ */
+ break;
}
inp += chunksize;
}
--
2.17.1
Clang's integrated assembler produces the warning for assembly files:
warning: DWARF2 only supports one section per compilation unit
If -Wa,-gdwarf-* is unspecified, then debug info is not emitted. This
will be re-enabled for new DWARF versions in a follow up patch.
Enables defconfig+CONFIG_DEBUG_INFO to build cleanly with
LLVM=1 LLVM_IAS=1 for x86_64 and arm64.
Cc: <stable(a)vger.kernel.org>
Link: https://github.com/ClangBuiltLinux/linux/issues/716
Reported-by: Nathan Chancellor <natechancellor(a)gmail.com>
Suggested-by: Dmitry Golovin <dima(a)golovin.in>
Suggested-by: Sedat Dilek <sedat.dilek(a)gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com>
---
Makefile | 2 ++
1 file changed, 2 insertions(+)
diff --git a/Makefile b/Makefile
index f353886dbf44..75b1a3dcbf30 100644
--- a/Makefile
+++ b/Makefile
@@ -826,7 +826,9 @@ else
DEBUG_CFLAGS += -g
endif
+ifndef LLVM_IAS
KBUILD_AFLAGS += -Wa,-gdwarf-2
+endif
ifdef CONFIG_DEBUG_INFO_DWARF4
DEBUG_CFLAGS += -gdwarf-4
--
2.29.1.341.ge80a0c044ae-goog
The patch titled
Subject: mm, page_frag: recover from memory pressure
has been added to the -mm tree. Its filename is
page_frag-recover-from-memory-pressure.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/page_frag-recover-from-memory-pre…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/page_frag-recover-from-memory-pre…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Dongli Zhang <dongli.zhang(a)oracle.com>
Subject: mm, page_frag: recover from memory pressure
The ethernet driver may allocate skb (and skb->data) via napi_alloc_skb().
This ends up to page_frag_alloc() to allocate skb->data from
page_frag_cache->va.
During the memory pressure, page_frag_cache->va may be allocated as
pfmemalloc page. As a result, the skb->pfmemalloc is always true as
skb->data is from page_frag_cache->va. The skb will be dropped if the
sock (receiver) does not have SOCK_MEMALLOC. This is expected behaviour
under memory pressure.
However, once kernel is not under memory pressure any longer (suppose
large amount of memory pages are just reclaimed), the page_frag_alloc()
may still re-use the prior pfmemalloc page_frag_cache->va to allocate
skb->data. As a result, the skb->pfmemalloc is always true unless
page_frag_cache->va is re-allocated, even if the kernel is not under
memory pressure any longer.
Here is how kernel runs into issue.
1. The kernel is under memory pressure and allocation of
PAGE_FRAG_CACHE_MAX_ORDER in __page_frag_cache_refill() will fail.
Instead, the pfmemalloc page is allocated for page_frag_cache->va.
2. All skb->data from page_frag_cache->va (pfmemalloc) will have
skb->pfmemalloc=true. The skb will always be dropped by sock without
SOCK_MEMALLOC. This is an expected behaviour.
3. Suppose a large amount of pages are reclaimed and kernel is not
under memory pressure any longer. We expect skb->pfmemalloc drop will
not happen.
4. Unfortunately, page_frag_alloc() does not proactively re-allocate
page_frag_alloc->va and will always re-use the prior pfmemalloc page.
The skb->pfmemalloc is always true even kernel is not under memory
pressure any longer.
Fix this by freeing and re-allocating the page instead of recycling it.
Link: https://lore.kernel.org/lkml/20201103193239.1807-1-dongli.zhang@oracle.com/
Link: https://lore.kernel.org/linux-mm/20201105042140.5253-1-willy@infradead.org/
Link: https://lkml.kernel.org/r/20201115201029.11903-1-dongli.zhang@oracle.com
Fixes: 79930f5892e ("net: do not deplete pfmemalloc reserve")
Signed-off-by: Dongli Zhang <dongli.zhang(a)oracle.com>
Suggested-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Cc: Aruna Ramakrishna <aruna.ramakrishna(a)oracle.com>
Cc: Bert Barbe <bert.barbe(a)oracle.com>
Cc: Rama Nichanamatlu <rama.nichanamatlu(a)oracle.com>
Cc: Venkat Venkatsubra <venkat.x.venkatsubra(a)oracle.com>
Cc: Manjunath Patil <manjunath.b.patil(a)oracle.com>
Cc: Joe Jin <joe.jin(a)oracle.com>
Cc: SRINIVAS <srinivas.eeda(a)oracle.com>
Cc: David S. Miller <davem(a)davemloft.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/mm/page_alloc.c~page_frag-recover-from-memory-pressure
+++ a/mm/page_alloc.c
@@ -5103,6 +5103,11 @@ refill:
if (!page_ref_sub_and_test(page, nc->pagecnt_bias))
goto refill;
+ if (unlikely(nc->pfmemalloc)) {
+ free_the_page(page, compound_order(page));
+ goto refill;
+ }
+
#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
/* if size can vary use size else just use PAGE_SIZE */
size = nc->size;
_
Patches currently in -mm which might be from dongli.zhang(a)oracle.com are
page_frag-recover-from-memory-pressure.patch
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
EDID can declare the maximum supported bpc up to 16,
and apparently there are displays that do so. Currently
we assume 12 bpc is tha max. Fix the assumption and
toss in a MISSING_CASE() for any other value we don't
expect to see.
This fixes modesets with a display with EDID max bpc > 12.
Previously any modeset would just silently fail on platforms
that didn't otherwise limit this via the max_bpc property.
In particular we don't add the max_bpc property to HDMI
ports on gmch platforms, and thus we would see the raw
max_bpc coming from the EDID.
I suppose we could already adjust this to also allow 16bpc,
but seeing as no current platform supports that there is
little point.
Cc: stable(a)vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2632
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_display.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 2729c852c668..2a6eb1ca9c8e 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -13060,10 +13060,11 @@ compute_sink_pipe_bpp(const struct drm_connector_state *conn_state,
case 10 ... 11:
bpp = 10 * 3;
break;
- case 12:
+ case 12 ... 16:
bpp = 12 * 3;
break;
default:
+ MISSING_CASE(conn_state->max_bpc);
return -EINVAL;
}
--
2.26.2
The patch titled
Subject: ocfs2: initialize ip_next_orphan
has been removed from the -mm tree. Its filename was
ocfs2-initialize-ip_next_orphan.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Wengang Wang <wen.gang.wang(a)oracle.com>
Subject: ocfs2: initialize ip_next_orphan
Though problem if found on a lower 4.1.12 kernel, I think upstream has
same issue.
In one node in the cluster, there is the following callback trace:
# cat /proc/21473/stack
[<ffffffffc09a2f06>] __ocfs2_cluster_lock.isra.36+0x336/0x9e0 [ocfs2]
[<ffffffffc09a4481>] ocfs2_inode_lock_full_nested+0x121/0x520 [ocfs2]
[<ffffffffc09b2ce2>] ocfs2_evict_inode+0x152/0x820 [ocfs2]
[<ffffffff8122b36e>] evict+0xae/0x1a0
[<ffffffff8122bd26>] iput+0x1c6/0x230
[<ffffffffc09b60ed>] ocfs2_orphan_filldir+0x5d/0x100 [ocfs2]
[<ffffffffc0992ae0>] ocfs2_dir_foreach_blk+0x490/0x4f0 [ocfs2]
[<ffffffffc099a1e9>] ocfs2_dir_foreach+0x29/0x30 [ocfs2]
[<ffffffffc09b7716>] ocfs2_recover_orphans+0x1b6/0x9a0 [ocfs2]
[<ffffffffc09b9b4e>] ocfs2_complete_recovery+0x1de/0x5c0 [ocfs2]
[<ffffffff810a1399>] process_one_work+0x169/0x4a0
[<ffffffff810a1bcb>] worker_thread+0x5b/0x560
[<ffffffff810a7a2b>] kthread+0xcb/0xf0
[<ffffffff816f5d21>] ret_from_fork+0x61/0x90
[<ffffffffffffffff>] 0xffffffffffffffff
The above stack is not reasonable, the final iput shouldn't happen in
ocfs2_orphan_filldir() function. Looking at the code,
2067 /* Skip inodes which are already added to recover list, since dio may
2068 * happen concurrently with unlink/rename */
2069 if (OCFS2_I(iter)->ip_next_orphan) {
2070 iput(iter);
2071 return 0;
2072 }
2073
The logic thinks the inode is already in recover list on seeing
ip_next_orphan is non-NULL, so it skip this inode after dropping a
reference which incremented in ocfs2_iget().
While, if the inode is already in recover list, it should have another
reference and the iput() at line 2070 should not be the final iput
(dropping the last reference). So I don't think the inode is really in
the recover list (no vmcore to confirm).
Note that ocfs2_queue_orphans(), though not shown up in the call back
trace, is holding cluster lock on the orphan directory when looking up for
unlinked inodes. The on disk inode eviction could involve a lot of IOs
which may need long time to finish. That means this node could hold the
cluster lock for very long time, that can lead to the lock requests (from
other nodes) to the orhpan directory hang for long time.
Looking at more on ip_next_orphan, I found it's not initialized when
allocating a new ocfs2_inode_info structure.
This causes te reflink operations from some nodes hang for very long
time waiting for the cluster lock on the orphan directory.
Fix: initialize ip_next_orphan as NULL.
Link: https://lkml.kernel.org/r/20201109171746.27884-1-wen.gang.wang@oracle.com
Signed-off-by: Wengang Wang <wen.gang.wang(a)oracle.com>
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/super.c | 1 +
1 file changed, 1 insertion(+)
--- a/fs/ocfs2/super.c~ocfs2-initialize-ip_next_orphan
+++ a/fs/ocfs2/super.c
@@ -1713,6 +1713,7 @@ static void ocfs2_inode_init_once(void *
oi->ip_blkno = 0ULL;
oi->ip_clusters = 0;
+ oi->ip_next_orphan = NULL;
ocfs2_resv_init_once(&oi->ip_la_data_resv);
_
Patches currently in -mm which might be from wen.gang.wang(a)oracle.com are
The patch titled
Subject: hugetlbfs: fix anon huge page migration race
has been removed from the -mm tree. Its filename was
hugetlbfs-fix-anon-huge-page-migration-race.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlbfs: fix anon huge page migration race
Qian Cai reported the following BUG in [1]
[ 6147.019063][T45242] LTP: starting move_pages12
[ 6147.475680][T64921] BUG: unable to handle page fault for address: ffffffffffffffe0
...
[ 6147.525866][T64921] RIP: 0010:anon_vma_interval_tree_iter_first+0xa2/0x170
avc_start_pgoff at mm/interval_tree.c:63
[ 6147.620914][T64921] Call Trace:
[ 6147.624078][T64921] rmap_walk_anon+0x141/0xa30
rmap_walk_anon at mm/rmap.c:1864
[ 6147.628639][T64921] try_to_unmap+0x209/0x2d0
try_to_unmap at mm/rmap.c:1763
[ 6147.633026][T64921] ? rmap_walk_locked+0x140/0x140
[ 6147.637936][T64921] ? page_remove_rmap+0x1190/0x1190
[ 6147.643020][T64921] ? page_not_mapped+0x10/0x10
[ 6147.647668][T64921] ? page_get_anon_vma+0x290/0x290
[ 6147.652664][T64921] ? page_mapcount_is_zero+0x10/0x10
[ 6147.657838][T64921] ? hugetlb_page_mapping_lock_write+0x97/0x180
[ 6147.663972][T64921] migrate_pages+0x1005/0x1fb0
[ 6147.668617][T64921] ? remove_migration_pte+0xac0/0xac0
[ 6147.673875][T64921] move_pages_and_store_status.isra.47+0xd7/0x1a0
[ 6147.680181][T64921] ? migrate_pages+0x1fb0/0x1fb0
[ 6147.685002][T64921] __x64_sys_move_pages+0xa5c/0x1100
[ 6147.690176][T64921] ? trace_hardirqs_on+0x20/0x1b5
[ 6147.695084][T64921] ? move_pages_and_store_status.isra.47+0x1a0/0x1a0
[ 6147.701653][T64921] ? rcu_read_lock_sched_held+0xaa/0xd0
[ 6147.707088][T64921] ? switch_fpu_return+0x196/0x400
[ 6147.712083][T64921] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 6147.717954][T64921] ? do_syscall_64+0x24/0x310
[ 6147.722513][T64921] do_syscall_64+0x5f/0x310
[ 6147.726897][T64921] ? trace_hardirqs_off+0x12/0x1a0
[ 6147.731894][T64921] ? asm_exc_page_fault+0x8/0x30
[ 6147.736714][T64921] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Hugh Dickens diagnosed this as a migration bug caused by code introduced
to use i_mmap_rwsem for pmd sharing synchronization. Specifically, the
routine unmap_and_move_huge_page() is always passing the TTU_RMAP_LOCKED
flag to try_to_unmap() while holding i_mmap_rwsem. This is wrong for
anon pages as the anon_vma_lock should be held in this case. Further
analysis suggested that i_mmap_rwsem was not required to he held at all
when calling try_to_unmap for anon pages as an anon page could never be
part of a shared pmd mapping.
Discussion also revealed that the hack in hugetlb_page_mapping_lock_write
to drop page lock and acquire i_mmap_rwsem is wrong. There is no way to
keep mapping valid while dropping page lock.
This patch does the following:
- Do not take i_mmap_rwsem and set TTU_RMAP_LOCKED for anon pages when
calling try_to_unmap.
- Remove the hacky code in hugetlb_page_mapping_lock_write. The routine
will now simply do a 'trylock' while still holding the page lock. If
the trylock fails, it will return NULL. This could impact the callers:
- migration calling code will receive -EAGAIN and retry up to the hard
coded limit (10).
- memory error code will treat the page as BUSY. This will force
killing (SIGKILL) instead of SIGBUS any mapping tasks.
Do note that this change in behavior only happens when there is a race.
None of the standard kernel testing suites actually hit this race, but
it is possible.
[1] https://lore.kernel.org/lkml/20200708012044.GC992@lca.pw/
[2] https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2010071833100.2214@eggly.a…
Link: https://lkml.kernel.org/r/20201105195058.78401-1-mike.kravetz@oracle.com
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: Qian Cai <cai(a)lca.pw>
Suggested-by: Hugh Dickins <hughd(a)google.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 90 ++----------------------------------------
mm/memory-failure.c | 36 +++++++---------
mm/migrate.c | 46 +++++++++++----------
mm/rmap.c | 5 --
4 files changed, 48 insertions(+), 129 deletions(-)
--- a/mm/hugetlb.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/hugetlb.c
@@ -1568,103 +1568,23 @@ int PageHeadHuge(struct page *page_head)
}
/*
- * Find address_space associated with hugetlbfs page.
- * Upon entry page is locked and page 'was' mapped although mapped state
- * could change. If necessary, use anon_vma to find vma and associated
- * address space. The returned mapping may be stale, but it can not be
- * invalid as page lock (which is held) is required to destroy mapping.
- */
-static struct address_space *_get_hugetlb_page_mapping(struct page *hpage)
-{
- struct anon_vma *anon_vma;
- pgoff_t pgoff_start, pgoff_end;
- struct anon_vma_chain *avc;
- struct address_space *mapping = page_mapping(hpage);
-
- /* Simple file based mapping */
- if (mapping)
- return mapping;
-
- /*
- * Even anonymous hugetlbfs mappings are associated with an
- * underlying hugetlbfs file (see hugetlb_file_setup in mmap
- * code). Find a vma associated with the anonymous vma, and
- * use the file pointer to get address_space.
- */
- anon_vma = page_lock_anon_vma_read(hpage);
- if (!anon_vma)
- return mapping; /* NULL */
-
- /* Use first found vma */
- pgoff_start = page_to_pgoff(hpage);
- pgoff_end = pgoff_start + pages_per_huge_page(page_hstate(hpage)) - 1;
- anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root,
- pgoff_start, pgoff_end) {
- struct vm_area_struct *vma = avc->vma;
-
- mapping = vma->vm_file->f_mapping;
- break;
- }
-
- anon_vma_unlock_read(anon_vma);
- return mapping;
-}
-
-/*
* Find and lock address space (mapping) in write mode.
*
- * Upon entry, the page is locked which allows us to find the mapping
- * even in the case of an anon page. However, locking order dictates
- * the i_mmap_rwsem be acquired BEFORE the page lock. This is hugetlbfs
- * specific. So, we first try to lock the sema while still holding the
- * page lock. If this works, great! If not, then we need to drop the
- * page lock and then acquire i_mmap_rwsem and reacquire page lock. Of
- * course, need to revalidate state along the way.
+ * Upon entry, the page is locked which means that page_mapping() is
+ * stable. Due to locking order, we can only trylock_write. If we can
+ * not get the lock, simply return NULL to caller.
*/
struct address_space *hugetlb_page_mapping_lock_write(struct page *hpage)
{
- struct address_space *mapping, *mapping2;
+ struct address_space *mapping = page_mapping(hpage);
- mapping = _get_hugetlb_page_mapping(hpage);
-retry:
if (!mapping)
return mapping;
- /*
- * If no contention, take lock and return
- */
if (i_mmap_trylock_write(mapping))
return mapping;
- /*
- * Must drop page lock and wait on mapping sema.
- * Note: Once page lock is dropped, mapping could become invalid.
- * As a hack, increase map count until we lock page again.
- */
- atomic_inc(&hpage->_mapcount);
- unlock_page(hpage);
- i_mmap_lock_write(mapping);
- lock_page(hpage);
- atomic_add_negative(-1, &hpage->_mapcount);
-
- /* verify page is still mapped */
- if (!page_mapped(hpage)) {
- i_mmap_unlock_write(mapping);
- return NULL;
- }
-
- /*
- * Get address space again and verify it is the same one
- * we locked. If not, drop lock and retry.
- */
- mapping2 = _get_hugetlb_page_mapping(hpage);
- if (mapping2 != mapping) {
- i_mmap_unlock_write(mapping);
- mapping = mapping2;
- goto retry;
- }
-
- return mapping;
+ return NULL;
}
pgoff_t __basepage_index(struct page *page)
--- a/mm/memory-failure.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/memory-failure.c
@@ -1057,27 +1057,25 @@ static bool hwpoison_user_mappings(struc
if (!PageHuge(hpage)) {
unmap_success = try_to_unmap(hpage, ttu);
} else {
- /*
- * For hugetlb pages, try_to_unmap could potentially call
- * huge_pmd_unshare. Because of this, take semaphore in
- * write mode here and set TTU_RMAP_LOCKED to indicate we
- * have taken the lock at this higer level.
- *
- * Note that the call to hugetlb_page_mapping_lock_write
- * is necessary even if mapping is already set. It handles
- * ugliness of potentially having to drop page lock to obtain
- * i_mmap_rwsem.
- */
- mapping = hugetlb_page_mapping_lock_write(hpage);
-
- if (mapping) {
- unmap_success = try_to_unmap(hpage,
+ if (!PageAnon(hpage)) {
+ /*
+ * For hugetlb pages in shared mappings, try_to_unmap
+ * could potentially call huge_pmd_unshare. Because of
+ * this, take semaphore in write mode here and set
+ * TTU_RMAP_LOCKED to indicate we have taken the lock
+ * at this higer level.
+ */
+ mapping = hugetlb_page_mapping_lock_write(hpage);
+ if (mapping) {
+ unmap_success = try_to_unmap(hpage,
ttu|TTU_RMAP_LOCKED);
- i_mmap_unlock_write(mapping);
+ i_mmap_unlock_write(mapping);
+ } else {
+ pr_info("Memory failure: %#lx: could not lock mapping for mapped huge page\n", pfn);
+ unmap_success = false;
+ }
} else {
- pr_info("Memory failure: %#lx: could not find mapping for mapped huge page\n",
- pfn);
- unmap_success = false;
+ unmap_success = try_to_unmap(hpage, ttu);
}
}
if (!unmap_success)
--- a/mm/migrate.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/migrate.c
@@ -1328,34 +1328,38 @@ static int unmap_and_move_huge_page(new_
goto put_anon;
if (page_mapped(hpage)) {
- /*
- * try_to_unmap could potentially call huge_pmd_unshare.
- * Because of this, take semaphore in write mode here and
- * set TTU_RMAP_LOCKED to let lower levels know we have
- * taken the lock.
- */
- mapping = hugetlb_page_mapping_lock_write(hpage);
- if (unlikely(!mapping))
- goto unlock_put_anon;
-
- try_to_unmap(hpage,
- TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS|
- TTU_RMAP_LOCKED);
+ bool mapping_locked = false;
+ enum ttu_flags ttu = TTU_MIGRATION|TTU_IGNORE_MLOCK|
+ TTU_IGNORE_ACCESS;
+
+ if (!PageAnon(hpage)) {
+ /*
+ * In shared mappings, try_to_unmap could potentially
+ * call huge_pmd_unshare. Because of this, take
+ * semaphore in write mode here and set TTU_RMAP_LOCKED
+ * to let lower levels know we have taken the lock.
+ */
+ mapping = hugetlb_page_mapping_lock_write(hpage);
+ if (unlikely(!mapping))
+ goto unlock_put_anon;
+
+ mapping_locked = true;
+ ttu |= TTU_RMAP_LOCKED;
+ }
+
+ try_to_unmap(hpage, ttu);
page_was_mapped = 1;
- /*
- * Leave mapping locked until after subsequent call to
- * remove_migration_ptes()
- */
+
+ if (mapping_locked)
+ i_mmap_unlock_write(mapping);
}
if (!page_mapped(hpage))
rc = move_to_new_page(new_hpage, hpage, mode);
- if (page_was_mapped) {
+ if (page_was_mapped)
remove_migration_ptes(hpage,
- rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage, true);
- i_mmap_unlock_write(mapping);
- }
+ rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage, false);
unlock_put_anon:
unlock_page(new_hpage);
--- a/mm/rmap.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/rmap.c
@@ -1413,9 +1413,6 @@ static bool try_to_unmap_one(struct page
/*
* If sharing is possible, start and end will be adjusted
* accordingly.
- *
- * If called for a huge page, caller must hold i_mmap_rwsem
- * in write mode as it is possible to call huge_pmd_unshare.
*/
adjust_range_if_pmd_sharing_possible(vma, &range.start,
&range.end);
@@ -1462,7 +1459,7 @@ static bool try_to_unmap_one(struct page
subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte);
address = pvmw.address;
- if (PageHuge(page)) {
+ if (PageHuge(page) && !PageAnon(page)) {
/*
* To call huge_pmd_unshare, i_mmap_rwsem must be
* held in write mode. Caller needs to explicitly
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
The patch titled
Subject: Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
has been removed from the -mm tree. Its filename was
revert-kernel-rebootc-convert-simple_strtoul-to-kstrtoint.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Matteo Croce <mcroce(a)microsoft.com>
Subject: Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
Patch series "fix parsing of reboot= cmdline", v3.
The parsing of the reboot= cmdline has two major errors:
- a missing bound check can crash the system on reboot
- parsing of the cpu number only works if specified last
Fix both.
This patch (of 2):
This reverts commit 616feab753972b97.
kstrtoint() and simple_strtoul() have a subtle difference which makes them
non interchangeable: if a non digit character is found amid the parsing,
the former will return an error, while the latter will just stop parsing,
e.g. simple_strtoul("123xyx") = 123.
The kernel cmdline reboot= argument allows to specify the CPU used for
rebooting, with the syntax `s####` among the other flags, e.g.
"reboot=warm,s31,force", so if this flag is not the last given, it's
silently ignored as well as the subsequent ones.
Link: https://lkml.kernel.org/r/20201103214025.116799-2-mcroce@linux.microsoft.com
Fixes: 616feab75397 ("kernel/reboot.c: convert simple_strtoul to kstrtoint")
Signed-off-by: Matteo Croce <mcroce(a)microsoft.com>
Cc: Guenter Roeck <linux(a)roeck-us.net>
Cc: Petr Mladek <pmladek(a)suse.com>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Robin Holt <robinmholt(a)gmail.com>
Cc: Fabian Frederick <fabf(a)skynet.be>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/reboot.c | 21 +++++++--------------
1 file changed, 7 insertions(+), 14 deletions(-)
--- a/kernel/reboot.c~revert-kernel-rebootc-convert-simple_strtoul-to-kstrtoint
+++ a/kernel/reboot.c
@@ -551,22 +551,15 @@ static int __init reboot_setup(char *str
break;
case 's':
- {
- int rc;
-
- if (isdigit(*(str+1))) {
- rc = kstrtoint(str+1, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else if (str[1] == 'm' && str[2] == 'p' &&
- isdigit(*(str+3))) {
- rc = kstrtoint(str+3, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else
+ if (isdigit(*(str+1)))
+ reboot_cpu = simple_strtoul(str+1, NULL, 0);
+ else if (str[1] == 'm' && str[2] == 'p' &&
+ isdigit(*(str+3)))
+ reboot_cpu = simple_strtoul(str+3, NULL, 0);
+ else
*mode = REBOOT_SOFT;
break;
- }
+
case 'g':
*mode = REBOOT_GPIO;
break;
_
Patches currently in -mm which might be from mcroce(a)microsoft.com are
reboot-refactor-and-comment-the-cpu-selection-code.patch
reboot-allow-to-specify-reboot-mode-via-sysfs.patch
reboot-remove-cf9_safe-from-allowed-types-and-rename-cf9_force.patch
The patch titled
Subject: compiler.h: fix barrier_data() on clang
has been removed from the -mm tree. Its filename was
compilerh-fix-barrier_data-on-clang.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Arvind Sankar <nivedita(a)alum.mit.edu>
Subject: compiler.h: fix barrier_data() on clang
Commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive") neglected to copy barrier_data() from compiler-gcc.h
into compiler-clang.h. The definition in compiler-gcc.h was really to
work around clang's more aggressive optimization, so this broke
barrier_data() on clang, and consequently memzero_explicit() as well.
For example, this results in at least the memzero_explicit() call in
lib/crypto/sha256.c:sha256_transform() being optimized away by clang.
Fix this by moving the definition of barrier_data() into compiler.h.
Also move the gcc/clang definition of barrier() into compiler.h,
__memory_barrier() is icc-specific (and barrier() is already defined using
it in compiler-intel.h) and doesn't belong in compiler.h.
[rdunlap(a)infradead.org: fix ALPHA builds when SMP is not enabled]
Link: https://lkml.kernel.org/r/20201101231835.4589-1-rdunlap@infradead.org
Link: https://lkml.kernel.org/r/20201014212631.207844-1-nivedita@alum.mit.edu
Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
Signed-off-by: Arvind Sankar <nivedita(a)alum.mit.edu>
Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org>
Reviewed-by: Nick Desaulniers <ndesaulniers(a)google.com>
Tested-by: Nick Desaulniers <ndesaulniers(a)google.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/asm-generic/barrier.h | 1 +
include/linux/compiler-clang.h | 6 ------
include/linux/compiler-gcc.h | 19 -------------------
include/linux/compiler.h | 18 ++++++++++++++++--
4 files changed, 17 insertions(+), 27 deletions(-)
--- a/include/asm-generic/barrier.h~compilerh-fix-barrier_data-on-clang
+++ a/include/asm-generic/barrier.h
@@ -13,6 +13,7 @@
#ifndef __ASSEMBLY__
+#include <linux/compiler.h>
#include <asm/rwonce.h>
#ifndef nop
--- a/include/linux/compiler-clang.h~compilerh-fix-barrier_data-on-clang
+++ a/include/linux/compiler-clang.h
@@ -60,12 +60,6 @@
#define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1
#endif
-/* The following are for compatibility with GCC, from compiler-gcc.h,
- * and may be redefined here because they should not be shared with other
- * compilers, like ICC.
- */
-#define barrier() __asm__ __volatile__("" : : : "memory")
-
#if __has_feature(shadow_call_stack)
# define __noscs __attribute__((__no_sanitize__("shadow-call-stack")))
#endif
--- a/include/linux/compiler-gcc.h~compilerh-fix-barrier_data-on-clang
+++ a/include/linux/compiler-gcc.h
@@ -15,25 +15,6 @@
# error Sorry, your version of GCC is too old - please use 4.9 or newer.
#endif
-/* Optimization barrier */
-
-/* The "volatile" is due to gcc bugs */
-#define barrier() __asm__ __volatile__("": : :"memory")
-/*
- * This version is i.e. to prevent dead stores elimination on @ptr
- * where gcc and llvm may behave differently when otherwise using
- * normal barrier(): while gcc behavior gets along with a normal
- * barrier(), llvm needs an explicit input variable to be assumed
- * clobbered. The issue is as follows: while the inline asm might
- * access any memory it wants, the compiler could have fit all of
- * @ptr into memory registers instead, and since @ptr never escaped
- * from that, it proved that the inline asm wasn't touching any of
- * it. This version works well with both compilers, i.e. we're telling
- * the compiler that the inline asm absolutely may see the contents
- * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
- */
-#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
-
/*
* This macro obfuscates arithmetic on a variable address so that gcc
* shouldn't recognize the original var, and make assumptions about it.
--- a/include/linux/compiler.h~compilerh-fix-barrier_data-on-clang
+++ a/include/linux/compiler.h
@@ -80,11 +80,25 @@ void ftrace_likely_update(struct ftrace_
/* Optimization barrier */
#ifndef barrier
-# define barrier() __memory_barrier()
+/* The "volatile" is due to gcc bugs */
+# define barrier() __asm__ __volatile__("": : :"memory")
#endif
#ifndef barrier_data
-# define barrier_data(ptr) barrier()
+/*
+ * This version is i.e. to prevent dead stores elimination on @ptr
+ * where gcc and llvm may behave differently when otherwise using
+ * normal barrier(): while gcc behavior gets along with a normal
+ * barrier(), llvm needs an explicit input variable to be assumed
+ * clobbered. The issue is as follows: while the inline asm might
+ * access any memory it wants, the compiler could have fit all of
+ * @ptr into memory registers instead, and since @ptr never escaped
+ * from that, it proved that the inline asm wasn't touching any of
+ * it. This version works well with both compilers, i.e. we're telling
+ * the compiler that the inline asm absolutely may see the contents
+ * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
+ */
+# define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
#endif
/* workaround for GCC PR82365 if needed */
_
Patches currently in -mm which might be from nivedita(a)alum.mit.edu are
The patch titled
Subject: mm/gup: use unpin_user_pages() in __gup_longterm_locked()
has been removed from the -mm tree. Its filename was
mm-gup-use-unpin_user_pages-in-__gup_longterm_locked.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Jason Gunthorpe <jgg(a)nvidia.com>
Subject: mm/gup: use unpin_user_pages() in __gup_longterm_locked()
When FOLL_PIN is passed to __get_user_pages() the page list must be put
back using unpin_user_pages() otherwise the page pin reference persists in
a corrupted state.
There are two places in the unwind of __gup_longterm_locked() that put the
pages back without checking. Normally on error this function would return
the partial page list making this the caller's responsibility, but in
these two cases the caller is not allowed to see these pages at all.
Link: https://lkml.kernel.org/r/0-v2-3ae7d9d162e2+2a7-gup_cma_fix_jgg@nvidia.com
Fixes: 3faa52c03f44 ("mm/gup: track FOLL_PIN pages")
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reported-by: Ira Weiny <ira.weiny(a)intel.com>
Reviewed-by: Ira Weiny <ira.weiny(a)intel.com>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Cc: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/gup.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
--- a/mm/gup.c~mm-gup-use-unpin_user_pages-in-__gup_longterm_locked
+++ a/mm/gup.c
@@ -1647,8 +1647,11 @@ check_again:
/*
* drop the above get_user_pages reference.
*/
- for (i = 0; i < nr_pages; i++)
- put_page(pages[i]);
+ if (gup_flags & FOLL_PIN)
+ unpin_user_pages(pages, nr_pages);
+ else
+ for (i = 0; i < nr_pages; i++)
+ put_page(pages[i]);
if (migrate_pages(&cma_page_list, alloc_migration_target, NULL,
(unsigned long)&mtc, MIGRATE_SYNC, MR_CONTIG_RANGE)) {
@@ -1728,8 +1731,11 @@ static long __gup_longterm_locked(struct
goto out;
if (check_dax_vmas(vmas_tmp, rc)) {
- for (i = 0; i < rc; i++)
- put_page(pages[i]);
+ if (gup_flags & FOLL_PIN)
+ unpin_user_pages(pages, rc);
+ else
+ for (i = 0; i < rc; i++)
+ put_page(pages[i]);
rc = -EOPNOTSUPP;
goto out;
}
_
Patches currently in -mm which might be from jgg(a)nvidia.com are
mm-reorganize-internal_get_user_pages_fast.patch
mm-prevent-gup_fast-from-racing-with-cow-during-fork.patch
The patch titled
Subject: mm/slub: fix panic in slab_alloc_node()
has been removed from the -mm tree. Its filename was
mm-slub-fix-panic-in-slab_alloc_node.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Laurent Dufour <ldufour(a)linux.ibm.com>
Subject: mm/slub: fix panic in slab_alloc_node()
While doing memory hot-unplug operation on a PowerPC VM running 1024 CPUs
with 11TB of ram, I hit the following panic:
BUG: Kernel NULL pointer dereference on read at 0x00000007
Faulting instruction address: 0xc000000000456048
Oops: Kernel access of bad area, sig: 11 [#2]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS= 2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp
CPU: 160 PID: 1 Comm: systemd Tainted: G D 5.9.0 #1
NIP: c000000000456048 LR: c000000000455fd4 CTR: c00000000047b350
REGS: c00006028d1b77a0 TRAP: 0300 Tainted: G D (5.9.0)
MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24004228 XER: 00000000
CFAR: c00000000000f1b0 DAR: 0000000000000007 DSISR: 40000000 IRQMASK: 0
GPR00: c000000000455fd4 c00006028d1b7a30 c000000001bec800 0000000000000000
GPR04: 0000000000000dc0 0000000000000000 00000000000374ef c00007c53df99320
GPR08: 000007c53c980000 0000000000000000 000007c53c980000 0000000000000000
GPR12: 0000000000004400 c00000001e8e4400 0000000000000000 0000000000000f6a
GPR16: 0000000000000000 c000000001c25930 c000000001d62528 00000000000000c1
GPR20: c000000001d62538 c00006be469e9000 0000000fffffffe0 c0000000003c0ff8
GPR24: 0000000000000018 0000000000000000 0000000000000dc0 0000000000000000
GPR28: c00007c513755700 c000000001c236a4 c00007bc4001f800 0000000000000001
NIP [c000000000456048] __kmalloc_node+0x108/0x790
LR [c000000000455fd4] __kmalloc_node+0x94/0x790
Call Trace:
[c00006028d1b7a30] [c00007c51af92000] 0xc00007c51af92000 (unreliable)
[c00006028d1b7aa0] [c0000000003c0ff8] kvmalloc_node+0x58/0x110
[c00006028d1b7ae0] [c00000000047b45c] mem_cgroup_css_online+0x10c/0x270
[c00006028d1b7b30] [c000000000241fd8] online_css+0x48/0xd0
[c00006028d1b7b60] [c00000000024af14] cgroup_apply_control_enable+0x2c4/0x470
[c00006028d1b7c40] [c00000000024e838] cgroup_mkdir+0x408/0x5f0
[c00006028d1b7cb0] [c0000000005a4ef0] kernfs_iop_mkdir+0x90/0x100
[c00006028d1b7cf0] [c0000000004b8168] vfs_mkdir+0x138/0x250
[c00006028d1b7d40] [c0000000004baf04] do_mkdirat+0x154/0x1c0
[c00006028d1b7dc0] [c000000000032b38] system_call_exception+0xf8/0x200
[c00006028d1b7e20] [c00000000000c740] system_call_common+0xf0/0x27c
Instruction dump:
e93e0000 e90d0030 39290008 7cc9402a e94d0030 e93e0000 7ce95214 7f89502a
2fbc0000 419e0018 41920230 e9270010 <89290007> 7f994800 419e0220 7ee6bb78
This pointing to the following code:
mm/slub.c:2851
if (unlikely(!object || !node_match(page, node))) {
c000000000456038: 00 00 bc 2f cmpdi cr7,r28,0
c00000000045603c: 18 00 9e 41 beq cr7,c000000000456054 <__kmalloc_node+0x114>
node_match():
mm/slub.c:2491
if (node != NUMA_NO_NODE && page_to_nid(page) != node)
c000000000456040: 30 02 92 41 beq cr4,c000000000456270 <__kmalloc_node+0x330>
page_to_nid():
include/linux/mm.h:1294
c000000000456044: 10 00 27 e9 ld r9,16(r7)
c000000000456048: 07 00 29 89 lbz r9,7(r9) <<<< r9 = NULL
node_match():
mm/slub.c:2491
c00000000045604c: 00 48 99 7f cmpw cr7,r25,r9
c000000000456050: 20 02 9e 41 beq cr7,c000000000456270 <__kmalloc_node+0x330>
The panic occurred in slab_alloc_node() when checking for the page's node:
object = c->freelist;
page = c->page;
if (unlikely(!object || !node_match(page, node))) {
object = __slab_alloc(s, gfpflags, node, addr, c);
stat(s, ALLOC_SLOWPATH);
The issue is that object is not NULL while page is NULL which is odd but
may happen if the cache flush happened after loading object but before
loading page. Thus checking for the page pointer is required too.
The cache flush is done through an inter processor interrupt when a piece
of memory is off-lined. That interrupt is triggered when a memory
hot-unplug operation is initiated and offline_pages() is calling the
slub's MEM_GOING_OFFLINE callback slab_mem_going_offline_callback() which
is calling flush_cpu_slab(). If that interrupt is caught between the
reading of c->freelist and the reading of c->page, this could lead to such
a situation. That situation is expected and the later call to
this_cpu_cmpxchg_double() will detect the change to c->freelist and redo
the whole operation.
In commit 6159d0f5c03e ("mm/slub.c: page is always non-NULL in
node_match()") check on the page pointer has been removed assuming that
page is always valid when it is called. It happens that this is not true
in that particular case, so check for page before calling node_match()
here.
Link: https://lkml.kernel.org/r/20201027190406.33283-1-ldufour@linux.ibm.com
Fixes: 6159d0f5c03e ("mm/slub.c: page is always non-NULL in node_match()")
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Acked-by: Christoph Lameter <cl(a)linux.com>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/slub.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/slub.c~mm-slub-fix-panic-in-slab_alloc_node
+++ a/mm/slub.c
@@ -2852,7 +2852,7 @@ redo:
object = c->freelist;
page = c->page;
- if (unlikely(!object || !node_match(page, node))) {
+ if (unlikely(!object || !page || !node_match(page, node))) {
object = __slab_alloc(s, gfpflags, node, addr, c);
} else {
void *next_object = get_freepointer_safe(s, object);
_
Patches currently in -mm which might be from ldufour(a)linux.ibm.com are
The patch titled
Subject: mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit
has been removed from the -mm tree. Its filename was
mm-vmscan-fix-nr_isolated_file-corruption-on-64-bit.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Nicholas Piggin <npiggin(a)gmail.com>
Subject: mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit
Previously the negated unsigned long would be cast back to signed long
which would have the correct negative value. After commit 730ec8c01a2b
("mm/vmscan.c: change prototype for shrink_page_list"), the large unsigned
int converts to a large positive signed long.
Symptoms include CMA allocations hanging forever holding the cma_mutex due
to alloc_contig_range->...->isolate_migratepages_block waiting forever in
"while (unlikely(too_many_isolated(pgdat)))".
[akpm(a)linux-foundation.org: fix -stat.nr_lazyfree_fail as well, per Michal]
Link: https://lkml.kernel.org/r/20201029032320.1448441-1-npiggin@gmail.com
Fixes: 730ec8c01a2b ("mm/vmscan.c: change prototype for shrink_page_list")
Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Vaneet Narang <v.narang(a)samsung.com>
Cc: Maninder Singh <maninder1.s(a)samsung.com>
Cc: Amit Sahrawat <a.sahrawat(a)samsung.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/vmscan.c~mm-vmscan-fix-nr_isolated_file-corruption-on-64-bit
+++ a/mm/vmscan.c
@@ -1516,7 +1516,8 @@ unsigned int reclaim_clean_pages_from_li
nr_reclaimed = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc,
TTU_IGNORE_ACCESS, &stat, true);
list_splice(&clean_pages, page_list);
- mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -nr_reclaimed);
+ mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE,
+ -(long)nr_reclaimed);
/*
* Since lazyfree pages are isolated from file LRU from the beginning,
* they will rotate back to anonymous LRU in the end if it failed to
@@ -1526,7 +1527,7 @@ unsigned int reclaim_clean_pages_from_li
mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_ANON,
stat.nr_lazyfree_fail);
mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE,
- -stat.nr_lazyfree_fail);
+ -(long)stat.nr_lazyfree_fail);
return nr_reclaimed;
}
_
Patches currently in -mm which might be from npiggin(a)gmail.com are
The patch titled
Subject: mm/compaction: stop isolation if too many pages are isolated and we have pages to migrate
has been removed from the -mm tree. Its filename was
mm-compaction-stop-isolation-if-too-many-pages-are-isolated-and-we-have-pages-to-migrate.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Zi Yan <ziy(a)nvidia.com>
Subject: mm/compaction: stop isolation if too many pages are isolated and we have pages to migrate
In isolate_migratepages_block, if we have too many isolated pages and
nr_migratepages is not zero, we should try to migrate what we have without
wasting time on isolating.
In theory it's possible that multiple parallel compactions will cause
too_many_isolated() to become true even if each has isolated less than
COMPACT_CLUSTER_MAX, and loop forever in the while loop. Bailing
immediately prevents that.
[vbabka(a)suse.cz: changelog addition]
Link: https://lkml.kernel.org/r/20201030183809.3616803-2-zi.yan@sent.com
Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”)
Signed-off-by: Zi Yan <ziy(a)nvidia.com>
Suggested-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/compaction.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/mm/compaction.c~mm-compaction-stop-isolation-if-too-many-pages-are-isolated-and-we-have-pages-to-migrate
+++ a/mm/compaction.c
@@ -817,6 +817,10 @@ isolate_migratepages_block(struct compac
* delay for some time until fewer pages are isolated
*/
while (unlikely(too_many_isolated(pgdat))) {
+ /* stop isolation if there are still pages not migrated */
+ if (cc->nr_migratepages)
+ return 0;
+
/* async migration should just abort */
if (cc->mode == MIGRATE_ASYNC)
return 0;
_
Patches currently in -mm which might be from ziy(a)nvidia.com are
The patch titled
Subject: mm/compaction: count pages and stop correctly during page isolation
has been removed from the -mm tree. Its filename was
mm-compaction-count-pages-and-stop-correctly-during-page-isolation.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Zi Yan <ziy(a)nvidia.com>
Subject: mm/compaction: count pages and stop correctly during page isolation
In isolate_migratepages_block, when cc->alloc_contig is true, we are able
to isolate compound pages, nr_migratepages and nr_isolated did not count
compound pages correctly, causing us to isolate more pages than we
thought. Count compound pages as the number of base pages they contain.
Otherwise, we might be trapped in too_many_isolated while loop, since the
actual isolated pages can go up to COMPACT_CLUSTER_MAX*512=16384, where
COMPACT_CLUSTER_MAX is 32, since we stop isolation after
cc->nr_migratepages reaches to COMPACT_CLUSTER_MAX.
In addition, after we fix the issue above, cc->nr_migratepages could never
be equal to COMPACT_CLUSTER_MAX if compound pages are isolated, thus page
isolation could not stop as we intended. Change the isolation stop
condition to >=.
The issue can be triggered as follows: In a system with 16GB memory and an
8GB CMA region reserved by hugetlb_cma, if we first allocate 10GB THPs and
mlock them (so some THPs are allocated in the CMA region and mlocked),
reserving 6 1GB hugetlb pages via
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages will get stuck
(looping in too_many_isolated function) until we kill either task. With
the patch applied, oom will kill the application with 10GB THPs and let
hugetlb page reservation finish.
[ziy(a)nvidia.com: v3]
Link: https://lkml.kernel.org/r/20201030183809.3616803-1-zi.yan@sent.com
Link: https://lkml.kernel.org/r/20201029200435.3386066-1-zi.yan@sent.com
Fixes: 1da2f328fa64 ("cmm,thp,compaction,cma: allow THP migration for CMA allocations")
Signed-off-by: Zi Yan <ziy(a)nvidia.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/compaction.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--- a/mm/compaction.c~mm-compaction-count-pages-and-stop-correctly-during-page-isolation
+++ a/mm/compaction.c
@@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compac
isolate_success:
list_add(&page->lru, &cc->migratepages);
- cc->nr_migratepages++;
- nr_isolated++;
+ cc->nr_migratepages += compound_nr(page);
+ nr_isolated += compound_nr(page);
/*
* Avoid isolating too much unless this block is being
@@ -1021,7 +1021,7 @@ isolate_success:
* or a lock is contended. For contention, isolate quickly to
* potentially remove one source of contention.
*/
- if (cc->nr_migratepages == COMPACT_CLUSTER_MAX &&
+ if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX &&
!cc->rescan && !cc->contended) {
++low_pfn;
break;
@@ -1132,7 +1132,7 @@ isolate_migratepages_range(struct compac
if (!pfn)
break;
- if (cc->nr_migratepages == COMPACT_CLUSTER_MAX)
+ if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX)
break;
}
_
Patches currently in -mm which might be from ziy(a)nvidia.com are
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d3938ee23e97bfcac2e0eb6b356875da73d700df Mon Sep 17 00:00:00 2001
From: Gao Xiang <hsiangkao(a)redhat.com>
Date: Sun, 1 Nov 2020 03:51:02 +0800
Subject: [PATCH] erofs: derive atime instead of leaving it empty
EROFS has _only one_ ondisk timestamp (ctime is currently
documented and recorded, we might also record mtime instead
with a new compat feature if needed) for each extended inode
since EROFS isn't mainly for archival purposes so no need to
keep all timestamps on disk especially for Android scenarios
due to security concerns. Also, romfs/cramfs don't have their
own on-disk timestamp, and squashfs only records mtime instead.
Let's also derive access time from ondisk timestamp rather than
leaving it empty, and if mtime/atime for each file are really
needed for specific scenarios as well, we can also use xattrs
to record them then.
Link: https://lore.kernel.org/r/20201031195102.21221-1-hsiangkao@aol.com
[ Gao Xiang: It'd be better to backport for user-friendly concern. ]
Fixes: 431339ba9042 ("staging: erofs: add inode operations")
Cc: stable <stable(a)vger.kernel.org> # 4.19+
Reported-by: nl6720 <nl6720(a)gmail.com>
Reviewed-by: Chao Yu <yuchao0(a)huawei.com>
Signed-off-by: Gao Xiang <hsiangkao(a)redhat.com>
diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
index 139d0bed42f8..3e21c0e8adae 100644
--- a/fs/erofs/inode.c
+++ b/fs/erofs/inode.c
@@ -107,11 +107,9 @@ static struct page *erofs_read_inode(struct inode *inode,
i_gid_write(inode, le32_to_cpu(die->i_gid));
set_nlink(inode, le32_to_cpu(die->i_nlink));
- /* ns timestamp */
- inode->i_mtime.tv_sec = inode->i_ctime.tv_sec =
- le64_to_cpu(die->i_ctime);
- inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec =
- le32_to_cpu(die->i_ctime_nsec);
+ /* extended inode has its own timestamp */
+ inode->i_ctime.tv_sec = le64_to_cpu(die->i_ctime);
+ inode->i_ctime.tv_nsec = le32_to_cpu(die->i_ctime_nsec);
inode->i_size = le64_to_cpu(die->i_size);
@@ -149,11 +147,9 @@ static struct page *erofs_read_inode(struct inode *inode,
i_gid_write(inode, le16_to_cpu(dic->i_gid));
set_nlink(inode, le16_to_cpu(dic->i_nlink));
- /* use build time to derive all file time */
- inode->i_mtime.tv_sec = inode->i_ctime.tv_sec =
- sbi->build_time;
- inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec =
- sbi->build_time_nsec;
+ /* use build time for compact inodes */
+ inode->i_ctime.tv_sec = sbi->build_time;
+ inode->i_ctime.tv_nsec = sbi->build_time_nsec;
inode->i_size = le32_to_cpu(dic->i_size);
if (erofs_inode_is_data_compressed(vi->datalayout))
@@ -167,6 +163,11 @@ static struct page *erofs_read_inode(struct inode *inode,
goto err_out;
}
+ inode->i_mtime.tv_sec = inode->i_ctime.tv_sec;
+ inode->i_atime.tv_sec = inode->i_ctime.tv_sec;
+ inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec;
+ inode->i_atime.tv_nsec = inode->i_ctime.tv_nsec;
+
if (!nblks)
/* measure inode.i_blocks as generic filesystems */
inode->i_blocks = roundup(inode->i_size, EROFS_BLKSIZ) >> 9;
From: Aili Yao <yaoaili(a)kingsoft.com>
>From commit 6915564dc5a8 ("ACPI: OSL: Change the type of
acpi_os_map_generic_address() return value"),
acpi_os_map_generic_address() will return logical address or NULL for
error, but for ACPI_ADR_SPACE_SYSTEM_IO case, it should be also return 0
as it's a normal case, but now it will return -ENXIO. So check it out for
such case to avoid einj module initialization fail.
Fixes: 6915564dc5a8 ("ACPI: OSL: Change the type of
acpi_os_map_generic_address() return value")
Cc: <stable(a)vger.kernel.org>
Reviewed-by: James Morse <james.morse(a)arm.com>
Tested-by: Tony Luck <tony.luck(a)intel.com>
Signed-off-by: Aili Yao <yaoaili(a)kingsoft.com>
---
drivers/acpi/apei/apei-base.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/acpi/apei/apei-base.c b/drivers/acpi/apei/apei-base.c
index 552fd9f..3294cc8 100644
--- a/drivers/acpi/apei/apei-base.c
+++ b/drivers/acpi/apei/apei-base.c
@@ -633,6 +633,10 @@ int apei_map_generic_address(struct acpi_generic_address *reg)
if (rc)
return rc;
+ /* IO space doesn't need mapping */
+ if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO)
+ return 0;
+
if (!acpi_os_map_generic_address(reg))
return -ENXIO;
--
2.9.5
This is a note to let you know that I've just added the patch titled
tty: serial: imx: fix potential deadlock
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 33f16855dcb973f745c51882d0e286601ff3be2b Mon Sep 17 00:00:00 2001
From: Sam Nobs <samuel.nobs(a)taitradio.com>
Date: Tue, 10 Nov 2020 09:50:06 +1300
Subject: tty: serial: imx: fix potential deadlock
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Enabling the lock dependency validator has revealed
that the way spinlocks are used in the IMX serial
port could result in a deadlock.
Specifically, imx_uart_int() acquires a spinlock
without disabling the interrupts, meaning that another
interrupt could come along and try to acquire the same
spinlock, potentially causing the two to wait for each
other indefinitely.
Use spin_lock_irqsave() instead to disable interrupts
upon acquisition of the spinlock.
Fixes: c974991d2620 ("tty:serial:imx: use spin_lock instead of spin_lock_irqsave in isr")
Reviewed-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Signed-off-by: Sam Nobs <samuel.nobs(a)taitradio.com>
Link: https://lore.kernel.org/r/1604955006-9363-1-git-send-email-samuel.nobs@tait…
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/imx.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 1731d9728865..3c53a3c89959 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -942,8 +942,14 @@ static irqreturn_t imx_uart_int(int irq, void *dev_id)
struct imx_port *sport = dev_id;
unsigned int usr1, usr2, ucr1, ucr2, ucr3, ucr4;
irqreturn_t ret = IRQ_NONE;
+ unsigned long flags = 0;
- spin_lock(&sport->port.lock);
+ /*
+ * IRQs might not be disabled upon entering this interrupt handler,
+ * e.g. when interrupt handlers are forced to be threaded. To support
+ * this scenario as well, disable IRQs when acquiring the spinlock.
+ */
+ spin_lock_irqsave(&sport->port.lock, flags);
usr1 = imx_uart_readl(sport, USR1);
usr2 = imx_uart_readl(sport, USR2);
@@ -1013,7 +1019,7 @@ static irqreturn_t imx_uart_int(int irq, void *dev_id)
ret = IRQ_HANDLED;
}
- spin_unlock(&sport->port.lock);
+ spin_unlock_irqrestore(&sport->port.lock, flags);
return ret;
}
--
2.29.2
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5ce6861d36ed5207aff9e5eead4c7cc38a986586 Mon Sep 17 00:00:00 2001
From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota(a)intel.com>
Date: Thu, 5 Nov 2020 17:18:42 -0800
Subject: [PATCH] drm/i915: Correctly set SFC capability for video engines
SFC capability of video engines is not set correctly because i915
is testing for incorrect bits.
Fixes: c5d3e39caa45 ("drm/i915: Engine discovery query")
Cc: Matt Roper <matthew.d.roper(a)intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota(a)intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio(a)intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.3+
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106011842.36203-1-daniel…
(cherry picked from commit ad18fa0f5f052046cad96fee762b5c64f42dd86a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 5bfb5f7ed02c..efdeb7b7b2a0 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -371,7 +371,8 @@ static void __setup_engine_capabilities(struct intel_engine_cs *engine)
* instances.
*/
if ((INTEL_GEN(i915) >= 11 &&
- engine->gt->info.vdbox_sfc_access & engine->mask) ||
+ (engine->gt->info.vdbox_sfc_access &
+ BIT(engine->instance))) ||
(INTEL_GEN(i915) >= 9 && engine->instance == 0))
engine->uabi_capabilities |=
I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC;
With the current implementation the following race can happen:
* blk_pre_runtime_suspend() calls blk_freeze_queue_start() and
blk_mq_unfreeze_queue().
* blk_queue_enter() calls blk_queue_pm_only() and that function returns
true.
* blk_queue_enter() calls blk_pm_request_resume() and that function does
not call pm_request_resume() because the queue runtime status is
RPM_ACTIVE.
* blk_pre_runtime_suspend() changes the queue status into RPM_SUSPENDING.
Fix this race by changing the queue runtime status into RPM_SUSPENDING
before switching q_usage_counter to atomic mode.
Acked-by: Alan Stern <stern(a)rowland.harvard.edu>
Acked-by: Stanley Chu <stanley.chu(a)mediatek.com>
Cc: Ming Lei <ming.lei(a)redhat.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Cc: stable <stable(a)vger.kernel.org>
Fixes: 986d413b7c15 ("blk-mq: Enable support for runtime power management")
Signed-off-by: Can Guo <cang(a)codeaurora.org>
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
---
block/blk-pm.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/block/blk-pm.c b/block/blk-pm.c
index b85234d758f7..17bd020268d4 100644
--- a/block/blk-pm.c
+++ b/block/blk-pm.c
@@ -67,6 +67,10 @@ int blk_pre_runtime_suspend(struct request_queue *q)
WARN_ON_ONCE(q->rpm_status != RPM_ACTIVE);
+ spin_lock_irq(&q->queue_lock);
+ q->rpm_status = RPM_SUSPENDING;
+ spin_unlock_irq(&q->queue_lock);
+
/*
* Increase the pm_only counter before checking whether any
* non-PM blk_queue_enter() calls are in progress to avoid that any
@@ -89,15 +93,14 @@ int blk_pre_runtime_suspend(struct request_queue *q)
/* Switch q_usage_counter back to per-cpu mode. */
blk_mq_unfreeze_queue(q);
- spin_lock_irq(&q->queue_lock);
- if (ret < 0)
+ if (ret < 0) {
+ spin_lock_irq(&q->queue_lock);
+ q->rpm_status = RPM_ACTIVE;
pm_runtime_mark_last_busy(q->dev);
- else
- q->rpm_status = RPM_SUSPENDING;
- spin_unlock_irq(&q->queue_lock);
+ spin_unlock_irq(&q->queue_lock);
- if (ret)
blk_clear_pm_only(q);
+ }
return ret;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e8973201d9b281375b5a8c66093de5679423021a Mon Sep 17 00:00:00 2001
From: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Date: Fri, 6 Nov 2020 18:25:30 +0900
Subject: [PATCH] mmc: renesas_sdhi_core: Add missing tmio_mmc_host_free() at
remove
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The commit 94b110aff867 ("mmc: tmio: add tmio_mmc_host_alloc/free()")
added tmio_mmc_host_free(), but missed the function calling in
the sh_mobile_sdhi_remove() at that time. So, fix it. Otherwise,
we cannot rebind the sdhi/mmc devices when we use aliases of mmc.
Fixes: 94b110aff867 ("mmc: tmio: add tmio_mmc_host_alloc/free()")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Reviewed-by: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Tested-by: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas(a)ragnatech.se>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/1604654730-29914-1-git-send-email-yoshihiro.shimo…
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
index 414314151d0a..03c905a781a7 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -1160,6 +1160,7 @@ int renesas_sdhi_remove(struct platform_device *pdev)
tmio_mmc_host_remove(host);
renesas_sdhi_clk_disable(host);
+ tmio_mmc_host_free(host);
return 0;
}