Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: b93aafd91105 - Convert trailing spaces and periods in path components
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ xfstests - ext4
🚧 ✅ xfstests - xfs
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
ppc64le:
Host 1:
✅ Boot test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ xfstests - ext4
🚧 ✅ xfstests - xfs
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage block - filesystem fio test
🚧 ✅ Storage block - queue scheduler test
🚧 ✅ Storage nvme - tcp
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
s390x:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 2:
✅ Boot test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 3:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ ACPI table test
⚡⚡⚡ LTP
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ Memory: fork_mem
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking: igmp conformance test
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - transport
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: sanity smoke test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
⚡⚡⚡ storage: SCSI VPD
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ Firmware test suite
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ Ethernet drivers sanity
🚧 ⚡⚡⚡ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ trace: ftrace/tracer
🚧 ⚡⚡⚡ kdump - kexec_boot
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 3:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 💥 Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ CPU: Frequency Driver Test
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - ext4
🚧 ⚡⚡⚡ xfstests - xfs
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 4:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ ACPI table test
⚡⚡⚡ LTP
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ Memory: fork_mem
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking: igmp conformance test
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - transport
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: sanity smoke test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
⚡⚡⚡ storage: SCSI VPD
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ Firmware test suite
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ Ethernet drivers sanity
🚧 ⚡⚡⚡ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ trace: ftrace/tracer
🚧 ⚡⚡⚡ kdump - kexec_boot
Host 5:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ ACPI table test
⚡⚡⚡ LTP
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ Memory: fork_mem
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking: igmp conformance test
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - transport
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: sanity smoke test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
⚡⚡⚡ storage: SCSI VPD
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ Firmware test suite
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ Ethernet drivers sanity
🚧 ⚡⚡⚡ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ trace: ftrace/tracer
🚧 ⚡⚡⚡ kdump - kexec_boot
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
This series replaces all the use of security_capable(current_cred(),
...) with ns_capable{,_noaudit}() which set PF_SUPERPRIV.
This initially come from a review of Landlock by Jann Horn:
https://lore.kernel.org/lkml/CAG48ez1FQVkt78129WozBwFbVhAPyAr9oJAHFHAbbNxEB…
Mickaël Salaün (2):
ptrace: Set PF_SUPERPRIV when checking capability
seccomp: Set PF_SUPERPRIV when checking capability
kernel/ptrace.c | 18 ++++++------------
kernel/seccomp.c | 5 ++---
2 files changed, 8 insertions(+), 15 deletions(-)
base-commit: 3650b228f83adda7e5ee532e2b90429c03f7b9ec
--
2.28.0
When the HW is powered down, the register state and links are lost. This
may be an issue in the firmware, or in the code expectations; whatever
it is, it is expected behaviour now for Tigerlake; stop warning!
References: https://gitlab.freedesktop.org/drm/intel/-/issues/2411
Fixes: 239bef676d8e ("drm/i915/display: Implement new combo phy initialization step")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Clinton A Taylor <clinton.a.taylor(a)intel.com>
Cc: Lucas De Marchi <lucas.demarchi(a)intel.com>
Cc: Matt Roper <matthew.d.roper(a)intel.com>
Cc: José Roberto de Souza <jose.souza(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.9+
---
drivers/gpu/drm/i915/display/intel_combo_phy.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_combo_phy.c b/drivers/gpu/drm/i915/display/intel_combo_phy.c
index d5ad61e4083e..9a87df982af8 100644
--- a/drivers/gpu/drm/i915/display/intel_combo_phy.c
+++ b/drivers/gpu/drm/i915/display/intel_combo_phy.c
@@ -428,9 +428,9 @@ static void icl_combo_phys_uninit(struct drm_i915_private *dev_priv)
if (phy == PHY_A &&
!icl_combo_phy_verify_state(dev_priv, phy))
- drm_warn(&dev_priv->drm,
- "Combo PHY %c HW state changed unexpectedly\n",
- phy_name(phy));
+ drm_dbg_kms(&dev_priv->drm,
+ "Combo PHY %c HW state changed unexpectedly\n",
+ phy_name(phy));
if (!has_phy_misc(dev_priv, phy))
goto skip_phy_misc;
--
2.20.1
commit 2fb541c862c9 ("net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc")
When the above upstream commit is backported to stable kernel,
one assignment is missing, which causes two problems reported
by Joakim and Vishwanath, see [1] and [2].
So add the assignment back to fix it.
1. https://www.spinics.net/lists/netdev/msg693916.html
2. https://www.spinics.net/lists/netdev/msg695131.html
Fixes: 749cc0b0c7f3 ("net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc")
Signed-off-by: Yunsheng Lin <linyunsheng(a)huawei.com>
---
net/sched/sch_generic.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 0e275e1..6e6147a 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -1127,10 +1127,13 @@ static void dev_deactivate_queue(struct net_device *dev,
void *_qdisc_default)
{
struct Qdisc *qdisc = rtnl_dereference(dev_queue->qdisc);
+ struct Qdisc *qdisc_default = _qdisc_default;
if (qdisc) {
if (!(qdisc->flags & TCQ_F_BUILTIN))
set_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state);
+
+ rcu_assign_pointer(dev_queue->qdisc, qdisc_default);
}
}
--
2.7.4
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From cb8d53d2c97369029cc638c9274ac7be0a316c75 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Tue, 22 Sep 2020 09:24:56 -0700
Subject: [PATCH] ext4: fix leaking sysfs kobject after failed mount
ext4_unregister_sysfs() only deletes the kobject. The reference to it
needs to be put separately, like ext4_put_super() does.
This addresses the syzbot report
"memory leak in kobject_set_name_vargs (3)"
(https://syzkaller.appspot.com/bug?extid=9f864abad79fae7c17e1).
Reported-by: syzbot+9f864abad79fae7c17e1(a)syzkaller.appspotmail.com
Fixes: 72ba74508b28 ("ext4: release sysfs kobject when failing to enable quotas on mount")
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Link: https://lore.kernel.org/r/20200922162456.93657-1-ebiggers@kernel.org
Reviewed-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ea425b49b345..41953b86ffe3 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4872,6 +4872,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
failed_mount8:
ext4_unregister_sysfs(sb);
+ kobject_put(&sbi->s_kobj);
failed_mount7:
ext4_unregister_li_request(sb);
failed_mount6:
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8b92c4ff4423aa9900cf838d3294fcade4dbda35 Mon Sep 17 00:00:00 2001
From: Matteo Croce <mcroce(a)microsoft.com>
Date: Fri, 13 Nov 2020 22:52:02 -0800
Subject: [PATCH] Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
Patch series "fix parsing of reboot= cmdline", v3.
The parsing of the reboot= cmdline has two major errors:
- a missing bound check can crash the system on reboot
- parsing of the cpu number only works if specified last
Fix both.
This patch (of 2):
This reverts commit 616feab753972b97.
kstrtoint() and simple_strtoul() have a subtle difference which makes
them non interchangeable: if a non digit character is found amid the
parsing, the former will return an error, while the latter will just
stop parsing, e.g. simple_strtoul("123xyx") = 123.
The kernel cmdline reboot= argument allows to specify the CPU used for
rebooting, with the syntax `s####` among the other flags, e.g.
"reboot=warm,s31,force", so if this flag is not the last given, it's
silently ignored as well as the subsequent ones.
Fixes: 616feab75397 ("kernel/reboot.c: convert simple_strtoul to kstrtoint")
Signed-off-by: Matteo Croce <mcroce(a)microsoft.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Guenter Roeck <linux(a)roeck-us.net>
Cc: Petr Mladek <pmladek(a)suse.com>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Robin Holt <robinmholt(a)gmail.com>
Cc: Fabian Frederick <fabf(a)skynet.be>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20201103214025.116799-2-mcroce@linux.microsoft.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/kernel/reboot.c b/kernel/reboot.c
index e7b78d5ae1ab..8fbba433725e 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -551,22 +551,15 @@ static int __init reboot_setup(char *str)
break;
case 's':
- {
- int rc;
-
- if (isdigit(*(str+1))) {
- rc = kstrtoint(str+1, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else if (str[1] == 'm' && str[2] == 'p' &&
- isdigit(*(str+3))) {
- rc = kstrtoint(str+3, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else
+ if (isdigit(*(str+1)))
+ reboot_cpu = simple_strtoul(str+1, NULL, 0);
+ else if (str[1] == 'm' && str[2] == 'p' &&
+ isdigit(*(str+3)))
+ reboot_cpu = simple_strtoul(str+3, NULL, 0);
+ else
*mode = REBOOT_SOFT;
break;
- }
+
case 'g':
*mode = REBOOT_GPIO;
break;
Hi Greg, Sasha,
Please consider the attached backport of
f91072ed1b72 ("perf/core: Fix race in the perf_mmap_close() function")
for v4.14.y, v4.19.y and v5.4.y
This will not apply in v4.9.y and v4.4.y and I am sending separate backport
for that.
--
Regards
Sudip
Hi Greg, Sasha,
This was missing from v4.4.y, v4.9.y and v4.14.y. Please consider
the attached backported patch.
Missed adding stable in the previous mail.
--
Regards
Sudip
commit 1978b3a53a74e3230cd46932b149c6e62e832e9a upstream.
On AMD CPUs which have the feature X86_FEATURE_AMD_STIBP_ALWAYS_ON,
STIBP is set to on and
spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED
At the same time, IBPB can be set to conditional.
However, this leads to the case where it's impossible to turn on IBPB
for a process because in the PR_SPEC_DISABLE case in ib_prctl_set() the
spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED
condition leads to a return before the task flag is set. Similarly,
ib_prctl_get() will return PR_SPEC_DISABLE even though IBPB is set to
conditional.
More generally, the following cases are possible:
1. STIBP = conditional && IBPB = on for spectre_v2_user=seccomp,ibpb
2. STIBP = on && IBPB = conditional for AMD CPUs with
X86_FEATURE_AMD_STIBP_ALWAYS_ON
The first case functions correctly today, but only because
spectre_v2_user_ibpb isn't updated to reflect the IBPB mode.
At a high level, this change does one thing. If either STIBP or IBPB
is set to conditional, allow the prctl to change the task flag.
Also, reflect that capability when querying the state. This isn't
perfect since it doesn't take into account if only STIBP or IBPB is
unconditionally on. But it allows the conditional feature to work as
expected, without affecting the unconditional one.
[ bp: Massage commit message and comment; space out statements for
better readability. ]
Fixes: 21998a351512 ("x86/speculation: Avoid force-disabling IBPB based on STIBP and enhanced IBRS.")
Signed-off-by: Anand K Mistry <amistry(a)google.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Acked-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Link: https://lkml.kernel.org/r/20201105163246.v2.1.Ifd7243cd3e2c2206a893ad0a5b9a…
Conflicts:
arch/x86/kernel/cpu/bugs.c
Superfluous newline which was removed in upstream commit a5ce9f2bb665
---
The one conflict is a newline in a comment which was removed in upstream
commit a5ce9f2bb665, but was not merged into the stable trees.
This patch applies cleanly on the stable trees 5.4, 4.19, 4.14, 4.9, and
4.4, which are affected by this bug.
arch/x86/kernel/cpu/bugs.c | 52 ++++++++++++++++++++++++--------------
1 file changed, 33 insertions(+), 19 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index acbf3dbb8bf2..bdc1ed7ff669 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1252,6 +1252,14 @@ static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl)
return 0;
}
+static bool is_spec_ib_user_controlled(void)
+{
+ return spectre_v2_user_ibpb == SPECTRE_V2_USER_PRCTL ||
+ spectre_v2_user_ibpb == SPECTRE_V2_USER_SECCOMP ||
+ spectre_v2_user_stibp == SPECTRE_V2_USER_PRCTL ||
+ spectre_v2_user_stibp == SPECTRE_V2_USER_SECCOMP;
+}
+
static int ib_prctl_set(struct task_struct *task, unsigned long ctrl)
{
switch (ctrl) {
@@ -1259,17 +1267,26 @@ static int ib_prctl_set(struct task_struct *task, unsigned long ctrl)
if (spectre_v2_user_ibpb == SPECTRE_V2_USER_NONE &&
spectre_v2_user_stibp == SPECTRE_V2_USER_NONE)
return 0;
- /*
- * Indirect branch speculation is always disabled in strict
- * mode. It can neither be enabled if it was force-disabled
- * by a previous prctl call.
+ /*
+ * With strict mode for both IBPB and STIBP, the instruction
+ * code paths avoid checking this task flag and instead,
+ * unconditionally run the instruction. However, STIBP and IBPB
+ * are independent and either can be set to conditionally
+ * enabled regardless of the mode of the other.
+ *
+ * If either is set to conditional, allow the task flag to be
+ * updated, unless it was force-disabled by a previous prctl
+ * call. Currently, this is possible on an AMD CPU which has the
+ * feature X86_FEATURE_AMD_STIBP_ALWAYS_ON. In this case, if the
+ * kernel is booted with 'spectre_v2_user=seccomp', then
+ * spectre_v2_user_ibpb == SPECTRE_V2_USER_SECCOMP and
+ * spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED.
*/
- if (spectre_v2_user_ibpb == SPECTRE_V2_USER_STRICT ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED ||
+ if (!is_spec_ib_user_controlled() ||
task_spec_ib_force_disable(task))
return -EPERM;
+
task_clear_spec_ib_disable(task);
task_update_spec_tif(task);
break;
@@ -1282,10 +1299,10 @@ static int ib_prctl_set(struct task_struct *task, unsigned long ctrl)
if (spectre_v2_user_ibpb == SPECTRE_V2_USER_NONE &&
spectre_v2_user_stibp == SPECTRE_V2_USER_NONE)
return -EPERM;
- if (spectre_v2_user_ibpb == SPECTRE_V2_USER_STRICT ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED)
+
+ if (!is_spec_ib_user_controlled())
return 0;
+
task_set_spec_ib_disable(task);
if (ctrl == PR_SPEC_FORCE_DISABLE)
task_set_spec_ib_force_disable(task);
@@ -1350,20 +1367,17 @@ static int ib_prctl_get(struct task_struct *task)
if (spectre_v2_user_ibpb == SPECTRE_V2_USER_NONE &&
spectre_v2_user_stibp == SPECTRE_V2_USER_NONE)
return PR_SPEC_ENABLE;
- else if (spectre_v2_user_ibpb == SPECTRE_V2_USER_STRICT ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED)
- return PR_SPEC_DISABLE;
- else if (spectre_v2_user_ibpb == SPECTRE_V2_USER_PRCTL ||
- spectre_v2_user_ibpb == SPECTRE_V2_USER_SECCOMP ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_PRCTL ||
- spectre_v2_user_stibp == SPECTRE_V2_USER_SECCOMP) {
+ else if (is_spec_ib_user_controlled()) {
if (task_spec_ib_force_disable(task))
return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE;
if (task_spec_ib_disable(task))
return PR_SPEC_PRCTL | PR_SPEC_DISABLE;
return PR_SPEC_PRCTL | PR_SPEC_ENABLE;
- } else
+ } else if (spectre_v2_user_ibpb == SPECTRE_V2_USER_STRICT ||
+ spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT ||
+ spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED)
+ return PR_SPEC_DISABLE;
+ else
return PR_SPEC_NOT_AFFECTED;
}
--
2.29.2.222.g5d2a92d10f8-goog
The patch below does not apply to the 5.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e1777d099728a76a8f8090f89649aac961e7e530 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)wdc.com>
Date: Fri, 6 Nov 2020 20:01:41 +0900
Subject: [PATCH] null_blk: Fix scheduling in atomic with zoned mode
Commit aa1c09cb65e2 ("null_blk: Fix locking in zoned mode") changed
zone locking to using the potentially sleeping wait_on_bit_io()
function. This is acceptable when memory backing is enabled as the
device queue is in that case marked as blocking, but this triggers a
scheduling while in atomic context with memory backing disabled.
Fix this by relying solely on the device zone spinlock for zone
information protection without temporarily releasing this lock around
null_process_cmd() execution in null_zone_write(). This is OK to do
since when memory backing is disabled, command processing does not
block and the memory backing lock nullb->lock is unused. This solution
avoids the overhead of having to mark a zoned null_blk device queue as
blocking when memory backing is unused.
This patch also adds comments to the zone locking code to explain the
unusual locking scheme.
Fixes: aa1c09cb65e2 ("null_blk: Fix locking in zoned mode")
Reported-by: kernel test robot <lkp(a)intel.com>
Signed-off-by: Damien Le Moal <damien.lemoal(a)wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Cc: stable(a)vger.kernel.org
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/drivers/block/null_blk.h b/drivers/block/null_blk.h
index cfd00ad40355..c24d9b5ad81a 100644
--- a/drivers/block/null_blk.h
+++ b/drivers/block/null_blk.h
@@ -47,7 +47,7 @@ struct nullb_device {
unsigned int nr_zones_closed;
struct blk_zone *zones;
sector_t zone_size_sects;
- spinlock_t zone_dev_lock;
+ spinlock_t zone_lock;
unsigned long *zone_locks;
unsigned long size; /* device size in MB */
diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c
index 8775acbb4f8f..beb34b4f76b0 100644
--- a/drivers/block/null_blk_zoned.c
+++ b/drivers/block/null_blk_zoned.c
@@ -46,11 +46,20 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
if (!dev->zones)
return -ENOMEM;
- spin_lock_init(&dev->zone_dev_lock);
- dev->zone_locks = bitmap_zalloc(dev->nr_zones, GFP_KERNEL);
- if (!dev->zone_locks) {
- kvfree(dev->zones);
- return -ENOMEM;
+ /*
+ * With memory backing, the zone_lock spinlock needs to be temporarily
+ * released to avoid scheduling in atomic context. To guarantee zone
+ * information protection, use a bitmap to lock zones with
+ * wait_on_bit_lock_io(). Sleeping on the lock is OK as memory backing
+ * implies that the queue is marked with BLK_MQ_F_BLOCKING.
+ */
+ spin_lock_init(&dev->zone_lock);
+ if (dev->memory_backed) {
+ dev->zone_locks = bitmap_zalloc(dev->nr_zones, GFP_KERNEL);
+ if (!dev->zone_locks) {
+ kvfree(dev->zones);
+ return -ENOMEM;
+ }
}
if (dev->zone_nr_conv >= dev->nr_zones) {
@@ -137,12 +146,17 @@ void null_free_zoned_dev(struct nullb_device *dev)
static inline void null_lock_zone(struct nullb_device *dev, unsigned int zno)
{
- wait_on_bit_lock_io(dev->zone_locks, zno, TASK_UNINTERRUPTIBLE);
+ if (dev->memory_backed)
+ wait_on_bit_lock_io(dev->zone_locks, zno, TASK_UNINTERRUPTIBLE);
+ spin_lock_irq(&dev->zone_lock);
}
static inline void null_unlock_zone(struct nullb_device *dev, unsigned int zno)
{
- clear_and_wake_up_bit(zno, dev->zone_locks);
+ spin_unlock_irq(&dev->zone_lock);
+
+ if (dev->memory_backed)
+ clear_and_wake_up_bit(zno, dev->zone_locks);
}
int null_report_zones(struct gendisk *disk, sector_t sector,
@@ -322,7 +336,6 @@ static blk_status_t null_zone_write(struct nullb_cmd *cmd, sector_t sector,
return null_process_cmd(cmd, REQ_OP_WRITE, sector, nr_sectors);
null_lock_zone(dev, zno);
- spin_lock(&dev->zone_dev_lock);
switch (zone->cond) {
case BLK_ZONE_COND_FULL:
@@ -375,9 +388,17 @@ static blk_status_t null_zone_write(struct nullb_cmd *cmd, sector_t sector,
if (zone->cond != BLK_ZONE_COND_EXP_OPEN)
zone->cond = BLK_ZONE_COND_IMP_OPEN;
- spin_unlock(&dev->zone_dev_lock);
+ /*
+ * Memory backing allocation may sleep: release the zone_lock spinlock
+ * to avoid scheduling in atomic context. Zone operation atomicity is
+ * still guaranteed through the zone_locks bitmap.
+ */
+ if (dev->memory_backed)
+ spin_unlock_irq(&dev->zone_lock);
ret = null_process_cmd(cmd, REQ_OP_WRITE, sector, nr_sectors);
- spin_lock(&dev->zone_dev_lock);
+ if (dev->memory_backed)
+ spin_lock_irq(&dev->zone_lock);
+
if (ret != BLK_STS_OK)
goto unlock;
@@ -392,7 +413,6 @@ static blk_status_t null_zone_write(struct nullb_cmd *cmd, sector_t sector,
ret = BLK_STS_OK;
unlock:
- spin_unlock(&dev->zone_dev_lock);
null_unlock_zone(dev, zno);
return ret;
@@ -516,9 +536,7 @@ static blk_status_t null_zone_mgmt(struct nullb_cmd *cmd, enum req_opf op,
null_lock_zone(dev, i);
zone = &dev->zones[i];
if (zone->cond != BLK_ZONE_COND_EMPTY) {
- spin_lock(&dev->zone_dev_lock);
null_reset_zone(dev, zone);
- spin_unlock(&dev->zone_dev_lock);
trace_nullb_zone_op(cmd, i, zone->cond);
}
null_unlock_zone(dev, i);
@@ -530,7 +548,6 @@ static blk_status_t null_zone_mgmt(struct nullb_cmd *cmd, enum req_opf op,
zone = &dev->zones[zone_no];
null_lock_zone(dev, zone_no);
- spin_lock(&dev->zone_dev_lock);
switch (op) {
case REQ_OP_ZONE_RESET:
@@ -550,8 +567,6 @@ static blk_status_t null_zone_mgmt(struct nullb_cmd *cmd, enum req_opf op,
break;
}
- spin_unlock(&dev->zone_dev_lock);
-
if (ret == BLK_STS_OK)
trace_nullb_zone_op(cmd, zone_no, zone->cond);
Hi,
Please backport 44492e70adc8 to 5.4 stable.
The commit fixes broken WiFi for many users.
Currently only stable 5.4 misses this patch, older kernels don't have rtw88.
Kai-Heng
From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota(a)intel.com>
Commit 5ce6861d36ed5207aff9e5eead4c7cc38a986586 upstream.
This backport targets stable version 5.4, since the original patch fails
to apply there, due to a variable having moved from one struct to another.
The only change required for the patch to apply to 5.4 is to use the
correct structure:
- (engine->gt->info.vdbox_sfc_access &
++ (RUNTIME_INFO(i915)->vdbox_sfc_access &
Original commit message below.
SFC capability of video engines is not set correctly because i915
is testing for incorrect bits.
Fixes: c5d3e39caa45 ("drm/i915: Engine discovery query")
Cc: Matt Roper <matthew.d.roper(a)intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota(a)intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio(a)intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.3+
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106011842.36203-1-daniel…
(cherry picked from commit ad18fa0f5f052046cad96fee762b5c64f42dd86a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
---
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4ce8626b140e..8073758d1036 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -354,7 +354,8 @@ static void __setup_engine_capabilities(struct intel_engine_cs *engine)
* instances.
*/
if ((INTEL_GEN(i915) >= 11 &&
- RUNTIME_INFO(i915)->vdbox_sfc_access & engine->mask) ||
+ (RUNTIME_INFO(i915)->vdbox_sfc_access &
+ BIT(engine->instance))) ||
(INTEL_GEN(i915) >= 9 && engine->instance == 0))
engine->uabi_capabilities |=
I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC;
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio: accel: kxcjk1013: Add support for KIOX010A ACPI DSM for setting
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From e5b1032a656e9aa4c7a4df77cb9156a2a651a5f9 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Tue, 10 Nov 2020 14:38:35 +0100
Subject: iio: accel: kxcjk1013: Add support for KIOX010A ACPI DSM for setting
tablet-mode
Some 360 degree hinges (yoga) style 2-in-1 devices use 2 KXCJ91008-s
to allow the OS to determine the angle between the display and the base
of the device, so that the OS can determine if the 2-in-1 is in laptop
or in tablet-mode.
On Windows both accelerometers are read by a special HingeAngleService
process; and this process calls a DSM (Device Specific Method) on the
ACPI KIOX010A device node for the sensor in the display, to let the
embedded-controller (EC) know about the mode so that it can disable the
kbd and touchpad to avoid spurious input while folded into tablet-mode.
This notifying of the EC is problematic because sometimes the EC comes up
thinking that device is in tablet-mode and the kbd and touchpad do not
work. This happens for example on Irbis NB111 devices after a suspend /
resume cycle (after a complete battery drain / hard reset without having
booted Windows at least once). Other 2-in-1s which are likely affected
too are e.g. the Teclast F5 and F6 series.
The kxcjk-1013 driver may seem like a strange place to deal with this,
but since it is *the* driver for the ACPI KIOX010A device, it is also
the driver which has access to the ACPI handle needed by the DSM.
Add support for calling the DSM and on probe unconditionally tell the
EC that the device is laptop mode, fixing the kbd and touchpad sometimes
not working.
Fixes: 7f6232e69539 ("iio: accel: kxcjk1013: Add KIOX010A ACPI Hardware-ID")
Reported-and-tested-by: russianneuromancer <russianneuromancer(a)ya.ru>
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Cc: <Stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20201110133835.129080-3-hdegoede@redhat.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/accel/kxcjk-1013.c | 36 ++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/drivers/iio/accel/kxcjk-1013.c b/drivers/iio/accel/kxcjk-1013.c
index abeb0d254046..560a3373ff20 100644
--- a/drivers/iio/accel/kxcjk-1013.c
+++ b/drivers/iio/accel/kxcjk-1013.c
@@ -129,6 +129,7 @@ enum kx_chipset {
enum kx_acpi_type {
ACPI_GENERIC,
ACPI_SMO8500,
+ ACPI_KIOX010A,
};
struct kxcjk1013_data {
@@ -275,6 +276,32 @@ static const struct {
{19163, 1, 0},
{38326, 0, 1} };
+#ifdef CONFIG_ACPI
+enum kiox010a_fn_index {
+ KIOX010A_SET_LAPTOP_MODE = 1,
+ KIOX010A_SET_TABLET_MODE = 2,
+};
+
+static int kiox010a_dsm(struct device *dev, int fn_index)
+{
+ acpi_handle handle = ACPI_HANDLE(dev);
+ guid_t kiox010a_dsm_guid;
+ union acpi_object *obj;
+
+ if (!handle)
+ return -ENODEV;
+
+ guid_parse("1f339696-d475-4e26-8cad-2e9f8e6d7a91", &kiox010a_dsm_guid);
+
+ obj = acpi_evaluate_dsm(handle, &kiox010a_dsm_guid, 1, fn_index, NULL);
+ if (!obj)
+ return -EIO;
+
+ ACPI_FREE(obj);
+ return 0;
+}
+#endif
+
static int kxcjk1013_set_mode(struct kxcjk1013_data *data,
enum kxcjk1013_mode mode)
{
@@ -352,6 +379,13 @@ static int kxcjk1013_chip_init(struct kxcjk1013_data *data)
{
int ret;
+#ifdef CONFIG_ACPI
+ if (data->acpi_type == ACPI_KIOX010A) {
+ /* Make sure the kbd and touchpad on 2-in-1s using 2 KXCJ91008-s work */
+ kiox010a_dsm(&data->client->dev, KIOX010A_SET_LAPTOP_MODE);
+ }
+#endif
+
ret = i2c_smbus_read_byte_data(data->client, KXCJK1013_REG_WHO_AM_I);
if (ret < 0) {
dev_err(&data->client->dev, "Error reading who_am_i\n");
@@ -1262,6 +1296,8 @@ static const char *kxcjk1013_match_acpi_device(struct device *dev,
if (strcmp(id->id, "SMO8500") == 0)
*acpi_type = ACPI_SMO8500;
+ else if (strcmp(id->id, "KIOX010A") == 0)
+ *acpi_type = ACPI_KIOX010A;
*chipset = (enum kx_chipset)id->driver_data;
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio: accel: kxcjk1013: Replace is_smo8500_device with an acpi_type
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 11e94f28c3de35d5ad1ac6a242a5b30f4378991a Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Tue, 10 Nov 2020 14:38:34 +0100
Subject: iio: accel: kxcjk1013: Replace is_smo8500_device with an acpi_type
enum
Replace the boolean is_smo8500_device variable with an acpi_type enum.
For now this can be either ACPI_GENERIC or ACPI_SMO8500, this is a
preparation patch for adding special handling for the KIOX010A ACPI HID,
which will add a ACPI_KIOX010A acpi_type to the introduced enum.
For stable as needed as precursor for next patch.
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Fixes: 7f6232e69539 ("iio: accel: kxcjk1013: Add KIOX010A ACPI Hardware-ID")
Cc: <Stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20201110133835.129080-2-hdegoede@redhat.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/accel/kxcjk-1013.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/iio/accel/kxcjk-1013.c b/drivers/iio/accel/kxcjk-1013.c
index beb38d9d607d..abeb0d254046 100644
--- a/drivers/iio/accel/kxcjk-1013.c
+++ b/drivers/iio/accel/kxcjk-1013.c
@@ -126,6 +126,11 @@ enum kx_chipset {
KX_MAX_CHIPS /* this must be last */
};
+enum kx_acpi_type {
+ ACPI_GENERIC,
+ ACPI_SMO8500,
+};
+
struct kxcjk1013_data {
struct i2c_client *client;
struct iio_trigger *dready_trig;
@@ -143,7 +148,7 @@ struct kxcjk1013_data {
bool motion_trigger_on;
int64_t timestamp;
enum kx_chipset chipset;
- bool is_smo8500_device;
+ enum kx_acpi_type acpi_type;
};
enum kxcjk1013_axis {
@@ -1247,7 +1252,7 @@ static irqreturn_t kxcjk1013_data_rdy_trig_poll(int irq, void *private)
static const char *kxcjk1013_match_acpi_device(struct device *dev,
enum kx_chipset *chipset,
- bool *is_smo8500_device)
+ enum kx_acpi_type *acpi_type)
{
const struct acpi_device_id *id;
@@ -1256,7 +1261,7 @@ static const char *kxcjk1013_match_acpi_device(struct device *dev,
return NULL;
if (strcmp(id->id, "SMO8500") == 0)
- *is_smo8500_device = true;
+ *acpi_type = ACPI_SMO8500;
*chipset = (enum kx_chipset)id->driver_data;
@@ -1299,7 +1304,7 @@ static int kxcjk1013_probe(struct i2c_client *client,
} else if (ACPI_HANDLE(&client->dev)) {
name = kxcjk1013_match_acpi_device(&client->dev,
&data->chipset,
- &data->is_smo8500_device);
+ &data->acpi_type);
} else
return -ENODEV;
@@ -1316,7 +1321,7 @@ static int kxcjk1013_probe(struct i2c_client *client,
indio_dev->modes = INDIO_DIRECT_MODE;
indio_dev->info = &kxcjk1013_info;
- if (client->irq > 0 && !data->is_smo8500_device) {
+ if (client->irq > 0 && data->acpi_type != ACPI_SMO8500) {
ret = devm_request_threaded_irq(&client->dev, client->irq,
kxcjk1013_data_rdy_trig_poll,
kxcjk1013_event_handler,
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio: light: fix kconfig dependency bug for VCNL4035
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 44a146a44f656fc03d368c1b9248d29a128cd053 Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran <fazilyildiran(a)gmail.com>
Date: Tue, 3 Nov 2020 01:35:24 +0300
Subject: iio: light: fix kconfig dependency bug for VCNL4035
When VCNL4035 is enabled and IIO_BUFFER is disabled, it results in the
following Kbuild warning:
WARNING: unmet direct dependencies detected for IIO_TRIGGERED_BUFFER
Depends on [n]: IIO [=y] && IIO_BUFFER [=n]
Selected by [y]:
- VCNL4035 [=y] && IIO [=y] && I2C [=y]
The reason is that VCNL4035 selects IIO_TRIGGERED_BUFFER without depending
on or selecting IIO_BUFFER while IIO_TRIGGERED_BUFFER depends on
IIO_BUFFER. This can also fail building the kernel.
Honor the kconfig dependency to remove unmet direct dependency warnings
and avoid any potential build failures.
Fixes: 55707294c4eb ("iio: light: Add support for vishay vcnl4035")
Signed-off-by: Necip Fazil Yildiran <fazilyildiran(a)gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=209883
Link: https://lore.kernel.org/r/20201102223523.572461-1-fazilyildiran@gmail.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/light/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/iio/light/Kconfig b/drivers/iio/light/Kconfig
index cade6dc0305b..33ad4dd0b5c7 100644
--- a/drivers/iio/light/Kconfig
+++ b/drivers/iio/light/Kconfig
@@ -544,6 +544,7 @@ config VCNL4000
config VCNL4035
tristate "VCNL4035 combined ALS and proximity sensor"
+ select IIO_BUFFER
select IIO_TRIGGERED_BUFFER
select REGMAP_I2C
depends on I2C
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio/adc: ingenic: Fix AUX/VBAT readings when touchscreen is used
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 6d6aa2907d59ddd3c0ebb2b93e1ddc84e474485b Mon Sep 17 00:00:00 2001
From: Paul Cercueil <paul(a)crapouillou.net>
Date: Tue, 3 Nov 2020 20:12:38 +0000
Subject: iio/adc: ingenic: Fix AUX/VBAT readings when touchscreen is used
When the command feature of the ADC is used, it is possible to program
the ADC, and specify at each step what input should be processed, and in
comparison to what reference.
This broke the AUX and battery readings when the touchscreen was
enabled, most likely because the CMD feature would change the VREF all
the time.
Now, when AUX or battery are read, we temporarily disable the CMD
feature, which means that we won't get touchscreen readings in that time
frame. But it now gives correct values for AUX / battery, and the
touchscreen isn't disabled for long enough to be an actual issue.
Fixes: b96952f498db ("IIO: Ingenic JZ47xx: Add touchscreen mode.")
Signed-off-by: Paul Cercueil <paul(a)crapouillou.net>
Acked-by: Artur Rojek <contact(a)artur-rojek.eu>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20201103201238.161083-1-paul@crapouillou.net
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/ingenic-adc.c | 32 ++++++++++++++++++++++++++------
1 file changed, 26 insertions(+), 6 deletions(-)
diff --git a/drivers/iio/adc/ingenic-adc.c b/drivers/iio/adc/ingenic-adc.c
index 973e84deebea..1aafbe2cfe67 100644
--- a/drivers/iio/adc/ingenic-adc.c
+++ b/drivers/iio/adc/ingenic-adc.c
@@ -177,13 +177,12 @@ static void ingenic_adc_set_config(struct ingenic_adc *adc,
mutex_unlock(&adc->lock);
}
-static void ingenic_adc_enable(struct ingenic_adc *adc,
- int engine,
- bool enabled)
+static void ingenic_adc_enable_unlocked(struct ingenic_adc *adc,
+ int engine,
+ bool enabled)
{
u8 val;
- mutex_lock(&adc->lock);
val = readb(adc->base + JZ_ADC_REG_ENABLE);
if (enabled)
@@ -192,20 +191,41 @@ static void ingenic_adc_enable(struct ingenic_adc *adc,
val &= ~BIT(engine);
writeb(val, adc->base + JZ_ADC_REG_ENABLE);
+}
+
+static void ingenic_adc_enable(struct ingenic_adc *adc,
+ int engine,
+ bool enabled)
+{
+ mutex_lock(&adc->lock);
+ ingenic_adc_enable_unlocked(adc, engine, enabled);
mutex_unlock(&adc->lock);
}
static int ingenic_adc_capture(struct ingenic_adc *adc,
int engine)
{
+ u32 cfg;
u8 val;
int ret;
- ingenic_adc_enable(adc, engine, true);
+ /*
+ * Disable CMD_SEL temporarily, because it causes wrong VBAT readings,
+ * probably due to the switch of VREF. We must keep the lock here to
+ * avoid races with the buffer enable/disable functions.
+ */
+ mutex_lock(&adc->lock);
+ cfg = readl(adc->base + JZ_ADC_REG_CFG);
+ writel(cfg & ~JZ_ADC_REG_CFG_CMD_SEL, adc->base + JZ_ADC_REG_CFG);
+
+ ingenic_adc_enable_unlocked(adc, engine, true);
ret = readb_poll_timeout(adc->base + JZ_ADC_REG_ENABLE, val,
!(val & BIT(engine)), 250, 1000);
if (ret)
- ingenic_adc_enable(adc, engine, false);
+ ingenic_adc_enable_unlocked(adc, engine, false);
+
+ writel(cfg, adc->base + JZ_ADC_REG_CFG);
+ mutex_unlock(&adc->lock);
return ret;
}
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio/adc: ingenic: Fix battery VREF for JZ4770 SoC
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From c91ebcc578e09783cfa4d85c1b437790f140f29a Mon Sep 17 00:00:00 2001
From: Paul Cercueil <paul(a)crapouillou.net>
Date: Wed, 4 Nov 2020 19:28:43 +0000
Subject: iio/adc: ingenic: Fix battery VREF for JZ4770 SoC
The reference voltage for the battery is clearly marked as 1.2V in the
programming manual. With this fixed, the battery channel now returns
correct values.
Fixes: a515d6488505 ("IIO: Ingenic JZ47xx: Add support for JZ4770 SoC ADC.")
Signed-off-by: Paul Cercueil <paul(a)crapouillou.net>
Acked-by: Artur Rojek <contact(a)artur-rojek.eu>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20201104192843.67187-1-paul@crapouillou.net
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/ingenic-adc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/adc/ingenic-adc.c b/drivers/iio/adc/ingenic-adc.c
index 92b25083e23f..973e84deebea 100644
--- a/drivers/iio/adc/ingenic-adc.c
+++ b/drivers/iio/adc/ingenic-adc.c
@@ -71,7 +71,7 @@
#define JZ4725B_ADC_BATTERY_HIGH_VREF_BITS 10
#define JZ4740_ADC_BATTERY_HIGH_VREF (7500 * 0.986)
#define JZ4740_ADC_BATTERY_HIGH_VREF_BITS 12
-#define JZ4770_ADC_BATTERY_VREF 6600
+#define JZ4770_ADC_BATTERY_VREF 1200
#define JZ4770_ADC_BATTERY_VREF_BITS 12
#define JZ_ADC_IRQ_AUX BIT(0)
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio: imu: st_lsm6dsx: set 10ms as min shub slave timeout
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From fe0b980ffd1dd8b10c09f82385514819ba2a661d Mon Sep 17 00:00:00 2001
From: Lorenzo Bianconi <lorenzo(a)kernel.org>
Date: Sun, 1 Nov 2020 17:21:18 +0100
Subject: iio: imu: st_lsm6dsx: set 10ms as min shub slave timeout
Set 10ms as minimum i2c slave configuration timeout since st_lsm6dsx
relies on accel ODR for i2c master clock and at high sample rates
(e.g. 833Hz or 416Hz) the slave sensor occasionally may need more cycles
than i2c master timeout (2s/833Hz + 1 ~ 3ms) to apply the configuration
resulting in an uncomplete slave configuration and a constant reading
from the i2c slave connected to st_lsm6dsx i2c master.
Fixes: 8f9a5249e3d9 ("iio: imu: st_lsm6dsx: enable 833Hz sample frequency for tagged sensors")
Fixes: c91c1c844ebd ("iio: imu: st_lsm6dsx: add i2c embedded controller support")
Signed-off-by: Lorenzo Bianconi <lorenzo(a)kernel.org>
Cc: <Stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/a69c8236bf16a1569966815ed71710af2722ed7d.16042472…
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c
index 8c8d8870ca07..99562ba85ee4 100644
--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c
+++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c
@@ -156,11 +156,13 @@ static const struct st_lsm6dsx_ext_dev_settings st_lsm6dsx_ext_dev_table[] = {
static void st_lsm6dsx_shub_wait_complete(struct st_lsm6dsx_hw *hw)
{
struct st_lsm6dsx_sensor *sensor;
- u32 odr;
+ u32 odr, timeout;
sensor = iio_priv(hw->iio_devs[ST_LSM6DSX_ID_ACC]);
odr = (hw->enable_mask & BIT(ST_LSM6DSX_ID_ACC)) ? sensor->odr : 12500;
- msleep((2000000U / odr) + 1);
+ /* set 10ms as minimum timeout for i2c slave configuration */
+ timeout = max_t(u32, 2000000U / odr + 1, 10);
+ msleep(timeout);
}
/*
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio: adc: mediatek: fix unset field
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 15207a92e019803d62687455d8aa2ff9eb3dc82c Mon Sep 17 00:00:00 2001
From: Fabien Parent <fparent(a)baylibre.com>
Date: Sun, 18 Oct 2020 21:46:44 +0200
Subject: iio: adc: mediatek: fix unset field
dev_comp field is used in a couple of places but it is never set. This
results in kernel oops when dereferencing a NULL pointer. Set the
`dev_comp` field correctly in the probe function.
Fixes: 6d97024dce23 ("iio: adc: mediatek: mt6577-auxadc, add mt6765 support")
Signed-off-by: Fabien Parent <fparent(a)baylibre.com>
Reviewed-by: Matthias Brugger <matthias.bgg(a)gmail.com>
Cc: <Stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20201018194644.3366846-1-fparent@baylibre.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/mt6577_auxadc.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/iio/adc/mt6577_auxadc.c b/drivers/iio/adc/mt6577_auxadc.c
index ac415cb089cd..79c1dd68b909 100644
--- a/drivers/iio/adc/mt6577_auxadc.c
+++ b/drivers/iio/adc/mt6577_auxadc.c
@@ -9,9 +9,9 @@
#include <linux/err.h>
#include <linux/kernel.h>
#include <linux/module.h>
-#include <linux/of.h>
-#include <linux/of_device.h>
+#include <linux/mod_devicetable.h>
#include <linux/platform_device.h>
+#include <linux/property.h>
#include <linux/iopoll.h>
#include <linux/io.h>
#include <linux/iio/iio.h>
@@ -276,6 +276,8 @@ static int mt6577_auxadc_probe(struct platform_device *pdev)
goto err_disable_clk;
}
+ adc_dev->dev_comp = device_get_match_data(&pdev->dev);
+
mutex_init(&adc_dev->lock);
mt6577_auxadc_mod_reg(adc_dev->reg_base + MT6577_AUXADC_MISC,
--
2.29.2
This is a note to let you know that I've just added the patch titled
iio: cros_ec: Use default frequencies when EC returns invalid
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 56e4f2dda23c6d39d327944faa89efaa4eb290d1 Mon Sep 17 00:00:00 2001
From: Gwendal Grignou <gwendal(a)chromium.org>
Date: Tue, 30 Jun 2020 08:37:30 -0700
Subject: iio: cros_ec: Use default frequencies when EC returns invalid
information
Minimal and maximal frequencies supported by a sensor is queried.
On some older machines, these frequencies are not returned properly and
the EC returns 0 instead.
When returned maximal frequency is 0, ignore the information and use
default frequencies instead.
Fixes: ae7b02ad2f32 ("iio: common: cros_ec_sensors: Expose cros_ec_sensors frequency range via iio sysfs")
Signed-off-by: Gwendal Grignou <gwendal(a)chromium.org>
Reviewed-by: Enric Balletbo i Serra <enric.balletbo(a)collabora.com>
Link: https://lore.kernel.org/r/20200630153730.3302889-1-gwendal@chromium.org
CC: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
.../cros_ec_sensors/cros_ec_sensors_core.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
index c62cacc04672..e3f507771f17 100644
--- a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
+++ b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
@@ -256,7 +256,7 @@ int cros_ec_sensors_core_init(struct platform_device *pdev,
struct cros_ec_sensorhub *sensor_hub = dev_get_drvdata(dev->parent);
struct cros_ec_dev *ec = sensor_hub->ec;
struct cros_ec_sensor_platform *sensor_platform = dev_get_platdata(dev);
- u32 ver_mask;
+ u32 ver_mask, temp;
int frequencies[ARRAY_SIZE(state->frequencies) / 2] = { 0 };
int ret, i;
@@ -311,10 +311,16 @@ int cros_ec_sensors_core_init(struct platform_device *pdev,
&frequencies[2],
&state->fifo_max_event_count);
} else {
- frequencies[1] = state->resp->info_3.min_frequency;
- frequencies[2] = state->resp->info_3.max_frequency;
- state->fifo_max_event_count =
- state->resp->info_3.fifo_max_event_count;
+ if (state->resp->info_3.max_frequency == 0) {
+ get_default_min_max_freq(state->resp->info.type,
+ &frequencies[1],
+ &frequencies[2],
+ &temp);
+ } else {
+ frequencies[1] = state->resp->info_3.min_frequency;
+ frequencies[2] = state->resp->info_3.max_frequency;
+ }
+ state->fifo_max_event_count = state->resp->info_3.fifo_max_event_count;
}
for (i = 0; i < ARRAY_SIZE(frequencies); i++) {
state->frequencies[2 * i] = frequencies[i] / 1000;
--
2.29.2
From: Alexander Sverdlin <alexander.sverdlin(a)nokia.com>
Linux doesn't own the memory immediately after the kernel image. On Octeon
bootloader places a shared structure right close after the kernel _end,
refer to "struct cvmx_bootinfo *octeon_bootinfo" in cavium-octeon/setup.c.
If check_kernel_sections_mem() rounds the PFNs up, first memblock_alloc()
inside early_init_dt_alloc_memory_arch() <= device_tree_init() returns
memory block overlapping with the above octeon_bootinfo structure, which
is being overwritten afterwards.
Cc: stable(a)vger.kernel.org
Fixes: a94e4f24ec83 ("MIPS: init: Drop boot_mem_map")
Signed-off-by: Alexander Sverdlin <alexander.sverdlin(a)nokia.com>
---
arch/mips/kernel/setup.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 0d42532..f6cf2f6 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -504,6 +504,12 @@ static void __init check_kernel_sections_mem(void)
if (!memblock_is_region_memory(start, size)) {
pr_info("Kernel sections are not in the memory maps\n");
memblock_add(start, size);
+ /*
+ * Octeon bootloader places shared data structure right after
+ * the kernel => make sure it will not be corrupted.
+ */
+ memblock_reserve(__pa_symbol(&_end),
+ start + size - __pa_symbol(&_end));
}
}
--
2.10.2
Currently scan_microcode() leverages microcode_matches() to check if the
microcode matches the CPU by comparing the family and model. However before
saving the microcode in scan_microcode(), the processor stepping and flag
of the microcode signature should also be considered in order to avoid
incompatible update and caused the failure of microcode update.
For example on one platform the microcode failed to be updated to the
latest revison on APs during resume from S3 due to incompatible cpu
stepping and signature->pf. This is because the scan_microcode() has
saved an incompatible copy of intel_ucode_patch in
save_microcode_in_initrd_intel() after bootup. And this intel_ucode_patch
is used by APs during early resume from S3 which results in unchecked MSR
access error during resume from S3:
[ 95.519390] unchecked MSR access error: RDMSR from 0x123 at
rIP: 0xffffffffb7676208 (native_read_msr+0x8/0x40)
[ 95.519391] Call Trace:
[ 95.519395] update_srbds_msr+0x38/0x80
[ 95.519396] identify_secondary_cpu+0x7a/0x90
[ 95.519397] smp_store_cpu_info+0x4e/0x60
[ 95.519398] start_secondary+0x49/0x150
[ 95.519399] secondary_startup_64_no_verify+0xa6/0xab
The system keeps running on old microcode during resume:
[ 210.366757] microcode: load_ucode_intel_ap: CPU1, enter, intel_ucode_patch: 0xffff9bf2816e0000
[ 210.366757] microcode: load_ucode_intel_ap: CPU1, p: 0xffff9bf2816e0000, rev: 0xd6
[ 210.366759] microcode: apply_microcode_early: rev: 0x84
[ 210.367826] microcode: apply_microcode_early: rev after upgrade: 0x84
until mc_cpu_starting() is invoked on each AP during resume and the
correct microcode is updated via apply_microcode_intel().
To fix this issue, the scan_microcode() uses find_matching_signature()
instead of microcode_matches() to compare the (family, model, stepping,
processor flag), and only save the microcode that matches. As there is
no other place invoking microcode_matches(), remove it accordingly.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208535
Fixes: 06b8534cb728 ("x86/microcode: Rework microcode loading")
Cc: stable(a)vger.kernel.org#v4.10+
Reviewed-by: Ashok Raj <ashok.raj(a)intel.com>
Signed-off-by: Chen Yu <yu.c.chen(a)intel.com>
---
v2: Remove RFC tag and Cc the stable mailing list.
---
arch/x86/kernel/cpu/microcode/intel.c | 50 ++-------------------------
1 file changed, 2 insertions(+), 48 deletions(-)
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 6a99535d7f37..923853f79099 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -100,53 +100,6 @@ static int has_newer_microcode(void *mc, unsigned int csig, int cpf, int new_rev
return find_matching_signature(mc, csig, cpf);
}
-/*
- * Given CPU signature and a microcode patch, this function finds if the
- * microcode patch has matching family and model with the CPU.
- *
- * %true - if there's a match
- * %false - otherwise
- */
-static bool microcode_matches(struct microcode_header_intel *mc_header,
- unsigned long sig)
-{
- unsigned long total_size = get_totalsize(mc_header);
- unsigned long data_size = get_datasize(mc_header);
- struct extended_sigtable *ext_header;
- unsigned int fam_ucode, model_ucode;
- struct extended_signature *ext_sig;
- unsigned int fam, model;
- int ext_sigcount, i;
-
- fam = x86_family(sig);
- model = x86_model(sig);
-
- fam_ucode = x86_family(mc_header->sig);
- model_ucode = x86_model(mc_header->sig);
-
- if (fam == fam_ucode && model == model_ucode)
- return true;
-
- /* Look for ext. headers: */
- if (total_size <= data_size + MC_HEADER_SIZE)
- return false;
-
- ext_header = (void *) mc_header + data_size + MC_HEADER_SIZE;
- ext_sig = (void *)ext_header + EXT_HEADER_SIZE;
- ext_sigcount = ext_header->count;
-
- for (i = 0; i < ext_sigcount; i++) {
- fam_ucode = x86_family(ext_sig->sig);
- model_ucode = x86_model(ext_sig->sig);
-
- if (fam == fam_ucode && model == model_ucode)
- return true;
-
- ext_sig++;
- }
- return false;
-}
-
static struct ucode_patch *memdup_patch(void *data, unsigned int size)
{
struct ucode_patch *p;
@@ -344,7 +297,8 @@ scan_microcode(void *data, size_t size, struct ucode_cpu_info *uci, bool save)
size -= mc_size;
- if (!microcode_matches(mc_header, uci->cpu_sig.sig)) {
+ if (!find_matching_signature(data, uci->cpu_sig.sig,
+ uci->cpu_sig.pf)) {
data += mc_size;
continue;
}
--
2.17.1
Write buffers use a kmalloc()'ed buffer, they can leak
up to seven bytes of kernel memory to flash if writes are not
aligned.
So use ubifs_pad() to fill these gaps with padding bytes.
This was never a problem while scanning because the scanner logic
manually aligns node lengths and skips over these gaps.
Cc: <stable(a)vger.kernel.org>
Fixes: 1e51764a3c2ac05a2 ("UBIFS: add new flash file system")
Signed-off-by: Richard Weinberger <richard(a)nod.at>
---
fs/ubifs/io.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/fs/ubifs/io.c b/fs/ubifs/io.c
index 7e4bfaf2871f..eae9cf5a57b0 100644
--- a/fs/ubifs/io.c
+++ b/fs/ubifs/io.c
@@ -319,7 +319,7 @@ void ubifs_pad(const struct ubifs_info *c, void *buf, int pad)
{
uint32_t crc;
- ubifs_assert(c, pad >= 0 && !(pad & 7));
+ ubifs_assert(c, pad >= 0);
if (pad >= UBIFS_PAD_NODE_SZ) {
struct ubifs_ch *ch = buf;
@@ -764,6 +764,10 @@ int ubifs_wbuf_write_nolock(struct ubifs_wbuf *wbuf, void *buf, int len)
* write-buffer.
*/
memcpy(wbuf->buf + wbuf->used, buf, len);
+ if (aligned_len > len) {
+ ubifs_assert(c, aligned_len - len < 8);
+ ubifs_pad(c, wbuf->buf + wbuf->used + len, aligned_len - len);
+ }
if (aligned_len == wbuf->avail) {
dbg_io("flush jhead %s wbuf to LEB %d:%d",
@@ -856,13 +860,18 @@ int ubifs_wbuf_write_nolock(struct ubifs_wbuf *wbuf, void *buf, int len)
}
spin_lock(&wbuf->lock);
- if (aligned_len)
+ if (aligned_len) {
/*
* And now we have what's left and what does not take whole
* max. write unit, so write it to the write-buffer and we are
* done.
*/
memcpy(wbuf->buf, buf + written, len);
+ if (aligned_len > len) {
+ ubifs_assert(c, aligned_len - len < 8);
+ ubifs_pad(c, wbuf->buf + len, aligned_len - len);
+ }
+ }
if (c->leb_size - wbuf->offs >= c->max_write_size)
wbuf->size = c->max_write_size;
--
2.26.2
From: Siarhei Liakh <siarhei.liakh(a)concurrent-rt.com>
TL;DR:
There are two places in unlz4() function where reads beyond the end of a buffer
might happen under certain conditions which had been observed in real life on
stock Ubuntu 20.04 x86_64 with several vanilla mainline kernels, including 5.10.
As a result of this issue, the kernel fails to decompress LZ4-compressed
initramfs with following message showing up in the logs:
initramfs unpacking failed: Decoding failed
Note that in most cases the affected system is still able to proceed with the
boot process to completion.
LONG STORY:
Background.
Not so long ago we've noticed that some of our Ubuntu 20.04 x86_64 test systems
often fail to boot newly generated initramfs image. After extensive
investigation we determined that a failure required the following combination
for our 5.4.66-rt38 kernel with some additional custom patches:
Real x86_64 hardware or QEMU
UEFI boot
Ubunutu 20.04 (or 20.04.1) x86_64
CONFIG_BLK_DEV_RAM=y in .config
COMPRESS=lz4 in initramfs.conf
Freshly compiled and installed kernel
Freshly generated and installed initramfs image
In our testing, such a combination would often produce a non-bootable system. It
is important to note that [un]bootability of the system was later tracked down
to particular instances of initramfs images, and would follow them if they were
to be switched around/transferred to other systems. What is even more important
is that consecutive re-generations of initramfs images from the same source and
binary materials would yield about 75% of "bad" images. Further, once the image
is identified as "bad",it always stays "bad"; once one is "good" it always stays
"good". Reverting CONFIG_BLK_DEV_RAM to "m" (default in Ubuntu), or changing
COMPRESS to "gzip" yields a 100% bootable system. Decompressing "bad" initramfs
image with "unmkinitramfs" yields *exactly* the same set of binaries, as
verified by matching MD5 sums to those from "good" image.
Speculation.
Based on general observations, it appears that Ubuntu's userland toolchain
cannot consistently generate exactly the same compressed initramfs image, likely
due to some variations in timestamps between the runs. This causes variations in
compressed lz4 data stream. Further, either initramfs tools or lz4 libraries
appear to pad compressed lz4 output to closest 4-byte boundary. lz4 v1.9.2 that
ships with Ubuntu 20.04 appears to be able to handle such padding just fine,
while lz4 (supposedly v1.8.3) within Linux kernel cannot.
Several reports of somewhat similar behavior had been recently circulation
through different bug tracking systems and discussion forums [1-4].
I also suspect only that systems which can mount permanent root directly (or
with help of modules contained in first, supposedly uncompressed, part of
initramfs, or the ones with statically linked modules) can actually complete the
boot when LZ4 decompression fails. This would certainly explain why most of
Ubuntu systems still manage to boot even after failing to decompress the image.
The facts.
Regardless of whether Ubuntu 20.04 toolchain produces a valid lz4-compressed
initramfs image or not, current version of unlz4() function in kernel has two
code paths which had been observed attempting to read beyond the buffer end when
presented with one of the "padded"/"bad" initramfs images generated by stock
Ubuntu 20.04 toolchain. Some configurations of some 5.4 kernels are known to
fail to boot in such cases. This behavior also becomes evident on vanilla
5.10.0-rc3 and 5.10.0-rc4 kernels with addition of two logging statements for
corresponding edge cases, even though it does not prevent system from booting in
most generic configurations.
Further investigation is likely warranted to confirm whether userland toolchain
contains any bugs and/or whether any of these cases constitute violation of LZ4
and/or initramfs specification.
References
[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1835660
[2] https://github.com/linuxmint/mint20-beta/issues/90
[3] https://askubuntu.com/questions/1245458/getting-the-message-0-283078-initra…
[4] https://forums.linuxmint.com/viewtopic.php?t=323152
Signed-off-by: Siarhei Liakh <siarhei.liakh(a)concurrent-rt.com>
---
Please CC: me directly on all replies.
lib/decompress_unlz4.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/lib/decompress_unlz4.c b/lib/decompress_unlz4.c
index c0cfcfd486be..a016643a6dc5 100644
--- a/lib/decompress_unlz4.c
+++ b/lib/decompress_unlz4.c
@@ -125,6 +125,21 @@ STATIC inline int INIT unlz4(u8 *input, long in_len,
continue;
}
+ if (chunksize == 0) {
+ /*
+ * Nothing to decode...
+ * FIXME: this could be an error condition due
+ * to invalid or corrupt data. However, some
+ * userspace tools had been observed producing
+ * otherwise valid initramfs images which happen
+ * to hit this condition.
+ * TODO: need to figure out whether the latest
+ * LZ4 and initramfs specifications allows for
+ * zero-sized chunks.
+ * See similar message below.
+ */
+ break;
+ }
if (posp)
*posp += 4;
@@ -179,6 +194,20 @@ STATIC inline int INIT unlz4(u8 *input, long in_len,
else if (size < 0) {
error("data corrupted");
goto exit_2;
+ } else if (size < 4) {
+ /*
+ * Ignore any undesized junk/padding...
+ * FIXME: this could be an error condition due
+ * to invalid or corrupt data. However, some
+ * userspace tools had been observed producing
+ * otherwise valid initramfs images which happen
+ * to hit this condition.
+ * TODO: need to figure out whether the latest
+ * LZ4 and initramfs specifications allows for
+ * small padding at the end of the chunk.
+ * See similar message above.
+ */
+ break;
}
inp += chunksize;
}
--
2.17.1
Clang's integrated assembler produces the warning for assembly files:
warning: DWARF2 only supports one section per compilation unit
If -Wa,-gdwarf-* is unspecified, then debug info is not emitted. This
will be re-enabled for new DWARF versions in a follow up patch.
Enables defconfig+CONFIG_DEBUG_INFO to build cleanly with
LLVM=1 LLVM_IAS=1 for x86_64 and arm64.
Cc: <stable(a)vger.kernel.org>
Link: https://github.com/ClangBuiltLinux/linux/issues/716
Reported-by: Nathan Chancellor <natechancellor(a)gmail.com>
Suggested-by: Dmitry Golovin <dima(a)golovin.in>
Suggested-by: Sedat Dilek <sedat.dilek(a)gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com>
---
Makefile | 2 ++
1 file changed, 2 insertions(+)
diff --git a/Makefile b/Makefile
index f353886dbf44..75b1a3dcbf30 100644
--- a/Makefile
+++ b/Makefile
@@ -826,7 +826,9 @@ else
DEBUG_CFLAGS += -g
endif
+ifndef LLVM_IAS
KBUILD_AFLAGS += -Wa,-gdwarf-2
+endif
ifdef CONFIG_DEBUG_INFO_DWARF4
DEBUG_CFLAGS += -gdwarf-4
--
2.29.1.341.ge80a0c044ae-goog
The patch titled
Subject: mm, page_frag: recover from memory pressure
has been added to the -mm tree. Its filename is
page_frag-recover-from-memory-pressure.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/page_frag-recover-from-memory-pre…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/page_frag-recover-from-memory-pre…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Dongli Zhang <dongli.zhang(a)oracle.com>
Subject: mm, page_frag: recover from memory pressure
The ethernet driver may allocate skb (and skb->data) via napi_alloc_skb().
This ends up to page_frag_alloc() to allocate skb->data from
page_frag_cache->va.
During the memory pressure, page_frag_cache->va may be allocated as
pfmemalloc page. As a result, the skb->pfmemalloc is always true as
skb->data is from page_frag_cache->va. The skb will be dropped if the
sock (receiver) does not have SOCK_MEMALLOC. This is expected behaviour
under memory pressure.
However, once kernel is not under memory pressure any longer (suppose
large amount of memory pages are just reclaimed), the page_frag_alloc()
may still re-use the prior pfmemalloc page_frag_cache->va to allocate
skb->data. As a result, the skb->pfmemalloc is always true unless
page_frag_cache->va is re-allocated, even if the kernel is not under
memory pressure any longer.
Here is how kernel runs into issue.
1. The kernel is under memory pressure and allocation of
PAGE_FRAG_CACHE_MAX_ORDER in __page_frag_cache_refill() will fail.
Instead, the pfmemalloc page is allocated for page_frag_cache->va.
2. All skb->data from page_frag_cache->va (pfmemalloc) will have
skb->pfmemalloc=true. The skb will always be dropped by sock without
SOCK_MEMALLOC. This is an expected behaviour.
3. Suppose a large amount of pages are reclaimed and kernel is not
under memory pressure any longer. We expect skb->pfmemalloc drop will
not happen.
4. Unfortunately, page_frag_alloc() does not proactively re-allocate
page_frag_alloc->va and will always re-use the prior pfmemalloc page.
The skb->pfmemalloc is always true even kernel is not under memory
pressure any longer.
Fix this by freeing and re-allocating the page instead of recycling it.
Link: https://lore.kernel.org/lkml/20201103193239.1807-1-dongli.zhang@oracle.com/
Link: https://lore.kernel.org/linux-mm/20201105042140.5253-1-willy@infradead.org/
Link: https://lkml.kernel.org/r/20201115201029.11903-1-dongli.zhang@oracle.com
Fixes: 79930f5892e ("net: do not deplete pfmemalloc reserve")
Signed-off-by: Dongli Zhang <dongli.zhang(a)oracle.com>
Suggested-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Cc: Aruna Ramakrishna <aruna.ramakrishna(a)oracle.com>
Cc: Bert Barbe <bert.barbe(a)oracle.com>
Cc: Rama Nichanamatlu <rama.nichanamatlu(a)oracle.com>
Cc: Venkat Venkatsubra <venkat.x.venkatsubra(a)oracle.com>
Cc: Manjunath Patil <manjunath.b.patil(a)oracle.com>
Cc: Joe Jin <joe.jin(a)oracle.com>
Cc: SRINIVAS <srinivas.eeda(a)oracle.com>
Cc: David S. Miller <davem(a)davemloft.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/mm/page_alloc.c~page_frag-recover-from-memory-pressure
+++ a/mm/page_alloc.c
@@ -5103,6 +5103,11 @@ refill:
if (!page_ref_sub_and_test(page, nc->pagecnt_bias))
goto refill;
+ if (unlikely(nc->pfmemalloc)) {
+ free_the_page(page, compound_order(page));
+ goto refill;
+ }
+
#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
/* if size can vary use size else just use PAGE_SIZE */
size = nc->size;
_
Patches currently in -mm which might be from dongli.zhang(a)oracle.com are
page_frag-recover-from-memory-pressure.patch
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
EDID can declare the maximum supported bpc up to 16,
and apparently there are displays that do so. Currently
we assume 12 bpc is tha max. Fix the assumption and
toss in a MISSING_CASE() for any other value we don't
expect to see.
This fixes modesets with a display with EDID max bpc > 12.
Previously any modeset would just silently fail on platforms
that didn't otherwise limit this via the max_bpc property.
In particular we don't add the max_bpc property to HDMI
ports on gmch platforms, and thus we would see the raw
max_bpc coming from the EDID.
I suppose we could already adjust this to also allow 16bpc,
but seeing as no current platform supports that there is
little point.
Cc: stable(a)vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2632
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_display.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 2729c852c668..2a6eb1ca9c8e 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -13060,10 +13060,11 @@ compute_sink_pipe_bpp(const struct drm_connector_state *conn_state,
case 10 ... 11:
bpp = 10 * 3;
break;
- case 12:
+ case 12 ... 16:
bpp = 12 * 3;
break;
default:
+ MISSING_CASE(conn_state->max_bpc);
return -EINVAL;
}
--
2.26.2
The patch titled
Subject: ocfs2: initialize ip_next_orphan
has been removed from the -mm tree. Its filename was
ocfs2-initialize-ip_next_orphan.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Wengang Wang <wen.gang.wang(a)oracle.com>
Subject: ocfs2: initialize ip_next_orphan
Though problem if found on a lower 4.1.12 kernel, I think upstream has
same issue.
In one node in the cluster, there is the following callback trace:
# cat /proc/21473/stack
[<ffffffffc09a2f06>] __ocfs2_cluster_lock.isra.36+0x336/0x9e0 [ocfs2]
[<ffffffffc09a4481>] ocfs2_inode_lock_full_nested+0x121/0x520 [ocfs2]
[<ffffffffc09b2ce2>] ocfs2_evict_inode+0x152/0x820 [ocfs2]
[<ffffffff8122b36e>] evict+0xae/0x1a0
[<ffffffff8122bd26>] iput+0x1c6/0x230
[<ffffffffc09b60ed>] ocfs2_orphan_filldir+0x5d/0x100 [ocfs2]
[<ffffffffc0992ae0>] ocfs2_dir_foreach_blk+0x490/0x4f0 [ocfs2]
[<ffffffffc099a1e9>] ocfs2_dir_foreach+0x29/0x30 [ocfs2]
[<ffffffffc09b7716>] ocfs2_recover_orphans+0x1b6/0x9a0 [ocfs2]
[<ffffffffc09b9b4e>] ocfs2_complete_recovery+0x1de/0x5c0 [ocfs2]
[<ffffffff810a1399>] process_one_work+0x169/0x4a0
[<ffffffff810a1bcb>] worker_thread+0x5b/0x560
[<ffffffff810a7a2b>] kthread+0xcb/0xf0
[<ffffffff816f5d21>] ret_from_fork+0x61/0x90
[<ffffffffffffffff>] 0xffffffffffffffff
The above stack is not reasonable, the final iput shouldn't happen in
ocfs2_orphan_filldir() function. Looking at the code,
2067 /* Skip inodes which are already added to recover list, since dio may
2068 * happen concurrently with unlink/rename */
2069 if (OCFS2_I(iter)->ip_next_orphan) {
2070 iput(iter);
2071 return 0;
2072 }
2073
The logic thinks the inode is already in recover list on seeing
ip_next_orphan is non-NULL, so it skip this inode after dropping a
reference which incremented in ocfs2_iget().
While, if the inode is already in recover list, it should have another
reference and the iput() at line 2070 should not be the final iput
(dropping the last reference). So I don't think the inode is really in
the recover list (no vmcore to confirm).
Note that ocfs2_queue_orphans(), though not shown up in the call back
trace, is holding cluster lock on the orphan directory when looking up for
unlinked inodes. The on disk inode eviction could involve a lot of IOs
which may need long time to finish. That means this node could hold the
cluster lock for very long time, that can lead to the lock requests (from
other nodes) to the orhpan directory hang for long time.
Looking at more on ip_next_orphan, I found it's not initialized when
allocating a new ocfs2_inode_info structure.
This causes te reflink operations from some nodes hang for very long
time waiting for the cluster lock on the orphan directory.
Fix: initialize ip_next_orphan as NULL.
Link: https://lkml.kernel.org/r/20201109171746.27884-1-wen.gang.wang@oracle.com
Signed-off-by: Wengang Wang <wen.gang.wang(a)oracle.com>
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/super.c | 1 +
1 file changed, 1 insertion(+)
--- a/fs/ocfs2/super.c~ocfs2-initialize-ip_next_orphan
+++ a/fs/ocfs2/super.c
@@ -1713,6 +1713,7 @@ static void ocfs2_inode_init_once(void *
oi->ip_blkno = 0ULL;
oi->ip_clusters = 0;
+ oi->ip_next_orphan = NULL;
ocfs2_resv_init_once(&oi->ip_la_data_resv);
_
Patches currently in -mm which might be from wen.gang.wang(a)oracle.com are
The patch titled
Subject: hugetlbfs: fix anon huge page migration race
has been removed from the -mm tree. Its filename was
hugetlbfs-fix-anon-huge-page-migration-race.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlbfs: fix anon huge page migration race
Qian Cai reported the following BUG in [1]
[ 6147.019063][T45242] LTP: starting move_pages12
[ 6147.475680][T64921] BUG: unable to handle page fault for address: ffffffffffffffe0
...
[ 6147.525866][T64921] RIP: 0010:anon_vma_interval_tree_iter_first+0xa2/0x170
avc_start_pgoff at mm/interval_tree.c:63
[ 6147.620914][T64921] Call Trace:
[ 6147.624078][T64921] rmap_walk_anon+0x141/0xa30
rmap_walk_anon at mm/rmap.c:1864
[ 6147.628639][T64921] try_to_unmap+0x209/0x2d0
try_to_unmap at mm/rmap.c:1763
[ 6147.633026][T64921] ? rmap_walk_locked+0x140/0x140
[ 6147.637936][T64921] ? page_remove_rmap+0x1190/0x1190
[ 6147.643020][T64921] ? page_not_mapped+0x10/0x10
[ 6147.647668][T64921] ? page_get_anon_vma+0x290/0x290
[ 6147.652664][T64921] ? page_mapcount_is_zero+0x10/0x10
[ 6147.657838][T64921] ? hugetlb_page_mapping_lock_write+0x97/0x180
[ 6147.663972][T64921] migrate_pages+0x1005/0x1fb0
[ 6147.668617][T64921] ? remove_migration_pte+0xac0/0xac0
[ 6147.673875][T64921] move_pages_and_store_status.isra.47+0xd7/0x1a0
[ 6147.680181][T64921] ? migrate_pages+0x1fb0/0x1fb0
[ 6147.685002][T64921] __x64_sys_move_pages+0xa5c/0x1100
[ 6147.690176][T64921] ? trace_hardirqs_on+0x20/0x1b5
[ 6147.695084][T64921] ? move_pages_and_store_status.isra.47+0x1a0/0x1a0
[ 6147.701653][T64921] ? rcu_read_lock_sched_held+0xaa/0xd0
[ 6147.707088][T64921] ? switch_fpu_return+0x196/0x400
[ 6147.712083][T64921] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 6147.717954][T64921] ? do_syscall_64+0x24/0x310
[ 6147.722513][T64921] do_syscall_64+0x5f/0x310
[ 6147.726897][T64921] ? trace_hardirqs_off+0x12/0x1a0
[ 6147.731894][T64921] ? asm_exc_page_fault+0x8/0x30
[ 6147.736714][T64921] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Hugh Dickens diagnosed this as a migration bug caused by code introduced
to use i_mmap_rwsem for pmd sharing synchronization. Specifically, the
routine unmap_and_move_huge_page() is always passing the TTU_RMAP_LOCKED
flag to try_to_unmap() while holding i_mmap_rwsem. This is wrong for
anon pages as the anon_vma_lock should be held in this case. Further
analysis suggested that i_mmap_rwsem was not required to he held at all
when calling try_to_unmap for anon pages as an anon page could never be
part of a shared pmd mapping.
Discussion also revealed that the hack in hugetlb_page_mapping_lock_write
to drop page lock and acquire i_mmap_rwsem is wrong. There is no way to
keep mapping valid while dropping page lock.
This patch does the following:
- Do not take i_mmap_rwsem and set TTU_RMAP_LOCKED for anon pages when
calling try_to_unmap.
- Remove the hacky code in hugetlb_page_mapping_lock_write. The routine
will now simply do a 'trylock' while still holding the page lock. If
the trylock fails, it will return NULL. This could impact the callers:
- migration calling code will receive -EAGAIN and retry up to the hard
coded limit (10).
- memory error code will treat the page as BUSY. This will force
killing (SIGKILL) instead of SIGBUS any mapping tasks.
Do note that this change in behavior only happens when there is a race.
None of the standard kernel testing suites actually hit this race, but
it is possible.
[1] https://lore.kernel.org/lkml/20200708012044.GC992@lca.pw/
[2] https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2010071833100.2214@eggly.a…
Link: https://lkml.kernel.org/r/20201105195058.78401-1-mike.kravetz@oracle.com
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: Qian Cai <cai(a)lca.pw>
Suggested-by: Hugh Dickins <hughd(a)google.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 90 ++----------------------------------------
mm/memory-failure.c | 36 +++++++---------
mm/migrate.c | 46 +++++++++++----------
mm/rmap.c | 5 --
4 files changed, 48 insertions(+), 129 deletions(-)
--- a/mm/hugetlb.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/hugetlb.c
@@ -1568,103 +1568,23 @@ int PageHeadHuge(struct page *page_head)
}
/*
- * Find address_space associated with hugetlbfs page.
- * Upon entry page is locked and page 'was' mapped although mapped state
- * could change. If necessary, use anon_vma to find vma and associated
- * address space. The returned mapping may be stale, but it can not be
- * invalid as page lock (which is held) is required to destroy mapping.
- */
-static struct address_space *_get_hugetlb_page_mapping(struct page *hpage)
-{
- struct anon_vma *anon_vma;
- pgoff_t pgoff_start, pgoff_end;
- struct anon_vma_chain *avc;
- struct address_space *mapping = page_mapping(hpage);
-
- /* Simple file based mapping */
- if (mapping)
- return mapping;
-
- /*
- * Even anonymous hugetlbfs mappings are associated with an
- * underlying hugetlbfs file (see hugetlb_file_setup in mmap
- * code). Find a vma associated with the anonymous vma, and
- * use the file pointer to get address_space.
- */
- anon_vma = page_lock_anon_vma_read(hpage);
- if (!anon_vma)
- return mapping; /* NULL */
-
- /* Use first found vma */
- pgoff_start = page_to_pgoff(hpage);
- pgoff_end = pgoff_start + pages_per_huge_page(page_hstate(hpage)) - 1;
- anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root,
- pgoff_start, pgoff_end) {
- struct vm_area_struct *vma = avc->vma;
-
- mapping = vma->vm_file->f_mapping;
- break;
- }
-
- anon_vma_unlock_read(anon_vma);
- return mapping;
-}
-
-/*
* Find and lock address space (mapping) in write mode.
*
- * Upon entry, the page is locked which allows us to find the mapping
- * even in the case of an anon page. However, locking order dictates
- * the i_mmap_rwsem be acquired BEFORE the page lock. This is hugetlbfs
- * specific. So, we first try to lock the sema while still holding the
- * page lock. If this works, great! If not, then we need to drop the
- * page lock and then acquire i_mmap_rwsem and reacquire page lock. Of
- * course, need to revalidate state along the way.
+ * Upon entry, the page is locked which means that page_mapping() is
+ * stable. Due to locking order, we can only trylock_write. If we can
+ * not get the lock, simply return NULL to caller.
*/
struct address_space *hugetlb_page_mapping_lock_write(struct page *hpage)
{
- struct address_space *mapping, *mapping2;
+ struct address_space *mapping = page_mapping(hpage);
- mapping = _get_hugetlb_page_mapping(hpage);
-retry:
if (!mapping)
return mapping;
- /*
- * If no contention, take lock and return
- */
if (i_mmap_trylock_write(mapping))
return mapping;
- /*
- * Must drop page lock and wait on mapping sema.
- * Note: Once page lock is dropped, mapping could become invalid.
- * As a hack, increase map count until we lock page again.
- */
- atomic_inc(&hpage->_mapcount);
- unlock_page(hpage);
- i_mmap_lock_write(mapping);
- lock_page(hpage);
- atomic_add_negative(-1, &hpage->_mapcount);
-
- /* verify page is still mapped */
- if (!page_mapped(hpage)) {
- i_mmap_unlock_write(mapping);
- return NULL;
- }
-
- /*
- * Get address space again and verify it is the same one
- * we locked. If not, drop lock and retry.
- */
- mapping2 = _get_hugetlb_page_mapping(hpage);
- if (mapping2 != mapping) {
- i_mmap_unlock_write(mapping);
- mapping = mapping2;
- goto retry;
- }
-
- return mapping;
+ return NULL;
}
pgoff_t __basepage_index(struct page *page)
--- a/mm/memory-failure.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/memory-failure.c
@@ -1057,27 +1057,25 @@ static bool hwpoison_user_mappings(struc
if (!PageHuge(hpage)) {
unmap_success = try_to_unmap(hpage, ttu);
} else {
- /*
- * For hugetlb pages, try_to_unmap could potentially call
- * huge_pmd_unshare. Because of this, take semaphore in
- * write mode here and set TTU_RMAP_LOCKED to indicate we
- * have taken the lock at this higer level.
- *
- * Note that the call to hugetlb_page_mapping_lock_write
- * is necessary even if mapping is already set. It handles
- * ugliness of potentially having to drop page lock to obtain
- * i_mmap_rwsem.
- */
- mapping = hugetlb_page_mapping_lock_write(hpage);
-
- if (mapping) {
- unmap_success = try_to_unmap(hpage,
+ if (!PageAnon(hpage)) {
+ /*
+ * For hugetlb pages in shared mappings, try_to_unmap
+ * could potentially call huge_pmd_unshare. Because of
+ * this, take semaphore in write mode here and set
+ * TTU_RMAP_LOCKED to indicate we have taken the lock
+ * at this higer level.
+ */
+ mapping = hugetlb_page_mapping_lock_write(hpage);
+ if (mapping) {
+ unmap_success = try_to_unmap(hpage,
ttu|TTU_RMAP_LOCKED);
- i_mmap_unlock_write(mapping);
+ i_mmap_unlock_write(mapping);
+ } else {
+ pr_info("Memory failure: %#lx: could not lock mapping for mapped huge page\n", pfn);
+ unmap_success = false;
+ }
} else {
- pr_info("Memory failure: %#lx: could not find mapping for mapped huge page\n",
- pfn);
- unmap_success = false;
+ unmap_success = try_to_unmap(hpage, ttu);
}
}
if (!unmap_success)
--- a/mm/migrate.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/migrate.c
@@ -1328,34 +1328,38 @@ static int unmap_and_move_huge_page(new_
goto put_anon;
if (page_mapped(hpage)) {
- /*
- * try_to_unmap could potentially call huge_pmd_unshare.
- * Because of this, take semaphore in write mode here and
- * set TTU_RMAP_LOCKED to let lower levels know we have
- * taken the lock.
- */
- mapping = hugetlb_page_mapping_lock_write(hpage);
- if (unlikely(!mapping))
- goto unlock_put_anon;
-
- try_to_unmap(hpage,
- TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS|
- TTU_RMAP_LOCKED);
+ bool mapping_locked = false;
+ enum ttu_flags ttu = TTU_MIGRATION|TTU_IGNORE_MLOCK|
+ TTU_IGNORE_ACCESS;
+
+ if (!PageAnon(hpage)) {
+ /*
+ * In shared mappings, try_to_unmap could potentially
+ * call huge_pmd_unshare. Because of this, take
+ * semaphore in write mode here and set TTU_RMAP_LOCKED
+ * to let lower levels know we have taken the lock.
+ */
+ mapping = hugetlb_page_mapping_lock_write(hpage);
+ if (unlikely(!mapping))
+ goto unlock_put_anon;
+
+ mapping_locked = true;
+ ttu |= TTU_RMAP_LOCKED;
+ }
+
+ try_to_unmap(hpage, ttu);
page_was_mapped = 1;
- /*
- * Leave mapping locked until after subsequent call to
- * remove_migration_ptes()
- */
+
+ if (mapping_locked)
+ i_mmap_unlock_write(mapping);
}
if (!page_mapped(hpage))
rc = move_to_new_page(new_hpage, hpage, mode);
- if (page_was_mapped) {
+ if (page_was_mapped)
remove_migration_ptes(hpage,
- rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage, true);
- i_mmap_unlock_write(mapping);
- }
+ rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage, false);
unlock_put_anon:
unlock_page(new_hpage);
--- a/mm/rmap.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/rmap.c
@@ -1413,9 +1413,6 @@ static bool try_to_unmap_one(struct page
/*
* If sharing is possible, start and end will be adjusted
* accordingly.
- *
- * If called for a huge page, caller must hold i_mmap_rwsem
- * in write mode as it is possible to call huge_pmd_unshare.
*/
adjust_range_if_pmd_sharing_possible(vma, &range.start,
&range.end);
@@ -1462,7 +1459,7 @@ static bool try_to_unmap_one(struct page
subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte);
address = pvmw.address;
- if (PageHuge(page)) {
+ if (PageHuge(page) && !PageAnon(page)) {
/*
* To call huge_pmd_unshare, i_mmap_rwsem must be
* held in write mode. Caller needs to explicitly
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
The patch titled
Subject: Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
has been removed from the -mm tree. Its filename was
revert-kernel-rebootc-convert-simple_strtoul-to-kstrtoint.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Matteo Croce <mcroce(a)microsoft.com>
Subject: Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
Patch series "fix parsing of reboot= cmdline", v3.
The parsing of the reboot= cmdline has two major errors:
- a missing bound check can crash the system on reboot
- parsing of the cpu number only works if specified last
Fix both.
This patch (of 2):
This reverts commit 616feab753972b97.
kstrtoint() and simple_strtoul() have a subtle difference which makes them
non interchangeable: if a non digit character is found amid the parsing,
the former will return an error, while the latter will just stop parsing,
e.g. simple_strtoul("123xyx") = 123.
The kernel cmdline reboot= argument allows to specify the CPU used for
rebooting, with the syntax `s####` among the other flags, e.g.
"reboot=warm,s31,force", so if this flag is not the last given, it's
silently ignored as well as the subsequent ones.
Link: https://lkml.kernel.org/r/20201103214025.116799-2-mcroce@linux.microsoft.com
Fixes: 616feab75397 ("kernel/reboot.c: convert simple_strtoul to kstrtoint")
Signed-off-by: Matteo Croce <mcroce(a)microsoft.com>
Cc: Guenter Roeck <linux(a)roeck-us.net>
Cc: Petr Mladek <pmladek(a)suse.com>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Robin Holt <robinmholt(a)gmail.com>
Cc: Fabian Frederick <fabf(a)skynet.be>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/reboot.c | 21 +++++++--------------
1 file changed, 7 insertions(+), 14 deletions(-)
--- a/kernel/reboot.c~revert-kernel-rebootc-convert-simple_strtoul-to-kstrtoint
+++ a/kernel/reboot.c
@@ -551,22 +551,15 @@ static int __init reboot_setup(char *str
break;
case 's':
- {
- int rc;
-
- if (isdigit(*(str+1))) {
- rc = kstrtoint(str+1, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else if (str[1] == 'm' && str[2] == 'p' &&
- isdigit(*(str+3))) {
- rc = kstrtoint(str+3, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else
+ if (isdigit(*(str+1)))
+ reboot_cpu = simple_strtoul(str+1, NULL, 0);
+ else if (str[1] == 'm' && str[2] == 'p' &&
+ isdigit(*(str+3)))
+ reboot_cpu = simple_strtoul(str+3, NULL, 0);
+ else
*mode = REBOOT_SOFT;
break;
- }
+
case 'g':
*mode = REBOOT_GPIO;
break;
_
Patches currently in -mm which might be from mcroce(a)microsoft.com are
reboot-refactor-and-comment-the-cpu-selection-code.patch
reboot-allow-to-specify-reboot-mode-via-sysfs.patch
reboot-remove-cf9_safe-from-allowed-types-and-rename-cf9_force.patch
The patch titled
Subject: compiler.h: fix barrier_data() on clang
has been removed from the -mm tree. Its filename was
compilerh-fix-barrier_data-on-clang.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Arvind Sankar <nivedita(a)alum.mit.edu>
Subject: compiler.h: fix barrier_data() on clang
Commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive") neglected to copy barrier_data() from compiler-gcc.h
into compiler-clang.h. The definition in compiler-gcc.h was really to
work around clang's more aggressive optimization, so this broke
barrier_data() on clang, and consequently memzero_explicit() as well.
For example, this results in at least the memzero_explicit() call in
lib/crypto/sha256.c:sha256_transform() being optimized away by clang.
Fix this by moving the definition of barrier_data() into compiler.h.
Also move the gcc/clang definition of barrier() into compiler.h,
__memory_barrier() is icc-specific (and barrier() is already defined using
it in compiler-intel.h) and doesn't belong in compiler.h.
[rdunlap(a)infradead.org: fix ALPHA builds when SMP is not enabled]
Link: https://lkml.kernel.org/r/20201101231835.4589-1-rdunlap@infradead.org
Link: https://lkml.kernel.org/r/20201014212631.207844-1-nivedita@alum.mit.edu
Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
Signed-off-by: Arvind Sankar <nivedita(a)alum.mit.edu>
Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org>
Reviewed-by: Nick Desaulniers <ndesaulniers(a)google.com>
Tested-by: Nick Desaulniers <ndesaulniers(a)google.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/asm-generic/barrier.h | 1 +
include/linux/compiler-clang.h | 6 ------
include/linux/compiler-gcc.h | 19 -------------------
include/linux/compiler.h | 18 ++++++++++++++++--
4 files changed, 17 insertions(+), 27 deletions(-)
--- a/include/asm-generic/barrier.h~compilerh-fix-barrier_data-on-clang
+++ a/include/asm-generic/barrier.h
@@ -13,6 +13,7 @@
#ifndef __ASSEMBLY__
+#include <linux/compiler.h>
#include <asm/rwonce.h>
#ifndef nop
--- a/include/linux/compiler-clang.h~compilerh-fix-barrier_data-on-clang
+++ a/include/linux/compiler-clang.h
@@ -60,12 +60,6 @@
#define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1
#endif
-/* The following are for compatibility with GCC, from compiler-gcc.h,
- * and may be redefined here because they should not be shared with other
- * compilers, like ICC.
- */
-#define barrier() __asm__ __volatile__("" : : : "memory")
-
#if __has_feature(shadow_call_stack)
# define __noscs __attribute__((__no_sanitize__("shadow-call-stack")))
#endif
--- a/include/linux/compiler-gcc.h~compilerh-fix-barrier_data-on-clang
+++ a/include/linux/compiler-gcc.h
@@ -15,25 +15,6 @@
# error Sorry, your version of GCC is too old - please use 4.9 or newer.
#endif
-/* Optimization barrier */
-
-/* The "volatile" is due to gcc bugs */
-#define barrier() __asm__ __volatile__("": : :"memory")
-/*
- * This version is i.e. to prevent dead stores elimination on @ptr
- * where gcc and llvm may behave differently when otherwise using
- * normal barrier(): while gcc behavior gets along with a normal
- * barrier(), llvm needs an explicit input variable to be assumed
- * clobbered. The issue is as follows: while the inline asm might
- * access any memory it wants, the compiler could have fit all of
- * @ptr into memory registers instead, and since @ptr never escaped
- * from that, it proved that the inline asm wasn't touching any of
- * it. This version works well with both compilers, i.e. we're telling
- * the compiler that the inline asm absolutely may see the contents
- * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
- */
-#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
-
/*
* This macro obfuscates arithmetic on a variable address so that gcc
* shouldn't recognize the original var, and make assumptions about it.
--- a/include/linux/compiler.h~compilerh-fix-barrier_data-on-clang
+++ a/include/linux/compiler.h
@@ -80,11 +80,25 @@ void ftrace_likely_update(struct ftrace_
/* Optimization barrier */
#ifndef barrier
-# define barrier() __memory_barrier()
+/* The "volatile" is due to gcc bugs */
+# define barrier() __asm__ __volatile__("": : :"memory")
#endif
#ifndef barrier_data
-# define barrier_data(ptr) barrier()
+/*
+ * This version is i.e. to prevent dead stores elimination on @ptr
+ * where gcc and llvm may behave differently when otherwise using
+ * normal barrier(): while gcc behavior gets along with a normal
+ * barrier(), llvm needs an explicit input variable to be assumed
+ * clobbered. The issue is as follows: while the inline asm might
+ * access any memory it wants, the compiler could have fit all of
+ * @ptr into memory registers instead, and since @ptr never escaped
+ * from that, it proved that the inline asm wasn't touching any of
+ * it. This version works well with both compilers, i.e. we're telling
+ * the compiler that the inline asm absolutely may see the contents
+ * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
+ */
+# define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
#endif
/* workaround for GCC PR82365 if needed */
_
Patches currently in -mm which might be from nivedita(a)alum.mit.edu are
The patch titled
Subject: mm/gup: use unpin_user_pages() in __gup_longterm_locked()
has been removed from the -mm tree. Its filename was
mm-gup-use-unpin_user_pages-in-__gup_longterm_locked.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Jason Gunthorpe <jgg(a)nvidia.com>
Subject: mm/gup: use unpin_user_pages() in __gup_longterm_locked()
When FOLL_PIN is passed to __get_user_pages() the page list must be put
back using unpin_user_pages() otherwise the page pin reference persists in
a corrupted state.
There are two places in the unwind of __gup_longterm_locked() that put the
pages back without checking. Normally on error this function would return
the partial page list making this the caller's responsibility, but in
these two cases the caller is not allowed to see these pages at all.
Link: https://lkml.kernel.org/r/0-v2-3ae7d9d162e2+2a7-gup_cma_fix_jgg@nvidia.com
Fixes: 3faa52c03f44 ("mm/gup: track FOLL_PIN pages")
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reported-by: Ira Weiny <ira.weiny(a)intel.com>
Reviewed-by: Ira Weiny <ira.weiny(a)intel.com>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Cc: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/gup.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
--- a/mm/gup.c~mm-gup-use-unpin_user_pages-in-__gup_longterm_locked
+++ a/mm/gup.c
@@ -1647,8 +1647,11 @@ check_again:
/*
* drop the above get_user_pages reference.
*/
- for (i = 0; i < nr_pages; i++)
- put_page(pages[i]);
+ if (gup_flags & FOLL_PIN)
+ unpin_user_pages(pages, nr_pages);
+ else
+ for (i = 0; i < nr_pages; i++)
+ put_page(pages[i]);
if (migrate_pages(&cma_page_list, alloc_migration_target, NULL,
(unsigned long)&mtc, MIGRATE_SYNC, MR_CONTIG_RANGE)) {
@@ -1728,8 +1731,11 @@ static long __gup_longterm_locked(struct
goto out;
if (check_dax_vmas(vmas_tmp, rc)) {
- for (i = 0; i < rc; i++)
- put_page(pages[i]);
+ if (gup_flags & FOLL_PIN)
+ unpin_user_pages(pages, rc);
+ else
+ for (i = 0; i < rc; i++)
+ put_page(pages[i]);
rc = -EOPNOTSUPP;
goto out;
}
_
Patches currently in -mm which might be from jgg(a)nvidia.com are
mm-reorganize-internal_get_user_pages_fast.patch
mm-prevent-gup_fast-from-racing-with-cow-during-fork.patch
The patch titled
Subject: mm/slub: fix panic in slab_alloc_node()
has been removed from the -mm tree. Its filename was
mm-slub-fix-panic-in-slab_alloc_node.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Laurent Dufour <ldufour(a)linux.ibm.com>
Subject: mm/slub: fix panic in slab_alloc_node()
While doing memory hot-unplug operation on a PowerPC VM running 1024 CPUs
with 11TB of ram, I hit the following panic:
BUG: Kernel NULL pointer dereference on read at 0x00000007
Faulting instruction address: 0xc000000000456048
Oops: Kernel access of bad area, sig: 11 [#2]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS= 2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp
CPU: 160 PID: 1 Comm: systemd Tainted: G D 5.9.0 #1
NIP: c000000000456048 LR: c000000000455fd4 CTR: c00000000047b350
REGS: c00006028d1b77a0 TRAP: 0300 Tainted: G D (5.9.0)
MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24004228 XER: 00000000
CFAR: c00000000000f1b0 DAR: 0000000000000007 DSISR: 40000000 IRQMASK: 0
GPR00: c000000000455fd4 c00006028d1b7a30 c000000001bec800 0000000000000000
GPR04: 0000000000000dc0 0000000000000000 00000000000374ef c00007c53df99320
GPR08: 000007c53c980000 0000000000000000 000007c53c980000 0000000000000000
GPR12: 0000000000004400 c00000001e8e4400 0000000000000000 0000000000000f6a
GPR16: 0000000000000000 c000000001c25930 c000000001d62528 00000000000000c1
GPR20: c000000001d62538 c00006be469e9000 0000000fffffffe0 c0000000003c0ff8
GPR24: 0000000000000018 0000000000000000 0000000000000dc0 0000000000000000
GPR28: c00007c513755700 c000000001c236a4 c00007bc4001f800 0000000000000001
NIP [c000000000456048] __kmalloc_node+0x108/0x790
LR [c000000000455fd4] __kmalloc_node+0x94/0x790
Call Trace:
[c00006028d1b7a30] [c00007c51af92000] 0xc00007c51af92000 (unreliable)
[c00006028d1b7aa0] [c0000000003c0ff8] kvmalloc_node+0x58/0x110
[c00006028d1b7ae0] [c00000000047b45c] mem_cgroup_css_online+0x10c/0x270
[c00006028d1b7b30] [c000000000241fd8] online_css+0x48/0xd0
[c00006028d1b7b60] [c00000000024af14] cgroup_apply_control_enable+0x2c4/0x470
[c00006028d1b7c40] [c00000000024e838] cgroup_mkdir+0x408/0x5f0
[c00006028d1b7cb0] [c0000000005a4ef0] kernfs_iop_mkdir+0x90/0x100
[c00006028d1b7cf0] [c0000000004b8168] vfs_mkdir+0x138/0x250
[c00006028d1b7d40] [c0000000004baf04] do_mkdirat+0x154/0x1c0
[c00006028d1b7dc0] [c000000000032b38] system_call_exception+0xf8/0x200
[c00006028d1b7e20] [c00000000000c740] system_call_common+0xf0/0x27c
Instruction dump:
e93e0000 e90d0030 39290008 7cc9402a e94d0030 e93e0000 7ce95214 7f89502a
2fbc0000 419e0018 41920230 e9270010 <89290007> 7f994800 419e0220 7ee6bb78
This pointing to the following code:
mm/slub.c:2851
if (unlikely(!object || !node_match(page, node))) {
c000000000456038: 00 00 bc 2f cmpdi cr7,r28,0
c00000000045603c: 18 00 9e 41 beq cr7,c000000000456054 <__kmalloc_node+0x114>
node_match():
mm/slub.c:2491
if (node != NUMA_NO_NODE && page_to_nid(page) != node)
c000000000456040: 30 02 92 41 beq cr4,c000000000456270 <__kmalloc_node+0x330>
page_to_nid():
include/linux/mm.h:1294
c000000000456044: 10 00 27 e9 ld r9,16(r7)
c000000000456048: 07 00 29 89 lbz r9,7(r9) <<<< r9 = NULL
node_match():
mm/slub.c:2491
c00000000045604c: 00 48 99 7f cmpw cr7,r25,r9
c000000000456050: 20 02 9e 41 beq cr7,c000000000456270 <__kmalloc_node+0x330>
The panic occurred in slab_alloc_node() when checking for the page's node:
object = c->freelist;
page = c->page;
if (unlikely(!object || !node_match(page, node))) {
object = __slab_alloc(s, gfpflags, node, addr, c);
stat(s, ALLOC_SLOWPATH);
The issue is that object is not NULL while page is NULL which is odd but
may happen if the cache flush happened after loading object but before
loading page. Thus checking for the page pointer is required too.
The cache flush is done through an inter processor interrupt when a piece
of memory is off-lined. That interrupt is triggered when a memory
hot-unplug operation is initiated and offline_pages() is calling the
slub's MEM_GOING_OFFLINE callback slab_mem_going_offline_callback() which
is calling flush_cpu_slab(). If that interrupt is caught between the
reading of c->freelist and the reading of c->page, this could lead to such
a situation. That situation is expected and the later call to
this_cpu_cmpxchg_double() will detect the change to c->freelist and redo
the whole operation.
In commit 6159d0f5c03e ("mm/slub.c: page is always non-NULL in
node_match()") check on the page pointer has been removed assuming that
page is always valid when it is called. It happens that this is not true
in that particular case, so check for page before calling node_match()
here.
Link: https://lkml.kernel.org/r/20201027190406.33283-1-ldufour@linux.ibm.com
Fixes: 6159d0f5c03e ("mm/slub.c: page is always non-NULL in node_match()")
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Acked-by: Christoph Lameter <cl(a)linux.com>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/slub.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/slub.c~mm-slub-fix-panic-in-slab_alloc_node
+++ a/mm/slub.c
@@ -2852,7 +2852,7 @@ redo:
object = c->freelist;
page = c->page;
- if (unlikely(!object || !node_match(page, node))) {
+ if (unlikely(!object || !page || !node_match(page, node))) {
object = __slab_alloc(s, gfpflags, node, addr, c);
} else {
void *next_object = get_freepointer_safe(s, object);
_
Patches currently in -mm which might be from ldufour(a)linux.ibm.com are
The patch titled
Subject: mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit
has been removed from the -mm tree. Its filename was
mm-vmscan-fix-nr_isolated_file-corruption-on-64-bit.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Nicholas Piggin <npiggin(a)gmail.com>
Subject: mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit
Previously the negated unsigned long would be cast back to signed long
which would have the correct negative value. After commit 730ec8c01a2b
("mm/vmscan.c: change prototype for shrink_page_list"), the large unsigned
int converts to a large positive signed long.
Symptoms include CMA allocations hanging forever holding the cma_mutex due
to alloc_contig_range->...->isolate_migratepages_block waiting forever in
"while (unlikely(too_many_isolated(pgdat)))".
[akpm(a)linux-foundation.org: fix -stat.nr_lazyfree_fail as well, per Michal]
Link: https://lkml.kernel.org/r/20201029032320.1448441-1-npiggin@gmail.com
Fixes: 730ec8c01a2b ("mm/vmscan.c: change prototype for shrink_page_list")
Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Vaneet Narang <v.narang(a)samsung.com>
Cc: Maninder Singh <maninder1.s(a)samsung.com>
Cc: Amit Sahrawat <a.sahrawat(a)samsung.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/vmscan.c~mm-vmscan-fix-nr_isolated_file-corruption-on-64-bit
+++ a/mm/vmscan.c
@@ -1516,7 +1516,8 @@ unsigned int reclaim_clean_pages_from_li
nr_reclaimed = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc,
TTU_IGNORE_ACCESS, &stat, true);
list_splice(&clean_pages, page_list);
- mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -nr_reclaimed);
+ mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE,
+ -(long)nr_reclaimed);
/*
* Since lazyfree pages are isolated from file LRU from the beginning,
* they will rotate back to anonymous LRU in the end if it failed to
@@ -1526,7 +1527,7 @@ unsigned int reclaim_clean_pages_from_li
mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_ANON,
stat.nr_lazyfree_fail);
mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE,
- -stat.nr_lazyfree_fail);
+ -(long)stat.nr_lazyfree_fail);
return nr_reclaimed;
}
_
Patches currently in -mm which might be from npiggin(a)gmail.com are
The patch titled
Subject: mm/compaction: stop isolation if too many pages are isolated and we have pages to migrate
has been removed from the -mm tree. Its filename was
mm-compaction-stop-isolation-if-too-many-pages-are-isolated-and-we-have-pages-to-migrate.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Zi Yan <ziy(a)nvidia.com>
Subject: mm/compaction: stop isolation if too many pages are isolated and we have pages to migrate
In isolate_migratepages_block, if we have too many isolated pages and
nr_migratepages is not zero, we should try to migrate what we have without
wasting time on isolating.
In theory it's possible that multiple parallel compactions will cause
too_many_isolated() to become true even if each has isolated less than
COMPACT_CLUSTER_MAX, and loop forever in the while loop. Bailing
immediately prevents that.
[vbabka(a)suse.cz: changelog addition]
Link: https://lkml.kernel.org/r/20201030183809.3616803-2-zi.yan@sent.com
Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”)
Signed-off-by: Zi Yan <ziy(a)nvidia.com>
Suggested-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/compaction.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/mm/compaction.c~mm-compaction-stop-isolation-if-too-many-pages-are-isolated-and-we-have-pages-to-migrate
+++ a/mm/compaction.c
@@ -817,6 +817,10 @@ isolate_migratepages_block(struct compac
* delay for some time until fewer pages are isolated
*/
while (unlikely(too_many_isolated(pgdat))) {
+ /* stop isolation if there are still pages not migrated */
+ if (cc->nr_migratepages)
+ return 0;
+
/* async migration should just abort */
if (cc->mode == MIGRATE_ASYNC)
return 0;
_
Patches currently in -mm which might be from ziy(a)nvidia.com are
The patch titled
Subject: mm/compaction: count pages and stop correctly during page isolation
has been removed from the -mm tree. Its filename was
mm-compaction-count-pages-and-stop-correctly-during-page-isolation.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Zi Yan <ziy(a)nvidia.com>
Subject: mm/compaction: count pages and stop correctly during page isolation
In isolate_migratepages_block, when cc->alloc_contig is true, we are able
to isolate compound pages, nr_migratepages and nr_isolated did not count
compound pages correctly, causing us to isolate more pages than we
thought. Count compound pages as the number of base pages they contain.
Otherwise, we might be trapped in too_many_isolated while loop, since the
actual isolated pages can go up to COMPACT_CLUSTER_MAX*512=16384, where
COMPACT_CLUSTER_MAX is 32, since we stop isolation after
cc->nr_migratepages reaches to COMPACT_CLUSTER_MAX.
In addition, after we fix the issue above, cc->nr_migratepages could never
be equal to COMPACT_CLUSTER_MAX if compound pages are isolated, thus page
isolation could not stop as we intended. Change the isolation stop
condition to >=.
The issue can be triggered as follows: In a system with 16GB memory and an
8GB CMA region reserved by hugetlb_cma, if we first allocate 10GB THPs and
mlock them (so some THPs are allocated in the CMA region and mlocked),
reserving 6 1GB hugetlb pages via
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages will get stuck
(looping in too_many_isolated function) until we kill either task. With
the patch applied, oom will kill the application with 10GB THPs and let
hugetlb page reservation finish.
[ziy(a)nvidia.com: v3]
Link: https://lkml.kernel.org/r/20201030183809.3616803-1-zi.yan@sent.com
Link: https://lkml.kernel.org/r/20201029200435.3386066-1-zi.yan@sent.com
Fixes: 1da2f328fa64 ("cmm,thp,compaction,cma: allow THP migration for CMA allocations")
Signed-off-by: Zi Yan <ziy(a)nvidia.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/compaction.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--- a/mm/compaction.c~mm-compaction-count-pages-and-stop-correctly-during-page-isolation
+++ a/mm/compaction.c
@@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compac
isolate_success:
list_add(&page->lru, &cc->migratepages);
- cc->nr_migratepages++;
- nr_isolated++;
+ cc->nr_migratepages += compound_nr(page);
+ nr_isolated += compound_nr(page);
/*
* Avoid isolating too much unless this block is being
@@ -1021,7 +1021,7 @@ isolate_success:
* or a lock is contended. For contention, isolate quickly to
* potentially remove one source of contention.
*/
- if (cc->nr_migratepages == COMPACT_CLUSTER_MAX &&
+ if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX &&
!cc->rescan && !cc->contended) {
++low_pfn;
break;
@@ -1132,7 +1132,7 @@ isolate_migratepages_range(struct compac
if (!pfn)
break;
- if (cc->nr_migratepages == COMPACT_CLUSTER_MAX)
+ if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX)
break;
}
_
Patches currently in -mm which might be from ziy(a)nvidia.com are
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d3938ee23e97bfcac2e0eb6b356875da73d700df Mon Sep 17 00:00:00 2001
From: Gao Xiang <hsiangkao(a)redhat.com>
Date: Sun, 1 Nov 2020 03:51:02 +0800
Subject: [PATCH] erofs: derive atime instead of leaving it empty
EROFS has _only one_ ondisk timestamp (ctime is currently
documented and recorded, we might also record mtime instead
with a new compat feature if needed) for each extended inode
since EROFS isn't mainly for archival purposes so no need to
keep all timestamps on disk especially for Android scenarios
due to security concerns. Also, romfs/cramfs don't have their
own on-disk timestamp, and squashfs only records mtime instead.
Let's also derive access time from ondisk timestamp rather than
leaving it empty, and if mtime/atime for each file are really
needed for specific scenarios as well, we can also use xattrs
to record them then.
Link: https://lore.kernel.org/r/20201031195102.21221-1-hsiangkao@aol.com
[ Gao Xiang: It'd be better to backport for user-friendly concern. ]
Fixes: 431339ba9042 ("staging: erofs: add inode operations")
Cc: stable <stable(a)vger.kernel.org> # 4.19+
Reported-by: nl6720 <nl6720(a)gmail.com>
Reviewed-by: Chao Yu <yuchao0(a)huawei.com>
Signed-off-by: Gao Xiang <hsiangkao(a)redhat.com>
diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
index 139d0bed42f8..3e21c0e8adae 100644
--- a/fs/erofs/inode.c
+++ b/fs/erofs/inode.c
@@ -107,11 +107,9 @@ static struct page *erofs_read_inode(struct inode *inode,
i_gid_write(inode, le32_to_cpu(die->i_gid));
set_nlink(inode, le32_to_cpu(die->i_nlink));
- /* ns timestamp */
- inode->i_mtime.tv_sec = inode->i_ctime.tv_sec =
- le64_to_cpu(die->i_ctime);
- inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec =
- le32_to_cpu(die->i_ctime_nsec);
+ /* extended inode has its own timestamp */
+ inode->i_ctime.tv_sec = le64_to_cpu(die->i_ctime);
+ inode->i_ctime.tv_nsec = le32_to_cpu(die->i_ctime_nsec);
inode->i_size = le64_to_cpu(die->i_size);
@@ -149,11 +147,9 @@ static struct page *erofs_read_inode(struct inode *inode,
i_gid_write(inode, le16_to_cpu(dic->i_gid));
set_nlink(inode, le16_to_cpu(dic->i_nlink));
- /* use build time to derive all file time */
- inode->i_mtime.tv_sec = inode->i_ctime.tv_sec =
- sbi->build_time;
- inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec =
- sbi->build_time_nsec;
+ /* use build time for compact inodes */
+ inode->i_ctime.tv_sec = sbi->build_time;
+ inode->i_ctime.tv_nsec = sbi->build_time_nsec;
inode->i_size = le32_to_cpu(dic->i_size);
if (erofs_inode_is_data_compressed(vi->datalayout))
@@ -167,6 +163,11 @@ static struct page *erofs_read_inode(struct inode *inode,
goto err_out;
}
+ inode->i_mtime.tv_sec = inode->i_ctime.tv_sec;
+ inode->i_atime.tv_sec = inode->i_ctime.tv_sec;
+ inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec;
+ inode->i_atime.tv_nsec = inode->i_ctime.tv_nsec;
+
if (!nblks)
/* measure inode.i_blocks as generic filesystems */
inode->i_blocks = roundup(inode->i_size, EROFS_BLKSIZ) >> 9;
From: Aili Yao <yaoaili(a)kingsoft.com>
>From commit 6915564dc5a8 ("ACPI: OSL: Change the type of
acpi_os_map_generic_address() return value"),
acpi_os_map_generic_address() will return logical address or NULL for
error, but for ACPI_ADR_SPACE_SYSTEM_IO case, it should be also return 0
as it's a normal case, but now it will return -ENXIO. So check it out for
such case to avoid einj module initialization fail.
Fixes: 6915564dc5a8 ("ACPI: OSL: Change the type of
acpi_os_map_generic_address() return value")
Cc: <stable(a)vger.kernel.org>
Reviewed-by: James Morse <james.morse(a)arm.com>
Tested-by: Tony Luck <tony.luck(a)intel.com>
Signed-off-by: Aili Yao <yaoaili(a)kingsoft.com>
---
drivers/acpi/apei/apei-base.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/acpi/apei/apei-base.c b/drivers/acpi/apei/apei-base.c
index 552fd9f..3294cc8 100644
--- a/drivers/acpi/apei/apei-base.c
+++ b/drivers/acpi/apei/apei-base.c
@@ -633,6 +633,10 @@ int apei_map_generic_address(struct acpi_generic_address *reg)
if (rc)
return rc;
+ /* IO space doesn't need mapping */
+ if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO)
+ return 0;
+
if (!acpi_os_map_generic_address(reg))
return -ENXIO;
--
2.9.5
This is a note to let you know that I've just added the patch titled
tty: serial: imx: fix potential deadlock
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 33f16855dcb973f745c51882d0e286601ff3be2b Mon Sep 17 00:00:00 2001
From: Sam Nobs <samuel.nobs(a)taitradio.com>
Date: Tue, 10 Nov 2020 09:50:06 +1300
Subject: tty: serial: imx: fix potential deadlock
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Enabling the lock dependency validator has revealed
that the way spinlocks are used in the IMX serial
port could result in a deadlock.
Specifically, imx_uart_int() acquires a spinlock
without disabling the interrupts, meaning that another
interrupt could come along and try to acquire the same
spinlock, potentially causing the two to wait for each
other indefinitely.
Use spin_lock_irqsave() instead to disable interrupts
upon acquisition of the spinlock.
Fixes: c974991d2620 ("tty:serial:imx: use spin_lock instead of spin_lock_irqsave in isr")
Reviewed-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Signed-off-by: Sam Nobs <samuel.nobs(a)taitradio.com>
Link: https://lore.kernel.org/r/1604955006-9363-1-git-send-email-samuel.nobs@tait…
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/imx.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 1731d9728865..3c53a3c89959 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -942,8 +942,14 @@ static irqreturn_t imx_uart_int(int irq, void *dev_id)
struct imx_port *sport = dev_id;
unsigned int usr1, usr2, ucr1, ucr2, ucr3, ucr4;
irqreturn_t ret = IRQ_NONE;
+ unsigned long flags = 0;
- spin_lock(&sport->port.lock);
+ /*
+ * IRQs might not be disabled upon entering this interrupt handler,
+ * e.g. when interrupt handlers are forced to be threaded. To support
+ * this scenario as well, disable IRQs when acquiring the spinlock.
+ */
+ spin_lock_irqsave(&sport->port.lock, flags);
usr1 = imx_uart_readl(sport, USR1);
usr2 = imx_uart_readl(sport, USR2);
@@ -1013,7 +1019,7 @@ static irqreturn_t imx_uart_int(int irq, void *dev_id)
ret = IRQ_HANDLED;
}
- spin_unlock(&sport->port.lock);
+ spin_unlock_irqrestore(&sport->port.lock, flags);
return ret;
}
--
2.29.2
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5ce6861d36ed5207aff9e5eead4c7cc38a986586 Mon Sep 17 00:00:00 2001
From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota(a)intel.com>
Date: Thu, 5 Nov 2020 17:18:42 -0800
Subject: [PATCH] drm/i915: Correctly set SFC capability for video engines
SFC capability of video engines is not set correctly because i915
is testing for incorrect bits.
Fixes: c5d3e39caa45 ("drm/i915: Engine discovery query")
Cc: Matt Roper <matthew.d.roper(a)intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota(a)intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio(a)intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.3+
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106011842.36203-1-daniel…
(cherry picked from commit ad18fa0f5f052046cad96fee762b5c64f42dd86a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 5bfb5f7ed02c..efdeb7b7b2a0 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -371,7 +371,8 @@ static void __setup_engine_capabilities(struct intel_engine_cs *engine)
* instances.
*/
if ((INTEL_GEN(i915) >= 11 &&
- engine->gt->info.vdbox_sfc_access & engine->mask) ||
+ (engine->gt->info.vdbox_sfc_access &
+ BIT(engine->instance))) ||
(INTEL_GEN(i915) >= 9 && engine->instance == 0))
engine->uabi_capabilities |=
I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC;
With the current implementation the following race can happen:
* blk_pre_runtime_suspend() calls blk_freeze_queue_start() and
blk_mq_unfreeze_queue().
* blk_queue_enter() calls blk_queue_pm_only() and that function returns
true.
* blk_queue_enter() calls blk_pm_request_resume() and that function does
not call pm_request_resume() because the queue runtime status is
RPM_ACTIVE.
* blk_pre_runtime_suspend() changes the queue status into RPM_SUSPENDING.
Fix this race by changing the queue runtime status into RPM_SUSPENDING
before switching q_usage_counter to atomic mode.
Acked-by: Alan Stern <stern(a)rowland.harvard.edu>
Acked-by: Stanley Chu <stanley.chu(a)mediatek.com>
Cc: Ming Lei <ming.lei(a)redhat.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Cc: stable <stable(a)vger.kernel.org>
Fixes: 986d413b7c15 ("blk-mq: Enable support for runtime power management")
Signed-off-by: Can Guo <cang(a)codeaurora.org>
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
---
block/blk-pm.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/block/blk-pm.c b/block/blk-pm.c
index b85234d758f7..17bd020268d4 100644
--- a/block/blk-pm.c
+++ b/block/blk-pm.c
@@ -67,6 +67,10 @@ int blk_pre_runtime_suspend(struct request_queue *q)
WARN_ON_ONCE(q->rpm_status != RPM_ACTIVE);
+ spin_lock_irq(&q->queue_lock);
+ q->rpm_status = RPM_SUSPENDING;
+ spin_unlock_irq(&q->queue_lock);
+
/*
* Increase the pm_only counter before checking whether any
* non-PM blk_queue_enter() calls are in progress to avoid that any
@@ -89,15 +93,14 @@ int blk_pre_runtime_suspend(struct request_queue *q)
/* Switch q_usage_counter back to per-cpu mode. */
blk_mq_unfreeze_queue(q);
- spin_lock_irq(&q->queue_lock);
- if (ret < 0)
+ if (ret < 0) {
+ spin_lock_irq(&q->queue_lock);
+ q->rpm_status = RPM_ACTIVE;
pm_runtime_mark_last_busy(q->dev);
- else
- q->rpm_status = RPM_SUSPENDING;
- spin_unlock_irq(&q->queue_lock);
+ spin_unlock_irq(&q->queue_lock);
- if (ret)
blk_clear_pm_only(q);
+ }
return ret;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e8973201d9b281375b5a8c66093de5679423021a Mon Sep 17 00:00:00 2001
From: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Date: Fri, 6 Nov 2020 18:25:30 +0900
Subject: [PATCH] mmc: renesas_sdhi_core: Add missing tmio_mmc_host_free() at
remove
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The commit 94b110aff867 ("mmc: tmio: add tmio_mmc_host_alloc/free()")
added tmio_mmc_host_free(), but missed the function calling in
the sh_mobile_sdhi_remove() at that time. So, fix it. Otherwise,
we cannot rebind the sdhi/mmc devices when we use aliases of mmc.
Fixes: 94b110aff867 ("mmc: tmio: add tmio_mmc_host_alloc/free()")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Reviewed-by: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Tested-by: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas(a)ragnatech.se>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/1604654730-29914-1-git-send-email-yoshihiro.shimo…
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
index 414314151d0a..03c905a781a7 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -1160,6 +1160,7 @@ int renesas_sdhi_remove(struct platform_device *pdev)
tmio_mmc_host_remove(host);
renesas_sdhi_clk_disable(host);
+ tmio_mmc_host_free(host);
return 0;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e8973201d9b281375b5a8c66093de5679423021a Mon Sep 17 00:00:00 2001
From: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Date: Fri, 6 Nov 2020 18:25:30 +0900
Subject: [PATCH] mmc: renesas_sdhi_core: Add missing tmio_mmc_host_free() at
remove
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The commit 94b110aff867 ("mmc: tmio: add tmio_mmc_host_alloc/free()")
added tmio_mmc_host_free(), but missed the function calling in
the sh_mobile_sdhi_remove() at that time. So, fix it. Otherwise,
we cannot rebind the sdhi/mmc devices when we use aliases of mmc.
Fixes: 94b110aff867 ("mmc: tmio: add tmio_mmc_host_alloc/free()")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Reviewed-by: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Tested-by: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas(a)ragnatech.se>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/1604654730-29914-1-git-send-email-yoshihiro.shimo…
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
index 414314151d0a..03c905a781a7 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -1160,6 +1160,7 @@ int renesas_sdhi_remove(struct platform_device *pdev)
tmio_mmc_host_remove(host);
renesas_sdhi_clk_disable(host);
+ tmio_mmc_host_free(host);
return 0;
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e8973201d9b281375b5a8c66093de5679423021a Mon Sep 17 00:00:00 2001
From: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Date: Fri, 6 Nov 2020 18:25:30 +0900
Subject: [PATCH] mmc: renesas_sdhi_core: Add missing tmio_mmc_host_free() at
remove
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The commit 94b110aff867 ("mmc: tmio: add tmio_mmc_host_alloc/free()")
added tmio_mmc_host_free(), but missed the function calling in
the sh_mobile_sdhi_remove() at that time. So, fix it. Otherwise,
we cannot rebind the sdhi/mmc devices when we use aliases of mmc.
Fixes: 94b110aff867 ("mmc: tmio: add tmio_mmc_host_alloc/free()")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Reviewed-by: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Tested-by: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas(a)ragnatech.se>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/1604654730-29914-1-git-send-email-yoshihiro.shimo…
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
index 414314151d0a..03c905a781a7 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -1160,6 +1160,7 @@ int renesas_sdhi_remove(struct platform_device *pdev)
tmio_mmc_host_remove(host);
renesas_sdhi_clk_disable(host);
+ tmio_mmc_host_free(host);
return 0;
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8b92c4ff4423aa9900cf838d3294fcade4dbda35 Mon Sep 17 00:00:00 2001
From: Matteo Croce <mcroce(a)microsoft.com>
Date: Fri, 13 Nov 2020 22:52:02 -0800
Subject: [PATCH] Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
Patch series "fix parsing of reboot= cmdline", v3.
The parsing of the reboot= cmdline has two major errors:
- a missing bound check can crash the system on reboot
- parsing of the cpu number only works if specified last
Fix both.
This patch (of 2):
This reverts commit 616feab753972b97.
kstrtoint() and simple_strtoul() have a subtle difference which makes
them non interchangeable: if a non digit character is found amid the
parsing, the former will return an error, while the latter will just
stop parsing, e.g. simple_strtoul("123xyx") = 123.
The kernel cmdline reboot= argument allows to specify the CPU used for
rebooting, with the syntax `s####` among the other flags, e.g.
"reboot=warm,s31,force", so if this flag is not the last given, it's
silently ignored as well as the subsequent ones.
Fixes: 616feab75397 ("kernel/reboot.c: convert simple_strtoul to kstrtoint")
Signed-off-by: Matteo Croce <mcroce(a)microsoft.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Guenter Roeck <linux(a)roeck-us.net>
Cc: Petr Mladek <pmladek(a)suse.com>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Robin Holt <robinmholt(a)gmail.com>
Cc: Fabian Frederick <fabf(a)skynet.be>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20201103214025.116799-2-mcroce@linux.microsoft.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/kernel/reboot.c b/kernel/reboot.c
index e7b78d5ae1ab..8fbba433725e 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -551,22 +551,15 @@ static int __init reboot_setup(char *str)
break;
case 's':
- {
- int rc;
-
- if (isdigit(*(str+1))) {
- rc = kstrtoint(str+1, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else if (str[1] == 'm' && str[2] == 'p' &&
- isdigit(*(str+3))) {
- rc = kstrtoint(str+3, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else
+ if (isdigit(*(str+1)))
+ reboot_cpu = simple_strtoul(str+1, NULL, 0);
+ else if (str[1] == 'm' && str[2] == 'p' &&
+ isdigit(*(str+3)))
+ reboot_cpu = simple_strtoul(str+3, NULL, 0);
+ else
*mode = REBOOT_SOFT;
break;
- }
+
case 'g':
*mode = REBOOT_GPIO;
break;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8b92c4ff4423aa9900cf838d3294fcade4dbda35 Mon Sep 17 00:00:00 2001
From: Matteo Croce <mcroce(a)microsoft.com>
Date: Fri, 13 Nov 2020 22:52:02 -0800
Subject: [PATCH] Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
Patch series "fix parsing of reboot= cmdline", v3.
The parsing of the reboot= cmdline has two major errors:
- a missing bound check can crash the system on reboot
- parsing of the cpu number only works if specified last
Fix both.
This patch (of 2):
This reverts commit 616feab753972b97.
kstrtoint() and simple_strtoul() have a subtle difference which makes
them non interchangeable: if a non digit character is found amid the
parsing, the former will return an error, while the latter will just
stop parsing, e.g. simple_strtoul("123xyx") = 123.
The kernel cmdline reboot= argument allows to specify the CPU used for
rebooting, with the syntax `s####` among the other flags, e.g.
"reboot=warm,s31,force", so if this flag is not the last given, it's
silently ignored as well as the subsequent ones.
Fixes: 616feab75397 ("kernel/reboot.c: convert simple_strtoul to kstrtoint")
Signed-off-by: Matteo Croce <mcroce(a)microsoft.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Guenter Roeck <linux(a)roeck-us.net>
Cc: Petr Mladek <pmladek(a)suse.com>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Robin Holt <robinmholt(a)gmail.com>
Cc: Fabian Frederick <fabf(a)skynet.be>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20201103214025.116799-2-mcroce@linux.microsoft.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/kernel/reboot.c b/kernel/reboot.c
index e7b78d5ae1ab..8fbba433725e 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -551,22 +551,15 @@ static int __init reboot_setup(char *str)
break;
case 's':
- {
- int rc;
-
- if (isdigit(*(str+1))) {
- rc = kstrtoint(str+1, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else if (str[1] == 'm' && str[2] == 'p' &&
- isdigit(*(str+3))) {
- rc = kstrtoint(str+3, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else
+ if (isdigit(*(str+1)))
+ reboot_cpu = simple_strtoul(str+1, NULL, 0);
+ else if (str[1] == 'm' && str[2] == 'p' &&
+ isdigit(*(str+3)))
+ reboot_cpu = simple_strtoul(str+3, NULL, 0);
+ else
*mode = REBOOT_SOFT;
break;
- }
+
case 'g':
*mode = REBOOT_GPIO;
break;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8b92c4ff4423aa9900cf838d3294fcade4dbda35 Mon Sep 17 00:00:00 2001
From: Matteo Croce <mcroce(a)microsoft.com>
Date: Fri, 13 Nov 2020 22:52:02 -0800
Subject: [PATCH] Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
Patch series "fix parsing of reboot= cmdline", v3.
The parsing of the reboot= cmdline has two major errors:
- a missing bound check can crash the system on reboot
- parsing of the cpu number only works if specified last
Fix both.
This patch (of 2):
This reverts commit 616feab753972b97.
kstrtoint() and simple_strtoul() have a subtle difference which makes
them non interchangeable: if a non digit character is found amid the
parsing, the former will return an error, while the latter will just
stop parsing, e.g. simple_strtoul("123xyx") = 123.
The kernel cmdline reboot= argument allows to specify the CPU used for
rebooting, with the syntax `s####` among the other flags, e.g.
"reboot=warm,s31,force", so if this flag is not the last given, it's
silently ignored as well as the subsequent ones.
Fixes: 616feab75397 ("kernel/reboot.c: convert simple_strtoul to kstrtoint")
Signed-off-by: Matteo Croce <mcroce(a)microsoft.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Guenter Roeck <linux(a)roeck-us.net>
Cc: Petr Mladek <pmladek(a)suse.com>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Robin Holt <robinmholt(a)gmail.com>
Cc: Fabian Frederick <fabf(a)skynet.be>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20201103214025.116799-2-mcroce@linux.microsoft.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/kernel/reboot.c b/kernel/reboot.c
index e7b78d5ae1ab..8fbba433725e 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -551,22 +551,15 @@ static int __init reboot_setup(char *str)
break;
case 's':
- {
- int rc;
-
- if (isdigit(*(str+1))) {
- rc = kstrtoint(str+1, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else if (str[1] == 'm' && str[2] == 'p' &&
- isdigit(*(str+3))) {
- rc = kstrtoint(str+3, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else
+ if (isdigit(*(str+1)))
+ reboot_cpu = simple_strtoul(str+1, NULL, 0);
+ else if (str[1] == 'm' && str[2] == 'p' &&
+ isdigit(*(str+3)))
+ reboot_cpu = simple_strtoul(str+3, NULL, 0);
+ else
*mode = REBOOT_SOFT;
break;
- }
+
case 'g':
*mode = REBOOT_GPIO;
break;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3347acc6fcd4ee71ad18a9ff9d9dac176b517329 Mon Sep 17 00:00:00 2001
From: Arvind Sankar <nivedita(a)alum.mit.edu>
Date: Fri, 13 Nov 2020 22:51:59 -0800
Subject: [PATCH] compiler.h: fix barrier_data() on clang
Commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive") neglected to copy barrier_data() from
compiler-gcc.h into compiler-clang.h.
The definition in compiler-gcc.h was really to work around clang's more
aggressive optimization, so this broke barrier_data() on clang, and
consequently memzero_explicit() as well.
For example, this results in at least the memzero_explicit() call in
lib/crypto/sha256.c:sha256_transform() being optimized away by clang.
Fix this by moving the definition of barrier_data() into compiler.h.
Also move the gcc/clang definition of barrier() into compiler.h,
__memory_barrier() is icc-specific (and barrier() is already defined
using it in compiler-intel.h) and doesn't belong in compiler.h.
[rdunlap(a)infradead.org: fix ALPHA builds when SMP is not enabled]
Link: https://lkml.kernel.org/r/20201101231835.4589-1-rdunlap@infradead.org
Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
Signed-off-by: Arvind Sankar <nivedita(a)alum.mit.edu>
Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Tested-by: Nick Desaulniers <ndesaulniers(a)google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers(a)google.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20201014212631.207844-1-nivedita@alum.mit.edu
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 798027bb89be..640f09479bdf 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -13,6 +13,7 @@
#ifndef __ASSEMBLY__
+#include <linux/compiler.h>
#include <asm/rwonce.h>
#ifndef nop
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 230604e7f057..dd7233c48bf3 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -60,12 +60,6 @@
#define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1
#endif
-/* The following are for compatibility with GCC, from compiler-gcc.h,
- * and may be redefined here because they should not be shared with other
- * compilers, like ICC.
- */
-#define barrier() __asm__ __volatile__("" : : : "memory")
-
#if __has_feature(shadow_call_stack)
# define __noscs __attribute__((__no_sanitize__("shadow-call-stack")))
#endif
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 5deb37024574..74c6c0486eed 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -15,25 +15,6 @@
# error Sorry, your version of GCC is too old - please use 4.9 or newer.
#endif
-/* Optimization barrier */
-
-/* The "volatile" is due to gcc bugs */
-#define barrier() __asm__ __volatile__("": : :"memory")
-/*
- * This version is i.e. to prevent dead stores elimination on @ptr
- * where gcc and llvm may behave differently when otherwise using
- * normal barrier(): while gcc behavior gets along with a normal
- * barrier(), llvm needs an explicit input variable to be assumed
- * clobbered. The issue is as follows: while the inline asm might
- * access any memory it wants, the compiler could have fit all of
- * @ptr into memory registers instead, and since @ptr never escaped
- * from that, it proved that the inline asm wasn't touching any of
- * it. This version works well with both compilers, i.e. we're telling
- * the compiler that the inline asm absolutely may see the contents
- * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
- */
-#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
-
/*
* This macro obfuscates arithmetic on a variable address so that gcc
* shouldn't recognize the original var, and make assumptions about it.
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index e512f5505dad..b8fe0c23cfff 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -80,11 +80,25 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
/* Optimization barrier */
#ifndef barrier
-# define barrier() __memory_barrier()
+/* The "volatile" is due to gcc bugs */
+# define barrier() __asm__ __volatile__("": : :"memory")
#endif
#ifndef barrier_data
-# define barrier_data(ptr) barrier()
+/*
+ * This version is i.e. to prevent dead stores elimination on @ptr
+ * where gcc and llvm may behave differently when otherwise using
+ * normal barrier(): while gcc behavior gets along with a normal
+ * barrier(), llvm needs an explicit input variable to be assumed
+ * clobbered. The issue is as follows: while the inline asm might
+ * access any memory it wants, the compiler could have fit all of
+ * @ptr into memory registers instead, and since @ptr never escaped
+ * from that, it proved that the inline asm wasn't touching any of
+ * it. This version works well with both compilers, i.e. we're telling
+ * the compiler that the inline asm absolutely may see the contents
+ * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
+ */
+# define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
#endif
/* workaround for GCC PR82365 if needed */
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f3217d6f2f7a76b36a3326ad58c8897f4d5fbe31 Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Mon, 26 Oct 2020 16:54:36 +0100
Subject: [PATCH] firmware: xilinx: fix out-of-bounds access
The zynqmp_pm_set_suspend_mode() and zynqmp_pm_get_trustzone_version()
functions pass values as api_id into zynqmp_pm_invoke_fn
that are beyond PM_API_MAX, resulting in an out-of-bounds access:
drivers/firmware/xilinx/zynqmp.c: In function 'zynqmp_pm_set_suspend_mode':
drivers/firmware/xilinx/zynqmp.c:150:24: warning: array subscript 2562 is above array bounds of 'u32[64]' {aka 'unsigned int[64]'} [-Warray-bounds]
150 | if (zynqmp_pm_features[api_id] != PM_FEATURE_UNCHECKED)
| ~~~~~~~~~~~~~~~~~~^~~~~~~~
drivers/firmware/xilinx/zynqmp.c:28:12: note: while referencing 'zynqmp_pm_features'
28 | static u32 zynqmp_pm_features[PM_API_MAX];
| ^~~~~~~~~~~~~~~~~~
Replace the resulting undefined behavior with an error return.
This may break some things that happen to work at the moment
but seems better than randomly overwriting kernel data.
I assume we need additional fixes for the two functions that now
return an error.
Fixes: 76582671eb5d ("firmware: xilinx: Add Zynqmp firmware driver")
Fixes: e178df31cf41 ("firmware: xilinx: Implement ZynqMP power management APIs")
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Link: https://lore.kernel.org/r/20201026155449.3703142-1-arnd@kernel.org
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/firmware/xilinx/zynqmp.c b/drivers/firmware/xilinx/zynqmp.c
index 8d1ff2454e2e..efb8a66efc68 100644
--- a/drivers/firmware/xilinx/zynqmp.c
+++ b/drivers/firmware/xilinx/zynqmp.c
@@ -147,6 +147,9 @@ static int zynqmp_pm_feature(u32 api_id)
return 0;
/* Return value if feature is already checked */
+ if (api_id > ARRAY_SIZE(zynqmp_pm_features))
+ return PM_FEATURE_INVALID;
+
if (zynqmp_pm_features[api_id] != PM_FEATURE_UNCHECKED)
return zynqmp_pm_features[api_id];
The optimized cipher function need length multiple of 4 bytes.
But it get sometimes odd length.
This is due to SG data could be stored with an offset.
So the fix is to check also if the offset is aligned with 4 bytes.
Fixes: 6298e948215f2 ("crypto: sunxi-ss - Add Allwinner Security System crypto accelerator")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Corentin Labbe <clabbe(a)baylibre.com>
---
drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c b/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
index 19f1aa577ed4..4dd736ee5a4d 100644
--- a/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
+++ b/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
@@ -186,12 +186,12 @@ static int sun4i_ss_cipher_poll(struct skcipher_request *areq)
* we can use the SS optimized function
*/
while (in_sg && no_chunk == 1) {
- if (in_sg->length % 4)
+ if (in_sg->length % 4 || !IS_ALIGNED(in_sg->offset, sizeof(u32)))
no_chunk = 0;
in_sg = sg_next(in_sg);
}
while (out_sg && no_chunk == 1) {
- if (out_sg->length % 4)
+ if (out_sg->length % 4 || !IS_ALIGNED(out_sg->offset, sizeof(u32)))
no_chunk = 0;
out_sg = sg_next(out_sg);
}
--
2.26.2
The need_fallback is never initialized and seem to be always true at runtime.
So all hardware operations are always bypassed.
Fixes: 0ae1f46c55f87 ("crypto: sun4i-ss - fallback when length is not multiple of blocksize")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Corentin Labbe <clabbe(a)baylibre.com>
---
drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c b/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
index 8f4621826330..7f4c97cc9627 100644
--- a/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
+++ b/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
@@ -179,7 +179,7 @@ static int sun4i_ss_cipher_poll(struct skcipher_request *areq)
unsigned int obo = 0; /* offset in bufo*/
unsigned int obl = 0; /* length of data in bufo */
unsigned long flags;
- bool need_fallback;
+ bool need_fallback = false;
if (!areq->cryptlen)
return 0;
--
2.26.2
We've fixed many races in panfrost_job_timedout() but some remain.
Instead of trying to fix it again, let's simplify the logic and move
the reset bits to a separate work scheduled when one of the queue
reports a timeout.
v5:
- Simplify panfrost_scheduler_stop() (Steven Price)
- Always restart the queue in panfrost_scheduler_start() even if
the status is corrupted (Steven Price)
v4:
- Rework the logic to prevent a race between drm_sched_start()
(reset work) and drm_sched_job_timedout() (timeout work)
- Drop Steven's R-b
- Add dma_fence annotation to the panfrost_reset() function (Daniel Vetter)
v3:
- Replace the atomic_cmpxchg() by an atomic_xchg() (Robin Murphy)
- Add Steven's R-b
v2:
- Use atomic_cmpxchg() to conditionally schedule the reset work (Steven Price)
Fixes: 1a11a88cfd9a ("drm/panfrost: Fix job timeout handling")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon(a)collabora.com>
---
drivers/gpu/drm/panfrost/panfrost_device.c | 1 -
drivers/gpu/drm/panfrost/panfrost_device.h | 6 +-
drivers/gpu/drm/panfrost/panfrost_job.c | 187 ++++++++++++++-------
3 files changed, 130 insertions(+), 64 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
index 1daf9322954a..fbcf5edbe367 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.c
+++ b/drivers/gpu/drm/panfrost/panfrost_device.c
@@ -200,7 +200,6 @@ int panfrost_device_init(struct panfrost_device *pfdev)
struct resource *res;
mutex_init(&pfdev->sched_lock);
- mutex_init(&pfdev->reset_lock);
INIT_LIST_HEAD(&pfdev->scheduled_jobs);
INIT_LIST_HEAD(&pfdev->as_lru_list);
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
index 140e004a3790..597cf1459b0a 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -106,7 +106,11 @@ struct panfrost_device {
struct panfrost_perfcnt *perfcnt;
struct mutex sched_lock;
- struct mutex reset_lock;
+
+ struct {
+ struct work_struct work;
+ atomic_t pending;
+ } reset;
struct mutex shrinker_lock;
struct list_head shrinker_list;
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index e75b7d2192f7..04e6f6f9b742 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -20,12 +20,21 @@
#include "panfrost_gpu.h"
#include "panfrost_mmu.h"
+#define JOB_TIMEOUT_MS 500
+
#define job_write(dev, reg, data) writel(data, dev->iomem + (reg))
#define job_read(dev, reg) readl(dev->iomem + (reg))
+enum panfrost_queue_status {
+ PANFROST_QUEUE_STATUS_ACTIVE,
+ PANFROST_QUEUE_STATUS_STOPPED,
+ PANFROST_QUEUE_STATUS_STARTING,
+ PANFROST_QUEUE_STATUS_FAULT_PENDING,
+};
+
struct panfrost_queue_state {
struct drm_gpu_scheduler sched;
- bool stopped;
+ atomic_t status;
struct mutex lock;
u64 fence_context;
u64 emit_seqno;
@@ -373,28 +382,61 @@ void panfrost_job_enable_interrupts(struct panfrost_device *pfdev)
static bool panfrost_scheduler_stop(struct panfrost_queue_state *queue,
struct drm_sched_job *bad)
{
+ enum panfrost_queue_status old_status;
bool stopped = false;
mutex_lock(&queue->lock);
- if (!queue->stopped) {
- drm_sched_stop(&queue->sched, bad);
- if (bad)
- drm_sched_increase_karma(bad);
- queue->stopped = true;
- stopped = true;
- }
+ old_status = atomic_xchg(&queue->status,
+ PANFROST_QUEUE_STATUS_STOPPED);
+ if (old_status == PANFROST_QUEUE_STATUS_STOPPED)
+ goto out;
+
+ WARN_ON(old_status != PANFROST_QUEUE_STATUS_ACTIVE);
+ drm_sched_stop(&queue->sched, bad);
+ if (bad)
+ drm_sched_increase_karma(bad);
+
+ stopped = true;
+
+ /*
+ * Set the timeout to max so the timer doesn't get started
+ * when we return from the timeout handler (restored in
+ * panfrost_scheduler_start()).
+ */
+ queue->sched.timeout = MAX_SCHEDULE_TIMEOUT;
+
+out:
mutex_unlock(&queue->lock);
return stopped;
}
+static void panfrost_scheduler_start(struct panfrost_queue_state *queue)
+{
+ enum panfrost_queue_status old_status;
+
+ mutex_lock(&queue->lock);
+ old_status = atomic_xchg(&queue->status,
+ PANFROST_QUEUE_STATUS_STARTING);
+ WARN_ON(old_status != PANFROST_QUEUE_STATUS_STOPPED);
+
+ /* Restore the original timeout before starting the scheduler. */
+ queue->sched.timeout = msecs_to_jiffies(JOB_TIMEOUT_MS);
+ drm_sched_resubmit_jobs(&queue->sched);
+ drm_sched_start(&queue->sched, true);
+ old_status = atomic_xchg(&queue->status,
+ PANFROST_QUEUE_STATUS_ACTIVE);
+ if (old_status == PANFROST_QUEUE_STATUS_FAULT_PENDING)
+ drm_sched_fault(&queue->sched);
+
+ mutex_unlock(&queue->lock);
+}
+
static void panfrost_job_timedout(struct drm_sched_job *sched_job)
{
struct panfrost_job *job = to_panfrost_job(sched_job);
struct panfrost_device *pfdev = job->pfdev;
int js = panfrost_job_get_slot(job);
- unsigned long flags;
- int i;
/*
* If the GPU managed to complete this jobs fence, the timeout is
@@ -415,56 +457,9 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
if (!panfrost_scheduler_stop(&pfdev->js->queue[js], sched_job))
return;
- if (!mutex_trylock(&pfdev->reset_lock))
- return;
-
- for (i = 0; i < NUM_JOB_SLOTS; i++) {
- struct drm_gpu_scheduler *sched = &pfdev->js->queue[i].sched;
-
- /*
- * If the queue is still active, make sure we wait for any
- * pending timeouts.
- */
- if (!pfdev->js->queue[i].stopped)
- cancel_delayed_work_sync(&sched->work_tdr);
-
- /*
- * If the scheduler was not already stopped, there's a tiny
- * chance a timeout has expired just before we stopped it, and
- * drm_sched_stop() does not flush pending works. Let's flush
- * them now so the timeout handler doesn't get called in the
- * middle of a reset.
- */
- if (panfrost_scheduler_stop(&pfdev->js->queue[i], NULL))
- cancel_delayed_work_sync(&sched->work_tdr);
-
- /*
- * Now that we cancelled the pending timeouts, we can safely
- * reset the stopped state.
- */
- pfdev->js->queue[i].stopped = false;
- }
-
- spin_lock_irqsave(&pfdev->js->job_lock, flags);
- for (i = 0; i < NUM_JOB_SLOTS; i++) {
- if (pfdev->jobs[i]) {
- pm_runtime_put_noidle(pfdev->dev);
- panfrost_devfreq_record_idle(&pfdev->pfdevfreq);
- pfdev->jobs[i] = NULL;
- }
- }
- spin_unlock_irqrestore(&pfdev->js->job_lock, flags);
-
- panfrost_device_reset(pfdev);
-
- for (i = 0; i < NUM_JOB_SLOTS; i++)
- drm_sched_resubmit_jobs(&pfdev->js->queue[i].sched);
-
- mutex_unlock(&pfdev->reset_lock);
-
- /* restart scheduler after GPU is usable again */
- for (i = 0; i < NUM_JOB_SLOTS; i++)
- drm_sched_start(&pfdev->js->queue[i].sched, true);
+ /* Schedule a reset if there's no reset in progress. */
+ if (!atomic_xchg(&pfdev->reset.pending, 1))
+ schedule_work(&pfdev->reset.work);
}
static const struct drm_sched_backend_ops panfrost_sched_ops = {
@@ -496,6 +491,8 @@ static irqreturn_t panfrost_job_irq_handler(int irq, void *data)
job_write(pfdev, JOB_INT_CLEAR, mask);
if (status & JOB_INT_MASK_ERR(j)) {
+ enum panfrost_queue_status old_status;
+
job_write(pfdev, JS_COMMAND_NEXT(j), JS_COMMAND_NOP);
dev_err(pfdev->dev, "js fault, js=%d, status=%s, head=0x%x, tail=0x%x",
@@ -504,7 +501,18 @@ static irqreturn_t panfrost_job_irq_handler(int irq, void *data)
job_read(pfdev, JS_HEAD_LO(j)),
job_read(pfdev, JS_TAIL_LO(j)));
- drm_sched_fault(&pfdev->js->queue[j].sched);
+ /*
+ * When the queue is being restarted we don't report
+ * faults directly to avoid races between the timeout
+ * and reset handlers. panfrost_scheduler_start() will
+ * call drm_sched_fault() after the queue has been
+ * started if status == FAULT_PENDING.
+ */
+ old_status = atomic_cmpxchg(&pfdev->js->queue[j].status,
+ PANFROST_QUEUE_STATUS_STARTING,
+ PANFROST_QUEUE_STATUS_FAULT_PENDING);
+ if (old_status == PANFROST_QUEUE_STATUS_ACTIVE)
+ drm_sched_fault(&pfdev->js->queue[j].sched);
}
if (status & JOB_INT_MASK_DONE(j)) {
@@ -531,11 +539,66 @@ static irqreturn_t panfrost_job_irq_handler(int irq, void *data)
return IRQ_HANDLED;
}
+static void panfrost_reset(struct work_struct *work)
+{
+ struct panfrost_device *pfdev = container_of(work,
+ struct panfrost_device,
+ reset.work);
+ unsigned long flags;
+ unsigned int i;
+ bool cookie;
+
+ cookie = dma_fence_begin_signalling();
+ for (i = 0; i < NUM_JOB_SLOTS; i++) {
+ /*
+ * We want pending timeouts to be handled before we attempt
+ * to stop the scheduler. If we don't do that and the timeout
+ * handler is in flight, it might have removed the bad job
+ * from the list, and we'll lose this job if the reset handler
+ * enters the critical section in panfrost_scheduler_stop()
+ * before the timeout handler.
+ *
+ * Timeout is set to MAX_SCHEDULE_TIMEOUT - 1 because we need
+ * something big enough to make sure the timer will not expire
+ * before we manage to stop the scheduler, but we can't use
+ * MAX_SCHEDULE_TIMEOUT because drm_sched_get_cleanup_job()
+ * considers that as 'timer is not running' and will dequeue
+ * the job without making sure the timeout handler is not
+ * running.
+ */
+ pfdev->js->queue[i].sched.timeout = MAX_SCHEDULE_TIMEOUT - 1;
+ cancel_delayed_work_sync(&pfdev->js->queue[i].sched.work_tdr);
+ panfrost_scheduler_stop(&pfdev->js->queue[i], NULL);
+ }
+
+ /* All timers have been stopped, we can safely reset the pending state. */
+ atomic_set(&pfdev->reset.pending, 0);
+
+ spin_lock_irqsave(&pfdev->js->job_lock, flags);
+ for (i = 0; i < NUM_JOB_SLOTS; i++) {
+ if (pfdev->jobs[i]) {
+ pm_runtime_put_noidle(pfdev->dev);
+ panfrost_devfreq_record_idle(&pfdev->pfdevfreq);
+ pfdev->jobs[i] = NULL;
+ }
+ }
+ spin_unlock_irqrestore(&pfdev->js->job_lock, flags);
+
+ panfrost_device_reset(pfdev);
+
+ for (i = 0; i < NUM_JOB_SLOTS; i++)
+ panfrost_scheduler_start(&pfdev->js->queue[i]);
+
+ dma_fence_end_signalling(cookie);
+}
+
int panfrost_job_init(struct panfrost_device *pfdev)
{
struct panfrost_job_slot *js;
int ret, j, irq;
+ INIT_WORK(&pfdev->reset.work, panfrost_reset);
+
pfdev->js = js = devm_kzalloc(pfdev->dev, sizeof(*js), GFP_KERNEL);
if (!js)
return -ENOMEM;
@@ -560,7 +623,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
ret = drm_sched_init(&js->queue[j].sched,
&panfrost_sched_ops,
- 1, 0, msecs_to_jiffies(500),
+ 1, 0, msecs_to_jiffies(JOB_TIMEOUT_MS),
"pan_js");
if (ret) {
dev_err(pfdev->dev, "Failed to create scheduler: %d.", ret);
--
2.26.2
Hi Greg and Sasha,
Please apply commit bc7f2cd7559c ("spi: bcm2835: remove use of
uninitialized gpio flags variable") as a fix to commit
5e31ba0c0543 ("spi: bcm2835: fix gpio cs level inversion"), which
appears to be in 5.4 and 5.9 at the time of writing this. I did not try
to apply it but I did not see an outstanding failure email about it. If
it does not apply cleanly, let me know.
Cheers,
Nathan
The ethernet driver may allocate skb (and skb->data) via napi_alloc_skb().
This ends up to page_frag_alloc() to allocate skb->data from
page_frag_cache->va.
During the memory pressure, page_frag_cache->va may be allocated as
pfmemalloc page. As a result, the skb->pfmemalloc is always true as
skb->data is from page_frag_cache->va. The skb will be dropped if the
sock (receiver) does not have SOCK_MEMALLOC. This is expected behaviour
under memory pressure.
However, once kernel is not under memory pressure any longer (suppose large
amount of memory pages are just reclaimed), the page_frag_alloc() may still
re-use the prior pfmemalloc page_frag_cache->va to allocate skb->data. As a
result, the skb->pfmemalloc is always true unless page_frag_cache->va is
re-allocated, even if the kernel is not under memory pressure any longer.
Here is how kernel runs into issue.
1. The kernel is under memory pressure and allocation of
PAGE_FRAG_CACHE_MAX_ORDER in __page_frag_cache_refill() will fail. Instead,
the pfmemalloc page is allocated for page_frag_cache->va.
2: All skb->data from page_frag_cache->va (pfmemalloc) will have
skb->pfmemalloc=true. The skb will always be dropped by sock without
SOCK_MEMALLOC. This is an expected behaviour.
3. Suppose a large amount of pages are reclaimed and kernel is not under
memory pressure any longer. We expect skb->pfmemalloc drop will not happen.
4. Unfortunately, page_frag_alloc() does not proactively re-allocate
page_frag_alloc->va and will always re-use the prior pfmemalloc page. The
skb->pfmemalloc is always true even kernel is not under memory pressure any
longer.
Fix this by freeing and re-allocating the page instead of recycling it.
References: https://lore.kernel.org/lkml/20201103193239.1807-1-dongli.zhang@oracle.com/
References: https://lore.kernel.org/linux-mm/20201105042140.5253-1-willy@infradead.org/
Suggested-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Aruna Ramakrishna <aruna.ramakrishna(a)oracle.com>
Cc: Bert Barbe <bert.barbe(a)oracle.com>
Cc: Rama Nichanamatlu <rama.nichanamatlu(a)oracle.com>
Cc: Venkat Venkatsubra <venkat.x.venkatsubra(a)oracle.com>
Cc: Manjunath Patil <manjunath.b.patil(a)oracle.com>
Cc: Joe Jin <joe.jin(a)oracle.com>
Cc: SRINIVAS <srinivas.eeda(a)oracle.com>
Cc: stable(a)vger.kernel.org
Fixes: 79930f5892e ("net: do not deplete pfmemalloc reserve")
Signed-off-by: Dongli Zhang <dongli.zhang(a)oracle.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
---
Changed since v1:
- change author from Matthew to Dongli
- Add references to all prior discussions
- Add more details to commit message
mm/page_alloc.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 23f5066bd4a5..91129ce75ed4 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5103,6 +5103,11 @@ void *page_frag_alloc(struct page_frag_cache *nc,
if (!page_ref_sub_and_test(page, nc->pagecnt_bias))
goto refill;
+ if (nc->pfmemalloc) {
+ free_the_page(page, compound_order(page));
+ goto refill;
+ }
+
#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
/* if size can vary use size else just use PAGE_SIZE */
size = nc->size;
--
2.17.1
When the machine is under extreme memory pressure, the page_frag allocator
signals this to the networking stack by marking allocations with the
'pfmemalloc' flag, which causes non-essential packets to be dropped.
Unfortunately, even after the machine recovers from the low memory
condition, the page continues to be used by the page_frag allocator,
so all allocations from this page will continue to be dropped.
Fix this by freeing and re-allocating the page instead of recycling it.
Reported-by: Dongli Zhang <dongli.zhang(a)oracle.com>
Cc: Aruna Ramakrishna <aruna.ramakrishna(a)oracle.com>
Cc: Bert Barbe <bert.barbe(a)oracle.com>
Cc: Rama Nichanamatlu <rama.nichanamatlu(a)oracle.com>
Cc: Venkat Venkatsubra <venkat.x.venkatsubra(a)oracle.com>
Cc: Manjunath Patil <manjunath.b.patil(a)oracle.com>
Cc: Joe Jin <joe.jin(a)oracle.com>
Cc: SRINIVAS <srinivas.eeda(a)oracle.com>
Cc: stable(a)vger.kernel.org
Fixes: 79930f5892e ("net: do not deplete pfmemalloc reserve")
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
---
mm/page_alloc.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 778e815130a6..631546ae1c53 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5139,6 +5139,10 @@ void *page_frag_alloc(struct page_frag_cache *nc,
if (!page_ref_sub_and_test(page, nc->pagecnt_bias))
goto refill;
+ if (nc->pfmemalloc) {
+ free_the_page(page, compound_order(page));
+ goto refill;
+ }
#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
/* if size can vary use size else just use PAGE_SIZE */
--
2.28.0
From: Evgeny Novikov <novikov(a)ispras.ru>
[ Upstream commit 0d66e04875c5aae876cf3d4f4be7978fa2b00523 ]
goku_probe() goes to error label "err" and invokes goku_remove()
in case of failures of pci_enable_device(), pci_resource_start()
and ioremap(). goku_remove() gets a device from
pci_get_drvdata(pdev) and works with it without any checks, in
particular it dereferences a corresponding pointer. But
goku_probe() did not set this device yet. So, one can expect
various crashes. The patch moves setting the device just after
allocation of memory for it.
Found by Linux Driver Verification project (linuxtesting.org).
Reported-by: Pavel Andrianov <andrianov(a)ispras.ru>
Signed-off-by: Evgeny Novikov <novikov(a)ispras.ru>
Signed-off-by: Felipe Balbi <balbi(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/usb/gadget/udc/goku_udc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/udc/goku_udc.c b/drivers/usb/gadget/udc/goku_udc.c
index c3721225b61ed..b706ad3034bc1 100644
--- a/drivers/usb/gadget/udc/goku_udc.c
+++ b/drivers/usb/gadget/udc/goku_udc.c
@@ -1757,6 +1757,7 @@ static int goku_probe(struct pci_dev *pdev, const struct pci_device_id *id)
goto err;
}
+ pci_set_drvdata(pdev, dev);
spin_lock_init(&dev->lock);
dev->pdev = pdev;
dev->gadget.ops = &goku_ops;
@@ -1790,7 +1791,6 @@ static int goku_probe(struct pci_dev *pdev, const struct pci_device_id *id)
}
dev->regs = (struct goku_udc_regs __iomem *) base;
- pci_set_drvdata(pdev, dev);
INFO(dev, "%s\n", driver_desc);
INFO(dev, "version: " DRIVER_VERSION " %s\n", dmastr());
INFO(dev, "irq %d, pci mem %p\n", pdev->irq, base);
--
2.27.0
From: Nicholas Piggin <npiggin(a)gmail.com>
Subject: mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit
Previously the negated unsigned long would be cast back to signed long
which would have the correct negative value. After commit 730ec8c01a2b
("mm/vmscan.c: change prototype for shrink_page_list"), the large unsigned
int converts to a large positive signed long.
Symptoms include CMA allocations hanging forever holding the cma_mutex due
to alloc_contig_range->...->isolate_migratepages_block waiting forever in
"while (unlikely(too_many_isolated(pgdat)))".
[akpm(a)linux-foundation.org: fix -stat.nr_lazyfree_fail as well, per Michal]
Link: https://lkml.kernel.org/r/20201029032320.1448441-1-npiggin@gmail.com
Fixes: 730ec8c01a2b ("mm/vmscan.c: change prototype for shrink_page_list")
Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Vaneet Narang <v.narang(a)samsung.com>
Cc: Maninder Singh <maninder1.s(a)samsung.com>
Cc: Amit Sahrawat <a.sahrawat(a)samsung.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/vmscan.c~mm-vmscan-fix-nr_isolated_file-corruption-on-64-bit
+++ a/mm/vmscan.c
@@ -1516,7 +1516,8 @@ unsigned int reclaim_clean_pages_from_li
nr_reclaimed = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc,
TTU_IGNORE_ACCESS, &stat, true);
list_splice(&clean_pages, page_list);
- mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -nr_reclaimed);
+ mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE,
+ -(long)nr_reclaimed);
/*
* Since lazyfree pages are isolated from file LRU from the beginning,
* they will rotate back to anonymous LRU in the end if it failed to
@@ -1526,7 +1527,7 @@ unsigned int reclaim_clean_pages_from_li
mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_ANON,
stat.nr_lazyfree_fail);
mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE,
- -stat.nr_lazyfree_fail);
+ -(long)stat.nr_lazyfree_fail);
return nr_reclaimed;
}
_
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: f022ed13f4d9 - Linux 5.9.9-rc1
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ xfstests - ext4
🚧 ✅ xfstests - xfs
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
ppc64le:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ Ethernet drivers sanity
🚧 ⚡⚡⚡ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ xfstests - ext4
🚧 ✅ xfstests - xfs
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage block - filesystem fio test
🚧 ✅ Storage block - queue scheduler test
🚧 ✅ Storage nvme - tcp
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 4:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 5:
✅ Boot test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
s390x:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 3:
✅ Boot test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
x86_64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ❌ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ xfstests - ext4
🚧 ✅ xfstests - xfs
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
Currently, commit e9e2eae89ddb dropped a (int) decoration from
XFS_LITINO(mp), and since sizeof() expression is also involved,
the result of XFS_LITINO(mp) is simply as the size_t type
(commonly unsigned long).
Considering the expression in xfs_attr_shortform_bytesfit():
offset = (XFS_LITINO(mp) - bytes) >> 3;
let "bytes" be (int)340, and
"XFS_LITINO(mp)" be (unsigned long)336.
on 64-bit platform, the expression is
offset = ((unsigned long)336 - (int)340) >> 8 =
(int)(0xfffffffffffffffcUL >> 3) = -1
but on 32-bit platform, the expression is
offset = ((unsigned long)336 - (int)340) >> 8 =
(int)(0xfffffffcUL >> 3) = 0x1fffffff
instead.
so offset becomes a large number on 32-bit platform, and cause
xfs_attr_shortform_bytesfit() returns maxforkoff rather than 0
Therefore, one result is
"ASSERT(new_size <= XFS_IFORK_SIZE(ip, whichfork));"
assertion failure in xfs_idata_realloc().
, which can be triggered with the following commands:
touch a;
setfattr -n user.0 -v "`seq 0 80`" a;
setfattr -n user.1 -v "`seq 0 80`" a
on 32-bit platform.
Fix it by restoring (int) decorator to XFS_LITINO(mp) since
int type for XFS_LITINO(mp) is safe and all pre-exist signed
calculations are correct.
Fixes: e9e2eae89ddb ("xfs: only check the superblock version for dinode size calculation")
Cc: <stable(a)vger.kernel.org> # 5.7+
Signed-off-by: Gao Xiang <hsiangkao(a)redhat.com>
---
I'm not sure this is the preferred way or just simply fix
xfs_attr_shortform_bytesfit() since I don't look into the
rest of XFS_LITINO(mp) users. Add (int) to XFS_LITINO(mp)
will avoid all potential regression at least.
fs/xfs/libxfs/xfs_format.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
index dd764da08f6f..f58f0a44c024 100644
--- a/fs/xfs/libxfs/xfs_format.h
+++ b/fs/xfs/libxfs/xfs_format.h
@@ -1061,7 +1061,7 @@ enum xfs_dinode_fmt {
sizeof(struct xfs_dinode) : \
offsetof(struct xfs_dinode, di_crc))
#define XFS_LITINO(mp) \
- ((mp)->m_sb.sb_inodesize - XFS_DINODE_SIZE(&(mp)->m_sb))
+ ((int)((mp)->m_sb.sb_inodesize - XFS_DINODE_SIZE(&(mp)->m_sb)))
/*
* Inode data & attribute fork sizes, per inode.
--
2.18.4
Some quad/bluetooth keyboards, such as the Dinovo Edge (Y-RAY81) have a
builtin touchpad. In this case when asking the receiver for paired devices,
we get only 1 paired device with a device_type of REPORT_TYPE_KEYBOARD.
This means that we do not instantiate a second dj_hiddev for the mouse
(as we normally would) and thus there is no place for us to forward the
mouse input reports to, causing the touchpad part of the keyboard to not
work.
There is no way for us to detect these keyboards, so this commit adds
an array with device-ids for such keyboards and when a keyboard is on
this list it adds STD_MOUSE to the reports_supported bitmap for the
dj_hiddev created for the keyboard fixing the touchpad not working.
Using a list of device-ids for this is not ideal, but there are only
very few such keyboards so this should be fine. Besides the Dinovo Edge,
other known wireless Logitech keyboards with a builtin touchpad are:
* Dinovo Mini (TODO add its device-id to the list)
* K400 (uses a unifying receiver so is not affected)
* K600 (uses a unifying receiver so is not affected)
Cc: stable(a)vger.kernel.org
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1811424
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
---
drivers/hid/hid-logitech-dj.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/drivers/hid/hid-logitech-dj.c b/drivers/hid/hid-logitech-dj.c
index ea1e40530f85..9ed7260b9593 100644
--- a/drivers/hid/hid-logitech-dj.c
+++ b/drivers/hid/hid-logitech-dj.c
@@ -867,11 +867,23 @@ static void logi_dj_recv_queue_notification(struct dj_receiver_dev *djrcv_dev,
schedule_work(&djrcv_dev->work);
}
+/*
+ * Some quad/bluetooth keyboards have a builtin touchpad in this case we see
+ * only 1 paired device with a device_type of REPORT_TYPE_KEYBOARD. For the
+ * touchpad to work we must also forward mouse input reports to the dj_hiddev
+ * created for the keyboard (instead of forwarding them to a second paired
+ * device with a device_type of REPORT_TYPE_MOUSE as we normally would).
+ */
+static const u16 kbd_builtin_touchpad_ids[] = {
+ 0xb309, /* Dinovo Edge */
+};
+
static void logi_hidpp_dev_conn_notif_equad(struct hid_device *hdev,
struct hidpp_event *hidpp_report,
struct dj_workitem *workitem)
{
struct dj_receiver_dev *djrcv_dev = hid_get_drvdata(hdev);
+ int i, id;
workitem->type = WORKITEM_TYPE_PAIRED;
workitem->device_type = hidpp_report->params[HIDPP_PARAM_DEVICE_INFO] &
@@ -883,6 +895,13 @@ static void logi_hidpp_dev_conn_notif_equad(struct hid_device *hdev,
workitem->reports_supported |= STD_KEYBOARD | MULTIMEDIA |
POWER_KEYS | MEDIA_CENTER |
HIDPP;
+ id = (workitem->quad_id_msb << 8) | workitem->quad_id_lsb;
+ for (i = 0; i < ARRAY_SIZE(kbd_builtin_touchpad_ids); i++) {
+ if (id == kbd_builtin_touchpad_ids[i]) {
+ workitem->reports_supported |= STD_MOUSE;
+ break;
+ }
+ }
break;
case REPORT_TYPE_MOUSE:
workitem->reports_supported |= STD_MOUSE | HIDPP;
--
2.28.0
Dear Stable Kernel Maintainers,
Please consider cherry picking
commit ed6ed11830a9 ("crypto: arm64/aes-modes - get rid of literal
load of addend vector")
to linux-4.19.y. It first landed in v4.20-rc1 and cherry picks
cleanly. It will allow us to use LLVM_IAS=1 for ARCH=arm64 on
linux-4.19.y for Android and CrOS.
--
Thanks,
~Nick Desaulniers
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 567899f1eda3 - ath9k_htc: Use appropriate rs_datalen type
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ xfstests - ext4
🚧 ✅ xfstests - xfs
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
ppc64le:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
⚡⚡⚡ LTP
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ Memory: fork_mem
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ Ethernet drivers sanity
🚧 ⚡⚡⚡ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ trace: ftrace/tracer
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ storage: software RAID testing
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ xfstests - ext4
🚧 ⚡⚡⚡ xfstests - xfs
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 4:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 5:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ storage: software RAID testing
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ xfstests - ext4
🚧 ⚡⚡⚡ xfstests - xfs
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
s390x:
Host 1:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
x86_64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 3:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ❌ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ xfstests - ext4
🚧 ✅ xfstests - xfs
🚧 ✅ xfstests - btrfs
🚧 ❌ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage block - filesystem fio test
🚧 ✅ Storage block - queue scheduler test
🚧 ✅ Storage nvme - tcp
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
From: Wengang Wang <wen.gang.wang(a)oracle.com>
Subject: ocfs2: initialize ip_next_orphan
Though problem if found on a lower 4.1.12 kernel, I think upstream has
same issue.
In one node in the cluster, there is the following callback trace:
# cat /proc/21473/stack
[<ffffffffc09a2f06>] __ocfs2_cluster_lock.isra.36+0x336/0x9e0 [ocfs2]
[<ffffffffc09a4481>] ocfs2_inode_lock_full_nested+0x121/0x520 [ocfs2]
[<ffffffffc09b2ce2>] ocfs2_evict_inode+0x152/0x820 [ocfs2]
[<ffffffff8122b36e>] evict+0xae/0x1a0
[<ffffffff8122bd26>] iput+0x1c6/0x230
[<ffffffffc09b60ed>] ocfs2_orphan_filldir+0x5d/0x100 [ocfs2]
[<ffffffffc0992ae0>] ocfs2_dir_foreach_blk+0x490/0x4f0 [ocfs2]
[<ffffffffc099a1e9>] ocfs2_dir_foreach+0x29/0x30 [ocfs2]
[<ffffffffc09b7716>] ocfs2_recover_orphans+0x1b6/0x9a0 [ocfs2]
[<ffffffffc09b9b4e>] ocfs2_complete_recovery+0x1de/0x5c0 [ocfs2]
[<ffffffff810a1399>] process_one_work+0x169/0x4a0
[<ffffffff810a1bcb>] worker_thread+0x5b/0x560
[<ffffffff810a7a2b>] kthread+0xcb/0xf0
[<ffffffff816f5d21>] ret_from_fork+0x61/0x90
[<ffffffffffffffff>] 0xffffffffffffffff
The above stack is not reasonable, the final iput shouldn't happen in
ocfs2_orphan_filldir() function. Looking at the code,
2067 /* Skip inodes which are already added to recover list, since dio may
2068 * happen concurrently with unlink/rename */
2069 if (OCFS2_I(iter)->ip_next_orphan) {
2070 iput(iter);
2071 return 0;
2072 }
2073
The logic thinks the inode is already in recover list on seeing
ip_next_orphan is non-NULL, so it skip this inode after dropping a
reference which incremented in ocfs2_iget().
While, if the inode is already in recover list, it should have another
reference and the iput() at line 2070 should not be the final iput
(dropping the last reference). So I don't think the inode is really in
the recover list (no vmcore to confirm).
Note that ocfs2_queue_orphans(), though not shown up in the call back
trace, is holding cluster lock on the orphan directory when looking up for
unlinked inodes. The on disk inode eviction could involve a lot of IOs
which may need long time to finish. That means this node could hold the
cluster lock for very long time, that can lead to the lock requests (from
other nodes) to the orhpan directory hang for long time.
Looking at more on ip_next_orphan, I found it's not initialized when
allocating a new ocfs2_inode_info structure.
This causes te reflink operations from some nodes hang for very long
time waiting for the cluster lock on the orphan directory.
Fix: initialize ip_next_orphan as NULL.
Link: https://lkml.kernel.org/r/20201109171746.27884-1-wen.gang.wang@oracle.com
Signed-off-by: Wengang Wang <wen.gang.wang(a)oracle.com>
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/super.c | 1 +
1 file changed, 1 insertion(+)
--- a/fs/ocfs2/super.c~ocfs2-initialize-ip_next_orphan
+++ a/fs/ocfs2/super.c
@@ -1713,6 +1713,7 @@ static void ocfs2_inode_init_once(void *
oi->ip_blkno = 0ULL;
oi->ip_clusters = 0;
+ oi->ip_next_orphan = NULL;
ocfs2_resv_init_once(&oi->ip_la_data_resv);
_
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlbfs: fix anon huge page migration race
Qian Cai reported the following BUG in [1]
[ 6147.019063][T45242] LTP: starting move_pages12
[ 6147.475680][T64921] BUG: unable to handle page fault for address: ffffffffffffffe0
...
[ 6147.525866][T64921] RIP: 0010:anon_vma_interval_tree_iter_first+0xa2/0x170
avc_start_pgoff at mm/interval_tree.c:63
[ 6147.620914][T64921] Call Trace:
[ 6147.624078][T64921] rmap_walk_anon+0x141/0xa30
rmap_walk_anon at mm/rmap.c:1864
[ 6147.628639][T64921] try_to_unmap+0x209/0x2d0
try_to_unmap at mm/rmap.c:1763
[ 6147.633026][T64921] ? rmap_walk_locked+0x140/0x140
[ 6147.637936][T64921] ? page_remove_rmap+0x1190/0x1190
[ 6147.643020][T64921] ? page_not_mapped+0x10/0x10
[ 6147.647668][T64921] ? page_get_anon_vma+0x290/0x290
[ 6147.652664][T64921] ? page_mapcount_is_zero+0x10/0x10
[ 6147.657838][T64921] ? hugetlb_page_mapping_lock_write+0x97/0x180
[ 6147.663972][T64921] migrate_pages+0x1005/0x1fb0
[ 6147.668617][T64921] ? remove_migration_pte+0xac0/0xac0
[ 6147.673875][T64921] move_pages_and_store_status.isra.47+0xd7/0x1a0
[ 6147.680181][T64921] ? migrate_pages+0x1fb0/0x1fb0
[ 6147.685002][T64921] __x64_sys_move_pages+0xa5c/0x1100
[ 6147.690176][T64921] ? trace_hardirqs_on+0x20/0x1b5
[ 6147.695084][T64921] ? move_pages_and_store_status.isra.47+0x1a0/0x1a0
[ 6147.701653][T64921] ? rcu_read_lock_sched_held+0xaa/0xd0
[ 6147.707088][T64921] ? switch_fpu_return+0x196/0x400
[ 6147.712083][T64921] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 6147.717954][T64921] ? do_syscall_64+0x24/0x310
[ 6147.722513][T64921] do_syscall_64+0x5f/0x310
[ 6147.726897][T64921] ? trace_hardirqs_off+0x12/0x1a0
[ 6147.731894][T64921] ? asm_exc_page_fault+0x8/0x30
[ 6147.736714][T64921] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Hugh Dickens diagnosed this as a migration bug caused by code introduced
to use i_mmap_rwsem for pmd sharing synchronization. Specifically, the
routine unmap_and_move_huge_page() is always passing the TTU_RMAP_LOCKED
flag to try_to_unmap() while holding i_mmap_rwsem. This is wrong for
anon pages as the anon_vma_lock should be held in this case. Further
analysis suggested that i_mmap_rwsem was not required to he held at all
when calling try_to_unmap for anon pages as an anon page could never be
part of a shared pmd mapping.
Discussion also revealed that the hack in hugetlb_page_mapping_lock_write
to drop page lock and acquire i_mmap_rwsem is wrong. There is no way to
keep mapping valid while dropping page lock.
This patch does the following:
- Do not take i_mmap_rwsem and set TTU_RMAP_LOCKED for anon pages when
calling try_to_unmap.
- Remove the hacky code in hugetlb_page_mapping_lock_write. The routine
will now simply do a 'trylock' while still holding the page lock. If
the trylock fails, it will return NULL. This could impact the callers:
- migration calling code will receive -EAGAIN and retry up to the hard
coded limit (10).
- memory error code will treat the page as BUSY. This will force
killing (SIGKILL) instead of SIGBUS any mapping tasks.
Do note that this change in behavior only happens when there is a race.
None of the standard kernel testing suites actually hit this race, but
it is possible.
[1] https://lore.kernel.org/lkml/20200708012044.GC992@lca.pw/
[2] https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2010071833100.2214@eggly.a…
Link: https://lkml.kernel.org/r/20201105195058.78401-1-mike.kravetz@oracle.com
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: Qian Cai <cai(a)lca.pw>
Suggested-by: Hugh Dickins <hughd(a)google.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 90 ++----------------------------------------
mm/memory-failure.c | 36 +++++++---------
mm/migrate.c | 46 +++++++++++----------
mm/rmap.c | 5 --
4 files changed, 48 insertions(+), 129 deletions(-)
--- a/mm/hugetlb.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/hugetlb.c
@@ -1568,103 +1568,23 @@ int PageHeadHuge(struct page *page_head)
}
/*
- * Find address_space associated with hugetlbfs page.
- * Upon entry page is locked and page 'was' mapped although mapped state
- * could change. If necessary, use anon_vma to find vma and associated
- * address space. The returned mapping may be stale, but it can not be
- * invalid as page lock (which is held) is required to destroy mapping.
- */
-static struct address_space *_get_hugetlb_page_mapping(struct page *hpage)
-{
- struct anon_vma *anon_vma;
- pgoff_t pgoff_start, pgoff_end;
- struct anon_vma_chain *avc;
- struct address_space *mapping = page_mapping(hpage);
-
- /* Simple file based mapping */
- if (mapping)
- return mapping;
-
- /*
- * Even anonymous hugetlbfs mappings are associated with an
- * underlying hugetlbfs file (see hugetlb_file_setup in mmap
- * code). Find a vma associated with the anonymous vma, and
- * use the file pointer to get address_space.
- */
- anon_vma = page_lock_anon_vma_read(hpage);
- if (!anon_vma)
- return mapping; /* NULL */
-
- /* Use first found vma */
- pgoff_start = page_to_pgoff(hpage);
- pgoff_end = pgoff_start + pages_per_huge_page(page_hstate(hpage)) - 1;
- anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root,
- pgoff_start, pgoff_end) {
- struct vm_area_struct *vma = avc->vma;
-
- mapping = vma->vm_file->f_mapping;
- break;
- }
-
- anon_vma_unlock_read(anon_vma);
- return mapping;
-}
-
-/*
* Find and lock address space (mapping) in write mode.
*
- * Upon entry, the page is locked which allows us to find the mapping
- * even in the case of an anon page. However, locking order dictates
- * the i_mmap_rwsem be acquired BEFORE the page lock. This is hugetlbfs
- * specific. So, we first try to lock the sema while still holding the
- * page lock. If this works, great! If not, then we need to drop the
- * page lock and then acquire i_mmap_rwsem and reacquire page lock. Of
- * course, need to revalidate state along the way.
+ * Upon entry, the page is locked which means that page_mapping() is
+ * stable. Due to locking order, we can only trylock_write. If we can
+ * not get the lock, simply return NULL to caller.
*/
struct address_space *hugetlb_page_mapping_lock_write(struct page *hpage)
{
- struct address_space *mapping, *mapping2;
+ struct address_space *mapping = page_mapping(hpage);
- mapping = _get_hugetlb_page_mapping(hpage);
-retry:
if (!mapping)
return mapping;
- /*
- * If no contention, take lock and return
- */
if (i_mmap_trylock_write(mapping))
return mapping;
- /*
- * Must drop page lock and wait on mapping sema.
- * Note: Once page lock is dropped, mapping could become invalid.
- * As a hack, increase map count until we lock page again.
- */
- atomic_inc(&hpage->_mapcount);
- unlock_page(hpage);
- i_mmap_lock_write(mapping);
- lock_page(hpage);
- atomic_add_negative(-1, &hpage->_mapcount);
-
- /* verify page is still mapped */
- if (!page_mapped(hpage)) {
- i_mmap_unlock_write(mapping);
- return NULL;
- }
-
- /*
- * Get address space again and verify it is the same one
- * we locked. If not, drop lock and retry.
- */
- mapping2 = _get_hugetlb_page_mapping(hpage);
- if (mapping2 != mapping) {
- i_mmap_unlock_write(mapping);
- mapping = mapping2;
- goto retry;
- }
-
- return mapping;
+ return NULL;
}
pgoff_t __basepage_index(struct page *page)
--- a/mm/memory-failure.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/memory-failure.c
@@ -1057,27 +1057,25 @@ static bool hwpoison_user_mappings(struc
if (!PageHuge(hpage)) {
unmap_success = try_to_unmap(hpage, ttu);
} else {
- /*
- * For hugetlb pages, try_to_unmap could potentially call
- * huge_pmd_unshare. Because of this, take semaphore in
- * write mode here and set TTU_RMAP_LOCKED to indicate we
- * have taken the lock at this higer level.
- *
- * Note that the call to hugetlb_page_mapping_lock_write
- * is necessary even if mapping is already set. It handles
- * ugliness of potentially having to drop page lock to obtain
- * i_mmap_rwsem.
- */
- mapping = hugetlb_page_mapping_lock_write(hpage);
-
- if (mapping) {
- unmap_success = try_to_unmap(hpage,
+ if (!PageAnon(hpage)) {
+ /*
+ * For hugetlb pages in shared mappings, try_to_unmap
+ * could potentially call huge_pmd_unshare. Because of
+ * this, take semaphore in write mode here and set
+ * TTU_RMAP_LOCKED to indicate we have taken the lock
+ * at this higer level.
+ */
+ mapping = hugetlb_page_mapping_lock_write(hpage);
+ if (mapping) {
+ unmap_success = try_to_unmap(hpage,
ttu|TTU_RMAP_LOCKED);
- i_mmap_unlock_write(mapping);
+ i_mmap_unlock_write(mapping);
+ } else {
+ pr_info("Memory failure: %#lx: could not lock mapping for mapped huge page\n", pfn);
+ unmap_success = false;
+ }
} else {
- pr_info("Memory failure: %#lx: could not find mapping for mapped huge page\n",
- pfn);
- unmap_success = false;
+ unmap_success = try_to_unmap(hpage, ttu);
}
}
if (!unmap_success)
--- a/mm/migrate.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/migrate.c
@@ -1328,34 +1328,38 @@ static int unmap_and_move_huge_page(new_
goto put_anon;
if (page_mapped(hpage)) {
- /*
- * try_to_unmap could potentially call huge_pmd_unshare.
- * Because of this, take semaphore in write mode here and
- * set TTU_RMAP_LOCKED to let lower levels know we have
- * taken the lock.
- */
- mapping = hugetlb_page_mapping_lock_write(hpage);
- if (unlikely(!mapping))
- goto unlock_put_anon;
-
- try_to_unmap(hpage,
- TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS|
- TTU_RMAP_LOCKED);
+ bool mapping_locked = false;
+ enum ttu_flags ttu = TTU_MIGRATION|TTU_IGNORE_MLOCK|
+ TTU_IGNORE_ACCESS;
+
+ if (!PageAnon(hpage)) {
+ /*
+ * In shared mappings, try_to_unmap could potentially
+ * call huge_pmd_unshare. Because of this, take
+ * semaphore in write mode here and set TTU_RMAP_LOCKED
+ * to let lower levels know we have taken the lock.
+ */
+ mapping = hugetlb_page_mapping_lock_write(hpage);
+ if (unlikely(!mapping))
+ goto unlock_put_anon;
+
+ mapping_locked = true;
+ ttu |= TTU_RMAP_LOCKED;
+ }
+
+ try_to_unmap(hpage, ttu);
page_was_mapped = 1;
- /*
- * Leave mapping locked until after subsequent call to
- * remove_migration_ptes()
- */
+
+ if (mapping_locked)
+ i_mmap_unlock_write(mapping);
}
if (!page_mapped(hpage))
rc = move_to_new_page(new_hpage, hpage, mode);
- if (page_was_mapped) {
+ if (page_was_mapped)
remove_migration_ptes(hpage,
- rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage, true);
- i_mmap_unlock_write(mapping);
- }
+ rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage, false);
unlock_put_anon:
unlock_page(new_hpage);
--- a/mm/rmap.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/rmap.c
@@ -1413,9 +1413,6 @@ static bool try_to_unmap_one(struct page
/*
* If sharing is possible, start and end will be adjusted
* accordingly.
- *
- * If called for a huge page, caller must hold i_mmap_rwsem
- * in write mode as it is possible to call huge_pmd_unshare.
*/
adjust_range_if_pmd_sharing_possible(vma, &range.start,
&range.end);
@@ -1462,7 +1459,7 @@ static bool try_to_unmap_one(struct page
subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte);
address = pvmw.address;
- if (PageHuge(page)) {
+ if (PageHuge(page) && !PageAnon(page)) {
/*
* To call huge_pmd_unshare, i_mmap_rwsem must be
* held in write mode. Caller needs to explicitly
_
From: Matteo Croce <mcroce(a)microsoft.com>
Subject: Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
Patch series "fix parsing of reboot= cmdline", v3.
The parsing of the reboot= cmdline has two major errors:
- a missing bound check can crash the system on reboot
- parsing of the cpu number only works if specified last
Fix both.
This patch (of 2):
This reverts commit 616feab753972b97.
kstrtoint() and simple_strtoul() have a subtle difference which makes them
non interchangeable: if a non digit character is found amid the parsing,
the former will return an error, while the latter will just stop parsing,
e.g. simple_strtoul("123xyx") = 123.
The kernel cmdline reboot= argument allows to specify the CPU used for
rebooting, with the syntax `s####` among the other flags, e.g.
"reboot=warm,s31,force", so if this flag is not the last given, it's
silently ignored as well as the subsequent ones.
Link: https://lkml.kernel.org/r/20201103214025.116799-2-mcroce@linux.microsoft.com
Fixes: 616feab75397 ("kernel/reboot.c: convert simple_strtoul to kstrtoint")
Signed-off-by: Matteo Croce <mcroce(a)microsoft.com>
Cc: Guenter Roeck <linux(a)roeck-us.net>
Cc: Petr Mladek <pmladek(a)suse.com>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Robin Holt <robinmholt(a)gmail.com>
Cc: Fabian Frederick <fabf(a)skynet.be>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/reboot.c | 21 +++++++--------------
1 file changed, 7 insertions(+), 14 deletions(-)
--- a/kernel/reboot.c~revert-kernel-rebootc-convert-simple_strtoul-to-kstrtoint
+++ a/kernel/reboot.c
@@ -551,22 +551,15 @@ static int __init reboot_setup(char *str
break;
case 's':
- {
- int rc;
-
- if (isdigit(*(str+1))) {
- rc = kstrtoint(str+1, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else if (str[1] == 'm' && str[2] == 'p' &&
- isdigit(*(str+3))) {
- rc = kstrtoint(str+3, 0, &reboot_cpu);
- if (rc)
- return rc;
- } else
+ if (isdigit(*(str+1)))
+ reboot_cpu = simple_strtoul(str+1, NULL, 0);
+ else if (str[1] == 'm' && str[2] == 'p' &&
+ isdigit(*(str+3)))
+ reboot_cpu = simple_strtoul(str+3, NULL, 0);
+ else
*mode = REBOOT_SOFT;
break;
- }
+
case 'g':
*mode = REBOOT_GPIO;
break;
_
From: Arvind Sankar <nivedita(a)alum.mit.edu>
Subject: compiler.h: fix barrier_data() on clang
Commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive") neglected to copy barrier_data() from compiler-gcc.h
into compiler-clang.h. The definition in compiler-gcc.h was really to
work around clang's more aggressive optimization, so this broke
barrier_data() on clang, and consequently memzero_explicit() as well.
For example, this results in at least the memzero_explicit() call in
lib/crypto/sha256.c:sha256_transform() being optimized away by clang.
Fix this by moving the definition of barrier_data() into compiler.h.
Also move the gcc/clang definition of barrier() into compiler.h,
__memory_barrier() is icc-specific (and barrier() is already defined using
it in compiler-intel.h) and doesn't belong in compiler.h.
[rdunlap(a)infradead.org: fix ALPHA builds when SMP is not enabled]
Link: https://lkml.kernel.org/r/20201101231835.4589-1-rdunlap@infradead.org
Link: https://lkml.kernel.org/r/20201014212631.207844-1-nivedita@alum.mit.edu
Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
Signed-off-by: Arvind Sankar <nivedita(a)alum.mit.edu>
Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org>
Reviewed-by: Nick Desaulniers <ndesaulniers(a)google.com>
Tested-by: Nick Desaulniers <ndesaulniers(a)google.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/asm-generic/barrier.h | 1 +
include/linux/compiler-clang.h | 6 ------
include/linux/compiler-gcc.h | 19 -------------------
include/linux/compiler.h | 18 ++++++++++++++++--
4 files changed, 17 insertions(+), 27 deletions(-)
--- a/include/asm-generic/barrier.h~compilerh-fix-barrier_data-on-clang
+++ a/include/asm-generic/barrier.h
@@ -13,6 +13,7 @@
#ifndef __ASSEMBLY__
+#include <linux/compiler.h>
#include <asm/rwonce.h>
#ifndef nop
--- a/include/linux/compiler-clang.h~compilerh-fix-barrier_data-on-clang
+++ a/include/linux/compiler-clang.h
@@ -60,12 +60,6 @@
#define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1
#endif
-/* The following are for compatibility with GCC, from compiler-gcc.h,
- * and may be redefined here because they should not be shared with other
- * compilers, like ICC.
- */
-#define barrier() __asm__ __volatile__("" : : : "memory")
-
#if __has_feature(shadow_call_stack)
# define __noscs __attribute__((__no_sanitize__("shadow-call-stack")))
#endif
--- a/include/linux/compiler-gcc.h~compilerh-fix-barrier_data-on-clang
+++ a/include/linux/compiler-gcc.h
@@ -15,25 +15,6 @@
# error Sorry, your version of GCC is too old - please use 4.9 or newer.
#endif
-/* Optimization barrier */
-
-/* The "volatile" is due to gcc bugs */
-#define barrier() __asm__ __volatile__("": : :"memory")
-/*
- * This version is i.e. to prevent dead stores elimination on @ptr
- * where gcc and llvm may behave differently when otherwise using
- * normal barrier(): while gcc behavior gets along with a normal
- * barrier(), llvm needs an explicit input variable to be assumed
- * clobbered. The issue is as follows: while the inline asm might
- * access any memory it wants, the compiler could have fit all of
- * @ptr into memory registers instead, and since @ptr never escaped
- * from that, it proved that the inline asm wasn't touching any of
- * it. This version works well with both compilers, i.e. we're telling
- * the compiler that the inline asm absolutely may see the contents
- * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
- */
-#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
-
/*
* This macro obfuscates arithmetic on a variable address so that gcc
* shouldn't recognize the original var, and make assumptions about it.
--- a/include/linux/compiler.h~compilerh-fix-barrier_data-on-clang
+++ a/include/linux/compiler.h
@@ -80,11 +80,25 @@ void ftrace_likely_update(struct ftrace_
/* Optimization barrier */
#ifndef barrier
-# define barrier() __memory_barrier()
+/* The "volatile" is due to gcc bugs */
+# define barrier() __asm__ __volatile__("": : :"memory")
#endif
#ifndef barrier_data
-# define barrier_data(ptr) barrier()
+/*
+ * This version is i.e. to prevent dead stores elimination on @ptr
+ * where gcc and llvm may behave differently when otherwise using
+ * normal barrier(): while gcc behavior gets along with a normal
+ * barrier(), llvm needs an explicit input variable to be assumed
+ * clobbered. The issue is as follows: while the inline asm might
+ * access any memory it wants, the compiler could have fit all of
+ * @ptr into memory registers instead, and since @ptr never escaped
+ * from that, it proved that the inline asm wasn't touching any of
+ * it. This version works well with both compilers, i.e. we're telling
+ * the compiler that the inline asm absolutely may see the contents
+ * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
+ */
+# define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
#endif
/* workaround for GCC PR82365 if needed */
_
From: Jason Gunthorpe <jgg(a)nvidia.com>
Subject: mm/gup: use unpin_user_pages() in __gup_longterm_locked()
When FOLL_PIN is passed to __get_user_pages() the page list must be put
back using unpin_user_pages() otherwise the page pin reference persists in
a corrupted state.
There are two places in the unwind of __gup_longterm_locked() that put the
pages back without checking. Normally on error this function would return
the partial page list making this the caller's responsibility, but in
these two cases the caller is not allowed to see these pages at all.
Link: https://lkml.kernel.org/r/0-v2-3ae7d9d162e2+2a7-gup_cma_fix_jgg@nvidia.com
Fixes: 3faa52c03f44 ("mm/gup: track FOLL_PIN pages")
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reported-by: Ira Weiny <ira.weiny(a)intel.com>
Reviewed-by: Ira Weiny <ira.weiny(a)intel.com>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Cc: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/gup.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
--- a/mm/gup.c~mm-gup-use-unpin_user_pages-in-__gup_longterm_locked
+++ a/mm/gup.c
@@ -1647,8 +1647,11 @@ check_again:
/*
* drop the above get_user_pages reference.
*/
- for (i = 0; i < nr_pages; i++)
- put_page(pages[i]);
+ if (gup_flags & FOLL_PIN)
+ unpin_user_pages(pages, nr_pages);
+ else
+ for (i = 0; i < nr_pages; i++)
+ put_page(pages[i]);
if (migrate_pages(&cma_page_list, alloc_migration_target, NULL,
(unsigned long)&mtc, MIGRATE_SYNC, MR_CONTIG_RANGE)) {
@@ -1728,8 +1731,11 @@ static long __gup_longterm_locked(struct
goto out;
if (check_dax_vmas(vmas_tmp, rc)) {
- for (i = 0; i < rc; i++)
- put_page(pages[i]);
+ if (gup_flags & FOLL_PIN)
+ unpin_user_pages(pages, rc);
+ else
+ for (i = 0; i < rc; i++)
+ put_page(pages[i]);
rc = -EOPNOTSUPP;
goto out;
}
_
From: Laurent Dufour <ldufour(a)linux.ibm.com>
Subject: mm/slub: fix panic in slab_alloc_node()
While doing memory hot-unplug operation on a PowerPC VM running 1024 CPUs
with 11TB of ram, I hit the following panic:
BUG: Kernel NULL pointer dereference on read at 0x00000007
Faulting instruction address: 0xc000000000456048
Oops: Kernel access of bad area, sig: 11 [#2]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS= 2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp
CPU: 160 PID: 1 Comm: systemd Tainted: G D 5.9.0 #1
NIP: c000000000456048 LR: c000000000455fd4 CTR: c00000000047b350
REGS: c00006028d1b77a0 TRAP: 0300 Tainted: G D (5.9.0)
MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24004228 XER: 00000000
CFAR: c00000000000f1b0 DAR: 0000000000000007 DSISR: 40000000 IRQMASK: 0
GPR00: c000000000455fd4 c00006028d1b7a30 c000000001bec800 0000000000000000
GPR04: 0000000000000dc0 0000000000000000 00000000000374ef c00007c53df99320
GPR08: 000007c53c980000 0000000000000000 000007c53c980000 0000000000000000
GPR12: 0000000000004400 c00000001e8e4400 0000000000000000 0000000000000f6a
GPR16: 0000000000000000 c000000001c25930 c000000001d62528 00000000000000c1
GPR20: c000000001d62538 c00006be469e9000 0000000fffffffe0 c0000000003c0ff8
GPR24: 0000000000000018 0000000000000000 0000000000000dc0 0000000000000000
GPR28: c00007c513755700 c000000001c236a4 c00007bc4001f800 0000000000000001
NIP [c000000000456048] __kmalloc_node+0x108/0x790
LR [c000000000455fd4] __kmalloc_node+0x94/0x790
Call Trace:
[c00006028d1b7a30] [c00007c51af92000] 0xc00007c51af92000 (unreliable)
[c00006028d1b7aa0] [c0000000003c0ff8] kvmalloc_node+0x58/0x110
[c00006028d1b7ae0] [c00000000047b45c] mem_cgroup_css_online+0x10c/0x270
[c00006028d1b7b30] [c000000000241fd8] online_css+0x48/0xd0
[c00006028d1b7b60] [c00000000024af14] cgroup_apply_control_enable+0x2c4/0x470
[c00006028d1b7c40] [c00000000024e838] cgroup_mkdir+0x408/0x5f0
[c00006028d1b7cb0] [c0000000005a4ef0] kernfs_iop_mkdir+0x90/0x100
[c00006028d1b7cf0] [c0000000004b8168] vfs_mkdir+0x138/0x250
[c00006028d1b7d40] [c0000000004baf04] do_mkdirat+0x154/0x1c0
[c00006028d1b7dc0] [c000000000032b38] system_call_exception+0xf8/0x200
[c00006028d1b7e20] [c00000000000c740] system_call_common+0xf0/0x27c
Instruction dump:
e93e0000 e90d0030 39290008 7cc9402a e94d0030 e93e0000 7ce95214 7f89502a
2fbc0000 419e0018 41920230 e9270010 <89290007> 7f994800 419e0220 7ee6bb78
This pointing to the following code:
mm/slub.c:2851
if (unlikely(!object || !node_match(page, node))) {
c000000000456038: 00 00 bc 2f cmpdi cr7,r28,0
c00000000045603c: 18 00 9e 41 beq cr7,c000000000456054 <__kmalloc_node+0x114>
node_match():
mm/slub.c:2491
if (node != NUMA_NO_NODE && page_to_nid(page) != node)
c000000000456040: 30 02 92 41 beq cr4,c000000000456270 <__kmalloc_node+0x330>
page_to_nid():
include/linux/mm.h:1294
c000000000456044: 10 00 27 e9 ld r9,16(r7)
c000000000456048: 07 00 29 89 lbz r9,7(r9) <<<< r9 = NULL
node_match():
mm/slub.c:2491
c00000000045604c: 00 48 99 7f cmpw cr7,r25,r9
c000000000456050: 20 02 9e 41 beq cr7,c000000000456270 <__kmalloc_node+0x330>
The panic occurred in slab_alloc_node() when checking for the page's node:
object = c->freelist;
page = c->page;
if (unlikely(!object || !node_match(page, node))) {
object = __slab_alloc(s, gfpflags, node, addr, c);
stat(s, ALLOC_SLOWPATH);
The issue is that object is not NULL while page is NULL which is odd but
may happen if the cache flush happened after loading object but before
loading page. Thus checking for the page pointer is required too.
The cache flush is done through an inter processor interrupt when a piece
of memory is off-lined. That interrupt is triggered when a memory
hot-unplug operation is initiated and offline_pages() is calling the
slub's MEM_GOING_OFFLINE callback slab_mem_going_offline_callback() which
is calling flush_cpu_slab(). If that interrupt is caught between the
reading of c->freelist and the reading of c->page, this could lead to such
a situation. That situation is expected and the later call to
this_cpu_cmpxchg_double() will detect the change to c->freelist and redo
the whole operation.
In commit 6159d0f5c03e ("mm/slub.c: page is always non-NULL in
node_match()") check on the page pointer has been removed assuming that
page is always valid when it is called. It happens that this is not true
in that particular case, so check for page before calling node_match()
here.
Link: https://lkml.kernel.org/r/20201027190406.33283-1-ldufour@linux.ibm.com
Fixes: 6159d0f5c03e ("mm/slub.c: page is always non-NULL in node_match()")
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Acked-by: Christoph Lameter <cl(a)linux.com>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/slub.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/slub.c~mm-slub-fix-panic-in-slab_alloc_node
+++ a/mm/slub.c
@@ -2852,7 +2852,7 @@ redo:
object = c->freelist;
page = c->page;
- if (unlikely(!object || !node_match(page, node))) {
+ if (unlikely(!object || !page || !node_match(page, node))) {
object = __slab_alloc(s, gfpflags, node, addr, c);
} else {
void *next_object = get_freepointer_safe(s, object);
_
From: Zi Yan <ziy(a)nvidia.com>
Subject: mm/compaction: stop isolation if too many pages are isolated and we have pages to migrate
In isolate_migratepages_block, if we have too many isolated pages and
nr_migratepages is not zero, we should try to migrate what we have without
wasting time on isolating.
In theory it's possible that multiple parallel compactions will cause
too_many_isolated() to become true even if each has isolated less than
COMPACT_CLUSTER_MAX, and loop forever in the while loop. Bailing
immediately prevents that.
[vbabka(a)suse.cz: changelog addition]
Link: https://lkml.kernel.org/r/20201030183809.3616803-2-zi.yan@sent.com
Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”)
Signed-off-by: Zi Yan <ziy(a)nvidia.com>
Suggested-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/compaction.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/mm/compaction.c~mm-compaction-stop-isolation-if-too-many-pages-are-isolated-and-we-have-pages-to-migrate
+++ a/mm/compaction.c
@@ -817,6 +817,10 @@ isolate_migratepages_block(struct compac
* delay for some time until fewer pages are isolated
*/
while (unlikely(too_many_isolated(pgdat))) {
+ /* stop isolation if there are still pages not migrated */
+ if (cc->nr_migratepages)
+ return 0;
+
/* async migration should just abort */
if (cc->mode == MIGRATE_ASYNC)
return 0;
_
From: Zi Yan <ziy(a)nvidia.com>
Subject: mm/compaction: count pages and stop correctly during page isolation
In isolate_migratepages_block, when cc->alloc_contig is true, we are able
to isolate compound pages, nr_migratepages and nr_isolated did not count
compound pages correctly, causing us to isolate more pages than we
thought. Count compound pages as the number of base pages they contain.
Otherwise, we might be trapped in too_many_isolated while loop, since the
actual isolated pages can go up to COMPACT_CLUSTER_MAX*512=16384, where
COMPACT_CLUSTER_MAX is 32, since we stop isolation after
cc->nr_migratepages reaches to COMPACT_CLUSTER_MAX.
In addition, after we fix the issue above, cc->nr_migratepages could never
be equal to COMPACT_CLUSTER_MAX if compound pages are isolated, thus page
isolation could not stop as we intended. Change the isolation stop
condition to >=.
The issue can be triggered as follows: In a system with 16GB memory and an
8GB CMA region reserved by hugetlb_cma, if we first allocate 10GB THPs and
mlock them (so some THPs are allocated in the CMA region and mlocked),
reserving 6 1GB hugetlb pages via
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages will get stuck
(looping in too_many_isolated function) until we kill either task. With
the patch applied, oom will kill the application with 10GB THPs and let
hugetlb page reservation finish.
[ziy(a)nvidia.com: v3]
Link: https://lkml.kernel.org/r/20201030183809.3616803-1-zi.yan@sent.com
Link: https://lkml.kernel.org/r/20201029200435.3386066-1-zi.yan@sent.com
Fixes: 1da2f328fa64 ("cmm,thp,compaction,cma: allow THP migration for CMA allocations")
Signed-off-by: Zi Yan <ziy(a)nvidia.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/compaction.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--- a/mm/compaction.c~mm-compaction-count-pages-and-stop-correctly-during-page-isolation
+++ a/mm/compaction.c
@@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compac
isolate_success:
list_add(&page->lru, &cc->migratepages);
- cc->nr_migratepages++;
- nr_isolated++;
+ cc->nr_migratepages += compound_nr(page);
+ nr_isolated += compound_nr(page);
/*
* Avoid isolating too much unless this block is being
@@ -1021,7 +1021,7 @@ isolate_success:
* or a lock is contended. For contention, isolate quickly to
* potentially remove one source of contention.
*/
- if (cc->nr_migratepages == COMPACT_CLUSTER_MAX &&
+ if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX &&
!cc->rescan && !cc->contended) {
++low_pfn;
break;
@@ -1132,7 +1132,7 @@ isolate_migratepages_range(struct compac
if (!pfn)
break;
- if (cc->nr_migratepages == COMPACT_CLUSTER_MAX)
+ if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX)
break;
}
_
Hi Greg, Sasha,
Please consider the attached backport of
f91072ed1b72 ("perf/core: Fix race in the perf_mmap_close() function")
for v4.4.y and v4.9.y.
I have sent separate backport for v4.14.y, v4.19.y and v5.4.y.
--
Regards
Sudip
This is a note to let you know that I've just added the patch titled
usb: cdc-acm: Add DISABLE_ECHO for Renesas USB Download mode
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 6d853c9e4104b4fc8d55dc9cd3b99712aa347174 Mon Sep 17 00:00:00 2001
From: Chris Brandt <chris.brandt(a)renesas.com>
Date: Wed, 11 Nov 2020 08:12:09 -0500
Subject: usb: cdc-acm: Add DISABLE_ECHO for Renesas USB Download mode
Renesas R-Car and RZ/G SoCs have a firmware download mode over USB.
However, on reset a banner string is transmitted out which is not expected
to be echoed back and will corrupt the protocol.
Cc: stable <stable(a)vger.kernel.org>
Acked-by: Oliver Neukum <oneukum(a)suse.com>
Signed-off-by: Chris Brandt <chris.brandt(a)renesas.com>
Link: https://lore.kernel.org/r/20201111131209.3977903-1-chris.brandt@renesas.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/class/cdc-acm.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c
index 1e7568867910..f52f1bc0559f 100644
--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -1693,6 +1693,15 @@ static const struct usb_device_id acm_ids[] = {
{ USB_DEVICE(0x0870, 0x0001), /* Metricom GS Modem */
.driver_info = NO_UNION_NORMAL, /* has no union descriptor */
},
+ { USB_DEVICE(0x045b, 0x023c), /* Renesas USB Download mode */
+ .driver_info = DISABLE_ECHO, /* Don't echo banner */
+ },
+ { USB_DEVICE(0x045b, 0x0248), /* Renesas USB Download mode */
+ .driver_info = DISABLE_ECHO, /* Don't echo banner */
+ },
+ { USB_DEVICE(0x045b, 0x024D), /* Renesas USB Download mode */
+ .driver_info = DISABLE_ECHO, /* Don't echo banner */
+ },
{ USB_DEVICE(0x0e8d, 0x0003), /* FIREFLY, MediaTek Inc; andrey.arapov(a)gmail.com */
.driver_info = NO_UNION_NORMAL, /* has no union descriptor */
},
--
2.29.2
This is a note to let you know that I've just added the patch titled
xhci: hisilicon: fix refercence leak in xhci_histb_probe
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 76255470ffa2795a44032e8b3c1ced11d81aa2db Mon Sep 17 00:00:00 2001
From: Zhang Qilong <zhangqilong3(a)huawei.com>
Date: Fri, 6 Nov 2020 20:22:21 +0800
Subject: xhci: hisilicon: fix refercence leak in xhci_histb_probe
pm_runtime_get_sync() will increment pm usage at first and it
will resume the device later. We should decrease the usage count
whetever it succeeded or failed(maybe runtime of the device has
error, or device is in inaccessible state, or other error state).
If we do not call put operation to decrease the reference, it will
result in reference leak in xhci_histb_probe. Moreover, this
device cannot enter the idle state and always stay busy or other
non-idle state later. So we fixed it by jumping to error handling
branch.
Fixes: c508f41da0788 ("xhci: hisilicon: support HiSilicon STB xHCI host controller")
Signed-off-by: Zhang Qilong <zhangqilong3(a)huawei.com>
Link: https://lore.kernel.org/r/20201106122221.2304528-1-zhangqilong3@huawei.com
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci-histb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-histb.c b/drivers/usb/host/xhci-histb.c
index 5546e7e013a8..08369857686e 100644
--- a/drivers/usb/host/xhci-histb.c
+++ b/drivers/usb/host/xhci-histb.c
@@ -240,7 +240,7 @@ static int xhci_histb_probe(struct platform_device *pdev)
/* Initialize dma_mask and coherent_dma_mask to 32-bits */
ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
if (ret)
- return ret;
+ goto disable_pm;
hcd = usb_create_hcd(driver, dev, dev_name(dev));
if (!hcd) {
--
2.29.2
This is a note to let you know that I've just added the patch titled
Revert "usb: musb: convert to devm_platform_ioremap_resource_byname"
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From ffa13d2d94029882eca22a565551783787f121e5 Mon Sep 17 00:00:00 2001
From: Geert Uytterhoeven <geert+renesas(a)glider.be>
Date: Thu, 12 Nov 2020 14:59:00 +0100
Subject: Revert "usb: musb: convert to devm_platform_ioremap_resource_byname"
This reverts commit 2d30e408a2a6b3443d3232593e3d472584a3e9f8.
On Beaglebone Black, where each interface has 2 children:
musb-dsps 47401c00.usb: can't request region for resource [mem 0x47401800-0x474019ff]
musb-hdrc musb-hdrc.1: musb_init_controller failed with status -16
musb-hdrc: probe of musb-hdrc.1 failed with error -16
musb-dsps 47401400.usb: can't request region for resource [mem 0x47401000-0x474011ff]
musb-hdrc musb-hdrc.0: musb_init_controller failed with status -16
musb-hdrc: probe of musb-hdrc.0 failed with error -16
Before, devm_ioremap_resource() was called on "dev" ("musb-hdrc.0" or
"musb-hdrc.1"), after it is called on "&pdev->dev" ("47401400.usb" or
"47401c00.usb"), leading to a duplicate region request, which fails.
Signed-off-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
Fixes: 2d30e408a2a6 ("usb: musb: convert to devm_platform_ioremap_resource_byname")
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20201112135900.3822599-1-geert+renesas@glider.be
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/musb/musb_dsps.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/musb/musb_dsps.c b/drivers/usb/musb/musb_dsps.c
index 30085b2be7b9..5892f3ce0cdc 100644
--- a/drivers/usb/musb/musb_dsps.c
+++ b/drivers/usb/musb/musb_dsps.c
@@ -429,10 +429,12 @@ static int dsps_musb_init(struct musb *musb)
struct platform_device *parent = to_platform_device(dev->parent);
const struct dsps_musb_wrapper *wrp = glue->wrp;
void __iomem *reg_base;
+ struct resource *r;
u32 rev, val;
int ret;
- reg_base = devm_platform_ioremap_resource_byname(parent, "control");
+ r = platform_get_resource_byname(parent, IORESOURCE_MEM, "control");
+ reg_base = devm_ioremap_resource(dev, r);
if (IS_ERR(reg_base))
return PTR_ERR(reg_base);
musb->ctrl_base = reg_base;
--
2.29.2
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 058b3316514f - KVM: arm64: ARM_SMCCC_ARCH_WORKAROUND_1 doesn't return SMCCC_RET_NOT_REQUIRED
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ stress: stress-ng
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ xfstests - ext4
🚧 ⚡⚡⚡ xfstests - xfs
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 4:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ xfstests - ext4
🚧 ✅ xfstests - xfs
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
ppc64le:
Host 1:
✅ Boot test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ storage: software RAID testing
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ xfstests - ext4
🚧 ⚡⚡⚡ xfstests - xfs
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 4:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ storage: software RAID testing
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ xfstests - ext4
🚧 ⚡⚡⚡ xfstests - xfs
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 5:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ xfstests - ext4
🚧 ✅ xfstests - xfs
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage block - filesystem fio test
🚧 ✅ Storage block - queue scheduler test
🚧 ✅ Storage nvme - tcp
s390x:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ Storage blktests
🚧 💥 Storage nvme - tcp
Host 3:
✅ Boot test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ stress: stress-ng
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ CPU: Frequency Driver Test
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - ext4
🚧 ⚡⚡⚡ xfstests - xfs
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Ethernet drivers sanity
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 4:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ❌ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ xfstests - ext4
🚧 ✅ xfstests - xfs
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
When processing request to add/replace user-defined element set, check
of given element identifier and decision of numeric identifier is done
in "__snd_ctl_add_replace()" helper function. When the result of check
is wrong, the helper function returns error code. The error code shall
be returned to userspace application.
Current implementation includes bug to return zero to userspace application
regardless of the result. This commit fixes the bug.
Cc: <stable(a)vger.kernel.org>
Fixes: e1a7bfe38079 ("ALSA: control: Fix race between adding and removing a user element")
Signed-off-by: Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
---
sound/core/control.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/core/control.c b/sound/core/control.c
index 421ddc7..f341fc4 100644
--- a/sound/core/control.c
+++ b/sound/core/control.c
@@ -1539,7 +1539,7 @@ static int snd_ctl_elem_add(struct snd_ctl_file *file,
unlock:
up_write(&card->controls_rwsem);
- return 0;
+ return err;
}
static int snd_ctl_elem_add_user(struct snd_ctl_file *file,
--
2.25.1
Hello stable(a)vger.kernel.org
We are Base Investment Company offering Corporate and Personal Loan at 3% Interest Rate for a duration of 10Years.
We also pay 1% commission to brokers, who introduce project owners for finance or other opportunities.
Please get back to me if you are interested for more
details.
Yours faithfully,
Hashim Murrah
Qian Cai reported the following BUG in [1]
[ 6147.019063][T45242] LTP: starting move_pages12
[ 6147.475680][T64921] BUG: unable to handle page fault for address: ffffffffffffffe0
...
[ 6147.525866][T64921] RIP: 0010:anon_vma_interval_tree_iter_first+0xa2/0x170
avc_start_pgoff at mm/interval_tree.c:63
[ 6147.620914][T64921] Call Trace:
[ 6147.624078][T64921] rmap_walk_anon+0x141/0xa30
rmap_walk_anon at mm/rmap.c:1864
[ 6147.628639][T64921] try_to_unmap+0x209/0x2d0
try_to_unmap at mm/rmap.c:1763
[ 6147.633026][T64921] ? rmap_walk_locked+0x140/0x140
[ 6147.637936][T64921] ? page_remove_rmap+0x1190/0x1190
[ 6147.643020][T64921] ? page_not_mapped+0x10/0x10
[ 6147.647668][T64921] ? page_get_anon_vma+0x290/0x290
[ 6147.652664][T64921] ? page_mapcount_is_zero+0x10/0x10
[ 6147.657838][T64921] ? hugetlb_page_mapping_lock_write+0x97/0x180
[ 6147.663972][T64921] migrate_pages+0x1005/0x1fb0
[ 6147.668617][T64921] ? remove_migration_pte+0xac0/0xac0
[ 6147.673875][T64921] move_pages_and_store_status.isra.47+0xd7/0x1a0
[ 6147.680181][T64921] ? migrate_pages+0x1fb0/0x1fb0
[ 6147.685002][T64921] __x64_sys_move_pages+0xa5c/0x1100
[ 6147.690176][T64921] ? trace_hardirqs_on+0x20/0x1b5
[ 6147.695084][T64921] ? move_pages_and_store_status.isra.47+0x1a0/0x1a0
[ 6147.701653][T64921] ? rcu_read_lock_sched_held+0xaa/0xd0
[ 6147.707088][T64921] ? switch_fpu_return+0x196/0x400
[ 6147.712083][T64921] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 6147.717954][T64921] ? do_syscall_64+0x24/0x310
[ 6147.722513][T64921] do_syscall_64+0x5f/0x310
[ 6147.726897][T64921] ? trace_hardirqs_off+0x12/0x1a0
[ 6147.731894][T64921] ? asm_exc_page_fault+0x8/0x30
[ 6147.736714][T64921] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Hugh Dickens diagnosed this as a migration bug caused by code introduced
to use i_mmap_rwsem for pmd sharing synchronization. Specifically, the
routine unmap_and_move_huge_page() is always passing the TTU_RMAP_LOCKED
flag to try_to_unmap() while holding i_mmap_rwsem. This is wrong for
anon pages as the anon_vma_lock should be held in this case. Further
analysis suggested that i_mmap_rwsem was not required to he held at all
when calling try_to_unmap for anon pages as an anon page could never be
part of a shared pmd mapping.
Discussion also revealed that the hack in hugetlb_page_mapping_lock_write
to drop page lock and acquire i_mmap_rwsem is wrong. There is no way to
keep mapping valid while dropping page lock.
This patch does the following:
- Do not take i_mmap_rwsem and set TTU_RMAP_LOCKED for anon pages when
calling try_to_unmap.
- Remove the hacky code in hugetlb_page_mapping_lock_write. The routine
will now simply do a 'trylock' while still holding the page lock. If
the trylock fails, it will return NULL. This could impact the callers:
- migration calling code will receive -EAGAIN and retry up to the hard
coded limit (10).
- memory error code will treat the page as BUSY. This will force
killing (SIGKILL) instead of SIGBUS any mapping tasks.
Do note that this change in behavior only happens when there is a race.
None of the standard kernel testing suites actually hit this race, but
it is possible.
[1] https://lore.kernel.org/lkml/20200708012044.GC992@lca.pw/
[2] https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2010071833100.2214@eggly.a…
Reported-by: Qian Cai <cai(a)lca.pw>
Suggested-by: Hugh Dickins <hughd(a)google.com>
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
---
mm/hugetlb.c | 90 +++------------------------------------------
mm/memory-failure.c | 36 +++++++++---------
mm/migrate.c | 44 ++++++++++++----------
mm/rmap.c | 5 +--
4 files changed, 47 insertions(+), 128 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index fe76f8fd5a73..15fc4f210a72 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1568,104 +1568,24 @@ int PageHeadHuge(struct page *page_head)
return page_head[1].compound_dtor == HUGETLB_PAGE_DTOR;
}
-/*
- * Find address_space associated with hugetlbfs page.
- * Upon entry page is locked and page 'was' mapped although mapped state
- * could change. If necessary, use anon_vma to find vma and associated
- * address space. The returned mapping may be stale, but it can not be
- * invalid as page lock (which is held) is required to destroy mapping.
- */
-static struct address_space *_get_hugetlb_page_mapping(struct page *hpage)
-{
- struct anon_vma *anon_vma;
- pgoff_t pgoff_start, pgoff_end;
- struct anon_vma_chain *avc;
- struct address_space *mapping = page_mapping(hpage);
-
- /* Simple file based mapping */
- if (mapping)
- return mapping;
-
- /*
- * Even anonymous hugetlbfs mappings are associated with an
- * underlying hugetlbfs file (see hugetlb_file_setup in mmap
- * code). Find a vma associated with the anonymous vma, and
- * use the file pointer to get address_space.
- */
- anon_vma = page_lock_anon_vma_read(hpage);
- if (!anon_vma)
- return mapping; /* NULL */
-
- /* Use first found vma */
- pgoff_start = page_to_pgoff(hpage);
- pgoff_end = pgoff_start + pages_per_huge_page(page_hstate(hpage)) - 1;
- anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root,
- pgoff_start, pgoff_end) {
- struct vm_area_struct *vma = avc->vma;
-
- mapping = vma->vm_file->f_mapping;
- break;
- }
-
- anon_vma_unlock_read(anon_vma);
- return mapping;
-}
-
/*
* Find and lock address space (mapping) in write mode.
*
- * Upon entry, the page is locked which allows us to find the mapping
- * even in the case of an anon page. However, locking order dictates
- * the i_mmap_rwsem be acquired BEFORE the page lock. This is hugetlbfs
- * specific. So, we first try to lock the sema while still holding the
- * page lock. If this works, great! If not, then we need to drop the
- * page lock and then acquire i_mmap_rwsem and reacquire page lock. Of
- * course, need to revalidate state along the way.
+ * Upon entry, the page is locked which means that page_mapping() is
+ * stable. Due to locking order, we can only trylock_write. If we can
+ * not get the lock, simply return NULL to caller.
*/
struct address_space *hugetlb_page_mapping_lock_write(struct page *hpage)
{
- struct address_space *mapping, *mapping2;
+ struct address_space *mapping = page_mapping(hpage);
- mapping = _get_hugetlb_page_mapping(hpage);
-retry:
if (!mapping)
return mapping;
- /*
- * If no contention, take lock and return
- */
if (i_mmap_trylock_write(mapping))
return mapping;
- /*
- * Must drop page lock and wait on mapping sema.
- * Note: Once page lock is dropped, mapping could become invalid.
- * As a hack, increase map count until we lock page again.
- */
- atomic_inc(&hpage->_mapcount);
- unlock_page(hpage);
- i_mmap_lock_write(mapping);
- lock_page(hpage);
- atomic_add_negative(-1, &hpage->_mapcount);
-
- /* verify page is still mapped */
- if (!page_mapped(hpage)) {
- i_mmap_unlock_write(mapping);
- return NULL;
- }
-
- /*
- * Get address space again and verify it is the same one
- * we locked. If not, drop lock and retry.
- */
- mapping2 = _get_hugetlb_page_mapping(hpage);
- if (mapping2 != mapping) {
- i_mmap_unlock_write(mapping);
- mapping = mapping2;
- goto retry;
- }
-
- return mapping;
+ return NULL;
}
pgoff_t __basepage_index(struct page *page)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index c0bb186bba62..5d880d4eb9a2 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1057,27 +1057,25 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
if (!PageHuge(hpage)) {
unmap_success = try_to_unmap(hpage, ttu);
} else {
- /*
- * For hugetlb pages, try_to_unmap could potentially call
- * huge_pmd_unshare. Because of this, take semaphore in
- * write mode here and set TTU_RMAP_LOCKED to indicate we
- * have taken the lock at this higer level.
- *
- * Note that the call to hugetlb_page_mapping_lock_write
- * is necessary even if mapping is already set. It handles
- * ugliness of potentially having to drop page lock to obtain
- * i_mmap_rwsem.
- */
- mapping = hugetlb_page_mapping_lock_write(hpage);
-
- if (mapping) {
- unmap_success = try_to_unmap(hpage,
+ if (!PageAnon(hpage)) {
+ /*
+ * For hugetlb pages in shared mappings, try_to_unmap
+ * could potentially call huge_pmd_unshare. Because of
+ * this, take semaphore in write mode here and set
+ * TTU_RMAP_LOCKED to indicate we have taken the lock
+ * at this higer level.
+ */
+ mapping = hugetlb_page_mapping_lock_write(hpage);
+ if (mapping) {
+ unmap_success = try_to_unmap(hpage,
ttu|TTU_RMAP_LOCKED);
- i_mmap_unlock_write(mapping);
+ i_mmap_unlock_write(mapping);
+ } else {
+ pr_info("Memory failure: %#lx: could not lock mapping for mapped huge page\n", pfn);
+ unmap_success = false;
+ }
} else {
- pr_info("Memory failure: %#lx: could not find mapping for mapped huge page\n",
- pfn);
- unmap_success = false;
+ unmap_success = try_to_unmap(hpage, ttu);
}
}
if (!unmap_success)
diff --git a/mm/migrate.c b/mm/migrate.c
index 5ca5842df5db..5795cb82e27c 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1328,34 +1328,38 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
goto put_anon;
if (page_mapped(hpage)) {
- /*
- * try_to_unmap could potentially call huge_pmd_unshare.
- * Because of this, take semaphore in write mode here and
- * set TTU_RMAP_LOCKED to let lower levels know we have
- * taken the lock.
- */
- mapping = hugetlb_page_mapping_lock_write(hpage);
- if (unlikely(!mapping))
- goto unlock_put_anon;
+ bool mapping_locked = false;
+ enum ttu_flags ttu = TTU_MIGRATION|TTU_IGNORE_MLOCK|
+ TTU_IGNORE_ACCESS;
+
+ if (!PageAnon(hpage)) {
+ /*
+ * In shared mappings, try_to_unmap could potentially
+ * call huge_pmd_unshare. Because of this, take
+ * semaphore in write mode here and set TTU_RMAP_LOCKED
+ * to let lower levels know we have taken the lock.
+ */
+ mapping = hugetlb_page_mapping_lock_write(hpage);
+ if (unlikely(!mapping))
+ goto unlock_put_anon;
+
+ mapping_locked = true;
+ ttu |= TTU_RMAP_LOCKED;
+ }
- try_to_unmap(hpage,
- TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS|
- TTU_RMAP_LOCKED);
+ try_to_unmap(hpage, ttu);
page_was_mapped = 1;
- /*
- * Leave mapping locked until after subsequent call to
- * remove_migration_ptes()
- */
+
+ if (mapping_locked)
+ i_mmap_unlock_write(mapping);
}
if (!page_mapped(hpage))
rc = move_to_new_page(new_hpage, hpage, mode);
- if (page_was_mapped) {
+ if (page_was_mapped)
remove_migration_ptes(hpage,
- rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage, true);
- i_mmap_unlock_write(mapping);
- }
+ rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage, false);
unlock_put_anon:
unlock_page(new_hpage);
diff --git a/mm/rmap.c b/mm/rmap.c
index 1b84945d655c..31b29321adfe 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1413,9 +1413,6 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
/*
* If sharing is possible, start and end will be adjusted
* accordingly.
- *
- * If called for a huge page, caller must hold i_mmap_rwsem
- * in write mode as it is possible to call huge_pmd_unshare.
*/
adjust_range_if_pmd_sharing_possible(vma, &range.start,
&range.end);
@@ -1462,7 +1459,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte);
address = pvmw.address;
- if (PageHuge(page)) {
+ if (PageHuge(page) && !PageAnon(page)) {
/*
* To call huge_pmd_unshare, i_mmap_rwsem must be
* held in write mode. Caller needs to explicitly
--
2.28.0
The patch titled
Subject: arm64: fix missing include in asm/uaccess.h
has been removed from the -mm tree. Its filename was
arm64-fix-missing-include-in-asm-uaccessh.patch
This patch was dropped because an alternative patch was merged
------------------------------------------------------
From: Ansuel Smith <ansuelsmth(a)gmail.com>
Subject: arm64: fix missing include in asm/uaccess.h
Fix a compilation error as PF_KTHREAD is defined in linux/sched.h and this
is missing.
[viro(a)zeniv.linux.org.uk: use linux/uaccess.h]
Link: https://lkml.kernel.org/r/20201111005826.GY3576660@ZenIV.linux.org.uk
Link: https://lkml.kernel.org/r/20201111004440.8783-1-ansuelsmth@gmail.com
Fixes: df325e05a682 ("arm64: Validate tagged addresses in access_ok() called from kernel threads")
Signed-off-by: Ansuel Smith <ansuelsmth(a)gmail.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/arm64/include/asm/uaccess.h | 2 ++
1 file changed, 2 insertions(+)
--- a/arch/arm64/include/asm/uaccess.h~arm64-fix-missing-include-in-asm-uaccessh
+++ a/arch/arm64/include/asm/uaccess.h
@@ -7,6 +7,8 @@
#ifndef __ASM_UACCESS_H
#define __ASM_UACCESS_H
+#include <linux/uaccess.h>
+
#include <asm/alternative.h>
#include <asm/kernel-pgtable.h>
#include <asm/sysreg.h>
_
Patches currently in -mm which might be from ansuelsmth(a)gmail.com are
The patch titled
Subject: arm64: fix missing include in asm/uaccess.h
has been added to the -mm tree. Its filename is
arm64-fix-missing-include-in-asm-uaccessh.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/arm64-fix-missing-include-in-asm-…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/arm64-fix-missing-include-in-asm-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Ansuel Smith <ansuelsmth(a)gmail.com>
Subject: arm64: fix missing include in asm/uaccess.h
Fix a compilation error as PF_KTHREAD is defined in linux/sched.h and this
is missing.
[viro(a)zeniv.linux.org.uk: use linux/uaccess.h]
Link: https://lkml.kernel.org/r/20201111005826.GY3576660@ZenIV.linux.org.uk
Link: https://lkml.kernel.org/r/20201111004440.8783-1-ansuelsmth@gmail.com
Fixes: df325e05a682 ("arm64: Validate tagged addresses in access_ok() called from kernel threads")
Signed-off-by: Ansuel Smith <ansuelsmth(a)gmail.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/arm64/include/asm/uaccess.h | 2 ++
1 file changed, 2 insertions(+)
--- a/arch/arm64/include/asm/uaccess.h~arm64-fix-missing-include-in-asm-uaccessh
+++ a/arch/arm64/include/asm/uaccess.h
@@ -7,6 +7,8 @@
#ifndef __ASM_UACCESS_H
#define __ASM_UACCESS_H
+#include <linux/uaccess.h>
+
#include <asm/alternative.h>
#include <asm/kernel-pgtable.h>
#include <asm/sysreg.h>
_
Patches currently in -mm which might be from ansuelsmth(a)gmail.com are
arm64-fix-missing-include-in-asm-uaccessh.patch
On Sun, Nov 08, 2020 at 03:04:14AM -0600, Luis Chamberlain wrote:
> I'd like some more time for this, to review on older kernels. Don't think
> it's the best yet
What is "this" and what is "the best yet"? Which patch(s) here are you
referring to in your context-less top-post? :)
thanks,
gre gk-h
From: Maxim Levitsky <mlevitsk(a)redhat.com>
This msr is only available when the host supports WAITPKG feature.
This breaks a nested guest, if the L1 hypervisor is set to ignore
unknown msrs, because the only other safety check that the
kernel does is that it attempts to read the msr and
rejects it if it gets an exception.
Cc: stable(a)vger.kernel.org
Fixes: 6e3ba4abce ("KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL")
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Message-Id: <20200523161455.3940-3-mlevitsk(a)redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson(a)intel.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
(cherry picked from commit f4cfcd2d5aea4e96c5d483c476f3057b6b7baf6a
use boot_cpu_has for checking the feature)
Signed-off-by: Jack Wang <jinpu.wang(a)cloud.ionos.com>
---
arch/x86/kvm/x86.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 708b37274cb5..4cacf4669235 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5226,6 +5226,10 @@ static void kvm_init_msr_list(void)
if (!kvm_x86_ops->rdtscp_supported())
continue;
break;
+ case MSR_IA32_UMWAIT_CONTROL:
+ if (!boot_cpu_has(X86_FEATURE_WAITPKG))
+ continue;
+ break;
case MSR_IA32_RTIT_CTL:
case MSR_IA32_RTIT_STATUS:
if (!kvm_x86_ops->pt_supported())
--
2.25.1
Here are backports of some fixes to the 4.4 stable branch.
I wasn't able to test the pinctrl fix (no idea how to reproduce it).
I wasn't able to test the i40e changes (no hardware and no reproducer
available).
I tested the geneve fix with libreswan as (roughly) described in the
commit message.
Ben.
--
Ben Hutchings, Software Developer Codethink Ltd
https://www.codethink.co.uk/ Dale House, 35 Dale Street
Manchester, M1 2HF, United Kingdom
Here are backports of some fixes to the 4.9 stable branch.
I wasn't able to test the pinctrl fix (no idea how to reproduce it).
I wasn't able to test the i40e changes (no hardware and no reproducer
available).
I tested the geneve fix with libreswan as (roughly) described in the
commit message.
Ben.
--
Ben Hutchings, Software Developer Codethink Ltd
https://www.codethink.co.uk/ Dale House, 35 Dale Street
Manchester, M1 2HF, United Kingdom
GPIOs that attempt to use interrupts get thwarted with a message like:
"pin 161 cannot be used as IRQ" (for instance with SD_CD). This is because
the HOSTSW_OWN offset is incorrect, so every GPIO looks like it's
owned by ACPI.
Fixes: e278dcb7048b1 ("pinctrl: intel: Add Intel Jasper Lake pin controller support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Evan Green <evgreen(a)chromium.org>
---
Changes in v2:
- Commit text rewording [Andy]
drivers/pinctrl/intel/pinctrl-jasperlake.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pinctrl/intel/pinctrl-jasperlake.c b/drivers/pinctrl/intel/pinctrl-jasperlake.c
index 9bd0e8e6310c3..283698cf0dc7d 100644
--- a/drivers/pinctrl/intel/pinctrl-jasperlake.c
+++ b/drivers/pinctrl/intel/pinctrl-jasperlake.c
@@ -16,7 +16,7 @@
#define JSL_PAD_OWN 0x020
#define JSL_PADCFGLOCK 0x080
-#define JSL_HOSTSW_OWN 0x0b0
+#define JSL_HOSTSW_OWN 0x0c0
#define JSL_GPI_IS 0x100
#define JSL_GPI_IE 0x120
--
2.26.2
Commit 9ce274630495 ("cpufreq: tegra20: Use generic cpufreq-dt driver
(Tegra30 supported now)") update the Tegra20 CPUFREQ driver to use the
generic CPUFREQ device-tree driver. Since this change CPUFREQ support
on the Tegra20 Ventana platform has been broken because the necessary
device-tree nodes with the operating point information are not populated
for this platform. Fix this by updating device-tree for Venata to
include the operating point informration for Tegra20.
Fixes: 9ce274630495 ("cpufreq: tegra20: Use generic cpufreq-dt driver (Tegra30 supported now)")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jon Hunter <jonathanh(a)nvidia.com>
---
arch/arm/boot/dts/tegra20-ventana.dts | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/arm/boot/dts/tegra20-ventana.dts b/arch/arm/boot/dts/tegra20-ventana.dts
index b158771ac0b7..055334ae3d28 100644
--- a/arch/arm/boot/dts/tegra20-ventana.dts
+++ b/arch/arm/boot/dts/tegra20-ventana.dts
@@ -3,6 +3,7 @@
#include <dt-bindings/input/input.h>
#include "tegra20.dtsi"
+#include "tegra20-cpu-opp.dtsi"
/ {
model = "NVIDIA Tegra20 Ventana evaluation board";
@@ -592,6 +593,16 @@ clk32k_in: clock@0 {
#clock-cells = <0>;
};
+ cpus {
+ cpu0: cpu@0 {
+ operating-points-v2 = <&cpu0_opp_table>;
+ };
+
+ cpu@1 {
+ operating-points-v2 = <&cpu0_opp_table>;
+ };
+ };
+
gpio-keys {
compatible = "gpio-keys";
--
2.25.1
Commit 9ce274630495 ("cpufreq: tegra20: Use generic cpufreq-dt driver
(Tegra30 supported now)") update the Tegra20 CPUFREQ driver to use the
generic CPUFREQ device-tree driver. Since this change CPUFREQ support
on the Tegra20 Ventana platform has been broken because the necessary
device-tree nodes with the operating point information are not populated
for this platform. Fix this by updating device-tree for Venata to
include the operating point informration for Tegra20.
Fixes: 9ce274630495 ("cpufreq: tegra20: Use generic cpufreq-dt driver (Tegra30 supported now)")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jon Hunter <jonathanh(a)nvidia.com>
---
Changes since V1:
- Remove unneeded 'cpu0' phandle
arch/arm/boot/dts/tegra20-ventana.dts | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/arm/boot/dts/tegra20-ventana.dts b/arch/arm/boot/dts/tegra20-ventana.dts
index b158771ac0b7..1b2a0dcd929a 100644
--- a/arch/arm/boot/dts/tegra20-ventana.dts
+++ b/arch/arm/boot/dts/tegra20-ventana.dts
@@ -3,6 +3,7 @@
#include <dt-bindings/input/input.h>
#include "tegra20.dtsi"
+#include "tegra20-cpu-opp.dtsi"
/ {
model = "NVIDIA Tegra20 Ventana evaluation board";
@@ -592,6 +593,16 @@ clk32k_in: clock@0 {
#clock-cells = <0>;
};
+ cpus {
+ cpu@0 {
+ operating-points-v2 = <&cpu0_opp_table>;
+ };
+
+ cpu@1 {
+ operating-points-v2 = <&cpu0_opp_table>;
+ };
+ };
+
gpio-keys {
compatible = "gpio-keys";
--
2.25.1
If a user holds a button down on a remote, then no ir idle interrupt will
be generated until the user releases the button, depending on how quickly
the remote repeats. No IR is processed until that point, which means that
holding down a button may not do anything.
This also resolves an issue on a Cubieboard 1 where the IR receiver is
picking up ambient infrared as IR and spews out endless
"rc rc0: IR event FIFO is full!" messages unless you choose to live in
the dark.
Cc: stable(a)vger.kernel.org
Reported-by: Hans Verkuil <hverkuil(a)xs4all.nl>
Signed-off-by: Sean Young <sean(a)mess.org>
---
drivers/media/rc/sunxi-cir.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/media/rc/sunxi-cir.c b/drivers/media/rc/sunxi-cir.c
index ddee6ee37bab1..4afc5895bee74 100644
--- a/drivers/media/rc/sunxi-cir.c
+++ b/drivers/media/rc/sunxi-cir.c
@@ -137,6 +137,8 @@ static irqreturn_t sunxi_ir_irq(int irqno, void *dev_id)
} else if (status & REG_RXSTA_RPE) {
ir_raw_event_set_idle(ir->rc, true);
ir_raw_event_handle(ir->rc);
+ } else {
+ ir_raw_event_handle(ir->rc);
}
spin_unlock(&ir->ir_lock);
--
2.28.0
From: Johannes Berg <johannes.berg(a)intel.com>
If sta_info_insert_finish() fails, we currently keep the station
around and free it only in the caller, but there's only one such
caller and it always frees it immediately.
As syzbot found, another consequence of this split is that we can
put things that sleep only into __cleanup_single_sta() and not in
sta_info_free(), but this is the only place that requires such of
sta_info_free() now.
Change this to free the station in sta_info_insert_finish(), in
which case we can still sleep. This will also let us unify the
cleanup code later.
Cc: stable(a)vger.kernel.org
Fixes: dcd479e10a05 ("mac80211: always wind down STA state")
Reported-by: syzbot+32c6c38c4812d22f2f0b(a)syzkaller.appspotmail.com
Reported-by: syzbot+4c81fe92e372d26c4246(a)syzkaller.appspotmail.com
Reported-by: syzbot+6a7fe9faf0d1d61bc24a(a)syzkaller.appspotmail.com
Reported-by: syzbot+abed06851c5ffe010921(a)syzkaller.appspotmail.com
Reported-by: syzbot+b7aeb9318541a1c709f1(a)syzkaller.appspotmail.com
Reported-by: syzbot+d5a9416c6cafe53b5dd0(a)syzkaller.appspotmail.com
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
---
net/mac80211/sta_info.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 4fe284ff1ea3..ec6973ee88ef 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -705,7 +705,7 @@ static int sta_info_insert_finish(struct sta_info *sta) __acquires(RCU)
out_drop_sta:
local->num_sta--;
synchronize_net();
- __cleanup_single_sta(sta);
+ cleanup_single_sta(sta);
out_err:
mutex_unlock(&local->sta_mtx);
kfree(sinfo);
@@ -724,19 +724,13 @@ int sta_info_insert_rcu(struct sta_info *sta) __acquires(RCU)
err = sta_info_insert_check(sta);
if (err) {
+ sta_info_free(local, sta);
mutex_unlock(&local->sta_mtx);
rcu_read_lock();
- goto out_free;
+ return err;
}
- err = sta_info_insert_finish(sta);
- if (err)
- goto out_free;
-
- return 0;
- out_free:
- sta_info_free(local, sta);
- return err;
+ return sta_info_insert_finish(sta);
}
int sta_info_insert(struct sta_info *sta)
--
2.26.2
This is a note to let you know that I've just added the patch titled
serial: ar933x_uart: disable clk on error handling path in probe
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 425af483523b76bc78e14674a430579d38b2a593 Mon Sep 17 00:00:00 2001
From: Zheng Zengkai <zhengzengkai(a)huawei.com>
Date: Wed, 11 Nov 2020 20:44:26 +0800
Subject: serial: ar933x_uart: disable clk on error handling path in probe
ar933x_uart_probe() does not invoke clk_disable_unprepare()
on one error handling path. This patch fixes that.
Fixes: 9be1064fe524 ("serial: ar933x_uart: add RS485 support")
Reported-by: Hulk Robot <hulkci(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
Link: https://lore.kernel.org/r/20201111124426.42638-1-zhengzengkai@huawei.com
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/ar933x_uart.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/ar933x_uart.c b/drivers/tty/serial/ar933x_uart.c
index 0c80a79d7442..c2be7cf91399 100644
--- a/drivers/tty/serial/ar933x_uart.c
+++ b/drivers/tty/serial/ar933x_uart.c
@@ -789,8 +789,10 @@ static int ar933x_uart_probe(struct platform_device *pdev)
goto err_disable_clk;
up->gpios = mctrl_gpio_init(port, 0);
- if (IS_ERR(up->gpios) && PTR_ERR(up->gpios) != -ENOSYS)
- return PTR_ERR(up->gpios);
+ if (IS_ERR(up->gpios) && PTR_ERR(up->gpios) != -ENOSYS) {
+ ret = PTR_ERR(up->gpios);
+ goto err_disable_clk;
+ }
up->rts_gpiod = mctrl_gpio_to_gpiod(up->gpios, UART_GPIO_RTS);
--
2.29.2
This is a note to let you know that I've just added the patch titled
tty: serial: imx: keep console clocks always on
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From e67c139c488e84e7eae6c333231e791f0e89b3fb Mon Sep 17 00:00:00 2001
From: Fugang Duan <fugang.duan(a)nxp.com>
Date: Wed, 11 Nov 2020 10:51:36 +0800
Subject: tty: serial: imx: keep console clocks always on
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
For below code, there has chance to cause deadlock in SMP system:
Thread 1:
clk_enable_lock();
pr_info("debug message");
clk_enable_unlock();
Thread 2:
imx_uart_console_write()
clk_enable()
clk_enable_lock();
Thread 1:
Acuired clk enable_lock -> printk -> console_trylock_spinning
Thread 2:
console_unlock() -> imx_uart_console_write -> clk_disable -> Acquite clk enable_lock
So the patch is to keep console port clocks always on like
other console drivers.
Fixes: 1cf93e0d5488 ("serial: imx: remove the uart_console() check")
Acked-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Signed-off-by: Fugang Duan <fugang.duan(a)nxp.com>
Link: https://lore.kernel.org/r/20201111025136.29818-1-fugang.duan@nxp.com
Cc: stable <stable(a)vger.kernel.org>
[fix up build warning - gregkh]
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/imx.c | 20 +++-----------------
1 file changed, 3 insertions(+), 17 deletions(-)
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 3c53a3c89959..cacf7266a262 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -2008,16 +2008,6 @@ imx_uart_console_write(struct console *co, const char *s, unsigned int count)
unsigned int ucr1;
unsigned long flags = 0;
int locked = 1;
- int retval;
-
- retval = clk_enable(sport->clk_per);
- if (retval)
- return;
- retval = clk_enable(sport->clk_ipg);
- if (retval) {
- clk_disable(sport->clk_per);
- return;
- }
if (sport->port.sysrq)
locked = 0;
@@ -2053,9 +2043,6 @@ imx_uart_console_write(struct console *co, const char *s, unsigned int count)
if (locked)
spin_unlock_irqrestore(&sport->port.lock, flags);
-
- clk_disable(sport->clk_ipg);
- clk_disable(sport->clk_per);
}
/*
@@ -2156,15 +2143,14 @@ imx_uart_console_setup(struct console *co, char *options)
retval = uart_set_options(&sport->port, co, baud, parity, bits, flow);
- clk_disable(sport->clk_ipg);
if (retval) {
- clk_unprepare(sport->clk_ipg);
+ clk_disable_unprepare(sport->clk_ipg);
goto error_console;
}
- retval = clk_prepare(sport->clk_per);
+ retval = clk_prepare_enable(sport->clk_per);
if (retval)
- clk_unprepare(sport->clk_ipg);
+ clk_disable_unprepare(sport->clk_ipg);
error_console:
return retval;
--
2.29.2
Hi,
I'm requesting a stable merge for commit
1978b3a53a74e3230cd46932b149c6e62e832e9a
("x86/speculation: Allow IBPB to be conditionally enabled on CPUs with
always-on STIBP")
into the stable branch for 5.4. Note, the commit is already queued for
inclusion into the next 5.9 stable release
(https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tre…).
The patch fixes an issue where a Spectre-v2-user mitigation could not
be enabled via prctl() on certain AMD CPUs. The issue was introduced
in commit 21998a351512eba4ed5969006f0c55882d995ada
("x86/speculation: Avoid force-disabling IBPB based on STIBP and
enhanced IBRS.")
which was merged into the 5.4 stable branch as commit
6d60d5462a91eb46fb88b016508edfa8ee0bc7c8. This commit also exists in
4.19, 4.14, 4.9, and 4.4, so those kernels are also likely affected by
this bug.
--
Anand K. Mistry
Software Engineer
Google Australia