From: "Joel Fernandes (Google)" <joel(a)joelfernandes.org>
There are instances where rcu_cpu_stall_reset() is called when jiffies
did not get a chance to update for a long time. Before jiffies is
updated, the CPU stall detector can go off triggering false-positives
where a just-started grace period appears to be ages old. In the past,
we disabled stall detection in rcu_cpu_stall_reset() however this got
changed [1]. This is resulting in false-positives in KGDB usecase [2].
Fix this by deferring the update of jiffies to the third run of the FQS
loop. This is more robust, as, even if rcu_cpu_stall_reset() is called
just before jiffies is read, we would end up pushing out the jiffies
read by 3 more FQS loops. Meanwhile the CPU stall detection will be
delayed and we will not get any false positives.
[1] https://lore.kernel.org/all/20210521155624.174524-2-senozhatsky@chromium.or…
[2] https://lore.kernel.org/all/20230814020045.51950-2-chenhuacai@loongson.cn/
Tested with rcutorture.cpu_stall option as well to verify stall behavior
with/without patch.
Tested-by: Huacai Chen <chenhuacai(a)loongson.cn>
Reported-by: Binbin Zhou <zhoubinbin(a)loongson.cn>
Closes: https://lore.kernel.org/all/20230814020045.51950-2-chenhuacai@loongson.cn/
Suggested-by: Paul McKenney <paulmck(a)kernel.org>
Cc: Sergey Senozhatsky <senozhatsky(a)chromium.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Fixes: a80be428fbc1 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Frederic Weisbecker <frederic(a)kernel.org>
---
kernel/rcu/tree.c | 12 ++++++++++++
kernel/rcu/tree.h | 4 ++++
kernel/rcu/tree_stall.h | 20 ++++++++++++++++++--
3 files changed, 34 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index cb1caefa8bd0..d85779f67aea 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1556,10 +1556,22 @@ static bool rcu_gp_fqs_check_wake(int *gfp)
*/
static void rcu_gp_fqs(bool first_time)
{
+ int nr_fqs = READ_ONCE(rcu_state.nr_fqs_jiffies_stall);
struct rcu_node *rnp = rcu_get_root();
WRITE_ONCE(rcu_state.gp_activity, jiffies);
WRITE_ONCE(rcu_state.n_force_qs, rcu_state.n_force_qs + 1);
+
+ WARN_ON_ONCE(nr_fqs > 3);
+ /* Only countdown nr_fqs for stall purposes if jiffies moves. */
+ if (nr_fqs) {
+ if (nr_fqs == 1) {
+ WRITE_ONCE(rcu_state.jiffies_stall,
+ jiffies + rcu_jiffies_till_stall_check());
+ }
+ WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, --nr_fqs);
+ }
+
if (first_time) {
/* Collect dyntick-idle snapshots. */
force_qs_rnp(dyntick_save_progress_counter);
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 192536916f9a..e9821a8422db 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -386,6 +386,10 @@ struct rcu_state {
/* in jiffies. */
unsigned long jiffies_stall; /* Time at which to check */
/* for CPU stalls. */
+ int nr_fqs_jiffies_stall; /* Number of fqs loops after
+ * which read jiffies and set
+ * jiffies_stall. Stall
+ * warnings disabled if !0. */
unsigned long jiffies_resched; /* Time at which to resched */
/* a reluctant CPU. */
unsigned long n_force_qs_gpstart; /* Snapshot of n_force_qs at */
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 49544f932279..ac8e86babe44 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -150,12 +150,17 @@ static void panic_on_rcu_stall(void)
/**
* rcu_cpu_stall_reset - restart stall-warning timeout for current grace period
*
+ * To perform the reset request from the caller, disable stall detection until
+ * 3 fqs loops have passed. This is required to ensure a fresh jiffies is
+ * loaded. It should be safe to do from the fqs loop as enough timer
+ * interrupts and context switches should have passed.
+ *
* The caller must disable hard irqs.
*/
void rcu_cpu_stall_reset(void)
{
- WRITE_ONCE(rcu_state.jiffies_stall,
- jiffies + rcu_jiffies_till_stall_check());
+ WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, 3);
+ WRITE_ONCE(rcu_state.jiffies_stall, ULONG_MAX);
}
//////////////////////////////////////////////////////////////////////////////
@@ -171,6 +176,7 @@ static void record_gp_stall_check_time(void)
WRITE_ONCE(rcu_state.gp_start, j);
j1 = rcu_jiffies_till_stall_check();
smp_mb(); // ->gp_start before ->jiffies_stall and caller's ->gp_seq.
+ WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, 0);
WRITE_ONCE(rcu_state.jiffies_stall, j + j1);
rcu_state.jiffies_resched = j + j1 / 2;
rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs);
@@ -726,6 +732,16 @@ static void check_cpu_stall(struct rcu_data *rdp)
!rcu_gp_in_progress())
return;
rcu_stall_kick_kthreads();
+
+ /*
+ * Check if it was requested (via rcu_cpu_stall_reset()) that the FQS
+ * loop has to set jiffies to ensure a non-stale jiffies value. This
+ * is required to have good jiffies value after coming out of long
+ * breaks of jiffies updates. Not doing so can cause false positives.
+ */
+ if (READ_ONCE(rcu_state.nr_fqs_jiffies_stall) > 0)
+ return;
+
j = jiffies;
/*
--
2.34.1
From: Ajay Singh <ajay.kathat(a)microchip.com>
Enabling KASAN and running some iperf tests raises some memory issues with
vmm_table:
BUG: KASAN: slab-out-of-bounds in wilc_wlan_handle_txq+0x6ac/0xdb4
Write of size 4 at addr c3a61540 by task wlan0-tx/95
KASAN detects that we are writing data beyond range allocated to vmm_table.
There is indeed a mismatch between the size passed to allocator in
wilc_wlan_init, and the range of possible indexes used later: allocation
size is missing a multiplication by sizeof(u32)
Fixes: 40b717bfcefa ("wifi: wilc1000: fix DMA on stack objects")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ajay Singh <ajay.kathat(a)microchip.com>
Signed-off-by: Alexis Lothoré <alexis.lothore(a)bootlin.com>
---
Changes in v3:
- Use kcalloc instead of kzalloc
- Link to v2: https://lore.kernel.org/r/20231016-wilc1000_tx_oops-v2-1-8d1982a29ef1@bootl…
Changes in v2:
- keep dedicated dynamic allocation for vmm_table
- Link to v1: https://lore.kernel.org/r/20231013-wilc1000_tx_oops-v1-1-3761beb9524d@bootl…
---
drivers/net/wireless/microchip/wilc1000/wlan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/microchip/wilc1000/wlan.c b/drivers/net/wireless/microchip/wilc1000/wlan.c
index 58bbf50081e4..9eb115c79c90 100644
--- a/drivers/net/wireless/microchip/wilc1000/wlan.c
+++ b/drivers/net/wireless/microchip/wilc1000/wlan.c
@@ -1492,7 +1492,7 @@ int wilc_wlan_init(struct net_device *dev)
}
if (!wilc->vmm_table)
- wilc->vmm_table = kzalloc(WILC_VMM_TBL_SIZE, GFP_KERNEL);
+ wilc->vmm_table = kcalloc(WILC_VMM_TBL_SIZE, sizeof(u32), GFP_KERNEL);
if (!wilc->vmm_table) {
ret = -ENOBUFS;
---
base-commit: ea12d85cbfd6b08fff40a4fefccc011b6cfadf8e
change-id: 20231012-wilc1000_tx_oops-58ce91ee3e93
Best regards,
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
DisplayPort Alt Mode CTS test 10.3.8 states that both sides of the
connection shall be compatible with one another such that the connection
is not Source to Source or Sink to Sink.
The DisplayPort driver currently checks for a compatible pin configuration
that resolves into a source and sink combination. The CTS test is designed
to send a Discover Modes message that has a compatible pin configuration
but advertises the same port capability as the device; the current check
fails this.
Verify that the port and port partner resolve into a valid source and sink
combination before checking for a compatible pin configuration.
Fixes: 0e3bb7d6894d ("usb: typec: Add driver for DisplayPort alternate mode")
Cc: stable(a)vger.kernel.org
Signed-off-by: RD Babiera <rdbabiera(a)google.com>
---
drivers/usb/typec/altmodes/displayport.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/usb/typec/altmodes/displayport.c b/drivers/usb/typec/altmodes/displayport.c
index 718da02036d8..3b35a6b8cb72 100644
--- a/drivers/usb/typec/altmodes/displayport.c
+++ b/drivers/usb/typec/altmodes/displayport.c
@@ -575,9 +575,18 @@ int dp_altmode_probe(struct typec_altmode *alt)
struct fwnode_handle *fwnode;
struct dp_altmode *dp;
int ret;
+ int port_cap, partner_cap;
/* FIXME: Port can only be DFP_U. */
+ /* Make sure that the port and partner can resolve into source and sink */
+ port_cap = DP_CAP_CAPABILITY(port->vdo);
+ partner_cap = DP_CAP_CAPABILITY(alt->vdo);
+ if (!((port_cap & DP_CAP_DFP_D) && (partner_cap & DP_CAP_UFP_D)) &&
+ !((port_cap & DP_CAP_UFP_D) && (partner_cap & DP_CAP_DFP_D))) {
+ return -ENODEV;
+ }
+
/* Make sure we have compatiple pin configurations */
if (!(DP_CAP_PIN_ASSIGN_DFP_D(port->vdo) &
DP_CAP_PIN_ASSIGN_UFP_D(alt->vdo)) &&
base-commit: d0d27ef87e1ca974ed93ed4f7d3c123cbd392ba6
--
2.42.0.655.g421f12c284-goog
This patch series was ultimately created to fix a bug reported via
BZ203687. Halil did some problem determination and concluded that the
problem was due to the fact that all queues removed from the guest's AP
configuration are not reset. For example, when an adapter or domain is
assigned to the mdev matrix, if a queue device associated with the adapter
or domain is not bound to the vfio_ap device driver, the adapter to which
the queue is attached will be removed from the guest's configuration. The
problem is, removing the adapter also implicitly removes all queues
attached to that adapter. Those queues also need to be reset to avert
leaking crypto data should any of those queues get assigned to another
guest or back to the host.
The original intent was to ensure that all queues removed from a guest's
AP configuration get reset. The testing included permutations of various
scenarios involving the binding/unbinding of queues either manually, or
due to dynamic host AP configuration changes initiated via an SE or HMC
attached to a DPM-enabled LPAR. This testing revealed several issues that
are also addressed via this patch series.
Note that several of the patches has a 'Fixes:' tag as well as a
Cc: <stable(a)vger.kernel.org> tag. I'm not sure whether this is necessary
because the patches not related to the reset issue are probably rarely
if ever encountered, so I'd like an opinion on that also.
Tony Krowiak (7):
s390/vfio-ap: always filter entire AP matrix
s390/vfio-ap: circumvent filtering for adapters/domains not in host
config
s390/vfio-ap: do not reset queue removed from host config
s390/vfio-ap: let 'on_scan_complete' callback filter matrix and update
guest's APCB
s390/vfio-ap: allow reset of subset of queues assigned to matrix mdev
s390/vfio-ap: reset queues filtered from the guest's AP config
s390/vfio-ap: reset queues associated with adapter for queue unbound
from driver
drivers/s390/crypto/vfio_ap_ops.c | 294 +++++++++++++++++++-------
drivers/s390/crypto/vfio_ap_private.h | 25 ++-
2 files changed, 234 insertions(+), 85 deletions(-)
--
2.41.0
The quilt patch titled
Subject: selftests/clone3: Fix broken test under !CONFIG_TIME_NS
has been removed from the -mm tree. Its filename was
selftests-clone3-fix-broken-test-under-config_time_ns.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Tiezhu Yang <yangtiezhu(a)loongson.cn>
Subject: selftests/clone3: Fix broken test under !CONFIG_TIME_NS
Date: Tue, 11 Jul 2023 17:13:34 +0800
When execute the following command to test clone3 under !CONFIG_TIME_NS:
# make headers && cd tools/testing/selftests/clone3 && make && ./clone3
we can see the following error info:
# [7538] Trying clone3() with flags 0x80 (size 0)
# Invalid argument - Failed to create new process
# [7538] clone3() with flags says: -22 expected 0
not ok 18 [7538] Result (-22) is different than expected (0)
...
# Totals: pass:18 fail:1 xfail:0 xpass:0 skip:0 error:0
This is because if CONFIG_TIME_NS is not set, but the flag
CLONE_NEWTIME (0x80) is used to clone a time namespace, it
will return -EINVAL in copy_time_ns().
If kernel does not support CONFIG_TIME_NS, /proc/self/ns/time
will be not exist, and then we should skip clone3() test with
CLONE_NEWTIME.
With this patch under !CONFIG_TIME_NS:
# make headers && cd tools/testing/selftests/clone3 && make && ./clone3
...
# Time namespaces are not supported
ok 18 # SKIP Skipping clone3() with CLONE_NEWTIME
...
# Totals: pass:18 fail:0 xfail:0 xpass:0 skip:1 error:0
Link: https://lkml.kernel.org/r/1689066814-13295-1-git-send-email-yangtiezhu@loon…
Fixes: 515bddf0ec41 ("selftests/clone3: test clone3 with CLONE_NEWTIME")
Signed-off-by: Tiezhu Yang <yangtiezhu(a)loongson.cn>
Suggested-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/clone3/clone3.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/clone3/clone3.c~selftests-clone3-fix-broken-test-under-config_time_ns
+++ a/tools/testing/selftests/clone3/clone3.c
@@ -196,7 +196,12 @@ int main(int argc, char *argv[])
CLONE3_ARGS_NO_TEST);
/* Do a clone3() in a new time namespace */
- test_clone3(CLONE_NEWTIME, 0, 0, CLONE3_ARGS_NO_TEST);
+ if (access("/proc/self/ns/time", F_OK) == 0) {
+ test_clone3(CLONE_NEWTIME, 0, 0, CLONE3_ARGS_NO_TEST);
+ } else {
+ ksft_print_msg("Time namespaces are not supported\n");
+ ksft_test_result_skip("Skipping clone3() with CLONE_NEWTIME\n");
+ }
/* Do a clone3() with exit signal (SIGCHLD) in flags */
test_clone3(SIGCHLD, 0, -EINVAL, CLONE3_ARGS_NO_TEST);
_
Patches currently in -mm which might be from yangtiezhu(a)loongson.cn are
The quilt patch titled
Subject: selftests/mm: include mman header to access MREMAP_DONTUNMAP identifier
has been removed from the -mm tree. Its filename was
selftests-mm-include-mman-header-to-access-mremap_dontunmap-identifier.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Samasth Norway Ananda <samasth.norway.ananda(a)oracle.com>
Subject: selftests/mm: include mman header to access MREMAP_DONTUNMAP identifier
Date: Thu, 12 Oct 2023 08:52:57 -0700
Definition for MREMAP_DONTUNMAP is not present in glibc older than 2.32
thus throwing an undeclared error when running make on mm. Including
linux/mman.h solves the build error for people having older glibc.
Link: https://lkml.kernel.org/r/20231012155257.891776-1-samasth.norway.ananda@ora…
Fixes: 0183d777c29a ("selftests: mm: remove duplicate unneeded defines")
Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda(a)oracle.com>
Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
Closes: https://lore.kernel.org/linux-mm/CA+G9fYvV-71XqpCr_jhdDfEtN701fBdG3q+=bafaZ…
Tested-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/mremap_dontunmap.c | 1 +
1 file changed, 1 insertion(+)
--- a/tools/testing/selftests/mm/mremap_dontunmap.c~selftests-mm-include-mman-header-to-access-mremap_dontunmap-identifier
+++ a/tools/testing/selftests/mm/mremap_dontunmap.c
@@ -7,6 +7,7 @@
*/
#define _GNU_SOURCE
#include <sys/mman.h>
+#include <linux/mman.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
_
Patches currently in -mm which might be from samasth.norway.ananda(a)oracle.com are