autoremove_wake_function uses list_del_init_careful, so should epoll's
more aggressive variant. It only doesn't because it was copied from an
older wait.c rather than the most recent.
Fixes: a16ceb139610 ("epoll: autoremove wakers even more aggressively")
Signed-off-by: Ben Segall <bsegall(a)google.com>
Cc: stable(a)vger.kernel.org
---
fs/eventpoll.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 52954d4637b5..081df056398a 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1756,11 +1756,11 @@ static struct timespec64 *ep_timeout_to_timespec(struct timespec64 *to, long ms)
static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry,
unsigned int mode, int sync, void *key)
{
int ret = default_wake_function(wq_entry, mode, sync, key);
- list_del_init(&wq_entry->entry);
+ list_del_init_careful(&wq_entry->entry);
return ret;
}
/**
* ep_poll - Retrieves ready events, and delivers them to the caller-supplied
--
2.39.0.rc0.267.gcb52ba06e7-goog
ICC_BWMON driver uses REGMAP_MMIO for accessing the hardware registers.
So select the dependency in Kconfig. Without this, there will be errors
while building the driver with COMPILE_TEST only:
ERROR: modpost: "__devm_regmap_init_mmio_clk" [drivers/soc/qcom/icc-bwmon.ko] undefined!
make[1]: *** [scripts/Makefile.modpost:126: Module.symvers] Error 1
make: *** [Makefile:1944: modpost] Error 2
Cc: <stable(a)vger.kernel.org> # 6.0
Cc: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Fixes: b9c2ae6cac40 ("soc: qcom: icc-bwmon: Add bandwidth monitoring driver")
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
---
drivers/soc/qcom/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index 024e420f1bb7..75bfdb6f9705 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -236,6 +236,7 @@ config QCOM_ICC_BWMON
tristate "QCOM Interconnect Bandwidth Monitor driver"
depends on ARCH_QCOM || COMPILE_TEST
select PM_OPP
+ select REGMAP_MMIO
help
Sets up driver monitoring bandwidth on various interconnects and
based on that voting for interconnect bandwidth, adjusting their
--
2.25.1
commit 152fe65f300e1819d59b80477d3e0999b4d5d7d2 upstream.
When enabled, KASAN enlarges function's stack-frames. Pushing quite a few
over the current threshold. This can mainly be seen on 32-bit
architectures where the present limit (when !GCC) is a lowly 1024-Bytes.
Link: https://lkml.kernel.org/r/20221125120750.3537134-3-lee@kernel.org
Signed-off-by: Lee Jones <lee(a)kernel.org>
Acked-by: Arnd Bergmann <arnd(a)arndb.de>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: "Christian König" <christian.koenig(a)amd.com>
Cc: Daniel Vetter <daniel(a)ffwll.ch>
Cc: David Airlie <airlied(a)gmail.com>
Cc: Harry Wentland <harry.wentland(a)amd.com>
Cc: Leo Li <sunpeng.li(a)amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
Cc: Maxime Ripard <mripard(a)kernel.org>
Cc: Nathan Chancellor <nathan(a)kernel.org>
Cc: Nick Desaulniers <ndesaulniers(a)google.com>
Cc: "Pan, Xinhui" <Xinhui.Pan(a)amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira(a)amd.com>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Tom Rix <trix(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
[Lee: Back-ported to linux-5.15.y]
Signed-off-by: Lee Jones <lee(a)kernel.org>
---
lib/Kconfig.debug | 1 +
1 file changed, 1 insertion(+)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 1699b21245586..cfbadc7c919d8 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -353,6 +353,7 @@ config FRAME_WARN
range 0 8192
default 2048 if GCC_PLUGIN_LATENT_ENTROPY
default 1536 if (!64BIT && (PARISC || XTENSA))
+ default 1280 if KASAN && !64BIT
default 1024 if (!64BIT && !PARISC)
default 2048 if 64BIT
help
--
2.39.0.rc0.267.gcb52ba06e7-goog
commit 152fe65f300e1819d59b80477d3e0999b4d5d7d2 upstream.
When enabled, KASAN enlarges function's stack-frames. Pushing quite a few
over the current threshold. This can mainly be seen on 32-bit
architectures where the present limit (when !GCC) is a lowly 1024-Bytes.
Link: https://lkml.kernel.org/r/20221125120750.3537134-3-lee@kernel.org
Signed-off-by: Lee Jones <lee(a)kernel.org>
Acked-by: Arnd Bergmann <arnd(a)arndb.de>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: "Christian König" <christian.koenig(a)amd.com>
Cc: Daniel Vetter <daniel(a)ffwll.ch>
Cc: David Airlie <airlied(a)gmail.com>
Cc: Harry Wentland <harry.wentland(a)amd.com>
Cc: Leo Li <sunpeng.li(a)amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
Cc: Maxime Ripard <mripard(a)kernel.org>
Cc: Nathan Chancellor <nathan(a)kernel.org>
Cc: Nick Desaulniers <ndesaulniers(a)google.com>
Cc: "Pan, Xinhui" <Xinhui.Pan(a)amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira(a)amd.com>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Tom Rix <trix(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
[Lee: Back-ported to linux-5.10.y]
Signed-off-by: Lee Jones <lee(a)kernel.org>
---
lib/Kconfig.debug | 1 +
1 file changed, 1 insertion(+)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index ce796ca869c22..ea0628a0116d0 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -298,6 +298,7 @@ config FRAME_WARN
int "Warn for stack frames larger than"
range 0 8192
default 2048 if GCC_PLUGIN_LATENT_ENTROPY
+ default 1280 if KASAN && !64BIT
default 1280 if (!64BIT && PARISC)
default 1024 if (!64BIT && !PARISC)
default 2048 if 64BIT
--
2.39.0.rc0.267.gcb52ba06e7-goog
Fix two regressions introudced by recent speculation mitigations
on the 4.14 branch:
- Crash on older 32-bit processors
- Build warning from objtool
Ben.
Ben Hutchings (1):
Revert "x86/speculation: Change FILL_RETURN_BUFFER to work with
objtool"
Peter Zijlstra (1):
x86/nospec: Fix i386 RSB stuffing
arch/x86/include/asm/nospec-branch.h | 24 +++++++++++++++++++-----
1 file changed, 19 insertions(+), 5 deletions(-)
This series fixes following issues:
Patch 1:
This patch provides a fix to correctly report encapsulated LRO'ed
packet.
Patch 2:
This patch provides a fix to use correct intrConf reference.
Changes in v2:
- declare generic descriptor to be used
- remove white spaces
- remove single quote around commit reference in patch 2
- remove if check for encap_lro
Ronak Doshi (2):
vmxnet3: correctly report encapsulated LRO packet
vmxnet3: use correct intrConf reference when using extended queues
drivers/net/vmxnet3/vmxnet3_drv.c | 27 +++++++++++++++++++++++----
1 file changed, 23 insertions(+), 4 deletions(-)
--
2.11.0
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b52be557e24c47286738276121177a41f54e3b83 Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Mon, 5 Dec 2022 17:59:27 +0100
Subject: [PATCH] ipc/sem: Fix dangling sem_array access in semtimedop race
When __do_semtimedop() goes to sleep because it has to wait for a
semaphore value becoming zero or becoming bigger than some threshold, it
links the on-stack sem_queue to the sem_array, then goes to sleep
without holding a reference on the sem_array.
When __do_semtimedop() comes back out of sleep, one of two things must
happen:
a) We prove that the on-stack sem_queue has been disconnected from the
(possibly freed) sem_array, making it safe to return from the stack
frame that the sem_queue exists in.
b) We stabilize our reference to the sem_array, lock the sem_array, and
detach the sem_queue from the sem_array ourselves.
sem_array has RCU lifetime, so for case (b), the reference can be
stabilized inside an RCU read-side critical section by locklessly
checking whether the sem_queue is still connected to the sem_array.
However, the current code does the lockless check on sem_queue before
starting an RCU read-side critical section, so the result of the
lockless check immediately becomes useless.
Fix it by doing rcu_read_lock() before the lockless check. Now RCU
ensures that if we observe the object being on our queue, the object
can't be freed until rcu_read_unlock().
This bug is only hittable on kernel builds with full preemption support
(either CONFIG_PREEMPT or PREEMPT_DYNAMIC with preempt=full).
Fixes: 370b262c896e ("ipc/sem: avoid idr tree lookup for interrupted semop")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jann Horn <jannh(a)google.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/ipc/sem.c b/ipc/sem.c
index c8496f98b139..00f88aa01ac5 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -2179,14 +2179,15 @@ long __do_semtimedop(int semid, struct sembuf *sops,
* scenarios where we were awakened externally, during the
* window between wake_q_add() and wake_up_q().
*/
+ rcu_read_lock();
error = READ_ONCE(queue.status);
if (error != -EINTR) {
/* see SEM_BARRIER_2 for purpose/pairing */
smp_acquire__after_ctrl_dep();
+ rcu_read_unlock();
goto out;
}
- rcu_read_lock();
locknum = sem_lock(sma, sops, nsops);
if (!ipc_valid_object(&sma->sem_perm))
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b52be557e24c47286738276121177a41f54e3b83 Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Mon, 5 Dec 2022 17:59:27 +0100
Subject: [PATCH] ipc/sem: Fix dangling sem_array access in semtimedop race
When __do_semtimedop() goes to sleep because it has to wait for a
semaphore value becoming zero or becoming bigger than some threshold, it
links the on-stack sem_queue to the sem_array, then goes to sleep
without holding a reference on the sem_array.
When __do_semtimedop() comes back out of sleep, one of two things must
happen:
a) We prove that the on-stack sem_queue has been disconnected from the
(possibly freed) sem_array, making it safe to return from the stack
frame that the sem_queue exists in.
b) We stabilize our reference to the sem_array, lock the sem_array, and
detach the sem_queue from the sem_array ourselves.
sem_array has RCU lifetime, so for case (b), the reference can be
stabilized inside an RCU read-side critical section by locklessly
checking whether the sem_queue is still connected to the sem_array.
However, the current code does the lockless check on sem_queue before
starting an RCU read-side critical section, so the result of the
lockless check immediately becomes useless.
Fix it by doing rcu_read_lock() before the lockless check. Now RCU
ensures that if we observe the object being on our queue, the object
can't be freed until rcu_read_unlock().
This bug is only hittable on kernel builds with full preemption support
(either CONFIG_PREEMPT or PREEMPT_DYNAMIC with preempt=full).
Fixes: 370b262c896e ("ipc/sem: avoid idr tree lookup for interrupted semop")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jann Horn <jannh(a)google.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/ipc/sem.c b/ipc/sem.c
index c8496f98b139..00f88aa01ac5 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -2179,14 +2179,15 @@ long __do_semtimedop(int semid, struct sembuf *sops,
* scenarios where we were awakened externally, during the
* window between wake_q_add() and wake_up_q().
*/
+ rcu_read_lock();
error = READ_ONCE(queue.status);
if (error != -EINTR) {
/* see SEM_BARRIER_2 for purpose/pairing */
smp_acquire__after_ctrl_dep();
+ rcu_read_unlock();
goto out;
}
- rcu_read_lock();
locknum = sem_lock(sma, sops, nsops);
if (!ipc_valid_object(&sma->sem_perm))
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b52be557e24c47286738276121177a41f54e3b83 Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Mon, 5 Dec 2022 17:59:27 +0100
Subject: [PATCH] ipc/sem: Fix dangling sem_array access in semtimedop race
When __do_semtimedop() goes to sleep because it has to wait for a
semaphore value becoming zero or becoming bigger than some threshold, it
links the on-stack sem_queue to the sem_array, then goes to sleep
without holding a reference on the sem_array.
When __do_semtimedop() comes back out of sleep, one of two things must
happen:
a) We prove that the on-stack sem_queue has been disconnected from the
(possibly freed) sem_array, making it safe to return from the stack
frame that the sem_queue exists in.
b) We stabilize our reference to the sem_array, lock the sem_array, and
detach the sem_queue from the sem_array ourselves.
sem_array has RCU lifetime, so for case (b), the reference can be
stabilized inside an RCU read-side critical section by locklessly
checking whether the sem_queue is still connected to the sem_array.
However, the current code does the lockless check on sem_queue before
starting an RCU read-side critical section, so the result of the
lockless check immediately becomes useless.
Fix it by doing rcu_read_lock() before the lockless check. Now RCU
ensures that if we observe the object being on our queue, the object
can't be freed until rcu_read_unlock().
This bug is only hittable on kernel builds with full preemption support
(either CONFIG_PREEMPT or PREEMPT_DYNAMIC with preempt=full).
Fixes: 370b262c896e ("ipc/sem: avoid idr tree lookup for interrupted semop")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jann Horn <jannh(a)google.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/ipc/sem.c b/ipc/sem.c
index c8496f98b139..00f88aa01ac5 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -2179,14 +2179,15 @@ long __do_semtimedop(int semid, struct sembuf *sops,
* scenarios where we were awakened externally, during the
* window between wake_q_add() and wake_up_q().
*/
+ rcu_read_lock();
error = READ_ONCE(queue.status);
if (error != -EINTR) {
/* see SEM_BARRIER_2 for purpose/pairing */
smp_acquire__after_ctrl_dep();
+ rcu_read_unlock();
goto out;
}
- rcu_read_lock();
locknum = sem_lock(sma, sops, nsops);
if (!ipc_valid_object(&sma->sem_perm))
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b52be557e24c47286738276121177a41f54e3b83 Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Mon, 5 Dec 2022 17:59:27 +0100
Subject: [PATCH] ipc/sem: Fix dangling sem_array access in semtimedop race
When __do_semtimedop() goes to sleep because it has to wait for a
semaphore value becoming zero or becoming bigger than some threshold, it
links the on-stack sem_queue to the sem_array, then goes to sleep
without holding a reference on the sem_array.
When __do_semtimedop() comes back out of sleep, one of two things must
happen:
a) We prove that the on-stack sem_queue has been disconnected from the
(possibly freed) sem_array, making it safe to return from the stack
frame that the sem_queue exists in.
b) We stabilize our reference to the sem_array, lock the sem_array, and
detach the sem_queue from the sem_array ourselves.
sem_array has RCU lifetime, so for case (b), the reference can be
stabilized inside an RCU read-side critical section by locklessly
checking whether the sem_queue is still connected to the sem_array.
However, the current code does the lockless check on sem_queue before
starting an RCU read-side critical section, so the result of the
lockless check immediately becomes useless.
Fix it by doing rcu_read_lock() before the lockless check. Now RCU
ensures that if we observe the object being on our queue, the object
can't be freed until rcu_read_unlock().
This bug is only hittable on kernel builds with full preemption support
(either CONFIG_PREEMPT or PREEMPT_DYNAMIC with preempt=full).
Fixes: 370b262c896e ("ipc/sem: avoid idr tree lookup for interrupted semop")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jann Horn <jannh(a)google.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/ipc/sem.c b/ipc/sem.c
index c8496f98b139..00f88aa01ac5 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -2179,14 +2179,15 @@ long __do_semtimedop(int semid, struct sembuf *sops,
* scenarios where we were awakened externally, during the
* window between wake_q_add() and wake_up_q().
*/
+ rcu_read_lock();
error = READ_ONCE(queue.status);
if (error != -EINTR) {
/* see SEM_BARRIER_2 for purpose/pairing */
smp_acquire__after_ctrl_dep();
+ rcu_read_unlock();
goto out;
}
- rcu_read_lock();
locknum = sem_lock(sma, sops, nsops);
if (!ipc_valid_object(&sma->sem_perm))
--
Hello,
I tried e-mailing you more than twice but my email bounced back
failure, Note this, soonest you receive this email revert to me before
I deliver the message it's importunate, pressing, crucial. Await your
response.
Best regards
Dr. Mimi Aminu
After calling napi_complete_done(), the NAPIF_STATE_SCHED bit may be
cleared, and another CPU can start napi thread and access per-CQ variable,
cq->work_done. If the other thread (for example, from busy_poll) sets
it to a value >= budget, this thread will continue to run when it should
stop, and cause memory corruption and panic.
To fix this issue, save the per-CQ work_done variable in a local variable
before napi_complete_done(), so it won't be corrupted by a possible
concurrent thread after napi_complete_done().
Also, add a flag bit to advertise to the NIC firmware: the NAPI work_done
variable race is fixed, so the driver is able to reliably support features
like busy_poll.
Cc: stable(a)vger.kernel.org
Fixes: e1b5683ff62e ("net: mana: Move NAPI from EQ to CQ")
Signed-off-by: Haiyang Zhang <haiyangz(a)microsoft.com>
---
drivers/net/ethernet/microsoft/mana/gdma.h | 9 ++++++++-
drivers/net/ethernet/microsoft/mana/mana_en.c | 16 +++++++++++-----
2 files changed, 19 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/gdma.h b/drivers/net/ethernet/microsoft/mana/gdma.h
index 4a6efe6ada08..65c24ee49efd 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma.h
+++ b/drivers/net/ethernet/microsoft/mana/gdma.h
@@ -498,7 +498,14 @@ enum {
#define GDMA_DRV_CAP_FLAG_1_EQ_SHARING_MULTI_VPORT BIT(0)
-#define GDMA_DRV_CAP_FLAGS1 GDMA_DRV_CAP_FLAG_1_EQ_SHARING_MULTI_VPORT
+/* Advertise to the NIC firmware: the NAPI work_done variable race is fixed,
+ * so the driver is able to reliably support features like busy_poll.
+ */
+#define GDMA_DRV_CAP_FLAG_1_NAPI_WKDONE_FIX BIT(2)
+
+#define GDMA_DRV_CAP_FLAGS1 \
+ (GDMA_DRV_CAP_FLAG_1_EQ_SHARING_MULTI_VPORT | \
+ GDMA_DRV_CAP_FLAG_1_NAPI_WKDONE_FIX)
#define GDMA_DRV_CAP_FLAGS2 0
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 9259a74eca40..27a0f3af8aab 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -1303,10 +1303,11 @@ static void mana_poll_rx_cq(struct mana_cq *cq)
xdp_do_flush();
}
-static void mana_cq_handler(void *context, struct gdma_queue *gdma_queue)
+static int mana_cq_handler(void *context, struct gdma_queue *gdma_queue)
{
struct mana_cq *cq = context;
u8 arm_bit;
+ int w;
WARN_ON_ONCE(cq->gdma_cq != gdma_queue);
@@ -1315,26 +1316,31 @@ static void mana_cq_handler(void *context, struct gdma_queue *gdma_queue)
else
mana_poll_tx_cq(cq);
- if (cq->work_done < cq->budget &&
- napi_complete_done(&cq->napi, cq->work_done)) {
+ w = cq->work_done;
+
+ if (w < cq->budget &&
+ napi_complete_done(&cq->napi, w)) {
arm_bit = SET_ARM_BIT;
} else {
arm_bit = 0;
}
mana_gd_ring_cq(gdma_queue, arm_bit);
+
+ return w;
}
static int mana_poll(struct napi_struct *napi, int budget)
{
struct mana_cq *cq = container_of(napi, struct mana_cq, napi);
+ int w;
cq->work_done = 0;
cq->budget = budget;
- mana_cq_handler(cq, cq->gdma_cq);
+ w = mana_cq_handler(cq, cq->gdma_cq);
- return min(cq->work_done, budget);
+ return min(w, budget);
}
static void mana_schedule_napi(void *context, struct gdma_queue *gdma_queue)
--
2.25.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 97eea946b93961fffd29448dcda7398d0d51c4b2 ]
The bounds checks in snd_soc_put_volsw_sx() are only being applied to the
first channel, meaning it is possible to write out of bounds values to the
second channel in stereo controls. Add appropriate checks.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220511134137.169575-2-broonie@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/soc-ops.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index 4fda8c24be29..7129bb685b63 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -465,6 +465,12 @@ int snd_soc_put_volsw_sx(struct snd_kcontrol *kcontrol,
if (snd_soc_volsw_is_stereo(mc)) {
val_mask = mask << rshift;
val2 = (ucontrol->value.integer.value[1] + min) & mask;
+
+ if (mc->platform_max && val2 > mc->platform_max)
+ return -EINVAL;
+ if (val2 > max)
+ return -EINVAL;
+
val2 = val2 << rshift;
err = snd_soc_component_update_bits(component, reg2, val_mask,
--
2.35.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 97eea946b93961fffd29448dcda7398d0d51c4b2 ]
The bounds checks in snd_soc_put_volsw_sx() are only being applied to the
first channel, meaning it is possible to write out of bounds values to the
second channel in stereo controls. Add appropriate checks.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220511134137.169575-2-broonie@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/soc-ops.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index 81c9ecfa7c7f..63c0e61b1754 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -465,6 +465,12 @@ int snd_soc_put_volsw_sx(struct snd_kcontrol *kcontrol,
if (snd_soc_volsw_is_stereo(mc)) {
val_mask = mask << rshift;
val2 = (ucontrol->value.integer.value[1] + min) & mask;
+
+ if (mc->platform_max && val2 > mc->platform_max)
+ return -EINVAL;
+ if (val2 > max)
+ return -EINVAL;
+
val2 = val2 << rshift;
err = snd_soc_component_update_bits(component, reg2, val_mask,
--
2.35.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 97eea946b93961fffd29448dcda7398d0d51c4b2 ]
The bounds checks in snd_soc_put_volsw_sx() are only being applied to the
first channel, meaning it is possible to write out of bounds values to the
second channel in stereo controls. Add appropriate checks.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220511134137.169575-2-broonie@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/soc-ops.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index 453b61b42dd9..00e6a6e46fe5 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -460,6 +460,12 @@ int snd_soc_put_volsw_sx(struct snd_kcontrol *kcontrol,
if (snd_soc_volsw_is_stereo(mc)) {
val_mask = mask << rshift;
val2 = (ucontrol->value.integer.value[1] + min) & mask;
+
+ if (mc->platform_max && val2 > mc->platform_max)
+ return -EINVAL;
+ if (val2 > max)
+ return -EINVAL;
+
val2 = val2 << rshift;
err = snd_soc_component_update_bits(component, reg2, val_mask,
--
2.35.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 97eea946b93961fffd29448dcda7398d0d51c4b2 ]
The bounds checks in snd_soc_put_volsw_sx() are only being applied to the
first channel, meaning it is possible to write out of bounds values to the
second channel in stereo controls. Add appropriate checks.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220511134137.169575-2-broonie@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/soc-ops.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index 453b61b42dd9..00e6a6e46fe5 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -460,6 +460,12 @@ int snd_soc_put_volsw_sx(struct snd_kcontrol *kcontrol,
if (snd_soc_volsw_is_stereo(mc)) {
val_mask = mask << rshift;
val2 = (ucontrol->value.integer.value[1] + min) & mask;
+
+ if (mc->platform_max && val2 > mc->platform_max)
+ return -EINVAL;
+ if (val2 > max)
+ return -EINVAL;
+
val2 = val2 << rshift;
err = snd_soc_component_update_bits(component, reg2, val_mask,
--
2.35.1
From: Shengjiu Wang <shengjiu.wang(a)nxp.com>
[ Upstream commit 292709b9cf3ba470af94b62c9bb60284cc581b79 ]
SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined as
non volatile register, it still remain in regmap cache after set,
then every update of REG_MICFIL_CTRL1, software reset happens.
to avoid this, clear it explicitly.
Signed-off-by: Shengjiu Wang <shengjiu.wang(a)nxp.com>
Link: https://lore.kernel.org/r/1651925654-32060-1-git-send-email-shengjiu.wang@n…
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/fsl/fsl_micfil.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c
index efc5daf53bba..ead4bfa13561 100644
--- a/sound/soc/fsl/fsl_micfil.c
+++ b/sound/soc/fsl/fsl_micfil.c
@@ -190,6 +190,17 @@ static int fsl_micfil_reset(struct device *dev)
return ret;
}
+ /*
+ * SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined
+ * as non-volatile register, so SRES still remain in regmap
+ * cache after set, that every update of REG_MICFIL_CTRL1,
+ * software reset happens. so clear it explicitly.
+ */
+ ret = regmap_clear_bits(micfil->regmap, REG_MICFIL_CTRL1,
+ MICFIL_CTRL1_SRES);
+ if (ret)
+ return ret;
+
return 0;
}
--
2.35.1
From: Shengjiu Wang <shengjiu.wang(a)nxp.com>
[ Upstream commit 292709b9cf3ba470af94b62c9bb60284cc581b79 ]
SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined as
non volatile register, it still remain in regmap cache after set,
then every update of REG_MICFIL_CTRL1, software reset happens.
to avoid this, clear it explicitly.
Signed-off-by: Shengjiu Wang <shengjiu.wang(a)nxp.com>
Link: https://lore.kernel.org/r/1651925654-32060-1-git-send-email-shengjiu.wang@n…
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/fsl/fsl_micfil.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c
index 9f90989ac59a..cb84d95c3aac 100644
--- a/sound/soc/fsl/fsl_micfil.c
+++ b/sound/soc/fsl/fsl_micfil.c
@@ -191,6 +191,17 @@ static int fsl_micfil_reset(struct device *dev)
return ret;
}
+ /*
+ * SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined
+ * as non-volatile register, so SRES still remain in regmap
+ * cache after set, that every update of REG_MICFIL_CTRL1,
+ * software reset happens. so clear it explicitly.
+ */
+ ret = regmap_clear_bits(micfil->regmap, REG_MICFIL_CTRL1,
+ MICFIL_CTRL1_SRES);
+ if (ret)
+ return ret;
+
return 0;
}
--
2.35.1
[commit 6647e76ab623b2b3fb2efe03a86e9c9046c52c33 upstream]
The V4L2_MEMORY_USERPTR interface is long deprecated and shouldn't be
used (and is discouraged for any modern v4l drivers). And Seth Jenkins
points out that the fallback to VM_PFNMAP/VM_IO is fundamentally racy
and dangerous.
Note that it's not even a case that should trigger, since any normal
user pointer logic ends up just using the pin_user_pages_fast() call
that does the proper page reference counting. That's not the problem
case, only if you try to use special device mappings do you have any
issues.
Normally I'd just remove this during the merge window, but since Seth
pointed out the problem cases, we really want to know as soon as
possible if there are actually any users of this odd special case of a
legacy interface. Neither Hans nor Mauro seem to think that such
mis-uses of the old legacy interface should exist. As Mauro says:
"See, V4L2 has actually 4 streaming APIs:
- Kernel-allocated mmap (usually referred simply as just mmap);
- USERPTR mmap;
- read();
- dmabuf;
The USERPTR is one of the oldest way to use it, coming from V4L
version 1 times, and by far the least used one"
And Hans chimed in on the USERPTR interface:
"To be honest, I wouldn't mind if it goes away completely, but that's a
bit of a pipe dream right now"
but while removing this legacy interface entirely may be a pipe dream we
can at least try to remove the unlikely (and actively broken) case of
using special device mappings for USERPTR accesses.
This replaces it with a WARN_ONCE() that we can remove once we've
hopefully confirmed that no actual users exist.
NOTE! Longer term, this means that a 'struct frame_vector' only ever
contains proper page pointers, and all the games we have with converting
them to pages can go away (grep for 'frame_vector_to_pages()' and the
uses of 'vec->is_pfns'). But this is just the first step, to verify
that this code really is all dead, and do so as quickly as possible.
Reported-by: Seth Jenkins <sethjenkins(a)google.com>
Acked-by: Hans Verkuil <hverkuil(a)xs4all.nl>
Acked-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Jan Kara <jack(a)suse.cz>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
CC: stable(a)vger.kernel.org # 4.9
Signed-off-by: Sergey Senozhatsky <senozhatsky(a)chromium.org>
---
mm/frame_vector.c | 31 ++++++-------------------------
1 file changed, 6 insertions(+), 25 deletions(-)
diff --git a/mm/frame_vector.c b/mm/frame_vector.c
index d73eed0443f6..aa5526e62c5e 100644
--- a/mm/frame_vector.c
+++ b/mm/frame_vector.c
@@ -36,7 +36,6 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
int ret = 0;
- int err;
int locked;
if (nr_frames == 0)
@@ -71,32 +70,14 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
vec->is_pfns = false;
ret = get_user_pages_locked(start, nr_frames,
gup_flags, (struct page **)(vec->ptrs), &locked);
- goto out;
+ if (likely(ret > 0))
+ goto out;
}
- vec->got_ref = false;
- vec->is_pfns = true;
- do {
- unsigned long *nums = frame_vector_pfns(vec);
-
- while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) {
- err = follow_pfn(vma, start, &nums[ret]);
- if (err) {
- if (ret == 0)
- ret = err;
- goto out;
- }
- start += PAGE_SIZE;
- ret++;
- }
- /*
- * We stop if we have enough pages or if VMA doesn't completely
- * cover the tail page.
- */
- if (ret >= nr_frames || start < vma->vm_end)
- break;
- vma = find_vma_intersection(mm, start, start + 1);
- } while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP));
+ /* This used to (racily) return non-refcounted pfns. Let people know */
+ WARN_ONCE(1, "get_vaddr_frames() cannot follow VM_IO mapping");
+ vec->nr_frames = 0;
+
out:
if (locked)
up_read(&mm->mmap_sem);
--
2.39.0.rc0.267.gcb52ba06e7-goog
Hello friend,
I am writing to you on behalf of my client Mr. Yusuf Habib who is currently being held by the Saudi authorities. My name is Lukas, I am an investment portfolio Manager, and my client (Mr. Yusuf Habib) has a large sum of money and he is looking for someone to help him manage the funds.
The Saudi government filed charges against my-client Mr. Yusuf Habib with the aim of keeping him in captivity indefinitely. A variety of local and foreign politicians, civil activists, and journalists consider the process leading to the detention of Mr. Yusuf to be politically motivated. My client's involvement and financial support for Jamal Ahmad Khashoggi posed the most challenge ever to Mohammed bin Salman Al Saud who happens to be the current Crown Prince of Saudi Arabia. The money is currently deposited in the name of an existing Investment entity.
My client Mr. Yusuf Habib has presented a subtle offer that will need the help of a foreign partner like you to complete successfully. Mr. Yusuf Habib is in a difficult situation, and he must immediately relocate certain sums of money and this must be done in such a way that it must not be tied to Mr. Yusuf Habib. The money is currently deposited in the name of an existing Investment entity.
Your role will be to:
[1]. Act as the original beneficiary of the funds.
[2]. Receive the funds into a business / private bank account.
[3]. Invest / Manage the funds outside of Turkey
[4]. Value of funds: 35 million US Dollars.
See the website below to understand better the problem Mr. Yusuf Habib faced all these past years:
https://www.foxnews.com/world/
saudi-government-to-
confiscate-800-billion-from-
alleged-corrupt-individuals.
Everything will be done legally to ensure the rights to the funds are transferred to you. If you agree to partner with Mr. Yusuf Habib in this partnership business proposal, he will compensate you with 35% percent of the total sum.
Terms will be discussed when you show interest and if you aren't interested and you know of someone looking for an investor, please give him / her my contact.
Should you prefer I re-contact you with more express facts. Then make your interest known.
Sincerely,
Lukas.
Hello there,
I am writing to you on behalf of Mr. Yusuf Habib. My name is Lukas, I am
an investment portfolio Manager at MetLife, and my client (Mr. Yusuf
Habib) has a large sum of money he is looking for someone to help him
manage the funds.
The Saudi government filed charges against my-client Mr. Yusuf Habib
with the aim of keeping him in prison indefinitely. A variety of local
and foreign politicians, civil activists, and journalists consider the
process leading to the imprisonment of Mr. Yusuf to be politically
motivated. My client&#039;s involvement and financial support for Jamal Ahmad
Khashoggi posed the most challenge ever to Mohammed bin Salman Al Saud
who happens to be the current Crown Prince of Saudi Arabia the money is
currently deposited in the name of an existing Investment entity.
My client Mr. Yusuf Habib has presented a subtle offer that will need
the help of a partner like you to complete successfully. Mr. Yusuf Habib
is in a difficult situation, and he must immediately relocate certain
sums of money and this must be done in such a way that it must not be
tied to Mr. Yusuf Habib. The money is currently deposited in the name of
an existing Investment entity. Your role will be to:
[1]. Act as the original beneficiary of the funds.
[2]. Receive the funds into a business / private bank account.
[3]. Invest / Manage the funds outside of Turkey
[4]. Value of funds: 35 million US Dollars.
Everything will be done legally to ensure the rights to the funds are
transferred to you. If you agree to partner with Mr. Yusuf Habib in this
partnership business proposal, he will compensate you with 35% percent
of the total sum. Terms will be discussed when you show interest and if
you aren&#039;t interested and you know of someone looking for an investor,
please give him / her my contact.
Should you prefer I re-contact you with more express facts. Then make
your interest known.
Sincerely,
Lukas.
Since commit 0f0101719138 ("usb: dwc3: Don't switch OTG -> peripheral
if extcon is present"), Dual Role support on Intel Merrifield platform
broke due to rearranging the call to dwc3_get_extcon().
It appears to be caused by ulpi_read_id() masking the timeout on the first
test write. In the past dwc3 probe continued by calling dwc3_core_soft_reset()
followed by dwc3_get_extcon() which happend to return -EPROBE_DEFER.
On deferred probe ulpi_read_id() finally succeeded. Due to above mentioned
rearranging -EPROBE_DEFER is not returned and probe completes without phy.
On Intel Merrifield the timeout on the first test write issue is reproducible
but it is difficult to find the root cause. Using a mainline kernel and
rootfs with buildroot ulpi_read_id() succeeds. As soon as adding
ftrace / bootconfig to find out why, ulpi_read_id() fails and we can't
analyze the flow. Using another rootfs ulpi_read_id() fails even without
adding ftrace. We suspect the issue is some kind of timing / race, but
merely retrying ulpi_read_id() does not resolve the issue.
As we now changed ulpi_read_id() to return -ETIMEDOUT in this case, we
need to handle the error by calling dwc3_core_soft_reset() and request
-EPROBE_DEFER. On deferred probe ulpi_read_id() is retried and succeeds.
Fixes: ef6a7bcfb01c ("usb: ulpi: Support device discovery via DT")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ferry Toth <ftoth(a)exalondelft.nl>
---
drivers/usb/dwc3/core.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 648f1c570021..2779f17bffaf 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -1106,8 +1106,13 @@ static int dwc3_core_init(struct dwc3 *dwc)
if (!dwc->ulpi_ready) {
ret = dwc3_core_ulpi_init(dwc);
- if (ret)
+ if (ret) {
+ if (ret == -ETIMEDOUT) {
+ dwc3_core_soft_reset(dwc);
+ ret = -EPROBE_DEFER;
+ }
goto err0;
+ }
dwc->ulpi_ready = true;
}
--
2.37.2
The quilt patch titled
Subject: mm/mempolicy: failed to disable numa balancing
has been removed from the -mm tree. Its filename was
mm-mempolicy-failed-to-disable-numa-balancing.patch
This patch was dropped because it was nacked
------------------------------------------------------
From: tzm <tcm1030(a)163.com>
Subject: mm/mempolicy: failed to disable numa balancing
Date: Fri, 2 Dec 2022 22:16:30 +0800
The kernel fails to disable numa balancing policy permanently when the
user passes <numa_balancing=disable> to the boot cmdline parameters. The
numabalancing_override variable is 1 for enable -1 for disable. So,
!numabalancing_override will always be true, which causes this bug.
Link: https://lkml.kernel.org/r/20221202141630.41220-1-tcm1030@163.com
Signed-off-by: tzm <tcm1030(a)163.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mempolicy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/mempolicy.c~mm-mempolicy-failed-to-disable-numa-balancing
+++ a/mm/mempolicy.c
@@ -2865,7 +2865,7 @@ static void __init check_numabalancing_e
if (numabalancing_override)
set_numabalancing_state(numabalancing_override == 1);
- if (num_online_nodes() > 1 && !numabalancing_override) {
+ if (num_online_nodes() > 1 && (numabalancing_override == 1)) {
pr_info("%s automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl\n",
numabalancing_default ? "Enabling" : "Disabling");
set_numabalancing_state(numabalancing_default);
_
Patches currently in -mm which might be from tcm1030(a)163.com are
The following commit has been merged into the x86/microcode branch of tip:
Commit-ID: be1b670f61443aa5d0d01782e9b8ea0ee825d018
Gitweb: https://git.kernel.org/tip/be1b670f61443aa5d0d01782e9b8ea0ee825d018
Author: Ashok Raj <ashok.raj(a)intel.com>
AuthorDate: Tue, 29 Nov 2022 13:08:27 -08:00
Committer: Borislav Petkov (AMD) <bp(a)alien8.de>
CommitterDate: Mon, 05 Dec 2022 21:22:21 +01:00
x86/microcode/intel: Do not retry microcode reloading on the APs
The retries in load_ucode_intel_ap() were in place to support systems
with mixed steppings. Mixed steppings are no longer supported and there is
only one microcode image at a time. Any retries will simply reattempt to
apply the same image over and over without making progress.
[ bp: Zap the circumstantial reasoning from the commit message. ]
Fixes: 06b8534cb728 ("x86/microcode: Rework microcode loading")
Signed-off-by: Ashok Raj <ashok.raj(a)intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20221129210832.107850-3-ashok.raj@intel.com
---
arch/x86/kernel/cpu/microcode/intel.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 4f93875..2dba4b5 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -495,7 +495,6 @@ void load_ucode_intel_ap(void)
else
iup = &intel_ucode_patch;
-reget:
if (!*iup) {
patch = __load_ucode_intel(&uci);
if (!patch)
@@ -506,12 +505,7 @@ reget:
uci.mc = *iup;
- if (apply_microcode_early(&uci, true)) {
- /* Mixed-silicon system? Try to refetch the proper patch: */
- *iup = NULL;
-
- goto reget;
- }
+ apply_microcode_early(&uci, true);
}
static struct microcode_intel *find_patch(struct ucode_cpu_info *uci)
The patch below does not apply to the 6.0-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
04ada095dcfc ("hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing")
21b85b09527c ("madvise: use zap_page_range_single for madvise dontneed")
ecfbd733878d ("hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer")
131a79b474e9 ("hugetlb: fix vma lock handling during split vma and range unmapping")
40549ba8f8e0 ("hugetlb: use new vma_lock for pmd sharing synchronization")
378397ccb8e5 ("hugetlb: create hugetlb_unmap_file_folio to unmap single file folio")
8d9bfb260814 ("hugetlb: add vma based lock for pmd sharing")
12710fd69634 ("hugetlb: rename vma_shareable() and refactor code")
c86272287bc6 ("hugetlb: create remove_inode_single_folio to remove single file folio")
7e1813d48dd3 ("hugetlb: rename remove_huge_page to hugetlb_delete_from_page_cache")
3a47c54f09c4 ("hugetlbfs: revert use i_mmap_rwsem for more pmd sharing synchronization")
188a39725ad7 ("hugetlbfs: revert use i_mmap_rwsem to address page fault/truncate race")
763ecb035029 ("mm: remove the vma linked list")
8220543df148 ("nommu: remove uses of VMA linked list")
11f9a21ab655 ("mm/mmap: reorganize munmap to use maple states")
e99668a56430 ("mm/mmap: move mmap_region() below do_munmap()")
7964cf8caa4d ("mm: remove vmacache")
4dd1b84140c1 ("mm/mmap: use advanced maple tree API for mmap_region()")
abdba2dda0c4 ("mm: use maple tree operations for find_vma_intersection()")
2e7ce7d354f2 ("mm/mmap: change do_brk_flags() to expand existing VMA and add do_brk_munmap()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 04ada095dcfc4ae359418053c0be94453bdf1e84 Mon Sep 17 00:00:00 2001
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Date: Mon, 14 Nov 2022 15:55:06 -0800
Subject: [PATCH] hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED
processing
madvise(MADV_DONTNEED) ends up calling zap_page_range() to clear page
tables associated with the address range. For hugetlb vmas,
zap_page_range will call __unmap_hugepage_range_final. However,
__unmap_hugepage_range_final assumes the passed vma is about to be removed
and deletes the vma_lock to prevent pmd sharing as the vma is on the way
out. In the case of madvise(MADV_DONTNEED) the vma remains, but the
missing vma_lock prevents pmd sharing and could potentially lead to issues
with truncation/fault races.
This issue was originally reported here [1] as a BUG triggered in
page_try_dup_anon_rmap. Prior to the introduction of the hugetlb
vma_lock, __unmap_hugepage_range_final cleared the VM_MAYSHARE flag to
prevent pmd sharing. Subsequent faults on this vma were confused as
VM_MAYSHARE indicates a sharable vma, but was not set so page_mapping was
not set in new pages added to the page table. This resulted in pages that
appeared anonymous in a VM_SHARED vma and triggered the BUG.
Address issue by adding a new zap flag ZAP_FLAG_UNMAP to indicate an unmap
call from unmap_vmas(). This is used to indicate the 'final' unmapping of
a hugetlb vma. When called via MADV_DONTNEED, this flag is not set and
the vm_lock is not deleted.
[1] https://lore.kernel.org/lkml/CAO4mrfdLMXsao9RF4fUE8-Wfde8xmjsKrTNMNC9wjUb6J…
Link: https://lkml.kernel.org/r/20221114235507.294320-3-mike.kravetz@oracle.com
Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: Wei Chen <harperchen1110(a)gmail.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mina Almasry <almasrymina(a)google.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)linux.dev>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/mm.h b/include/linux/mm.h
index cbfb489d381c..974ccca609d2 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1868,6 +1868,8 @@ struct zap_details {
* default, the flag is not set.
*/
#define ZAP_FLAG_DROP_MARKER ((__force zap_flags_t) BIT(0))
+/* Set in unmap_vmas() to indicate a final unmap call. Only used by hugetlb */
+#define ZAP_FLAG_UNMAP ((__force zap_flags_t) BIT(1))
#ifdef CONFIG_MMU
extern bool can_do_mlock(void);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f1385c3b6c96..e36ca75311a5 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5206,17 +5206,22 @@ void __unmap_hugepage_range_final(struct mmu_gather *tlb,
__unmap_hugepage_range(tlb, vma, start, end, ref_page, zap_flags);
- /*
- * Unlock and free the vma lock before releasing i_mmap_rwsem. When
- * the vma_lock is freed, this makes the vma ineligible for pmd
- * sharing. And, i_mmap_rwsem is required to set up pmd sharing.
- * This is important as page tables for this unmapped range will
- * be asynchrously deleted. If the page tables are shared, there
- * will be issues when accessed by someone else.
- */
- __hugetlb_vma_unlock_write_free(vma);
-
- i_mmap_unlock_write(vma->vm_file->f_mapping);
+ if (zap_flags & ZAP_FLAG_UNMAP) { /* final unmap */
+ /*
+ * Unlock and free the vma lock before releasing i_mmap_rwsem.
+ * When the vma_lock is freed, this makes the vma ineligible
+ * for pmd sharing. And, i_mmap_rwsem is required to set up
+ * pmd sharing. This is important as page tables for this
+ * unmapped range will be asynchrously deleted. If the page
+ * tables are shared, there will be issues when accessed by
+ * someone else.
+ */
+ __hugetlb_vma_unlock_write_free(vma);
+ i_mmap_unlock_write(vma->vm_file->f_mapping);
+ } else {
+ i_mmap_unlock_write(vma->vm_file->f_mapping);
+ hugetlb_vma_unlock_write(vma);
+ }
}
void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start,
diff --git a/mm/memory.c b/mm/memory.c
index 9bc5edc35725..8c8420934d60 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1711,7 +1711,7 @@ void unmap_vmas(struct mmu_gather *tlb, struct maple_tree *mt,
{
struct mmu_notifier_range range;
struct zap_details details = {
- .zap_flags = ZAP_FLAG_DROP_MARKER,
+ .zap_flags = ZAP_FLAG_DROP_MARKER | ZAP_FLAG_UNMAP,
/* Careful - we need to zap private pages too! */
.even_cows = true,
};
The patch below does not apply to the 6.0-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
21b85b09527c ("madvise: use zap_page_range_single for madvise dontneed")
763ecb035029 ("mm: remove the vma linked list")
8220543df148 ("nommu: remove uses of VMA linked list")
11f9a21ab655 ("mm/mmap: reorganize munmap to use maple states")
e99668a56430 ("mm/mmap: move mmap_region() below do_munmap()")
7964cf8caa4d ("mm: remove vmacache")
4dd1b84140c1 ("mm/mmap: use advanced maple tree API for mmap_region()")
abdba2dda0c4 ("mm: use maple tree operations for find_vma_intersection()")
2e7ce7d354f2 ("mm/mmap: change do_brk_flags() to expand existing VMA and add do_brk_munmap()")
3b0e81a1cdc9 ("mmap: change zeroing of maple tree in __vma_adjust()")
524e00b36e8c ("mm: remove rb tree.")
c9dbe82cb99d ("kernel/fork: use maple tree for dup_mmap() during forking")
3499a13168da ("mm/mmap: use maple tree for unmapped_area{_topdown}")
be8432e7166e ("mm/mmap: use the maple tree in find_vma() instead of the rbtree.")
2e3af1db1744 ("mmap: use the VMA iterator in count_vma_pages_range()")
f39af05949a4 ("mm: add VMA iterator")
d4af56c5c7c6 ("mm: start tracking VMAs with maple tree")
54a611b60590 ("Maple Tree: add new data structure")
bd74fdaea146 ("mm: multi-gen LRU: support page table walks")
018ee47f1489 ("mm: multi-gen LRU: exploit locality in rmap")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 21b85b09527c28e242db55c1b751f7f7549b830c Mon Sep 17 00:00:00 2001
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Date: Mon, 14 Nov 2022 15:55:05 -0800
Subject: [PATCH] madvise: use zap_page_range_single for madvise dontneed
This series addresses the issue first reported in [1], and fully described
in patch 2. Patches 1 and 2 address the user visible issue and are tagged
for stable backports.
While exploring solutions to this issue, related problems with mmu
notification calls were discovered. This is addressed in the patch
"hugetlb: remove duplicate mmu notifications:". Since there are no user
visible effects, this third is not tagged for stable backports.
Previous discussions suggested further cleanup by removing the
routine zap_page_range. This is possible because zap_page_range_single
is now exported, and all callers of zap_page_range pass ranges entirely
within a single vma. This work will be done in a later patch so as not
to distract from this bug fix.
[1] https://lore.kernel.org/lkml/CAO4mrfdLMXsao9RF4fUE8-Wfde8xmjsKrTNMNC9wjUb6J…
This patch (of 2):
Expose the routine zap_page_range_single to zap a range within a single
vma. The madvise routine madvise_dontneed_single_vma can use this routine
as it explicitly operates on a single vma. Also, update the mmu
notification range in zap_page_range_single to take hugetlb pmd sharing
into account. This is required as MADV_DONTNEED supports hugetlb vmas.
Link: https://lkml.kernel.org/r/20221114235507.294320-1-mike.kravetz@oracle.com
Link: https://lkml.kernel.org/r/20221114235507.294320-2-mike.kravetz@oracle.com
Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: Wei Chen <harperchen1110(a)gmail.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mina Almasry <almasrymina(a)google.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)linux.dev>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8bbcccbc5565..cbfb489d381c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1852,6 +1852,23 @@ static void __maybe_unused show_free_areas(unsigned int flags, nodemask_t *nodem
__show_free_areas(flags, nodemask, MAX_NR_ZONES - 1);
}
+/*
+ * Parameter block passed down to zap_pte_range in exceptional cases.
+ */
+struct zap_details {
+ struct folio *single_folio; /* Locked folio to be unmapped */
+ bool even_cows; /* Zap COWed private pages too? */
+ zap_flags_t zap_flags; /* Extra flags for zapping */
+};
+
+/*
+ * Whether to drop the pte markers, for example, the uffd-wp information for
+ * file-backed memory. This should only be specified when we will completely
+ * drop the page in the mm, either by truncation or unmapping of the vma. By
+ * default, the flag is not set.
+ */
+#define ZAP_FLAG_DROP_MARKER ((__force zap_flags_t) BIT(0))
+
#ifdef CONFIG_MMU
extern bool can_do_mlock(void);
#else
@@ -1869,6 +1886,8 @@ void zap_vma_ptes(struct vm_area_struct *vma, unsigned long address,
unsigned long size);
void zap_page_range(struct vm_area_struct *vma, unsigned long address,
unsigned long size);
+void zap_page_range_single(struct vm_area_struct *vma, unsigned long address,
+ unsigned long size, struct zap_details *details);
void unmap_vmas(struct mmu_gather *tlb, struct maple_tree *mt,
struct vm_area_struct *start_vma, unsigned long start,
unsigned long end);
@@ -3467,12 +3486,4 @@ madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
}
#endif
-/*
- * Whether to drop the pte markers, for example, the uffd-wp information for
- * file-backed memory. This should only be specified when we will completely
- * drop the page in the mm, either by truncation or unmapping of the vma. By
- * default, the flag is not set.
- */
-#define ZAP_FLAG_DROP_MARKER ((__force zap_flags_t) BIT(0))
-
#endif /* _LINUX_MM_H */
diff --git a/mm/madvise.c b/mm/madvise.c
index c7105ec6d08c..b913ba6efc10 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -772,8 +772,8 @@ static int madvise_free_single_vma(struct vm_area_struct *vma,
* Application no longer needs these pages. If the pages are dirty,
* it's OK to just throw them away. The app will be more careful about
* data it wants to keep. Be sure to free swap resources too. The
- * zap_page_range call sets things up for shrink_active_list to actually free
- * these pages later if no one else has touched them in the meantime,
+ * zap_page_range_single call sets things up for shrink_active_list to actually
+ * free these pages later if no one else has touched them in the meantime,
* although we could add these pages to a global reuse list for
* shrink_active_list to pick up before reclaiming other pages.
*
@@ -790,7 +790,7 @@ static int madvise_free_single_vma(struct vm_area_struct *vma,
static long madvise_dontneed_single_vma(struct vm_area_struct *vma,
unsigned long start, unsigned long end)
{
- zap_page_range(vma, start, end - start);
+ zap_page_range_single(vma, start, end - start, NULL);
return 0;
}
diff --git a/mm/memory.c b/mm/memory.c
index 8a6d5c823f91..9bc5edc35725 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1341,15 +1341,6 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma)
return ret;
}
-/*
- * Parameter block passed down to zap_pte_range in exceptional cases.
- */
-struct zap_details {
- struct folio *single_folio; /* Locked folio to be unmapped */
- bool even_cows; /* Zap COWed private pages too? */
- zap_flags_t zap_flags; /* Extra flags for zapping */
-};
-
/* Whether we should zap all COWed (private) pages too */
static inline bool should_zap_cows(struct zap_details *details)
{
@@ -1774,19 +1765,27 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start,
*
* The range must fit into one VMA.
*/
-static void zap_page_range_single(struct vm_area_struct *vma, unsigned long address,
+void zap_page_range_single(struct vm_area_struct *vma, unsigned long address,
unsigned long size, struct zap_details *details)
{
+ const unsigned long end = address + size;
struct mmu_notifier_range range;
struct mmu_gather tlb;
lru_add_drain();
mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
- address, address + size);
+ address, end);
+ if (is_vm_hugetlb_page(vma))
+ adjust_range_if_pmd_sharing_possible(vma, &range.start,
+ &range.end);
tlb_gather_mmu(&tlb, vma->vm_mm);
update_hiwater_rss(vma->vm_mm);
mmu_notifier_invalidate_range_start(&range);
- unmap_single_vma(&tlb, vma, address, range.end, details);
+ /*
+ * unmap 'address-end' not 'range.start-range.end' as range
+ * could have been expanded for hugetlb pmd sharing.
+ */
+ unmap_single_vma(&tlb, vma, address, end, details);
mmu_notifier_invalidate_range_end(&range);
tlb_finish_mmu(&tlb);
}
We are a procuring Branch of Reitan Groups. A Retail and Trading
company located in Norway and newly now in USA. This is one of
our procuring services, hence will like to order your Product.
Please provide us with your offers, catalog so we can choose
precisely the needed products required by our client.
Advise if you can as well Ship to USA and Europe.
Your prompt response will be well appreciated.
Thank You,
Kriston Jentes
Oversea Marketing Manager
REMA 1000 REMA 1000 Norge
+47 24 09 85 00Gladengveien 2, 0661, Oslo
Postboks 6428 Etterstad, 0605 Oslo
https://www.reitangruppen.no
v5.11 changes the blkdev lookup mechanism completely since commit
22ae8ce8b892 ("block: simplify bdev/disk lookup in blkdev_get"),
and small part of the change is to unhash part bdev inode when
deleting partition. Turns out this kind of change does fix one
nasty issue in case of BLOCK_EXT_MAJOR:
1) when one partition is deleted & closed, disk_put_part() is always
called before bdput(bdev), see blkdev_put(); so the part's devt can
be freed & re-used before the inode is dropped
2) then new partition with same devt can be created just before the
inode in 1) is dropped, then the old inode/bdev structurein 1) is
re-used for this new partition, this way causes use-after-free and
kernel panic.
It isn't possible to backport the whole big patchset of "merge struct
block_device and struct hd_struct v4" for addressing this issue.
https://lore.kernel.org/linux-block/20201128161510.347752-1-hch@lst.de/
So fixes it by unhashing part bdev in delete_partition(), and this way
is actually aligned with v5.11+'s behavior.
Reported-by: Shiwei Cui <cuishw(a)inspur.com>
Tested-by: Shiwei Cui <cuishw(a)inspur.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Jan Kara <jack(a)suse.cz>
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
---
V2:
- fix one typo and Shiwei's email format
block/partitions/core.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/block/partitions/core.c b/block/partitions/core.c
index a02e22411594..e3d61ec4a5a6 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -329,6 +329,7 @@ void delete_partition(struct hd_struct *part)
struct gendisk *disk = part_to_disk(part);
struct disk_part_tbl *ptbl =
rcu_dereference_protected(disk->part_tbl, 1);
+ struct block_device *bdev;
/*
* ->part_tbl is referenced in this part's release handler, so
@@ -346,6 +347,12 @@ void delete_partition(struct hd_struct *part)
* "in-use" until we really free the gendisk.
*/
blk_invalidate_devt(part_devt(part));
+
+ bdev = bdget_part(part);
+ if (bdev) {
+ remove_inode_hash(bdev->bd_inode);
+ bdput(bdev);
+ }
percpu_ref_kill(&part->ref);
}
--
2.37.3
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
c981cdfb9925 ("mmc: sdhci: Fix voltage switch delay")
fa0910107a9f ("mmc: sdhci: use FIELD_GET for preset value bit masks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c981cdfb9925f64a364f13c2b4f98f877308a408 Mon Sep 17 00:00:00 2001
From: Adrian Hunter <adrian.hunter(a)intel.com>
Date: Mon, 28 Nov 2022 15:32:56 +0200
Subject: [PATCH] mmc: sdhci: Fix voltage switch delay
Commit 20b92a30b561 ("mmc: sdhci: update signal voltage switch code")
removed voltage switch delays from sdhci because mmc core had been
enhanced to support them. However that assumed that sdhci_set_ios()
did a single clock change, which it did not, and so the delays in mmc
core, which should have come after the first clock change, were not
effective.
Fix by avoiding re-configuring UHS and preset settings when the clock
is turning on and the settings have not changed. That then also avoids
the associated clock changes, so that then sdhci_set_ios() does a single
clock change when voltage switching, and the mmc core delays become
effective.
To do that has meant keeping track of driver strength (host->drv_type),
and cases of reinitialization (host->reinit_uhs).
Note also, the 'turning_on_clk' restriction should not be necessary
but is done to minimize the impact of the change on stable kernels.
Fixes: 20b92a30b561 ("mmc: sdhci: update signal voltage switch code")
Cc: stable(a)vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter(a)intel.com>
Link: https://lore.kernel.org/r/20221128133259.38305-2-adrian.hunter@intel.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index fef03de85b99..c7ad32a75b57 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -373,6 +373,7 @@ static void sdhci_init(struct sdhci_host *host, int soft)
if (soft) {
/* force clock reconfiguration */
host->clock = 0;
+ host->reinit_uhs = true;
mmc->ops->set_ios(mmc, &mmc->ios);
}
}
@@ -2293,11 +2294,46 @@ void sdhci_set_uhs_signaling(struct sdhci_host *host, unsigned timing)
}
EXPORT_SYMBOL_GPL(sdhci_set_uhs_signaling);
+static bool sdhci_timing_has_preset(unsigned char timing)
+{
+ switch (timing) {
+ case MMC_TIMING_UHS_SDR12:
+ case MMC_TIMING_UHS_SDR25:
+ case MMC_TIMING_UHS_SDR50:
+ case MMC_TIMING_UHS_SDR104:
+ case MMC_TIMING_UHS_DDR50:
+ case MMC_TIMING_MMC_DDR52:
+ return true;
+ };
+ return false;
+}
+
+static bool sdhci_preset_needed(struct sdhci_host *host, unsigned char timing)
+{
+ return !(host->quirks2 & SDHCI_QUIRK2_PRESET_VALUE_BROKEN) &&
+ sdhci_timing_has_preset(timing);
+}
+
+static bool sdhci_presetable_values_change(struct sdhci_host *host, struct mmc_ios *ios)
+{
+ /*
+ * Preset Values are: Driver Strength, Clock Generator and SDCLK/RCLK
+ * Frequency. Check if preset values need to be enabled, or the Driver
+ * Strength needs updating. Note, clock changes are handled separately.
+ */
+ return !host->preset_enabled &&
+ (sdhci_preset_needed(host, ios->timing) || host->drv_type != ios->drv_type);
+}
+
void sdhci_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)
{
struct sdhci_host *host = mmc_priv(mmc);
+ bool reinit_uhs = host->reinit_uhs;
+ bool turning_on_clk = false;
u8 ctrl;
+ host->reinit_uhs = false;
+
if (ios->power_mode == MMC_POWER_UNDEFINED)
return;
@@ -2323,6 +2359,8 @@ void sdhci_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)
sdhci_enable_preset_value(host, false);
if (!ios->clock || ios->clock != host->clock) {
+ turning_on_clk = ios->clock && !host->clock;
+
host->ops->set_clock(host, ios->clock);
host->clock = ios->clock;
@@ -2349,6 +2387,17 @@ void sdhci_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)
host->ops->set_bus_width(host, ios->bus_width);
+ /*
+ * Special case to avoid multiple clock changes during voltage
+ * switching.
+ */
+ if (!reinit_uhs &&
+ turning_on_clk &&
+ host->timing == ios->timing &&
+ host->version >= SDHCI_SPEC_300 &&
+ !sdhci_presetable_values_change(host, ios))
+ return;
+
ctrl = sdhci_readb(host, SDHCI_HOST_CONTROL);
if (!(host->quirks & SDHCI_QUIRK_NO_HISPD_BIT)) {
@@ -2392,6 +2441,7 @@ void sdhci_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)
}
sdhci_writew(host, ctrl_2, SDHCI_HOST_CONTROL2);
+ host->drv_type = ios->drv_type;
} else {
/*
* According to SDHC Spec v3.00, if the Preset Value
@@ -2419,19 +2469,14 @@ void sdhci_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)
host->ops->set_uhs_signaling(host, ios->timing);
host->timing = ios->timing;
- if (!(host->quirks2 & SDHCI_QUIRK2_PRESET_VALUE_BROKEN) &&
- ((ios->timing == MMC_TIMING_UHS_SDR12) ||
- (ios->timing == MMC_TIMING_UHS_SDR25) ||
- (ios->timing == MMC_TIMING_UHS_SDR50) ||
- (ios->timing == MMC_TIMING_UHS_SDR104) ||
- (ios->timing == MMC_TIMING_UHS_DDR50) ||
- (ios->timing == MMC_TIMING_MMC_DDR52))) {
+ if (sdhci_preset_needed(host, ios->timing)) {
u16 preset;
sdhci_enable_preset_value(host, true);
preset = sdhci_get_preset_value(host);
ios->drv_type = FIELD_GET(SDHCI_PRESET_DRV_MASK,
preset);
+ host->drv_type = ios->drv_type;
}
/* Re-enable SD Clock */
@@ -3768,6 +3813,7 @@ int sdhci_resume_host(struct sdhci_host *host)
sdhci_init(host, 0);
host->pwr = 0;
host->clock = 0;
+ host->reinit_uhs = true;
mmc->ops->set_ios(mmc, &mmc->ios);
} else {
sdhci_init(host, (mmc->pm_flags & MMC_PM_KEEP_POWER));
@@ -3830,6 +3876,7 @@ int sdhci_runtime_resume_host(struct sdhci_host *host, int soft_reset)
/* Force clock and power re-program */
host->pwr = 0;
host->clock = 0;
+ host->reinit_uhs = true;
mmc->ops->start_signal_voltage_switch(mmc, &mmc->ios);
mmc->ops->set_ios(mmc, &mmc->ios);
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index d750c464bd1e..87a3aaa07438 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -524,6 +524,8 @@ struct sdhci_host {
unsigned int clock; /* Current clock (MHz) */
u8 pwr; /* Current voltage */
+ u8 drv_type; /* Current UHS-I driver type */
+ bool reinit_uhs; /* Force UHS-related re-initialization */
bool runtime_suspended; /* Host is runtime suspended */
bool bus_on; /* Bus power prevents runtime suspend */
v5.11 changes the blkdev lookup mechanism completely since commit
22ae8ce8b892 ("block: simplify bdev/disk lookup in blkdev_get"),
and small part of the change is to unhash part bdev inode when
deleting partition. Turns out this kind of change does fix one
nasty issue in case of BLOCK_EXT_MAJOR:
1) when one partition is deleted & closed, disk_put_part() is always
called before bdput(bdev), see blkdev_put(); so the part's devt can
be freed & re-used before the inode is dropped
2) then new partition with same devt can be created just before the
inode in 1) is dropped, then the old inode/bdev structurein 1) is
re-used for this new partition, this way causes use-after-free and
kernel panic.
It isn't possible to backport the whole fbig patchset of "merge struct
block_device and struct hd_struct v4" for addressing this issue.
https://lore.kernel.org/linux-block/20201128161510.347752-1-hch@lst.de/
So fixes it by unhashing part bdev in delete_partition(), and this way
is actually aligned with v5.11+'s behavior.
Reported-by: cuishw(a)inspur.com
Tested-by: cuishw(a)inspur.com
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Jan Kara <jack(a)suse.cz>
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
---
block/partitions/core.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/block/partitions/core.c b/block/partitions/core.c
index a02e22411594..e3d61ec4a5a6 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -329,6 +329,7 @@ void delete_partition(struct hd_struct *part)
struct gendisk *disk = part_to_disk(part);
struct disk_part_tbl *ptbl =
rcu_dereference_protected(disk->part_tbl, 1);
+ struct block_device *bdev;
/*
* ->part_tbl is referenced in this part's release handler, so
@@ -346,6 +347,12 @@ void delete_partition(struct hd_struct *part)
* "in-use" until we really free the gendisk.
*/
blk_invalidate_devt(part_devt(part));
+
+ bdev = bdget_part(part);
+ if (bdev) {
+ remove_inode_hash(bdev->bd_inode);
+ bdput(bdev);
+ }
percpu_ref_kill(&part->ref);
}
--
2.37.3
This bug is marked as fixed by commit:
net: core: netlink: add helper refcount dec and lock function
net: sched: add helper function to take reference to Qdisc
net: sched: extend Qdisc with rcu
net: sched: rename qdisc_destroy() to qdisc_put()
net: sched: use Qdisc rcu API instead of relying on rtnl lock
But I can't find it in any tested tree for more than 90 days.
Is it a correct commit? Please update it by replying:
#syz fix: exact-commit-title
Until then the bug is still considered open and
new crashes with the same signature are ignored.
FOND MONETAR INTERNAȚIONAL (HQ1)
700 19th Street, N.W., Washington, D.C. 20431.
Fondul nerevendicat al loteriei Mo 755.000,00 USD
detectat de proprietarul de e-mail al fondului.
REF:-XVGN 82022.
Numărul câștigător; 5-6-14-29-35
Stimate proprietar de e-mail al fondului,
Ți-am trimis această scrisoare, acum o lună, dar nu am auzit de la
tine, nu sunt sigur dacă ai primit-o și de aceea o trimit din nou.
Acesta este pentru a te informa despre informații foarte importante
care vor fi de mare ajutor pentru a vă răscumpăra de toate
dificultățile pe care le-ați avut în obținerea plății restante din
cauza cererii excesive de bani de la dumneavoastră atât de către
funcționari corupți funcționari ai băncii, cât și de către companiile
de curierat după care fondul dumneavoastră rămâne neachitat.
Sunt doamna Kristalina Georgieva, un înalt funcționar la Fondul
Monetar Internațional (FMI). Te-ar putea interesa să știi că prin
atâtea corespondențe au ajuns la biroul nostru rapoarte despre felul
tulburător în care oamenii ca tine sunt tratați de către Diverse
Bănci. După investigația noastră, am aflat că adresa dvs. de e-mail a
fost unul dintre norocoșii câștigători ai selecției de loterie Mo în
anul 2020, dar din cauza unor bancheri corupți, aceștia încearcă să vă
redirecționeze fondurile în contul lor privat.
Toate prostatale guvernamentale și neguvernamentale, ONG-urile,
companiile financiare, băncile, companiile de securitate și companiile
de curierat care au fost în contact cu dvs. în ultimul timp au primit
instrucțiuni să vă retragă tranzacția și vi s-a sfătuit să NU
răspundeți de atunci. Fondul Monetar Internațional (FMI) este acum
direct responsabil pentru plata dvs., pe care o puteți primi în contul
dvs. bancar cu ajutorul Băncii Europene de Investiții.
Numele dvs. a apărut pe lista noastră de program de plăți a
beneficiarilor care își vor primi fondurile Loteriei Mo în această
plată din primul trimestru al anului 2022, deoarece transferăm fonduri
doar de două ori pe an, conform reglementărilor noastre bancare. Ne
cerem scuze pentru întârzierea plății și vă rugăm să nu mai comunicați
cu orice birou acum și să acordați atenție biroului nostru în
consecință.
Acum noua dvs. plată, Nr. de aprobare a Națiunilor Unite; UN 5685 P,
Nr. aprobat la Casa Alba: WH44CV, Nr. referinta 35460021, Nr. alocare:
674632 Nr. parola: 339331, nr. Cod PIN: 55674 și certificatul
dumneavoastră de plată de merit: Nr.: 103, Nr. cod emis: 0763;
Confirmare imediată (IMF) Telex Nr: -1114433; Cod secret nr: XXTN013.
Fondul dvs. de moștenire pentru plată parțială este de 755.000 USD
după ce ați primit aceste plăți vitale. numere câștigătoare;
5-6-14-29-35, data câștigării; 05 octombrie 2020, sunteți acum
calificat să primiți și să confirmați plata cu Regiunea Africană a
Fondului Monetar Internațional (FMI) imediat în următoarele 168 de
ore. Vă asigurăm că plata vă va ajunge atâta timp cât urmați
directivele și instrucțiunile mele. Am decis să vă oferim un COD,
CODUL ESTE: 601. Vă rugăm, de fiecare dată când primiți un mail cu
numele doamnei Kristalina Georgieva, verificați dacă există COD (601)
dacă codul nu este scris, vă rugăm să ștergeți mesaj din inbox-ul tău!
Sunteți sfătuit prin prezenta să NU mai trimiteți nicio plată către
nicio instituție cu privire la tranzacția dvs., deoarece fondul
dumneavoastră vă va fi transferat direct din sursa noastră. Singura
plată necesară la bancă este doar taxa pentru certificatul de
lichidare FMI, astfel încât banca va putea elibera întreaga sumă de
bani în contul dvs. bancar.
Sper că acest lucru este clar. Orice acțiune contrară acestei
instrucțiuni este pe propriul risc. REȚINEȚI CĂ TREBUIE PLĂȚIȚI
COMISIONALĂ DE LIMITARE A CERTIFICATULUI FMI CĂTRE BANCĂ ȘI FĂRĂ TAXĂ
SUPLIMENTARĂ.
Am depus deja fondul dvs. total la banca plătitoare corespunzătoare
Banca Europeană de Investiții, vreau să contactați Approval pentru a
primi fondul dvs. total în stare bună, fără alte întârzieri sau taxe
suplimentare.
Răspundeți la acest e-mail bancar:
(europeaninvestmentbankeib6(a)gmail.com)
Și fondul tău va fi eliberat în contul tău bancar odată ce vei contacta banca.
Contactați numele managerului băncii; Dr. Wilson Taylor
E-mail bancar:
(europeaninvestmentbankeib6(a)gmail.com)
Cu respect
Salutari,
Doamna Kristalina Georgieva, (I.M.F)(601)
Additionally, remove it from .ndo_stop().
This ensures that the worker is not called after being freed, and that
the UART TX queue remains active to send final commands when the netdev
is stopped.
Thanks to Jiri Slaby for finding this in slcan:
https://lore.kernel.org/linux-can/20221201073426.17328-1-jirislaby@kernel.o…
A variant of this patch for slcan, with the flush in .ndo_stop() still
present, has been tested successfully on physical hardware:
https://bugzilla.suse.com/show_bug.cgi?id=1205597
Fixes: 43da2f07622f ("can: can327: CAN/ldisc driver for ELM327 based OBD-II adapters")
Cc: "Jiri Slaby (SUSE)" <jirislaby(a)kernel.org>
Cc: Max Staudt <max(a)enpas.org>
Cc: Wolfgang Grandegger <wg(a)grandegger.com>
Cc: Marc Kleine-Budde <mkl(a)pengutronix.de>
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Jakub Kicinski <kuba(a)kernel.org>
Cc: Paolo Abeni <pabeni(a)redhat.com>
Cc: linux-can(a)vger.kernel.org
Cc: netdev(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Signed-off-by: Max Staudt <max(a)enpas.org>
---
drivers/net/can/can327.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/drivers/net/can/can327.c b/drivers/net/can/can327.c
index ed3d0b8989a0..dc7192ecb001 100644
--- a/drivers/net/can/can327.c
+++ b/drivers/net/can/can327.c
@@ -796,9 +796,9 @@ static int can327_netdev_close(struct net_device *dev)
netif_stop_queue(dev);
- /* Give UART one final chance to flush. */
- clear_bit(TTY_DO_WRITE_WAKEUP, &elm->tty->flags);
- flush_work(&elm->tx_work);
+ /* We don't flush the UART TX queue here, as we want final stop
+ * commands (like the above dummy char) to be flushed out.
+ */
can_rx_offload_disable(&elm->offload);
elm->can.state = CAN_STATE_STOPPED;
@@ -1069,12 +1069,15 @@ static void can327_ldisc_close(struct tty_struct *tty)
{
struct can327 *elm = (struct can327 *)tty->disc_data;
- /* unregister_netdev() calls .ndo_stop() so we don't have to.
- * Our .ndo_stop() also flushes the TTY write wakeup handler,
- * so we can safely set elm->tty = NULL after this.
- */
+ /* unregister_netdev() calls .ndo_stop() so we don't have to. */
unregister_candev(elm->dev);
+ /* Give UART one final chance to flush.
+ * No need to clear TTY_DO_WRITE_WAKEUP since .write_wakeup() is
+ * serialised against .close() and will not be called once we return.
+ */
+ flush_work(&elm->tx_work);
+
/* Mark channel as dead */
spin_lock_bh(&elm->lock);
tty->disc_data = NULL;
--
2.30.2
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
89d21e259a94 ("powerpc/bpf/32: Fix Oops on tail call tests")
49c3af43e65f ("powerpc/bpf: Simplify bpf_to_ppc() and adopt it for powerpc64")
3a3fc9bf1039 ("powerpc64/bpf: Store temp registers' bpf to ppc mapping")
036d559c0bde ("powerpc/bpf: Use _Rn macros for GPRs")
576a6c3a00c1 ("powerpc/bpf: Move bpf_jit64.h into bpf_jit_comp64.c")
7b187dcdb5d3 ("powerpc/bpf: Cleanup bpf_jit.h")
794abc08d75e ("powerpc64/bpf: Get rid of PPC_BPF_[LL|STL|STLU] macros")
391c271f4deb ("powerpc64/bpf: Convert some of the uses of PPC_BPF_[LL|STL] to PPC_BPF_[LD|STD]")
feb6307289d8 ("powerpc64/bpf: Optimize instruction sequence used for function calls")
43d636f8b4fd ("powerpc64/bpf elfv1: Do not load TOC before calling functions")
b10cb163c4b3 ("powerpc64/bpf elfv2: Setup kernel TOC in r2 on entry")
1d4866d5652f ("powerpc64/bpf: Use r12 for constant blinding")
c2067f7f8883 ("powerpc64/bpf: Do not save/restore LR on each call to bpf_stf_barrier()")
0ffdbce6f4a8 ("powerpc/bpf: Handle large branch ranges with BPF_EXIT")
bafb5898de5d ("powerpc/bpf: Emit a single branch instruction for known short branch ranges")
a8936569a07b ("powerpc/bpf: Always reallocate BPF_REG_5, BPF_REG_AX and TMP_REG when possible")
3f5f766d5f7f ("powerpc64/bpf: Limit 'ldbrx' to processors compliant with ISA v2.06")
f9320c49993c ("powerpc/bpf: Update ldimm64 instructions during extra pass")
29ec39fcf11e ("Merge tag 'powerpc-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 89d21e259a94f7d5582ec675aa445f5a79f347e4 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Date: Thu, 24 Nov 2022 09:37:27 +0100
Subject: [PATCH] powerpc/bpf/32: Fix Oops on tail call tests
test_bpf tail call tests end up as:
test_bpf: #0 Tail call leaf jited:1 85 PASS
test_bpf: #1 Tail call 2 jited:1 111 PASS
test_bpf: #2 Tail call 3 jited:1 145 PASS
test_bpf: #3 Tail call 4 jited:1 170 PASS
test_bpf: #4 Tail call load/store leaf jited:1 190 PASS
test_bpf: #5 Tail call load/store jited:1
BUG: Unable to handle kernel data access on write at 0xf1b4e000
Faulting instruction address: 0xbe86b710
Oops: Kernel access of bad area, sig: 11 [#1]
BE PAGE_SIZE=4K MMU=Hash PowerMac
Modules linked in: test_bpf(+)
CPU: 0 PID: 97 Comm: insmod Not tainted 6.1.0-rc4+ #195
Hardware name: PowerMac3,1 750CL 0x87210 PowerMac
NIP: be86b710 LR: be857e88 CTR: be86b704
REGS: f1b4df20 TRAP: 0300 Not tainted (6.1.0-rc4+)
MSR: 00009032 <EE,ME,IR,DR,RI> CR: 28008242 XER: 00000000
DAR: f1b4e000 DSISR: 42000000
GPR00: 00000001 f1b4dfe0 c11d2280 00000000 00000000 00000000 00000002 00000000
GPR08: f1b4e000 be86b704 f1b4e000 00000000 00000000 100d816a f2440000 fe73baa8
GPR16: f2458000 00000000 c1941ae4 f1fe2248 00000045 c0de0000 f2458030 00000000
GPR24: 000003e8 0000000f f2458000 f1b4dc90 3e584b46 00000000 f24466a0 c1941a00
NIP [be86b710] 0xbe86b710
LR [be857e88] __run_one+0xec/0x264 [test_bpf]
Call Trace:
[f1b4dfe0] [00000002] 0x2 (unreliable)
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
---[ end trace 0000000000000000 ]---
This is a tentative to write above the stack. The problem is encoutered
with tests added by commit 38608ee7b690 ("bpf, tests: Add load store
test case for tail call")
This happens because tail call is done to a BPF prog with a different
stack_depth. At the time being, the stack is kept as is when the caller
tail calls its callee. But at exit, the callee restores the stack based
on its own properties. Therefore here, at each run, r1 is erroneously
increased by 32 - 16 = 16 bytes.
This was done that way in order to pass the tail call count from caller
to callee through the stack. As powerpc32 doesn't have a red zone in
the stack, it was necessary the maintain the stack as is for the tail
call. But it was not anticipated that the BPF frame size could be
different.
Let's take a new approach. Use register r4 to carry the tail call count
during the tail call, and save it into the stack at function entry if
required. This means the input parameter must be in r3, which is more
correct as it is a 32 bits parameter, then tail call better match with
normal BPF function entry, the down side being that we move that input
parameter back and forth between r3 and r4. That can be optimised later.
Doing that also has the advantage of maximising the common parts between
tail calls and a normal function exit.
With the fix, tail call tests are now successfull:
test_bpf: #0 Tail call leaf jited:1 53 PASS
test_bpf: #1 Tail call 2 jited:1 115 PASS
test_bpf: #2 Tail call 3 jited:1 154 PASS
test_bpf: #3 Tail call 4 jited:1 165 PASS
test_bpf: #4 Tail call load/store leaf jited:1 101 PASS
test_bpf: #5 Tail call load/store jited:1 141 PASS
test_bpf: #6 Tail call error path, max count reached jited:1 994 PASS
test_bpf: #7 Tail call count preserved across function calls jited:1 140975 PASS
test_bpf: #8 Tail call error path, NULL target jited:1 110 PASS
test_bpf: #9 Tail call error path, index out of range jited:1 69 PASS
test_bpf: test_tail_calls: Summary: 10 PASSED, 0 FAILED, [10/10 JIT'ed]
Suggested-by: Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com>
Fixes: 51c66ad849a7 ("powerpc/bpf: Implement extended BPF on PPC32")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Tested-by: Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/757acccb7fbfc78efa42dcf3c974b46678198905.16692788…
diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
index 43f1c76d48ce..a379b0ce19ff 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -113,23 +113,19 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
{
int i;
- /* First arg comes in as a 32 bits pointer. */
- EMIT(PPC_RAW_MR(bpf_to_ppc(BPF_REG_1), _R3));
- EMIT(PPC_RAW_LI(bpf_to_ppc(BPF_REG_1) - 1, 0));
+ /* Initialize tail_call_cnt, to be skipped if we do tail calls. */
+ EMIT(PPC_RAW_LI(_R4, 0));
+
+#define BPF_TAILCALL_PROLOGUE_SIZE 4
+
EMIT(PPC_RAW_STWU(_R1, _R1, -BPF_PPC_STACKFRAME(ctx)));
- /*
- * Initialize tail_call_cnt in stack frame if we do tail calls.
- * Otherwise, put in NOPs so that it can be skipped when we are
- * invoked through a tail call.
- */
if (ctx->seen & SEEN_TAILCALL)
- EMIT(PPC_RAW_STW(bpf_to_ppc(BPF_REG_1) - 1, _R1,
- bpf_jit_stack_offsetof(ctx, BPF_PPC_TC)));
- else
- EMIT(PPC_RAW_NOP());
+ EMIT(PPC_RAW_STW(_R4, _R1, bpf_jit_stack_offsetof(ctx, BPF_PPC_TC)));
-#define BPF_TAILCALL_PROLOGUE_SIZE 16
+ /* First arg comes in as a 32 bits pointer. */
+ EMIT(PPC_RAW_MR(bpf_to_ppc(BPF_REG_1), _R3));
+ EMIT(PPC_RAW_LI(bpf_to_ppc(BPF_REG_1) - 1, 0));
/*
* We need a stack frame, but we don't necessarily need to
@@ -170,24 +166,24 @@ static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx
for (i = BPF_PPC_NVR_MIN; i <= 31; i++)
if (bpf_is_seen_register(ctx, i))
EMIT(PPC_RAW_LWZ(i, _R1, bpf_jit_stack_offsetof(ctx, i)));
-}
-
-void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx)
-{
- EMIT(PPC_RAW_MR(_R3, bpf_to_ppc(BPF_REG_0)));
-
- bpf_jit_emit_common_epilogue(image, ctx);
-
- /* Tear down our stack frame */
if (ctx->seen & SEEN_FUNC)
EMIT(PPC_RAW_LWZ(_R0, _R1, BPF_PPC_STACKFRAME(ctx) + PPC_LR_STKOFF));
+ /* Tear down our stack frame */
EMIT(PPC_RAW_ADDI(_R1, _R1, BPF_PPC_STACKFRAME(ctx)));
if (ctx->seen & SEEN_FUNC)
EMIT(PPC_RAW_MTLR(_R0));
+}
+
+void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx)
+{
+ EMIT(PPC_RAW_MR(_R3, bpf_to_ppc(BPF_REG_0)));
+
+ bpf_jit_emit_common_epilogue(image, ctx);
+
EMIT(PPC_RAW_BLR());
}
@@ -244,7 +240,6 @@ static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 o
EMIT(PPC_RAW_RLWINM(_R3, b2p_index, 2, 0, 29));
EMIT(PPC_RAW_ADD(_R3, _R3, b2p_bpf_array));
EMIT(PPC_RAW_LWZ(_R3, _R3, offsetof(struct bpf_array, ptrs)));
- EMIT(PPC_RAW_STW(_R0, _R1, bpf_jit_stack_offsetof(ctx, BPF_PPC_TC)));
/*
* if (prog == NULL)
@@ -255,19 +250,14 @@ static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 o
/* goto *(prog->bpf_func + prologue_size); */
EMIT(PPC_RAW_LWZ(_R3, _R3, offsetof(struct bpf_prog, bpf_func)));
-
- if (ctx->seen & SEEN_FUNC)
- EMIT(PPC_RAW_LWZ(_R0, _R1, BPF_PPC_STACKFRAME(ctx) + PPC_LR_STKOFF));
-
EMIT(PPC_RAW_ADDIC(_R3, _R3, BPF_TAILCALL_PROLOGUE_SIZE));
-
- if (ctx->seen & SEEN_FUNC)
- EMIT(PPC_RAW_MTLR(_R0));
-
EMIT(PPC_RAW_MTCTR(_R3));
EMIT(PPC_RAW_MR(_R3, bpf_to_ppc(BPF_REG_1)));
+ /* Put tail_call_cnt in r4 */
+ EMIT(PPC_RAW_MR(_R4, _R0));
+
/* tear restore NVRs, ... */
bpf_jit_emit_common_epilogue(image, ctx);
> On 3 Dec 2022, at 09:26, Sasha Levin <sashal(a)kernel.org> wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible
>
> to the 5.15-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> kbuild-fix-wimplicit-function-declaration-in-license.patch
> and it can be found in the queue-5.15 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit 3f07ef5743bd434f3689f9ceed7a24128dc6b5ae
> Author: Sam James <sam(a)gentoo.org>
> Date: Wed Nov 16 18:26:34 2022 +0000
>
> kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible
>
> [ Upstream commit 50c697215a8cc22f0e58c88f06f2716c05a26e85 ]
Hi Sasha,
Please yank this commit as it's been reverted upstream, it doesn't work
for cross as-is.
Best,
sam
The patch titled
Subject: tmpfs: fix data loss from failed fallocate
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
tmpfs-fix-data-loss-from-failed-fallocate.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: tmpfs: fix data loss from failed fallocate
Date: Sun, 4 Dec 2022 16:51:50 -0800 (PST)
Fix tmpfs data loss when the fallocate system call is interrupted by a
signal, or fails for some other reason. The partial folio handling in
shmem_undo_range() forgot to consider this unfalloc case, and was liable
to erase or truncate out data which had already been committed earlier.
It turns out that none of the partial folio handling there is appropriate
for the unfalloc case, which just wants to proceed to removal of whole
folios: which find_get_entries() provides, even when partially covered.
Original patch by Rui Wang.
Link: https://lore.kernel.org/linux-mm/33b85d82.7764.1842e9ab207.Coremail.chenguo…
Link: https://lkml.kernel.org/r/a5dac112-cf4b-7af-a33-f386e347fd38@google.com
Fixes: b9a8a4195c7d ("truncate,shmem: Handle truncates that split large folios")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Guoqi Chen <chenguoqic(a)163.com>
Link: https://lore.kernel.org/all/20221101032248.819360-1-kernel@hev.cc/
Cc: Rui Wang <kernel(a)hev.cc>
Cc: Huacai Chen <chenhuacai(a)loongson.cn>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Vishal Moola (Oracle) <vishal.moola(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [5.17+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/shmem.c | 11 +++++++++++
1 file changed, 11 insertions(+)
--- a/mm/shmem.c~tmpfs-fix-data-loss-from-failed-fallocate
+++ a/mm/shmem.c
@@ -948,6 +948,15 @@ static void shmem_undo_range(struct inod
index++;
}
+ /*
+ * When undoing a failed fallocate, we want none of the partial folio
+ * zeroing and splitting below, but shall want to truncate the whole
+ * folio when !uptodate indicates that it was added by this fallocate,
+ * even when [lstart, lend] covers only a part of the folio.
+ */
+ if (unfalloc)
+ goto whole_folios;
+
same_folio = (lstart >> PAGE_SHIFT) == (lend >> PAGE_SHIFT);
folio = shmem_get_partial_folio(inode, lstart >> PAGE_SHIFT);
if (folio) {
@@ -973,6 +982,8 @@ static void shmem_undo_range(struct inod
folio_put(folio);
}
+whole_folios:
+
index = start;
while (index < end) {
cond_resched();
_
Patches currently in -mm which might be from hughd(a)google.com are
tmpfs-fix-data-loss-from-failed-fallocate.patch
Here's a backported version of upstream commit ID
4dbd6a3e90e03130973688fd79e19425f720d999 that will work with
5.4, 4.19, 4.14, and 4.9.
Michael
-------------------------------------------------------------------------------------------
6f2e8eba629d29ffae0fc71c0cfd6b694ac4a5ec Mon Sep 17 00:00:00 2001
From: Michael Kelley <mikelley(a)microsoft.com>
Date: Sun, 4 Dec 2022 13:52:01 -0800
Subject: [PATCH v4 1/1] x86/ioremap: Fix page aligned size calculation in
__ioremap_caller()
Current code re-calculates the size after aligning the starting and
ending physical addresses on a page boundary. But the re-calculation
also embeds the masking of high order bits that exceed the size of
the physical address space (via PHYSICAL_PAGE_MASK). If the masking
removes any high order bits, the size calculation results in a huge
value that is likely to immediately fail.
Fix this by re-calculating the page-aligned size first. Then mask any
high order bits using PHYSICAL_PAGE_MASK.
Fixes: ffa71f33a820 ("x86, ioremap: Fix incorrect physical address handling in PAE mode")
Acked-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Michael Kelley <mikelley(a)microsoft.com>
---
arch/x86/mm/ioremap.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index ecae9ac..696fd6f 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -126,9 +126,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
* Mappings have to be page-aligned
*/
offset = phys_addr & ~PAGE_MASK;
- phys_addr &= PHYSICAL_PAGE_MASK;
+ phys_addr &= PAGE_MASK;
size = PAGE_ALIGN(last_addr+1) - phys_addr;
+ /*
+ * Mask out any bits not part of the actual physical
+ * address, like memory encryption bits.
+ */
+ phys_addr &= PHYSICAL_PAGE_MASK;
+
retval = reserve_memtype(phys_addr, (u64)phys_addr + size,
pcm, &new_pcm);
if (retval) {
--
1.8.3.1
Hello,
commit 711f8c3fb3db ("Bluetooth: L2CAP: Fix accepting connection request for
invalid SPSM") did not apply to 5.4-stable tree previously.
One of the notable dependencies is commit 15f02b910562 ("Bluetooth: L2CAP:
Add initial code for Enhanced Credit Based Mode") and that doesn't apply to
5.4-stable either due to a mismatch on `l2cap_sock_setsockopt_old` in
l2cap_sock.c.
After doing a comparison between upstream and older revisions, I merged the
changes and backported 15f02b910562 to 5.4-stable.
During compilation, I discovered another dependency commit 145720963b6c
("Bluetooth: L2CAP: Add definitions for Enhanced Credit Based Mode") and
added that to patchset.
All those combined will hopefully allow us to have the fix for CVE-2022-42896
in 5.4-stable.
Thank you.
Luiz Augusto von Dentz (3):
Bluetooth: L2CAP: Add initial code for Enhanced Credit Based Mode
Bluetooth: L2CAP: Add definitions for Enhanced Credit Based Mode
Bluetooth: L2CAP: Fix accepting connection request for invalid SPSM
include/net/bluetooth/l2cap.h | 43 +++
net/bluetooth/l2cap_core.c | 570 +++++++++++++++++++++++++++++++++-
net/bluetooth/l2cap_sock.c | 24 +-
3 files changed, 617 insertions(+), 20 deletions(-)
--
2.37.2
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
48d4180939e1 ("ACPI: HMAT: Fix initiator registration for single-initiator systems")
14f16d47561b ("ACPI: HMAT: remove unnecessary variable initialization")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 48d4180939e12c4bd2846f984436d895bb9699ed Mon Sep 17 00:00:00 2001
From: Vishal Verma <vishal.l.verma(a)intel.com>
Date: Wed, 16 Nov 2022 16:37:37 -0700
Subject: [PATCH] ACPI: HMAT: Fix initiator registration for single-initiator
systems
In a system with a single initiator node, and one or more memory-only
'target' nodes, the memory-only node(s) would fail to register their
initiator node correctly. i.e. in sysfs:
# ls /sys/devices/system/node/node0/access0/targets/
node0
Where as the correct behavior should be:
# ls /sys/devices/system/node/node0/access0/targets/
node0 node1
This happened because hmat_register_target_initiators() uses list_sort()
to sort the initiator list, but the sort comparision function
(initiator_cmp()) is overloaded to also set the node mask's bits.
In a system with a single initiator, the list is singular, and list_sort
elides the comparision helper call. Thus the node mask never gets set,
and the subsequent search for the best initiator comes up empty.
Add a new helper to consume the sorted initiator list, and generate the
nodemask, decoupling it from the overloaded initiator_cmp() comparision
callback. This prevents the singular list corner case naturally, and
makes the code easier to follow as well.
Cc: <stable(a)vger.kernel.org>
Cc: Rafael J. Wysocki <rafael(a)kernel.org>
Cc: Liu Shixin <liushixin2(a)huawei.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Reported-by: Chris Piper <chris.d.piper(a)intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Link: https://lore.kernel.org/r/20221116-acpi_hmat_fix-v2-2-3712569be691@intel.com
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 144a84f429ed..6cceca64a6bc 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -562,17 +562,26 @@ static int initiator_cmp(void *priv, const struct list_head *a,
{
struct memory_initiator *ia;
struct memory_initiator *ib;
- unsigned long *p_nodes = priv;
ia = list_entry(a, struct memory_initiator, node);
ib = list_entry(b, struct memory_initiator, node);
- set_bit(ia->processor_pxm, p_nodes);
- set_bit(ib->processor_pxm, p_nodes);
-
return ia->processor_pxm - ib->processor_pxm;
}
+static int initiators_to_nodemask(unsigned long *p_nodes)
+{
+ struct memory_initiator *initiator;
+
+ if (list_empty(&initiators))
+ return -ENXIO;
+
+ list_for_each_entry(initiator, &initiators, node)
+ set_bit(initiator->processor_pxm, p_nodes);
+
+ return 0;
+}
+
static void hmat_register_target_initiators(struct memory_target *target)
{
static DECLARE_BITMAP(p_nodes, MAX_NUMNODES);
@@ -609,7 +618,10 @@ static void hmat_register_target_initiators(struct memory_target *target)
* initiators.
*/
bitmap_zero(p_nodes, MAX_NUMNODES);
- list_sort(p_nodes, &initiators, initiator_cmp);
+ list_sort(NULL, &initiators, initiator_cmp);
+ if (initiators_to_nodemask(p_nodes) < 0)
+ return;
+
if (!access0done) {
for (i = WRITE_LATENCY; i <= READ_BANDWIDTH; i++) {
loc = localities_types[i];
@@ -643,7 +655,9 @@ static void hmat_register_target_initiators(struct memory_target *target)
/* Access 1 ignores Generic Initiators */
bitmap_zero(p_nodes, MAX_NUMNODES);
- list_sort(p_nodes, &initiators, initiator_cmp);
+ if (initiators_to_nodemask(p_nodes) < 0)
+ return;
+
for (i = WRITE_LATENCY; i <= READ_BANDWIDTH; i++) {
loc = localities_types[i];
if (!loc)
The patch below does not apply to the 6.0-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
48d4180939e1 ("ACPI: HMAT: Fix initiator registration for single-initiator systems")
14f16d47561b ("ACPI: HMAT: remove unnecessary variable initialization")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 48d4180939e12c4bd2846f984436d895bb9699ed Mon Sep 17 00:00:00 2001
From: Vishal Verma <vishal.l.verma(a)intel.com>
Date: Wed, 16 Nov 2022 16:37:37 -0700
Subject: [PATCH] ACPI: HMAT: Fix initiator registration for single-initiator
systems
In a system with a single initiator node, and one or more memory-only
'target' nodes, the memory-only node(s) would fail to register their
initiator node correctly. i.e. in sysfs:
# ls /sys/devices/system/node/node0/access0/targets/
node0
Where as the correct behavior should be:
# ls /sys/devices/system/node/node0/access0/targets/
node0 node1
This happened because hmat_register_target_initiators() uses list_sort()
to sort the initiator list, but the sort comparision function
(initiator_cmp()) is overloaded to also set the node mask's bits.
In a system with a single initiator, the list is singular, and list_sort
elides the comparision helper call. Thus the node mask never gets set,
and the subsequent search for the best initiator comes up empty.
Add a new helper to consume the sorted initiator list, and generate the
nodemask, decoupling it from the overloaded initiator_cmp() comparision
callback. This prevents the singular list corner case naturally, and
makes the code easier to follow as well.
Cc: <stable(a)vger.kernel.org>
Cc: Rafael J. Wysocki <rafael(a)kernel.org>
Cc: Liu Shixin <liushixin2(a)huawei.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Reported-by: Chris Piper <chris.d.piper(a)intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Link: https://lore.kernel.org/r/20221116-acpi_hmat_fix-v2-2-3712569be691@intel.com
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 144a84f429ed..6cceca64a6bc 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -562,17 +562,26 @@ static int initiator_cmp(void *priv, const struct list_head *a,
{
struct memory_initiator *ia;
struct memory_initiator *ib;
- unsigned long *p_nodes = priv;
ia = list_entry(a, struct memory_initiator, node);
ib = list_entry(b, struct memory_initiator, node);
- set_bit(ia->processor_pxm, p_nodes);
- set_bit(ib->processor_pxm, p_nodes);
-
return ia->processor_pxm - ib->processor_pxm;
}
+static int initiators_to_nodemask(unsigned long *p_nodes)
+{
+ struct memory_initiator *initiator;
+
+ if (list_empty(&initiators))
+ return -ENXIO;
+
+ list_for_each_entry(initiator, &initiators, node)
+ set_bit(initiator->processor_pxm, p_nodes);
+
+ return 0;
+}
+
static void hmat_register_target_initiators(struct memory_target *target)
{
static DECLARE_BITMAP(p_nodes, MAX_NUMNODES);
@@ -609,7 +618,10 @@ static void hmat_register_target_initiators(struct memory_target *target)
* initiators.
*/
bitmap_zero(p_nodes, MAX_NUMNODES);
- list_sort(p_nodes, &initiators, initiator_cmp);
+ list_sort(NULL, &initiators, initiator_cmp);
+ if (initiators_to_nodemask(p_nodes) < 0)
+ return;
+
if (!access0done) {
for (i = WRITE_LATENCY; i <= READ_BANDWIDTH; i++) {
loc = localities_types[i];
@@ -643,7 +655,9 @@ static void hmat_register_target_initiators(struct memory_target *target)
/* Access 1 ignores Generic Initiators */
bitmap_zero(p_nodes, MAX_NUMNODES);
- list_sort(p_nodes, &initiators, initiator_cmp);
+ if (initiators_to_nodemask(p_nodes) < 0)
+ return;
+
for (i = WRITE_LATENCY; i <= READ_BANDWIDTH; i++) {
loc = localities_types[i];
if (!loc)
Portions of upstream commit a4055888629b ("mm/memcg: warning on !memcg
after readahead page charged") were backported as commit cfe575954ddd
("mm: add VM_WARN_ON_ONCE_PAGE() macro"). Unfortunately, the backport
did not account for the lack of commit 33def8498fdd ("treewide: Convert
macro and uses of __section(foo) to __section("foo")") in kernels prior
to 5.10, resulting in the following orphan section warnings on PowerPC
clang builds with CONFIG_DEBUG_VM=y:
powerpc64le-linux-gnu-ld: warning: orphan section `".data.once"' from `mm/huge_memory.o' being placed in section `".data.once"'
powerpc64le-linux-gnu-ld: warning: orphan section `".data.once"' from `mm/huge_memory.o' being placed in section `".data.once"'
powerpc64le-linux-gnu-ld: warning: orphan section `".data.once"' from `mm/huge_memory.o' being placed in section `".data.once"'
This is a difference between how clang and gcc handle macro
stringification, which was resolved for the kernel by not stringifying
the argument to the __section() macro. Since that change was deemed not
suitable for the stable kernels by commit 59f89518f510 ("once: fix
section mismatch on clang builds"), do that same thing as that change
and remove the quotes from the argument to __section().
Fixes: cfe575954ddd ("mm: add VM_WARN_ON_ONCE_PAGE() macro")
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
As far as I can tell, this should be applied to 5.4 and earlier. It
should apply cleanly but let me know if not.
include/linux/mmdebug.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h
index 5d0767cb424a..4ed52879ce55 100644
--- a/include/linux/mmdebug.h
+++ b/include/linux/mmdebug.h
@@ -38,7 +38,7 @@ void dump_mm(const struct mm_struct *mm);
} \
} while (0)
#define VM_WARN_ON_ONCE_PAGE(cond, page) ({ \
- static bool __section(".data.once") __warned; \
+ static bool __section(.data.once) __warned; \
int __ret_warn_once = !!(cond); \
\
if (unlikely(__ret_warn_once && !__warned)) { \
base-commit: 4d2a309b5c28a2edc0900542d22fec3a5a22243b
--
2.38.1
From: Keith Busch <kbusch(a)kernel.org>
commit 23e085b2dead13b51fe86d27069895b740f749c0 upstream.
The passthrough commands already have this restriction, but the other
operations do not. Require the same capabilities for all users as all of
these operations, which include resets and rescans, can be disruptive.
Signed-off-by: Keith Busch <kbusch(a)kernel.org>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Ovidiu Panait <ovidiu.panait(a)windriver.com>
---
These backports are for CVE-2022-3169.
drivers/nvme/host/core.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 6627fb531f33..3b5e5fb158be 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3027,11 +3027,17 @@ static long nvme_dev_ioctl(struct file *file, unsigned int cmd,
case NVME_IOCTL_IO_CMD:
return nvme_dev_user_cmd(ctrl, argp);
case NVME_IOCTL_RESET:
+ if (!capable(CAP_SYS_ADMIN))
+ return -EACCES;
dev_warn(ctrl->device, "resetting controller\n");
return nvme_reset_ctrl_sync(ctrl);
case NVME_IOCTL_SUBSYS_RESET:
+ if (!capable(CAP_SYS_ADMIN))
+ return -EACCES;
return nvme_reset_subsystem(ctrl);
case NVME_IOCTL_RESCAN:
+ if (!capable(CAP_SYS_ADMIN))
+ return -EACCES;
nvme_queue_scan(ctrl);
return 0;
default:
--
2.38.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
42fb0a1e84ff ("tracing/ring-buffer: Have polling block on watermark")
7e9fbbb1b776 ("ring-buffer: Add ring_buffer_wake_waiters()")
13292494379f ("tracing: Make struct ring_buffer less ambiguous")
1c5eb4481e01 ("tracing: Rename trace_buffer to array_buffer")
2d6425af6116 ("tracing: Declare newly exported APIs in include/linux/trace.h")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 42fb0a1e84ff525ebe560e2baf9451ab69127e2b Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
Date: Thu, 20 Oct 2022 23:14:27 -0400
Subject: [PATCH] tracing/ring-buffer: Have polling block on watermark
Currently the way polling works on the ring buffer is broken. It will
return immediately if there's any data in the ring buffer whereas a read
will block until the watermark (defined by the tracefs buffer_percent file)
is hit.
That is, a select() or poll() will return as if there's data available,
but then the following read will block. This is broken for the way
select()s and poll()s are supposed to work.
Have the polling on the ring buffer also block the same way reads and
splice does on the ring buffer.
Link: https://lkml.kernel.org/r/20221020231427.41be3f26@gandalf.local.home
Cc: Linux Trace Kernel <linux-trace-kernel(a)vger.kernel.org>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Primiano Tucci <primiano(a)google.com>
Cc: stable(a)vger.kernel.org
Fixes: 1e0d6714aceb7 ("ring-buffer: Do not wake up a splice waiter when page is not full")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index 2504df9a0453..3c7d295746f6 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -100,7 +100,7 @@ __ring_buffer_alloc(unsigned long size, unsigned flags, struct lock_class_key *k
int ring_buffer_wait(struct trace_buffer *buffer, int cpu, int full);
__poll_t ring_buffer_poll_wait(struct trace_buffer *buffer, int cpu,
- struct file *filp, poll_table *poll_table);
+ struct file *filp, poll_table *poll_table, int full);
void ring_buffer_wake_waiters(struct trace_buffer *buffer, int cpu);
#define RING_BUFFER_ALL_CPUS -1
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 9712083832f4..089b1ec9cb3b 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -907,6 +907,21 @@ size_t ring_buffer_nr_dirty_pages(struct trace_buffer *buffer, int cpu)
return cnt - read;
}
+static __always_inline bool full_hit(struct trace_buffer *buffer, int cpu, int full)
+{
+ struct ring_buffer_per_cpu *cpu_buffer = buffer->buffers[cpu];
+ size_t nr_pages;
+ size_t dirty;
+
+ nr_pages = cpu_buffer->nr_pages;
+ if (!nr_pages || !full)
+ return true;
+
+ dirty = ring_buffer_nr_dirty_pages(buffer, cpu);
+
+ return (dirty * 100) > (full * nr_pages);
+}
+
/*
* rb_wake_up_waiters - wake up tasks waiting for ring buffer input
*
@@ -1046,22 +1061,20 @@ int ring_buffer_wait(struct trace_buffer *buffer, int cpu, int full)
!ring_buffer_empty_cpu(buffer, cpu)) {
unsigned long flags;
bool pagebusy;
- size_t nr_pages;
- size_t dirty;
+ bool done;
if (!full)
break;
raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
pagebusy = cpu_buffer->reader_page == cpu_buffer->commit_page;
- nr_pages = cpu_buffer->nr_pages;
- dirty = ring_buffer_nr_dirty_pages(buffer, cpu);
+ done = !pagebusy && full_hit(buffer, cpu, full);
+
if (!cpu_buffer->shortest_full ||
cpu_buffer->shortest_full > full)
cpu_buffer->shortest_full = full;
raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
- if (!pagebusy &&
- (!nr_pages || (dirty * 100) > full * nr_pages))
+ if (done)
break;
}
@@ -1087,6 +1100,7 @@ int ring_buffer_wait(struct trace_buffer *buffer, int cpu, int full)
* @cpu: the cpu buffer to wait on
* @filp: the file descriptor
* @poll_table: The poll descriptor
+ * @full: wait until the percentage of pages are available, if @cpu != RING_BUFFER_ALL_CPUS
*
* If @cpu == RING_BUFFER_ALL_CPUS then the task will wake up as soon
* as data is added to any of the @buffer's cpu buffers. Otherwise
@@ -1096,14 +1110,15 @@ int ring_buffer_wait(struct trace_buffer *buffer, int cpu, int full)
* zero otherwise.
*/
__poll_t ring_buffer_poll_wait(struct trace_buffer *buffer, int cpu,
- struct file *filp, poll_table *poll_table)
+ struct file *filp, poll_table *poll_table, int full)
{
struct ring_buffer_per_cpu *cpu_buffer;
struct rb_irq_work *work;
- if (cpu == RING_BUFFER_ALL_CPUS)
+ if (cpu == RING_BUFFER_ALL_CPUS) {
work = &buffer->irq_work;
- else {
+ full = 0;
+ } else {
if (!cpumask_test_cpu(cpu, buffer->cpumask))
return -EINVAL;
@@ -1111,8 +1126,14 @@ __poll_t ring_buffer_poll_wait(struct trace_buffer *buffer, int cpu,
work = &cpu_buffer->irq_work;
}
- poll_wait(filp, &work->waiters, poll_table);
- work->waiters_pending = true;
+ if (full) {
+ poll_wait(filp, &work->full_waiters, poll_table);
+ work->full_waiters_pending = true;
+ } else {
+ poll_wait(filp, &work->waiters, poll_table);
+ work->waiters_pending = true;
+ }
+
/*
* There's a tight race between setting the waiters_pending and
* checking if the ring buffer is empty. Once the waiters_pending bit
@@ -1128,6 +1149,9 @@ __poll_t ring_buffer_poll_wait(struct trace_buffer *buffer, int cpu,
*/
smp_mb();
+ if (full)
+ return full_hit(buffer, cpu, full) ? EPOLLIN | EPOLLRDNORM : 0;
+
if ((cpu == RING_BUFFER_ALL_CPUS && !ring_buffer_empty(buffer)) ||
(cpu != RING_BUFFER_ALL_CPUS && !ring_buffer_empty_cpu(buffer, cpu)))
return EPOLLIN | EPOLLRDNORM;
@@ -3155,10 +3179,6 @@ static void rb_commit(struct ring_buffer_per_cpu *cpu_buffer,
static __always_inline void
rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
{
- size_t nr_pages;
- size_t dirty;
- size_t full;
-
if (buffer->irq_work.waiters_pending) {
buffer->irq_work.waiters_pending = false;
/* irq_work_queue() supplies it's own memory barriers */
@@ -3182,10 +3202,7 @@ rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
cpu_buffer->last_pages_touch = local_read(&cpu_buffer->pages_touched);
- full = cpu_buffer->shortest_full;
- nr_pages = cpu_buffer->nr_pages;
- dirty = ring_buffer_nr_dirty_pages(buffer, cpu_buffer->cpu);
- if (full && nr_pages && (dirty * 100) <= full * nr_pages)
+ if (!full_hit(buffer, cpu_buffer->cpu, cpu_buffer->shortest_full))
return;
cpu_buffer->irq_work.wakeup_full = true;
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 47a44b055a1d..c6c7a0af3ed2 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -6681,7 +6681,7 @@ trace_poll(struct trace_iterator *iter, struct file *filp, poll_table *poll_tabl
return EPOLLIN | EPOLLRDNORM;
else
return ring_buffer_poll_wait(iter->array_buffer->buffer, iter->cpu_file,
- filp, poll_table);
+ filp, poll_table, iter->tr->buffer_percent);
}
static __poll_t
The patch below does not apply to the 6.0-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
d5082d386eee ("ipv4: Fix route deletion when nexthop info is not specified")
61b91eb33a69 ("ipv4: Handle attempt to delete multipath route when fib_info contains an nh reference")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d5082d386eee7e8ec46fa8581932c81a4961dcef Mon Sep 17 00:00:00 2001
From: Ido Schimmel <idosch(a)nvidia.com>
Date: Thu, 24 Nov 2022 23:09:32 +0200
Subject: [PATCH] ipv4: Fix route deletion when nexthop info is not specified
When the kernel receives a route deletion request from user space it
tries to delete a route that matches the route attributes specified in
the request.
If only prefix information is specified in the request, the kernel
should delete the first matching FIB alias regardless of its associated
FIB info. However, an error is currently returned when the FIB info is
backed by a nexthop object:
# ip nexthop add id 1 via 192.0.2.2 dev dummy10
# ip route add 198.51.100.0/24 nhid 1
# ip route del 198.51.100.0/24
RTNETLINK answers: No such process
Fix by matching on such a FIB info when legacy nexthop attributes are
not specified in the request. An earlier check already covers the case
where a nexthop ID is specified in the request.
Add tests that cover these flows. Before the fix:
# ./fib_nexthops.sh -t ipv4_fcnal
...
TEST: Delete route when not specifying nexthop attributes [FAIL]
Tests passed: 11
Tests failed: 1
After the fix:
# ./fib_nexthops.sh -t ipv4_fcnal
...
TEST: Delete route when not specifying nexthop attributes [ OK ]
Tests passed: 12
Tests failed: 0
No regressions in other tests:
# ./fib_nexthops.sh
...
Tests passed: 228
Tests failed: 0
# ./fib_tests.sh
...
Tests passed: 186
Tests failed: 0
Cc: stable(a)vger.kernel.org
Reported-by: Jonas Gorski <jonas.gorski(a)gmail.com>
Tested-by: Jonas Gorski <jonas.gorski(a)gmail.com>
Fixes: 493ced1ac47c ("ipv4: Allow routes to use nexthop objects")
Fixes: 6bf92d70e690 ("net: ipv4: fix route with nexthop object delete warning")
Fixes: 61b91eb33a69 ("ipv4: Handle attempt to delete multipath route when fib_info contains an nh reference")
Signed-off-by: Ido Schimmel <idosch(a)nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor(a)blackwall.org>
Reviewed-by: David Ahern <dsahern(a)kernel.org>
Link: https://lore.kernel.org/r/20221124210932.2470010-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index f721c308248b..19a662003eef 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -888,9 +888,11 @@ int fib_nh_match(struct net *net, struct fib_config *cfg, struct fib_info *fi,
return 1;
}
- /* cannot match on nexthop object attributes */
- if (fi->nh)
- return 1;
+ if (fi->nh) {
+ if (cfg->fc_oif || cfg->fc_gw_family || cfg->fc_mp)
+ return 1;
+ return 0;
+ }
if (cfg->fc_oif || cfg->fc_gw_family) {
struct fib_nh *nh;
diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh
index ee5e98204d3d..a47b26ab48f2 100755
--- a/tools/testing/selftests/net/fib_nexthops.sh
+++ b/tools/testing/selftests/net/fib_nexthops.sh
@@ -1228,6 +1228,17 @@ ipv4_fcnal()
run_cmd "$IP ro add 172.16.101.0/24 nhid 21"
run_cmd "$IP ro del 172.16.101.0/24 nexthop via 172.16.1.7 dev veth1 nexthop via 172.16.1.8 dev veth1"
log_test $? 2 "Delete multipath route with only nh id based entry"
+
+ run_cmd "$IP nexthop add id 22 via 172.16.1.6 dev veth1"
+ run_cmd "$IP ro add 172.16.102.0/24 nhid 22"
+ run_cmd "$IP ro del 172.16.102.0/24 dev veth1"
+ log_test $? 2 "Delete route when specifying only nexthop device"
+
+ run_cmd "$IP ro del 172.16.102.0/24 via 172.16.1.6"
+ log_test $? 2 "Delete route when specifying only gateway"
+
+ run_cmd "$IP ro del 172.16.102.0/24"
+ log_test $? 0 "Delete route when not specifying nexthop attributes"
}
ipv4_grp_fcnal()
Greetings,
I am Danish Asad a gold miner from Burkina Faso. I have a
Mutual/Beneficial Business that would be beneficial to both of us. I
only have one question to ask of you,
Can you be honest?
Please note that the deal requires high level of maturity, honesty
and secrecy. This will involve moving some bars of golds, lots of
jewelries ranging from neck less to rings and bracelets under my care,
on trust to your hand also note that i will do everything to make sure
that the bars of golds and jewelries is moved as a purely legitimate,
so you will not be exposed to any risk.
I request for your full co-operation. I will give you details and
procedure when I receive your reply, to commence this business, I
require you to immediately indicate your interest by a return reply. I
will be waiting for your response in a timely manner.
Please contact my private email address.
danishasad074(a)gmail.com
Yours sincerely,
Mr.Danish Asad
danishasad074(a)gmail.com
+22667440112
NOTE: Please treat it genuinely.
Commit be36f9e7517e ("efi: READ_ONCE rng seed size before munmap")
added a READ_ONCE() and also changed the call to
add_bootloader_randomness() to use the local size variable. Neither
of these changes was actually needed and this was not backported to
the 4.14 stable branch.
Commit 161a438d730d ("efi: random: reduce seed size to 32 bytes")
reverted the addition of READ_ONCE() and added a limit to the value of
size. This depends on the earlier commit, because size can now differ
from seed->size, but it was wrongly backported to the 4.14 stable
branch by itself.
Apply the missing change to the add_bootloader_randomness() parameter
(except that here we are still using add_device_randomness()).
Fixes: 700485f70e50 ("efi: random: reduce seed size to 32 bytes")
Signed-off-by: Ben Hutchings <ben(a)decadent.org.uk>
---
drivers/firmware/efi/efi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index ed981f5e29ae..cc64869d8420 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -541,7 +541,7 @@ int __init efi_config_parse_tables(void *config_tables, int count, int sz,
seed = early_memremap(efi.rng_seed,
sizeof(*seed) + size);
if (seed != NULL) {
- add_device_randomness(seed->bits, seed->size);
+ add_device_randomness(seed->bits, size);
early_memunmap(seed, sizeof(*seed) + size);
pr_notice("seeding entropy pool\n");
} else {
Hello!
Would the stable maintainers please consider backporting the following
series of patches?:
https://lore.kernel.org/lkml/20220801013834.156015-1-andres@anarazel.de/
Branches where this is needed are:
* 5.4
* 5.10
* 5.15
Branch 6.0.y is fine, as this series is present there.
Failure is seen in this form:
-----8<----------8<----------8<-----
util/annotate.c: In function 'symbol__disassemble_bpf':
util/annotate.c:1739:9: error: too few arguments to function
'init_disassemble_info'
1739 | init_disassemble_info(&info, s,
| ^~~~~~~~~~~~~~~~~~~~~
In file included from util/annotate.c:1692:
/usr/include/dis-asm.h:472:13: note: declared here
472 | extern void init_disassemble_info (struct disassemble_info
*dinfo, void *stream,
| ^~~~~~~~~~~~~~~~~~~~~
make[4]: *** [/builds/linux/tools/build/Makefile.build:96:
/home/tuxbuild/.cache/tuxmake/builds/1/build/util/annotate.o] Error 1
----->8---------->8---------->8-----
The 5.15 backport is almost straight-forward, with patches 7 and 8
requiring some small modifications. I could not get the 5.10 backport
to build, and for 5.4 it was even more difficult to adapt.
Thanks and greetings!
Daniel Díaz
daniel.diaz(a)linaro.org