The quilt patch titled
Subject: mm/damon/sysfs: dealloc commit test ctx always
has been removed from the -mm tree. Its filename was
mm-damon-sysfs-dealloc-commit-test-ctx-always.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/sysfs: dealloc commit test ctx always
Date: Fri, 3 Oct 2025 13:14:55 -0700
The damon_ctx for testing online DAMON parameters commit inputs is
deallocated only when the test fails. This means memory is leaked for
every successful online DAMON parameters commit. Fix the leak by always
deallocating it.
Link: https://lkml.kernel.org/r/20251003201455.41448-3-sj@kernel.org
Fixes: 4c9ea539ad59 ("mm/damon/sysfs: validate user inputs from damon_sysfs_commit_input()")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [6.15+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/sysfs.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
--- a/mm/damon/sysfs.c~mm-damon-sysfs-dealloc-commit-test-ctx-always
+++ a/mm/damon/sysfs.c
@@ -1476,12 +1476,11 @@ static int damon_sysfs_commit_input(void
if (!test_ctx)
return -ENOMEM;
err = damon_commit_ctx(test_ctx, param_ctx);
- if (err) {
- damon_destroy_ctx(test_ctx);
+ if (err)
goto out;
- }
err = damon_commit_ctx(kdamond->damon_ctx, param_ctx);
out:
+ damon_destroy_ctx(test_ctx);
damon_destroy_ctx(param_ctx);
return err;
}
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-core-fix-list_add_tail-call-on-damon_call.patch
mm-damon-core-use-damos_commit_quota_goal-for-new-goal-commit.patch
mm-zswap-remove-unnecessary-dlen-writes-for-incompressible-pages.patch
mm-zswap-fix-typos-s-zwap-zswap.patch
mm-zswap-s-red-black-tree-xarray.patch
docs-admin-guide-mm-zswap-s-red-black-tree-xarray.patch
The quilt patch titled
Subject: mm/damon/sysfs: catch commit test ctx alloc failure
has been removed from the -mm tree. Its filename was
mm-damon-sysfs-catch-commit-test-ctx-alloc-failure.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/sysfs: catch commit test ctx alloc failure
Date: Fri, 3 Oct 2025 13:14:54 -0700
Patch series "mm/damon/sysfs: fix commit test damon_ctx [de]allocation".
DAMON sysfs interface dynamically allocates and uses a damon_ctx object
for testing if given inputs for online DAMON parameters update is valid.
The object is being used without an allocation failure check, and leaked
when the test succeeds. Fix the two bugs.
This patch (of 2):
The damon_ctx for testing online DAMON parameters commit inputs is used
without its allocation failure check. This could result in an invalid
memory access. Fix it by directly returning an error when the allocation
failed.
Link: https://lkml.kernel.org/r/20251003201455.41448-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20251003201455.41448-2-sj@kernel.org
Fixes: 4c9ea539ad59 ("mm/damon/sysfs: validate user inputs from damon_sysfs_commit_input()")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [6.15+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/sysfs.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/damon/sysfs.c~mm-damon-sysfs-catch-commit-test-ctx-alloc-failure
+++ a/mm/damon/sysfs.c
@@ -1473,6 +1473,8 @@ static int damon_sysfs_commit_input(void
if (IS_ERR(param_ctx))
return PTR_ERR(param_ctx);
test_ctx = damon_new_ctx();
+ if (!test_ctx)
+ return -ENOMEM;
err = damon_commit_ctx(test_ctx, param_ctx);
if (err) {
damon_destroy_ctx(test_ctx);
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-core-fix-list_add_tail-call-on-damon_call.patch
mm-damon-core-use-damos_commit_quota_goal-for-new-goal-commit.patch
mm-zswap-remove-unnecessary-dlen-writes-for-incompressible-pages.patch
mm-zswap-fix-typos-s-zwap-zswap.patch
mm-zswap-s-red-black-tree-xarray.patch
docs-admin-guide-mm-zswap-s-red-black-tree-xarray.patch
The quilt patch titled
Subject: hung_task: fix warnings caused by unaligned lock pointers
has been removed from the -mm tree. Its filename was
hung_task-fix-warnings-caused-by-unaligned-lock-pointers.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Lance Yang <lance.yang(a)linux.dev>
Subject: hung_task: fix warnings caused by unaligned lock pointers
Date: Tue, 9 Sep 2025 22:52:43 +0800
From: Lance Yang <lance.yang(a)linux.dev>
The blocker tracking mechanism assumes that lock pointers are at least
4-byte aligned to use their lower bits for type encoding.
However, as reported by Eero Tamminen, some architectures like m68k
only guarantee 2-byte alignment of 32-bit values. This breaks the
assumption and causes two related WARN_ON_ONCE checks to trigger.
To fix this, the runtime checks are adjusted to silently ignore any lock
that is not 4-byte aligned, effectively disabling the feature in such
cases and avoiding the related warnings.
Thanks to Geert Uytterhoeven for bisecting!
Link: https://lkml.kernel.org/r/20250909145243.17119-1-lance.yang@linux.dev
Fixes: e711faaafbe5 ("hung_task: replace blocker_mutex with encoded blocker")
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
Reported-by: Eero Tamminen <oak(a)helsinkinet.fi>
Closes: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58R…
Reviewed-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Cc: John Paul Adrian Glaubitz <glaubitz(a)physik.fu-berlin.de>
Cc: Anna Schumaker <anna.schumaker(a)oracle.com>
Cc: Boqun Feng <boqun.feng(a)gmail.com>
Cc: Finn Thain <fthain(a)linux-m68k.org>
Cc: Geert Uytterhoeven <geert(a)linux-m68k.org>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Joel Granados <joel.granados(a)kernel.org>
Cc: John Stultz <jstultz(a)google.com>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: Lance Yang <lance.yang(a)linux.dev>
Cc: Mingzhe Yang <mingzhe.yang(a)ly.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Sergey Senozhatsky <senozhatsky(a)chromium.org>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Tomasz Figa <tfiga(a)chromium.org>
Cc: Waiman Long <longman(a)redhat.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Yongliang Gao <leonylgao(a)tencent.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/hung_task.h | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
--- a/include/linux/hung_task.h~hung_task-fix-warnings-caused-by-unaligned-lock-pointers
+++ a/include/linux/hung_task.h
@@ -20,6 +20,10 @@
* always zero. So we can use these bits to encode the specific blocking
* type.
*
+ * Note that on architectures where this is not guaranteed, or for any
+ * unaligned lock, this tracking mechanism is silently skipped for that
+ * lock.
+ *
* Type encoding:
* 00 - Blocked on mutex (BLOCKER_TYPE_MUTEX)
* 01 - Blocked on semaphore (BLOCKER_TYPE_SEM)
@@ -45,7 +49,7 @@ static inline void hung_task_set_blocker
* If the lock pointer matches the BLOCKER_TYPE_MASK, return
* without writing anything.
*/
- if (WARN_ON_ONCE(lock_ptr & BLOCKER_TYPE_MASK))
+ if (lock_ptr & BLOCKER_TYPE_MASK)
return;
WRITE_ONCE(current->blocker, lock_ptr | type);
@@ -53,8 +57,6 @@ static inline void hung_task_set_blocker
static inline void hung_task_clear_blocker(void)
{
- WARN_ON_ONCE(!READ_ONCE(current->blocker));
-
WRITE_ONCE(current->blocker, 0UL);
}
_
Patches currently in -mm which might be from lance.yang(a)linux.dev are
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x f97aef092e199c10a3da96ae79b571edd5362faa
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101528-barcode-doorstop-420a@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f97aef092e199c10a3da96ae79b571edd5362faa Mon Sep 17 00:00:00 2001
From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com>
Date: Fri, 26 Sep 2025 12:12:37 +0200
Subject: [PATCH] cpufreq: Make drivers using CPUFREQ_ETERNAL specify
transition latency
Commit a755d0e2d41b ("cpufreq: Honour transition_latency over
transition_delay_us") caused platforms where cpuinfo.transition_latency
is CPUFREQ_ETERNAL to get a very large transition latency whereas
previously it had been capped at 10 ms (and later at 2 ms).
This led to a user-observable regression between 6.6 and 6.12 as
described by Shawn:
"The dbs sampling_rate was 10000 us on 6.6 and suddently becomes
6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these platforms
because the default transition delay was dropped [...].
It slows down dbs governor's reacting to CPU loading change
dramatically. Also, as transition_delay_us is used by schedutil
governor as rate_limit_us, it shows a negative impact on device
idle power consumption, because the device gets slightly less time
in the lowest OPP."
Evidently, the expectation of the drivers using CPUFREQ_ETERNAL as
cpuinfo.transition_latency was that it would be capped by the core,
but they may as well return a default transition latency value instead
of CPUFREQ_ETERNAL and the core need not do anything with it.
Accordingly, introduce CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS and make
all of the drivers in question use it instead of CPUFREQ_ETERNAL. Also
update the related Rust binding.
Fixes: a755d0e2d41b ("cpufreq: Honour transition_latency over transition_delay_us")
Closes: https://lore.kernel.org/linux-pm/20250922125929.453444-1-shawnguo2@yeah.net/
Reported-by: Shawn Guo <shawnguo(a)kernel.org>
Reviewed-by: Mario Limonciello (AMD) <superm1(a)kernel.org>
Reviewed-by: Jie Zhan <zhanjie9(a)hisilicon.com>
Acked-by: Viresh Kumar <viresh.kumar(a)linaro.org>
Cc: 6.6+ <stable(a)vger.kernel.org> # 6.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Link: https://patch.msgid.link/2264949.irdbgypaU6@rafael.j.wysocki
[ rjw: Fix typo in new symbol name, drop redundant type cast from Rust binding ]
Tested-by: Shawn Guo <shawnguo(a)kernel.org> # with cpufreq-dt driver
Reviewed-by: Qais Yousef <qyousef(a)layalina.io>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index 506437489b4d..7d5079fd1688 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -104,7 +104,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
transition_latency = dev_pm_opp_get_max_transition_latency(cpu_dev);
if (!transition_latency)
- transition_latency = CPUFREQ_ETERNAL;
+ transition_latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
cpumask_copy(policy->cpus, priv->cpus);
policy->driver_data = priv;
diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
index db1c88e9d3f9..e93697d3edfd 100644
--- a/drivers/cpufreq/imx6q-cpufreq.c
+++ b/drivers/cpufreq/imx6q-cpufreq.c
@@ -442,7 +442,7 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
}
if (of_property_read_u32(np, "clock-latency", &transition_latency))
- transition_latency = CPUFREQ_ETERNAL;
+ transition_latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
/*
* Calculate the ramp time for max voltage change in the
diff --git a/drivers/cpufreq/mediatek-cpufreq-hw.c b/drivers/cpufreq/mediatek-cpufreq-hw.c
index fce5aa5ceea0..ae4500ab4891 100644
--- a/drivers/cpufreq/mediatek-cpufreq-hw.c
+++ b/drivers/cpufreq/mediatek-cpufreq-hw.c
@@ -309,7 +309,7 @@ static int mtk_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
latency = readl_relaxed(data->reg_bases[REG_FREQ_LATENCY]) * 1000;
if (!latency)
- latency = CPUFREQ_ETERNAL;
+ latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
policy->cpuinfo.transition_latency = latency;
policy->fast_switch_possible = true;
diff --git a/drivers/cpufreq/rcpufreq_dt.rs b/drivers/cpufreq/rcpufreq_dt.rs
index 7e1fbf9a091f..3909022e1c74 100644
--- a/drivers/cpufreq/rcpufreq_dt.rs
+++ b/drivers/cpufreq/rcpufreq_dt.rs
@@ -123,7 +123,7 @@ fn init(policy: &mut cpufreq::Policy) -> Result<Self::PData> {
let mut transition_latency = opp_table.max_transition_latency_ns() as u32;
if transition_latency == 0 {
- transition_latency = cpufreq::ETERNAL_LATENCY_NS;
+ transition_latency = cpufreq::DEFAULT_TRANSITION_LATENCY_NS;
}
policy
diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
index 38c165d526d1..d2a110079f5f 100644
--- a/drivers/cpufreq/scmi-cpufreq.c
+++ b/drivers/cpufreq/scmi-cpufreq.c
@@ -294,7 +294,7 @@ static int scmi_cpufreq_init(struct cpufreq_policy *policy)
latency = perf_ops->transition_latency_get(ph, domain);
if (!latency)
- latency = CPUFREQ_ETERNAL;
+ latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
policy->cpuinfo.transition_latency = latency;
diff --git a/drivers/cpufreq/scpi-cpufreq.c b/drivers/cpufreq/scpi-cpufreq.c
index dcbb0ae7dd47..e530345baddf 100644
--- a/drivers/cpufreq/scpi-cpufreq.c
+++ b/drivers/cpufreq/scpi-cpufreq.c
@@ -157,7 +157,7 @@ static int scpi_cpufreq_init(struct cpufreq_policy *policy)
latency = scpi_ops->get_transition_latency(cpu_dev);
if (!latency)
- latency = CPUFREQ_ETERNAL;
+ latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
policy->cpuinfo.transition_latency = latency;
diff --git a/drivers/cpufreq/spear-cpufreq.c b/drivers/cpufreq/spear-cpufreq.c
index 707c71090cc3..2a1550e1aa21 100644
--- a/drivers/cpufreq/spear-cpufreq.c
+++ b/drivers/cpufreq/spear-cpufreq.c
@@ -182,7 +182,7 @@ static int spear_cpufreq_probe(struct platform_device *pdev)
if (of_property_read_u32(np, "clock-latency",
&spear_cpufreq.transition_latency))
- spear_cpufreq.transition_latency = CPUFREQ_ETERNAL;
+ spear_cpufreq.transition_latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
cnt = of_property_count_u32_elems(np, "cpufreq_tbl");
if (cnt <= 0) {
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 40966512ea18..bc8c083bc16a 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -32,6 +32,9 @@
*/
#define CPUFREQ_ETERNAL (-1)
+
+#define CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS NSEC_PER_MSEC
+
#define CPUFREQ_NAME_LEN 16
/* Print length for names. Extra 1 space for accommodating '\n' in prints */
#define CPUFREQ_NAME_PLEN (CPUFREQ_NAME_LEN + 1)
diff --git a/rust/kernel/cpufreq.rs b/rust/kernel/cpufreq.rs
index eea57ba95f24..2ea735700ae7 100644
--- a/rust/kernel/cpufreq.rs
+++ b/rust/kernel/cpufreq.rs
@@ -39,7 +39,8 @@
const CPUFREQ_NAME_LEN: usize = bindings::CPUFREQ_NAME_LEN as usize;
/// Default transition latency value in nanoseconds.
-pub const ETERNAL_LATENCY_NS: u32 = bindings::CPUFREQ_ETERNAL as u32;
+pub const DEFAULT_TRANSITION_LATENCY_NS: u32 =
+ bindings::CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
/// CPU frequency driver flags.
pub mod flags {
@@ -400,13 +401,13 @@ pub fn to_table(mut self) -> Result<TableBox> {
/// The following example demonstrates how to create a CPU frequency table.
///
/// ```
-/// use kernel::cpufreq::{ETERNAL_LATENCY_NS, Policy};
+/// use kernel::cpufreq::{DEFAULT_TRANSITION_LATENCY_NS, Policy};
///
/// fn update_policy(policy: &mut Policy) {
/// policy
/// .set_dvfs_possible_from_any_cpu(true)
/// .set_fast_switch_possible(true)
-/// .set_transition_latency_ns(ETERNAL_LATENCY_NS);
+/// .set_transition_latency_ns(DEFAULT_TRANSITION_LATENCY_NS);
///
/// pr_info!("The policy details are: {:?}\n", (policy.cpu(), policy.cur()));
/// }
From: Yeoreum Yun <yeoreum.yun(a)arm.com>
[ Upstream commit 3b7a34aebbdf2a4b7295205bf0c654294283ec82 ]
Commit a3c3c66670ce ("perf/core: Fix child_total_time_enabled accounting
bug at task exit") moves the event->state update to before
list_del_event(). This makes the event->state test in list_del_event()
always false; never calling perf_cgroup_event_disable().
As a result, cpuctx->cgrp won't be cleared properly; causing havoc.
Cc: stable(a)vger.kernel.org # 6.6.x, 6.12.x
Fixes: a3c3c66670ce ("perf/core: Fix child_total_time_enabled accounting bug at task exit")
Signed-off-by: Chris J Arges <carges(a)cloudflare.com>
---
kernel/events/core.c | 21 ++++++---------------
1 file changed, 6 insertions(+), 15 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 3cc06ffb60c1..6688660845d2 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2100,18 +2100,6 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
if (event->group_leader == event)
del_event_from_groups(event, ctx);
- /*
- * If event was in error state, then keep it
- * that way, otherwise bogus counts will be
- * returned on read(). The only way to get out
- * of error state is by explicit re-enabling
- * of the event
- */
- if (event->state > PERF_EVENT_STATE_OFF) {
- perf_cgroup_event_disable(event, ctx);
- perf_event_set_state(event, PERF_EVENT_STATE_OFF);
- }
-
ctx->generation++;
event->pmu_ctx->nr_events--;
}
@@ -2456,11 +2444,14 @@ __perf_remove_from_context(struct perf_event *event,
*/
if (flags & DETACH_EXIT)
state = PERF_EVENT_STATE_EXIT;
- if (flags & DETACH_DEAD) {
- event->pending_disable = 1;
+ if (flags & DETACH_DEAD)
state = PERF_EVENT_STATE_DEAD;
- }
+
event_sched_out(event, ctx);
+
+ if (event->state > PERF_EVENT_STATE_OFF)
+ perf_cgroup_event_disable(event, ctx);
+
perf_event_set_state(event, min(event->state, state));
if (flags & DETACH_GROUP)
perf_group_detach(event);
--
2.43.0
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 72d271a7baa7062cb27e774ac37c5459c6d20e22
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101539-racoon-uneasily-cfd4@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 72d271a7baa7062cb27e774ac37c5459c6d20e22 Mon Sep 17 00:00:00 2001
From: Aleksa Sarai <cyphar(a)cyphar.com>
Date: Thu, 7 Aug 2025 03:55:23 +1000
Subject: [PATCH] fscontext: do not consume log entries when returning
-EMSGSIZE
Userspace generally expects APIs that return -EMSGSIZE to allow for them
to adjust their buffer size and retry the operation. However, the
fscontext log would previously clear the message even in the -EMSGSIZE
case.
Given that it is very cheap for us to check whether the buffer is too
small before we remove the message from the ring buffer, let's just do
that instead. While we're at it, refactor some fscontext_read() into a
separate helper to make the ring buffer logic a bit easier to read.
Fixes: 007ec26cdc9f ("vfs: Implement logging through fs_context")
Cc: David Howells <dhowells(a)redhat.com>
Cc: stable(a)vger.kernel.org # v5.2+
Signed-off-by: Aleksa Sarai <cyphar(a)cyphar.com>
Link: https://lore.kernel.org/20250807-fscontext-log-cleanups-v3-1-8d91d6242dc3@c…
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/fsopen.c b/fs/fsopen.c
index 1aaf4cb2afb2..f645c99204eb 100644
--- a/fs/fsopen.c
+++ b/fs/fsopen.c
@@ -18,50 +18,56 @@
#include "internal.h"
#include "mount.h"
+static inline const char *fetch_message_locked(struct fc_log *log, size_t len,
+ bool *need_free)
+{
+ const char *p;
+ int index;
+
+ if (unlikely(log->head == log->tail))
+ return ERR_PTR(-ENODATA);
+
+ index = log->tail & (ARRAY_SIZE(log->buffer) - 1);
+ p = log->buffer[index];
+ if (unlikely(strlen(p) > len))
+ return ERR_PTR(-EMSGSIZE);
+
+ log->buffer[index] = NULL;
+ *need_free = log->need_free & (1 << index);
+ log->need_free &= ~(1 << index);
+ log->tail++;
+
+ return p;
+}
+
/*
* Allow the user to read back any error, warning or informational messages.
+ * Only one message is returned for each read(2) call.
*/
static ssize_t fscontext_read(struct file *file,
char __user *_buf, size_t len, loff_t *pos)
{
struct fs_context *fc = file->private_data;
- struct fc_log *log = fc->log.log;
- unsigned int logsize = ARRAY_SIZE(log->buffer);
- ssize_t ret;
- char *p;
+ ssize_t err;
+ const char *p __free(kfree) = NULL, *message;
bool need_free;
- int index, n;
+ int n;
- ret = mutex_lock_interruptible(&fc->uapi_mutex);
- if (ret < 0)
- return ret;
-
- if (log->head == log->tail) {
- mutex_unlock(&fc->uapi_mutex);
- return -ENODATA;
- }
-
- index = log->tail & (logsize - 1);
- p = log->buffer[index];
- need_free = log->need_free & (1 << index);
- log->buffer[index] = NULL;
- log->need_free &= ~(1 << index);
- log->tail++;
+ err = mutex_lock_interruptible(&fc->uapi_mutex);
+ if (err < 0)
+ return err;
+ message = fetch_message_locked(fc->log.log, len, &need_free);
mutex_unlock(&fc->uapi_mutex);
+ if (IS_ERR(message))
+ return PTR_ERR(message);
- ret = -EMSGSIZE;
- n = strlen(p);
- if (n > len)
- goto err_free;
- ret = -EFAULT;
- if (copy_to_user(_buf, p, n) != 0)
- goto err_free;
- ret = n;
-
-err_free:
if (need_free)
- kfree(p);
- return ret;
+ p = message;
+
+ n = strlen(message);
+ if (copy_to_user(_buf, message, n))
+ return -EFAULT;
+ return n;
}
static int fscontext_release(struct inode *inode, struct file *file)
commit 3fcbf1c77d08 ("arch_topology: Fix cache attributes detection
in the CPU hotplug path")
adds a call to detect_cache_attributes() to populate the cacheinfo
before updating the siblings mask. detect_cache_attributes() allocates
memory and can take the PPTT mutex (on ACPI platforms). On PREEMPT_RT
kernels, on secondary CPUs, this triggers a:
'BUG: sleeping function called from invalid context'
as the code is executed with preemption and interrupts disabled:
| BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
| in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
| preempt_count: 1, expected: 0
| RCU nest depth: 1, expected: 1
| 3 locks held by swapper/111/0:
| #0: (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
| #1: (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
| #2: (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
| irq event stamp: 0
| hardirqs last enabled at (0): 0x0
| hardirqs last disabled at (0): copy_process+0x5dc/0x1ab8
| softirqs last enabled at (0): copy_process+0x5dc/0x1ab8
| softirqs last disabled at (0): 0x0
| Preemption disabled at:
| migrate_enable+0x30/0x130
| CPU: 111 PID: 0 Comm: swapper/111 Tainted: G W 6.0.0-rc4-rt6-[...]
| Call trace:
| __kmalloc+0xbc/0x1e8
| detect_cache_attributes+0x2d4/0x5f0
| update_siblings_masks+0x30/0x368
| store_cpu_topology+0x78/0xb8
| secondary_start_kernel+0xd0/0x198
| __secondary_switched+0xb0/0xb4
Pierre fixed this issue in the upstream 6.3 and the original series is follows:
https://lore.kernel.org/all/167404285593.885445.6219705651301997538.b4-ty@a…
We also encountered the same issue on 6.1 stable branch, and need to backport this series.
Pierre Gondois (6):
cacheinfo: Use RISC-V's init_cache_level() as generic OF
implementation
cacheinfo: Return error code in init_of_cache_level()
cacheinfo: Check 'cache-unified' property to count cache leaves
ACPI: PPTT: Remove acpi_find_cache_levels()
ACPI: PPTT: Update acpi_find_last_cache_level() to
acpi_get_cache_info()
arch_topology: Build cacheinfo from primary CPU
arch/arm64/kernel/cacheinfo.c | 11 ++-
arch/riscv/kernel/cacheinfo.c | 42 -----------
drivers/acpi/pptt.c | 93 +++++++++++++----------
drivers/base/arch_topology.c | 12 ++-
drivers/base/cacheinfo.c | 134 +++++++++++++++++++++++++++++-----
include/linux/cacheinfo.h | 11 ++-
6 files changed, 196 insertions(+), 107 deletions(-)
--
2.25.1
From: Tim Hostetler <thostet(a)google.com>
The device returns a valid bit in the LSB of the low timestamp byte in
the completion descriptor that the driver should check before
setting the SKB's hardware timestamp. If the timestamp is not valid, do not
hardware timestamp the SKB.
Cc: stable(a)vger.kernel.org
Fixes: b2c7aeb49056 ("gve: Implement ndo_hwtstamp_get/set for RX timestamping")
Reviewed-by: Joshua Washington <joshwash(a)google.com>
Signed-off-by: Tim Hostetler <thostet(a)google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy(a)google.com>
---
drivers/net/ethernet/google/gve/gve.h | 2 ++
drivers/net/ethernet/google/gve/gve_desc_dqo.h | 3 ++-
drivers/net/ethernet/google/gve/gve_rx_dqo.c | 18 ++++++++++++------
3 files changed, 16 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/google/gve/gve.h b/drivers/net/ethernet/google/gve/gve.h
index bceaf9b05cb4..4cc6dcbfd367 100644
--- a/drivers/net/ethernet/google/gve/gve.h
+++ b/drivers/net/ethernet/google/gve/gve.h
@@ -100,6 +100,8 @@
*/
#define GVE_DQO_QPL_ONDEMAND_ALLOC_THRESHOLD 96
+#define GVE_DQO_RX_HWTSTAMP_VALID 0x1
+
/* Each slot in the desc ring has a 1:1 mapping to a slot in the data ring */
struct gve_rx_desc_queue {
struct gve_rx_desc *desc_ring; /* the descriptor ring */
diff --git a/drivers/net/ethernet/google/gve/gve_desc_dqo.h b/drivers/net/ethernet/google/gve/gve_desc_dqo.h
index d17da841b5a0..f7786b03c744 100644
--- a/drivers/net/ethernet/google/gve/gve_desc_dqo.h
+++ b/drivers/net/ethernet/google/gve/gve_desc_dqo.h
@@ -236,7 +236,8 @@ struct gve_rx_compl_desc_dqo {
u8 status_error1;
- __le16 reserved5;
+ u8 reserved5;
+ u8 ts_sub_nsecs_low;
__le16 buf_id; /* Buffer ID which was sent on the buffer queue. */
union {
diff --git a/drivers/net/ethernet/google/gve/gve_rx_dqo.c b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
index 7380c2b7a2d8..02e25be8a50d 100644
--- a/drivers/net/ethernet/google/gve/gve_rx_dqo.c
+++ b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
@@ -456,14 +456,20 @@ static void gve_rx_skb_hash(struct sk_buff *skb,
* Note that this means if the time delta between packet reception and the last
* clock read is greater than ~2 seconds, this will provide invalid results.
*/
-static void gve_rx_skb_hwtstamp(struct gve_rx_ring *rx, u32 hwts)
+static void gve_rx_skb_hwtstamp(struct gve_rx_ring *rx,
+ const struct gve_rx_compl_desc_dqo *desc)
{
u64 last_read = READ_ONCE(rx->gve->last_sync_nic_counter);
struct sk_buff *skb = rx->ctx.skb_head;
- u32 low = (u32)last_read;
- s32 diff = hwts - low;
-
- skb_hwtstamps(skb)->hwtstamp = ns_to_ktime(last_read + diff);
+ u32 ts, low;
+ s32 diff;
+
+ if (desc->ts_sub_nsecs_low & GVE_DQO_RX_HWTSTAMP_VALID) {
+ ts = le32_to_cpu(desc->ts);
+ low = (u32)last_read;
+ diff = ts - low;
+ skb_hwtstamps(skb)->hwtstamp = ns_to_ktime(last_read + diff);
+ }
}
static void gve_rx_free_skb(struct napi_struct *napi, struct gve_rx_ring *rx)
@@ -919,7 +925,7 @@ static int gve_rx_complete_skb(struct gve_rx_ring *rx, struct napi_struct *napi,
gve_rx_skb_csum(rx->ctx.skb_head, desc, ptype);
if (rx->gve->ts_config.rx_filter == HWTSTAMP_FILTER_ALL)
- gve_rx_skb_hwtstamp(rx, le32_to_cpu(desc->ts));
+ gve_rx_skb_hwtstamp(rx, desc);
/* RSC packets must set gso_size otherwise the TCP stack will complain
* that packets are larger than MTU.
--
2.51.0.740.g6adb054d12-goog
From: Lad Prabhakar <prabhakar.mahadev-lad.rj(a)bp.renesas.com>
On SoCs that only support the best-effort queue and not the network
control queue, calling alloc_etherdev_mqs() with fixed values for
TX/RX queues is not appropriate. Use the nc_queues flag from the
per-SoC match data to determine whether the network control queue
is available, and fall back to a single TX/RX queue when it is not.
This ensures correct queue allocation across all supported SoCs.
Fixes: a92f4f0662bf ("ravb: Add nc_queue to struct ravb_hw_info")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj(a)bp.renesas.com>
---
drivers/net/ethernet/renesas/ravb_main.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
index 69d382e8757d..a200e205825a 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -2926,13 +2926,14 @@ static int ravb_probe(struct platform_device *pdev)
return dev_err_probe(&pdev->dev, PTR_ERR(rstc),
"failed to get cpg reset\n");
+ info = of_device_get_match_data(&pdev->dev);
+
ndev = alloc_etherdev_mqs(sizeof(struct ravb_private),
- NUM_TX_QUEUE, NUM_RX_QUEUE);
+ info->nc_queues ? NUM_TX_QUEUE : 1,
+ info->nc_queues ? NUM_RX_QUEUE : 1);
if (!ndev)
return -ENOMEM;
- info = of_device_get_match_data(&pdev->dev);
-
ndev->features = info->net_features;
ndev->hw_features = info->net_hw_features;
ndev->vlan_features = info->vlan_features;
--
2.43.0
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 7b26da407420e5054e3f06c5d13271697add9423
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101534-attempt-stubbly-cf5f@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7b26da407420e5054e3f06c5d13271697add9423 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Fri, 19 Sep 2025 14:33:23 +0930
Subject: [PATCH] btrfs: fix the incorrect max_bytes value for
find_lock_delalloc_range()
[BUG]
With my local branch to enable bs > ps support for btrfs, sometimes I
hit the following ASSERT() inside submit_one_sector():
ASSERT(block_start != EXTENT_MAP_HOLE);
Please note that it's not yet possible to hit this ASSERT() in the wild
yet, as it requires btrfs bs > ps support, which is not even in the
development branch.
But on the other hand, there is also a very low chance to hit above
ASSERT() with bs < ps cases, so this is an existing bug affect not only
the incoming bs > ps support but also the existing bs < ps support.
[CAUSE]
Firstly that ASSERT() means we're trying to submit a dirty block but
without a real extent map nor ordered extent map backing it.
Furthermore with extra debugging, the folio triggering such ASSERT() is
always larger than the fs block size in my bs > ps case.
(8K block size, 4K page size)
After some more debugging, the ASSERT() is trigger by the following
sequence:
extent_writepage()
| We got a 32K folio (4 fs blocks) at file offset 0, and the fs block
| size is 8K, page size is 4K.
| And there is another 8K folio at file offset 32K, which is also
| dirty.
| So the filemap layout looks like the following:
|
| "||" is the filio boundary in the filemap.
| "//| is the dirty range.
|
| 0 8K 16K 24K 32K 40K
| |////////| |//////////////////////||////////|
|
|- writepage_delalloc()
| |- find_lock_delalloc_range() for [0, 8K)
| | Now range [0, 8K) is properly locked.
| |
| |- find_lock_delalloc_range() for [16K, 40K)
| | |- btrfs_find_delalloc_range() returned range [16K, 40K)
| | |- lock_delalloc_folios() locked folio 0 successfully
| | |
| | | The filemap range [32K, 40K) got dropped from filemap.
| | |
| | |- lock_delalloc_folios() failed with -EAGAIN on folio 32K
| | | As the folio at 32K is dropped.
| | |
| | |- loops = 1;
| | |- max_bytes = PAGE_SIZE;
| | |- goto again;
| | | This will re-do the lookup for dirty delalloc ranges.
| | |
| | |- btrfs_find_delalloc_range() called with @max_bytes == 4K
| | | This is smaller than block size, so
| | | btrfs_find_delalloc_range() is unable to return any range.
| | \- return false;
| |
| \- Now only range [0, 8K) has an OE for it, but for dirty range
| [16K, 32K) it's dirty without an OE.
| This breaks the assumption that writepage_delalloc() will find
| and lock all dirty ranges inside the folio.
|
|- extent_writepage_io()
|- submit_one_sector() for [0, 8K)
| Succeeded
|
|- submit_one_sector() for [16K, 24K)
Triggering the ASSERT(), as there is no OE, and the original
extent map is a hole.
Please note that, this also exposed the same problem for bs < ps
support. E.g. with 64K page size and 4K block size.
If we failed to lock a folio, and falls back into the "loops = 1;"
branch, we will re-do the search using 64K as max_bytes.
Which may fail again to lock the next folio, and exit early without
handling all dirty blocks inside the folio.
[FIX]
Instead of using the fixed size PAGE_SIZE as @max_bytes, use
@sectorsize, so that we are ensured to find and lock any remaining
blocks inside the folio.
And since we're here, add an extra ASSERT() to
before calling btrfs_find_delalloc_range() to make sure the @max_bytes is
at least no smaller than a block to avoid false negative.
Cc: stable(a)vger.kernel.org # 5.15+
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 0782533aad51..2b6027ebf265 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -393,6 +393,13 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
/* step one, find a bunch of delalloc bytes starting at start */
delalloc_start = *start;
delalloc_end = 0;
+
+ /*
+ * If @max_bytes is smaller than a block, btrfs_find_delalloc_range() can
+ * return early without handling any dirty ranges.
+ */
+ ASSERT(max_bytes >= fs_info->sectorsize);
+
found = btrfs_find_delalloc_range(tree, &delalloc_start, &delalloc_end,
max_bytes, &cached_state);
if (!found || delalloc_end <= *start || delalloc_start > orig_end) {
@@ -423,13 +430,14 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
delalloc_end);
ASSERT(!ret || ret == -EAGAIN);
if (ret == -EAGAIN) {
- /* some of the folios are gone, lets avoid looping by
- * shortening the size of the delalloc range we're searching
+ /*
+ * Some of the folios are gone, lets avoid looping by
+ * shortening the size of the delalloc range we're searching.
*/
btrfs_free_extent_state(cached_state);
cached_state = NULL;
if (!loops) {
- max_bytes = PAGE_SIZE;
+ max_bytes = fs_info->sectorsize;
loops = 1;
goto again;
} else {
The L1 substates support requires additional steps to work, see e.g.
section '11.6.6.4 L1 Substate' in the RK3588 TRM V1.0.
These steps are currently missing from the driver.
While this has always been a problem when using e.g.
CONFIG_PCIEASPM_POWER_SUPERSAVE=y, the problem became more apparent after
commit f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for
devicetree platforms"), which enabled ASPM also for
CONFIG_PCIEASPM_DEFAULT=y.
Disable L1 substates until proper support is added.
Cc: stable(a)vger.kernel.org
Fixes: 0e898eb8df4e ("PCI: rockchip-dwc: Add Rockchip RK356X host controller driver")
Fixes: f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms")
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
---
drivers/pci/controller/dwc/pcie-dw-rockchip.c | 22 +++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
index 3e2752c7dd09..28e0fffe2542 100644
--- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
+++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
@@ -200,6 +200,26 @@ static bool rockchip_pcie_link_up(struct dw_pcie *pci)
return FIELD_GET(PCIE_LINKUP_MASK, val) == PCIE_LINKUP;
}
+/*
+ * See e.g. section '11.6.6.4 L1 Substate' in the RK3588 TRM V1.0 for the steps
+ * needed to support L1 substates. Currently, not a single rockchip platform
+ * performs these steps, so disable L1 substates until there is proper support.
+ */
+static void rockchip_pcie_disable_l1sub(struct dw_pcie *pci)
+{
+ u32 cap, l1subcap;
+
+ cap = dw_pcie_find_ext_capability(pci, PCI_EXT_CAP_ID_L1SS);
+ if (cap) {
+ l1subcap = dw_pcie_readl_dbi(pci, cap + PCI_L1SS_CAP);
+ l1subcap &= ~(PCI_L1SS_CAP_L1_PM_SS | PCI_L1SS_CAP_ASPM_L1_1 |
+ PCI_L1SS_CAP_ASPM_L1_2 | PCI_L1SS_CAP_PCIPM_L1_1 |
+ PCI_L1SS_CAP_PCIPM_L1_2);
+ dw_pcie_writel_dbi(pci, cap + PCI_L1SS_CAP, l1subcap);
+ l1subcap = dw_pcie_readl_dbi(pci, cap + PCI_L1SS_CAP);
+ }
+}
+
static void rockchip_pcie_enable_l0s(struct dw_pcie *pci)
{
u32 cap, lnkcap;
@@ -264,6 +284,7 @@ static int rockchip_pcie_host_init(struct dw_pcie_rp *pp)
irq_set_chained_handler_and_data(irq, rockchip_pcie_intx_handler,
rockchip);
+ rockchip_pcie_disable_l1sub(pci);
rockchip_pcie_enable_l0s(pci);
return 0;
@@ -301,6 +322,7 @@ static void rockchip_pcie_ep_init(struct dw_pcie_ep *ep)
struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
enum pci_barno bar;
+ rockchip_pcie_disable_l1sub(pci);
rockchip_pcie_enable_l0s(pci);
rockchip_pcie_ep_hide_broken_ats_cap_rk3588(ep);
--
2.51.0
According to Peter, we've had for a very long time an issue on some
mutltiouch touchpads where the fingers were stuck in a scrolling mode,
or 3 fingers gesture mode. I was unable to debug it because it was
rather hard to reproduce.
Recently, some people raised the issue again on libinput, and this time
added a recording of the actual bug.
It turns out that the sticky finger quirk that was introduced back in
2017 was only checking the last report, and that those missing releases
also happen when moving from 3 to 1 finger (only 1 is released instead
of 2).
This solution seems to me to be the most sensible, because we could also
add the NSMU quirk to win8 multitouch touchpads, but this would involve
a lot more computations at each report for rather annoying corner cases.
Link: https://gitlab.freedesktop.org/libinput/libinput/-/issues/1194
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
Benjamin Tissoires (2):
HID: multitouch: fix sticky fingers
selftests/hid: add tests for missing release on the Dell Synaptics
drivers/hid/hid-multitouch.c | 27 ++++++-----
.../testing/selftests/hid/tests/test_multitouch.py | 55 ++++++++++++++++++++++
2 files changed, 69 insertions(+), 13 deletions(-)
---
base-commit: 54ba6d9b1393a0061600c0e49c8ebef65d60a8b2
change-id: 20250926-fix-sticky-fingers-8ae88436ae82
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 7b26da407420e5054e3f06c5d13271697add9423
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101530-composed-concave-1075@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7b26da407420e5054e3f06c5d13271697add9423 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Fri, 19 Sep 2025 14:33:23 +0930
Subject: [PATCH] btrfs: fix the incorrect max_bytes value for
find_lock_delalloc_range()
[BUG]
With my local branch to enable bs > ps support for btrfs, sometimes I
hit the following ASSERT() inside submit_one_sector():
ASSERT(block_start != EXTENT_MAP_HOLE);
Please note that it's not yet possible to hit this ASSERT() in the wild
yet, as it requires btrfs bs > ps support, which is not even in the
development branch.
But on the other hand, there is also a very low chance to hit above
ASSERT() with bs < ps cases, so this is an existing bug affect not only
the incoming bs > ps support but also the existing bs < ps support.
[CAUSE]
Firstly that ASSERT() means we're trying to submit a dirty block but
without a real extent map nor ordered extent map backing it.
Furthermore with extra debugging, the folio triggering such ASSERT() is
always larger than the fs block size in my bs > ps case.
(8K block size, 4K page size)
After some more debugging, the ASSERT() is trigger by the following
sequence:
extent_writepage()
| We got a 32K folio (4 fs blocks) at file offset 0, and the fs block
| size is 8K, page size is 4K.
| And there is another 8K folio at file offset 32K, which is also
| dirty.
| So the filemap layout looks like the following:
|
| "||" is the filio boundary in the filemap.
| "//| is the dirty range.
|
| 0 8K 16K 24K 32K 40K
| |////////| |//////////////////////||////////|
|
|- writepage_delalloc()
| |- find_lock_delalloc_range() for [0, 8K)
| | Now range [0, 8K) is properly locked.
| |
| |- find_lock_delalloc_range() for [16K, 40K)
| | |- btrfs_find_delalloc_range() returned range [16K, 40K)
| | |- lock_delalloc_folios() locked folio 0 successfully
| | |
| | | The filemap range [32K, 40K) got dropped from filemap.
| | |
| | |- lock_delalloc_folios() failed with -EAGAIN on folio 32K
| | | As the folio at 32K is dropped.
| | |
| | |- loops = 1;
| | |- max_bytes = PAGE_SIZE;
| | |- goto again;
| | | This will re-do the lookup for dirty delalloc ranges.
| | |
| | |- btrfs_find_delalloc_range() called with @max_bytes == 4K
| | | This is smaller than block size, so
| | | btrfs_find_delalloc_range() is unable to return any range.
| | \- return false;
| |
| \- Now only range [0, 8K) has an OE for it, but for dirty range
| [16K, 32K) it's dirty without an OE.
| This breaks the assumption that writepage_delalloc() will find
| and lock all dirty ranges inside the folio.
|
|- extent_writepage_io()
|- submit_one_sector() for [0, 8K)
| Succeeded
|
|- submit_one_sector() for [16K, 24K)
Triggering the ASSERT(), as there is no OE, and the original
extent map is a hole.
Please note that, this also exposed the same problem for bs < ps
support. E.g. with 64K page size and 4K block size.
If we failed to lock a folio, and falls back into the "loops = 1;"
branch, we will re-do the search using 64K as max_bytes.
Which may fail again to lock the next folio, and exit early without
handling all dirty blocks inside the folio.
[FIX]
Instead of using the fixed size PAGE_SIZE as @max_bytes, use
@sectorsize, so that we are ensured to find and lock any remaining
blocks inside the folio.
And since we're here, add an extra ASSERT() to
before calling btrfs_find_delalloc_range() to make sure the @max_bytes is
at least no smaller than a block to avoid false negative.
Cc: stable(a)vger.kernel.org # 5.15+
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 0782533aad51..2b6027ebf265 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -393,6 +393,13 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
/* step one, find a bunch of delalloc bytes starting at start */
delalloc_start = *start;
delalloc_end = 0;
+
+ /*
+ * If @max_bytes is smaller than a block, btrfs_find_delalloc_range() can
+ * return early without handling any dirty ranges.
+ */
+ ASSERT(max_bytes >= fs_info->sectorsize);
+
found = btrfs_find_delalloc_range(tree, &delalloc_start, &delalloc_end,
max_bytes, &cached_state);
if (!found || delalloc_end <= *start || delalloc_start > orig_end) {
@@ -423,13 +430,14 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
delalloc_end);
ASSERT(!ret || ret == -EAGAIN);
if (ret == -EAGAIN) {
- /* some of the folios are gone, lets avoid looping by
- * shortening the size of the delalloc range we're searching
+ /*
+ * Some of the folios are gone, lets avoid looping by
+ * shortening the size of the delalloc range we're searching.
*/
btrfs_free_extent_state(cached_state);
cached_state = NULL;
if (!loops) {
- max_bytes = PAGE_SIZE;
+ max_bytes = fs_info->sectorsize;
loops = 1;
goto again;
} else {
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 72d271a7baa7062cb27e774ac37c5459c6d20e22
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101543-quake-judicial-9e2e@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 72d271a7baa7062cb27e774ac37c5459c6d20e22 Mon Sep 17 00:00:00 2001
From: Aleksa Sarai <cyphar(a)cyphar.com>
Date: Thu, 7 Aug 2025 03:55:23 +1000
Subject: [PATCH] fscontext: do not consume log entries when returning
-EMSGSIZE
Userspace generally expects APIs that return -EMSGSIZE to allow for them
to adjust their buffer size and retry the operation. However, the
fscontext log would previously clear the message even in the -EMSGSIZE
case.
Given that it is very cheap for us to check whether the buffer is too
small before we remove the message from the ring buffer, let's just do
that instead. While we're at it, refactor some fscontext_read() into a
separate helper to make the ring buffer logic a bit easier to read.
Fixes: 007ec26cdc9f ("vfs: Implement logging through fs_context")
Cc: David Howells <dhowells(a)redhat.com>
Cc: stable(a)vger.kernel.org # v5.2+
Signed-off-by: Aleksa Sarai <cyphar(a)cyphar.com>
Link: https://lore.kernel.org/20250807-fscontext-log-cleanups-v3-1-8d91d6242dc3@c…
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/fsopen.c b/fs/fsopen.c
index 1aaf4cb2afb2..f645c99204eb 100644
--- a/fs/fsopen.c
+++ b/fs/fsopen.c
@@ -18,50 +18,56 @@
#include "internal.h"
#include "mount.h"
+static inline const char *fetch_message_locked(struct fc_log *log, size_t len,
+ bool *need_free)
+{
+ const char *p;
+ int index;
+
+ if (unlikely(log->head == log->tail))
+ return ERR_PTR(-ENODATA);
+
+ index = log->tail & (ARRAY_SIZE(log->buffer) - 1);
+ p = log->buffer[index];
+ if (unlikely(strlen(p) > len))
+ return ERR_PTR(-EMSGSIZE);
+
+ log->buffer[index] = NULL;
+ *need_free = log->need_free & (1 << index);
+ log->need_free &= ~(1 << index);
+ log->tail++;
+
+ return p;
+}
+
/*
* Allow the user to read back any error, warning or informational messages.
+ * Only one message is returned for each read(2) call.
*/
static ssize_t fscontext_read(struct file *file,
char __user *_buf, size_t len, loff_t *pos)
{
struct fs_context *fc = file->private_data;
- struct fc_log *log = fc->log.log;
- unsigned int logsize = ARRAY_SIZE(log->buffer);
- ssize_t ret;
- char *p;
+ ssize_t err;
+ const char *p __free(kfree) = NULL, *message;
bool need_free;
- int index, n;
+ int n;
- ret = mutex_lock_interruptible(&fc->uapi_mutex);
- if (ret < 0)
- return ret;
-
- if (log->head == log->tail) {
- mutex_unlock(&fc->uapi_mutex);
- return -ENODATA;
- }
-
- index = log->tail & (logsize - 1);
- p = log->buffer[index];
- need_free = log->need_free & (1 << index);
- log->buffer[index] = NULL;
- log->need_free &= ~(1 << index);
- log->tail++;
+ err = mutex_lock_interruptible(&fc->uapi_mutex);
+ if (err < 0)
+ return err;
+ message = fetch_message_locked(fc->log.log, len, &need_free);
mutex_unlock(&fc->uapi_mutex);
+ if (IS_ERR(message))
+ return PTR_ERR(message);
- ret = -EMSGSIZE;
- n = strlen(p);
- if (n > len)
- goto err_free;
- ret = -EFAULT;
- if (copy_to_user(_buf, p, n) != 0)
- goto err_free;
- ret = n;
-
-err_free:
if (need_free)
- kfree(p);
- return ret;
+ p = message;
+
+ n = strlen(message);
+ if (copy_to_user(_buf, message, n))
+ return -EFAULT;
+ return n;
}
static int fscontext_release(struct inode *inode, struct file *file)
There are no scenarios where a weak increment is invalid on binder_node.
The only possible case where it could be invalid is if the kernel
delivers BR_DECREFS to the process that owns the node, and then
increments the weak refcount again, effectively "reviving" a dead node.
However, that is not possible: when the BR_DECREFS command is delivered,
the kernel removes and frees the binder_node. The fact that you were
able to call binder_inc_node_nilocked() implies that the node is not yet
destroyed, which implies that BR_DECREFS has not been delivered to
userspace, so incrementing the weak refcount is valid.
Note that it's currently possible to trigger this condition if the owner
calls BINDER_THREAD_EXIT while node->has_weak_ref is true. This causes
BC_INCREFS on binder_ref instances to fail when they should not.
Cc: stable(a)vger.kernel.org
Fixes: 457b9a6f09f0 ("Staging: android: add binder driver")
Reported-by: Yu-Ting Tseng <yutingtseng(a)google.com>
Signed-off-by: Alice Ryhl <aliceryhl(a)google.com>
---
drivers/android/binder.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index 8c99ceaa303bad8751571a337770df857e72d981..3915d8d2d896d1e3a861c336d900d7a4657a0104 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -851,17 +851,8 @@ static int binder_inc_node_nilocked(struct binder_node *node, int strong,
} else {
if (!internal)
node->local_weak_refs++;
- if (!node->has_weak_ref && list_empty(&node->work.entry)) {
- if (target_list == NULL) {
- pr_err("invalid inc weak node for %d\n",
- node->debug_id);
- return -EINVAL;
- }
- /*
- * See comment above
- */
+ if (!node->has_weak_ref && target_list && list_empty(&node->work.entry))
binder_enqueue_work_ilocked(&node->work, target_list);
- }
}
return 0;
}
---
base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
change-id: 20251015-binder-weak-inc-f294e62d2ec6
Best regards,
--
Alice Ryhl <aliceryhl(a)google.com>
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 7b26da407420e5054e3f06c5d13271697add9423
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101526-rush-bagel-d750@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7b26da407420e5054e3f06c5d13271697add9423 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Fri, 19 Sep 2025 14:33:23 +0930
Subject: [PATCH] btrfs: fix the incorrect max_bytes value for
find_lock_delalloc_range()
[BUG]
With my local branch to enable bs > ps support for btrfs, sometimes I
hit the following ASSERT() inside submit_one_sector():
ASSERT(block_start != EXTENT_MAP_HOLE);
Please note that it's not yet possible to hit this ASSERT() in the wild
yet, as it requires btrfs bs > ps support, which is not even in the
development branch.
But on the other hand, there is also a very low chance to hit above
ASSERT() with bs < ps cases, so this is an existing bug affect not only
the incoming bs > ps support but also the existing bs < ps support.
[CAUSE]
Firstly that ASSERT() means we're trying to submit a dirty block but
without a real extent map nor ordered extent map backing it.
Furthermore with extra debugging, the folio triggering such ASSERT() is
always larger than the fs block size in my bs > ps case.
(8K block size, 4K page size)
After some more debugging, the ASSERT() is trigger by the following
sequence:
extent_writepage()
| We got a 32K folio (4 fs blocks) at file offset 0, and the fs block
| size is 8K, page size is 4K.
| And there is another 8K folio at file offset 32K, which is also
| dirty.
| So the filemap layout looks like the following:
|
| "||" is the filio boundary in the filemap.
| "//| is the dirty range.
|
| 0 8K 16K 24K 32K 40K
| |////////| |//////////////////////||////////|
|
|- writepage_delalloc()
| |- find_lock_delalloc_range() for [0, 8K)
| | Now range [0, 8K) is properly locked.
| |
| |- find_lock_delalloc_range() for [16K, 40K)
| | |- btrfs_find_delalloc_range() returned range [16K, 40K)
| | |- lock_delalloc_folios() locked folio 0 successfully
| | |
| | | The filemap range [32K, 40K) got dropped from filemap.
| | |
| | |- lock_delalloc_folios() failed with -EAGAIN on folio 32K
| | | As the folio at 32K is dropped.
| | |
| | |- loops = 1;
| | |- max_bytes = PAGE_SIZE;
| | |- goto again;
| | | This will re-do the lookup for dirty delalloc ranges.
| | |
| | |- btrfs_find_delalloc_range() called with @max_bytes == 4K
| | | This is smaller than block size, so
| | | btrfs_find_delalloc_range() is unable to return any range.
| | \- return false;
| |
| \- Now only range [0, 8K) has an OE for it, but for dirty range
| [16K, 32K) it's dirty without an OE.
| This breaks the assumption that writepage_delalloc() will find
| and lock all dirty ranges inside the folio.
|
|- extent_writepage_io()
|- submit_one_sector() for [0, 8K)
| Succeeded
|
|- submit_one_sector() for [16K, 24K)
Triggering the ASSERT(), as there is no OE, and the original
extent map is a hole.
Please note that, this also exposed the same problem for bs < ps
support. E.g. with 64K page size and 4K block size.
If we failed to lock a folio, and falls back into the "loops = 1;"
branch, we will re-do the search using 64K as max_bytes.
Which may fail again to lock the next folio, and exit early without
handling all dirty blocks inside the folio.
[FIX]
Instead of using the fixed size PAGE_SIZE as @max_bytes, use
@sectorsize, so that we are ensured to find and lock any remaining
blocks inside the folio.
And since we're here, add an extra ASSERT() to
before calling btrfs_find_delalloc_range() to make sure the @max_bytes is
at least no smaller than a block to avoid false negative.
Cc: stable(a)vger.kernel.org # 5.15+
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 0782533aad51..2b6027ebf265 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -393,6 +393,13 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
/* step one, find a bunch of delalloc bytes starting at start */
delalloc_start = *start;
delalloc_end = 0;
+
+ /*
+ * If @max_bytes is smaller than a block, btrfs_find_delalloc_range() can
+ * return early without handling any dirty ranges.
+ */
+ ASSERT(max_bytes >= fs_info->sectorsize);
+
found = btrfs_find_delalloc_range(tree, &delalloc_start, &delalloc_end,
max_bytes, &cached_state);
if (!found || delalloc_end <= *start || delalloc_start > orig_end) {
@@ -423,13 +430,14 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
delalloc_end);
ASSERT(!ret || ret == -EAGAIN);
if (ret == -EAGAIN) {
- /* some of the folios are gone, lets avoid looping by
- * shortening the size of the delalloc range we're searching
+ /*
+ * Some of the folios are gone, lets avoid looping by
+ * shortening the size of the delalloc range we're searching.
*/
btrfs_free_extent_state(cached_state);
cached_state = NULL;
if (!loops) {
- max_bytes = PAGE_SIZE;
+ max_bytes = fs_info->sectorsize;
loops = 1;
goto again;
} else {
Do not stop a q6asm stream if its not started, this can result in
unnecessary dsp command which will timeout anyway something like below:
q6asm-dai ab00000.remoteproc:glink-edge:apr:service@7:dais: CMD 10bcd timeout
Fix this by correctly checking the state.
Fixes: 2a9e92d371db ("ASoC: qdsp6: q6asm: Add q6asm dai driver")
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)oss.qualcomm.com>
---
sound/soc/qcom/qdsp6/q6asm-dai.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/sound/soc/qcom/qdsp6/q6asm-dai.c b/sound/soc/qcom/qdsp6/q6asm-dai.c
index e8129510a734..0eae8c6e42b8 100644
--- a/sound/soc/qcom/qdsp6/q6asm-dai.c
+++ b/sound/soc/qcom/qdsp6/q6asm-dai.c
@@ -233,13 +233,14 @@ static int q6asm_dai_prepare(struct snd_soc_component *component,
prtd->pcm_count = snd_pcm_lib_period_bytes(substream);
prtd->pcm_irq_pos = 0;
/* rate and channels are sent to audio driver */
- if (prtd->state) {
+ if (prtd->state == Q6ASM_STREAM_RUNNING) {
/* clear the previous setup if any */
q6asm_cmd(prtd->audio_client, prtd->stream_id, CMD_CLOSE);
q6asm_unmap_memory_regions(substream->stream,
prtd->audio_client);
q6routing_stream_close(soc_prtd->dai_link->id,
substream->stream);
+ prtd->state = Q6ASM_STREAM_STOPPED;
}
ret = q6asm_map_memory_regions(substream->stream, prtd->audio_client,
--
2.51.0
DSP expects the periods to be aligned to fragment sizes, currently
setting up to hw constriants on periods bytes is not going to work
correctly as we can endup with periods sizes aligned to 32 bytes however
not aligned to fragment size.
Update the constriants to use fragment size, and also set at step of
10ms for period size to accommodate DSP requirements of 10ms latency.
Fixes: 2a9e92d371db ("ASoC: qdsp6: q6asm: Add q6asm dai driver")
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)oss.qualcomm.com>
---
sound/soc/qcom/qdsp6/q6asm-dai.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/soc/qcom/qdsp6/q6asm-dai.c b/sound/soc/qcom/qdsp6/q6asm-dai.c
index b616ce316d2f..e8129510a734 100644
--- a/sound/soc/qcom/qdsp6/q6asm-dai.c
+++ b/sound/soc/qcom/qdsp6/q6asm-dai.c
@@ -403,13 +403,13 @@ static int q6asm_dai_open(struct snd_soc_component *component,
}
ret = snd_pcm_hw_constraint_step(runtime, 0,
- SNDRV_PCM_HW_PARAM_PERIOD_BYTES, 32);
+ SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 480);
if (ret < 0) {
dev_err(dev, "constraint for period bytes step ret = %d\n",
ret);
}
ret = snd_pcm_hw_constraint_step(runtime, 0,
- SNDRV_PCM_HW_PARAM_BUFFER_BYTES, 32);
+ SNDRV_PCM_HW_PARAM_BUFFER_SIZE, 480);
if (ret < 0) {
dev_err(dev, "constraint for buffer bytes step ret = %d\n",
ret);
--
2.51.0
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 7b26da407420e5054e3f06c5d13271697add9423
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101522-repugnant-demystify-deee@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7b26da407420e5054e3f06c5d13271697add9423 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Fri, 19 Sep 2025 14:33:23 +0930
Subject: [PATCH] btrfs: fix the incorrect max_bytes value for
find_lock_delalloc_range()
[BUG]
With my local branch to enable bs > ps support for btrfs, sometimes I
hit the following ASSERT() inside submit_one_sector():
ASSERT(block_start != EXTENT_MAP_HOLE);
Please note that it's not yet possible to hit this ASSERT() in the wild
yet, as it requires btrfs bs > ps support, which is not even in the
development branch.
But on the other hand, there is also a very low chance to hit above
ASSERT() with bs < ps cases, so this is an existing bug affect not only
the incoming bs > ps support but also the existing bs < ps support.
[CAUSE]
Firstly that ASSERT() means we're trying to submit a dirty block but
without a real extent map nor ordered extent map backing it.
Furthermore with extra debugging, the folio triggering such ASSERT() is
always larger than the fs block size in my bs > ps case.
(8K block size, 4K page size)
After some more debugging, the ASSERT() is trigger by the following
sequence:
extent_writepage()
| We got a 32K folio (4 fs blocks) at file offset 0, and the fs block
| size is 8K, page size is 4K.
| And there is another 8K folio at file offset 32K, which is also
| dirty.
| So the filemap layout looks like the following:
|
| "||" is the filio boundary in the filemap.
| "//| is the dirty range.
|
| 0 8K 16K 24K 32K 40K
| |////////| |//////////////////////||////////|
|
|- writepage_delalloc()
| |- find_lock_delalloc_range() for [0, 8K)
| | Now range [0, 8K) is properly locked.
| |
| |- find_lock_delalloc_range() for [16K, 40K)
| | |- btrfs_find_delalloc_range() returned range [16K, 40K)
| | |- lock_delalloc_folios() locked folio 0 successfully
| | |
| | | The filemap range [32K, 40K) got dropped from filemap.
| | |
| | |- lock_delalloc_folios() failed with -EAGAIN on folio 32K
| | | As the folio at 32K is dropped.
| | |
| | |- loops = 1;
| | |- max_bytes = PAGE_SIZE;
| | |- goto again;
| | | This will re-do the lookup for dirty delalloc ranges.
| | |
| | |- btrfs_find_delalloc_range() called with @max_bytes == 4K
| | | This is smaller than block size, so
| | | btrfs_find_delalloc_range() is unable to return any range.
| | \- return false;
| |
| \- Now only range [0, 8K) has an OE for it, but for dirty range
| [16K, 32K) it's dirty without an OE.
| This breaks the assumption that writepage_delalloc() will find
| and lock all dirty ranges inside the folio.
|
|- extent_writepage_io()
|- submit_one_sector() for [0, 8K)
| Succeeded
|
|- submit_one_sector() for [16K, 24K)
Triggering the ASSERT(), as there is no OE, and the original
extent map is a hole.
Please note that, this also exposed the same problem for bs < ps
support. E.g. with 64K page size and 4K block size.
If we failed to lock a folio, and falls back into the "loops = 1;"
branch, we will re-do the search using 64K as max_bytes.
Which may fail again to lock the next folio, and exit early without
handling all dirty blocks inside the folio.
[FIX]
Instead of using the fixed size PAGE_SIZE as @max_bytes, use
@sectorsize, so that we are ensured to find and lock any remaining
blocks inside the folio.
And since we're here, add an extra ASSERT() to
before calling btrfs_find_delalloc_range() to make sure the @max_bytes is
at least no smaller than a block to avoid false negative.
Cc: stable(a)vger.kernel.org # 5.15+
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 0782533aad51..2b6027ebf265 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -393,6 +393,13 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
/* step one, find a bunch of delalloc bytes starting at start */
delalloc_start = *start;
delalloc_end = 0;
+
+ /*
+ * If @max_bytes is smaller than a block, btrfs_find_delalloc_range() can
+ * return early without handling any dirty ranges.
+ */
+ ASSERT(max_bytes >= fs_info->sectorsize);
+
found = btrfs_find_delalloc_range(tree, &delalloc_start, &delalloc_end,
max_bytes, &cached_state);
if (!found || delalloc_end <= *start || delalloc_start > orig_end) {
@@ -423,13 +430,14 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
delalloc_end);
ASSERT(!ret || ret == -EAGAIN);
if (ret == -EAGAIN) {
- /* some of the folios are gone, lets avoid looping by
- * shortening the size of the delalloc range we're searching
+ /*
+ * Some of the folios are gone, lets avoid looping by
+ * shortening the size of the delalloc range we're searching.
*/
btrfs_free_extent_state(cached_state);
cached_state = NULL;
if (!loops) {
- max_bytes = PAGE_SIZE;
+ max_bytes = fs_info->sectorsize;
loops = 1;
goto again;
} else {
Hi Sasha,
On Wed, Oct 15, 2025 at 07:36:02AM -0400, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> clk: at91: peripheral: fix return value
>
> to the 6.17-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> clk-at91-peripheral-fix-return-value.patch
> and it can be found in the queue-6.17 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit cbddc84a11e6112835fced8f7266d9810efc52f3
> Author: Brian Masney <bmasney(a)redhat.com>
> Date: Mon Aug 11 11:17:53 2025 -0400
>
> clk: at91: peripheral: fix return value
>
> [ Upstream commit 47b13635dabc14f1c2fdcaa5468b47ddadbdd1b5 ]
>
> determine_rate() is expected to return an error code, or 0 on success.
> clk_sam9x5_peripheral_determine_rate() has a branch that returns the
> parent rate on a certain case. This is the behavior of round_rate(),
> so let's go ahead and fix this by setting req->rate.
>
> Fixes: b4c115c76184f ("clk: at91: clk-peripheral: add support for changeable parent rate")
> Reviewed-by: Alexander Sverdlin <alexander.sverdlin(a)gmail.com>
> Acked-by: Nicolas Ferre <nicolas.ferre(a)microchip.com>
> Signed-off-by: Brian Masney <bmasney(a)redhat.com>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Please don't backport any of my round_rate() to determine_rate()
migrations to the stable kernels. I have maybe close to 200 of these
patches for various clk drivers, and stable can stay on round_rate().
There's no functional change.
Stephen mentioned this work on his pull to Linus about how this is all
prerequisite work to get to the real task of improving the clk rate
setting process.
https://lore.kernel.org/linux-clk/20251007051720.11386-1-sboyd@kernel.org/
Thanks,
Brian
Fix the incorrect usage of platform_set_drvdata and dev_set_drvdata. They
both are of the same data and overrides each other. This resulted in the
rmmod of the svc driver to fail and throw a kernel panic for kthread_stop
and fifo free.
Fixes: bf0e5bf68a20 ("firmware: stratix10-svc: extend svc to support new RSU features")
Signed-off-by: Ang Tien Sung <tiensung.ang(a)altera.com>
Signed-off-by: Khairul Anuar Romli <khairul.anuar.romli(a)altera.com>
---
drivers/firmware/stratix10-svc.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/firmware/stratix10-svc.c b/drivers/firmware/stratix10-svc.c
index e3f990d888d7..00f58e27f6de 100644
--- a/drivers/firmware/stratix10-svc.c
+++ b/drivers/firmware/stratix10-svc.c
@@ -134,6 +134,7 @@ struct stratix10_svc_data {
* @complete_status: state for completion
* @svc_fifo_lock: protect access to service message data queue
* @invoke_fn: function to issue secure monitor call or hypervisor call
+ * @svc: manages the list of client svc drivers
*
* This struct is used to create communication channels for service clients, to
* handle secure monitor or hypervisor call.
@@ -150,6 +151,7 @@ struct stratix10_svc_controller {
struct completion complete_status;
spinlock_t svc_fifo_lock;
svc_invoke_fn *invoke_fn;
+ struct stratix10_svc *svc;
};
/**
@@ -1206,6 +1208,7 @@ static int stratix10_svc_drv_probe(struct platform_device *pdev)
ret = -ENOMEM;
goto err_free_kfifo;
}
+ controller->svc = svc;
svc->stratix10_svc_rsu = platform_device_alloc(STRATIX10_RSU, 0);
if (!svc->stratix10_svc_rsu) {
@@ -1237,8 +1240,6 @@ static int stratix10_svc_drv_probe(struct platform_device *pdev)
if (ret)
goto err_unregister_fcs_dev;
- dev_set_drvdata(dev, svc);
-
pr_info("Intel Service Layer Driver Initialized\n");
return 0;
@@ -1256,8 +1257,8 @@ static int stratix10_svc_drv_probe(struct platform_device *pdev)
static void stratix10_svc_drv_remove(struct platform_device *pdev)
{
- struct stratix10_svc *svc = dev_get_drvdata(&pdev->dev);
struct stratix10_svc_controller *ctrl = platform_get_drvdata(pdev);
+ struct stratix10_svc *svc = ctrl->svc;
of_platform_depopulate(ctrl->dev);
--
2.35.3
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 2e454fb8056df6da4bba7d89a57bf60e217463c0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101518-thriving-lend-6587@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2e454fb8056df6da4bba7d89a57bf60e217463c0 Mon Sep 17 00:00:00 2001
From: Dave Jiang <dave.jiang(a)intel.com>
Date: Fri, 29 Aug 2025 15:29:06 -0700
Subject: [PATCH] cxl, acpi/hmat: Update CXL access coordinates directly
instead of through HMAT
The current implementation of CXL memory hotplug notifier gets called
before the HMAT memory hotplug notifier. The CXL driver calculates the
access coordinates (bandwidth and latency values) for the CXL end to
end path (i.e. CPU to endpoint). When the CXL region is onlined, the CXL
memory hotplug notifier writes the access coordinates to the HMAT target
structs. Then the HMAT memory hotplug notifier is called and it creates
the access coordinates for the node sysfs attributes.
During testing on an Intel platform, it was found that although the
newly calculated coordinates were pushed to sysfs, the sysfs attributes for
the access coordinates showed up with the wrong initiator. The system has
4 nodes (0, 1, 2, 3) where node 0 and 1 are CPU nodes and node 2 and 3 are
CXL nodes. The expectation is that node 2 would show up as a target to node
0:
/sys/devices/system/node/node2/access0/initiators/node0
However it was observed that node 2 showed up as a target under node 1:
/sys/devices/system/node/node2/access0/initiators/node1
The original intent of the 'ext_updated' flag in HMAT handling code was to
stop HMAT memory hotplug callback from clobbering the access coordinates
after CXL has injected its calculated coordinates and replaced the generic
target access coordinates provided by the HMAT table in the HMAT target
structs. However the flag is hacky at best and blocks the updates from
other CXL regions that are onlined in the same node later on. Remove the
'ext_updated' flag usage and just update the access coordinates for the
nodes directly without touching HMAT target data.
The hotplug memory callback ordering is changed. Instead of changing CXL,
move HMAT back so there's room for the levels rather than have CXL share
the same level as SLAB_CALLBACK_PRI. The change will resulting in the CXL
callback to be executed after the HMAT callback.
With the change, the CXL hotplug memory notifier runs after the HMAT
callback. The HMAT callback will create the node sysfs attributes for
access coordinates. The CXL callback will write the access coordinates to
the now created node sysfs attributes directly and will not pollute the
HMAT target values.
A nodemask is introduced to keep track if a node has been updated and
prevents further updates.
Fixes: 067353a46d8c ("cxl/region: Add memory hotplug notifier for cxl region")
Cc: stable(a)vger.kernel.org
Tested-by: Marc Herbert <marc.herbert(a)linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams(a)intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron(a)huawei.com>
Link: https://patch.msgid.link/20250829222907.1290912-4-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang(a)intel.com>
diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 4958301f5417..5d32490dc4ab 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -74,7 +74,6 @@ struct memory_target {
struct node_cache_attrs cache_attrs;
u8 gen_port_device_handle[ACPI_SRAT_DEVICE_HANDLE_SIZE];
bool registered;
- bool ext_updated; /* externally updated */
};
struct memory_initiator {
@@ -391,7 +390,6 @@ int hmat_update_target_coordinates(int nid, struct access_coordinate *coord,
coord->read_bandwidth, access);
hmat_update_target_access(target, ACPI_HMAT_WRITE_BANDWIDTH,
coord->write_bandwidth, access);
- target->ext_updated = true;
return 0;
}
@@ -773,10 +771,6 @@ static void hmat_update_target_attrs(struct memory_target *target,
u32 best = 0;
int i;
- /* Don't update if an external agent has changed the data. */
- if (target->ext_updated)
- return;
-
/* Don't update for generic port if there's no device handle */
if ((access == NODE_ACCESS_CLASS_GENPORT_SINK_LOCAL ||
access == NODE_ACCESS_CLASS_GENPORT_SINK_CPU) &&
diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index c0af645425f4..c891fd618cfd 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -1081,8 +1081,3 @@ int cxl_update_hmat_access_coordinates(int nid, struct cxl_region *cxlr,
{
return hmat_update_target_coordinates(nid, &cxlr->coord[access], access);
}
-
-bool cxl_need_node_perf_attrs_update(int nid)
-{
- return !acpi_node_backed_by_real_pxm(nid);
-}
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 2669f251d677..a253d308f3c9 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -139,7 +139,6 @@ long cxl_pci_get_latency(struct pci_dev *pdev);
int cxl_pci_get_bandwidth(struct pci_dev *pdev, struct access_coordinate *c);
int cxl_update_hmat_access_coordinates(int nid, struct cxl_region *cxlr,
enum access_coordinate_class access);
-bool cxl_need_node_perf_attrs_update(int nid);
int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
struct access_coordinate *c);
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 71cc42d05248..0ed95cbc5d5b 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -30,6 +30,12 @@
* 3. Decoder targets
*/
+/*
+ * nodemask that sets per node when the access_coordinates for the node has
+ * been updated by the CXL memory hotplug notifier.
+ */
+static nodemask_t nodemask_region_seen = NODE_MASK_NONE;
+
static struct cxl_region *to_cxl_region(struct device *dev);
#define __ACCESS_ATTR_RO(_level, _name) { \
@@ -2442,14 +2448,8 @@ static bool cxl_region_update_coordinates(struct cxl_region *cxlr, int nid)
for (int i = 0; i < ACCESS_COORDINATE_MAX; i++) {
if (cxlr->coord[i].read_bandwidth) {
- rc = 0;
- if (cxl_need_node_perf_attrs_update(nid))
- node_set_perf_attrs(nid, &cxlr->coord[i], i);
- else
- rc = cxl_update_hmat_access_coordinates(nid, cxlr, i);
-
- if (rc == 0)
- cset++;
+ node_update_perf_attrs(nid, &cxlr->coord[i], i);
+ cset++;
}
}
@@ -2487,6 +2487,10 @@ static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
if (nid != region_nid)
return NOTIFY_DONE;
+ /* No action needed if node bit already set */
+ if (node_test_and_set(nid, nodemask_region_seen))
+ return NOTIFY_DONE;
+
if (!cxl_region_update_coordinates(cxlr, nid))
return NOTIFY_DONE;
diff --git a/include/linux/memory.h b/include/linux/memory.h
index 1305102688d0..0b755d1ef1ec 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -120,8 +120,8 @@ struct mem_section;
*/
#define DEFAULT_CALLBACK_PRI 0
#define SLAB_CALLBACK_PRI 1
-#define HMAT_CALLBACK_PRI 2
#define CXL_CALLBACK_PRI 5
+#define HMAT_CALLBACK_PRI 6
#define MM_COMPUTE_BATCH_PRI 10
#define CPUSET_CALLBACK_PRI 10
#define MEMTIER_HOTPLUG_PRI 100
The patch below does not apply to the 6.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.17.y
git checkout FETCH_HEAD
git cherry-pick -x 2e454fb8056df6da4bba7d89a57bf60e217463c0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101516-stowaway-bogged-2173@gregkh' --subject-prefix 'PATCH 6.17.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2e454fb8056df6da4bba7d89a57bf60e217463c0 Mon Sep 17 00:00:00 2001
From: Dave Jiang <dave.jiang(a)intel.com>
Date: Fri, 29 Aug 2025 15:29:06 -0700
Subject: [PATCH] cxl, acpi/hmat: Update CXL access coordinates directly
instead of through HMAT
The current implementation of CXL memory hotplug notifier gets called
before the HMAT memory hotplug notifier. The CXL driver calculates the
access coordinates (bandwidth and latency values) for the CXL end to
end path (i.e. CPU to endpoint). When the CXL region is onlined, the CXL
memory hotplug notifier writes the access coordinates to the HMAT target
structs. Then the HMAT memory hotplug notifier is called and it creates
the access coordinates for the node sysfs attributes.
During testing on an Intel platform, it was found that although the
newly calculated coordinates were pushed to sysfs, the sysfs attributes for
the access coordinates showed up with the wrong initiator. The system has
4 nodes (0, 1, 2, 3) where node 0 and 1 are CPU nodes and node 2 and 3 are
CXL nodes. The expectation is that node 2 would show up as a target to node
0:
/sys/devices/system/node/node2/access0/initiators/node0
However it was observed that node 2 showed up as a target under node 1:
/sys/devices/system/node/node2/access0/initiators/node1
The original intent of the 'ext_updated' flag in HMAT handling code was to
stop HMAT memory hotplug callback from clobbering the access coordinates
after CXL has injected its calculated coordinates and replaced the generic
target access coordinates provided by the HMAT table in the HMAT target
structs. However the flag is hacky at best and blocks the updates from
other CXL regions that are onlined in the same node later on. Remove the
'ext_updated' flag usage and just update the access coordinates for the
nodes directly without touching HMAT target data.
The hotplug memory callback ordering is changed. Instead of changing CXL,
move HMAT back so there's room for the levels rather than have CXL share
the same level as SLAB_CALLBACK_PRI. The change will resulting in the CXL
callback to be executed after the HMAT callback.
With the change, the CXL hotplug memory notifier runs after the HMAT
callback. The HMAT callback will create the node sysfs attributes for
access coordinates. The CXL callback will write the access coordinates to
the now created node sysfs attributes directly and will not pollute the
HMAT target values.
A nodemask is introduced to keep track if a node has been updated and
prevents further updates.
Fixes: 067353a46d8c ("cxl/region: Add memory hotplug notifier for cxl region")
Cc: stable(a)vger.kernel.org
Tested-by: Marc Herbert <marc.herbert(a)linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams(a)intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron(a)huawei.com>
Link: https://patch.msgid.link/20250829222907.1290912-4-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang(a)intel.com>
diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 4958301f5417..5d32490dc4ab 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -74,7 +74,6 @@ struct memory_target {
struct node_cache_attrs cache_attrs;
u8 gen_port_device_handle[ACPI_SRAT_DEVICE_HANDLE_SIZE];
bool registered;
- bool ext_updated; /* externally updated */
};
struct memory_initiator {
@@ -391,7 +390,6 @@ int hmat_update_target_coordinates(int nid, struct access_coordinate *coord,
coord->read_bandwidth, access);
hmat_update_target_access(target, ACPI_HMAT_WRITE_BANDWIDTH,
coord->write_bandwidth, access);
- target->ext_updated = true;
return 0;
}
@@ -773,10 +771,6 @@ static void hmat_update_target_attrs(struct memory_target *target,
u32 best = 0;
int i;
- /* Don't update if an external agent has changed the data. */
- if (target->ext_updated)
- return;
-
/* Don't update for generic port if there's no device handle */
if ((access == NODE_ACCESS_CLASS_GENPORT_SINK_LOCAL ||
access == NODE_ACCESS_CLASS_GENPORT_SINK_CPU) &&
diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index c0af645425f4..c891fd618cfd 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -1081,8 +1081,3 @@ int cxl_update_hmat_access_coordinates(int nid, struct cxl_region *cxlr,
{
return hmat_update_target_coordinates(nid, &cxlr->coord[access], access);
}
-
-bool cxl_need_node_perf_attrs_update(int nid)
-{
- return !acpi_node_backed_by_real_pxm(nid);
-}
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 2669f251d677..a253d308f3c9 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -139,7 +139,6 @@ long cxl_pci_get_latency(struct pci_dev *pdev);
int cxl_pci_get_bandwidth(struct pci_dev *pdev, struct access_coordinate *c);
int cxl_update_hmat_access_coordinates(int nid, struct cxl_region *cxlr,
enum access_coordinate_class access);
-bool cxl_need_node_perf_attrs_update(int nid);
int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
struct access_coordinate *c);
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 71cc42d05248..0ed95cbc5d5b 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -30,6 +30,12 @@
* 3. Decoder targets
*/
+/*
+ * nodemask that sets per node when the access_coordinates for the node has
+ * been updated by the CXL memory hotplug notifier.
+ */
+static nodemask_t nodemask_region_seen = NODE_MASK_NONE;
+
static struct cxl_region *to_cxl_region(struct device *dev);
#define __ACCESS_ATTR_RO(_level, _name) { \
@@ -2442,14 +2448,8 @@ static bool cxl_region_update_coordinates(struct cxl_region *cxlr, int nid)
for (int i = 0; i < ACCESS_COORDINATE_MAX; i++) {
if (cxlr->coord[i].read_bandwidth) {
- rc = 0;
- if (cxl_need_node_perf_attrs_update(nid))
- node_set_perf_attrs(nid, &cxlr->coord[i], i);
- else
- rc = cxl_update_hmat_access_coordinates(nid, cxlr, i);
-
- if (rc == 0)
- cset++;
+ node_update_perf_attrs(nid, &cxlr->coord[i], i);
+ cset++;
}
}
@@ -2487,6 +2487,10 @@ static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
if (nid != region_nid)
return NOTIFY_DONE;
+ /* No action needed if node bit already set */
+ if (node_test_and_set(nid, nodemask_region_seen))
+ return NOTIFY_DONE;
+
if (!cxl_region_update_coordinates(cxlr, nid))
return NOTIFY_DONE;
diff --git a/include/linux/memory.h b/include/linux/memory.h
index 1305102688d0..0b755d1ef1ec 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -120,8 +120,8 @@ struct mem_section;
*/
#define DEFAULT_CALLBACK_PRI 0
#define SLAB_CALLBACK_PRI 1
-#define HMAT_CALLBACK_PRI 2
#define CXL_CALLBACK_PRI 5
+#define HMAT_CALLBACK_PRI 6
#define MM_COMPUTE_BATCH_PRI 10
#define CPUSET_CALLBACK_PRI 10
#define MEMTIER_HOTPLUG_PRI 100
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 6eb350a2233100a283f882c023e5ad426d0ed63b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101556-riches-vivacious-ee8f@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6eb350a2233100a283f882c023e5ad426d0ed63b Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Wed, 13 Aug 2025 17:02:30 +0200
Subject: [PATCH] rseq: Protect event mask against membarrier IPI
rseq_need_restart() reads and clears task::rseq_event_mask with preemption
disabled to guard against the scheduler.
But membarrier() uses an IPI and sets the PREEMPT bit in the event mask
from the IPI, which leaves that RMW operation unprotected.
Use guard(irq) if CONFIG_MEMBARRIER is enabled to fix that.
Fixes: 2a36ab717e8f ("rseq/membarrier: Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ")
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Boqun Feng <boqun.feng(a)gmail.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: stable(a)vger.kernel.org
diff --git a/include/linux/rseq.h b/include/linux/rseq.h
index bc8af3eb5598..1fbeb61babeb 100644
--- a/include/linux/rseq.h
+++ b/include/linux/rseq.h
@@ -7,6 +7,12 @@
#include <linux/preempt.h>
#include <linux/sched.h>
+#ifdef CONFIG_MEMBARRIER
+# define RSEQ_EVENT_GUARD irq
+#else
+# define RSEQ_EVENT_GUARD preempt
+#endif
+
/*
* Map the event mask on the user-space ABI enum rseq_cs_flags
* for direct mask checks.
@@ -41,9 +47,8 @@ static inline void rseq_handle_notify_resume(struct ksignal *ksig,
static inline void rseq_signal_deliver(struct ksignal *ksig,
struct pt_regs *regs)
{
- preempt_disable();
- __set_bit(RSEQ_EVENT_SIGNAL_BIT, ¤t->rseq_event_mask);
- preempt_enable();
+ scoped_guard(RSEQ_EVENT_GUARD)
+ __set_bit(RSEQ_EVENT_SIGNAL_BIT, ¤t->rseq_event_mask);
rseq_handle_notify_resume(ksig, regs);
}
diff --git a/kernel/rseq.c b/kernel/rseq.c
index b7a1ec327e81..2452b7366b00 100644
--- a/kernel/rseq.c
+++ b/kernel/rseq.c
@@ -342,12 +342,12 @@ static int rseq_need_restart(struct task_struct *t, u32 cs_flags)
/*
* Load and clear event mask atomically with respect to
- * scheduler preemption.
+ * scheduler preemption and membarrier IPIs.
*/
- preempt_disable();
- event_mask = t->rseq_event_mask;
- t->rseq_event_mask = 0;
- preempt_enable();
+ scoped_guard(RSEQ_EVENT_GUARD) {
+ event_mask = t->rseq_event_mask;
+ t->rseq_event_mask = 0;
+ }
return !!event_mask;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 6eb350a2233100a283f882c023e5ad426d0ed63b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101556-district-timid-1eda@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6eb350a2233100a283f882c023e5ad426d0ed63b Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Wed, 13 Aug 2025 17:02:30 +0200
Subject: [PATCH] rseq: Protect event mask against membarrier IPI
rseq_need_restart() reads and clears task::rseq_event_mask with preemption
disabled to guard against the scheduler.
But membarrier() uses an IPI and sets the PREEMPT bit in the event mask
from the IPI, which leaves that RMW operation unprotected.
Use guard(irq) if CONFIG_MEMBARRIER is enabled to fix that.
Fixes: 2a36ab717e8f ("rseq/membarrier: Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ")
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Boqun Feng <boqun.feng(a)gmail.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: stable(a)vger.kernel.org
diff --git a/include/linux/rseq.h b/include/linux/rseq.h
index bc8af3eb5598..1fbeb61babeb 100644
--- a/include/linux/rseq.h
+++ b/include/linux/rseq.h
@@ -7,6 +7,12 @@
#include <linux/preempt.h>
#include <linux/sched.h>
+#ifdef CONFIG_MEMBARRIER
+# define RSEQ_EVENT_GUARD irq
+#else
+# define RSEQ_EVENT_GUARD preempt
+#endif
+
/*
* Map the event mask on the user-space ABI enum rseq_cs_flags
* for direct mask checks.
@@ -41,9 +47,8 @@ static inline void rseq_handle_notify_resume(struct ksignal *ksig,
static inline void rseq_signal_deliver(struct ksignal *ksig,
struct pt_regs *regs)
{
- preempt_disable();
- __set_bit(RSEQ_EVENT_SIGNAL_BIT, ¤t->rseq_event_mask);
- preempt_enable();
+ scoped_guard(RSEQ_EVENT_GUARD)
+ __set_bit(RSEQ_EVENT_SIGNAL_BIT, ¤t->rseq_event_mask);
rseq_handle_notify_resume(ksig, regs);
}
diff --git a/kernel/rseq.c b/kernel/rseq.c
index b7a1ec327e81..2452b7366b00 100644
--- a/kernel/rseq.c
+++ b/kernel/rseq.c
@@ -342,12 +342,12 @@ static int rseq_need_restart(struct task_struct *t, u32 cs_flags)
/*
* Load and clear event mask atomically with respect to
- * scheduler preemption.
+ * scheduler preemption and membarrier IPIs.
*/
- preempt_disable();
- event_mask = t->rseq_event_mask;
- t->rseq_event_mask = 0;
- preempt_enable();
+ scoped_guard(RSEQ_EVENT_GUARD) {
+ event_mask = t->rseq_event_mask;
+ t->rseq_event_mask = 0;
+ }
return !!event_mask;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 5973a62efa34c80c9a4e5eac1fca6f6209b902af
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101551-unsealed-whisking-8514@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5973a62efa34c80c9a4e5eac1fca6f6209b902af Mon Sep 17 00:00:00 2001
From: Omar Sandoval <osandov(a)fb.com>
Date: Fri, 19 Sep 2025 14:27:51 -0700
Subject: [PATCH] arm64: map [_text, _stext) virtual address range
non-executable+read-only
Since the referenced fixes commit, the kernel's .text section is only
mapped starting from _stext; the region [_text, _stext) is omitted. As a
result, other vmalloc/vmap allocations may use the virtual addresses
nominally in the range [_text, _stext). This address reuse confuses
multiple things:
1. crash_prepare_elf64_headers() sets up a segment in /proc/vmcore
mapping the entire range [_text, _end) to
[__pa_symbol(_text), __pa_symbol(_end)). Reading an address in
[_text, _stext) from /proc/vmcore therefore gives the incorrect
result.
2. Tools doing symbolization (either by reading /proc/kallsyms or based
on the vmlinux ELF file) will incorrectly identify vmalloc/vmap
allocations in [_text, _stext) as kernel symbols.
In practice, both of these issues affect the drgn debugger.
Specifically, there were cases where the vmap IRQ stacks for some CPUs
were allocated in [_text, _stext). As a result, drgn could not get the
stack trace for a crash in an IRQ handler because the core dump
contained invalid data for the IRQ stack address. The stack addresses
were also symbolized as being in the _text symbol.
Fix this by bringing back the mapping of [_text, _stext), but now make
it non-executable and read-only. This prevents other allocations from
using it while still achieving the original goal of not mapping
unpredictable data as executable. Other than the changed protection,
this is effectively a revert of the fixes commit.
Fixes: e2a073dde921 ("arm64: omit [_text, _stext) from permanent kernel mapping")
Cc: stable(a)vger.kernel.org
Signed-off-by: Omar Sandoval <osandov(a)fb.com>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/kernel/pi/map_kernel.c b/arch/arm64/kernel/pi/map_kernel.c
index e6d35eff1486..e8ddbde31a83 100644
--- a/arch/arm64/kernel/pi/map_kernel.c
+++ b/arch/arm64/kernel/pi/map_kernel.c
@@ -78,6 +78,12 @@ static void __init map_kernel(u64 kaslr_offset, u64 va_offset, int root_level)
twopass |= enable_scs;
prot = twopass ? data_prot : text_prot;
+ /*
+ * [_stext, _text) isn't executed after boot and contains some
+ * non-executable, unpredictable data, so map it non-executable.
+ */
+ map_segment(init_pg_dir, &pgdp, va_offset, _text, _stext, data_prot,
+ false, root_level);
map_segment(init_pg_dir, &pgdp, va_offset, _stext, _etext, prot,
!twopass, root_level);
map_segment(init_pg_dir, &pgdp, va_offset, __start_rodata,
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 77c7926a4df6..23c05dc7a8f2 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -214,7 +214,7 @@ static void __init request_standard_resources(void)
unsigned long i = 0;
size_t res_size;
- kernel_code.start = __pa_symbol(_stext);
+ kernel_code.start = __pa_symbol(_text);
kernel_code.end = __pa_symbol(__init_begin - 1);
kernel_data.start = __pa_symbol(_sdata);
kernel_data.end = __pa_symbol(_end - 1);
@@ -280,7 +280,7 @@ u64 cpu_logical_map(unsigned int cpu)
void __init __no_sanitize_address setup_arch(char **cmdline_p)
{
- setup_initial_init_mm(_stext, _etext, _edata, _end);
+ setup_initial_init_mm(_text, _etext, _edata, _end);
*cmdline_p = boot_command_line;
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 70c2ca813c18..524d34a0e921 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -279,7 +279,7 @@ void __init arm64_memblock_init(void)
* Register the kernel text, kernel data, initrd, and initial
* pagetables with memblock.
*/
- memblock_reserve(__pa_symbol(_stext), _end - _stext);
+ memblock_reserve(__pa_symbol(_text), _end - _text);
if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && phys_initrd_size) {
/* the generic initrd code expects virtual addresses */
initrd_start = __phys_to_virt(phys_initrd_start);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 0ba1a15e7e74..10c258099581 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -965,8 +965,8 @@ void __init mark_linear_text_alias_ro(void)
/*
* Remove the write permissions from the linear alias of .text/.rodata
*/
- update_mapping_prot(__pa_symbol(_stext), (unsigned long)lm_alias(_stext),
- (unsigned long)__init_begin - (unsigned long)_stext,
+ update_mapping_prot(__pa_symbol(_text), (unsigned long)lm_alias(_text),
+ (unsigned long)__init_begin - (unsigned long)_text,
PAGE_KERNEL_RO);
}
@@ -1037,7 +1037,7 @@ static inline bool force_pte_mapping(void)
static void __init map_mem(pgd_t *pgdp)
{
static const u64 direct_map_end = _PAGE_END(VA_BITS_MIN);
- phys_addr_t kernel_start = __pa_symbol(_stext);
+ phys_addr_t kernel_start = __pa_symbol(_text);
phys_addr_t kernel_end = __pa_symbol(__init_begin);
phys_addr_t start, end;
phys_addr_t early_kfence_pool;
@@ -1086,7 +1086,7 @@ static void __init map_mem(pgd_t *pgdp)
}
/*
- * Map the linear alias of the [_stext, __init_begin) interval
+ * Map the linear alias of the [_text, __init_begin) interval
* as non-executable now, and remove the write permission in
* mark_linear_text_alias_ro() below (which will be called after
* alternative patching has completed). This makes the contents
@@ -1113,6 +1113,10 @@ void mark_rodata_ro(void)
WRITE_ONCE(rodata_is_rw, false);
update_mapping_prot(__pa_symbol(__start_rodata), (unsigned long)__start_rodata,
section_size, PAGE_KERNEL_RO);
+ /* mark the range between _text and _stext as read only. */
+ update_mapping_prot(__pa_symbol(_text), (unsigned long)_text,
+ (unsigned long)_stext - (unsigned long)_text,
+ PAGE_KERNEL_RO);
}
static void __init declare_vma(struct vm_struct *vma,
@@ -1183,7 +1187,7 @@ static void __init declare_kernel_vmas(void)
{
static struct vm_struct vmlinux_seg[KERNEL_SEGMENT_COUNT];
- declare_vma(&vmlinux_seg[0], _stext, _etext, VM_NO_GUARD);
+ declare_vma(&vmlinux_seg[0], _text, _etext, VM_NO_GUARD);
declare_vma(&vmlinux_seg[1], __start_rodata, __inittext_begin, VM_NO_GUARD);
declare_vma(&vmlinux_seg[2], __inittext_begin, __inittext_end, VM_NO_GUARD);
declare_vma(&vmlinux_seg[3], __initdata_begin, __initdata_end, VM_NO_GUARD);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 5973a62efa34c80c9a4e5eac1fca6f6209b902af
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101549-slurp-darkened-955a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5973a62efa34c80c9a4e5eac1fca6f6209b902af Mon Sep 17 00:00:00 2001
From: Omar Sandoval <osandov(a)fb.com>
Date: Fri, 19 Sep 2025 14:27:51 -0700
Subject: [PATCH] arm64: map [_text, _stext) virtual address range
non-executable+read-only
Since the referenced fixes commit, the kernel's .text section is only
mapped starting from _stext; the region [_text, _stext) is omitted. As a
result, other vmalloc/vmap allocations may use the virtual addresses
nominally in the range [_text, _stext). This address reuse confuses
multiple things:
1. crash_prepare_elf64_headers() sets up a segment in /proc/vmcore
mapping the entire range [_text, _end) to
[__pa_symbol(_text), __pa_symbol(_end)). Reading an address in
[_text, _stext) from /proc/vmcore therefore gives the incorrect
result.
2. Tools doing symbolization (either by reading /proc/kallsyms or based
on the vmlinux ELF file) will incorrectly identify vmalloc/vmap
allocations in [_text, _stext) as kernel symbols.
In practice, both of these issues affect the drgn debugger.
Specifically, there were cases where the vmap IRQ stacks for some CPUs
were allocated in [_text, _stext). As a result, drgn could not get the
stack trace for a crash in an IRQ handler because the core dump
contained invalid data for the IRQ stack address. The stack addresses
were also symbolized as being in the _text symbol.
Fix this by bringing back the mapping of [_text, _stext), but now make
it non-executable and read-only. This prevents other allocations from
using it while still achieving the original goal of not mapping
unpredictable data as executable. Other than the changed protection,
this is effectively a revert of the fixes commit.
Fixes: e2a073dde921 ("arm64: omit [_text, _stext) from permanent kernel mapping")
Cc: stable(a)vger.kernel.org
Signed-off-by: Omar Sandoval <osandov(a)fb.com>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/kernel/pi/map_kernel.c b/arch/arm64/kernel/pi/map_kernel.c
index e6d35eff1486..e8ddbde31a83 100644
--- a/arch/arm64/kernel/pi/map_kernel.c
+++ b/arch/arm64/kernel/pi/map_kernel.c
@@ -78,6 +78,12 @@ static void __init map_kernel(u64 kaslr_offset, u64 va_offset, int root_level)
twopass |= enable_scs;
prot = twopass ? data_prot : text_prot;
+ /*
+ * [_stext, _text) isn't executed after boot and contains some
+ * non-executable, unpredictable data, so map it non-executable.
+ */
+ map_segment(init_pg_dir, &pgdp, va_offset, _text, _stext, data_prot,
+ false, root_level);
map_segment(init_pg_dir, &pgdp, va_offset, _stext, _etext, prot,
!twopass, root_level);
map_segment(init_pg_dir, &pgdp, va_offset, __start_rodata,
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 77c7926a4df6..23c05dc7a8f2 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -214,7 +214,7 @@ static void __init request_standard_resources(void)
unsigned long i = 0;
size_t res_size;
- kernel_code.start = __pa_symbol(_stext);
+ kernel_code.start = __pa_symbol(_text);
kernel_code.end = __pa_symbol(__init_begin - 1);
kernel_data.start = __pa_symbol(_sdata);
kernel_data.end = __pa_symbol(_end - 1);
@@ -280,7 +280,7 @@ u64 cpu_logical_map(unsigned int cpu)
void __init __no_sanitize_address setup_arch(char **cmdline_p)
{
- setup_initial_init_mm(_stext, _etext, _edata, _end);
+ setup_initial_init_mm(_text, _etext, _edata, _end);
*cmdline_p = boot_command_line;
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 70c2ca813c18..524d34a0e921 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -279,7 +279,7 @@ void __init arm64_memblock_init(void)
* Register the kernel text, kernel data, initrd, and initial
* pagetables with memblock.
*/
- memblock_reserve(__pa_symbol(_stext), _end - _stext);
+ memblock_reserve(__pa_symbol(_text), _end - _text);
if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && phys_initrd_size) {
/* the generic initrd code expects virtual addresses */
initrd_start = __phys_to_virt(phys_initrd_start);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 0ba1a15e7e74..10c258099581 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -965,8 +965,8 @@ void __init mark_linear_text_alias_ro(void)
/*
* Remove the write permissions from the linear alias of .text/.rodata
*/
- update_mapping_prot(__pa_symbol(_stext), (unsigned long)lm_alias(_stext),
- (unsigned long)__init_begin - (unsigned long)_stext,
+ update_mapping_prot(__pa_symbol(_text), (unsigned long)lm_alias(_text),
+ (unsigned long)__init_begin - (unsigned long)_text,
PAGE_KERNEL_RO);
}
@@ -1037,7 +1037,7 @@ static inline bool force_pte_mapping(void)
static void __init map_mem(pgd_t *pgdp)
{
static const u64 direct_map_end = _PAGE_END(VA_BITS_MIN);
- phys_addr_t kernel_start = __pa_symbol(_stext);
+ phys_addr_t kernel_start = __pa_symbol(_text);
phys_addr_t kernel_end = __pa_symbol(__init_begin);
phys_addr_t start, end;
phys_addr_t early_kfence_pool;
@@ -1086,7 +1086,7 @@ static void __init map_mem(pgd_t *pgdp)
}
/*
- * Map the linear alias of the [_stext, __init_begin) interval
+ * Map the linear alias of the [_text, __init_begin) interval
* as non-executable now, and remove the write permission in
* mark_linear_text_alias_ro() below (which will be called after
* alternative patching has completed). This makes the contents
@@ -1113,6 +1113,10 @@ void mark_rodata_ro(void)
WRITE_ONCE(rodata_is_rw, false);
update_mapping_prot(__pa_symbol(__start_rodata), (unsigned long)__start_rodata,
section_size, PAGE_KERNEL_RO);
+ /* mark the range between _text and _stext as read only. */
+ update_mapping_prot(__pa_symbol(_text), (unsigned long)_text,
+ (unsigned long)_stext - (unsigned long)_text,
+ PAGE_KERNEL_RO);
}
static void __init declare_vma(struct vm_struct *vma,
@@ -1183,7 +1187,7 @@ static void __init declare_kernel_vmas(void)
{
static struct vm_struct vmlinux_seg[KERNEL_SEGMENT_COUNT];
- declare_vma(&vmlinux_seg[0], _stext, _etext, VM_NO_GUARD);
+ declare_vma(&vmlinux_seg[0], _text, _etext, VM_NO_GUARD);
declare_vma(&vmlinux_seg[1], __start_rodata, __inittext_begin, VM_NO_GUARD);
declare_vma(&vmlinux_seg[2], __inittext_begin, __inittext_end, VM_NO_GUARD);
declare_vma(&vmlinux_seg[3], __initdata_begin, __initdata_end, VM_NO_GUARD);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 5973a62efa34c80c9a4e5eac1fca6f6209b902af
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101548-stiffly-eagle-d9a5@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5973a62efa34c80c9a4e5eac1fca6f6209b902af Mon Sep 17 00:00:00 2001
From: Omar Sandoval <osandov(a)fb.com>
Date: Fri, 19 Sep 2025 14:27:51 -0700
Subject: [PATCH] arm64: map [_text, _stext) virtual address range
non-executable+read-only
Since the referenced fixes commit, the kernel's .text section is only
mapped starting from _stext; the region [_text, _stext) is omitted. As a
result, other vmalloc/vmap allocations may use the virtual addresses
nominally in the range [_text, _stext). This address reuse confuses
multiple things:
1. crash_prepare_elf64_headers() sets up a segment in /proc/vmcore
mapping the entire range [_text, _end) to
[__pa_symbol(_text), __pa_symbol(_end)). Reading an address in
[_text, _stext) from /proc/vmcore therefore gives the incorrect
result.
2. Tools doing symbolization (either by reading /proc/kallsyms or based
on the vmlinux ELF file) will incorrectly identify vmalloc/vmap
allocations in [_text, _stext) as kernel symbols.
In practice, both of these issues affect the drgn debugger.
Specifically, there were cases where the vmap IRQ stacks for some CPUs
were allocated in [_text, _stext). As a result, drgn could not get the
stack trace for a crash in an IRQ handler because the core dump
contained invalid data for the IRQ stack address. The stack addresses
were also symbolized as being in the _text symbol.
Fix this by bringing back the mapping of [_text, _stext), but now make
it non-executable and read-only. This prevents other allocations from
using it while still achieving the original goal of not mapping
unpredictable data as executable. Other than the changed protection,
this is effectively a revert of the fixes commit.
Fixes: e2a073dde921 ("arm64: omit [_text, _stext) from permanent kernel mapping")
Cc: stable(a)vger.kernel.org
Signed-off-by: Omar Sandoval <osandov(a)fb.com>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/kernel/pi/map_kernel.c b/arch/arm64/kernel/pi/map_kernel.c
index e6d35eff1486..e8ddbde31a83 100644
--- a/arch/arm64/kernel/pi/map_kernel.c
+++ b/arch/arm64/kernel/pi/map_kernel.c
@@ -78,6 +78,12 @@ static void __init map_kernel(u64 kaslr_offset, u64 va_offset, int root_level)
twopass |= enable_scs;
prot = twopass ? data_prot : text_prot;
+ /*
+ * [_stext, _text) isn't executed after boot and contains some
+ * non-executable, unpredictable data, so map it non-executable.
+ */
+ map_segment(init_pg_dir, &pgdp, va_offset, _text, _stext, data_prot,
+ false, root_level);
map_segment(init_pg_dir, &pgdp, va_offset, _stext, _etext, prot,
!twopass, root_level);
map_segment(init_pg_dir, &pgdp, va_offset, __start_rodata,
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 77c7926a4df6..23c05dc7a8f2 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -214,7 +214,7 @@ static void __init request_standard_resources(void)
unsigned long i = 0;
size_t res_size;
- kernel_code.start = __pa_symbol(_stext);
+ kernel_code.start = __pa_symbol(_text);
kernel_code.end = __pa_symbol(__init_begin - 1);
kernel_data.start = __pa_symbol(_sdata);
kernel_data.end = __pa_symbol(_end - 1);
@@ -280,7 +280,7 @@ u64 cpu_logical_map(unsigned int cpu)
void __init __no_sanitize_address setup_arch(char **cmdline_p)
{
- setup_initial_init_mm(_stext, _etext, _edata, _end);
+ setup_initial_init_mm(_text, _etext, _edata, _end);
*cmdline_p = boot_command_line;
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 70c2ca813c18..524d34a0e921 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -279,7 +279,7 @@ void __init arm64_memblock_init(void)
* Register the kernel text, kernel data, initrd, and initial
* pagetables with memblock.
*/
- memblock_reserve(__pa_symbol(_stext), _end - _stext);
+ memblock_reserve(__pa_symbol(_text), _end - _text);
if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && phys_initrd_size) {
/* the generic initrd code expects virtual addresses */
initrd_start = __phys_to_virt(phys_initrd_start);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 0ba1a15e7e74..10c258099581 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -965,8 +965,8 @@ void __init mark_linear_text_alias_ro(void)
/*
* Remove the write permissions from the linear alias of .text/.rodata
*/
- update_mapping_prot(__pa_symbol(_stext), (unsigned long)lm_alias(_stext),
- (unsigned long)__init_begin - (unsigned long)_stext,
+ update_mapping_prot(__pa_symbol(_text), (unsigned long)lm_alias(_text),
+ (unsigned long)__init_begin - (unsigned long)_text,
PAGE_KERNEL_RO);
}
@@ -1037,7 +1037,7 @@ static inline bool force_pte_mapping(void)
static void __init map_mem(pgd_t *pgdp)
{
static const u64 direct_map_end = _PAGE_END(VA_BITS_MIN);
- phys_addr_t kernel_start = __pa_symbol(_stext);
+ phys_addr_t kernel_start = __pa_symbol(_text);
phys_addr_t kernel_end = __pa_symbol(__init_begin);
phys_addr_t start, end;
phys_addr_t early_kfence_pool;
@@ -1086,7 +1086,7 @@ static void __init map_mem(pgd_t *pgdp)
}
/*
- * Map the linear alias of the [_stext, __init_begin) interval
+ * Map the linear alias of the [_text, __init_begin) interval
* as non-executable now, and remove the write permission in
* mark_linear_text_alias_ro() below (which will be called after
* alternative patching has completed). This makes the contents
@@ -1113,6 +1113,10 @@ void mark_rodata_ro(void)
WRITE_ONCE(rodata_is_rw, false);
update_mapping_prot(__pa_symbol(__start_rodata), (unsigned long)__start_rodata,
section_size, PAGE_KERNEL_RO);
+ /* mark the range between _text and _stext as read only. */
+ update_mapping_prot(__pa_symbol(_text), (unsigned long)_text,
+ (unsigned long)_stext - (unsigned long)_text,
+ PAGE_KERNEL_RO);
}
static void __init declare_vma(struct vm_struct *vma,
@@ -1183,7 +1187,7 @@ static void __init declare_kernel_vmas(void)
{
static struct vm_struct vmlinux_seg[KERNEL_SEGMENT_COUNT];
- declare_vma(&vmlinux_seg[0], _stext, _etext, VM_NO_GUARD);
+ declare_vma(&vmlinux_seg[0], _text, _etext, VM_NO_GUARD);
declare_vma(&vmlinux_seg[1], __start_rodata, __inittext_begin, VM_NO_GUARD);
declare_vma(&vmlinux_seg[2], __inittext_begin, __inittext_end, VM_NO_GUARD);
declare_vma(&vmlinux_seg[3], __initdata_begin, __initdata_end, VM_NO_GUARD);
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 8e7e265d558e0257d6dacc78ec64aff4ba75f61e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101502-elf-bootie-19bf@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8e7e265d558e0257d6dacc78ec64aff4ba75f61e Mon Sep 17 00:00:00 2001
From: Charalampos Mitrodimas <charmitro(a)posteo.net>
Date: Sat, 16 Aug 2025 14:14:37 +0000
Subject: [PATCH] debugfs: fix mount options not being applied
Mount options (uid, gid, mode) are silently ignored when debugfs is
mounted. This is a regression introduced during the conversion to the
new mount API.
When the mount API conversion was done, the parsed options were never
applied to the superblock when it was reused. As a result, the mount
options were ignored when debugfs was mounted.
Fix this by following the same pattern as the tracefs fix in commit
e4d32142d1de ("tracing: Fix tracefs mount options"). Call
debugfs_reconfigure() in debugfs_get_tree() to apply the mount options
to the superblock after it has been created or reused.
As an example, with the bug the "mode" mount option is ignored:
$ mount -o mode=0666 -t debugfs debugfs /tmp/debugfs_test
$ mount | grep debugfs_test
debugfs on /tmp/debugfs_test type debugfs (rw,relatime)
$ ls -ld /tmp/debugfs_test
drwx------ 25 root root 0 Aug 4 14:16 /tmp/debugfs_test
With the fix applied, it works as expected:
$ mount -o mode=0666 -t debugfs debugfs /tmp/debugfs_test
$ mount | grep debugfs_test
debugfs on /tmp/debugfs_test type debugfs (rw,relatime,mode=666)
$ ls -ld /tmp/debugfs_test
drw-rw-rw- 37 root root 0 Aug 2 17:28 /tmp/debugfs_test
Fixes: a20971c18752 ("vfs: Convert debugfs to use the new mount API")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220406
Cc: stable(a)vger.kernel.org
Reviewed-by: Eric Sandeen <sandeen(a)redhat.com>
Signed-off-by: Charalampos Mitrodimas <charmitro(a)posteo.net>
Link: https://lore.kernel.org/20250816-debugfs-mount-opts-v3-1-d271dad57b5b@poste…
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index a0357b0cf362..c12d649df6a5 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -183,6 +183,9 @@ static int debugfs_reconfigure(struct fs_context *fc)
struct debugfs_fs_info *sb_opts = sb->s_fs_info;
struct debugfs_fs_info *new_opts = fc->s_fs_info;
+ if (!new_opts)
+ return 0;
+
sync_filesystem(sb);
/* structure copy of new mount options to sb */
@@ -282,10 +285,16 @@ static int debugfs_fill_super(struct super_block *sb, struct fs_context *fc)
static int debugfs_get_tree(struct fs_context *fc)
{
+ int err;
+
if (!(debugfs_allow & DEBUGFS_ALLOW_API))
return -EPERM;
- return get_tree_single(fc, debugfs_fill_super);
+ err = get_tree_single(fc, debugfs_fill_super);
+ if (err)
+ return err;
+
+ return debugfs_reconfigure(fc);
}
static void debugfs_free_fc(struct fs_context *fc)
The patch below does not apply to the 6.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.17.y
git checkout FETCH_HEAD
git cherry-pick -x 8e7e265d558e0257d6dacc78ec64aff4ba75f61e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101559-usher-corroding-5b5a@gregkh' --subject-prefix 'PATCH 6.17.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8e7e265d558e0257d6dacc78ec64aff4ba75f61e Mon Sep 17 00:00:00 2001
From: Charalampos Mitrodimas <charmitro(a)posteo.net>
Date: Sat, 16 Aug 2025 14:14:37 +0000
Subject: [PATCH] debugfs: fix mount options not being applied
Mount options (uid, gid, mode) are silently ignored when debugfs is
mounted. This is a regression introduced during the conversion to the
new mount API.
When the mount API conversion was done, the parsed options were never
applied to the superblock when it was reused. As a result, the mount
options were ignored when debugfs was mounted.
Fix this by following the same pattern as the tracefs fix in commit
e4d32142d1de ("tracing: Fix tracefs mount options"). Call
debugfs_reconfigure() in debugfs_get_tree() to apply the mount options
to the superblock after it has been created or reused.
As an example, with the bug the "mode" mount option is ignored:
$ mount -o mode=0666 -t debugfs debugfs /tmp/debugfs_test
$ mount | grep debugfs_test
debugfs on /tmp/debugfs_test type debugfs (rw,relatime)
$ ls -ld /tmp/debugfs_test
drwx------ 25 root root 0 Aug 4 14:16 /tmp/debugfs_test
With the fix applied, it works as expected:
$ mount -o mode=0666 -t debugfs debugfs /tmp/debugfs_test
$ mount | grep debugfs_test
debugfs on /tmp/debugfs_test type debugfs (rw,relatime,mode=666)
$ ls -ld /tmp/debugfs_test
drw-rw-rw- 37 root root 0 Aug 2 17:28 /tmp/debugfs_test
Fixes: a20971c18752 ("vfs: Convert debugfs to use the new mount API")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220406
Cc: stable(a)vger.kernel.org
Reviewed-by: Eric Sandeen <sandeen(a)redhat.com>
Signed-off-by: Charalampos Mitrodimas <charmitro(a)posteo.net>
Link: https://lore.kernel.org/20250816-debugfs-mount-opts-v3-1-d271dad57b5b@poste…
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index a0357b0cf362..c12d649df6a5 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -183,6 +183,9 @@ static int debugfs_reconfigure(struct fs_context *fc)
struct debugfs_fs_info *sb_opts = sb->s_fs_info;
struct debugfs_fs_info *new_opts = fc->s_fs_info;
+ if (!new_opts)
+ return 0;
+
sync_filesystem(sb);
/* structure copy of new mount options to sb */
@@ -282,10 +285,16 @@ static int debugfs_fill_super(struct super_block *sb, struct fs_context *fc)
static int debugfs_get_tree(struct fs_context *fc)
{
+ int err;
+
if (!(debugfs_allow & DEBUGFS_ALLOW_API))
return -EPERM;
- return get_tree_single(fc, debugfs_fill_super);
+ err = get_tree_single(fc, debugfs_fill_super);
+ if (err)
+ return err;
+
+ return debugfs_reconfigure(fc);
}
static void debugfs_free_fc(struct fs_context *fc)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 72d271a7baa7062cb27e774ac37c5459c6d20e22
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101537-visor-thank-7800@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 72d271a7baa7062cb27e774ac37c5459c6d20e22 Mon Sep 17 00:00:00 2001
From: Aleksa Sarai <cyphar(a)cyphar.com>
Date: Thu, 7 Aug 2025 03:55:23 +1000
Subject: [PATCH] fscontext: do not consume log entries when returning
-EMSGSIZE
Userspace generally expects APIs that return -EMSGSIZE to allow for them
to adjust their buffer size and retry the operation. However, the
fscontext log would previously clear the message even in the -EMSGSIZE
case.
Given that it is very cheap for us to check whether the buffer is too
small before we remove the message from the ring buffer, let's just do
that instead. While we're at it, refactor some fscontext_read() into a
separate helper to make the ring buffer logic a bit easier to read.
Fixes: 007ec26cdc9f ("vfs: Implement logging through fs_context")
Cc: David Howells <dhowells(a)redhat.com>
Cc: stable(a)vger.kernel.org # v5.2+
Signed-off-by: Aleksa Sarai <cyphar(a)cyphar.com>
Link: https://lore.kernel.org/20250807-fscontext-log-cleanups-v3-1-8d91d6242dc3@c…
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/fsopen.c b/fs/fsopen.c
index 1aaf4cb2afb2..f645c99204eb 100644
--- a/fs/fsopen.c
+++ b/fs/fsopen.c
@@ -18,50 +18,56 @@
#include "internal.h"
#include "mount.h"
+static inline const char *fetch_message_locked(struct fc_log *log, size_t len,
+ bool *need_free)
+{
+ const char *p;
+ int index;
+
+ if (unlikely(log->head == log->tail))
+ return ERR_PTR(-ENODATA);
+
+ index = log->tail & (ARRAY_SIZE(log->buffer) - 1);
+ p = log->buffer[index];
+ if (unlikely(strlen(p) > len))
+ return ERR_PTR(-EMSGSIZE);
+
+ log->buffer[index] = NULL;
+ *need_free = log->need_free & (1 << index);
+ log->need_free &= ~(1 << index);
+ log->tail++;
+
+ return p;
+}
+
/*
* Allow the user to read back any error, warning or informational messages.
+ * Only one message is returned for each read(2) call.
*/
static ssize_t fscontext_read(struct file *file,
char __user *_buf, size_t len, loff_t *pos)
{
struct fs_context *fc = file->private_data;
- struct fc_log *log = fc->log.log;
- unsigned int logsize = ARRAY_SIZE(log->buffer);
- ssize_t ret;
- char *p;
+ ssize_t err;
+ const char *p __free(kfree) = NULL, *message;
bool need_free;
- int index, n;
+ int n;
- ret = mutex_lock_interruptible(&fc->uapi_mutex);
- if (ret < 0)
- return ret;
-
- if (log->head == log->tail) {
- mutex_unlock(&fc->uapi_mutex);
- return -ENODATA;
- }
-
- index = log->tail & (logsize - 1);
- p = log->buffer[index];
- need_free = log->need_free & (1 << index);
- log->buffer[index] = NULL;
- log->need_free &= ~(1 << index);
- log->tail++;
+ err = mutex_lock_interruptible(&fc->uapi_mutex);
+ if (err < 0)
+ return err;
+ message = fetch_message_locked(fc->log.log, len, &need_free);
mutex_unlock(&fc->uapi_mutex);
+ if (IS_ERR(message))
+ return PTR_ERR(message);
- ret = -EMSGSIZE;
- n = strlen(p);
- if (n > len)
- goto err_free;
- ret = -EFAULT;
- if (copy_to_user(_buf, p, n) != 0)
- goto err_free;
- ret = n;
-
-err_free:
if (need_free)
- kfree(p);
- return ret;
+ p = message;
+
+ n = strlen(message);
+ if (copy_to_user(_buf, message, n))
+ return -EFAULT;
+ return n;
}
static int fscontext_release(struct inode *inode, struct file *file)
The desc->len value can be set up to U32_MAX. If umem tx_metadata_len
option is also set, the value of the expression
'desc->len + pool->tx_metadata_len' can overflow and validation
of the incorrect descriptor will be successfully passed.
This can lead to a subsequent chain of arithmetic overflows
in the xsk_build_skb() function and incorrect sk_buff allocation.
To reproduce the overflow, this piece of userspace code can be used:
struct xdp_umem_reg umem_reg;
umem_reg.addr = (__u64)(void *)umem;
...
umem_reg.chunk_size = 4096;
umem_reg.tx_metadata_len = 16;
umem_reg.flags = XDP_UMEM_TX_METADATA_LEN;
setsockopt(sfd, SOL_XDP, XDP_UMEM_REG, &umem_reg, sizeof(umem_reg));
...
xsk_ring_prod__reserve(tq, batch_size, &idx);
for (i = 0; i < nr_packets; ++i) {
struct xdp_desc *tx_desc = xsk_ring_prod__tx_desc(tq, idx + i);
tx_desc->addr = packets[i].addr;
tx_desc->addr += umem->tx_metadata_len;
tx_desc->options = XDP_TX_METADATA;
tx_desc->len = UINT32_MAX;
}
xsk_ring_prod__submit(tq, nr_packets);
...
sendto(sfd, NULL, 0, MSG_DONTWAIT, NULL, 0);
Found by InfoTeCS on behalf of Linux Verification Center
(linuxtesting.org) with SVACE.
Fixes: 341ac980eab9 ("xsk: Support tx_metadata_len")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov(a)infotecs.ru>
---
v2: Add a repro
net/xdp/xsk_queue.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index f16f390370dc..b206a8839b39 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -144,7 +144,7 @@ static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool,
struct xdp_desc *desc)
{
u64 addr = desc->addr - pool->tx_metadata_len;
- u64 len = desc->len + pool->tx_metadata_len;
+ u64 len = (u64)desc->len + pool->tx_metadata_len;
u64 offset = addr & (pool->chunk_size - 1);
if (!desc->len)
@@ -165,7 +165,7 @@ static inline bool xp_unaligned_validate_desc(struct xsk_buff_pool *pool,
struct xdp_desc *desc)
{
u64 addr = xp_unaligned_add_offset_to_addr(desc->addr) - pool->tx_metadata_len;
- u64 len = desc->len + pool->tx_metadata_len;
+ u64 len = (u64)desc->len + pool->tx_metadata_len;
if (!desc->len)
return false;
--
2.39.5
Devices without the AWCC interface don't initialize `awcc`. Add a check
before dereferencing it in sleep handlers.
Cc: stable(a)vger.kernel.org
Reported-by: Gal Hammer <galhammer(a)gmail.com>
Tested-by: Gal Hammer <galhammer(a)gmail.com>
Fixes: 07ac275981b1 ("platform/x86: alienware-wmi-wmax: Add support for manual fan control")
Signed-off-by: Kurt Borja <kuurtb(a)gmail.com>
---
Changes in v3:
- Fix typo in title
- Go for a simpler approach because the last one prevented the old
driver interface from loading
- Link to v2: https://lore.kernel.org/r/20251013-sleep-fix-v2-1-1ad8bdb79585@gmail.com
Changes in v2:
- Little logic mistake in the `force_gmode` path... (oops)
- Link to v1: https://lore.kernel.org/r/20251013-sleep-fix-v1-1-92bc11b6ecae@gmail.com
---
drivers/platform/x86/dell/alienware-wmi-wmax.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/platform/x86/dell/alienware-wmi-wmax.c b/drivers/platform/x86/dell/alienware-wmi-wmax.c
index 31f9643a6a3b..b106e8e407b3 100644
--- a/drivers/platform/x86/dell/alienware-wmi-wmax.c
+++ b/drivers/platform/x86/dell/alienware-wmi-wmax.c
@@ -1639,7 +1639,7 @@ static int wmax_wmi_probe(struct wmi_device *wdev, const void *context)
static int wmax_wmi_suspend(struct device *dev)
{
- if (awcc->hwmon)
+ if (awcc && awcc->hwmon)
awcc_hwmon_suspend(dev);
return 0;
@@ -1647,7 +1647,7 @@ static int wmax_wmi_suspend(struct device *dev)
static int wmax_wmi_resume(struct device *dev)
{
- if (awcc->hwmon)
+ if (awcc && awcc->hwmon)
awcc_hwmon_resume(dev);
return 0;
---
base-commit: 3ed17349f18774c24505b0c21dfbd3cc4f126518
change-id: 20251012-sleep-fix-5d0596dd92a3
--
~ Kurt
When adding dependencies with drm_sched_job_add_dependency(), that
function consumes the fence reference both on success and failure, so in
the latter case the dma_fence_put() on the error path (xarray failed to
expand) is a double free.
Interestingly this bug appears to have been present ever since
ebd5f74255b9 ("drm/sched: Add dependency tracking"), since the code back
then looked like this:
drm_sched_job_add_implicit_dependencies():
...
for (i = 0; i < fence_count; i++) {
ret = drm_sched_job_add_dependency(job, fences[i]);
if (ret)
break;
}
for (; i < fence_count; i++)
dma_fence_put(fences[i]);
Which means for the failing 'i' the dma_fence_put was already a double
free. Possibly there were no users at that time, or the test cases were
insufficient to hit it.
The bug was then only noticed and fixed after
9c2ba265352a ("drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2")
landed, with its fixup of
4eaf02d6076c ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies").
At that point it was a slightly different flavour of a double free, which
963d0b356935 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder")
noticed and attempted to fix.
But it only moved the double free from happening inside the
drm_sched_job_add_dependency(), when releasing the reference not yet
obtained, to the caller, when releasing the reference already released by
the former in the failure case.
As such it is not easy to identify the right target for the fixes tag so
lets keep it simple and just continue the chain.
While fixing we also improve the comment and explain the reason for taking
the reference and not dropping it.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com>
Fixes: 963d0b356935 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder")
Reported-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Reference: https://lore.kernel.org/dri-devel/aNFbXq8OeYl3QSdm@stanley.mountain/
Cc: Christian König <christian.koenig(a)amd.com>
Cc: Rob Clark <robdclark(a)chromium.org>
Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Danilo Krummrich <dakr(a)kernel.org>
Cc: Philipp Stanner <phasta(a)kernel.org>
Cc: "Christian König" <ckoenig.leichtzumerken(a)gmail.com>
Cc: dri-devel(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v5.16+
---
v2:
* Re-arrange commit text so discussion around sentences starting with
capital letters in all cases can be avoided.
* Keep double return for now.
* Improved comment instead of dropping it.
---
drivers/gpu/drm/scheduler/sched_main.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 46119aacb809..c39f0245e3a9 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -965,13 +965,14 @@ int drm_sched_job_add_resv_dependencies(struct drm_sched_job *job,
dma_resv_assert_held(resv);
dma_resv_for_each_fence(&cursor, resv, usage, fence) {
- /* Make sure to grab an additional ref on the added fence */
- dma_fence_get(fence);
- ret = drm_sched_job_add_dependency(job, fence);
- if (ret) {
- dma_fence_put(fence);
+ /*
+ * As drm_sched_job_add_dependency always consumes the fence
+ * reference (even when it fails), and dma_resv_for_each_fence
+ * is not obtaining one, we need to grab one before calling.
+ */
+ ret = drm_sched_job_add_dependency(job, dma_fence_get(fence));
+ if (ret)
return ret;
- }
}
return 0;
}
--
2.48.0
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x bcd1383516bb5a6f72b2d1e7f7ad42c4a14837d1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101352-joylessly-yoyo-b1e8@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bcd1383516bb5a6f72b2d1e7f7ad42c4a14837d1 Mon Sep 17 00:00:00 2001
From: Kai Vehmanen <kai.vehmanen(a)linux.intel.com>
Date: Thu, 2 Oct 2025 10:47:15 +0300
Subject: [PATCH] ASoC: SOF: ipc4-pcm: fix delay calculation when DSP resamples
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When the sampling rates going in (host) and out (dai) from the DSP
are different, the IPC4 delay reporting does not work correctly.
Add support for this case by scaling the all raw position values to
a common timebase before calculating real-time delay for the PCM.
Cc: stable(a)vger.kernel.org
Fixes: 0ea06680dfcb ("ASoC: SOF: ipc4-pcm: Correct the delay calculation")
Signed-off-by: Kai Vehmanen <kai.vehmanen(a)linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi(a)linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao(a)linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi(a)linux.intel.com>
Link: https://patch.msgid.link/20251002074719.2084-2-peter.ujfalusi@linux.intel.c…
Signed-off-by: Mark Brown <broonie(a)kernel.org>
diff --git a/sound/soc/sof/ipc4-pcm.c b/sound/soc/sof/ipc4-pcm.c
index cb9a06792a47..769ba4fed56a 100644
--- a/sound/soc/sof/ipc4-pcm.c
+++ b/sound/soc/sof/ipc4-pcm.c
@@ -19,12 +19,14 @@
* struct sof_ipc4_timestamp_info - IPC4 timestamp info
* @host_copier: the host copier of the pcm stream
* @dai_copier: the dai copier of the pcm stream
- * @stream_start_offset: reported by fw in memory window (converted to frames)
- * @stream_end_offset: reported by fw in memory window (converted to frames)
+ * @stream_start_offset: reported by fw in memory window (converted to
+ * frames at host_copier sampling rate)
+ * @stream_end_offset: reported by fw in memory window (converted to
+ * frames at host_copier sampling rate)
* @llp_offset: llp offset in memory window
- * @boundary: wrap boundary should be used for the LLP frame counter
* @delay: Calculated and stored in pointer callback. The stored value is
- * returned in the delay callback.
+ * returned in the delay callback. Expressed in frames at host copier
+ * sampling rate.
*/
struct sof_ipc4_timestamp_info {
struct sof_ipc4_copier *host_copier;
@@ -33,7 +35,6 @@ struct sof_ipc4_timestamp_info {
u64 stream_end_offset;
u32 llp_offset;
- u64 boundary;
snd_pcm_sframes_t delay;
};
@@ -48,6 +49,16 @@ struct sof_ipc4_pcm_stream_priv {
bool chain_dma_allocated;
};
+/*
+ * Modulus to use to compare host and link position counters. The sampling
+ * rates may be different, so the raw hardware counters will wrap
+ * around at different times. To calculate differences, use
+ * DELAY_BOUNDARY as a common modulus. This value must be smaller than
+ * the wrap-around point of any hardware counter, and larger than any
+ * valid delay measurement.
+ */
+#define DELAY_BOUNDARY U32_MAX
+
static inline struct sof_ipc4_timestamp_info *
sof_ipc4_sps_to_time_info(struct snd_sof_pcm_stream *sps)
{
@@ -1049,6 +1060,35 @@ static int sof_ipc4_pcm_hw_params(struct snd_soc_component *component,
return 0;
}
+static u64 sof_ipc4_frames_dai_to_host(struct sof_ipc4_timestamp_info *time_info, u64 value)
+{
+ u64 dai_rate, host_rate;
+
+ if (!time_info->dai_copier || !time_info->host_copier)
+ return value;
+
+ /*
+ * copiers do not change sampling rate, so we can use the
+ * out_format independently of stream direction
+ */
+ dai_rate = time_info->dai_copier->data.out_format.sampling_frequency;
+ host_rate = time_info->host_copier->data.out_format.sampling_frequency;
+
+ if (!dai_rate || !host_rate || dai_rate == host_rate)
+ return value;
+
+ /* take care not to overflow u64, rates can be up to 768000 */
+ if (value > U32_MAX) {
+ value = div64_u64(value, dai_rate);
+ value *= host_rate;
+ } else {
+ value *= host_rate;
+ value = div64_u64(value, dai_rate);
+ }
+
+ return value;
+}
+
static int sof_ipc4_get_stream_start_offset(struct snd_sof_dev *sdev,
struct snd_pcm_substream *substream,
struct snd_sof_pcm_stream *sps,
@@ -1099,14 +1139,13 @@ static int sof_ipc4_get_stream_start_offset(struct snd_sof_dev *sdev,
time_info->stream_end_offset = ppl_reg.stream_end_offset;
do_div(time_info->stream_end_offset, dai_sample_size);
+ /* convert to host frame time */
+ time_info->stream_start_offset =
+ sof_ipc4_frames_dai_to_host(time_info, time_info->stream_start_offset);
+ time_info->stream_end_offset =
+ sof_ipc4_frames_dai_to_host(time_info, time_info->stream_end_offset);
+
out:
- /*
- * Calculate the wrap boundary need to be used for delay calculation
- * The host counter is in bytes, it will wrap earlier than the frames
- * based link counter.
- */
- time_info->boundary = div64_u64(~((u64)0),
- frames_to_bytes(substream->runtime, 1));
/* Initialize the delay value to 0 (no delay) */
time_info->delay = 0;
@@ -1149,6 +1188,8 @@ static int sof_ipc4_pcm_pointer(struct snd_soc_component *component,
/* For delay calculation we need the host counter */
host_cnt = snd_sof_pcm_get_host_byte_counter(sdev, component, substream);
+
+ /* Store the original value to host_ptr */
host_ptr = host_cnt;
/* convert the host_cnt to frames */
@@ -1167,6 +1208,8 @@ static int sof_ipc4_pcm_pointer(struct snd_soc_component *component,
sof_mailbox_read(sdev, time_info->llp_offset, &llp, sizeof(llp));
dai_cnt = ((u64)llp.reading.llp_u << 32) | llp.reading.llp_l;
}
+
+ dai_cnt = sof_ipc4_frames_dai_to_host(time_info, dai_cnt);
dai_cnt += time_info->stream_end_offset;
/* In two cases dai dma counter is not accurate
@@ -1200,8 +1243,9 @@ static int sof_ipc4_pcm_pointer(struct snd_soc_component *component,
dai_cnt -= time_info->stream_start_offset;
}
- /* Wrap the dai counter at the boundary where the host counter wraps */
- div64_u64_rem(dai_cnt, time_info->boundary, &dai_cnt);
+ /* Convert to a common base before comparisons */
+ dai_cnt &= DELAY_BOUNDARY;
+ host_cnt &= DELAY_BOUNDARY;
if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
head_cnt = host_cnt;
@@ -1211,14 +1255,11 @@ static int sof_ipc4_pcm_pointer(struct snd_soc_component *component,
tail_cnt = host_cnt;
}
- if (head_cnt < tail_cnt) {
- time_info->delay = time_info->boundary - tail_cnt + head_cnt;
- goto out;
- }
+ if (unlikely(head_cnt < tail_cnt))
+ time_info->delay = DELAY_BOUNDARY - tail_cnt + head_cnt;
+ else
+ time_info->delay = head_cnt - tail_cnt;
- time_info->delay = head_cnt - tail_cnt;
-
-out:
/*
* Convert the host byte counter to PCM pointer which wraps in buffer
* and it is in frames
The patch titled
Subject: mm/damon/core: use damos_commit_quota_goal() for new goal commit
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-damon-core-use-damos_commit_quota_goal-for-new-goal-commit.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/core: use damos_commit_quota_goal() for new goal commit
Date: Mon, 13 Oct 2025 17:18:44 -0700
When damos_commit_quota_goals() is called for adding new DAMOS quota goals
of DAMOS_QUOTA_USER_INPUT metric, current_value fields of the new goals
should be also set as requested.
However, damos_commit_quota_goals() is not updating the field for the
case, since it is setting only metrics and target values using
damos_new_quota_goal(), and metric-optional union fields using
damos_commit_quota_goal_union(). As a result, users could see the first
current_value parameter that committed online with a new quota goal is
ignored. Users are assumed to commit the current_value for
DAMOS_QUOTA_USER_INPUT quota goals, since it is being used as a feedback.
Hence the real impact would be subtle. That said, this is obviously not
intended behavior.
Fix the issue by using damos_commit_quota_goal() which sets all quota goal
parameters, instead of damos_commit_quota_goal_union(), which sets only
the union fields.
Link: https://lkml.kernel.org/r/20251014001846.279282-1-sj@kernel.org
Fixes: 1aef9df0ee90 ("mm/damon/core: commit damos_quota_goal->nid")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [6.16+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/damon/core.c~mm-damon-core-use-damos_commit_quota_goal-for-new-goal-commit
+++ a/mm/damon/core.c
@@ -835,7 +835,7 @@ int damos_commit_quota_goals(struct damo
src_goal->metric, src_goal->target_value);
if (!new_goal)
return -ENOMEM;
- damos_commit_quota_goal_union(new_goal, src_goal);
+ damos_commit_quota_goal(new_goal, src_goal);
damos_add_quota_goal(dst, new_goal);
}
return 0;
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-sysfs-catch-commit-test-ctx-alloc-failure.patch
mm-damon-sysfs-dealloc-commit-test-ctx-always.patch
mm-damon-core-fix-list_add_tail-call-on-damon_call.patch
mm-damon-core-use-damos_commit_quota_goal-for-new-goal-commit.patch
mm-zswap-remove-unnecessary-dlen-writes-for-incompressible-pages.patch
mm-zswap-fix-typos-s-zwap-zswap.patch
mm-zswap-s-red-black-tree-xarray.patch
docs-admin-guide-mm-zswap-s-red-black-tree-xarray.patch
The patch titled
Subject: mm/damon/core: fix potential memory leak by cleaning ops_filter in damon_destroy_scheme
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-damon-core-fix-potential-memory-leak-by-cleaning-ops_filter-in-damon_destroy_scheme.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Enze Li <lienze(a)kylinos.cn>
Subject: mm/damon/core: fix potential memory leak by cleaning ops_filter in damon_destroy_scheme
Date: Tue, 14 Oct 2025 16:42:25 +0800
Currently, damon_destroy_scheme() only cleans up the filter list but
leaves ops_filter untouched, which could lead to memory leaks when a
scheme is destroyed.
This patch ensures both filter and ops_filter are properly freed in
damon_destroy_scheme(), preventing potential memory leaks.
Link: https://lkml.kernel.org/r/20251014084225.313313-1-lienze@kylinos.cn
Fixes: ab82e57981d0 ("mm/damon/core: introduce damos->ops_filters")
Signed-off-by: Enze Li <lienze(a)kylinos.cn>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Tested-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/core.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/damon/core.c~mm-damon-core-fix-potential-memory-leak-by-cleaning-ops_filter-in-damon_destroy_scheme
+++ a/mm/damon/core.c
@@ -452,6 +452,9 @@ void damon_destroy_scheme(struct damos *
damos_for_each_filter_safe(f, next, s)
damos_destroy_filter(f);
+ damos_for_each_ops_filter_safe(f, next, s)
+ damos_destroy_filter(f);
+
kfree(s->migrate_dests.node_id_arr);
kfree(s->migrate_dests.weight_arr);
damon_del_scheme(s);
_
Patches currently in -mm which might be from lienze(a)kylinos.cn are
mm-damon-core-fix-potential-memory-leak-by-cleaning-ops_filter-in-damon_destroy_scheme.patch
The patch titled
Subject: vmw_balloon: indicate success when effectively deflating during migration
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
vmw_balloon-indicate-success-when-effectively-deflating-during-migration.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: vmw_balloon: indicate success when effectively deflating during migration
Date: Tue, 14 Oct 2025 14:44:55 +0200
When migrating a balloon page, we first deflate the old page to then
inflate the new page.
However, if inflating the new page succeeded, we effectively deflated the
old page, reducing the balloon size.
In that case, the migration actually worked: similar to migrating+
immediately deflating the new page. The old page will be freed back to
the buddy.
Right now, the core will leave the page be marked as isolated (as we
returned an error). When later trying to putback that page, we will run
into the WARN_ON_ONCE() in balloon_page_putback().
That handling was changed in commit 3544c4faccb8 ("mm/balloon_compaction:
stop using __ClearPageMovable()"); before that change, we would have
tolerated that way of handling it.
To fix it, let's just return 0 in that case, making the core effectively
just clear the "isolated" flag + freeing it back to the buddy as if the
migration succeeded. Note that the new page will also get freed when the
core puts the last reference.
Note that this also makes it all be more consistent: we will no longer
unisolate the page in the balloon driver while keeping it marked as being
isolated in migration core.
This was found by code inspection.
Link: https://lkml.kernel.org/r/20251014124455.478345-1-david@redhat.com
Fixes: 3544c4faccb8 ("mm/balloon_compaction: stop using __ClearPageMovable()")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Cc: Jerrin Shaji George <jerrin.shaji-george(a)broadcom.com>
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list(a)broadcom.com>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/misc/vmw_balloon.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
--- a/drivers/misc/vmw_balloon.c~vmw_balloon-indicate-success-when-effectively-deflating-during-migration
+++ a/drivers/misc/vmw_balloon.c
@@ -1737,7 +1737,7 @@ static int vmballoon_migratepage(struct
{
unsigned long status, flags;
struct vmballoon *b;
- int ret;
+ int ret = 0;
b = container_of(b_dev_info, struct vmballoon, b_dev_info);
@@ -1796,17 +1796,15 @@ static int vmballoon_migratepage(struct
* A failure happened. While we can deflate the page we just
* inflated, this deflation can also encounter an error. Instead
* we will decrease the size of the balloon to reflect the
- * change and report failure.
+ * change.
*/
atomic64_dec(&b->size);
- ret = -EBUSY;
} else {
/*
* Success. Take a reference for the page, and we will add it to
* the list after acquiring the lock.
*/
get_page(newpage);
- ret = 0;
}
/* Update the balloon list under the @pages_lock */
@@ -1817,7 +1815,7 @@ static int vmballoon_migratepage(struct
* If we succeed just insert it to the list and update the statistics
* under the lock.
*/
- if (!ret) {
+ if (status == VMW_BALLOON_SUCCESS) {
balloon_page_insert(&b->b_dev_info, newpage);
__count_vm_event(BALLOON_MIGRATE);
}
_
Patches currently in -mm which might be from david(a)redhat.com are
vmw_balloon-indicate-success-when-effectively-deflating-during-migration.patch
Using async-profiler
(https://github.com/async-profiler/async-profiler/) on Linux
6.17.1-arch1-1 causes a complete hang of the CPU. This has been
reported by many people at https://github.com/lucko/spark/issues/530.
spark is a piece of software that uses async-profiler internally.
As seen in https://github.com/lucko/spark/issues/530#issuecomment-3339974827,
this was bisected to 18dbcbfabfffc4a5d3ea10290c5ad27f22b0d240 perf:
Fix the POLL_HUP delivery breakage. Reverting this commit on 6.17.1
fixed the issue for me.
Steps to reproduce:
1. Get a copy of async-profiler. I tested both v3 (affects older spark
versions) and v4.1 (latest at time of writing). Unarchive it, this is
<async-profiler-dir>.
2. Set kernel parameters kernel.perf_event_paranoid=1 and
kernel.kptr_restrict=0 as instructed by
https://github.com/async-profiler/async-profiler/blob/fb673227c7fb311f872ce…
3. Install a version of Java that comes with jshell, i.e. Java 9 or
newer. Note: jshell is used for ease of reproduction. Any Java
application that is actively running will work.
4. Run `printf 'int acc; while (true) { acc++; }' | jshell -`. This
will start an infinitely running Java process.
5. Run `jps` and take the PID next to the text RemoteExecutionControl
-- this is the process that was just started.
6. Attach async-profiler to this process by running
`<async-profiler-dir>/bin/asprof -d 1 <PID>`. This will run for one
second, then the system should freeze entirely shortly thereafter.
I triggered a sysrq crash while the system was frozen, and the output
I found in journalctl afterwards is at
https://gist.github.com/octylFractal/76611ee76060051e5efc0c898dd0949e
I'm not sure if that text is actually from the triggered crash, but it
seems relevant. If needed, please tell me how to get the actual crash
report, I'm not sure where it is.
I'm using an AMD Ryzen 9 5900X 12-Core Processor. Given that I've seen
no Intel reports, it may be AMD specific. I don't have an Intel CPU on
hand to test with.
/proc/version: Linux version 6.17.1-arch1-1 (linux@archlinux) (gcc
(GCC) 15.2.1 20250813, GNU ld (GNU Binutils) 2.45.0) #1 SMP
PREEMPT_DYNAMIC Mon, 06 Oct 2025 18:48:29 +0000
Operating System: Arch Linux
uname -mi: x86_64 unknown
For idpf:
Milena fixes a memory leak in the idpf reset logic when the driver resets
with an outstanding Tx timestamp.
For ixgbe and ixgbevf:
Jedrzej fixes an issue with reporting link speed on E610 VFs.
Jedrzej also fixes the VF mailbox API incompatibilities caused by the
confusion with API v1.4, v1.5, and v1.6. The v1.4 API introduced IPSEC
offload, but this was only supported on Linux hosts. The v1.5 API
introduced a new mailbox API which is necessary to resolve issues on ESX
hosts. The v1.6 API introduced a new link management API for E610. Jedrzej
introduces a new v1.7 API with a feature negotiation which enables properly
checking if features such as IPSEC or the ESX mailbox APIs are supported.
This resolves issues with compatibility on different hosts, and aligns the
API across hosts instead of having Linux require custom mailbox API
versions for IPSEC offload.
Koichiro fixes a KASAN use-after-free bug in ixgbe_remove().
Signed-off-by: Jacob Keller <jacob.e.keller(a)intel.com>
---
Changes in v3:
- Rebase and verified compilation issues are resolved.
- Configured b4 to exclude undeliverable addresses.
- Link to v2: https://lore.kernel.org/r/20251003-jk-iwl-net-2025-10-01-v2-0-e59b4141c1b5@…
Changes in v2:
- Drop Emil's idpf_vport_open race fix for now.
- Add my signature.
- Link to v1: https://lore.kernel.org/r/20251001-jk-iwl-net-2025-10-01-v1-0-49fa99e86600@…
---
Jedrzej Jagielski (4):
ixgbevf: fix getting link speed data for E610 devices
ixgbe: handle IXGBE_VF_GET_PF_LINK_STATE mailbox operation
ixgbevf: fix mailbox API compatibility by negotiating supported features
ixgbe: handle IXGBE_VF_FEATURES_NEGOTIATE mbox cmd
Koichiro Den (1):
ixgbe: fix too early devlink_free() in ixgbe_remove()
Milena Olech (1):
idpf: cleanup remaining SKBs in PTP flows
drivers/net/ethernet/intel/ixgbe/ixgbe_mbx.h | 15 ++
drivers/net/ethernet/intel/ixgbevf/defines.h | 1 +
drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 7 +
drivers/net/ethernet/intel/ixgbevf/mbx.h | 8 +
drivers/net/ethernet/intel/ixgbevf/vf.h | 1 +
drivers/net/ethernet/intel/idpf/idpf_ptp.c | 3 +
.../net/ethernet/intel/idpf/idpf_virtchnl_ptp.c | 1 +
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 +-
drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 79 +++++++++
drivers/net/ethernet/intel/ixgbevf/ipsec.c | 10 ++
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 34 +++-
drivers/net/ethernet/intel/ixgbevf/vf.c | 182 +++++++++++++++++----
12 files changed, 310 insertions(+), 34 deletions(-)
---
base-commit: fea8cdf6738a8b25fccbb7b109b440795a0892cb
change-id: 20251001-jk-iwl-net-2025-10-01-92cd2a626ff7
Best regards,
--
Jacob Keller <jacob.e.keller(a)intel.com>
The patch titled
Subject: mm/damon/core: fix list_add_tail() call on damon_call()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-damon-core-fix-list_add_tail-call-on-damon_call.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/core: fix list_add_tail() call on damon_call()
Date: Tue, 14 Oct 2025 13:59:36 -0700
Each damon_ctx maintains callback requests using a linked list
(damon_ctx->call_controls). When a new callback request is received via
damon_call(), the new request should be added to the list. However, the
function is making a mistake at list_add_tail() invocation: putting the
new item to add and the list head to add it before, in the opposite order.
Because of the linked list manipulation implementation, the new request
can still be reached from the context's list head. But the list items
that were added before the new request are dropped from the list.
As a result, the callbacks are unexpectedly not invocated. Worse yet, if
the dropped callback requests were dynamically allocated, the memory is
leaked. Actually DAMON sysfs interface is using a dynamically allocated
repeat-mode callback request for automatic essential stats update. And
because the online DAMON parameters commit is using a non-repeat-mode
callback request, the issue can easily be reproduced, like below.
# damo start --damos_action stat --refresh_stat 1s
# damo tune --damos_action stat --refresh_stat 1s
The first command dynamically allocates the repeat-mode callback request
for automatic essential stat update. Users can see the essential stats
are automatically updated for every second, using the sysfs interface.
The second command calls damon_commit() with a new callback request that
was made for the commit. As a result, the previously added repeat-mode
callback request is dropped from the list. The automatic stats refresh
stops working, and the memory for the repeat-mode callback request is
leaked. It can be confirmed using kmemleak.
Fix the mistake on the list_add_tail() call.
Link: https://lkml.kernel.org/r/20251014205939.1206-1-sj@kernel.org
Fixes: 004ded6bee11 ("mm/damon: accept parallel damon_call() requests")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [6.17+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/damon/core.c~mm-damon-core-fix-list_add_tail-call-on-damon_call
+++ a/mm/damon/core.c
@@ -1450,7 +1450,7 @@ int damon_call(struct damon_ctx *ctx, st
INIT_LIST_HEAD(&control->list);
mutex_lock(&ctx->call_controls_lock);
- list_add_tail(&ctx->call_controls, &control->list);
+ list_add_tail(&control->list, &ctx->call_controls);
mutex_unlock(&ctx->call_controls_lock);
if (!damon_is_running(ctx))
return -EINVAL;
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-sysfs-catch-commit-test-ctx-alloc-failure.patch
mm-damon-sysfs-dealloc-commit-test-ctx-always.patch
mm-damon-core-fix-list_add_tail-call-on-damon_call.patch
mm-zswap-remove-unnecessary-dlen-writes-for-incompressible-pages.patch
mm-zswap-fix-typos-s-zwap-zswap.patch
mm-zswap-s-red-black-tree-xarray.patch
docs-admin-guide-mm-zswap-s-red-black-tree-xarray.patch
We found some discrete AMD graphics devices hide funtion 0 and the whole
is not supposed to be probed.
Since our original purpose is to allow integrated devices (on bus 0) to
be probed without function 0, we can limit the islolated function probing
only on bus 0.
Cc: stable(a)vger.kernel.org
Fixes: a02fd05661d73a8 ("PCI: Extend isolated function probing to LoongArch")
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
Resend to correct the subject line.
drivers/pci/probe.c | 2 +-
include/linux/hypervisor.h | 8 +++++---
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index c83e75a0ec12..da6a2aef823a 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2883,7 +2883,7 @@ int pci_scan_slot(struct pci_bus *bus, int devfn)
* a hypervisor that passes through individual PCI
* functions.
*/
- if (!hypervisor_isolated_pci_functions())
+ if (!hypervisor_isolated_pci_functions(bus->number))
break;
}
fn = next_fn(bus, dev, fn);
diff --git a/include/linux/hypervisor.h b/include/linux/hypervisor.h
index be5417303ecf..30ece04a16d9 100644
--- a/include/linux/hypervisor.h
+++ b/include/linux/hypervisor.h
@@ -32,13 +32,15 @@ static inline bool jailhouse_paravirt(void)
#endif /* !CONFIG_X86 */
-static inline bool hypervisor_isolated_pci_functions(void)
+static inline bool hypervisor_isolated_pci_functions(int bus)
{
if (IS_ENABLED(CONFIG_S390))
return true;
- if (IS_ENABLED(CONFIG_LOONGARCH))
- return true;
+ if (IS_ENABLED(CONFIG_LOONGARCH)) {
+ if (bus == 0)
+ return true;
+ }
return jailhouse_paravirt();
}
--
2.47.3
Each damon_ctx maintains callback requests using a linked list
(damon_ctx->call_controls). When a new callback request is received via
damon_call(), the new request should be added to the list. However, the
function is making a mistake at list_add_tail() invocation: putting the
new item to add and the list head to add it before, in the opposite
order. Because of the linked list manipulation implementation, the new
request can still be reached from the context's list head. But the list
items that were added before the new request are dropped from the list.
As a result, the callbacks are unexpectedly not invocated. Worse yet,
if the dropped callback requests were dynamically allocated, the memory
is leaked. Actually DAMON sysfs interface is using a dynamically
allocated repeat-mode callback request for automatic essential stats
update. And because the online DAMON parameters commit is using
a non-repeat-mode callback request, the issue can easily be reproduced,
like below.
# damo start --damos_action stat --refresh_stat 1s
# damo tune --damos_action stat --refresh_stat 1s
The first command dynamically allocates the repeat-mode callback request
for automatic essential stat update. Users can see the essential stats
are automatically updated for every second, using the sysfs interface.
The second command calls damon_commit() with a new callback request that
was made for the commit. As a result, the previously added repeat-mode
callback request is dropped from the list. The automatic stats refresh
stops working, and the memory for the repeat-mode callback request is
leaked. It can be confirmed using kmemleak.
Fix the mistake on the list_add_tail() call.
Fixes: 004ded6bee11 ("mm/damon: accept parallel damon_call() requests")
Cc: <stable(a)vger.kernel.org> # 6.17.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
mm/damon/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/damon/core.c b/mm/damon/core.c
index 417f33a7868e..109b050c795a 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1453,7 +1453,7 @@ int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
INIT_LIST_HEAD(&control->list);
mutex_lock(&ctx->call_controls_lock);
- list_add_tail(&ctx->call_controls, &control->list);
+ list_add_tail(&control->list, &ctx->call_controls);
mutex_unlock(&ctx->call_controls_lock);
if (!damon_is_running(ctx))
return -EINVAL;
base-commit: 49926cfb24ad064c8c26f8652e8a61bbcde37701
--
2.47.3
When building for 32-bit platforms, for which 'size_t' is
'unsigned int', there are a couple instances of -Wformat:
fs/smb/client/misc.c:922:25: error: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Werror,-Wformat]
921 | "%s: header is malformed (size is %u, must be %lu)\n",
| ~~~
| %u
922 | __func__, rsp_size, sizeof(*rsp));
| ^~~~~~~~~~~~
fs/smb/client/misc.c:940:5: error: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Werror,-Wformat]
938 | "%s: malformed buffer (size is %u, must be at least %lu)\n",
| ~~~
| %u
939 | __func__, rsp_size,
940 | sizeof(*rsp) + *num_of_nodes * sizeof(REFERRAL3));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use the proper 'size_t' format specifier, '%zu', to clear up these
warnings.
Cc: stable(a)vger.kernel.org
Fixes: c1047752ed9f ("cifs: parse_dfs_referrals: prevent oob on malformed input")
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
Feel free to squash this into the original change to make backporting
easier. I included the tags in case rebasing was not an option.
---
fs/smb/client/misc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/smb/client/misc.c b/fs/smb/client/misc.c
index 987f0ca73123..e10123d8cd7d 100644
--- a/fs/smb/client/misc.c
+++ b/fs/smb/client/misc.c
@@ -918,7 +918,7 @@ parse_dfs_referrals(struct get_dfs_referral_rsp *rsp, u32 rsp_size,
if (rsp_size < sizeof(*rsp)) {
cifs_dbg(VFS | ONCE,
- "%s: header is malformed (size is %u, must be %lu)\n",
+ "%s: header is malformed (size is %u, must be %zu)\n",
__func__, rsp_size, sizeof(*rsp));
rc = -EINVAL;
goto parse_DFS_referrals_exit;
@@ -935,7 +935,7 @@ parse_dfs_referrals(struct get_dfs_referral_rsp *rsp, u32 rsp_size,
if (sizeof(*rsp) + *num_of_nodes * sizeof(REFERRAL3) > rsp_size) {
cifs_dbg(VFS | ONCE,
- "%s: malformed buffer (size is %u, must be at least %lu)\n",
+ "%s: malformed buffer (size is %u, must be at least %zu)\n",
__func__, rsp_size,
sizeof(*rsp) + *num_of_nodes * sizeof(REFERRAL3));
rc = -EINVAL;
---
base-commit: 4e47319b091f90d5776efe96d6c198c139f34883
change-id: 20251014-smb-client-fix-wformat-32b-parse_dfs_referrals-189b8c6fdf75
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 3d3c4cd5c62f24bb3cb4511b7a95df707635e00a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101349-reptile-seldom-427d@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3d3c4cd5c62f24bb3cb4511b7a95df707635e00a Mon Sep 17 00:00:00 2001
From: Oleksij Rempel <o.rempel(a)pengutronix.de>
Date: Sun, 5 Oct 2025 10:12:03 +0200
Subject: [PATCH] net: usb: asix: hold PM usage ref to avoid PM/MDIO + RTNL
deadlock
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Prevent USB runtime PM (autosuspend) for AX88772* in bind.
usbnet enables runtime PM (autosuspend) by default, so disabling it via
the usb_driver flag is ineffective. On AX88772B, autosuspend shows no
measurable power saving with current driver (no link partner, admin
up/down). The ~0.453 W -> ~0.248 W drop on v6.1 comes from phylib powering
the PHY off on admin-down, not from USB autosuspend.
The real hazard is that with runtime PM enabled, ndo_open() (under RTNL)
may synchronously trigger autoresume (usb_autopm_get_interface()) into
asix_resume() while the USB PM lock is held. Resume paths then invoke
phylink/phylib and MDIO, which also expect RTNL, leading to possible
deadlocks or PM lock vs MDIO wake issues.
To avoid this, keep the device runtime-PM active by taking a usage
reference in ax88772_bind() and dropping it in unbind(). A non-zero PM
usage count blocks runtime suspend regardless of userspace policy
(.../power/control - pm_runtime_allow/forbid), making this approach
robust against sysfs overrides.
Holding a runtime-PM usage ref does not affect system-wide suspend;
system sleep/resume callbacks continue to run as before.
Fixes: 4a2c7217cd5a ("net: usb: asix: ax88772: manage PHY PM from MAC")
Reported-by: Hubert Wiśniewski <hubert.wisniewski.25632(a)gmail.com>
Closes: https://lore.kernel.org/all/DCGHG5UJT9G3.2K1GHFZ3H87T0@gmail.com
Tested-by: Hubert Wiśniewski <hubert.wisniewski.25632(a)gmail.com>
Reported-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Closes: https://lore.kernel.org/all/b5ea8296-f981-445d-a09a-2f389d7f6fdd@samsung.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Oleksij Rempel <o.rempel(a)pengutronix.de>
Link: https://patch.msgid.link/20251005081203.3067982-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index 792ddda1ad49..85bd5d845409 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -625,6 +625,21 @@ static void ax88772_suspend(struct usbnet *dev)
asix_read_medium_status(dev, 1));
}
+/* Notes on PM callbacks and locking context:
+ *
+ * - asix_suspend()/asix_resume() are invoked for both runtime PM and
+ * system-wide suspend/resume. For struct usb_driver the ->resume()
+ * callback does not receive pm_message_t, so the resume type cannot
+ * be distinguished here.
+ *
+ * - The MAC driver must hold RTNL when calling phylink interfaces such as
+ * phylink_suspend()/resume(). Those calls will also perform MDIO I/O.
+ *
+ * - Taking RTNL and doing MDIO from a runtime-PM resume callback (while
+ * the USB PM lock is held) is fragile. Since autosuspend brings no
+ * measurable power saving here, we block it by holding a PM usage
+ * reference in ax88772_bind().
+ */
static int asix_suspend(struct usb_interface *intf, pm_message_t message)
{
struct usbnet *dev = usb_get_intfdata(intf);
@@ -919,6 +934,13 @@ static int ax88772_bind(struct usbnet *dev, struct usb_interface *intf)
if (ret)
goto initphy_err;
+ /* Keep this interface runtime-PM active by taking a usage ref.
+ * Prevents runtime suspend while bound and avoids resume paths
+ * that could deadlock (autoresume under RTNL while USB PM lock
+ * is held, phylink/MDIO wants RTNL).
+ */
+ pm_runtime_get_noresume(&intf->dev);
+
return 0;
initphy_err:
@@ -948,6 +970,8 @@ static void ax88772_unbind(struct usbnet *dev, struct usb_interface *intf)
phylink_destroy(priv->phylink);
ax88772_mdio_unregister(priv);
asix_rx_fixup_common_free(dev->driver_priv);
+ /* Drop the PM usage ref taken in bind() */
+ pm_runtime_put(&intf->dev);
}
static void ax88178_unbind(struct usbnet *dev, struct usb_interface *intf)
@@ -1600,6 +1624,11 @@ static struct usb_driver asix_driver = {
.resume = asix_resume,
.reset_resume = asix_resume,
.disconnect = usbnet_disconnect,
+ /* usbnet enables autosuspend by default (supports_autosuspend=1).
+ * We keep runtime-PM active for AX88772* by taking a PM usage
+ * reference in ax88772_bind() (pm_runtime_get_noresume()) and
+ * dropping it in unbind(), which effectively blocks autosuspend.
+ */
.supports_autosuspend = 1,
.disable_hub_initiated_lpm = 1,
};
Reduce la rotación y retén a tu talento clave
body {
margin: 0;
padding: 0;
font-family: Arial, Helvetica, sans-serif;
font-size: 14px;
color: #333;
background-color: #ffffff;
}
table {
border-spacing: 0;
width: 100%;
max-width: 600px;
margin: auto;
}
td {
padding: 12px 20px;
}
a {
color: #1a73e8;
text-decoration: none;
}
.footer {
font-size: 12px;
color: #888888;
text-align: center;
}
Evita la rotación de personal y retén talento clave con PsicoSmart.
Hola, ,
¿Te ha pasado que tus mejores empleados renuncian poco después de ser contratados, dejando vacantes y generando costos inesperados?
Confiar únicamente en entrevistas o currículums no siempre garantiza que un candidato encaje en tu empresa. Por eso quiero presentarte PsicoSmart, una herramienta que ayuda a tomar decisiones de selección basadas en datos, reduciendo sorpresas y rotación de personal.
Con PsicoSmart puedes:
Evaluar 31 competencias psicométricas, incluyendo liderazgo, comunicación, honestidad e inteligencia emocional.
Validar conocimientos técnicos con más de 2,500 exámenes especializados.
Verificar la identidad de quien responde mediante captura fotográfica automática durante la evaluación.
Gestionar todo desde una sola plataforma, accesible desde cualquier dispositivo.
Reducir la rotación y retener talento clave está al alcance de un clic. Si quieres conocer más, puedes responder este correo o contactarme directamente, mis datos están abajo.
Saludos,
--------------
Atte.: Valeria Pérez
Ciudad de México: (55) 5018 0565
WhatsApp: +52 33 1607 2089
Si no deseas recibir más correos, haz clic aquí para darte de baja.
Para remover su dirección de esta lista haga <a href="https://s1.arrobamail.com/unsuscribe.php?id=yiwtsrewispotseup">click aquí</a>
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 85afa9ea122dd9d4a2ead104a951d318975dcd25
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101341-gallon-ungloved-bc19@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 85afa9ea122dd9d4a2ead104a951d318975dcd25 Mon Sep 17 00:00:00 2001
From: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
Date: Tue, 16 Sep 2025 11:57:56 +0900
Subject: [PATCH] PCI: endpoint: pci-epf-test: Add NULL check for DMA channels
before release
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The fields dma_chan_tx and dma_chan_rx of the struct pci_epf_test can be
NULL even after EPF initialization. Then it is prudent to check that
they have non-NULL values before releasing the channels. Add the checks
in pci_epf_test_clean_dma_chan().
Without the checks, NULL pointer dereferences happen and they can lead
to a kernel panic in some cases:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
Call trace:
dma_release_channel+0x2c/0x120 (P)
pci_epf_test_epc_deinit+0x94/0xc0 [pci_epf_test]
pci_epc_deinit_notify+0x74/0xc0
tegra_pcie_ep_pex_rst_irq+0x250/0x5d8
irq_thread_fn+0x34/0xb8
irq_thread+0x18c/0x2e8
kthread+0x14c/0x210
ret_from_fork+0x10/0x20
Fixes: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities")
Fixes: 5ebf3fc59bd2 ("PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
[mani: trimmed the stack trace]
Signed-off-by: Manivannan Sadhasivam <mani(a)kernel.org>
Reviewed-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://patch.msgid.link/20250916025756.34807-1-shinichiro.kawasaki@wdc.com
diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
index 09e1b8b46b55..31617772ad51 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -301,15 +301,20 @@ static void pci_epf_test_clean_dma_chan(struct pci_epf_test *epf_test)
if (!epf_test->dma_supported)
return;
- dma_release_channel(epf_test->dma_chan_tx);
- if (epf_test->dma_chan_tx == epf_test->dma_chan_rx) {
+ if (epf_test->dma_chan_tx) {
+ dma_release_channel(epf_test->dma_chan_tx);
+ if (epf_test->dma_chan_tx == epf_test->dma_chan_rx) {
+ epf_test->dma_chan_tx = NULL;
+ epf_test->dma_chan_rx = NULL;
+ return;
+ }
epf_test->dma_chan_tx = NULL;
- epf_test->dma_chan_rx = NULL;
- return;
}
- dma_release_channel(epf_test->dma_chan_rx);
- epf_test->dma_chan_rx = NULL;
+ if (epf_test->dma_chan_rx) {
+ dma_release_channel(epf_test->dma_chan_rx);
+ epf_test->dma_chan_rx = NULL;
+ }
}
static void pci_epf_test_print_rate(struct pci_epf_test *epf_test,
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 85afa9ea122dd9d4a2ead104a951d318975dcd25
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101340-passably-rounding-59df@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 85afa9ea122dd9d4a2ead104a951d318975dcd25 Mon Sep 17 00:00:00 2001
From: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
Date: Tue, 16 Sep 2025 11:57:56 +0900
Subject: [PATCH] PCI: endpoint: pci-epf-test: Add NULL check for DMA channels
before release
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The fields dma_chan_tx and dma_chan_rx of the struct pci_epf_test can be
NULL even after EPF initialization. Then it is prudent to check that
they have non-NULL values before releasing the channels. Add the checks
in pci_epf_test_clean_dma_chan().
Without the checks, NULL pointer dereferences happen and they can lead
to a kernel panic in some cases:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
Call trace:
dma_release_channel+0x2c/0x120 (P)
pci_epf_test_epc_deinit+0x94/0xc0 [pci_epf_test]
pci_epc_deinit_notify+0x74/0xc0
tegra_pcie_ep_pex_rst_irq+0x250/0x5d8
irq_thread_fn+0x34/0xb8
irq_thread+0x18c/0x2e8
kthread+0x14c/0x210
ret_from_fork+0x10/0x20
Fixes: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities")
Fixes: 5ebf3fc59bd2 ("PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
[mani: trimmed the stack trace]
Signed-off-by: Manivannan Sadhasivam <mani(a)kernel.org>
Reviewed-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://patch.msgid.link/20250916025756.34807-1-shinichiro.kawasaki@wdc.com
diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
index 09e1b8b46b55..31617772ad51 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -301,15 +301,20 @@ static void pci_epf_test_clean_dma_chan(struct pci_epf_test *epf_test)
if (!epf_test->dma_supported)
return;
- dma_release_channel(epf_test->dma_chan_tx);
- if (epf_test->dma_chan_tx == epf_test->dma_chan_rx) {
+ if (epf_test->dma_chan_tx) {
+ dma_release_channel(epf_test->dma_chan_tx);
+ if (epf_test->dma_chan_tx == epf_test->dma_chan_rx) {
+ epf_test->dma_chan_tx = NULL;
+ epf_test->dma_chan_rx = NULL;
+ return;
+ }
epf_test->dma_chan_tx = NULL;
- epf_test->dma_chan_rx = NULL;
- return;
}
- dma_release_channel(epf_test->dma_chan_rx);
- epf_test->dma_chan_rx = NULL;
+ if (epf_test->dma_chan_rx) {
+ dma_release_channel(epf_test->dma_chan_rx);
+ epf_test->dma_chan_rx = NULL;
+ }
}
static void pci_epf_test_print_rate(struct pci_epf_test *epf_test,
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 85afa9ea122dd9d4a2ead104a951d318975dcd25
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101340-marbled-uneven-a896@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 85afa9ea122dd9d4a2ead104a951d318975dcd25 Mon Sep 17 00:00:00 2001
From: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
Date: Tue, 16 Sep 2025 11:57:56 +0900
Subject: [PATCH] PCI: endpoint: pci-epf-test: Add NULL check for DMA channels
before release
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The fields dma_chan_tx and dma_chan_rx of the struct pci_epf_test can be
NULL even after EPF initialization. Then it is prudent to check that
they have non-NULL values before releasing the channels. Add the checks
in pci_epf_test_clean_dma_chan().
Without the checks, NULL pointer dereferences happen and they can lead
to a kernel panic in some cases:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
Call trace:
dma_release_channel+0x2c/0x120 (P)
pci_epf_test_epc_deinit+0x94/0xc0 [pci_epf_test]
pci_epc_deinit_notify+0x74/0xc0
tegra_pcie_ep_pex_rst_irq+0x250/0x5d8
irq_thread_fn+0x34/0xb8
irq_thread+0x18c/0x2e8
kthread+0x14c/0x210
ret_from_fork+0x10/0x20
Fixes: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities")
Fixes: 5ebf3fc59bd2 ("PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
[mani: trimmed the stack trace]
Signed-off-by: Manivannan Sadhasivam <mani(a)kernel.org>
Reviewed-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://patch.msgid.link/20250916025756.34807-1-shinichiro.kawasaki@wdc.com
diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
index 09e1b8b46b55..31617772ad51 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -301,15 +301,20 @@ static void pci_epf_test_clean_dma_chan(struct pci_epf_test *epf_test)
if (!epf_test->dma_supported)
return;
- dma_release_channel(epf_test->dma_chan_tx);
- if (epf_test->dma_chan_tx == epf_test->dma_chan_rx) {
+ if (epf_test->dma_chan_tx) {
+ dma_release_channel(epf_test->dma_chan_tx);
+ if (epf_test->dma_chan_tx == epf_test->dma_chan_rx) {
+ epf_test->dma_chan_tx = NULL;
+ epf_test->dma_chan_rx = NULL;
+ return;
+ }
epf_test->dma_chan_tx = NULL;
- epf_test->dma_chan_rx = NULL;
- return;
}
- dma_release_channel(epf_test->dma_chan_rx);
- epf_test->dma_chan_rx = NULL;
+ if (epf_test->dma_chan_rx) {
+ dma_release_channel(epf_test->dma_chan_rx);
+ epf_test->dma_chan_rx = NULL;
+ }
}
static void pci_epf_test_print_rate(struct pci_epf_test *epf_test,
When fsl_edma_alloc_chan_resources() fails after clk_prepare_enable(),
the error paths only free IRQs and destroy the TCD pool, but forget to
call clk_disable_unprepare(). This causes the channel clock to remain
enabled, leaking power and resources.
Fix it by disabling the channel clock in the error unwind path.
Fixes: d8d4355861d8 ("dmaengine: fsl-edma: add i.MX8ULP edma support")
Cc: stable(a)vger.kernel.org
Suggested-by: Frank Li <Frank.Li(a)nxp.com>
Signed-off-by: Zhen Ni <zhen.ni(a)easystack.cn>
---
Changes in v2:
- Remove FSL_EDMA_DRV_HAS_CHCLK check
Changes in v3:
- Remove cleanup
Changes in v4:
- Re-send as a new thread
---
drivers/dma/fsl-edma-common.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
index 4976d7dde080..11655dcc4d6c 100644
--- a/drivers/dma/fsl-edma-common.c
+++ b/drivers/dma/fsl-edma-common.c
@@ -852,6 +852,7 @@ int fsl_edma_alloc_chan_resources(struct dma_chan *chan)
free_irq(fsl_chan->txirq, fsl_chan);
err_txirq:
dma_pool_destroy(fsl_chan->tcd_pool);
+ clk_disable_unprepare(fsl_chan->clk);
return ret;
}
--
2.20.1
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 4d6fc29f36341d7795db1d1819b4c15fe9be7b23
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101300-backspace-pulmonary-5620@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4d6fc29f36341d7795db1d1819b4c15fe9be7b23 Mon Sep 17 00:00:00 2001
From: Donet Tom <donettom(a)linux.ibm.com>
Date: Wed, 24 Sep 2025 00:16:59 +0530
Subject: [PATCH] mm/ksm: fix incorrect KSM counter handling in mm_struct
during fork
Patch series "mm/ksm: Fix incorrect accounting of KSM counters during
fork", v3.
The first patch in this series fixes the incorrect accounting of KSM
counters such as ksm_merging_pages, ksm_rmap_items, and the global
ksm_zero_pages during fork.
The following patch add a selftest to verify the ksm_merging_pages counter
was updated correctly during fork.
Test Results
============
Without the first patch
-----------------------
# [RUN] test_fork_ksm_merging_page_count
not ok 10 ksm_merging_page in child: 32
With the first patch
--------------------
# [RUN] test_fork_ksm_merging_page_count
ok 10 ksm_merging_pages is not inherited after fork
This patch (of 2):
Currently, the KSM-related counters in `mm_struct`, such as
`ksm_merging_pages`, `ksm_rmap_items`, and `ksm_zero_pages`, are inherited
by the child process during fork. This results in inconsistent
accounting.
When a process uses KSM, identical pages are merged and an rmap item is
created for each merged page. The `ksm_merging_pages` and
`ksm_rmap_items` counters are updated accordingly. However, after a fork,
these counters are copied to the child while the corresponding rmap items
are not. As a result, when the child later triggers an unmerge, there are
no rmap items present in the child, so the counters remain stale, leading
to incorrect accounting.
A similar issue exists with `ksm_zero_pages`, which maintains both a
global counter and a per-process counter. During fork, the per-process
counter is inherited by the child, but the global counter is not
incremented. Since the child also references zero pages, the global
counter should be updated as well. Otherwise, during zero-page unmerge,
both the global and per-process counters are decremented, causing the
global counter to become inconsistent.
To fix this, ksm_merging_pages and ksm_rmap_items are reset to 0 during
fork, and the global ksm_zero_pages counter is updated with the
per-process ksm_zero_pages value inherited by the child. This ensures
that KSM statistics remain accurate and reflect the activity of each
process correctly.
Link: https://lkml.kernel.org/r/cover.1758648700.git.donettom@linux.ibm.com
Link: https://lkml.kernel.org/r/7b9870eb67ccc0d79593940d9dbd4a0b39b5d396.17586487…
Fixes: 7609385337a4 ("ksm: count ksm merging pages for each process")
Fixes: cb4df4cae4f2 ("ksm: count allocated ksm rmap_items for each process")
Fixes: e2942062e01d ("ksm: count all zero pages placed by KSM")
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
Reviewed-by: Chengming Zhou <chengming.zhou(a)linux.dev>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Aboorva Devarajan <aboorvad(a)linux.ibm.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Donet Tom <donettom(a)linux.ibm.com>
Cc: "Ritesh Harjani (IBM)" <ritesh.list(a)gmail.com>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: xu xin <xu.xin16(a)zte.com.cn>
Cc: <stable(a)vger.kernel.org> [6.6+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 22e67ca7cba3..067538fc4d58 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -56,8 +56,14 @@ static inline long mm_ksm_zero_pages(struct mm_struct *mm)
static inline void ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
{
/* Adding mm to ksm is best effort on fork. */
- if (mm_flags_test(MMF_VM_MERGEABLE, oldmm))
+ if (mm_flags_test(MMF_VM_MERGEABLE, oldmm)) {
+ long nr_ksm_zero_pages = atomic_long_read(&mm->ksm_zero_pages);
+
+ mm->ksm_merging_pages = 0;
+ mm->ksm_rmap_items = 0;
+ atomic_long_add(nr_ksm_zero_pages, &ksm_zero_pages);
__ksm_enter(mm);
+ }
}
static inline int ksm_execve(struct mm_struct *mm)
fsnotify_mmap_perm() requires a byte offset for the file about to be
mmap'ed. But it is called from vm_mmap_pgoff(), which has a page offset.
Previously the conversion was done incorrectly so let's fix it, being
careful not to overflow on 32-bit platforms.
Discovered during code review.
Cc: <stable(a)vger.kernel.org>
Fixes: 066e053fe208 ("fsnotify: add pre-content hooks on mmap()")
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
---
Applies against today's mm-unstable (aa05a436eca8).
Thanks,
Ryan
mm/util.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/util.c b/mm/util.c
index 6c1d64ed0221..8989d5767528 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -566,6 +566,7 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
unsigned long len, unsigned long prot,
unsigned long flag, unsigned long pgoff)
{
+ loff_t off = (loff_t)pgoff << PAGE_SHIFT;
unsigned long ret;
struct mm_struct *mm = current->mm;
unsigned long populate;
@@ -573,7 +574,7 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
ret = security_mmap_file(file, prot, flag);
if (!ret)
- ret = fsnotify_mmap_perm(file, prot, pgoff >> PAGE_SHIFT, len);
+ ret = fsnotify_mmap_perm(file, prot, off, len);
if (!ret) {
if (mmap_write_lock_killable(mm))
return -EINTR;
--
2.43.0
Since commits
7b9eb53e8591 ("media: cx18: Access v4l2_fh from file")
9ba9d11544f9 ("media: ivtv: Access v4l2_fh from file")
All the ioctl handlers access their private data structures
from file *
The ivtv and cx18 drivers call the ioctl handlers from their
DVB layer without a valid file *, causing invalid memory access.
The issue has been reported by smatch in
"[bug report] media: cx18: Access v4l2_fh from file"
Fix this by providing wrappers for the ioctl handlers to be
used by the DVB layer that do not require a valid file *.
Signed-off-by: Jacopo Mondi <jacopo.mondi(a)ideasonboard.com>
---
Changes in v5:
- Remove stray link from cx18 patch
- Add Hans' tags
- Link to v4: https://lore.kernel.org/r/20250819-cx18-v4l2-fh-v4-0-9db1635d6787@ideasonbo…
Changes in v4:
- Slightly adjust commit messages
- Link to v3: https://lore.kernel.org/r/20250818-cx18-v4l2-fh-v3-0-5e2f08f3cadc@ideasonbo…
Changes in v3:
- Change helpers to accept the type they're going to operate on instead
of using the open_id wrapper type as suggested by Laurent
- Link to v2: https://lore.kernel.org/r/20250818-cx18-v4l2-fh-v2-0-3f53ce423663@ideasonbo…
Changes in v2:
- Add Cc: stable(a)vger.kernel.org per-patch
---
Jacopo Mondi (2):
media: cx18: Fix invalid access to file *
media: ivtv: Fix invalid access to file *
drivers/media/pci/cx18/cx18-driver.c | 9 +++------
drivers/media/pci/cx18/cx18-ioctl.c | 30 +++++++++++++++++++-----------
drivers/media/pci/cx18/cx18-ioctl.h | 8 +++++---
drivers/media/pci/ivtv/ivtv-driver.c | 11 ++++-------
drivers/media/pci/ivtv/ivtv-ioctl.c | 22 +++++++++++++++++-----
drivers/media/pci/ivtv/ivtv-ioctl.h | 6 ++++--
6 files changed, 52 insertions(+), 34 deletions(-)
---
base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
change-id: 20250818-cx18-v4l2-fh-7eaa6199fdde
Best regards,
--
Jacopo Mondi <jacopo.mondi(a)ideasonboard.com>
In the IOMMU Shared Virtual Addressing (SVA) context, the IOMMU hardware
shares and walks the CPU's page tables. The x86 architecture maps the
kernel's virtual address space into the upper portion of every process's
page table. Consequently, in an SVA context, the IOMMU hardware can walk
and cache kernel page table entries.
The Linux kernel currently lacks a notification mechanism for kernel page
table changes, specifically when page table pages are freed and reused.
The IOMMU driver is only notified of changes to user virtual address
mappings. This can cause the IOMMU's internal caches to retain stale
entries for kernel VA.
A Use-After-Free (UAF) and Write-After-Free (WAF) condition arises when
kernel page table pages are freed and later reallocated. The IOMMU could
misinterpret the new data as valid page table entries. The IOMMU might
then walk into attacker-controlled memory, leading to arbitrary physical
memory DMA access or privilege escalation. This is also a Write-After-Free
issue, as the IOMMU will potentially continue to write Accessed and Dirty
bits to the freed memory while attempting to walk the stale page tables.
Currently, SVA contexts are unprivileged and cannot access kernel
mappings. However, the IOMMU will still walk kernel-only page tables
all the way down to the leaf entries, where it realizes the mapping
is for the kernel and errors out. This means the IOMMU still caches
these intermediate page table entries, making the described vulnerability
a real concern.
To mitigate this, a new IOMMU interface is introduced to flush IOTLB
entries for the kernel address space. This interface is invoked from the
x86 architecture code that manages combined user and kernel page tables,
specifically before any kernel page table page is freed and reused.
This addresses the main issue with vfree() which is a common occurrence
and can be triggered by unprivileged users. While this resolves the
primary problem, it doesn't address some extremely rare case related to
memory unplug of memory that was present as reserved memory at boot,
which cannot be triggered by unprivileged users. The discussion can be
found at the link below.
Fixes: 26b25a2b98e4 ("iommu: Bind process address spaces to devices")
Cc: stable(a)vger.kernel.org
Suggested-by: Jann Horn <jannh(a)google.com>
Co-developed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu(a)linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde(a)amd.com>
Reviewed-by: Kevin Tian <kevin.tian(a)intel.com>
Link: https://lore.kernel.org/linux-iommu/04983c62-3b1d-40d4-93ae-34ca04b827e5@in…
---
include/linux/iommu.h | 4 ++++
drivers/iommu/iommu-sva.c | 29 ++++++++++++++++++++++++++++-
mm/pgtable-generic.c | 2 ++
3 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c30d12e16473..66e4abb2df0d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -1134,7 +1134,9 @@ struct iommu_sva {
struct iommu_mm_data {
u32 pasid;
+ struct mm_struct *mm;
struct list_head sva_domains;
+ struct list_head mm_list_elm;
};
int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode);
@@ -1615,6 +1617,7 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
struct mm_struct *mm);
void iommu_sva_unbind_device(struct iommu_sva *handle);
u32 iommu_sva_get_pasid(struct iommu_sva *handle);
+void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end);
#else
static inline struct iommu_sva *
iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
@@ -1639,6 +1642,7 @@ static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm)
}
static inline void mm_pasid_drop(struct mm_struct *mm) {}
+static inline void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end) {}
#endif /* CONFIG_IOMMU_SVA */
#ifdef CONFIG_IOMMU_IOPF
diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 1a51cfd82808..d236aef80a8d 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -10,6 +10,8 @@
#include "iommu-priv.h"
static DEFINE_MUTEX(iommu_sva_lock);
+static bool iommu_sva_present;
+static LIST_HEAD(iommu_sva_mms);
static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
struct mm_struct *mm);
@@ -42,6 +44,7 @@ static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct de
return ERR_PTR(-ENOSPC);
}
iommu_mm->pasid = pasid;
+ iommu_mm->mm = mm;
INIT_LIST_HEAD(&iommu_mm->sva_domains);
/*
* Make sure the write to mm->iommu_mm is not reordered in front of
@@ -132,8 +135,13 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm
if (ret)
goto out_free_domain;
domain->users = 1;
- list_add(&domain->next, &mm->iommu_mm->sva_domains);
+ if (list_empty(&iommu_mm->sva_domains)) {
+ if (list_empty(&iommu_sva_mms))
+ iommu_sva_present = true;
+ list_add(&iommu_mm->mm_list_elm, &iommu_sva_mms);
+ }
+ list_add(&domain->next, &iommu_mm->sva_domains);
out:
refcount_set(&handle->users, 1);
mutex_unlock(&iommu_sva_lock);
@@ -175,6 +183,13 @@ void iommu_sva_unbind_device(struct iommu_sva *handle)
list_del(&domain->next);
iommu_domain_free(domain);
}
+
+ if (list_empty(&iommu_mm->sva_domains)) {
+ list_del(&iommu_mm->mm_list_elm);
+ if (list_empty(&iommu_sva_mms))
+ iommu_sva_present = false;
+ }
+
mutex_unlock(&iommu_sva_lock);
kfree(handle);
}
@@ -312,3 +327,15 @@ static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
return domain;
}
+
+void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end)
+{
+ struct iommu_mm_data *iommu_mm;
+
+ guard(mutex)(&iommu_sva_lock);
+ if (!iommu_sva_present)
+ return;
+
+ list_for_each_entry(iommu_mm, &iommu_sva_mms, mm_list_elm)
+ mmu_notifier_arch_invalidate_secondary_tlbs(iommu_mm->mm, start, end);
+}
diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
index 1c7caa8ef164..8c22be79b734 100644
--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -13,6 +13,7 @@
#include <linux/swap.h>
#include <linux/swapops.h>
#include <linux/mm_inline.h>
+#include <linux/iommu.h>
#include <asm/pgalloc.h>
#include <asm/tlb.h>
@@ -430,6 +431,7 @@ static void kernel_pgtable_work_func(struct work_struct *work)
list_splice_tail_init(&kernel_pgtable_work.list, &page_list);
spin_unlock(&kernel_pgtable_work.lock);
+ iommu_sva_invalidate_kva_range(PAGE_OFFSET, TLB_FLUSH_ALL);
list_for_each_entry_safe(pt, next, &page_list, pt_list)
__pagetable_free(pt);
}
--
2.43.0
Since commits
7b9eb53e8591 ("media: cx18: Access v4l2_fh from file")
9ba9d11544f9 ("media: ivtv: Access v4l2_fh from file")
All the ioctl handlers access their private data structures
from file *
The ivtv and cx18 drivers call the ioctl handlers from their
DVB layer without a valid file *, causing invalid memory access.
The issue has been reported by smatch in
"[bug report] media: cx18: Access v4l2_fh from file"
Fix this by providing wrappers for the ioctl handlers to be
used by the DVB layer that do not require a valid file *.
Signed-off-by: Jacopo Mondi <jacopo.mondi(a)ideasonboard.com>
---
Changes in v4:
- Slightly adjust commit messages
- Link to v3: https://lore.kernel.org/r/20250818-cx18-v4l2-fh-v3-0-5e2f08f3cadc@ideasonbo…
Changes in v3:
- Change helpers to accept the type they're going to operate on instead
of using the open_id wrapper type as suggested by Laurent
- Link to v2: https://lore.kernel.org/r/20250818-cx18-v4l2-fh-v2-0-3f53ce423663@ideasonbo…
Changes in v2:
- Add Cc: stable(a)vger.kernel.org per-patch
---
Jacopo Mondi (2):
media: cx18: Fix invalid access to file *
media: ivtv: Fix invalid access to file *
drivers/media/pci/cx18/cx18-driver.c | 9 +++------
drivers/media/pci/cx18/cx18-ioctl.c | 30 +++++++++++++++++++-----------
drivers/media/pci/cx18/cx18-ioctl.h | 8 +++++---
drivers/media/pci/ivtv/ivtv-driver.c | 11 ++++-------
drivers/media/pci/ivtv/ivtv-ioctl.c | 22 +++++++++++++++++-----
drivers/media/pci/ivtv/ivtv-ioctl.h | 6 ++++--
6 files changed, 52 insertions(+), 34 deletions(-)
---
base-commit: a75b8d198c55e9eb5feb6f6e155496305caba2dc
change-id: 20250818-cx18-v4l2-fh-7eaa6199fdde
Best regards,
--
Jacopo Mondi <jacopo.mondi(a)ideasonboard.com>
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 88daf2f448aad05a2e6df738d66fe8b0cf85cee0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101324-echo-cardinal-3043@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 88daf2f448aad05a2e6df738d66fe8b0cf85cee0 Mon Sep 17 00:00:00 2001
From: Matvey Kovalev <matvey.kovalev(a)ispras.ru>
Date: Thu, 25 Sep 2025 15:12:34 +0300
Subject: [PATCH] ksmbd: fix error code overwriting in
smb2_get_info_filesystem()
If client doesn't negotiate with SMB3.1.1 POSIX Extensions,
then proper error code won't be returned due to overwriting.
Return error immediately.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: e2f34481b24db ("cifsd: add server-side procedures for SMB3")
Cc: stable(a)vger.kernel.org
Signed-off-by: Matvey Kovalev <matvey.kovalev(a)ispras.ru>
Acked-by: Namjae Jeon <linkinjeon(a)kernel.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c
index 0c069eff80b7..133ca5beb7cf 100644
--- a/fs/smb/server/smb2pdu.c
+++ b/fs/smb/server/smb2pdu.c
@@ -5629,7 +5629,8 @@ static int smb2_get_info_filesystem(struct ksmbd_work *work,
if (!work->tcon->posix_extensions) {
pr_err("client doesn't negotiate with SMB3.1.1 POSIX Extensions\n");
- rc = -EOPNOTSUPP;
+ path_put(&path);
+ return -EOPNOTSUPP;
} else {
info = (struct filesystem_posix_info *)(rsp->Buffer);
info->OptimalTransferSize = cpu_to_le32(stfs.f_bsize);
When migrating a balloon page, we first deflate the old page to then
inflate the new page.
However, if inflating the new page succeeded, we effectively deflated
the old page, reducing the balloon size.
In that case, the migration actually worked: similar to migrating+
immediately deflating the new page. The old page will be freed back to
the buddy.
Right now, the core will leave the page be marked as isolated (as
we returned an error). When later trying to putback that page, we will
run into the WARN_ON_ONCE() in balloon_page_putback().
That handling was changed in commit 3544c4faccb8 ("mm/balloon_compaction:
stop using __ClearPageMovable()"); before that change, we would have
tolerated that way of handling it.
To fix it, let's just return 0 in that case, making the core effectively
just clear the "isolated" flag + freeing it back to the buddy as if the
migration succeeded. Note that the new page will also get freed when the
core puts the last reference.
Note that this also makes it all be more consistent: we will no longer
unisolate the page in the balloon driver while keeping it marked as
being isolated in migration core.
This was found by code inspection.
Fixes: 3544c4faccb8 ("mm/balloon_compaction: stop using __ClearPageMovable()")
Cc: <stable(a)vger.kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Jerrin Shaji George <jerrin.shaji-george(a)broadcom.com>
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list(a)broadcom.com>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
---
I have no easy way to test this, and I assume it happens very very rarely
(inflation during migration failing).
I would prefer this to go through the MM-tree, as I have some follow-up
balloon_compaction reworks also mess with this code.
---
drivers/misc/vmw_balloon.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 6df51ee8db621..cc1d18b3df5ca 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1737,7 +1737,7 @@ static int vmballoon_migratepage(struct balloon_dev_info *b_dev_info,
{
unsigned long status, flags;
struct vmballoon *b;
- int ret;
+ int ret = 0;
b = container_of(b_dev_info, struct vmballoon, b_dev_info);
@@ -1796,17 +1796,15 @@ static int vmballoon_migratepage(struct balloon_dev_info *b_dev_info,
* A failure happened. While we can deflate the page we just
* inflated, this deflation can also encounter an error. Instead
* we will decrease the size of the balloon to reflect the
- * change and report failure.
+ * change.
*/
atomic64_dec(&b->size);
- ret = -EBUSY;
} else {
/*
* Success. Take a reference for the page, and we will add it to
* the list after acquiring the lock.
*/
get_page(newpage);
- ret = 0;
}
/* Update the balloon list under the @pages_lock */
@@ -1817,7 +1815,7 @@ static int vmballoon_migratepage(struct balloon_dev_info *b_dev_info,
* If we succeed just insert it to the list and update the statistics
* under the lock.
*/
- if (!ret) {
+ if (status == VMW_BALLOON_SUCCESS) {
balloon_page_insert(&b->b_dev_info, newpage);
__count_vm_event(BALLOON_MIGRATE);
}
base-commit: 1c58c31dc83e39f4790ae778626b6b8b59bc0db8
--
2.51.0
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 64cf7d058a005c5c31eb8a0b741f35dc12915d18
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101354-eject-groove-319c@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64cf7d058a005c5c31eb8a0b741f35dc12915d18 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Wed, 8 Oct 2025 12:45:10 -0400
Subject: [PATCH] tracing: Have trace_marker use per-cpu data to read user
space
It was reported that using __copy_from_user_inatomic() can actually
schedule. Which is bad when preemption is disabled. Even though there's
logic to check in_atomic() is set, but this is a nop when the kernel is
configured with PREEMPT_NONE. This is due to page faulting and the code
could schedule with preemption disabled.
Link: https://lore.kernel.org/all/20250819105152.2766363-1-luogengkun@huaweicloud…
The solution was to change the __copy_from_user_inatomic() to
copy_from_user_nofault(). But then it was reported that this caused a
regression in Android. There's several applications writing into
trace_marker() in Android, but now instead of showing the expected data,
it is showing:
tracing_mark_write: <faulted>
After reverting the conversion to copy_from_user_nofault(), Android was
able to get the data again.
Writes to the trace_marker is a way to efficiently and quickly enter data
into the Linux tracing buffer. It takes no locks and was designed to be as
non-intrusive as possible. This means it cannot allocate memory, and must
use pre-allocated data.
A method that is actively being worked on to have faultable system call
tracepoints read user space data is to allocate per CPU buffers, and use
them in the callback. The method uses a technique similar to seqcount.
That is something like this:
preempt_disable();
cpu = smp_processor_id();
buffer = this_cpu_ptr(&pre_allocated_cpu_buffers, cpu);
do {
cnt = nr_context_switches_cpu(cpu);
migrate_disable();
preempt_enable();
ret = copy_from_user(buffer, ptr, size);
preempt_disable();
migrate_enable();
} while (!ret && cnt != nr_context_switches_cpu(cpu));
if (!ret)
ring_buffer_write(buffer);
preempt_enable();
It's a little more involved than that, but the above is the basic logic.
The idea is to acquire the current CPU buffer, disable migration, and then
enable preemption. At this moment, it can safely use copy_from_user().
After reading the data from user space, it disables preemption again. It
then checks to see if there was any new scheduling on this CPU. If there
was, it must assume that the buffer was corrupted by another task. If
there wasn't, then the buffer is still valid as only tasks in preemptable
context can write to this buffer and only those that are running on the
CPU.
By using this method, where trace_marker open allocates the per CPU
buffers, trace_marker writes can access user space and even fault it in,
without having to allocate or take any locks of its own.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Luo Gengkun <luogengkun(a)huaweicloud.com>
Cc: Wattson CI <wattson-external(a)google.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Link: https://lore.kernel.org/20251008124510.6dba541a@gandalf.local.home
Fixes: 3d62ab32df065 ("tracing: Fix tracing_marker may trigger page fault during preempt_disable")
Reported-by: Runping Lai <runpinglai(a)google.com>
Tested-by: Runping Lai <runpinglai(a)google.com>
Closes: https://lore.kernel.org/linux-trace-kernel/20251007003417.3470979-2-runping…
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index b3c94fbaf002..0fd582651293 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4791,12 +4791,6 @@ int tracing_single_release_file_tr(struct inode *inode, struct file *filp)
return single_release(inode, filp);
}
-static int tracing_mark_open(struct inode *inode, struct file *filp)
-{
- stream_open(inode, filp);
- return tracing_open_generic_tr(inode, filp);
-}
-
static int tracing_release(struct inode *inode, struct file *file)
{
struct trace_array *tr = inode->i_private;
@@ -7163,7 +7157,7 @@ tracing_free_buffer_release(struct inode *inode, struct file *filp)
#define TRACE_MARKER_MAX_SIZE 4096
-static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user *ubuf,
+static ssize_t write_marker_to_buffer(struct trace_array *tr, const char *buf,
size_t cnt, unsigned long ip)
{
struct ring_buffer_event *event;
@@ -7173,20 +7167,11 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
int meta_size;
ssize_t written;
size_t size;
- int len;
-
-/* Used in tracing_mark_raw_write() as well */
-#define FAULTED_STR "<faulted>"
-#define FAULTED_SIZE (sizeof(FAULTED_STR) - 1) /* '\0' is already accounted for */
meta_size = sizeof(*entry) + 2; /* add '\0' and possible '\n' */
again:
size = cnt + meta_size;
- /* If less than "<faulted>", then make sure we can still add that */
- if (cnt < FAULTED_SIZE)
- size += FAULTED_SIZE - cnt;
-
buffer = tr->array_buffer.buffer;
event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size,
tracing_gen_ctx());
@@ -7196,9 +7181,6 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
* make it smaller and try again.
*/
if (size > ring_buffer_max_event_size(buffer)) {
- /* cnt < FAULTED size should never be bigger than max */
- if (WARN_ON_ONCE(cnt < FAULTED_SIZE))
- return -EBADF;
cnt = ring_buffer_max_event_size(buffer) - meta_size;
/* The above should only happen once */
if (WARN_ON_ONCE(cnt + meta_size == size))
@@ -7212,14 +7194,8 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
entry = ring_buffer_event_data(event);
entry->ip = ip;
-
- len = copy_from_user_nofault(&entry->buf, ubuf, cnt);
- if (len) {
- memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE);
- cnt = FAULTED_SIZE;
- written = -EFAULT;
- } else
- written = cnt;
+ memcpy(&entry->buf, buf, cnt);
+ written = cnt;
if (tr->trace_marker_file && !list_empty(&tr->trace_marker_file->triggers)) {
/* do not add \n before testing triggers, but add \0 */
@@ -7243,6 +7219,169 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
return written;
}
+struct trace_user_buf {
+ char *buf;
+};
+
+struct trace_user_buf_info {
+ struct trace_user_buf __percpu *tbuf;
+ int ref;
+};
+
+
+static DEFINE_MUTEX(trace_user_buffer_mutex);
+static struct trace_user_buf_info *trace_user_buffer;
+
+static void trace_user_fault_buffer_free(struct trace_user_buf_info *tinfo)
+{
+ char *buf;
+ int cpu;
+
+ for_each_possible_cpu(cpu) {
+ buf = per_cpu_ptr(tinfo->tbuf, cpu)->buf;
+ kfree(buf);
+ }
+ free_percpu(tinfo->tbuf);
+ kfree(tinfo);
+}
+
+static int trace_user_fault_buffer_enable(void)
+{
+ struct trace_user_buf_info *tinfo;
+ char *buf;
+ int cpu;
+
+ guard(mutex)(&trace_user_buffer_mutex);
+
+ if (trace_user_buffer) {
+ trace_user_buffer->ref++;
+ return 0;
+ }
+
+ tinfo = kmalloc(sizeof(*tinfo), GFP_KERNEL);
+ if (!tinfo)
+ return -ENOMEM;
+
+ tinfo->tbuf = alloc_percpu(struct trace_user_buf);
+ if (!tinfo->tbuf) {
+ kfree(tinfo);
+ return -ENOMEM;
+ }
+
+ tinfo->ref = 1;
+
+ /* Clear each buffer in case of error */
+ for_each_possible_cpu(cpu) {
+ per_cpu_ptr(tinfo->tbuf, cpu)->buf = NULL;
+ }
+
+ for_each_possible_cpu(cpu) {
+ buf = kmalloc_node(TRACE_MARKER_MAX_SIZE, GFP_KERNEL,
+ cpu_to_node(cpu));
+ if (!buf) {
+ trace_user_fault_buffer_free(tinfo);
+ return -ENOMEM;
+ }
+ per_cpu_ptr(tinfo->tbuf, cpu)->buf = buf;
+ }
+
+ trace_user_buffer = tinfo;
+
+ return 0;
+}
+
+static void trace_user_fault_buffer_disable(void)
+{
+ struct trace_user_buf_info *tinfo;
+
+ guard(mutex)(&trace_user_buffer_mutex);
+
+ tinfo = trace_user_buffer;
+
+ if (WARN_ON_ONCE(!tinfo))
+ return;
+
+ if (--tinfo->ref)
+ return;
+
+ trace_user_fault_buffer_free(tinfo);
+ trace_user_buffer = NULL;
+}
+
+/* Must be called with preemption disabled */
+static char *trace_user_fault_read(struct trace_user_buf_info *tinfo,
+ const char __user *ptr, size_t size,
+ size_t *read_size)
+{
+ int cpu = smp_processor_id();
+ char *buffer = per_cpu_ptr(tinfo->tbuf, cpu)->buf;
+ unsigned int cnt;
+ int trys = 0;
+ int ret;
+
+ if (size > TRACE_MARKER_MAX_SIZE)
+ size = TRACE_MARKER_MAX_SIZE;
+ *read_size = 0;
+
+ /*
+ * This acts similar to a seqcount. The per CPU context switches are
+ * recorded, migration is disabled and preemption is enabled. The
+ * read of the user space memory is copied into the per CPU buffer.
+ * Preemption is disabled again, and if the per CPU context switches count
+ * is still the same, it means the buffer has not been corrupted.
+ * If the count is different, it is assumed the buffer is corrupted
+ * and reading must be tried again.
+ */
+
+ do {
+ /*
+ * If for some reason, copy_from_user() always causes a context
+ * switch, this would then cause an infinite loop.
+ * If this task is preempted by another user space task, it
+ * will cause this task to try again. But just in case something
+ * changes where the copying from user space causes another task
+ * to run, prevent this from going into an infinite loop.
+ * 100 tries should be plenty.
+ */
+ if (WARN_ONCE(trys++ > 100, "Error: Too many tries to read user space"))
+ return NULL;
+
+ /* Read the current CPU context switch counter */
+ cnt = nr_context_switches_cpu(cpu);
+
+ /*
+ * Preemption is going to be enabled, but this task must
+ * remain on this CPU.
+ */
+ migrate_disable();
+
+ /*
+ * Now preemption is being enabed and another task can come in
+ * and use the same buffer and corrupt our data.
+ */
+ preempt_enable_notrace();
+
+ ret = __copy_from_user(buffer, ptr, size);
+
+ preempt_disable_notrace();
+ migrate_enable();
+
+ /* if it faulted, no need to test if the buffer was corrupted */
+ if (ret)
+ return NULL;
+
+ /*
+ * Preemption is disabled again, now check the per CPU context
+ * switch counter. If it doesn't match, then another user space
+ * process may have schedule in and corrupted our buffer. In that
+ * case the copying must be retried.
+ */
+ } while (nr_context_switches_cpu(cpu) != cnt);
+
+ *read_size = size;
+ return buffer;
+}
+
static ssize_t
tracing_mark_write(struct file *filp, const char __user *ubuf,
size_t cnt, loff_t *fpos)
@@ -7250,6 +7389,8 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
struct trace_array *tr = filp->private_data;
ssize_t written = -ENODEV;
unsigned long ip;
+ size_t size;
+ char *buf;
if (tracing_disabled)
return -EINVAL;
@@ -7263,6 +7404,16 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
if (cnt > TRACE_MARKER_MAX_SIZE)
cnt = TRACE_MARKER_MAX_SIZE;
+ /* Must have preemption disabled while having access to the buffer */
+ guard(preempt_notrace)();
+
+ buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size);
+ if (!buf)
+ return -EFAULT;
+
+ if (cnt > size)
+ cnt = size;
+
/* The selftests expect this function to be the IP address */
ip = _THIS_IP_;
@@ -7270,32 +7421,27 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
if (tr == &global_trace) {
guard(rcu)();
list_for_each_entry_rcu(tr, &marker_copies, marker_list) {
- written = write_marker_to_buffer(tr, ubuf, cnt, ip);
+ written = write_marker_to_buffer(tr, buf, cnt, ip);
if (written < 0)
break;
}
} else {
- written = write_marker_to_buffer(tr, ubuf, cnt, ip);
+ written = write_marker_to_buffer(tr, buf, cnt, ip);
}
return written;
}
static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
- const char __user *ubuf, size_t cnt)
+ const char *buf, size_t cnt)
{
struct ring_buffer_event *event;
struct trace_buffer *buffer;
struct raw_data_entry *entry;
ssize_t written;
- int size;
- int len;
-
-#define FAULT_SIZE_ID (FAULTED_SIZE + sizeof(int))
+ size_t size;
size = sizeof(*entry) + cnt;
- if (cnt < FAULT_SIZE_ID)
- size += FAULT_SIZE_ID - cnt;
buffer = tr->array_buffer.buffer;
@@ -7309,14 +7455,8 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
return -EBADF;
entry = ring_buffer_event_data(event);
-
- len = copy_from_user_nofault(&entry->id, ubuf, cnt);
- if (len) {
- entry->id = -1;
- memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE);
- written = -EFAULT;
- } else
- written = cnt;
+ memcpy(&entry->id, buf, cnt);
+ written = cnt;
__buffer_unlock_commit(buffer, event);
@@ -7329,8 +7469,8 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
{
struct trace_array *tr = filp->private_data;
ssize_t written = -ENODEV;
-
-#define FAULT_SIZE_ID (FAULTED_SIZE + sizeof(int))
+ size_t size;
+ char *buf;
if (tracing_disabled)
return -EINVAL;
@@ -7342,6 +7482,17 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
if (cnt < sizeof(unsigned int))
return -EINVAL;
+ /* Must have preemption disabled while having access to the buffer */
+ guard(preempt_notrace)();
+
+ buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size);
+ if (!buf)
+ return -EFAULT;
+
+ /* raw write is all or nothing */
+ if (cnt > size)
+ return -EINVAL;
+
/* The global trace_marker_raw can go to multiple instances */
if (tr == &global_trace) {
guard(rcu)();
@@ -7357,6 +7508,27 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
return written;
}
+static int tracing_mark_open(struct inode *inode, struct file *filp)
+{
+ int ret;
+
+ ret = trace_user_fault_buffer_enable();
+ if (ret < 0)
+ return ret;
+
+ stream_open(inode, filp);
+ ret = tracing_open_generic_tr(inode, filp);
+ if (ret < 0)
+ trace_user_fault_buffer_disable();
+ return ret;
+}
+
+static int tracing_mark_release(struct inode *inode, struct file *file)
+{
+ trace_user_fault_buffer_disable();
+ return tracing_release_generic_tr(inode, file);
+}
+
static int tracing_clock_show(struct seq_file *m, void *v)
{
struct trace_array *tr = m->private;
@@ -7764,13 +7936,13 @@ static const struct file_operations tracing_free_buffer_fops = {
static const struct file_operations tracing_mark_fops = {
.open = tracing_mark_open,
.write = tracing_mark_write,
- .release = tracing_release_generic_tr,
+ .release = tracing_mark_release,
};
static const struct file_operations tracing_mark_raw_fops = {
.open = tracing_mark_open,
.write = tracing_mark_raw_write,
- .release = tracing_release_generic_tr,
+ .release = tracing_mark_release,
};
static const struct file_operations trace_clock_fops = {
From: Lance Yang <lance.yang(a)linux.dev>
When splitting an mTHP and replacing a zero-filled subpage with the shared
zeropage, try_to_map_unused_to_zeropage() currently drops several important
PTE bits.
For userspace tools like CRIU, which rely on the soft-dirty mechanism for
incremental snapshots, losing the soft-dirty bit means modified pages are
missed, leading to inconsistent memory state after restore.
As pointed out by David, the more critical uffd-wp bit is also dropped.
This breaks the userfaultfd write-protection mechanism, causing writes
to be silently missed by monitoring applications, which can lead to data
corruption.
Preserve both the soft-dirty and uffd-wp bits from the old PTE when
creating the new zeropage mapping to ensure they are correctly tracked.
Cc: <stable(a)vger.kernel.org>
Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Suggested-by: David Hildenbrand <david(a)redhat.com>
Suggested-by: Dev Jain <dev.jain(a)arm.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Dev Jain <dev.jain(a)arm.com>
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
---
v4 -> v5:
- Move ptep_get() call after the !pvmw.pte check, which handles PMD-mapped
THP migration entries.
- https://lore.kernel.org/linux-mm/20250930071053.36158-1-lance.yang@linux.de…
v3 -> v4:
- Minor formatting tweak in try_to_map_unused_to_zeropage() function
signature (per David and Dev)
- Collect Reviewed-by from Dev - thanks!
- https://lore.kernel.org/linux-mm/20250930060557.85133-1-lance.yang@linux.de…
v2 -> v3:
- ptep_get() gets called only once per iteration (per Dev)
- https://lore.kernel.org/linux-mm/20250930043351.34927-1-lance.yang@linux.de…
v1 -> v2:
- Avoid calling ptep_get() multiple times (per Dev)
- Double-check the uffd-wp bit (per David)
- Collect Acked-by from David - thanks!
- https://lore.kernel.org/linux-mm/20250928044855.76359-1-lance.yang@linux.de…
mm/migrate.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index ce83c2c3c287..e3065c9edb55 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -296,8 +296,7 @@ bool isolate_folio_to_list(struct folio *folio, struct list_head *list)
}
static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
- struct folio *folio,
- unsigned long idx)
+ struct folio *folio, pte_t old_pte, unsigned long idx)
{
struct page *page = folio_page(folio, idx);
pte_t newpte;
@@ -306,7 +305,7 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
return false;
VM_BUG_ON_PAGE(!PageAnon(page), page);
VM_BUG_ON_PAGE(!PageLocked(page), page);
- VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page);
+ VM_BUG_ON_PAGE(pte_present(old_pte), page);
if (folio_test_mlocked(folio) || (pvmw->vma->vm_flags & VM_LOCKED) ||
mm_forbids_zeropage(pvmw->vma->vm_mm))
@@ -322,6 +321,12 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address),
pvmw->vma->vm_page_prot));
+
+ if (pte_swp_soft_dirty(old_pte))
+ newpte = pte_mksoft_dirty(newpte);
+ if (pte_swp_uffd_wp(old_pte))
+ newpte = pte_mkuffd_wp(newpte);
+
set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte);
dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio));
@@ -364,13 +369,13 @@ static bool remove_migration_pte(struct folio *folio,
continue;
}
#endif
+ old_pte = ptep_get(pvmw.pte);
if (rmap_walk_arg->map_unused_to_zeropage &&
- try_to_map_unused_to_zeropage(&pvmw, folio, idx))
+ try_to_map_unused_to_zeropage(&pvmw, folio, old_pte, idx))
continue;
folio_get(folio);
pte = mk_pte(new, READ_ONCE(vma->vm_page_prot));
- old_pte = ptep_get(pvmw.pte);
entry = pte_to_swp_entry(old_pte);
if (!is_migration_entry_young(entry))
--
2.49.0
From: Celeste Liu <uwu(a)coelacanthus.name>
The gs_usb driver supports USB devices with more than 1 CAN channel.
In old kernel before 3.15, it uses net_device->dev_id to distinguish
different channel in userspace, which was done in commit
acff76fa45b4 ("can: gs_usb: gs_make_candev(): set netdev->dev_id").
But since 3.15, the correct way is populating net_device->dev_port.
And according to documentation, if network device support multiple
interface, lack of net_device->dev_port SHALL be treated as a bug.
Fixes: acff76fa45b4 ("can: gs_usb: gs_make_candev(): set netdev->dev_id")
Cc: stable(a)vger.kernel.org
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
Link: https://patch.msgid.link/20250930-gs-usb-populate-net_device-dev_port-v1-1-…
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/usb/gs_usb.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
index 9fb4cbbd6d6d..69b8d6da651b 100644
--- a/drivers/net/can/usb/gs_usb.c
+++ b/drivers/net/can/usb/gs_usb.c
@@ -1245,6 +1245,7 @@ static struct gs_can *gs_make_candev(unsigned int channel,
netdev->flags |= IFF_ECHO; /* we support full roundtrip echo */
netdev->dev_id = channel;
+ netdev->dev_port = channel;
/* dev setup */
strcpy(dev->bt_const.name, KBUILD_MODNAME);
--
2.51.0
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 4d6fc29f36341d7795db1d1819b4c15fe9be7b23
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101358-bucket-cadet-c4d4@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4d6fc29f36341d7795db1d1819b4c15fe9be7b23 Mon Sep 17 00:00:00 2001
From: Donet Tom <donettom(a)linux.ibm.com>
Date: Wed, 24 Sep 2025 00:16:59 +0530
Subject: [PATCH] mm/ksm: fix incorrect KSM counter handling in mm_struct
during fork
Patch series "mm/ksm: Fix incorrect accounting of KSM counters during
fork", v3.
The first patch in this series fixes the incorrect accounting of KSM
counters such as ksm_merging_pages, ksm_rmap_items, and the global
ksm_zero_pages during fork.
The following patch add a selftest to verify the ksm_merging_pages counter
was updated correctly during fork.
Test Results
============
Without the first patch
-----------------------
# [RUN] test_fork_ksm_merging_page_count
not ok 10 ksm_merging_page in child: 32
With the first patch
--------------------
# [RUN] test_fork_ksm_merging_page_count
ok 10 ksm_merging_pages is not inherited after fork
This patch (of 2):
Currently, the KSM-related counters in `mm_struct`, such as
`ksm_merging_pages`, `ksm_rmap_items`, and `ksm_zero_pages`, are inherited
by the child process during fork. This results in inconsistent
accounting.
When a process uses KSM, identical pages are merged and an rmap item is
created for each merged page. The `ksm_merging_pages` and
`ksm_rmap_items` counters are updated accordingly. However, after a fork,
these counters are copied to the child while the corresponding rmap items
are not. As a result, when the child later triggers an unmerge, there are
no rmap items present in the child, so the counters remain stale, leading
to incorrect accounting.
A similar issue exists with `ksm_zero_pages`, which maintains both a
global counter and a per-process counter. During fork, the per-process
counter is inherited by the child, but the global counter is not
incremented. Since the child also references zero pages, the global
counter should be updated as well. Otherwise, during zero-page unmerge,
both the global and per-process counters are decremented, causing the
global counter to become inconsistent.
To fix this, ksm_merging_pages and ksm_rmap_items are reset to 0 during
fork, and the global ksm_zero_pages counter is updated with the
per-process ksm_zero_pages value inherited by the child. This ensures
that KSM statistics remain accurate and reflect the activity of each
process correctly.
Link: https://lkml.kernel.org/r/cover.1758648700.git.donettom@linux.ibm.com
Link: https://lkml.kernel.org/r/7b9870eb67ccc0d79593940d9dbd4a0b39b5d396.17586487…
Fixes: 7609385337a4 ("ksm: count ksm merging pages for each process")
Fixes: cb4df4cae4f2 ("ksm: count allocated ksm rmap_items for each process")
Fixes: e2942062e01d ("ksm: count all zero pages placed by KSM")
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
Reviewed-by: Chengming Zhou <chengming.zhou(a)linux.dev>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Aboorva Devarajan <aboorvad(a)linux.ibm.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Donet Tom <donettom(a)linux.ibm.com>
Cc: "Ritesh Harjani (IBM)" <ritesh.list(a)gmail.com>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: xu xin <xu.xin16(a)zte.com.cn>
Cc: <stable(a)vger.kernel.org> [6.6+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 22e67ca7cba3..067538fc4d58 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -56,8 +56,14 @@ static inline long mm_ksm_zero_pages(struct mm_struct *mm)
static inline void ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
{
/* Adding mm to ksm is best effort on fork. */
- if (mm_flags_test(MMF_VM_MERGEABLE, oldmm))
+ if (mm_flags_test(MMF_VM_MERGEABLE, oldmm)) {
+ long nr_ksm_zero_pages = atomic_long_read(&mm->ksm_zero_pages);
+
+ mm->ksm_merging_pages = 0;
+ mm->ksm_rmap_items = 0;
+ atomic_long_add(nr_ksm_zero_pages, &ksm_zero_pages);
__ksm_enter(mm);
+ }
}
static inline int ksm_execve(struct mm_struct *mm)
This patch series refactors the az6007 driver to address root causes of
persistent bugs that have persisted for some time.
Jeongjun Park (2):
media: az6007: fix out-of-bounds in az6007_i2c_xfer()
media: az6007: refactor to properly use dvb-usb-v2
drivers/media/usb/dvb-usb-v2/az6007.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------------------------------------------------------------------------------
1 file changed, 107 insertions(+), 104 deletions(-)
The patch below does not apply to the 6.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.17.y
git checkout FETCH_HEAD
git cherry-pick -x 4d6fc29f36341d7795db1d1819b4c15fe9be7b23
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101357-monorail-juror-352b@gregkh' --subject-prefix 'PATCH 6.17.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4d6fc29f36341d7795db1d1819b4c15fe9be7b23 Mon Sep 17 00:00:00 2001
From: Donet Tom <donettom(a)linux.ibm.com>
Date: Wed, 24 Sep 2025 00:16:59 +0530
Subject: [PATCH] mm/ksm: fix incorrect KSM counter handling in mm_struct
during fork
Patch series "mm/ksm: Fix incorrect accounting of KSM counters during
fork", v3.
The first patch in this series fixes the incorrect accounting of KSM
counters such as ksm_merging_pages, ksm_rmap_items, and the global
ksm_zero_pages during fork.
The following patch add a selftest to verify the ksm_merging_pages counter
was updated correctly during fork.
Test Results
============
Without the first patch
-----------------------
# [RUN] test_fork_ksm_merging_page_count
not ok 10 ksm_merging_page in child: 32
With the first patch
--------------------
# [RUN] test_fork_ksm_merging_page_count
ok 10 ksm_merging_pages is not inherited after fork
This patch (of 2):
Currently, the KSM-related counters in `mm_struct`, such as
`ksm_merging_pages`, `ksm_rmap_items`, and `ksm_zero_pages`, are inherited
by the child process during fork. This results in inconsistent
accounting.
When a process uses KSM, identical pages are merged and an rmap item is
created for each merged page. The `ksm_merging_pages` and
`ksm_rmap_items` counters are updated accordingly. However, after a fork,
these counters are copied to the child while the corresponding rmap items
are not. As a result, when the child later triggers an unmerge, there are
no rmap items present in the child, so the counters remain stale, leading
to incorrect accounting.
A similar issue exists with `ksm_zero_pages`, which maintains both a
global counter and a per-process counter. During fork, the per-process
counter is inherited by the child, but the global counter is not
incremented. Since the child also references zero pages, the global
counter should be updated as well. Otherwise, during zero-page unmerge,
both the global and per-process counters are decremented, causing the
global counter to become inconsistent.
To fix this, ksm_merging_pages and ksm_rmap_items are reset to 0 during
fork, and the global ksm_zero_pages counter is updated with the
per-process ksm_zero_pages value inherited by the child. This ensures
that KSM statistics remain accurate and reflect the activity of each
process correctly.
Link: https://lkml.kernel.org/r/cover.1758648700.git.donettom@linux.ibm.com
Link: https://lkml.kernel.org/r/7b9870eb67ccc0d79593940d9dbd4a0b39b5d396.17586487…
Fixes: 7609385337a4 ("ksm: count ksm merging pages for each process")
Fixes: cb4df4cae4f2 ("ksm: count allocated ksm rmap_items for each process")
Fixes: e2942062e01d ("ksm: count all zero pages placed by KSM")
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
Reviewed-by: Chengming Zhou <chengming.zhou(a)linux.dev>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Aboorva Devarajan <aboorvad(a)linux.ibm.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Donet Tom <donettom(a)linux.ibm.com>
Cc: "Ritesh Harjani (IBM)" <ritesh.list(a)gmail.com>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: xu xin <xu.xin16(a)zte.com.cn>
Cc: <stable(a)vger.kernel.org> [6.6+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 22e67ca7cba3..067538fc4d58 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -56,8 +56,14 @@ static inline long mm_ksm_zero_pages(struct mm_struct *mm)
static inline void ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
{
/* Adding mm to ksm is best effort on fork. */
- if (mm_flags_test(MMF_VM_MERGEABLE, oldmm))
+ if (mm_flags_test(MMF_VM_MERGEABLE, oldmm)) {
+ long nr_ksm_zero_pages = atomic_long_read(&mm->ksm_zero_pages);
+
+ mm->ksm_merging_pages = 0;
+ mm->ksm_rmap_items = 0;
+ atomic_long_add(nr_ksm_zero_pages, &ksm_zero_pages);
__ksm_enter(mm);
+ }
}
static inline int ksm_execve(struct mm_struct *mm)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 8d33a030c566e1f105cd5bf27f37940b6367f3be
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101316-handmade-imaginary-edf0@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8d33a030c566e1f105cd5bf27f37940b6367f3be Mon Sep 17 00:00:00 2001
From: Zheng Qixing <zhengqixing(a)huawei.com>
Date: Tue, 26 Aug 2025 15:42:04 +0800
Subject: [PATCH] dm: fix NULL pointer dereference in __dm_suspend()
There is a race condition between dm device suspend and table load that
can lead to null pointer dereference. The issue occurs when suspend is
invoked before table load completes:
BUG: kernel NULL pointer dereference, address: 0000000000000054
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014
RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50
Call Trace:
<TASK>
blk_mq_quiesce_queue+0x2c/0x50
dm_stop_queue+0xd/0x20
__dm_suspend+0x130/0x330
dm_suspend+0x11a/0x180
dev_suspend+0x27e/0x560
ctl_ioctl+0x4cf/0x850
dm_ctl_ioctl+0xd/0x20
vfs_ioctl+0x1d/0x50
__se_sys_ioctl+0x9b/0xc0
__x64_sys_ioctl+0x19/0x30
x64_sys_call+0x2c4a/0x4620
do_syscall_64+0x9e/0x1b0
The issue can be triggered as below:
T1 T2
dm_suspend table_load
__dm_suspend dm_setup_md_queue
dm_mq_init_request_queue
blk_mq_init_allocated_queue
=> q->mq_ops = set->ops; (1)
dm_stop_queue / dm_wait_for_completion
=> q->tag_set NULL pointer! (2)
=> q->tag_set = set; (3)
Fix this by checking if a valid table (map) exists before performing
request-based suspend and waiting for target I/O. When map is NULL,
skip these table-dependent suspend steps.
Even when map is NULL, no I/O can reach any target because there is
no table loaded; I/O submitted in this state will fail early in the
DM layer. Skipping the table-dependent suspend logic in this case
is safe and avoids NULL pointer dereferences.
Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zheng Qixing <zhengqixing(a)huawei.com>
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 7222f20c1a83..66dd5f6ce778 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2908,7 +2908,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
{
bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG;
bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG;
- int r;
+ int r = 0;
lockdep_assert_held(&md->suspend_lock);
@@ -2960,7 +2960,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
* Stop md->queue before flushing md->wq in case request-based
* dm defers requests to md->wq from md->queue.
*/
- if (dm_request_based(md)) {
+ if (map && dm_request_based(md)) {
dm_stop_queue(md->queue);
set_bit(DMF_QUEUE_STOPPED, &md->flags);
}
@@ -2972,7 +2972,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
* We call dm_wait_for_completion to wait for all existing requests
* to finish.
*/
- r = dm_wait_for_completion(md, task_state);
+ if (map)
+ r = dm_wait_for_completion(md, task_state);
if (!r)
set_bit(dmf_suspended_flag, &md->flags);
Greetings,
I hope this email finds you well.
I am Mrs. Diana Owen contacting you from Reality Trading Group Ltd India.
.
As a new growing company, we are looking for reliable company of your product as seen in your website for urgent purchase order basis.
Could you please give me more details and specification of your products.
We also need your products catalog, MOQ, Production Lead Time and Unit price?
Best Regards,
Mrs. Diana Owen
Purchase Manager
Company: Reality Trading Group Ltd
According to documentation, the DP PHY on x1e80100 has another clock
called ref.
The current X Elite devices supported upstream work fine without this
clock, because the boot firmware leaves this clock enabled. But we should
not rely on that. Also, when it comes to power management, this clock
needs to be also disabled on suspend. So even though this change breaks
the ABI, it is needed in order to make we disable this clock on runtime
PM, when that is going to be enabled in the driver.
So rework the driver to allow different number of clocks, fix the
dt-bindings schema and add the clock to the DT node as well.
Signed-off-by: Abel Vesa <abel.vesa(a)linaro.org>
---
Changes in v3 (resend)
- picked-up Krzysztof's R-b tag for bindings patch
Changes in v3:
- Use dev_err_probe() on clocks parsing failure.
- Explain why the ABI break is necessary.
- Drop the extra 'clk' suffix from the clock name. So ref instead of
refclk.
- Link to v2: https://lore.kernel.org/r/20250903-phy-qcom-edp-add-missing-refclk-v2-0-d88…
Changes in v2:
- Fix schema by adding the minItems, as suggested by Krzysztof.
- Use devm_clk_bulk_get_all, as suggested by Konrad.
- Rephrase the commit messages to reflect the flexible number of clocks.
- Link to v1: https://lore.kernel.org/r/20250730-phy-qcom-edp-add-missing-refclk-v1-0-6f7…
---
Abel Vesa (3):
dt-bindings: phy: qcom-edp: Add missing clock for X Elite
phy: qcom: edp: Make the number of clocks flexible
arm64: dts: qcom: Add missing TCSR ref clock to the DP PHYs
.../devicetree/bindings/phy/qcom,edp-phy.yaml | 28 +++++++++++++++++++++-
arch/arm64/boot/dts/qcom/x1e80100.dtsi | 12 ++++++----
drivers/phy/qualcomm/phy-qcom-edp.c | 16 ++++++-------
3 files changed, 43 insertions(+), 13 deletions(-)
---
base-commit: 52ba76324a9d7c39830c850999210a36ef023cde
change-id: 20250730-phy-qcom-edp-add-missing-refclk-5ab82828f8e7
Best regards,
--
Abel Vesa <abel.vesa(a)linaro.org>
From: Lance Yang <lance.yang(a)linux.dev>
The blocker tracking mechanism assumes that lock pointers are at least
4-byte aligned to use their lower bits for type encoding.
However, as reported by Eero Tamminen, some architectures like m68k
only guarantee 2-byte alignment of 32-bit values. This breaks the
assumption and causes two related WARN_ON_ONCE checks to trigger.
To fix this, the runtime checks are adjusted to silently ignore any lock
that is not 4-byte aligned, effectively disabling the feature in such
cases and avoiding the related warnings.
Thanks to Geert Uytterhoeven for bisecting!
Reported-by: Eero Tamminen <oak(a)helsinkinet.fi>
Closes: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58R…
Fixes: e711faaafbe5 ("hung_task: replace blocker_mutex with encoded blocker")
Cc: <stable(a)vger.kernel.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
---
v1 -> v2:
- Pick RB from Masami - thanks!
- Update the changelog and comments
- https://lore.kernel.org/lkml/20250823050036.7748-1-lance.yang@linux.dev/
include/linux/hung_task.h | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/include/linux/hung_task.h b/include/linux/hung_task.h
index 34e615c76ca5..c4403eeb7144 100644
--- a/include/linux/hung_task.h
+++ b/include/linux/hung_task.h
@@ -20,6 +20,10 @@
* always zero. So we can use these bits to encode the specific blocking
* type.
*
+ * Note that on architectures where this is not guaranteed, or for any
+ * unaligned lock, this tracking mechanism is silently skipped for that
+ * lock.
+ *
* Type encoding:
* 00 - Blocked on mutex (BLOCKER_TYPE_MUTEX)
* 01 - Blocked on semaphore (BLOCKER_TYPE_SEM)
@@ -45,7 +49,7 @@ static inline void hung_task_set_blocker(void *lock, unsigned long type)
* If the lock pointer matches the BLOCKER_TYPE_MASK, return
* without writing anything.
*/
- if (WARN_ON_ONCE(lock_ptr & BLOCKER_TYPE_MASK))
+ if (lock_ptr & BLOCKER_TYPE_MASK)
return;
WRITE_ONCE(current->blocker, lock_ptr | type);
@@ -53,8 +57,6 @@ static inline void hung_task_set_blocker(void *lock, unsigned long type)
static inline void hung_task_clear_blocker(void)
{
- WARN_ON_ONCE(!READ_ONCE(current->blocker));
-
WRITE_ONCE(current->blocker, 0UL);
}
--
2.49.0
In as102_usb driver, the following race condition occurs:
```
CPU0 CPU1
as102_usb_probe()
kzalloc(); // alloc as102_dev_t
....
usb_register_dev();
open("/path/to/dev"); // open as102 dev
....
usb_deregister_dev();
....
kfree(); // free as102_dev_t
....
close(fd);
as102_release() // UAF!!
as102_usb_release()
kfree(); // DFB!!
```
When a USB character device registered with usb_register_dev() is later
unregistered (via usb_deregister_dev() or disconnect), the device node is
removed so new open() calls fail. However, file descriptors that are
already open do not go away immediately: they remain valid until the last
reference is dropped and the driver's .release() is invoked.
In as102, as102_usb_probe() calls usb_register_dev() and then, on an
error path, does usb_deregister_dev() and frees as102_dev_t right away.
If userspace raced a successful open() before the deregistration, that
open FD will later hit as102_release() --> as102_usb_release() and access
or free as102_dev_t again, occur a race to use-after-free and
double-free vuln.
The fix is to never kfree(as102_dev_t) directly once usb_register_dev()
has succeeded. After deregistration, defer freeing memory to .release().
In other words, let release() perform the last kfree when the final open
FD is closed.
Cc: <stable(a)vger.kernel.org>
Reported-by: syzbot+47321e8fd5a4c84088db(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=47321e8fd5a4c84088db
Fixes: cd19f7d3e39b ("[media] as102: fix leaks at failure paths in as102_usb_probe()")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
---
v2: Fix incorrect patch description style and CC stable mailing list
- Link to v1: https://lore.kernel.org/all/20250822143539.1157329-1-aha310510@gmail.com/
---
drivers/media/usb/as102/as102_usb_drv.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/media/usb/as102/as102_usb_drv.c b/drivers/media/usb/as102/as102_usb_drv.c
index e0ef66a522e2..abde5666b2ee 100644
--- a/drivers/media/usb/as102/as102_usb_drv.c
+++ b/drivers/media/usb/as102/as102_usb_drv.c
@@ -404,6 +404,7 @@ static int as102_usb_probe(struct usb_interface *intf,
as102_free_usb_stream_buffer(as102_dev);
failed_stream:
usb_deregister_dev(intf, &as102_usb_class_driver);
+ return ret;
failed:
usb_put_dev(as102_dev->bus_adap.usb_dev);
usb_set_intfdata(intf, NULL);
--
In hackrf driver, the following race condition occurs:
```
CPU0 CPU1
hackrf_probe()
kzalloc(); // alloc hackrf_dev
....
v4l2_device_register();
....
open("/path/to/dev"); // open hackrf dev
....
v4l2_device_unregister();
....
kfree(); // free hackrf_dev
....
ioctl(fd, ...);
v4l2_ioctl();
video_is_registered() // UAF!!
....
close(fd);
v4l2_release() // UAF!!
hackrf_video_release()
kfree(); // DFB!!
```
When a V4L2 or video device is unregistered, the device node is removed so
new open() calls are blocked.
However, file descriptors that are already open-and any in-flight I/O-do
not terminate immediately; they remain valid until the last reference is
dropped and the driver's release() is invoked.
Therefore, freeing device memory on the error path after hackrf_probe()
has registered dev it will lead to a race to use-after-free vuln, since
those already-open handles haven't been released yet.
And since release() free memory too, race to use-after-free and
double-free vuln occur.
To prevent this, if device is registered from probe(), it should be
modified to free memory only through release() rather than calling
kfree() directly.
Cc: <stable(a)vger.kernel.org>
Reported-by: syzbot+6ffd76b5405c006a46b7(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=6ffd76b5405c006a46b7
Reported-by: syzbot+f1b20958f93d2d250727(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f1b20958f93d2d250727
Fixes: 8bc4a9ed8504 ("[media] hackrf: add support for transmitter")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
---
v2: Fix incorrect patch description style and CC stable mailing list
- Link to v1: https://lore.kernel.org/all/20250822142729.1156816-1-aha310510@gmail.com/
---
drivers/media/usb/hackrf/hackrf.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/media/usb/hackrf/hackrf.c b/drivers/media/usb/hackrf/hackrf.c
index 0b50de8775a3..d7a84422193d 100644
--- a/drivers/media/usb/hackrf/hackrf.c
+++ b/drivers/media/usb/hackrf/hackrf.c
@@ -1515,6 +1515,8 @@ static int hackrf_probe(struct usb_interface *intf,
video_unregister_device(&dev->rx_vdev);
err_v4l2_device_unregister:
v4l2_device_unregister(&dev->v4l2_dev);
+ dev_dbg(&intf->dev, "failed=%d\n", ret);
+ return ret;
err_v4l2_ctrl_handler_free_tx:
v4l2_ctrl_handler_free(&dev->tx_ctrl_handler);
err_v4l2_ctrl_handler_free_rx:
--
The following commit has been merged into the perf/urgent branch of tip:
Commit-ID: ebfc8542ad62d066771e46c8aa30f5624b89cad8
Gitweb: https://git.kernel.org/tip/ebfc8542ad62d066771e46c8aa30f5624b89cad8
Author: Adrian Hunter <adrian.hunter(a)intel.com>
AuthorDate: Mon, 13 Oct 2025 10:22:42 +03:00
Committer: Peter Zijlstra <peterz(a)infradead.org>
CommitterDate: Tue, 14 Oct 2025 10:38:09 +02:00
perf/core: Fix address filter match with backing files
It was reported that Intel PT address filters do not work in Docker
containers. That relates to the use of overlayfs.
overlayfs records the backing file in struct vm_area_struct vm_file,
instead of the user file that the user mmapped. In order for an address
filter to match, it must compare to the user file inode. There is an
existing helper file_user_inode() for that situation.
Use file_user_inode() instead of file_inode() to get the inode for address
filter matching.
Example:
Setup:
# cd /root
# mkdir test ; cd test ; mkdir lower upper work merged
# cp `which cat` lower
# mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged
# perf record --buildid-mmap -e intel_pt//u --filter 'filter * @ /root/test/merged/cat' -- /root/test/merged/cat /proc/self/maps
...
55d61d246000-55d61d2e1000 r-xp 00018000 00:1a 3418 /root/test/merged/cat
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.015 MB perf.data ]
# perf buildid-cache --add /root/test/merged/cat
Before:
Address filter does not match so there are no control flow packets
# perf script --itrace=e
# perf script --itrace=b | wc -l
0
# perf script -D | grep 'TIP.PGE' | wc -l
0
#
After:
Address filter does match so there are control flow packets
# perf script --itrace=e
# perf script --itrace=b | wc -l
235
# perf script -D | grep 'TIP.PGE' | wc -l
57
#
With respect to stable kernels, overlayfs mmap function ovl_mmap() was
added in v4.19 but file_user_inode() was not added until v6.8 and never
back-ported to stable kernels. FMODE_BACKING that it depends on was added
in v6.5. This issue has gone largely unnoticed, so back-porting before
v6.8 is probably not worth it, so put 6.8 as the stable kernel prerequisite
version, although in practice the next long term kernel is 6.12.
Closes: https://lore.kernel.org/linux-perf-users/aBCwoq7w8ohBRQCh@fremen.lan
Reported-by: Edd Barrett <edd(a)theunixzoo.co.uk>
Signed-off-by: Adrian Hunter <adrian.hunter(a)intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Acked-by: Amir Goldstein <amir73il(a)gmail.com>
Cc: stable(a)vger.kernel.org # 6.8
---
kernel/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7541f6f..cd63ec8 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9492,7 +9492,7 @@ static bool perf_addr_filter_match(struct perf_addr_filter *filter,
if (!filter->path.dentry)
return false;
- if (d_inode(filter->path.dentry) != file_inode(file))
+ if (d_inode(filter->path.dentry) != file_user_inode(file))
return false;
if (filter->offset > offset + size)
The following commit has been merged into the perf/urgent branch of tip:
Commit-ID: 8818f507a9391019a3ec7c57b1a32e4b386e48a5
Gitweb: https://git.kernel.org/tip/8818f507a9391019a3ec7c57b1a32e4b386e48a5
Author: Adrian Hunter <adrian.hunter(a)intel.com>
AuthorDate: Mon, 13 Oct 2025 10:22:43 +03:00
Committer: Peter Zijlstra <peterz(a)infradead.org>
CommitterDate: Tue, 14 Oct 2025 10:38:09 +02:00
perf/core: Fix MMAP event path names with backing files
Some file systems like FUSE-based ones or overlayfs may record the backing
file in struct vm_area_struct vm_file, instead of the user file that the
user mmapped.
Since commit def3ae83da02f ("fs: store real path instead of fake path in
backing file f_path"), file_path() no longer returns the user file path
when applied to a backing file. There is an existing helper
file_user_path() for that situation.
Use file_user_path() instead of file_path() to get the path for MMAP
and MMAP2 events.
Example:
Setup:
# cd /root
# mkdir test ; cd test ; mkdir lower upper work merged
# cp `which cat` lower
# mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged
# perf record -e intel_pt//u -- /root/test/merged/cat /proc/self/maps
...
55b0ba399000-55b0ba434000 r-xp 00018000 00:1a 3419 /root/test/merged/cat
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.060 MB perf.data ]
#
Before:
File name is wrong (/cat), so decoding fails:
# perf script --no-itrace --show-mmap-events
cat 367 [016] 100.491492: PERF_RECORD_MMAP2 367/367: [0x55b0ba399000(0x9b000) @ 0x18000 00:02 3419 489959280]: r-xp /cat
...
# perf script --itrace=e | wc -l
Warning:
19 instruction trace errors
19
#
After:
File name is correct (/root/test/merged/cat), so decoding is ok:
# perf script --no-itrace --show-mmap-events
cat 364 [016] 72.153006: PERF_RECORD_MMAP2 364/364: [0x55ce4003d000(0x9b000) @ 0x18000 00:02 3419 3132534314]: r-xp /root/test/merged/cat
# perf script --itrace=e
# perf script --itrace=e | wc -l
0
#
Fixes: def3ae83da02f ("fs: store real path instead of fake path in backing file f_path")
Signed-off-by: Adrian Hunter <adrian.hunter(a)intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Acked-by: Amir Goldstein <amir73il(a)gmail.com>
Cc: stable(a)vger.kernel.org
---
kernel/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index cd63ec8..7b5c237 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9416,7 +9416,7 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
* need to add enough zero bytes after the string to handle
* the 64bit alignment we do later.
*/
- name = file_path(file, buf, PATH_MAX - sizeof(u64));
+ name = d_path(file_user_path(file), buf, PATH_MAX - sizeof(u64));
if (IS_ERR(name)) {
name = "//toolong";
goto cpy_name;
The following commit has been merged into the perf/urgent branch of tip:
Commit-ID: fa4f4bae893fbce8a3edfff1ab7ece0c01dc1328
Gitweb: https://git.kernel.org/tip/fa4f4bae893fbce8a3edfff1ab7ece0c01dc1328
Author: Adrian Hunter <adrian.hunter(a)intel.com>
AuthorDate: Mon, 13 Oct 2025 10:22:44 +03:00
Committer: Peter Zijlstra <peterz(a)infradead.org>
CommitterDate: Tue, 14 Oct 2025 10:38:10 +02:00
perf/core: Fix MMAP2 event device with backing files
Some file systems like FUSE-based ones or overlayfs may record the backing
file in struct vm_area_struct vm_file, instead of the user file that the
user mmapped.
That causes perf to misreport the device major/minor numbers of the file
system of the file, and the generation of the file, and potentially other
inode details. There is an existing helper file_user_inode() for that
situation.
Use file_user_inode() instead of file_inode() to get the inode for MMAP2
events.
Example:
Setup:
# cd /root
# mkdir test ; cd test ; mkdir lower upper work merged
# cp `which cat` lower
# mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged
# perf record -e cycles:u -- /root/test/merged/cat /proc/self/maps
...
55b2c91d0000-55b2c926b000 r-xp 00018000 00:1a 3419 /root/test/merged/cat
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.004 MB perf.data (5 samples) ]
#
# stat /root/test/merged/cat
File: /root/test/merged/cat
Size: 1127792 Blocks: 2208 IO Block: 4096 regular file
Device: 0,26 Inode: 3419 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2025-09-08 12:23:59.453309624 +0000
Modify: 2025-09-08 12:23:59.454309624 +0000
Change: 2025-09-08 12:23:59.454309624 +0000
Birth: 2025-09-08 12:23:59.453309624 +0000
Before:
Device reported 00:02 differs from stat output and /proc/self/maps
# perf script --show-mmap-events | grep /root/test/merged/cat
cat 377 [-01] 243.078558: PERF_RECORD_MMAP2 377/377: [0x55b2c91d0000(0x9b000) @ 0x18000 00:02 3419 2068525940]: r-xp /root/test/merged/cat
After:
Device reported 00:1a is the same as stat output and /proc/self/maps
# perf script --show-mmap-events | grep /root/test/merged/cat
cat 362 [-01] 127.755167: PERF_RECORD_MMAP2 362/362: [0x55ba6e781000(0x9b000) @ 0x18000 00:1a 3419 0]: r-xp /root/test/merged/cat
With respect to stable kernels, overlayfs mmap function ovl_mmap() was
added in v4.19 but file_user_inode() was not added until v6.8 and never
back-ported to stable kernels. FMODE_BACKING that it depends on was added
in v6.5. This issue has gone largely unnoticed, so back-porting before
v6.8 is probably not worth it, so put 6.8 as the stable kernel prerequisite
version, although in practice the next long term kernel is 6.12.
Signed-off-by: Adrian Hunter <adrian.hunter(a)intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Acked-by: Amir Goldstein <amir73il(a)gmail.com>
Cc: stable(a)vger.kernel.org # 6.8
---
kernel/events/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7b5c237..177e57c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9403,7 +9403,7 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
flags |= MAP_HUGETLB;
if (file) {
- struct inode *inode;
+ const struct inode *inode;
dev_t dev;
buf = kmalloc(PATH_MAX, GFP_KERNEL);
@@ -9421,7 +9421,7 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
name = "//toolong";
goto cpy_name;
}
- inode = file_inode(vma->vm_file);
+ inode = file_user_inode(vma->vm_file);
dev = inode->i_sb->s_dev;
ino = inode->i_ino;
gen = inode->i_generation;
We found some discrete AMD graphics devices hide funtion 0 and the whole
is not supposed to be probed.
Since our original purpose is to allow integrated devices (on bus 0) to
be probed without function 0, we can limit the islolated function probing
only on bus 0.
Cc: stable(a)vger.kernel.org
Fixes: a02fd05661d73a8 ("PCI: Extend isolated function probing to LoongArch")
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
drivers/pci/probe.c | 2 +-
include/linux/hypervisor.h | 8 +++++---
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index c83e75a0ec12..da6a2aef823a 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2883,7 +2883,7 @@ int pci_scan_slot(struct pci_bus *bus, int devfn)
* a hypervisor that passes through individual PCI
* functions.
*/
- if (!hypervisor_isolated_pci_functions())
+ if (!hypervisor_isolated_pci_functions(bus->number))
break;
}
fn = next_fn(bus, dev, fn);
diff --git a/include/linux/hypervisor.h b/include/linux/hypervisor.h
index be5417303ecf..30ece04a16d9 100644
--- a/include/linux/hypervisor.h
+++ b/include/linux/hypervisor.h
@@ -32,13 +32,15 @@ static inline bool jailhouse_paravirt(void)
#endif /* !CONFIG_X86 */
-static inline bool hypervisor_isolated_pci_functions(void)
+static inline bool hypervisor_isolated_pci_functions(int bus)
{
if (IS_ENABLED(CONFIG_S390))
return true;
- if (IS_ENABLED(CONFIG_LOONGARCH))
- return true;
+ if (IS_ENABLED(CONFIG_LOONGARCH)) {
+ if (bus == 0)
+ return true;
+ }
return jailhouse_paravirt();
}
--
2.47.3
Since the len argument value passed to exfat_ioctl_set_volume_label()
from exfat_nls_to_utf16() is passed 1 too large, an out-of-bounds read
occurs when dereferencing p_cstring in exfat_nls_to_ucs2() later.
And because of the NLS_NAME_OVERLEN macro, another error occurs when
creating a file with a period at the end using utf8 and other iocharsets,
so the NLS_NAME_OVERLEN macro should be removed and the len argument value
should be passed as FSLABEL_MAX - 1.
Cc: <stable(a)vger.kernel.org>
Reported-by: syzbot+98cc76a76de46b3714d4(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=98cc76a76de46b3714d4
Fixes: 370e812b3ec1 ("exfat: add nls operations")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
---
fs/exfat/file.c | 2 +-
fs/exfat/nls.c | 3 ---
2 files changed, 1 insertion(+), 4 deletions(-)
diff --git a/fs/exfat/file.c b/fs/exfat/file.c
index f246cf439588..7ce0fb6f2564 100644
--- a/fs/exfat/file.c
+++ b/fs/exfat/file.c
@@ -521,7 +521,7 @@ static int exfat_ioctl_set_volume_label(struct super_block *sb,
memset(&uniname, 0, sizeof(uniname));
if (label[0]) {
- ret = exfat_nls_to_utf16(sb, label, FSLABEL_MAX,
+ ret = exfat_nls_to_utf16(sb, label, FSLABEL_MAX - 1,
&uniname, &lossy);
if (ret < 0)
return ret;
diff --git a/fs/exfat/nls.c b/fs/exfat/nls.c
index 8243d94ceaf4..57db08a5271c 100644
--- a/fs/exfat/nls.c
+++ b/fs/exfat/nls.c
@@ -616,9 +616,6 @@ static int exfat_nls_to_ucs2(struct super_block *sb,
unilen++;
}
- if (p_cstring[i] != '\0')
- lossy |= NLS_NAME_OVERLEN;
-
*uniname = '\0';
p_uniname->name_len = unilen;
p_uniname->name_hash = exfat_calc_chksum16(upname, unilen << 1, 0,
--
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 0389c305ef56cbadca4cbef44affc0ec3213ed30
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101330-hemstitch-crimson-1681@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0389c305ef56cbadca4cbef44affc0ec3213ed30 Mon Sep 17 00:00:00 2001
From: Lance Yang <lance.yang(a)linux.dev>
Date: Wed, 17 Sep 2025 21:31:37 +0800
Subject: [PATCH] selftests/mm: skip soft-dirty tests when
CONFIG_MEM_SOFT_DIRTY is disabled
The madv_populate and soft-dirty kselftests currently fail on systems
where CONFIG_MEM_SOFT_DIRTY is disabled.
Introduce a new helper softdirty_supported() into vm_util.c/h to ensure
tests are properly skipped when the feature is not enabled.
Link: https://lkml.kernel.org/r/20250917133137.62802-1-lance.yang@linux.dev
Fixes: 9f3265db6ae8 ("selftests: vm: add test for Soft-Dirty PTE bit")
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
Acked-by: David Hildenbrand <david(a)redhat.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Gabriel Krisman Bertazi <krisman(a)collabora.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/tools/testing/selftests/mm/madv_populate.c b/tools/testing/selftests/mm/madv_populate.c
index b6fabd5c27ed..d8d11bc67ddc 100644
--- a/tools/testing/selftests/mm/madv_populate.c
+++ b/tools/testing/selftests/mm/madv_populate.c
@@ -264,23 +264,6 @@ static void test_softdirty(void)
munmap(addr, SIZE);
}
-static int system_has_softdirty(void)
-{
- /*
- * There is no way to check if the kernel supports soft-dirty, other
- * than by writing to a page and seeing if the bit was set. But the
- * tests are intended to check that the bit gets set when it should, so
- * doing that check would turn a potentially legitimate fail into a
- * skip. Fortunately, we know for sure that arm64 does not support
- * soft-dirty. So for now, let's just use the arch as a corse guide.
- */
-#if defined(__aarch64__)
- return 0;
-#else
- return 1;
-#endif
-}
-
int main(int argc, char **argv)
{
int nr_tests = 16;
@@ -288,7 +271,7 @@ int main(int argc, char **argv)
pagesize = getpagesize();
- if (system_has_softdirty())
+ if (softdirty_supported())
nr_tests += 5;
ksft_print_header();
@@ -300,7 +283,7 @@ int main(int argc, char **argv)
test_holes();
test_populate_read();
test_populate_write();
- if (system_has_softdirty())
+ if (softdirty_supported())
test_softdirty();
err = ksft_get_fail_cnt();
diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selftests/mm/soft-dirty.c
index 8a3f2b4b2186..4ee4db3750c1 100644
--- a/tools/testing/selftests/mm/soft-dirty.c
+++ b/tools/testing/selftests/mm/soft-dirty.c
@@ -200,8 +200,11 @@ int main(int argc, char **argv)
int pagesize;
ksft_print_header();
- ksft_set_plan(15);
+ if (!softdirty_supported())
+ ksft_exit_skip("soft-dirty is not support\n");
+
+ ksft_set_plan(15);
pagemap_fd = open(PAGEMAP_FILE_PATH, O_RDONLY);
if (pagemap_fd < 0)
ksft_exit_fail_msg("Failed to open %s\n", PAGEMAP_FILE_PATH);
diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests/mm/vm_util.c
index 56e9bd541edd..e33cda301dad 100644
--- a/tools/testing/selftests/mm/vm_util.c
+++ b/tools/testing/selftests/mm/vm_util.c
@@ -449,6 +449,23 @@ bool check_vmflag_pfnmap(void *addr)
return check_vmflag(addr, "pf");
}
+bool softdirty_supported(void)
+{
+ char *addr;
+ bool supported = false;
+ const size_t pagesize = getpagesize();
+
+ /* New mappings are expected to be marked with VM_SOFTDIRTY (sd). */
+ addr = mmap(0, pagesize, PROT_READ | PROT_WRITE,
+ MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
+ if (!addr)
+ ksft_exit_fail_msg("mmap failed\n");
+
+ supported = check_vmflag(addr, "sd");
+ munmap(addr, pagesize);
+ return supported;
+}
+
/*
* Open an fd at /proc/$pid/maps and configure procmap_out ready for
* PROCMAP_QUERY query. Returns 0 on success, or an error code otherwise.
diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h
index 07c4acfd84b6..26c30fdc0241 100644
--- a/tools/testing/selftests/mm/vm_util.h
+++ b/tools/testing/selftests/mm/vm_util.h
@@ -104,6 +104,7 @@ bool find_vma_procmap(struct procmap_fd *procmap, void *address);
int close_procmap(struct procmap_fd *procmap);
int write_sysfs(const char *file_path, unsigned long val);
int read_sysfs(const char *file_path, unsigned long *val);
+bool softdirty_supported(void);
static inline int open_self_procmap(struct procmap_fd *procmap_out)
{
We add pmd folio into ds_queue on the first page fault in
__do_huge_pmd_anonymous_page(), so that we can split it in case of
memory pressure. This should be the same for a pmd folio during wp
page fault.
Commit 1ced09e0331f ("mm: allocate THP on hugezeropage wp-fault") miss
to add it to ds_queue, which means system may not reclaim enough memory
in case of memory pressure even the pmd folio is under used.
Move deferred_split_folio() into map_anon_folio_pmd() to make the pmd
folio installation consistent.
Fixes: 1ced09e0331f ("mm: allocate THP on hugezeropage wp-fault")
Signed-off-by: Wei Yang <richard.weiyang(a)gmail.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Lance Yang <lance.yang(a)linux.dev>
Cc: Dev Jain <dev.jain(a)arm.com>
Cc: <stable(a)vger.kernel.org>
---
v2:
* add fix, cc stable and put description about the flow of current
code
* move deferred_split_folio() into map_anon_folio_pmd()
---
mm/huge_memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 1b81680b4225..f13de93637bf 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1232,6 +1232,7 @@ static void map_anon_folio_pmd(struct folio *folio, pmd_t *pmd,
count_vm_event(THP_FAULT_ALLOC);
count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_ALLOC);
count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC);
+ deferred_split_folio(folio, false);
}
static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf)
@@ -1272,7 +1273,6 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf)
pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable);
map_anon_folio_pmd(folio, vmf->pmd, vma, haddr);
mm_inc_nr_ptes(vma->vm_mm);
- deferred_split_folio(folio, false);
spin_unlock(vmf->ptl);
}
--
2.34.1
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 0389c305ef56cbadca4cbef44affc0ec3213ed30
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101329-unplanted-language-2cc7@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0389c305ef56cbadca4cbef44affc0ec3213ed30 Mon Sep 17 00:00:00 2001
From: Lance Yang <lance.yang(a)linux.dev>
Date: Wed, 17 Sep 2025 21:31:37 +0800
Subject: [PATCH] selftests/mm: skip soft-dirty tests when
CONFIG_MEM_SOFT_DIRTY is disabled
The madv_populate and soft-dirty kselftests currently fail on systems
where CONFIG_MEM_SOFT_DIRTY is disabled.
Introduce a new helper softdirty_supported() into vm_util.c/h to ensure
tests are properly skipped when the feature is not enabled.
Link: https://lkml.kernel.org/r/20250917133137.62802-1-lance.yang@linux.dev
Fixes: 9f3265db6ae8 ("selftests: vm: add test for Soft-Dirty PTE bit")
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
Acked-by: David Hildenbrand <david(a)redhat.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Gabriel Krisman Bertazi <krisman(a)collabora.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/tools/testing/selftests/mm/madv_populate.c b/tools/testing/selftests/mm/madv_populate.c
index b6fabd5c27ed..d8d11bc67ddc 100644
--- a/tools/testing/selftests/mm/madv_populate.c
+++ b/tools/testing/selftests/mm/madv_populate.c
@@ -264,23 +264,6 @@ static void test_softdirty(void)
munmap(addr, SIZE);
}
-static int system_has_softdirty(void)
-{
- /*
- * There is no way to check if the kernel supports soft-dirty, other
- * than by writing to a page and seeing if the bit was set. But the
- * tests are intended to check that the bit gets set when it should, so
- * doing that check would turn a potentially legitimate fail into a
- * skip. Fortunately, we know for sure that arm64 does not support
- * soft-dirty. So for now, let's just use the arch as a corse guide.
- */
-#if defined(__aarch64__)
- return 0;
-#else
- return 1;
-#endif
-}
-
int main(int argc, char **argv)
{
int nr_tests = 16;
@@ -288,7 +271,7 @@ int main(int argc, char **argv)
pagesize = getpagesize();
- if (system_has_softdirty())
+ if (softdirty_supported())
nr_tests += 5;
ksft_print_header();
@@ -300,7 +283,7 @@ int main(int argc, char **argv)
test_holes();
test_populate_read();
test_populate_write();
- if (system_has_softdirty())
+ if (softdirty_supported())
test_softdirty();
err = ksft_get_fail_cnt();
diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selftests/mm/soft-dirty.c
index 8a3f2b4b2186..4ee4db3750c1 100644
--- a/tools/testing/selftests/mm/soft-dirty.c
+++ b/tools/testing/selftests/mm/soft-dirty.c
@@ -200,8 +200,11 @@ int main(int argc, char **argv)
int pagesize;
ksft_print_header();
- ksft_set_plan(15);
+ if (!softdirty_supported())
+ ksft_exit_skip("soft-dirty is not support\n");
+
+ ksft_set_plan(15);
pagemap_fd = open(PAGEMAP_FILE_PATH, O_RDONLY);
if (pagemap_fd < 0)
ksft_exit_fail_msg("Failed to open %s\n", PAGEMAP_FILE_PATH);
diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests/mm/vm_util.c
index 56e9bd541edd..e33cda301dad 100644
--- a/tools/testing/selftests/mm/vm_util.c
+++ b/tools/testing/selftests/mm/vm_util.c
@@ -449,6 +449,23 @@ bool check_vmflag_pfnmap(void *addr)
return check_vmflag(addr, "pf");
}
+bool softdirty_supported(void)
+{
+ char *addr;
+ bool supported = false;
+ const size_t pagesize = getpagesize();
+
+ /* New mappings are expected to be marked with VM_SOFTDIRTY (sd). */
+ addr = mmap(0, pagesize, PROT_READ | PROT_WRITE,
+ MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
+ if (!addr)
+ ksft_exit_fail_msg("mmap failed\n");
+
+ supported = check_vmflag(addr, "sd");
+ munmap(addr, pagesize);
+ return supported;
+}
+
/*
* Open an fd at /proc/$pid/maps and configure procmap_out ready for
* PROCMAP_QUERY query. Returns 0 on success, or an error code otherwise.
diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h
index 07c4acfd84b6..26c30fdc0241 100644
--- a/tools/testing/selftests/mm/vm_util.h
+++ b/tools/testing/selftests/mm/vm_util.h
@@ -104,6 +104,7 @@ bool find_vma_procmap(struct procmap_fd *procmap, void *address);
int close_procmap(struct procmap_fd *procmap);
int write_sysfs(const char *file_path, unsigned long val);
int read_sysfs(const char *file_path, unsigned long *val);
+bool softdirty_supported(void);
static inline int open_self_procmap(struct procmap_fd *procmap_out)
{
When fsl_edma_alloc_chan_resources() fails after clk_prepare_enable(),
the error paths only free IRQs and destroy the TCD pool, but forget to
call clk_disable_unprepare(). This causes the channel clock to remain
enabled, leaking power and resources.
Fix it by disabling the channel clock in the error unwind path.
Fixes: d8d4355861d8 ("dmaengine: fsl-edma: add i.MX8ULP edma support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zhen Ni <zhen.ni(a)easystack.cn>
---
drivers/dma/fsl-edma-common.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
index 4976d7dde080..bd673f08f610 100644
--- a/drivers/dma/fsl-edma-common.c
+++ b/drivers/dma/fsl-edma-common.c
@@ -852,6 +852,8 @@ int fsl_edma_alloc_chan_resources(struct dma_chan *chan)
free_irq(fsl_chan->txirq, fsl_chan);
err_txirq:
dma_pool_destroy(fsl_chan->tcd_pool);
+ if (fsl_edma_drvflags(fsl_chan) & FSL_EDMA_DRV_HAS_CHCLK)
+ clk_disable_unprepare(fsl_chan->clk);
return ret;
}
--
2.20.1
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 0389c305ef56cbadca4cbef44affc0ec3213ed30
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101329-cyclic-cylinder-9e6b@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0389c305ef56cbadca4cbef44affc0ec3213ed30 Mon Sep 17 00:00:00 2001
From: Lance Yang <lance.yang(a)linux.dev>
Date: Wed, 17 Sep 2025 21:31:37 +0800
Subject: [PATCH] selftests/mm: skip soft-dirty tests when
CONFIG_MEM_SOFT_DIRTY is disabled
The madv_populate and soft-dirty kselftests currently fail on systems
where CONFIG_MEM_SOFT_DIRTY is disabled.
Introduce a new helper softdirty_supported() into vm_util.c/h to ensure
tests are properly skipped when the feature is not enabled.
Link: https://lkml.kernel.org/r/20250917133137.62802-1-lance.yang@linux.dev
Fixes: 9f3265db6ae8 ("selftests: vm: add test for Soft-Dirty PTE bit")
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
Acked-by: David Hildenbrand <david(a)redhat.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Gabriel Krisman Bertazi <krisman(a)collabora.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/tools/testing/selftests/mm/madv_populate.c b/tools/testing/selftests/mm/madv_populate.c
index b6fabd5c27ed..d8d11bc67ddc 100644
--- a/tools/testing/selftests/mm/madv_populate.c
+++ b/tools/testing/selftests/mm/madv_populate.c
@@ -264,23 +264,6 @@ static void test_softdirty(void)
munmap(addr, SIZE);
}
-static int system_has_softdirty(void)
-{
- /*
- * There is no way to check if the kernel supports soft-dirty, other
- * than by writing to a page and seeing if the bit was set. But the
- * tests are intended to check that the bit gets set when it should, so
- * doing that check would turn a potentially legitimate fail into a
- * skip. Fortunately, we know for sure that arm64 does not support
- * soft-dirty. So for now, let's just use the arch as a corse guide.
- */
-#if defined(__aarch64__)
- return 0;
-#else
- return 1;
-#endif
-}
-
int main(int argc, char **argv)
{
int nr_tests = 16;
@@ -288,7 +271,7 @@ int main(int argc, char **argv)
pagesize = getpagesize();
- if (system_has_softdirty())
+ if (softdirty_supported())
nr_tests += 5;
ksft_print_header();
@@ -300,7 +283,7 @@ int main(int argc, char **argv)
test_holes();
test_populate_read();
test_populate_write();
- if (system_has_softdirty())
+ if (softdirty_supported())
test_softdirty();
err = ksft_get_fail_cnt();
diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selftests/mm/soft-dirty.c
index 8a3f2b4b2186..4ee4db3750c1 100644
--- a/tools/testing/selftests/mm/soft-dirty.c
+++ b/tools/testing/selftests/mm/soft-dirty.c
@@ -200,8 +200,11 @@ int main(int argc, char **argv)
int pagesize;
ksft_print_header();
- ksft_set_plan(15);
+ if (!softdirty_supported())
+ ksft_exit_skip("soft-dirty is not support\n");
+
+ ksft_set_plan(15);
pagemap_fd = open(PAGEMAP_FILE_PATH, O_RDONLY);
if (pagemap_fd < 0)
ksft_exit_fail_msg("Failed to open %s\n", PAGEMAP_FILE_PATH);
diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests/mm/vm_util.c
index 56e9bd541edd..e33cda301dad 100644
--- a/tools/testing/selftests/mm/vm_util.c
+++ b/tools/testing/selftests/mm/vm_util.c
@@ -449,6 +449,23 @@ bool check_vmflag_pfnmap(void *addr)
return check_vmflag(addr, "pf");
}
+bool softdirty_supported(void)
+{
+ char *addr;
+ bool supported = false;
+ const size_t pagesize = getpagesize();
+
+ /* New mappings are expected to be marked with VM_SOFTDIRTY (sd). */
+ addr = mmap(0, pagesize, PROT_READ | PROT_WRITE,
+ MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
+ if (!addr)
+ ksft_exit_fail_msg("mmap failed\n");
+
+ supported = check_vmflag(addr, "sd");
+ munmap(addr, pagesize);
+ return supported;
+}
+
/*
* Open an fd at /proc/$pid/maps and configure procmap_out ready for
* PROCMAP_QUERY query. Returns 0 on success, or an error code otherwise.
diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h
index 07c4acfd84b6..26c30fdc0241 100644
--- a/tools/testing/selftests/mm/vm_util.h
+++ b/tools/testing/selftests/mm/vm_util.h
@@ -104,6 +104,7 @@ bool find_vma_procmap(struct procmap_fd *procmap, void *address);
int close_procmap(struct procmap_fd *procmap);
int write_sysfs(const char *file_path, unsigned long val);
int read_sysfs(const char *file_path, unsigned long *val);
+bool softdirty_supported(void);
static inline int open_self_procmap(struct procmap_fd *procmap_out)
{
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 8d33a030c566e1f105cd5bf27f37940b6367f3be
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101315-graveness-treason-be2b@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8d33a030c566e1f105cd5bf27f37940b6367f3be Mon Sep 17 00:00:00 2001
From: Zheng Qixing <zhengqixing(a)huawei.com>
Date: Tue, 26 Aug 2025 15:42:04 +0800
Subject: [PATCH] dm: fix NULL pointer dereference in __dm_suspend()
There is a race condition between dm device suspend and table load that
can lead to null pointer dereference. The issue occurs when suspend is
invoked before table load completes:
BUG: kernel NULL pointer dereference, address: 0000000000000054
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014
RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50
Call Trace:
<TASK>
blk_mq_quiesce_queue+0x2c/0x50
dm_stop_queue+0xd/0x20
__dm_suspend+0x130/0x330
dm_suspend+0x11a/0x180
dev_suspend+0x27e/0x560
ctl_ioctl+0x4cf/0x850
dm_ctl_ioctl+0xd/0x20
vfs_ioctl+0x1d/0x50
__se_sys_ioctl+0x9b/0xc0
__x64_sys_ioctl+0x19/0x30
x64_sys_call+0x2c4a/0x4620
do_syscall_64+0x9e/0x1b0
The issue can be triggered as below:
T1 T2
dm_suspend table_load
__dm_suspend dm_setup_md_queue
dm_mq_init_request_queue
blk_mq_init_allocated_queue
=> q->mq_ops = set->ops; (1)
dm_stop_queue / dm_wait_for_completion
=> q->tag_set NULL pointer! (2)
=> q->tag_set = set; (3)
Fix this by checking if a valid table (map) exists before performing
request-based suspend and waiting for target I/O. When map is NULL,
skip these table-dependent suspend steps.
Even when map is NULL, no I/O can reach any target because there is
no table loaded; I/O submitted in this state will fail early in the
DM layer. Skipping the table-dependent suspend logic in this case
is safe and avoids NULL pointer dereferences.
Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zheng Qixing <zhengqixing(a)huawei.com>
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 7222f20c1a83..66dd5f6ce778 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2908,7 +2908,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
{
bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG;
bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG;
- int r;
+ int r = 0;
lockdep_assert_held(&md->suspend_lock);
@@ -2960,7 +2960,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
* Stop md->queue before flushing md->wq in case request-based
* dm defers requests to md->wq from md->queue.
*/
- if (dm_request_based(md)) {
+ if (map && dm_request_based(md)) {
dm_stop_queue(md->queue);
set_bit(DMF_QUEUE_STOPPED, &md->flags);
}
@@ -2972,7 +2972,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
* We call dm_wait_for_completion to wait for all existing requests
* to finish.
*/
- r = dm_wait_for_completion(md, task_state);
+ if (map)
+ r = dm_wait_for_completion(md, task_state);
if (!r)
set_bit(dmf_suspended_flag, &md->flags);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 1efbee6852f1ff698a9981bd731308dd027189fb
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101336-abrasion-hatchling-01bc@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1efbee6852f1ff698a9981bd731308dd027189fb Mon Sep 17 00:00:00 2001
From: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
Date: Mon, 11 Aug 2025 15:36:16 +0200
Subject: [PATCH] mfd: vexpress-sysreg: Check the return value of
devm_gpiochip_add_data()
Commit 974cc7b93441 ("mfd: vexpress: Define the device as MFD cells")
removed the return value check from the call to gpiochip_add_data() (or
rather gpiochip_add() back then and later converted to devres) with no
explanation. This function however can still fail, so check the return
value and bail-out if it does.
Cc: stable(a)vger.kernel.org
Fixes: 974cc7b93441 ("mfd: vexpress: Define the device as MFD cells")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
Reviewed-by: Linus Walleij <linus.walleij(a)linaro.org>
Link: https://lore.kernel.org/r/20250811-gpio-mmio-mfd-conv-v1-1-68c5c958cf80@lin…
Signed-off-by: Lee Jones <lee(a)kernel.org>
diff --git a/drivers/mfd/vexpress-sysreg.c b/drivers/mfd/vexpress-sysreg.c
index fc2daffc4352..77245c1e5d7d 100644
--- a/drivers/mfd/vexpress-sysreg.c
+++ b/drivers/mfd/vexpress-sysreg.c
@@ -99,6 +99,7 @@ static int vexpress_sysreg_probe(struct platform_device *pdev)
struct resource *mem;
void __iomem *base;
struct gpio_chip *mmc_gpio_chip;
+ int ret;
mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (!mem)
@@ -119,7 +120,10 @@ static int vexpress_sysreg_probe(struct platform_device *pdev)
bgpio_init(mmc_gpio_chip, &pdev->dev, 0x4, base + SYS_MCI,
NULL, NULL, NULL, NULL, 0);
mmc_gpio_chip->ngpio = 2;
- devm_gpiochip_add_data(&pdev->dev, mmc_gpio_chip, NULL);
+
+ ret = devm_gpiochip_add_data(&pdev->dev, mmc_gpio_chip, NULL);
+ if (ret)
+ return ret;
return devm_mfd_add_devices(&pdev->dev, PLATFORM_DEVID_AUTO,
vexpress_sysreg_cells,
When a damos_commit_quota_goals() is called for adding new DAMOS quota
goals of DAMOS_QUOTA_USER_INPUT metric, current_value fields of the new
goals should be also set as requested.
However, damos_commit_quota_goals() is not updating the field for the
case, since it is setting only metrics and target values using
damos_new_quota_goal(), and metric-optional union fields using
damos_commit_quota_goal_union(). As a result, users could see the first
current_value parameter that committed online with a new quota goal is
ignored. Users are assumed to commit the current_value for
DAMOS_QUOTA_USER_INPUT quota goals, since it is being used as a
feedback. Hence the real impact would be subtle. That said, this is
obviously not intended behavior.
Fix the issue by using damos_commit_quota_goal() which sets all quota
goal parameters, instead of damos_commit_quota_goal_union(), which sets
only the union fields.
Fixes: 1aef9df0ee90 ("mm/damon/core: commit damos_quota_goal->nid")
Cc: <stable(a)vger.kernel.org> # 6.16.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
mm/damon/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/damon/core.c b/mm/damon/core.c
index 93848b4c6944..e72dc49d501c 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -832,7 +832,7 @@ int damos_commit_quota_goals(struct damos_quota *dst, struct damos_quota *src)
src_goal->metric, src_goal->target_value);
if (!new_goal)
return -ENOMEM;
- damos_commit_quota_goal_union(new_goal, src_goal);
+ damos_commit_quota_goal(new_goal, src_goal);
damos_add_quota_goal(dst, new_goal);
}
return 0;
base-commit: ccb48f0d949e274d388e66c8f80f7d1ff234ce46
--
2.47.3
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 64e0d839c589f4f2ecd2e3e5bdb5cee6ba6bade9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101316-disarm-pried-7a1f@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64e0d839c589f4f2ecd2e3e5bdb5cee6ba6bade9 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Mon, 4 Aug 2025 15:32:40 +0200
Subject: [PATCH] mfd: intel_soc_pmic_chtdc_ti: Set use_single_read
regmap_config flag
Testing has shown that reading multiple registers at once (for 10-bit
ADC values) does not work. Set the use_single_read regmap_config flag
to make regmap split these for us.
This should fix temperature opregion accesses done by
drivers/acpi/pmic/intel_pmic_chtdc_ti.c and is also necessary for
the upcoming drivers for the ADC and battery MFD cells.
Fixes: 6bac0606fdba ("mfd: Add support for Cherry Trail Dollar Cove TI PMIC")
Cc: stable(a)vger.kernel.org
Reviewed-by: Andy Shevchenko <andy(a)kernel.org>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Link: https://lore.kernel.org/r/20250804133240.312383-1-hansg@kernel.org
Signed-off-by: Lee Jones <lee(a)kernel.org>
diff --git a/drivers/mfd/intel_soc_pmic_chtdc_ti.c b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
index 4c1a68c9f575..6daf33e07ea0 100644
--- a/drivers/mfd/intel_soc_pmic_chtdc_ti.c
+++ b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
@@ -82,6 +82,8 @@ static const struct regmap_config chtdc_ti_regmap_config = {
.reg_bits = 8,
.val_bits = 8,
.max_register = 0xff,
+ /* The hardware does not support reading multiple registers at once */
+ .use_single_read = true,
};
static const struct regmap_irq chtdc_ti_irqs[] = {
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 64e0d839c589f4f2ecd2e3e5bdb5cee6ba6bade9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101315-glue-daylight-739b@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64e0d839c589f4f2ecd2e3e5bdb5cee6ba6bade9 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Mon, 4 Aug 2025 15:32:40 +0200
Subject: [PATCH] mfd: intel_soc_pmic_chtdc_ti: Set use_single_read
regmap_config flag
Testing has shown that reading multiple registers at once (for 10-bit
ADC values) does not work. Set the use_single_read regmap_config flag
to make regmap split these for us.
This should fix temperature opregion accesses done by
drivers/acpi/pmic/intel_pmic_chtdc_ti.c and is also necessary for
the upcoming drivers for the ADC and battery MFD cells.
Fixes: 6bac0606fdba ("mfd: Add support for Cherry Trail Dollar Cove TI PMIC")
Cc: stable(a)vger.kernel.org
Reviewed-by: Andy Shevchenko <andy(a)kernel.org>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Link: https://lore.kernel.org/r/20250804133240.312383-1-hansg@kernel.org
Signed-off-by: Lee Jones <lee(a)kernel.org>
diff --git a/drivers/mfd/intel_soc_pmic_chtdc_ti.c b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
index 4c1a68c9f575..6daf33e07ea0 100644
--- a/drivers/mfd/intel_soc_pmic_chtdc_ti.c
+++ b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
@@ -82,6 +82,8 @@ static const struct regmap_config chtdc_ti_regmap_config = {
.reg_bits = 8,
.val_bits = 8,
.max_register = 0xff,
+ /* The hardware does not support reading multiple registers at once */
+ .use_single_read = true,
};
static const struct regmap_irq chtdc_ti_irqs[] = {
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 64e0d839c589f4f2ecd2e3e5bdb5cee6ba6bade9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101308-pedometer-broadness-3e95@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64e0d839c589f4f2ecd2e3e5bdb5cee6ba6bade9 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Mon, 4 Aug 2025 15:32:40 +0200
Subject: [PATCH] mfd: intel_soc_pmic_chtdc_ti: Set use_single_read
regmap_config flag
Testing has shown that reading multiple registers at once (for 10-bit
ADC values) does not work. Set the use_single_read regmap_config flag
to make regmap split these for us.
This should fix temperature opregion accesses done by
drivers/acpi/pmic/intel_pmic_chtdc_ti.c and is also necessary for
the upcoming drivers for the ADC and battery MFD cells.
Fixes: 6bac0606fdba ("mfd: Add support for Cherry Trail Dollar Cove TI PMIC")
Cc: stable(a)vger.kernel.org
Reviewed-by: Andy Shevchenko <andy(a)kernel.org>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Link: https://lore.kernel.org/r/20250804133240.312383-1-hansg@kernel.org
Signed-off-by: Lee Jones <lee(a)kernel.org>
diff --git a/drivers/mfd/intel_soc_pmic_chtdc_ti.c b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
index 4c1a68c9f575..6daf33e07ea0 100644
--- a/drivers/mfd/intel_soc_pmic_chtdc_ti.c
+++ b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
@@ -82,6 +82,8 @@ static const struct regmap_config chtdc_ti_regmap_config = {
.reg_bits = 8,
.val_bits = 8,
.max_register = 0xff,
+ /* The hardware does not support reading multiple registers at once */
+ .use_single_read = true,
};
static const struct regmap_irq chtdc_ti_irqs[] = {
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 64e0d839c589f4f2ecd2e3e5bdb5cee6ba6bade9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101307-little-reseal-5a13@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64e0d839c589f4f2ecd2e3e5bdb5cee6ba6bade9 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Mon, 4 Aug 2025 15:32:40 +0200
Subject: [PATCH] mfd: intel_soc_pmic_chtdc_ti: Set use_single_read
regmap_config flag
Testing has shown that reading multiple registers at once (for 10-bit
ADC values) does not work. Set the use_single_read regmap_config flag
to make regmap split these for us.
This should fix temperature opregion accesses done by
drivers/acpi/pmic/intel_pmic_chtdc_ti.c and is also necessary for
the upcoming drivers for the ADC and battery MFD cells.
Fixes: 6bac0606fdba ("mfd: Add support for Cherry Trail Dollar Cove TI PMIC")
Cc: stable(a)vger.kernel.org
Reviewed-by: Andy Shevchenko <andy(a)kernel.org>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Link: https://lore.kernel.org/r/20250804133240.312383-1-hansg@kernel.org
Signed-off-by: Lee Jones <lee(a)kernel.org>
diff --git a/drivers/mfd/intel_soc_pmic_chtdc_ti.c b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
index 4c1a68c9f575..6daf33e07ea0 100644
--- a/drivers/mfd/intel_soc_pmic_chtdc_ti.c
+++ b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
@@ -82,6 +82,8 @@ static const struct regmap_config chtdc_ti_regmap_config = {
.reg_bits = 8,
.val_bits = 8,
.max_register = 0xff,
+ /* The hardware does not support reading multiple registers at once */
+ .use_single_read = true,
};
static const struct regmap_irq chtdc_ti_irqs[] = {
The patch titled
Subject: mm/mremap: correctly account old mapping after MREMAP_DONTUNMAP remap
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-mremap-correctly-account-old-mapping-after-mremap_dontunmap-remap.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: mm/mremap: correctly account old mapping after MREMAP_DONTUNMAP remap
Date: Mon, 13 Oct 2025 17:58:36 +0100
Commit b714ccb02a76 ("mm/mremap: complete refactor of move_vma()")
mistakenly introduced a new behaviour - clearing the VM_ACCOUNT flag of
the old mapping when a mapping is mremap()'d with the MREMAP_DONTUNMAP
flag set.
While we always clear the VM_LOCKED and VM_LOCKONFAULT flags for the old
mapping (the page tables have been moved, so there is no data that could
possibly be locked in memory), there is no reason to touch any other VMA
flags.
This is because after the move the old mapping is in a state as if it were
freshly mapped. This implies that the attributes of the mapping ought to
remain the same, including whether or not the mapping is accounted.
Link: https://lkml.kernel.org/r/20251013165836.273113-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Fixes: b714ccb02a76 ("mm/mremap: complete refactor of move_vma()")
Cc: Jann Horn <jannh(a)google.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mremap.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
--- a/mm/mremap.c~mm-mremap-correctly-account-old-mapping-after-mremap_dontunmap-remap
+++ a/mm/mremap.c
@@ -1237,10 +1237,10 @@ static int copy_vma_and_data(struct vma_
}
/*
- * Perform final tasks for MADV_DONTUNMAP operation, clearing mlock() and
- * account flags on remaining VMA by convention (it cannot be mlock()'d any
- * longer, as pages in range are no longer mapped), and removing anon_vma_chain
- * links from it (if the entire VMA was copied over).
+ * Perform final tasks for MADV_DONTUNMAP operation, clearing mlock() flag on
+ * remaining VMA by convention (it cannot be mlock()'d any longer, as pages in
+ * range are no longer mapped), and removing anon_vma_chain links from it if the
+ * entire VMA was copied over.
*/
static void dontunmap_complete(struct vma_remap_struct *vrm,
struct vm_area_struct *new_vma)
@@ -1250,11 +1250,8 @@ static void dontunmap_complete(struct vm
unsigned long old_start = vrm->vma->vm_start;
unsigned long old_end = vrm->vma->vm_end;
- /*
- * We always clear VM_LOCKED[ONFAULT] | VM_ACCOUNT on the old
- * vma.
- */
- vm_flags_clear(vrm->vma, VM_LOCKED_MASK | VM_ACCOUNT);
+ /* We always clear VM_LOCKED[ONFAULT] on the old VMA. */
+ vm_flags_clear(vrm->vma, VM_LOCKED_MASK);
/*
* anon_vma links of the old vma is no longer needed after its page
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
mm-mremap-correctly-account-old-mapping-after-mremap_dontunmap-remap.patch
mm-shmem-update-shmem-to-use-mmap_prepare.patch
device-dax-update-devdax-to-use-mmap_prepare.patch
mm-add-vma_desc_size-vma_desc_pages-helpers.patch
relay-update-relay-to-use-mmap_prepare.patch
mm-vma-rename-__mmap_prepare-function-to-avoid-confusion.patch
mm-add-remap_pfn_range_prepare-remap_pfn_range_complete.patch
mm-abstract-io_remap_pfn_range-based-on-pfn.patch
mm-introduce-io_remap_pfn_range_.patch
mm-introduce-io_remap_pfn_range_-fix.patch
mm-add-ability-to-take-further-action-in-vm_area_desc.patch
doc-update-porting-vfs-documentation-for-mmap_prepare-actions.patch
mm-hugetlbfs-update-hugetlbfs-to-use-mmap_prepare.patch
mm-add-shmem_zero_setup_desc.patch
mm-update-mem-char-driver-to-use-mmap_prepare.patch
mm-update-resctl-to-use-mmap_prepare.patch
DbC is currently only enabled back if it's in configured state during
suspend.
If system is suspended after DbC is enabled, but before the device is
properly enumerated by the host, then DbC would not be enabled back in
resume.
Always enable DbC back in resume if it's suspended in enabled,
connected, or configured state
Cc: stable(a)vger.kernel.org
Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver")
Tested-by: Łukasz Bartosik <ukaszb(a)chromium.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-dbgcap.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-dbgcap.c b/drivers/usb/host/xhci-dbgcap.c
index 023a8ec6f305..ecda964e018a 100644
--- a/drivers/usb/host/xhci-dbgcap.c
+++ b/drivers/usb/host/xhci-dbgcap.c
@@ -1392,8 +1392,15 @@ int xhci_dbc_suspend(struct xhci_hcd *xhci)
if (!dbc)
return 0;
- if (dbc->state == DS_CONFIGURED)
+ switch (dbc->state) {
+ case DS_ENABLED:
+ case DS_CONNECTED:
+ case DS_CONFIGURED:
dbc->resume_required = 1;
+ break;
+ default:
+ break;
+ }
xhci_dbc_stop(dbc);
--
2.43.0
DbC may add 1024 bogus bytes to the beginneing of the receiving endpoint
if DbC hw triggers a STALL event before any Transfer Blocks (TRBs) for
incoming data are queued, but driver handles the event after it queued
the TRBs.
This is possible as xHCI DbC hardware may trigger spurious STALL transfer
events even if endpoint is empty. The STALL event contains a pointer
to the stalled TRB, and "remaining" untransferred data length.
As there are no TRBs queued yet the STALL event will just point to first
TRB position of the empty ring, with '0' bytes remaining untransferred.
DbC driver is polling for events, and may not handle the STALL event
before /dev/ttyDBC0 is opened and incoming data TRBs are queued.
The DbC event handler will now assume the first queued TRB (length 1024)
has stalled with '0' bytes remaining untransferred, and copies the data
This race situation can be practically mitigated by making sure the event
handler handles all pending transfer events when DbC reaches configured
state, and only then create dev/ttyDbC0, and start queueing transfers.
The event handler can this way detect the STALL events on empty rings
and discard them before any transfers are queued.
This does in practice solve the issue, but still leaves a small possible
gap for the race to trigger.
We still need a way to distinguish spurious STALLs on empty rings with '0'
bytes remaing, from actual STALL events with all bytes transmitted.
Cc: stable(a)vger.kernel.org
Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver")
Tested-by: Łukasz Bartosik <ukaszb(a)chromium.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-dbgcap.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/host/xhci-dbgcap.c b/drivers/usb/host/xhci-dbgcap.c
index 63edf2d8f245..023a8ec6f305 100644
--- a/drivers/usb/host/xhci-dbgcap.c
+++ b/drivers/usb/host/xhci-dbgcap.c
@@ -892,7 +892,8 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
dev_info(dbc->dev, "DbC configured\n");
portsc = readl(&dbc->regs->portsc);
writel(portsc, &dbc->regs->portsc);
- return EVT_GSER;
+ ret = EVT_GSER;
+ break;
}
return EVT_DONE;
@@ -954,7 +955,8 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
break;
case TRB_TYPE(TRB_TRANSFER):
dbc_handle_xfer_event(dbc, evt);
- ret = EVT_XFER_DONE;
+ if (ret != EVT_GSER)
+ ret = EVT_XFER_DONE;
break;
default:
break;
--
2.43.0
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 64e0d839c589f4f2ecd2e3e5bdb5cee6ba6bade9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101306-cufflink-fidgeting-4c7b@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64e0d839c589f4f2ecd2e3e5bdb5cee6ba6bade9 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Mon, 4 Aug 2025 15:32:40 +0200
Subject: [PATCH] mfd: intel_soc_pmic_chtdc_ti: Set use_single_read
regmap_config flag
Testing has shown that reading multiple registers at once (for 10-bit
ADC values) does not work. Set the use_single_read regmap_config flag
to make regmap split these for us.
This should fix temperature opregion accesses done by
drivers/acpi/pmic/intel_pmic_chtdc_ti.c and is also necessary for
the upcoming drivers for the ADC and battery MFD cells.
Fixes: 6bac0606fdba ("mfd: Add support for Cherry Trail Dollar Cove TI PMIC")
Cc: stable(a)vger.kernel.org
Reviewed-by: Andy Shevchenko <andy(a)kernel.org>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Link: https://lore.kernel.org/r/20250804133240.312383-1-hansg@kernel.org
Signed-off-by: Lee Jones <lee(a)kernel.org>
diff --git a/drivers/mfd/intel_soc_pmic_chtdc_ti.c b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
index 4c1a68c9f575..6daf33e07ea0 100644
--- a/drivers/mfd/intel_soc_pmic_chtdc_ti.c
+++ b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
@@ -82,6 +82,8 @@ static const struct regmap_config chtdc_ti_regmap_config = {
.reg_bits = 8,
.val_bits = 8,
.max_register = 0xff,
+ /* The hardware does not support reading multiple registers at once */
+ .use_single_read = true,
};
static const struct regmap_irq chtdc_ti_irqs[] = {
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 8cfc8cec1b4da88a47c243a11f384baefd092a50
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101340-boned-upright-7693@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8cfc8cec1b4da88a47c243a11f384baefd092a50 Mon Sep 17 00:00:00 2001
From: Edward Adam Davis <eadavis(a)qq.com>
Date: Wed, 10 Sep 2025 09:15:27 +0800
Subject: [PATCH] media: mc: Clear minor number before put device
The device minor should not be cleared after the device is released.
Fixes: 9e14868dc952 ("media: mc: Clear minor number reservation at unregistration time")
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+031d0cfd7c362817963f(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=031d0cfd7c362817963f
Tested-by: syzbot+031d0cfd7c362817963f(a)syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis(a)qq.com>
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco(a)kernel.org>
diff --git a/drivers/media/mc/mc-devnode.c b/drivers/media/mc/mc-devnode.c
index 0d01cbae98f2..6daa7aa99442 100644
--- a/drivers/media/mc/mc-devnode.c
+++ b/drivers/media/mc/mc-devnode.c
@@ -276,13 +276,10 @@ void media_devnode_unregister(struct media_devnode *devnode)
/* Delete the cdev on this minor as well */
cdev_device_del(&devnode->cdev, &devnode->dev);
devnode->media_dev = NULL;
+ clear_bit(devnode->minor, media_devnode_nums);
mutex_unlock(&media_devnode_lock);
put_device(&devnode->dev);
-
- mutex_lock(&media_devnode_lock);
- clear_bit(devnode->minor, media_devnode_nums);
- mutex_unlock(&media_devnode_lock);
}
/*
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 64e0d839c589f4f2ecd2e3e5bdb5cee6ba6bade9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101305-cheek-copartner-c523@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64e0d839c589f4f2ecd2e3e5bdb5cee6ba6bade9 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Mon, 4 Aug 2025 15:32:40 +0200
Subject: [PATCH] mfd: intel_soc_pmic_chtdc_ti: Set use_single_read
regmap_config flag
Testing has shown that reading multiple registers at once (for 10-bit
ADC values) does not work. Set the use_single_read regmap_config flag
to make regmap split these for us.
This should fix temperature opregion accesses done by
drivers/acpi/pmic/intel_pmic_chtdc_ti.c and is also necessary for
the upcoming drivers for the ADC and battery MFD cells.
Fixes: 6bac0606fdba ("mfd: Add support for Cherry Trail Dollar Cove TI PMIC")
Cc: stable(a)vger.kernel.org
Reviewed-by: Andy Shevchenko <andy(a)kernel.org>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Link: https://lore.kernel.org/r/20250804133240.312383-1-hansg@kernel.org
Signed-off-by: Lee Jones <lee(a)kernel.org>
diff --git a/drivers/mfd/intel_soc_pmic_chtdc_ti.c b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
index 4c1a68c9f575..6daf33e07ea0 100644
--- a/drivers/mfd/intel_soc_pmic_chtdc_ti.c
+++ b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
@@ -82,6 +82,8 @@ static const struct regmap_config chtdc_ti_regmap_config = {
.reg_bits = 8,
.val_bits = 8,
.max_register = 0xff,
+ /* The hardware does not support reading multiple registers at once */
+ .use_single_read = true,
};
static const struct regmap_irq chtdc_ti_irqs[] = {
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 8cfc8cec1b4da88a47c243a11f384baefd092a50
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101339-polygraph-crept-0130@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8cfc8cec1b4da88a47c243a11f384baefd092a50 Mon Sep 17 00:00:00 2001
From: Edward Adam Davis <eadavis(a)qq.com>
Date: Wed, 10 Sep 2025 09:15:27 +0800
Subject: [PATCH] media: mc: Clear minor number before put device
The device minor should not be cleared after the device is released.
Fixes: 9e14868dc952 ("media: mc: Clear minor number reservation at unregistration time")
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+031d0cfd7c362817963f(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=031d0cfd7c362817963f
Tested-by: syzbot+031d0cfd7c362817963f(a)syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis(a)qq.com>
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco(a)kernel.org>
diff --git a/drivers/media/mc/mc-devnode.c b/drivers/media/mc/mc-devnode.c
index 0d01cbae98f2..6daa7aa99442 100644
--- a/drivers/media/mc/mc-devnode.c
+++ b/drivers/media/mc/mc-devnode.c
@@ -276,13 +276,10 @@ void media_devnode_unregister(struct media_devnode *devnode)
/* Delete the cdev on this minor as well */
cdev_device_del(&devnode->cdev, &devnode->dev);
devnode->media_dev = NULL;
+ clear_bit(devnode->minor, media_devnode_nums);
mutex_unlock(&media_devnode_lock);
put_device(&devnode->dev);
-
- mutex_lock(&media_devnode_lock);
- clear_bit(devnode->minor, media_devnode_nums);
- mutex_unlock(&media_devnode_lock);
}
/*
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 8cfc8cec1b4da88a47c243a11f384baefd092a50
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101338-swab-rut-c1d4@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8cfc8cec1b4da88a47c243a11f384baefd092a50 Mon Sep 17 00:00:00 2001
From: Edward Adam Davis <eadavis(a)qq.com>
Date: Wed, 10 Sep 2025 09:15:27 +0800
Subject: [PATCH] media: mc: Clear minor number before put device
The device minor should not be cleared after the device is released.
Fixes: 9e14868dc952 ("media: mc: Clear minor number reservation at unregistration time")
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+031d0cfd7c362817963f(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=031d0cfd7c362817963f
Tested-by: syzbot+031d0cfd7c362817963f(a)syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis(a)qq.com>
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco(a)kernel.org>
diff --git a/drivers/media/mc/mc-devnode.c b/drivers/media/mc/mc-devnode.c
index 0d01cbae98f2..6daa7aa99442 100644
--- a/drivers/media/mc/mc-devnode.c
+++ b/drivers/media/mc/mc-devnode.c
@@ -276,13 +276,10 @@ void media_devnode_unregister(struct media_devnode *devnode)
/* Delete the cdev on this minor as well */
cdev_device_del(&devnode->cdev, &devnode->dev);
devnode->media_dev = NULL;
+ clear_bit(devnode->minor, media_devnode_nums);
mutex_unlock(&media_devnode_lock);
put_device(&devnode->dev);
-
- mutex_lock(&media_devnode_lock);
- clear_bit(devnode->minor, media_devnode_nums);
- mutex_unlock(&media_devnode_lock);
}
/*
From: Brian Norris <briannorris(a)google.com>
When transitioning to D3cold, __pci_set_power_state() will first
transition a device to D3hot. If the device was already in D3hot, this
will add excess work:
(a) read/modify/write PMCSR; and
(b) excess delay (pci_dev_d3_sleep()).
For (b), we already performed the necessary delay on the previous D3hot
entry; this was extra noticeable when evaluating runtime PM transition
latency.
Check whether we're already in the target state before continuing.
Note that __pci_set_power_state() already does this same check for other
state transitions, but D3cold is special because __pci_set_power_state()
converts it to D3hot for the purposes of PMCSR.
This seems to be an oversight in commit 0aacdc957401 ("PCI/PM: Clean up
pci_set_low_power_state()").
Fixes: 0aacdc957401 ("PCI/PM: Clean up pci_set_low_power_state()")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Brian Norris <briannorris(a)google.com>
Signed-off-by: Brian Norris <briannorris(a)chromium.org>
---
drivers/pci/pci.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b0f4d98036cd..7517f1380201 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1539,6 +1539,9 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool
|| (state == PCI_D2 && !dev->d2_support))
return -EIO;
+ if (state == dev->current_state)
+ return 0;
+
pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
if (PCI_POSSIBLE_ERROR(pmcsr)) {
pci_err(dev, "Unable to change power state from %s to %s, device inaccessible\n",
--
2.51.0.618.g983fd99d29-goog
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101322-kept-undone-f6f6@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b Mon Sep 17 00:00:00 2001
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Date: Fri, 26 Sep 2025 22:59:35 +0100
Subject: [PATCH] Squashfs: reject negative file sizes in squashfs_read_inode()
Syskaller reports a "WARNING in ovl_copy_up_file" in overlayfs.
This warning is ultimately caused because the underlying Squashfs file
system returns a file with a negative file size.
This commit checks for a negative file size and returns EINVAL.
[phillip(a)squashfs.org.uk: only need to check 64 bit quantity]
Link: https://lkml.kernel.org/r/20250926222305.110103-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20250926215935.107233-1-phillip@squashfs.org.uk
Fixes: 6545b246a2c8 ("Squashfs: inode operations")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+f754e01116421e9754b9(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68d580e5.a00a0220.303701.0019.GAE@google.com/
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
index ddc65d006063..cceae3b78698 100644
--- a/fs/squashfs/inode.c
+++ b/fs/squashfs/inode.c
@@ -197,6 +197,10 @@ int squashfs_read_inode(struct inode *inode, long long ino)
goto failed_read;
inode->i_size = le64_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
frag = le32_to_cpu(sqsh_ino->fragment);
if (frag != SQUASHFS_INVALID_FRAG) {
/*
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 8cfc8cec1b4da88a47c243a11f384baefd092a50
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101338-exalted-uncorrupt-96aa@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8cfc8cec1b4da88a47c243a11f384baefd092a50 Mon Sep 17 00:00:00 2001
From: Edward Adam Davis <eadavis(a)qq.com>
Date: Wed, 10 Sep 2025 09:15:27 +0800
Subject: [PATCH] media: mc: Clear minor number before put device
The device minor should not be cleared after the device is released.
Fixes: 9e14868dc952 ("media: mc: Clear minor number reservation at unregistration time")
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+031d0cfd7c362817963f(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=031d0cfd7c362817963f
Tested-by: syzbot+031d0cfd7c362817963f(a)syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis(a)qq.com>
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco(a)kernel.org>
diff --git a/drivers/media/mc/mc-devnode.c b/drivers/media/mc/mc-devnode.c
index 0d01cbae98f2..6daa7aa99442 100644
--- a/drivers/media/mc/mc-devnode.c
+++ b/drivers/media/mc/mc-devnode.c
@@ -276,13 +276,10 @@ void media_devnode_unregister(struct media_devnode *devnode)
/* Delete the cdev on this minor as well */
cdev_device_del(&devnode->cdev, &devnode->dev);
devnode->media_dev = NULL;
+ clear_bit(devnode->minor, media_devnode_nums);
mutex_unlock(&media_devnode_lock);
put_device(&devnode->dev);
-
- mutex_lock(&media_devnode_lock);
- clear_bit(devnode->minor, media_devnode_nums);
- mutex_unlock(&media_devnode_lock);
}
/*
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101321-ripening-subscript-11b6@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b Mon Sep 17 00:00:00 2001
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Date: Fri, 26 Sep 2025 22:59:35 +0100
Subject: [PATCH] Squashfs: reject negative file sizes in squashfs_read_inode()
Syskaller reports a "WARNING in ovl_copy_up_file" in overlayfs.
This warning is ultimately caused because the underlying Squashfs file
system returns a file with a negative file size.
This commit checks for a negative file size and returns EINVAL.
[phillip(a)squashfs.org.uk: only need to check 64 bit quantity]
Link: https://lkml.kernel.org/r/20250926222305.110103-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20250926215935.107233-1-phillip@squashfs.org.uk
Fixes: 6545b246a2c8 ("Squashfs: inode operations")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+f754e01116421e9754b9(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68d580e5.a00a0220.303701.0019.GAE@google.com/
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
index ddc65d006063..cceae3b78698 100644
--- a/fs/squashfs/inode.c
+++ b/fs/squashfs/inode.c
@@ -197,6 +197,10 @@ int squashfs_read_inode(struct inode *inode, long long ino)
goto failed_read;
inode->i_size = le64_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
frag = le32_to_cpu(sqsh_ino->fragment);
if (frag != SQUASHFS_INVALID_FRAG) {
/*
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101321-update-disprove-8836@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b Mon Sep 17 00:00:00 2001
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Date: Fri, 26 Sep 2025 22:59:35 +0100
Subject: [PATCH] Squashfs: reject negative file sizes in squashfs_read_inode()
Syskaller reports a "WARNING in ovl_copy_up_file" in overlayfs.
This warning is ultimately caused because the underlying Squashfs file
system returns a file with a negative file size.
This commit checks for a negative file size and returns EINVAL.
[phillip(a)squashfs.org.uk: only need to check 64 bit quantity]
Link: https://lkml.kernel.org/r/20250926222305.110103-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20250926215935.107233-1-phillip@squashfs.org.uk
Fixes: 6545b246a2c8 ("Squashfs: inode operations")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+f754e01116421e9754b9(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68d580e5.a00a0220.303701.0019.GAE@google.com/
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
index ddc65d006063..cceae3b78698 100644
--- a/fs/squashfs/inode.c
+++ b/fs/squashfs/inode.c
@@ -197,6 +197,10 @@ int squashfs_read_inode(struct inode *inode, long long ino)
goto failed_read;
inode->i_size = le64_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
frag = le32_to_cpu(sqsh_ino->fragment);
if (frag != SQUASHFS_INVALID_FRAG) {
/*
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 3bd5e45c2ce30e239d596becd5db720f7eb83c99
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101311-karate-spur-a795@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3bd5e45c2ce30e239d596becd5db720f7eb83c99 Mon Sep 17 00:00:00 2001
From: Larshin Sergey <Sergey.Larshin(a)kaspersky.com>
Date: Mon, 22 Sep 2025 16:13:58 +0300
Subject: [PATCH] fs: udf: fix OOB read in lengthAllocDescs handling
When parsing Allocation Extent Descriptor, lengthAllocDescs comes from
on-disk data and must be validated against the block size. Crafted or
corrupted images may set lengthAllocDescs so that the total descriptor
length (sizeof(allocExtDesc) + lengthAllocDescs) exceeds the buffer,
leading udf_update_tag() to call crc_itu_t() on out-of-bounds memory and
trigger a KASAN use-after-free read.
BUG: KASAN: use-after-free in crc_itu_t+0x1d5/0x2b0 lib/crc-itu-t.c:60
Read of size 1 at addr ffff888041e7d000 by task syz-executor317/5309
CPU: 0 UID: 0 PID: 5309 Comm: syz-executor317 Not tainted 6.12.0-rc4-syzkaller-00261-g850925a8133c #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
crc_itu_t+0x1d5/0x2b0 lib/crc-itu-t.c:60
udf_update_tag+0x70/0x6a0 fs/udf/misc.c:261
udf_write_aext+0x4d8/0x7b0 fs/udf/inode.c:2179
extent_trunc+0x2f7/0x4a0 fs/udf/truncate.c:46
udf_truncate_tail_extent+0x527/0x7e0 fs/udf/truncate.c:106
udf_release_file+0xc1/0x120 fs/udf/file.c:185
__fput+0x23f/0x880 fs/file_table.c:431
task_work_run+0x24f/0x310 kernel/task_work.c:239
exit_task_work include/linux/task_work.h:43 [inline]
do_exit+0xa2f/0x28e0 kernel/exit.c:939
do_group_exit+0x207/0x2c0 kernel/exit.c:1088
__do_sys_exit_group kernel/exit.c:1099 [inline]
__se_sys_exit_group kernel/exit.c:1097 [inline]
__x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1097
x64_sys_call+0x2634/0x2640 arch/x86/include/generated/asm/syscalls_64.h:232
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
Validate the computed total length against epos->bh->b_size.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Reported-by: syzbot+8743fca924afed42f93e(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=8743fca924afed42f93e
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable(a)vger.kernel.org
Signed-off-by: Larshin Sergey <Sergey.Larshin(a)kaspersky.com>
Link: https://patch.msgid.link/20250922131358.745579-1-Sergey.Larshin@kaspersky.c…
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/udf/inode.c b/fs/udf/inode.c
index f24aa98e6869..a79d73f28aa7 100644
--- a/fs/udf/inode.c
+++ b/fs/udf/inode.c
@@ -2272,6 +2272,9 @@ int udf_current_aext(struct inode *inode, struct extent_position *epos,
if (check_add_overflow(sizeof(struct allocExtDesc),
le32_to_cpu(header->lengthAllocDescs), &alen))
return -1;
+
+ if (alen > epos->bh->b_size)
+ return -1;
}
switch (iinfo->i_alloc_type) {
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 8cfc8cec1b4da88a47c243a11f384baefd092a50
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101337-robust-deepness-48c6@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8cfc8cec1b4da88a47c243a11f384baefd092a50 Mon Sep 17 00:00:00 2001
From: Edward Adam Davis <eadavis(a)qq.com>
Date: Wed, 10 Sep 2025 09:15:27 +0800
Subject: [PATCH] media: mc: Clear minor number before put device
The device minor should not be cleared after the device is released.
Fixes: 9e14868dc952 ("media: mc: Clear minor number reservation at unregistration time")
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+031d0cfd7c362817963f(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=031d0cfd7c362817963f
Tested-by: syzbot+031d0cfd7c362817963f(a)syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis(a)qq.com>
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco(a)kernel.org>
diff --git a/drivers/media/mc/mc-devnode.c b/drivers/media/mc/mc-devnode.c
index 0d01cbae98f2..6daa7aa99442 100644
--- a/drivers/media/mc/mc-devnode.c
+++ b/drivers/media/mc/mc-devnode.c
@@ -276,13 +276,10 @@ void media_devnode_unregister(struct media_devnode *devnode)
/* Delete the cdev on this minor as well */
cdev_device_del(&devnode->cdev, &devnode->dev);
devnode->media_dev = NULL;
+ clear_bit(devnode->minor, media_devnode_nums);
mutex_unlock(&media_devnode_lock);
put_device(&devnode->dev);
-
- mutex_lock(&media_devnode_lock);
- clear_bit(devnode->minor, media_devnode_nums);
- mutex_unlock(&media_devnode_lock);
}
/*
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101320-padding-swagger-0208@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b Mon Sep 17 00:00:00 2001
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Date: Fri, 26 Sep 2025 22:59:35 +0100
Subject: [PATCH] Squashfs: reject negative file sizes in squashfs_read_inode()
Syskaller reports a "WARNING in ovl_copy_up_file" in overlayfs.
This warning is ultimately caused because the underlying Squashfs file
system returns a file with a negative file size.
This commit checks for a negative file size and returns EINVAL.
[phillip(a)squashfs.org.uk: only need to check 64 bit quantity]
Link: https://lkml.kernel.org/r/20250926222305.110103-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20250926215935.107233-1-phillip@squashfs.org.uk
Fixes: 6545b246a2c8 ("Squashfs: inode operations")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+f754e01116421e9754b9(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68d580e5.a00a0220.303701.0019.GAE@google.com/
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
index ddc65d006063..cceae3b78698 100644
--- a/fs/squashfs/inode.c
+++ b/fs/squashfs/inode.c
@@ -197,6 +197,10 @@ int squashfs_read_inode(struct inode *inode, long long ino)
goto failed_read;
inode->i_size = le64_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
frag = le32_to_cpu(sqsh_ino->fragment);
if (frag != SQUASHFS_INVALID_FRAG) {
/*
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 3bd5e45c2ce30e239d596becd5db720f7eb83c99
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101309-prescribe-imprison-4896@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3bd5e45c2ce30e239d596becd5db720f7eb83c99 Mon Sep 17 00:00:00 2001
From: Larshin Sergey <Sergey.Larshin(a)kaspersky.com>
Date: Mon, 22 Sep 2025 16:13:58 +0300
Subject: [PATCH] fs: udf: fix OOB read in lengthAllocDescs handling
When parsing Allocation Extent Descriptor, lengthAllocDescs comes from
on-disk data and must be validated against the block size. Crafted or
corrupted images may set lengthAllocDescs so that the total descriptor
length (sizeof(allocExtDesc) + lengthAllocDescs) exceeds the buffer,
leading udf_update_tag() to call crc_itu_t() on out-of-bounds memory and
trigger a KASAN use-after-free read.
BUG: KASAN: use-after-free in crc_itu_t+0x1d5/0x2b0 lib/crc-itu-t.c:60
Read of size 1 at addr ffff888041e7d000 by task syz-executor317/5309
CPU: 0 UID: 0 PID: 5309 Comm: syz-executor317 Not tainted 6.12.0-rc4-syzkaller-00261-g850925a8133c #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
crc_itu_t+0x1d5/0x2b0 lib/crc-itu-t.c:60
udf_update_tag+0x70/0x6a0 fs/udf/misc.c:261
udf_write_aext+0x4d8/0x7b0 fs/udf/inode.c:2179
extent_trunc+0x2f7/0x4a0 fs/udf/truncate.c:46
udf_truncate_tail_extent+0x527/0x7e0 fs/udf/truncate.c:106
udf_release_file+0xc1/0x120 fs/udf/file.c:185
__fput+0x23f/0x880 fs/file_table.c:431
task_work_run+0x24f/0x310 kernel/task_work.c:239
exit_task_work include/linux/task_work.h:43 [inline]
do_exit+0xa2f/0x28e0 kernel/exit.c:939
do_group_exit+0x207/0x2c0 kernel/exit.c:1088
__do_sys_exit_group kernel/exit.c:1099 [inline]
__se_sys_exit_group kernel/exit.c:1097 [inline]
__x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1097
x64_sys_call+0x2634/0x2640 arch/x86/include/generated/asm/syscalls_64.h:232
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
Validate the computed total length against epos->bh->b_size.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Reported-by: syzbot+8743fca924afed42f93e(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=8743fca924afed42f93e
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable(a)vger.kernel.org
Signed-off-by: Larshin Sergey <Sergey.Larshin(a)kaspersky.com>
Link: https://patch.msgid.link/20250922131358.745579-1-Sergey.Larshin@kaspersky.c…
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/udf/inode.c b/fs/udf/inode.c
index f24aa98e6869..a79d73f28aa7 100644
--- a/fs/udf/inode.c
+++ b/fs/udf/inode.c
@@ -2272,6 +2272,9 @@ int udf_current_aext(struct inode *inode, struct extent_position *epos,
if (check_add_overflow(sizeof(struct allocExtDesc),
le32_to_cpu(header->lengthAllocDescs), &alen))
return -1;
+
+ if (alen > epos->bh->b_size)
+ return -1;
}
switch (iinfo->i_alloc_type) {
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 8cfc8cec1b4da88a47c243a11f384baefd092a50
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101336-tannery-reverb-5975@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8cfc8cec1b4da88a47c243a11f384baefd092a50 Mon Sep 17 00:00:00 2001
From: Edward Adam Davis <eadavis(a)qq.com>
Date: Wed, 10 Sep 2025 09:15:27 +0800
Subject: [PATCH] media: mc: Clear minor number before put device
The device minor should not be cleared after the device is released.
Fixes: 9e14868dc952 ("media: mc: Clear minor number reservation at unregistration time")
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+031d0cfd7c362817963f(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=031d0cfd7c362817963f
Tested-by: syzbot+031d0cfd7c362817963f(a)syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis(a)qq.com>
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco(a)kernel.org>
diff --git a/drivers/media/mc/mc-devnode.c b/drivers/media/mc/mc-devnode.c
index 0d01cbae98f2..6daa7aa99442 100644
--- a/drivers/media/mc/mc-devnode.c
+++ b/drivers/media/mc/mc-devnode.c
@@ -276,13 +276,10 @@ void media_devnode_unregister(struct media_devnode *devnode)
/* Delete the cdev on this minor as well */
cdev_device_del(&devnode->cdev, &devnode->dev);
devnode->media_dev = NULL;
+ clear_bit(devnode->minor, media_devnode_nums);
mutex_unlock(&media_devnode_lock);
put_device(&devnode->dev);
-
- mutex_lock(&media_devnode_lock);
- clear_bit(devnode->minor, media_devnode_nums);
- mutex_unlock(&media_devnode_lock);
}
/*
The patch below does not apply to the 6.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.17.y
git checkout FETCH_HEAD
git cherry-pick -x 8cfc8cec1b4da88a47c243a11f384baefd092a50
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101335-script-feeble-03de@gregkh' --subject-prefix 'PATCH 6.17.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8cfc8cec1b4da88a47c243a11f384baefd092a50 Mon Sep 17 00:00:00 2001
From: Edward Adam Davis <eadavis(a)qq.com>
Date: Wed, 10 Sep 2025 09:15:27 +0800
Subject: [PATCH] media: mc: Clear minor number before put device
The device minor should not be cleared after the device is released.
Fixes: 9e14868dc952 ("media: mc: Clear minor number reservation at unregistration time")
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+031d0cfd7c362817963f(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=031d0cfd7c362817963f
Tested-by: syzbot+031d0cfd7c362817963f(a)syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis(a)qq.com>
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco(a)kernel.org>
diff --git a/drivers/media/mc/mc-devnode.c b/drivers/media/mc/mc-devnode.c
index 0d01cbae98f2..6daa7aa99442 100644
--- a/drivers/media/mc/mc-devnode.c
+++ b/drivers/media/mc/mc-devnode.c
@@ -276,13 +276,10 @@ void media_devnode_unregister(struct media_devnode *devnode)
/* Delete the cdev on this minor as well */
cdev_device_del(&devnode->cdev, &devnode->dev);
devnode->media_dev = NULL;
+ clear_bit(devnode->minor, media_devnode_nums);
mutex_unlock(&media_devnode_lock);
put_device(&devnode->dev);
-
- mutex_lock(&media_devnode_lock);
- clear_bit(devnode->minor, media_devnode_nums);
- mutex_unlock(&media_devnode_lock);
}
/*
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101320-sulphate-crafty-c0c4@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b Mon Sep 17 00:00:00 2001
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Date: Fri, 26 Sep 2025 22:59:35 +0100
Subject: [PATCH] Squashfs: reject negative file sizes in squashfs_read_inode()
Syskaller reports a "WARNING in ovl_copy_up_file" in overlayfs.
This warning is ultimately caused because the underlying Squashfs file
system returns a file with a negative file size.
This commit checks for a negative file size and returns EINVAL.
[phillip(a)squashfs.org.uk: only need to check 64 bit quantity]
Link: https://lkml.kernel.org/r/20250926222305.110103-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20250926215935.107233-1-phillip@squashfs.org.uk
Fixes: 6545b246a2c8 ("Squashfs: inode operations")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+f754e01116421e9754b9(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68d580e5.a00a0220.303701.0019.GAE@google.com/
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
index ddc65d006063..cceae3b78698 100644
--- a/fs/squashfs/inode.c
+++ b/fs/squashfs/inode.c
@@ -197,6 +197,10 @@ int squashfs_read_inode(struct inode *inode, long long ino)
goto failed_read;
inode->i_size = le64_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
frag = le32_to_cpu(sqsh_ino->fragment);
if (frag != SQUASHFS_INVALID_FRAG) {
/*
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 4e65bda8273c938039403144730923e77916a3d7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101306-ploy-alkalize-b950@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4e65bda8273c938039403144730923e77916a3d7 Mon Sep 17 00:00:00 2001
From: Ma Ke <make24(a)iscas.ac.cn>
Date: Tue, 23 Sep 2025 14:52:12 +0800
Subject: [PATCH] ASoC: wcd934x: fix error handling in
wcd934x_codec_parse_data()
wcd934x_codec_parse_data() contains a device reference count leak in
of_slim_get_device() where device_find_child() increases the reference
count of the device but this reference is not properly decreased in
the success path. Add put_device() in wcd934x_codec_parse_data() and
add devm_add_action_or_reset() in the probe function, which ensures
that the reference count of the device is correctly managed.
Memory leak in regmap_init_slimbus() as the allocated regmap is not
released when the device is removed. Using devm_regmap_init_slimbus()
instead of regmap_init_slimbus() to ensure automatic regmap cleanup on
device removal.
Calling path: of_slim_get_device() -> of_find_slim_device() ->
device_find_child(). As comment of device_find_child() says, 'NOTE:
you will need to drop the reference with put_device() after use.'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: a61f3b4f476e ("ASoC: wcd934x: add support to wcd9340/wcd9341 codec")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov(a)oss.qualcomm.com>
Link: https://patch.msgid.link/20250923065212.26660-1-make24@iscas.ac.cn
Signed-off-by: Mark Brown <broonie(a)kernel.org>
diff --git a/sound/soc/codecs/wcd934x.c b/sound/soc/codecs/wcd934x.c
index 1bb7e1dc7e6b..e92939068bf7 100644
--- a/sound/soc/codecs/wcd934x.c
+++ b/sound/soc/codecs/wcd934x.c
@@ -5831,6 +5831,13 @@ static const struct snd_soc_component_driver wcd934x_component_drv = {
.endianness = 1,
};
+static void wcd934x_put_device_action(void *data)
+{
+ struct device *dev = data;
+
+ put_device(dev);
+}
+
static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
{
struct device *dev = &wcd->sdev->dev;
@@ -5847,11 +5854,13 @@ static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
return dev_err_probe(dev, -EINVAL, "Unable to get SLIM Interface device\n");
slim_get_logical_addr(wcd->sidev);
- wcd->if_regmap = regmap_init_slimbus(wcd->sidev,
+ wcd->if_regmap = devm_regmap_init_slimbus(wcd->sidev,
&wcd934x_ifc_regmap_config);
- if (IS_ERR(wcd->if_regmap))
+ if (IS_ERR(wcd->if_regmap)) {
+ put_device(&wcd->sidev->dev);
return dev_err_probe(dev, PTR_ERR(wcd->if_regmap),
"Failed to allocate ifc register map\n");
+ }
of_property_read_u32(dev->parent->of_node, "qcom,dmic-sample-rate",
&wcd->dmic_sample_rate);
@@ -5893,6 +5902,10 @@ static int wcd934x_codec_probe(struct platform_device *pdev)
if (ret)
return ret;
+ ret = devm_add_action_or_reset(dev, wcd934x_put_device_action, &wcd->sidev->dev);
+ if (ret)
+ return ret;
+
/* set default rate 9P6MHz */
regmap_update_bits(wcd->regmap, WCD934X_CODEC_RPM_CLK_MCLK_CFG,
WCD934X_CODEC_RPM_CLK_MCLK_CFG_MCLK_MASK,
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101320-sliceable-electable-e1b9@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b Mon Sep 17 00:00:00 2001
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Date: Fri, 26 Sep 2025 22:59:35 +0100
Subject: [PATCH] Squashfs: reject negative file sizes in squashfs_read_inode()
Syskaller reports a "WARNING in ovl_copy_up_file" in overlayfs.
This warning is ultimately caused because the underlying Squashfs file
system returns a file with a negative file size.
This commit checks for a negative file size and returns EINVAL.
[phillip(a)squashfs.org.uk: only need to check 64 bit quantity]
Link: https://lkml.kernel.org/r/20250926222305.110103-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20250926215935.107233-1-phillip@squashfs.org.uk
Fixes: 6545b246a2c8 ("Squashfs: inode operations")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+f754e01116421e9754b9(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68d580e5.a00a0220.303701.0019.GAE@google.com/
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
index ddc65d006063..cceae3b78698 100644
--- a/fs/squashfs/inode.c
+++ b/fs/squashfs/inode.c
@@ -197,6 +197,10 @@ int squashfs_read_inode(struct inode *inode, long long ino)
goto failed_read;
inode->i_size = le64_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
frag = le32_to_cpu(sqsh_ino->fragment);
if (frag != SQUASHFS_INVALID_FRAG) {
/*
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 4e65bda8273c938039403144730923e77916a3d7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101305-until-selector-fc19@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4e65bda8273c938039403144730923e77916a3d7 Mon Sep 17 00:00:00 2001
From: Ma Ke <make24(a)iscas.ac.cn>
Date: Tue, 23 Sep 2025 14:52:12 +0800
Subject: [PATCH] ASoC: wcd934x: fix error handling in
wcd934x_codec_parse_data()
wcd934x_codec_parse_data() contains a device reference count leak in
of_slim_get_device() where device_find_child() increases the reference
count of the device but this reference is not properly decreased in
the success path. Add put_device() in wcd934x_codec_parse_data() and
add devm_add_action_or_reset() in the probe function, which ensures
that the reference count of the device is correctly managed.
Memory leak in regmap_init_slimbus() as the allocated regmap is not
released when the device is removed. Using devm_regmap_init_slimbus()
instead of regmap_init_slimbus() to ensure automatic regmap cleanup on
device removal.
Calling path: of_slim_get_device() -> of_find_slim_device() ->
device_find_child(). As comment of device_find_child() says, 'NOTE:
you will need to drop the reference with put_device() after use.'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: a61f3b4f476e ("ASoC: wcd934x: add support to wcd9340/wcd9341 codec")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov(a)oss.qualcomm.com>
Link: https://patch.msgid.link/20250923065212.26660-1-make24@iscas.ac.cn
Signed-off-by: Mark Brown <broonie(a)kernel.org>
diff --git a/sound/soc/codecs/wcd934x.c b/sound/soc/codecs/wcd934x.c
index 1bb7e1dc7e6b..e92939068bf7 100644
--- a/sound/soc/codecs/wcd934x.c
+++ b/sound/soc/codecs/wcd934x.c
@@ -5831,6 +5831,13 @@ static const struct snd_soc_component_driver wcd934x_component_drv = {
.endianness = 1,
};
+static void wcd934x_put_device_action(void *data)
+{
+ struct device *dev = data;
+
+ put_device(dev);
+}
+
static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
{
struct device *dev = &wcd->sdev->dev;
@@ -5847,11 +5854,13 @@ static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
return dev_err_probe(dev, -EINVAL, "Unable to get SLIM Interface device\n");
slim_get_logical_addr(wcd->sidev);
- wcd->if_regmap = regmap_init_slimbus(wcd->sidev,
+ wcd->if_regmap = devm_regmap_init_slimbus(wcd->sidev,
&wcd934x_ifc_regmap_config);
- if (IS_ERR(wcd->if_regmap))
+ if (IS_ERR(wcd->if_regmap)) {
+ put_device(&wcd->sidev->dev);
return dev_err_probe(dev, PTR_ERR(wcd->if_regmap),
"Failed to allocate ifc register map\n");
+ }
of_property_read_u32(dev->parent->of_node, "qcom,dmic-sample-rate",
&wcd->dmic_sample_rate);
@@ -5893,6 +5902,10 @@ static int wcd934x_codec_probe(struct platform_device *pdev)
if (ret)
return ret;
+ ret = devm_add_action_or_reset(dev, wcd934x_put_device_action, &wcd->sidev->dev);
+ if (ret)
+ return ret;
+
/* set default rate 9P6MHz */
regmap_update_bits(wcd->regmap, WCD934X_CODEC_RPM_CLK_MCLK_CFG,
WCD934X_CODEC_RPM_CLK_MCLK_CFG_MCLK_MASK,
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 4e65bda8273c938039403144730923e77916a3d7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101303-agency-job-17ce@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4e65bda8273c938039403144730923e77916a3d7 Mon Sep 17 00:00:00 2001
From: Ma Ke <make24(a)iscas.ac.cn>
Date: Tue, 23 Sep 2025 14:52:12 +0800
Subject: [PATCH] ASoC: wcd934x: fix error handling in
wcd934x_codec_parse_data()
wcd934x_codec_parse_data() contains a device reference count leak in
of_slim_get_device() where device_find_child() increases the reference
count of the device but this reference is not properly decreased in
the success path. Add put_device() in wcd934x_codec_parse_data() and
add devm_add_action_or_reset() in the probe function, which ensures
that the reference count of the device is correctly managed.
Memory leak in regmap_init_slimbus() as the allocated regmap is not
released when the device is removed. Using devm_regmap_init_slimbus()
instead of regmap_init_slimbus() to ensure automatic regmap cleanup on
device removal.
Calling path: of_slim_get_device() -> of_find_slim_device() ->
device_find_child(). As comment of device_find_child() says, 'NOTE:
you will need to drop the reference with put_device() after use.'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: a61f3b4f476e ("ASoC: wcd934x: add support to wcd9340/wcd9341 codec")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov(a)oss.qualcomm.com>
Link: https://patch.msgid.link/20250923065212.26660-1-make24@iscas.ac.cn
Signed-off-by: Mark Brown <broonie(a)kernel.org>
diff --git a/sound/soc/codecs/wcd934x.c b/sound/soc/codecs/wcd934x.c
index 1bb7e1dc7e6b..e92939068bf7 100644
--- a/sound/soc/codecs/wcd934x.c
+++ b/sound/soc/codecs/wcd934x.c
@@ -5831,6 +5831,13 @@ static const struct snd_soc_component_driver wcd934x_component_drv = {
.endianness = 1,
};
+static void wcd934x_put_device_action(void *data)
+{
+ struct device *dev = data;
+
+ put_device(dev);
+}
+
static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
{
struct device *dev = &wcd->sdev->dev;
@@ -5847,11 +5854,13 @@ static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
return dev_err_probe(dev, -EINVAL, "Unable to get SLIM Interface device\n");
slim_get_logical_addr(wcd->sidev);
- wcd->if_regmap = regmap_init_slimbus(wcd->sidev,
+ wcd->if_regmap = devm_regmap_init_slimbus(wcd->sidev,
&wcd934x_ifc_regmap_config);
- if (IS_ERR(wcd->if_regmap))
+ if (IS_ERR(wcd->if_regmap)) {
+ put_device(&wcd->sidev->dev);
return dev_err_probe(dev, PTR_ERR(wcd->if_regmap),
"Failed to allocate ifc register map\n");
+ }
of_property_read_u32(dev->parent->of_node, "qcom,dmic-sample-rate",
&wcd->dmic_sample_rate);
@@ -5893,6 +5902,10 @@ static int wcd934x_codec_probe(struct platform_device *pdev)
if (ret)
return ret;
+ ret = devm_add_action_or_reset(dev, wcd934x_put_device_action, &wcd->sidev->dev);
+ if (ret)
+ return ret;
+
/* set default rate 9P6MHz */
regmap_update_bits(wcd->regmap, WCD934X_CODEC_RPM_CLK_MCLK_CFG,
WCD934X_CODEC_RPM_CLK_MCLK_CFG_MCLK_MASK,
The patch below does not apply to the 6.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.17.y
git checkout FETCH_HEAD
git cherry-pick -x 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101319-anything-blob-1499@gregkh' --subject-prefix 'PATCH 6.17.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b Mon Sep 17 00:00:00 2001
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Date: Fri, 26 Sep 2025 22:59:35 +0100
Subject: [PATCH] Squashfs: reject negative file sizes in squashfs_read_inode()
Syskaller reports a "WARNING in ovl_copy_up_file" in overlayfs.
This warning is ultimately caused because the underlying Squashfs file
system returns a file with a negative file size.
This commit checks for a negative file size and returns EINVAL.
[phillip(a)squashfs.org.uk: only need to check 64 bit quantity]
Link: https://lkml.kernel.org/r/20250926222305.110103-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20250926215935.107233-1-phillip@squashfs.org.uk
Fixes: 6545b246a2c8 ("Squashfs: inode operations")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+f754e01116421e9754b9(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68d580e5.a00a0220.303701.0019.GAE@google.com/
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
index ddc65d006063..cceae3b78698 100644
--- a/fs/squashfs/inode.c
+++ b/fs/squashfs/inode.c
@@ -197,6 +197,10 @@ int squashfs_read_inode(struct inode *inode, long long ino)
goto failed_read;
inode->i_size = le64_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
frag = le32_to_cpu(sqsh_ino->fragment);
if (frag != SQUASHFS_INVALID_FRAG) {
/*
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 674b56aa57f9379854cb6798c3bbcef7e7b51ab7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101020-earmark-certainly-9bbd@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 674b56aa57f9379854cb6798c3bbcef7e7b51ab7 Mon Sep 17 00:00:00 2001
From: Nalivayko Sergey <Sergey.Nalivayko(a)kaspersky.com>
Date: Tue, 15 Jul 2025 18:48:15 +0300
Subject: [PATCH] net/9p: fix double req put in p9_fd_cancelled
Syzkaller reports a KASAN issue as below:
general protection fault, probably for non-canonical address 0xfbd59c0000000021: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: maybe wild-memory-access in range [0xdead000000000108-0xdead00000000010f]
CPU: 0 PID: 5083 Comm: syz-executor.2 Not tainted 6.1.134-syzkaller-00037-g855bd1d7d838 #0
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
RIP: 0010:__list_del include/linux/list.h:114 [inline]
RIP: 0010:__list_del_entry include/linux/list.h:137 [inline]
RIP: 0010:list_del include/linux/list.h:148 [inline]
RIP: 0010:p9_fd_cancelled+0xe9/0x200 net/9p/trans_fd.c:734
Call Trace:
<TASK>
p9_client_flush+0x351/0x440 net/9p/client.c:614
p9_client_rpc+0xb6b/0xc70 net/9p/client.c:734
p9_client_version net/9p/client.c:920 [inline]
p9_client_create+0xb51/0x1240 net/9p/client.c:1027
v9fs_session_init+0x1f0/0x18f0 fs/9p/v9fs.c:408
v9fs_mount+0xba/0xcb0 fs/9p/vfs_super.c:126
legacy_get_tree+0x108/0x220 fs/fs_context.c:632
vfs_get_tree+0x8e/0x300 fs/super.c:1573
do_new_mount fs/namespace.c:3056 [inline]
path_mount+0x6a6/0x1e90 fs/namespace.c:3386
do_mount fs/namespace.c:3399 [inline]
__do_sys_mount fs/namespace.c:3607 [inline]
__se_sys_mount fs/namespace.c:3584 [inline]
__x64_sys_mount+0x283/0x300 fs/namespace.c:3584
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x35/0x80 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
This happens because of a race condition between:
- The 9p client sending an invalid flush request and later cleaning it up;
- The 9p client in p9_read_work() canceled all pending requests.
Thread 1 Thread 2
...
p9_client_create()
...
p9_fd_create()
...
p9_conn_create()
...
// start Thread 2
INIT_WORK(&m->rq, p9_read_work);
p9_read_work()
...
p9_client_rpc()
...
...
p9_conn_cancel()
...
spin_lock(&m->req_lock);
...
p9_fd_cancelled()
...
...
spin_unlock(&m->req_lock);
// status rewrite
p9_client_cb(m->client, req, REQ_STATUS_ERROR)
// first remove
list_del(&req->req_list);
...
spin_lock(&m->req_lock)
...
// second remove
list_del(&req->req_list);
spin_unlock(&m->req_lock)
...
Commit 74d6a5d56629 ("9p/trans_fd: Fix concurrency del of req_list in
p9_fd_cancelled/p9_read_work") fixes a concurrency issue in the 9p filesystem
client where the req_list could be deleted simultaneously by both
p9_read_work and p9_fd_cancelled functions, but for the case where req->status
equals REQ_STATUS_RCVD.
Update the check for req->status in p9_fd_cancelled to skip processing not
just received requests, but anything that is not SENT, as whatever
changed the state from SENT also removed the request from its list.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: afd8d6541155 ("9P: Add cancelled() to the transport functions.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Nalivayko Sergey <Sergey.Nalivayko(a)kaspersky.com>
Message-ID: <20250715154815.3501030-1-Sergey.Nalivayko(a)kaspersky.com>
[updated the check from status == RECV || status == ERROR to status != SENT]
Signed-off-by: Dominique Martinet <asmadeus(a)codewreck.org>
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 339ec4e54778..8992d8bebbdd 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -726,10 +726,10 @@ static int p9_fd_cancelled(struct p9_client *client, struct p9_req_t *req)
p9_debug(P9_DEBUG_TRANS, "client %p req %p\n", client, req);
spin_lock(&m->req_lock);
- /* Ignore cancelled request if message has been received
- * before lock.
- */
- if (req->status == REQ_STATUS_RCVD) {
+ /* Ignore cancelled request if status changed since the request was
+ * processed in p9_client_flush()
+ */
+ if (req->status != REQ_STATUS_SENT) {
spin_unlock(&m->req_lock);
return 0;
}
Drm_sched_job_add_dependency() consumes the fence reference both on
success and failure, so in the latter case the dma_fence_put() on the
error path (xarray failed to expand) is a double free.
Interestingly this bug appears to have been present ever since
ebd5f74255b9 ("drm/sched: Add dependency tracking"), since the code back
then looked like this:
drm_sched_job_add_implicit_dependencies():
...
for (i = 0; i < fence_count; i++) {
ret = drm_sched_job_add_dependency(job, fences[i]);
if (ret)
break;
}
for (; i < fence_count; i++)
dma_fence_put(fences[i]);
Which means for the failing 'i' the dma_fence_put was already a double
free. Possibly there were no users at that time, or the test cases were
insufficient to hit it.
The bug was then only noticed and fixed after
9c2ba265352a ("drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2")
landed, with its fixup of
4eaf02d6076c ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies").
At that point it was a slightly different flavour of a double free, which
963d0b356935 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder")
noticed and attempted to fix.
But it only moved the double free from happening inside the
drm_sched_job_add_dependency(), when releasing the reference not yet
obtained, to the caller, when releasing the reference already released by
the former in the failure case.
As such it is not easy to identify the right target for the fixes tag so
lets keep it simple and just continue the chain.
We also drop the misleading comment about additional reference, since it
is not additional but the only one from the point of view of dependency
tracking.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com>
Fixes: 963d0b356935 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder")
Reported-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Cc: Christian König <christian.koenig(a)amd.com>
Cc: Rob Clark <robdclark(a)chromium.org>
Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Danilo Krummrich <dakr(a)kernel.org>
Cc: Philipp Stanner <phasta(a)kernel.org>
Cc: "Christian König" <ckoenig.leichtzumerken(a)gmail.com>
Cc: dri-devel(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v5.16+
---
drivers/gpu/drm/scheduler/sched_main.c | 14 +++++---------
1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 46119aacb809..aff34240f230 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -960,20 +960,16 @@ int drm_sched_job_add_resv_dependencies(struct drm_sched_job *job,
{
struct dma_resv_iter cursor;
struct dma_fence *fence;
- int ret;
+ int ret = 0;
dma_resv_assert_held(resv);
dma_resv_for_each_fence(&cursor, resv, usage, fence) {
- /* Make sure to grab an additional ref on the added fence */
- dma_fence_get(fence);
- ret = drm_sched_job_add_dependency(job, fence);
- if (ret) {
- dma_fence_put(fence);
- return ret;
- }
+ ret = drm_sched_job_add_dependency(job, dma_fence_get(fence));
+ if (ret)
+ break;
}
- return 0;
+ return ret;
}
EXPORT_SYMBOL(drm_sched_job_add_resv_dependencies);
--
2.48.0
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 674b56aa57f9379854cb6798c3bbcef7e7b51ab7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101019-zeppelin-polymer-aadc@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 674b56aa57f9379854cb6798c3bbcef7e7b51ab7 Mon Sep 17 00:00:00 2001
From: Nalivayko Sergey <Sergey.Nalivayko(a)kaspersky.com>
Date: Tue, 15 Jul 2025 18:48:15 +0300
Subject: [PATCH] net/9p: fix double req put in p9_fd_cancelled
Syzkaller reports a KASAN issue as below:
general protection fault, probably for non-canonical address 0xfbd59c0000000021: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: maybe wild-memory-access in range [0xdead000000000108-0xdead00000000010f]
CPU: 0 PID: 5083 Comm: syz-executor.2 Not tainted 6.1.134-syzkaller-00037-g855bd1d7d838 #0
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
RIP: 0010:__list_del include/linux/list.h:114 [inline]
RIP: 0010:__list_del_entry include/linux/list.h:137 [inline]
RIP: 0010:list_del include/linux/list.h:148 [inline]
RIP: 0010:p9_fd_cancelled+0xe9/0x200 net/9p/trans_fd.c:734
Call Trace:
<TASK>
p9_client_flush+0x351/0x440 net/9p/client.c:614
p9_client_rpc+0xb6b/0xc70 net/9p/client.c:734
p9_client_version net/9p/client.c:920 [inline]
p9_client_create+0xb51/0x1240 net/9p/client.c:1027
v9fs_session_init+0x1f0/0x18f0 fs/9p/v9fs.c:408
v9fs_mount+0xba/0xcb0 fs/9p/vfs_super.c:126
legacy_get_tree+0x108/0x220 fs/fs_context.c:632
vfs_get_tree+0x8e/0x300 fs/super.c:1573
do_new_mount fs/namespace.c:3056 [inline]
path_mount+0x6a6/0x1e90 fs/namespace.c:3386
do_mount fs/namespace.c:3399 [inline]
__do_sys_mount fs/namespace.c:3607 [inline]
__se_sys_mount fs/namespace.c:3584 [inline]
__x64_sys_mount+0x283/0x300 fs/namespace.c:3584
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x35/0x80 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
This happens because of a race condition between:
- The 9p client sending an invalid flush request and later cleaning it up;
- The 9p client in p9_read_work() canceled all pending requests.
Thread 1 Thread 2
...
p9_client_create()
...
p9_fd_create()
...
p9_conn_create()
...
// start Thread 2
INIT_WORK(&m->rq, p9_read_work);
p9_read_work()
...
p9_client_rpc()
...
...
p9_conn_cancel()
...
spin_lock(&m->req_lock);
...
p9_fd_cancelled()
...
...
spin_unlock(&m->req_lock);
// status rewrite
p9_client_cb(m->client, req, REQ_STATUS_ERROR)
// first remove
list_del(&req->req_list);
...
spin_lock(&m->req_lock)
...
// second remove
list_del(&req->req_list);
spin_unlock(&m->req_lock)
...
Commit 74d6a5d56629 ("9p/trans_fd: Fix concurrency del of req_list in
p9_fd_cancelled/p9_read_work") fixes a concurrency issue in the 9p filesystem
client where the req_list could be deleted simultaneously by both
p9_read_work and p9_fd_cancelled functions, but for the case where req->status
equals REQ_STATUS_RCVD.
Update the check for req->status in p9_fd_cancelled to skip processing not
just received requests, but anything that is not SENT, as whatever
changed the state from SENT also removed the request from its list.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: afd8d6541155 ("9P: Add cancelled() to the transport functions.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Nalivayko Sergey <Sergey.Nalivayko(a)kaspersky.com>
Message-ID: <20250715154815.3501030-1-Sergey.Nalivayko(a)kaspersky.com>
[updated the check from status == RECV || status == ERROR to status != SENT]
Signed-off-by: Dominique Martinet <asmadeus(a)codewreck.org>
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 339ec4e54778..8992d8bebbdd 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -726,10 +726,10 @@ static int p9_fd_cancelled(struct p9_client *client, struct p9_req_t *req)
p9_debug(P9_DEBUG_TRANS, "client %p req %p\n", client, req);
spin_lock(&m->req_lock);
- /* Ignore cancelled request if message has been received
- * before lock.
- */
- if (req->status == REQ_STATUS_RCVD) {
+ /* Ignore cancelled request if status changed since the request was
+ * processed in p9_client_flush()
+ */
+ if (req->status != REQ_STATUS_SENT) {
spin_unlock(&m->req_lock);
return 0;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x cde0fbff07eff7e4e0e85fa053fe19a24c86b1e0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101328-ferry-wrought-6a64@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From cde0fbff07eff7e4e0e85fa053fe19a24c86b1e0 Mon Sep 17 00:00:00 2001
From: Pierre Gondois <pierre.gondois(a)arm.com>
Date: Fri, 14 Apr 2023 10:14:50 +0200
Subject: [PATCH] cacheinfo: Check cache properties are present in DT
If a Device Tree (DT) is used, the presence of cache properties is
assumed. Not finding any is not considered. For arm64 platforms,
cache information can be fetched from the clidr_el1 register.
Checking whether cache information is available in the DT
allows to switch to using clidr_el1.
init_of_cache_level()
\-of_count_cache_leaves()
will assume there a 2 cache leaves (L1 data/instruction caches), which
can be different from clidr_el1 information.
cache_setup_of_node() tries to read cache properties in the DT.
If there are none, this is considered a success. Knowing no
information was available would allow to switch to using clidr_el1.
Fixes: de0df442ee49 ("cacheinfo: Check 'cache-unified' property to count cache leaves")
Reported-by: Alexandre Ghiti <alexghiti(a)rivosinc.com>
Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
Signed-off-by: Pierre Gondois <pierre.gondois(a)arm.com>
Reviewed-by: Conor Dooley <conor.dooley(a)microchip.com>
Link: https://lore.kernel.org/r/20230414081453.244787-3-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla(a)arm.com>
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index ba14c7872e4a..f16e5a82f0f3 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -78,6 +78,9 @@ bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
}
#ifdef CONFIG_OF
+
+static bool of_check_cache_nodes(struct device_node *np);
+
/* OF properties to query for a given cache type */
struct cache_type_info {
const char *size_prop;
@@ -205,6 +208,11 @@ static int cache_setup_of_node(unsigned int cpu)
return -ENOENT;
}
+ if (!of_check_cache_nodes(np)) {
+ of_node_put(np);
+ return -ENOENT;
+ }
+
prev = np;
while (index < cache_leaves(cpu)) {
@@ -229,6 +237,25 @@ static int cache_setup_of_node(unsigned int cpu)
return 0;
}
+static bool of_check_cache_nodes(struct device_node *np)
+{
+ struct device_node *next;
+
+ if (of_property_present(np, "cache-size") ||
+ of_property_present(np, "i-cache-size") ||
+ of_property_present(np, "d-cache-size") ||
+ of_property_present(np, "cache-unified"))
+ return true;
+
+ next = of_find_next_cache_node(np);
+ if (next) {
+ of_node_put(next);
+ return true;
+ }
+
+ return false;
+}
+
static int of_count_cache_leaves(struct device_node *np)
{
unsigned int leaves = 0;
@@ -260,6 +287,11 @@ int init_of_cache_level(unsigned int cpu)
struct device_node *prev = NULL;
unsigned int levels = 0, leaves, level;
+ if (!of_check_cache_nodes(np)) {
+ of_node_put(np);
+ return -ENOENT;
+ }
+
leaves = of_count_cache_leaves(np);
if (leaves > 0)
levels = 1;
Hi Sasha,
Please do NOT backport commit dd83609b8898 alone to stable. This patch
causes a regression in fallocate(PUNCH_HOLE) operations where pages are
not freed immediately, as reported by Mark Brown.
The fix for this regression is already in linux-next as commit
91a830422707 ("hugetlbfs: check for shareable lock before calling
huge_pmd_unshare()").
Please backport both commits together to avoid introducing the
regression in stable kernels:
- dd83609b88986f4add37c0871c3434310652ebd5 ("hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list")
- 91a830422707a62629fc4fbf8cdc3c8acf56ca64 ("hugetlbfs: check for shareable lock before calling huge_pmd_unshare()")
Thanks,
Deepanshu Kartikey
On Sun, Oct 12, 2025 at 3:57 PM Sasha Levin <sashal(a)kernel.org> wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> gpio: TODO: remove the task for converting to the new line setters
>
> to the 6.17-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> gpio-todo-remove-the-task-for-converting-to-the-new-.patch
> and it can be found in the queue-6.17 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
As per commit message: this is neither a fix nor even a new feature,
this is just a change in the TODO file. Please drop it, this has no
place in stable branches.
Bart
A race condition during gadget teardown can lead to a use-after-free
in usb_gadget_state_work(), as reported by KASAN:
BUG: KASAN: invalid-access in sysfs_notify+0_x_2c/0_x_d0
Workqueue: events usb_gadget_state_work
The fundamental race occurs because a concurrent event (e.g., an
interrupt) can call usb_gadget_set_state() and schedule gadget->work
at any time during the cleanup process in usb_del_gadget().
Commit 399a45e5237c ("usb: gadget: core: flush gadget workqueue after
device removal") attempted to fix this by moving flush_work() to after
device_del(). However, this does not fully solve the race, as a new
work item can still be scheduled *after* flush_work() completes but
before the gadget's memory is freed, leading to the same use-after-free.
This patch fixes the race condition robustly by introducing a 'teardown'
flag and a 'state_lock' spinlock to the usb_gadget struct. The flag is
set during cleanup in usb_del_gadget() *before* calling flush_work() to
prevent any new work from being scheduled once cleanup has commenced.
The scheduling site, usb_gadget_set_state(), now checks this flag under
the lock before queueing the work, thus safely closing the race window.
Fixes: 5702f75375aa9 ("usb: gadget: udc-core: move sysfs_notify() to a workqueue")
Signed-off-by: Jimmy Hu <hhhuuu(a)google.com>
Cc: stable(a)vger.kernel.org
---
drivers/usb/gadget/udc/core.c | 18 +++++++++++++++++-
include/linux/usb/gadget.h | 6 ++++++
2 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
index d709e24c1fd4..c4268b76d747 100644
--- a/drivers/usb/gadget/udc/core.c
+++ b/drivers/usb/gadget/udc/core.c
@@ -1123,8 +1123,13 @@ static void usb_gadget_state_work(struct work_struct *work)
void usb_gadget_set_state(struct usb_gadget *gadget,
enum usb_device_state state)
{
+ unsigned long flags;
+
+ spin_lock_irqsave(&gadget->state_lock, flags);
gadget->state = state;
- schedule_work(&gadget->work);
+ if (!gadget->teardown)
+ schedule_work(&gadget->work);
+ spin_unlock_irqrestore(&gadget->state_lock, flags);
}
EXPORT_SYMBOL_GPL(usb_gadget_set_state);
@@ -1357,6 +1362,9 @@ static void usb_udc_nop_release(struct device *dev)
void usb_initialize_gadget(struct device *parent, struct usb_gadget *gadget,
void (*release)(struct device *dev))
{
+ /* For race-free teardown */
+ spin_lock_init(&gadget->state_lock);
+ gadget->teardown = false;
INIT_WORK(&gadget->work, usb_gadget_state_work);
gadget->dev.parent = parent;
@@ -1531,6 +1539,7 @@ EXPORT_SYMBOL_GPL(usb_add_gadget_udc);
void usb_del_gadget(struct usb_gadget *gadget)
{
struct usb_udc *udc = gadget->udc;
+ unsigned long flags;
if (!udc)
return;
@@ -1544,6 +1553,13 @@ void usb_del_gadget(struct usb_gadget *gadget)
kobject_uevent(&udc->dev.kobj, KOBJ_REMOVE);
sysfs_remove_link(&udc->dev.kobj, "gadget");
device_del(&gadget->dev);
+ /*
+ * Set the teardown flag before flushing the work to prevent new work
+ * from being scheduled while we are cleaning up.
+ */
+ spin_lock_irqsave(&gadget->state_lock, flags);
+ gadget->teardown = true;
+ spin_unlock_irqrestore(&gadget->state_lock, flags);
flush_work(&gadget->work);
ida_free(&gadget_id_numbers, gadget->id_number);
cancel_work_sync(&udc->vbus_work);
diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h
index 0f28c5512fcb..8302aeaea82e 100644
--- a/include/linux/usb/gadget.h
+++ b/include/linux/usb/gadget.h
@@ -351,6 +351,9 @@ struct usb_gadget_ops {
* can handle. The UDC must support this and all slower speeds and lower
* number of lanes.
* @state: the state we are now (attached, suspended, configured, etc)
+ * @state_lock: Spinlock protecting the `state` and `teardown` members.
+ * @teardown: True if the device is undergoing teardown, used to prevent
+ * new work from being scheduled during cleanup.
* @name: Identifies the controller hardware type. Used in diagnostics
* and sometimes configuration.
* @dev: Driver model state for this abstract device.
@@ -426,6 +429,9 @@ struct usb_gadget {
enum usb_ssp_rate max_ssp_rate;
enum usb_device_state state;
+ /* For race-free teardown and state management */
+ spinlock_t state_lock;
+ bool teardown;
const char *name;
struct device dev;
unsigned isoch_delay;
--
2.51.0.618.g983fd99d29-goog
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 85afa9ea122dd9d4a2ead104a951d318975dcd25
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101342-copartner-greedless-b2d8@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 85afa9ea122dd9d4a2ead104a951d318975dcd25 Mon Sep 17 00:00:00 2001
From: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
Date: Tue, 16 Sep 2025 11:57:56 +0900
Subject: [PATCH] PCI: endpoint: pci-epf-test: Add NULL check for DMA channels
before release
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The fields dma_chan_tx and dma_chan_rx of the struct pci_epf_test can be
NULL even after EPF initialization. Then it is prudent to check that
they have non-NULL values before releasing the channels. Add the checks
in pci_epf_test_clean_dma_chan().
Without the checks, NULL pointer dereferences happen and they can lead
to a kernel panic in some cases:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
Call trace:
dma_release_channel+0x2c/0x120 (P)
pci_epf_test_epc_deinit+0x94/0xc0 [pci_epf_test]
pci_epc_deinit_notify+0x74/0xc0
tegra_pcie_ep_pex_rst_irq+0x250/0x5d8
irq_thread_fn+0x34/0xb8
irq_thread+0x18c/0x2e8
kthread+0x14c/0x210
ret_from_fork+0x10/0x20
Fixes: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities")
Fixes: 5ebf3fc59bd2 ("PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
[mani: trimmed the stack trace]
Signed-off-by: Manivannan Sadhasivam <mani(a)kernel.org>
Reviewed-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://patch.msgid.link/20250916025756.34807-1-shinichiro.kawasaki@wdc.com
diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
index 09e1b8b46b55..31617772ad51 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -301,15 +301,20 @@ static void pci_epf_test_clean_dma_chan(struct pci_epf_test *epf_test)
if (!epf_test->dma_supported)
return;
- dma_release_channel(epf_test->dma_chan_tx);
- if (epf_test->dma_chan_tx == epf_test->dma_chan_rx) {
+ if (epf_test->dma_chan_tx) {
+ dma_release_channel(epf_test->dma_chan_tx);
+ if (epf_test->dma_chan_tx == epf_test->dma_chan_rx) {
+ epf_test->dma_chan_tx = NULL;
+ epf_test->dma_chan_rx = NULL;
+ return;
+ }
epf_test->dma_chan_tx = NULL;
- epf_test->dma_chan_rx = NULL;
- return;
}
- dma_release_channel(epf_test->dma_chan_rx);
- epf_test->dma_chan_rx = NULL;
+ if (epf_test->dma_chan_rx) {
+ dma_release_channel(epf_test->dma_chan_rx);
+ epf_test->dma_chan_rx = NULL;
+ }
}
static void pci_epf_test_print_rate(struct pci_epf_test *epf_test,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 85afa9ea122dd9d4a2ead104a951d318975dcd25
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101341-hurler-salute-f4f9@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 85afa9ea122dd9d4a2ead104a951d318975dcd25 Mon Sep 17 00:00:00 2001
From: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
Date: Tue, 16 Sep 2025 11:57:56 +0900
Subject: [PATCH] PCI: endpoint: pci-epf-test: Add NULL check for DMA channels
before release
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The fields dma_chan_tx and dma_chan_rx of the struct pci_epf_test can be
NULL even after EPF initialization. Then it is prudent to check that
they have non-NULL values before releasing the channels. Add the checks
in pci_epf_test_clean_dma_chan().
Without the checks, NULL pointer dereferences happen and they can lead
to a kernel panic in some cases:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
Call trace:
dma_release_channel+0x2c/0x120 (P)
pci_epf_test_epc_deinit+0x94/0xc0 [pci_epf_test]
pci_epc_deinit_notify+0x74/0xc0
tegra_pcie_ep_pex_rst_irq+0x250/0x5d8
irq_thread_fn+0x34/0xb8
irq_thread+0x18c/0x2e8
kthread+0x14c/0x210
ret_from_fork+0x10/0x20
Fixes: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities")
Fixes: 5ebf3fc59bd2 ("PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
[mani: trimmed the stack trace]
Signed-off-by: Manivannan Sadhasivam <mani(a)kernel.org>
Reviewed-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://patch.msgid.link/20250916025756.34807-1-shinichiro.kawasaki@wdc.com
diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
index 09e1b8b46b55..31617772ad51 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -301,15 +301,20 @@ static void pci_epf_test_clean_dma_chan(struct pci_epf_test *epf_test)
if (!epf_test->dma_supported)
return;
- dma_release_channel(epf_test->dma_chan_tx);
- if (epf_test->dma_chan_tx == epf_test->dma_chan_rx) {
+ if (epf_test->dma_chan_tx) {
+ dma_release_channel(epf_test->dma_chan_tx);
+ if (epf_test->dma_chan_tx == epf_test->dma_chan_rx) {
+ epf_test->dma_chan_tx = NULL;
+ epf_test->dma_chan_rx = NULL;
+ return;
+ }
epf_test->dma_chan_tx = NULL;
- epf_test->dma_chan_rx = NULL;
- return;
}
- dma_release_channel(epf_test->dma_chan_rx);
- epf_test->dma_chan_rx = NULL;
+ if (epf_test->dma_chan_rx) {
+ dma_release_channel(epf_test->dma_chan_rx);
+ epf_test->dma_chan_rx = NULL;
+ }
}
static void pci_epf_test_print_rate(struct pci_epf_test *epf_test,
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x da1ba64176e0138f2bfa96f9e43e8c3640d01e1e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101312-express-attractor-1757@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From da1ba64176e0138f2bfa96f9e43e8c3640d01e1e Mon Sep 17 00:00:00 2001
From: Ling Xu <quic_lxu5(a)quicinc.com>
Date: Fri, 12 Sep 2025 14:12:35 +0100
Subject: [PATCH] misc: fastrpc: fix possible map leak in fastrpc_put_args
copy_to_user() failure would cause an early return without cleaning up
the fdlist, which has been updated by the DSP. This could lead to map
leak. Fix this by redirecting to a cleanup path on failure, ensuring
that all mapped buffers are properly released before returning.
Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method")
Cc: stable(a)kernel.org
Co-developed-by: Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>
Signed-off-by: Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>
Signed-off-by: Ling Xu <quic_lxu5(a)quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov(a)oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srini(a)kernel.org>
Link: https://lore.kernel.org/r/20250912131236.303102-4-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
index 1815b1e0c607..d950a179bff8 100644
--- a/drivers/misc/fastrpc.c
+++ b/drivers/misc/fastrpc.c
@@ -1085,6 +1085,7 @@ static int fastrpc_put_args(struct fastrpc_invoke_ctx *ctx,
struct fastrpc_phy_page *pages;
u64 *fdlist;
int i, inbufs, outbufs, handles;
+ int ret = 0;
inbufs = REMOTE_SCALARS_INBUFS(ctx->sc);
outbufs = REMOTE_SCALARS_OUTBUFS(ctx->sc);
@@ -1100,14 +1101,17 @@ static int fastrpc_put_args(struct fastrpc_invoke_ctx *ctx,
u64 len = rpra[i].buf.len;
if (!kernel) {
- if (copy_to_user((void __user *)dst, src, len))
- return -EFAULT;
+ if (copy_to_user((void __user *)dst, src, len)) {
+ ret = -EFAULT;
+ goto cleanup_fdlist;
+ }
} else {
memcpy(dst, src, len);
}
}
}
+cleanup_fdlist:
/* Clean up fdlist which is updated by DSP */
for (i = 0; i < FASTRPC_MAX_FDLIST; i++) {
if (!fdlist[i])
@@ -1116,7 +1120,7 @@ static int fastrpc_put_args(struct fastrpc_invoke_ctx *ctx,
fastrpc_map_put(mmap);
}
- return 0;
+ return ret;
}
static int fastrpc_invoke_send(struct fastrpc_session_ctx *sctx,
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x da1ba64176e0138f2bfa96f9e43e8c3640d01e1e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101311-unrented-email-bbb3@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From da1ba64176e0138f2bfa96f9e43e8c3640d01e1e Mon Sep 17 00:00:00 2001
From: Ling Xu <quic_lxu5(a)quicinc.com>
Date: Fri, 12 Sep 2025 14:12:35 +0100
Subject: [PATCH] misc: fastrpc: fix possible map leak in fastrpc_put_args
copy_to_user() failure would cause an early return without cleaning up
the fdlist, which has been updated by the DSP. This could lead to map
leak. Fix this by redirecting to a cleanup path on failure, ensuring
that all mapped buffers are properly released before returning.
Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method")
Cc: stable(a)kernel.org
Co-developed-by: Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>
Signed-off-by: Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>
Signed-off-by: Ling Xu <quic_lxu5(a)quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov(a)oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srini(a)kernel.org>
Link: https://lore.kernel.org/r/20250912131236.303102-4-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
index 1815b1e0c607..d950a179bff8 100644
--- a/drivers/misc/fastrpc.c
+++ b/drivers/misc/fastrpc.c
@@ -1085,6 +1085,7 @@ static int fastrpc_put_args(struct fastrpc_invoke_ctx *ctx,
struct fastrpc_phy_page *pages;
u64 *fdlist;
int i, inbufs, outbufs, handles;
+ int ret = 0;
inbufs = REMOTE_SCALARS_INBUFS(ctx->sc);
outbufs = REMOTE_SCALARS_OUTBUFS(ctx->sc);
@@ -1100,14 +1101,17 @@ static int fastrpc_put_args(struct fastrpc_invoke_ctx *ctx,
u64 len = rpra[i].buf.len;
if (!kernel) {
- if (copy_to_user((void __user *)dst, src, len))
- return -EFAULT;
+ if (copy_to_user((void __user *)dst, src, len)) {
+ ret = -EFAULT;
+ goto cleanup_fdlist;
+ }
} else {
memcpy(dst, src, len);
}
}
}
+cleanup_fdlist:
/* Clean up fdlist which is updated by DSP */
for (i = 0; i < FASTRPC_MAX_FDLIST; i++) {
if (!fdlist[i])
@@ -1116,7 +1120,7 @@ static int fastrpc_put_args(struct fastrpc_invoke_ctx *ctx,
fastrpc_map_put(mmap);
}
- return 0;
+ return ret;
}
static int fastrpc_invoke_send(struct fastrpc_session_ctx *sctx,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x da1ba64176e0138f2bfa96f9e43e8c3640d01e1e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101311-user-gratified-0681@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From da1ba64176e0138f2bfa96f9e43e8c3640d01e1e Mon Sep 17 00:00:00 2001
From: Ling Xu <quic_lxu5(a)quicinc.com>
Date: Fri, 12 Sep 2025 14:12:35 +0100
Subject: [PATCH] misc: fastrpc: fix possible map leak in fastrpc_put_args
copy_to_user() failure would cause an early return without cleaning up
the fdlist, which has been updated by the DSP. This could lead to map
leak. Fix this by redirecting to a cleanup path on failure, ensuring
that all mapped buffers are properly released before returning.
Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method")
Cc: stable(a)kernel.org
Co-developed-by: Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>
Signed-off-by: Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>
Signed-off-by: Ling Xu <quic_lxu5(a)quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov(a)oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srini(a)kernel.org>
Link: https://lore.kernel.org/r/20250912131236.303102-4-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
index 1815b1e0c607..d950a179bff8 100644
--- a/drivers/misc/fastrpc.c
+++ b/drivers/misc/fastrpc.c
@@ -1085,6 +1085,7 @@ static int fastrpc_put_args(struct fastrpc_invoke_ctx *ctx,
struct fastrpc_phy_page *pages;
u64 *fdlist;
int i, inbufs, outbufs, handles;
+ int ret = 0;
inbufs = REMOTE_SCALARS_INBUFS(ctx->sc);
outbufs = REMOTE_SCALARS_OUTBUFS(ctx->sc);
@@ -1100,14 +1101,17 @@ static int fastrpc_put_args(struct fastrpc_invoke_ctx *ctx,
u64 len = rpra[i].buf.len;
if (!kernel) {
- if (copy_to_user((void __user *)dst, src, len))
- return -EFAULT;
+ if (copy_to_user((void __user *)dst, src, len)) {
+ ret = -EFAULT;
+ goto cleanup_fdlist;
+ }
} else {
memcpy(dst, src, len);
}
}
}
+cleanup_fdlist:
/* Clean up fdlist which is updated by DSP */
for (i = 0; i < FASTRPC_MAX_FDLIST; i++) {
if (!fdlist[i])
@@ -1116,7 +1120,7 @@ static int fastrpc_put_args(struct fastrpc_invoke_ctx *ctx,
fastrpc_map_put(mmap);
}
- return 0;
+ return ret;
}
static int fastrpc_invoke_send(struct fastrpc_session_ctx *sctx,
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 9031626ade38b092b72638dfe0c6ffce8d8acd43
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101348-drapery-clean-43c1@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9031626ade38b092b72638dfe0c6ffce8d8acd43 Mon Sep 17 00:00:00 2001
From: Ling Xu <quic_lxu5(a)quicinc.com>
Date: Fri, 12 Sep 2025 14:12:34 +0100
Subject: [PATCH] misc: fastrpc: Fix fastrpc_map_lookup operation
Fastrpc driver creates maps for user allocated fd buffers. Before
creating a new map, the map list is checked for any already existing
maps using map fd. Checking with just map fd is not sufficient as the
user can pass offsetted buffer with less size when the map is created
and then a larger size the next time which could result in memory
issues. Check for dma_buf object also when looking up for the map.
Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method")
Cc: stable(a)kernel.org
Co-developed-by: Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>
Signed-off-by: Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>
Signed-off-by: Ling Xu <quic_lxu5(a)quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov(a)oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srini(a)kernel.org>
Link: https://lore.kernel.org/r/20250912131236.303102-3-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
index 52571916acd4..1815b1e0c607 100644
--- a/drivers/misc/fastrpc.c
+++ b/drivers/misc/fastrpc.c
@@ -367,11 +367,16 @@ static int fastrpc_map_lookup(struct fastrpc_user *fl, int fd,
{
struct fastrpc_session_ctx *sess = fl->sctx;
struct fastrpc_map *map = NULL;
+ struct dma_buf *buf;
int ret = -ENOENT;
+ buf = dma_buf_get(fd);
+ if (IS_ERR(buf))
+ return PTR_ERR(buf);
+
spin_lock(&fl->lock);
list_for_each_entry(map, &fl->maps, node) {
- if (map->fd != fd)
+ if (map->fd != fd || map->buf != buf)
continue;
if (take_ref) {
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 9031626ade38b092b72638dfe0c6ffce8d8acd43
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101347-concept-litigate-de3c@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9031626ade38b092b72638dfe0c6ffce8d8acd43 Mon Sep 17 00:00:00 2001
From: Ling Xu <quic_lxu5(a)quicinc.com>
Date: Fri, 12 Sep 2025 14:12:34 +0100
Subject: [PATCH] misc: fastrpc: Fix fastrpc_map_lookup operation
Fastrpc driver creates maps for user allocated fd buffers. Before
creating a new map, the map list is checked for any already existing
maps using map fd. Checking with just map fd is not sufficient as the
user can pass offsetted buffer with less size when the map is created
and then a larger size the next time which could result in memory
issues. Check for dma_buf object also when looking up for the map.
Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method")
Cc: stable(a)kernel.org
Co-developed-by: Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>
Signed-off-by: Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>
Signed-off-by: Ling Xu <quic_lxu5(a)quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov(a)oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srini(a)kernel.org>
Link: https://lore.kernel.org/r/20250912131236.303102-3-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
index 52571916acd4..1815b1e0c607 100644
--- a/drivers/misc/fastrpc.c
+++ b/drivers/misc/fastrpc.c
@@ -367,11 +367,16 @@ static int fastrpc_map_lookup(struct fastrpc_user *fl, int fd,
{
struct fastrpc_session_ctx *sess = fl->sctx;
struct fastrpc_map *map = NULL;
+ struct dma_buf *buf;
int ret = -ENOENT;
+ buf = dma_buf_get(fd);
+ if (IS_ERR(buf))
+ return PTR_ERR(buf);
+
spin_lock(&fl->lock);
list_for_each_entry(map, &fl->maps, node) {
- if (map->fd != fd)
+ if (map->fd != fd || map->buf != buf)
continue;
if (take_ref) {
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 9031626ade38b092b72638dfe0c6ffce8d8acd43
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101347-racism-president-87cb@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9031626ade38b092b72638dfe0c6ffce8d8acd43 Mon Sep 17 00:00:00 2001
From: Ling Xu <quic_lxu5(a)quicinc.com>
Date: Fri, 12 Sep 2025 14:12:34 +0100
Subject: [PATCH] misc: fastrpc: Fix fastrpc_map_lookup operation
Fastrpc driver creates maps for user allocated fd buffers. Before
creating a new map, the map list is checked for any already existing
maps using map fd. Checking with just map fd is not sufficient as the
user can pass offsetted buffer with less size when the map is created
and then a larger size the next time which could result in memory
issues. Check for dma_buf object also when looking up for the map.
Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method")
Cc: stable(a)kernel.org
Co-developed-by: Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>
Signed-off-by: Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>
Signed-off-by: Ling Xu <quic_lxu5(a)quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov(a)oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srini(a)kernel.org>
Link: https://lore.kernel.org/r/20250912131236.303102-3-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
index 52571916acd4..1815b1e0c607 100644
--- a/drivers/misc/fastrpc.c
+++ b/drivers/misc/fastrpc.c
@@ -367,11 +367,16 @@ static int fastrpc_map_lookup(struct fastrpc_user *fl, int fd,
{
struct fastrpc_session_ctx *sess = fl->sctx;
struct fastrpc_map *map = NULL;
+ struct dma_buf *buf;
int ret = -ENOENT;
+ buf = dma_buf_get(fd);
+ if (IS_ERR(buf))
+ return PTR_ERR(buf);
+
spin_lock(&fl->lock);
list_for_each_entry(map, &fl->maps, node) {
- if (map->fd != fd)
+ if (map->fd != fd || map->buf != buf)
continue;
if (take_ref) {
Hi,
I would like to request backporting 134121bfd99a ("ipvs: Defer ip_vs_ftp
unregister during netns cleanup") to all LTS kernels.
This fixes a UAF vulnerability in IPVS that was introduced since v2.6.39, and
the patch applies cleanly to the LTS kernels.
thanks,
Slavin Liu
Hi,
I’d like to let you know that we have a carefully verified database of 501,564 attendees and 750 exhibitors from IAA MOBILITY 2025.
The file includes: Contact name, Job Title, Business Name, Physical Address, Phone Numbers, Official Email address and many more…
You’ll receive the list within 48 hours, fully opt-in and ready to use.
Q4 Special discount: Get 20% off for a limited time.
If you’d like more information, just reply with “Send me the cost” and I’ll share the details.
Best regards,
Ella Brown
Sr. Marketing Manager
P.S. Not interested? Simply reply with “Unfollow” to opt out.
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 4d6fc29f36341d7795db1d1819b4c15fe9be7b23
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101301-starring-gravel-336e@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4d6fc29f36341d7795db1d1819b4c15fe9be7b23 Mon Sep 17 00:00:00 2001
From: Donet Tom <donettom(a)linux.ibm.com>
Date: Wed, 24 Sep 2025 00:16:59 +0530
Subject: [PATCH] mm/ksm: fix incorrect KSM counter handling in mm_struct
during fork
Patch series "mm/ksm: Fix incorrect accounting of KSM counters during
fork", v3.
The first patch in this series fixes the incorrect accounting of KSM
counters such as ksm_merging_pages, ksm_rmap_items, and the global
ksm_zero_pages during fork.
The following patch add a selftest to verify the ksm_merging_pages counter
was updated correctly during fork.
Test Results
============
Without the first patch
-----------------------
# [RUN] test_fork_ksm_merging_page_count
not ok 10 ksm_merging_page in child: 32
With the first patch
--------------------
# [RUN] test_fork_ksm_merging_page_count
ok 10 ksm_merging_pages is not inherited after fork
This patch (of 2):
Currently, the KSM-related counters in `mm_struct`, such as
`ksm_merging_pages`, `ksm_rmap_items`, and `ksm_zero_pages`, are inherited
by the child process during fork. This results in inconsistent
accounting.
When a process uses KSM, identical pages are merged and an rmap item is
created for each merged page. The `ksm_merging_pages` and
`ksm_rmap_items` counters are updated accordingly. However, after a fork,
these counters are copied to the child while the corresponding rmap items
are not. As a result, when the child later triggers an unmerge, there are
no rmap items present in the child, so the counters remain stale, leading
to incorrect accounting.
A similar issue exists with `ksm_zero_pages`, which maintains both a
global counter and a per-process counter. During fork, the per-process
counter is inherited by the child, but the global counter is not
incremented. Since the child also references zero pages, the global
counter should be updated as well. Otherwise, during zero-page unmerge,
both the global and per-process counters are decremented, causing the
global counter to become inconsistent.
To fix this, ksm_merging_pages and ksm_rmap_items are reset to 0 during
fork, and the global ksm_zero_pages counter is updated with the
per-process ksm_zero_pages value inherited by the child. This ensures
that KSM statistics remain accurate and reflect the activity of each
process correctly.
Link: https://lkml.kernel.org/r/cover.1758648700.git.donettom@linux.ibm.com
Link: https://lkml.kernel.org/r/7b9870eb67ccc0d79593940d9dbd4a0b39b5d396.17586487…
Fixes: 7609385337a4 ("ksm: count ksm merging pages for each process")
Fixes: cb4df4cae4f2 ("ksm: count allocated ksm rmap_items for each process")
Fixes: e2942062e01d ("ksm: count all zero pages placed by KSM")
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
Reviewed-by: Chengming Zhou <chengming.zhou(a)linux.dev>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Aboorva Devarajan <aboorvad(a)linux.ibm.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Donet Tom <donettom(a)linux.ibm.com>
Cc: "Ritesh Harjani (IBM)" <ritesh.list(a)gmail.com>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: xu xin <xu.xin16(a)zte.com.cn>
Cc: <stable(a)vger.kernel.org> [6.6+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 22e67ca7cba3..067538fc4d58 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -56,8 +56,14 @@ static inline long mm_ksm_zero_pages(struct mm_struct *mm)
static inline void ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
{
/* Adding mm to ksm is best effort on fork. */
- if (mm_flags_test(MMF_VM_MERGEABLE, oldmm))
+ if (mm_flags_test(MMF_VM_MERGEABLE, oldmm)) {
+ long nr_ksm_zero_pages = atomic_long_read(&mm->ksm_zero_pages);
+
+ mm->ksm_merging_pages = 0;
+ mm->ksm_rmap_items = 0;
+ atomic_long_add(nr_ksm_zero_pages, &ksm_zero_pages);
__ksm_enter(mm);
+ }
}
static inline int ksm_execve(struct mm_struct *mm)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 54b91e54b113d4f15ab023a44f508251db6e22e7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101336-mortality-parched-3105@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54b91e54b113d4f15ab023a44f508251db6e22e7 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Sat, 11 Oct 2025 11:20:32 -0400
Subject: [PATCH] tracing: Stop fortify-string from warning in
tracing_mark_raw_write()
The way tracing_mark_raw_write() records its data is that it has the
following structure:
struct {
struct trace_entry;
int id;
char buf[];
};
But memcpy(&entry->id, buf, size) triggers the following warning when the
size is greater than the id:
------------[ cut here ]------------
memcpy: detected field-spanning write (size 6) of single field "&entry->id" at kernel/trace/trace.c:7458 (size 4)
WARNING: CPU: 7 PID: 995 at kernel/trace/trace.c:7458 write_raw_marker_to_buffer.isra.0+0x1f9/0x2e0
Modules linked in:
CPU: 7 UID: 0 PID: 995 Comm: bash Not tainted 6.17.0-test-00007-g60b82183e78a-dirty #211 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
RIP: 0010:write_raw_marker_to_buffer.isra.0+0x1f9/0x2e0
Code: 04 00 75 a7 b9 04 00 00 00 48 89 de 48 89 04 24 48 c7 c2 e0 b1 d1 b2 48 c7 c7 40 b2 d1 b2 c6 05 2d 88 6a 04 01 e8 f7 e8 bd ff <0f> 0b 48 8b 04 24 e9 76 ff ff ff 49 8d 7c 24 04 49 8d 5c 24 08 48
RSP: 0018:ffff888104c3fc78 EFLAGS: 00010292
RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 1ffffffff6b363b4 RDI: 0000000000000001
RBP: ffff888100058a00 R08: ffffffffb041d459 R09: ffffed1020987f40
R10: 0000000000000007 R11: 0000000000000001 R12: ffff888100bb9010
R13: 0000000000000000 R14: 00000000000003e3 R15: ffff888134800000
FS: 00007fa61d286740(0000) GS:ffff888286cad000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000560d28d509f1 CR3: 00000001047a4006 CR4: 0000000000172ef0
Call Trace:
<TASK>
tracing_mark_raw_write+0x1fe/0x290
? __pfx_tracing_mark_raw_write+0x10/0x10
? security_file_permission+0x50/0xf0
? rw_verify_area+0x6f/0x4b0
vfs_write+0x1d8/0xdd0
? __pfx_vfs_write+0x10/0x10
? __pfx_css_rstat_updated+0x10/0x10
? count_memcg_events+0xd9/0x410
? fdget_pos+0x53/0x5e0
ksys_write+0x182/0x200
? __pfx_ksys_write+0x10/0x10
? do_user_addr_fault+0x4af/0xa30
do_syscall_64+0x63/0x350
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fa61d318687
Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
RSP: 002b:00007ffd87fe0120 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fa61d286740 RCX: 00007fa61d318687
RDX: 0000000000000006 RSI: 0000560d28d509f0 RDI: 0000000000000001
RBP: 0000560d28d509f0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000006
R13: 00007fa61d4715c0 R14: 00007fa61d46ee80 R15: 0000000000000000
</TASK>
---[ end trace 0000000000000000 ]---
This is because fortify string sees that the size of entry->id is only 4
bytes, but it is writing more than that. But this is OK as the
dynamic_array is allocated to handle that copy.
The size allocated on the ring buffer was actually a bit too big:
size = sizeof(*entry) + cnt;
But cnt includes the 'id' and the buffer data, so adding cnt to the size
of *entry actually allocates too much on the ring buffer.
Change the allocation to:
size = struct_size(entry, buf, cnt - sizeof(entry->id));
and the memcpy() to unsafe_memcpy() with an added justification.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Link: https://lore.kernel.org/20251011112032.77be18e4@gandalf.local.home
Fixes: 64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space")
Reported-by: syzbot+9a2ede1643175f350105(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68e973f5.050a0220.1186a4.0010.GAE@google.com/
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index bbb89206a891..eb256378e65b 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7441,7 +7441,8 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
ssize_t written;
size_t size;
- size = sizeof(*entry) + cnt;
+ /* cnt includes both the entry->id and the data behind it. */
+ size = struct_size(entry, buf, cnt - sizeof(entry->id));
buffer = tr->array_buffer.buffer;
@@ -7455,7 +7456,10 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
return -EBADF;
entry = ring_buffer_event_data(event);
- memcpy(&entry->id, buf, cnt);
+ unsafe_memcpy(&entry->id, buf, cnt,
+ "id and content already reserved on ring buffer"
+ "'buf' includes the 'id' and the data."
+ "'entry' was allocated with cnt from 'id'.");
written = cnt;
__buffer_unlock_commit(buffer, event);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 54b91e54b113d4f15ab023a44f508251db6e22e7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101333-thieving-clanking-79c8@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54b91e54b113d4f15ab023a44f508251db6e22e7 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Sat, 11 Oct 2025 11:20:32 -0400
Subject: [PATCH] tracing: Stop fortify-string from warning in
tracing_mark_raw_write()
The way tracing_mark_raw_write() records its data is that it has the
following structure:
struct {
struct trace_entry;
int id;
char buf[];
};
But memcpy(&entry->id, buf, size) triggers the following warning when the
size is greater than the id:
------------[ cut here ]------------
memcpy: detected field-spanning write (size 6) of single field "&entry->id" at kernel/trace/trace.c:7458 (size 4)
WARNING: CPU: 7 PID: 995 at kernel/trace/trace.c:7458 write_raw_marker_to_buffer.isra.0+0x1f9/0x2e0
Modules linked in:
CPU: 7 UID: 0 PID: 995 Comm: bash Not tainted 6.17.0-test-00007-g60b82183e78a-dirty #211 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
RIP: 0010:write_raw_marker_to_buffer.isra.0+0x1f9/0x2e0
Code: 04 00 75 a7 b9 04 00 00 00 48 89 de 48 89 04 24 48 c7 c2 e0 b1 d1 b2 48 c7 c7 40 b2 d1 b2 c6 05 2d 88 6a 04 01 e8 f7 e8 bd ff <0f> 0b 48 8b 04 24 e9 76 ff ff ff 49 8d 7c 24 04 49 8d 5c 24 08 48
RSP: 0018:ffff888104c3fc78 EFLAGS: 00010292
RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 1ffffffff6b363b4 RDI: 0000000000000001
RBP: ffff888100058a00 R08: ffffffffb041d459 R09: ffffed1020987f40
R10: 0000000000000007 R11: 0000000000000001 R12: ffff888100bb9010
R13: 0000000000000000 R14: 00000000000003e3 R15: ffff888134800000
FS: 00007fa61d286740(0000) GS:ffff888286cad000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000560d28d509f1 CR3: 00000001047a4006 CR4: 0000000000172ef0
Call Trace:
<TASK>
tracing_mark_raw_write+0x1fe/0x290
? __pfx_tracing_mark_raw_write+0x10/0x10
? security_file_permission+0x50/0xf0
? rw_verify_area+0x6f/0x4b0
vfs_write+0x1d8/0xdd0
? __pfx_vfs_write+0x10/0x10
? __pfx_css_rstat_updated+0x10/0x10
? count_memcg_events+0xd9/0x410
? fdget_pos+0x53/0x5e0
ksys_write+0x182/0x200
? __pfx_ksys_write+0x10/0x10
? do_user_addr_fault+0x4af/0xa30
do_syscall_64+0x63/0x350
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fa61d318687
Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
RSP: 002b:00007ffd87fe0120 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fa61d286740 RCX: 00007fa61d318687
RDX: 0000000000000006 RSI: 0000560d28d509f0 RDI: 0000000000000001
RBP: 0000560d28d509f0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000006
R13: 00007fa61d4715c0 R14: 00007fa61d46ee80 R15: 0000000000000000
</TASK>
---[ end trace 0000000000000000 ]---
This is because fortify string sees that the size of entry->id is only 4
bytes, but it is writing more than that. But this is OK as the
dynamic_array is allocated to handle that copy.
The size allocated on the ring buffer was actually a bit too big:
size = sizeof(*entry) + cnt;
But cnt includes the 'id' and the buffer data, so adding cnt to the size
of *entry actually allocates too much on the ring buffer.
Change the allocation to:
size = struct_size(entry, buf, cnt - sizeof(entry->id));
and the memcpy() to unsafe_memcpy() with an added justification.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Link: https://lore.kernel.org/20251011112032.77be18e4@gandalf.local.home
Fixes: 64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space")
Reported-by: syzbot+9a2ede1643175f350105(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68e973f5.050a0220.1186a4.0010.GAE@google.com/
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index bbb89206a891..eb256378e65b 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7441,7 +7441,8 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
ssize_t written;
size_t size;
- size = sizeof(*entry) + cnt;
+ /* cnt includes both the entry->id and the data behind it. */
+ size = struct_size(entry, buf, cnt - sizeof(entry->id));
buffer = tr->array_buffer.buffer;
@@ -7455,7 +7456,10 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
return -EBADF;
entry = ring_buffer_event_data(event);
- memcpy(&entry->id, buf, cnt);
+ unsafe_memcpy(&entry->id, buf, cnt,
+ "id and content already reserved on ring buffer"
+ "'buf' includes the 'id' and the data."
+ "'entry' was allocated with cnt from 'id'.");
written = cnt;
__buffer_unlock_commit(buffer, event);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 54b91e54b113d4f15ab023a44f508251db6e22e7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101329-uncorrupt-ruse-3d0e@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54b91e54b113d4f15ab023a44f508251db6e22e7 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Sat, 11 Oct 2025 11:20:32 -0400
Subject: [PATCH] tracing: Stop fortify-string from warning in
tracing_mark_raw_write()
The way tracing_mark_raw_write() records its data is that it has the
following structure:
struct {
struct trace_entry;
int id;
char buf[];
};
But memcpy(&entry->id, buf, size) triggers the following warning when the
size is greater than the id:
------------[ cut here ]------------
memcpy: detected field-spanning write (size 6) of single field "&entry->id" at kernel/trace/trace.c:7458 (size 4)
WARNING: CPU: 7 PID: 995 at kernel/trace/trace.c:7458 write_raw_marker_to_buffer.isra.0+0x1f9/0x2e0
Modules linked in:
CPU: 7 UID: 0 PID: 995 Comm: bash Not tainted 6.17.0-test-00007-g60b82183e78a-dirty #211 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
RIP: 0010:write_raw_marker_to_buffer.isra.0+0x1f9/0x2e0
Code: 04 00 75 a7 b9 04 00 00 00 48 89 de 48 89 04 24 48 c7 c2 e0 b1 d1 b2 48 c7 c7 40 b2 d1 b2 c6 05 2d 88 6a 04 01 e8 f7 e8 bd ff <0f> 0b 48 8b 04 24 e9 76 ff ff ff 49 8d 7c 24 04 49 8d 5c 24 08 48
RSP: 0018:ffff888104c3fc78 EFLAGS: 00010292
RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 1ffffffff6b363b4 RDI: 0000000000000001
RBP: ffff888100058a00 R08: ffffffffb041d459 R09: ffffed1020987f40
R10: 0000000000000007 R11: 0000000000000001 R12: ffff888100bb9010
R13: 0000000000000000 R14: 00000000000003e3 R15: ffff888134800000
FS: 00007fa61d286740(0000) GS:ffff888286cad000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000560d28d509f1 CR3: 00000001047a4006 CR4: 0000000000172ef0
Call Trace:
<TASK>
tracing_mark_raw_write+0x1fe/0x290
? __pfx_tracing_mark_raw_write+0x10/0x10
? security_file_permission+0x50/0xf0
? rw_verify_area+0x6f/0x4b0
vfs_write+0x1d8/0xdd0
? __pfx_vfs_write+0x10/0x10
? __pfx_css_rstat_updated+0x10/0x10
? count_memcg_events+0xd9/0x410
? fdget_pos+0x53/0x5e0
ksys_write+0x182/0x200
? __pfx_ksys_write+0x10/0x10
? do_user_addr_fault+0x4af/0xa30
do_syscall_64+0x63/0x350
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fa61d318687
Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
RSP: 002b:00007ffd87fe0120 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fa61d286740 RCX: 00007fa61d318687
RDX: 0000000000000006 RSI: 0000560d28d509f0 RDI: 0000000000000001
RBP: 0000560d28d509f0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000006
R13: 00007fa61d4715c0 R14: 00007fa61d46ee80 R15: 0000000000000000
</TASK>
---[ end trace 0000000000000000 ]---
This is because fortify string sees that the size of entry->id is only 4
bytes, but it is writing more than that. But this is OK as the
dynamic_array is allocated to handle that copy.
The size allocated on the ring buffer was actually a bit too big:
size = sizeof(*entry) + cnt;
But cnt includes the 'id' and the buffer data, so adding cnt to the size
of *entry actually allocates too much on the ring buffer.
Change the allocation to:
size = struct_size(entry, buf, cnt - sizeof(entry->id));
and the memcpy() to unsafe_memcpy() with an added justification.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Link: https://lore.kernel.org/20251011112032.77be18e4@gandalf.local.home
Fixes: 64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space")
Reported-by: syzbot+9a2ede1643175f350105(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68e973f5.050a0220.1186a4.0010.GAE@google.com/
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index bbb89206a891..eb256378e65b 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7441,7 +7441,8 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
ssize_t written;
size_t size;
- size = sizeof(*entry) + cnt;
+ /* cnt includes both the entry->id and the data behind it. */
+ size = struct_size(entry, buf, cnt - sizeof(entry->id));
buffer = tr->array_buffer.buffer;
@@ -7455,7 +7456,10 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
return -EBADF;
entry = ring_buffer_event_data(event);
- memcpy(&entry->id, buf, cnt);
+ unsafe_memcpy(&entry->id, buf, cnt,
+ "id and content already reserved on ring buffer"
+ "'buf' includes the 'id' and the data."
+ "'entry' was allocated with cnt from 'id'.");
written = cnt;
__buffer_unlock_commit(buffer, event);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 54b91e54b113d4f15ab023a44f508251db6e22e7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101326-whoever-tutu-7ce1@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54b91e54b113d4f15ab023a44f508251db6e22e7 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Sat, 11 Oct 2025 11:20:32 -0400
Subject: [PATCH] tracing: Stop fortify-string from warning in
tracing_mark_raw_write()
The way tracing_mark_raw_write() records its data is that it has the
following structure:
struct {
struct trace_entry;
int id;
char buf[];
};
But memcpy(&entry->id, buf, size) triggers the following warning when the
size is greater than the id:
------------[ cut here ]------------
memcpy: detected field-spanning write (size 6) of single field "&entry->id" at kernel/trace/trace.c:7458 (size 4)
WARNING: CPU: 7 PID: 995 at kernel/trace/trace.c:7458 write_raw_marker_to_buffer.isra.0+0x1f9/0x2e0
Modules linked in:
CPU: 7 UID: 0 PID: 995 Comm: bash Not tainted 6.17.0-test-00007-g60b82183e78a-dirty #211 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
RIP: 0010:write_raw_marker_to_buffer.isra.0+0x1f9/0x2e0
Code: 04 00 75 a7 b9 04 00 00 00 48 89 de 48 89 04 24 48 c7 c2 e0 b1 d1 b2 48 c7 c7 40 b2 d1 b2 c6 05 2d 88 6a 04 01 e8 f7 e8 bd ff <0f> 0b 48 8b 04 24 e9 76 ff ff ff 49 8d 7c 24 04 49 8d 5c 24 08 48
RSP: 0018:ffff888104c3fc78 EFLAGS: 00010292
RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 1ffffffff6b363b4 RDI: 0000000000000001
RBP: ffff888100058a00 R08: ffffffffb041d459 R09: ffffed1020987f40
R10: 0000000000000007 R11: 0000000000000001 R12: ffff888100bb9010
R13: 0000000000000000 R14: 00000000000003e3 R15: ffff888134800000
FS: 00007fa61d286740(0000) GS:ffff888286cad000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000560d28d509f1 CR3: 00000001047a4006 CR4: 0000000000172ef0
Call Trace:
<TASK>
tracing_mark_raw_write+0x1fe/0x290
? __pfx_tracing_mark_raw_write+0x10/0x10
? security_file_permission+0x50/0xf0
? rw_verify_area+0x6f/0x4b0
vfs_write+0x1d8/0xdd0
? __pfx_vfs_write+0x10/0x10
? __pfx_css_rstat_updated+0x10/0x10
? count_memcg_events+0xd9/0x410
? fdget_pos+0x53/0x5e0
ksys_write+0x182/0x200
? __pfx_ksys_write+0x10/0x10
? do_user_addr_fault+0x4af/0xa30
do_syscall_64+0x63/0x350
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fa61d318687
Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
RSP: 002b:00007ffd87fe0120 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fa61d286740 RCX: 00007fa61d318687
RDX: 0000000000000006 RSI: 0000560d28d509f0 RDI: 0000000000000001
RBP: 0000560d28d509f0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000006
R13: 00007fa61d4715c0 R14: 00007fa61d46ee80 R15: 0000000000000000
</TASK>
---[ end trace 0000000000000000 ]---
This is because fortify string sees that the size of entry->id is only 4
bytes, but it is writing more than that. But this is OK as the
dynamic_array is allocated to handle that copy.
The size allocated on the ring buffer was actually a bit too big:
size = sizeof(*entry) + cnt;
But cnt includes the 'id' and the buffer data, so adding cnt to the size
of *entry actually allocates too much on the ring buffer.
Change the allocation to:
size = struct_size(entry, buf, cnt - sizeof(entry->id));
and the memcpy() to unsafe_memcpy() with an added justification.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Link: https://lore.kernel.org/20251011112032.77be18e4@gandalf.local.home
Fixes: 64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space")
Reported-by: syzbot+9a2ede1643175f350105(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68e973f5.050a0220.1186a4.0010.GAE@google.com/
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index bbb89206a891..eb256378e65b 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7441,7 +7441,8 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
ssize_t written;
size_t size;
- size = sizeof(*entry) + cnt;
+ /* cnt includes both the entry->id and the data behind it. */
+ size = struct_size(entry, buf, cnt - sizeof(entry->id));
buffer = tr->array_buffer.buffer;
@@ -7455,7 +7456,10 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
return -EBADF;
entry = ring_buffer_event_data(event);
- memcpy(&entry->id, buf, cnt);
+ unsafe_memcpy(&entry->id, buf, cnt,
+ "id and content already reserved on ring buffer"
+ "'buf' includes the 'id' and the data."
+ "'entry' was allocated with cnt from 'id'.");
written = cnt;
__buffer_unlock_commit(buffer, event);
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 54b91e54b113d4f15ab023a44f508251db6e22e7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101323-remindful-perky-2e2a@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54b91e54b113d4f15ab023a44f508251db6e22e7 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Sat, 11 Oct 2025 11:20:32 -0400
Subject: [PATCH] tracing: Stop fortify-string from warning in
tracing_mark_raw_write()
The way tracing_mark_raw_write() records its data is that it has the
following structure:
struct {
struct trace_entry;
int id;
char buf[];
};
But memcpy(&entry->id, buf, size) triggers the following warning when the
size is greater than the id:
------------[ cut here ]------------
memcpy: detected field-spanning write (size 6) of single field "&entry->id" at kernel/trace/trace.c:7458 (size 4)
WARNING: CPU: 7 PID: 995 at kernel/trace/trace.c:7458 write_raw_marker_to_buffer.isra.0+0x1f9/0x2e0
Modules linked in:
CPU: 7 UID: 0 PID: 995 Comm: bash Not tainted 6.17.0-test-00007-g60b82183e78a-dirty #211 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
RIP: 0010:write_raw_marker_to_buffer.isra.0+0x1f9/0x2e0
Code: 04 00 75 a7 b9 04 00 00 00 48 89 de 48 89 04 24 48 c7 c2 e0 b1 d1 b2 48 c7 c7 40 b2 d1 b2 c6 05 2d 88 6a 04 01 e8 f7 e8 bd ff <0f> 0b 48 8b 04 24 e9 76 ff ff ff 49 8d 7c 24 04 49 8d 5c 24 08 48
RSP: 0018:ffff888104c3fc78 EFLAGS: 00010292
RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 1ffffffff6b363b4 RDI: 0000000000000001
RBP: ffff888100058a00 R08: ffffffffb041d459 R09: ffffed1020987f40
R10: 0000000000000007 R11: 0000000000000001 R12: ffff888100bb9010
R13: 0000000000000000 R14: 00000000000003e3 R15: ffff888134800000
FS: 00007fa61d286740(0000) GS:ffff888286cad000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000560d28d509f1 CR3: 00000001047a4006 CR4: 0000000000172ef0
Call Trace:
<TASK>
tracing_mark_raw_write+0x1fe/0x290
? __pfx_tracing_mark_raw_write+0x10/0x10
? security_file_permission+0x50/0xf0
? rw_verify_area+0x6f/0x4b0
vfs_write+0x1d8/0xdd0
? __pfx_vfs_write+0x10/0x10
? __pfx_css_rstat_updated+0x10/0x10
? count_memcg_events+0xd9/0x410
? fdget_pos+0x53/0x5e0
ksys_write+0x182/0x200
? __pfx_ksys_write+0x10/0x10
? do_user_addr_fault+0x4af/0xa30
do_syscall_64+0x63/0x350
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fa61d318687
Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
RSP: 002b:00007ffd87fe0120 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fa61d286740 RCX: 00007fa61d318687
RDX: 0000000000000006 RSI: 0000560d28d509f0 RDI: 0000000000000001
RBP: 0000560d28d509f0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000006
R13: 00007fa61d4715c0 R14: 00007fa61d46ee80 R15: 0000000000000000
</TASK>
---[ end trace 0000000000000000 ]---
This is because fortify string sees that the size of entry->id is only 4
bytes, but it is writing more than that. But this is OK as the
dynamic_array is allocated to handle that copy.
The size allocated on the ring buffer was actually a bit too big:
size = sizeof(*entry) + cnt;
But cnt includes the 'id' and the buffer data, so adding cnt to the size
of *entry actually allocates too much on the ring buffer.
Change the allocation to:
size = struct_size(entry, buf, cnt - sizeof(entry->id));
and the memcpy() to unsafe_memcpy() with an added justification.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Link: https://lore.kernel.org/20251011112032.77be18e4@gandalf.local.home
Fixes: 64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space")
Reported-by: syzbot+9a2ede1643175f350105(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68e973f5.050a0220.1186a4.0010.GAE@google.com/
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index bbb89206a891..eb256378e65b 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7441,7 +7441,8 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
ssize_t written;
size_t size;
- size = sizeof(*entry) + cnt;
+ /* cnt includes both the entry->id and the data behind it. */
+ size = struct_size(entry, buf, cnt - sizeof(entry->id));
buffer = tr->array_buffer.buffer;
@@ -7455,7 +7456,10 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
return -EBADF;
entry = ring_buffer_event_data(event);
- memcpy(&entry->id, buf, cnt);
+ unsafe_memcpy(&entry->id, buf, cnt,
+ "id and content already reserved on ring buffer"
+ "'buf' includes the 'id' and the data."
+ "'entry' was allocated with cnt from 'id'.");
written = cnt;
__buffer_unlock_commit(buffer, event);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x bda745ee8fbb63330d8f2f2ea4157229a5df959e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101314-sediment-vaporizer-416f@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bda745ee8fbb63330d8f2f2ea4157229a5df959e Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 10 Oct 2025 23:51:42 -0400
Subject: [PATCH] tracing: Fix tracing_mark_raw_write() to use buf and not ubuf
The fix to use a per CPU buffer to read user space tested only the writes
to trace_marker. But it appears that the selftests are missing tests to
the trace_maker_raw file. The trace_maker_raw file is used by applications
that writes data structures and not strings into the file, and the tools
read the raw ring buffer to process the structures it writes.
The fix that reads the per CPU buffers passes the new per CPU buffer to
the trace_marker file writes, but the update to the trace_marker_raw write
read the data from user space into the per CPU buffer, but then still used
then passed the user space address to the function that records the data.
Pass in the per CPU buffer and not the user space address.
TODO: Add a test to better test trace_marker_raw.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Link: https://lore.kernel.org/20251011035243.386098147@kernel.org
Fixes: 64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space")
Reported-by: syzbot+9a2ede1643175f350105(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68e973f5.050a0220.1186a4.0010.GAE@google.com/
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 0fd582651293..bbb89206a891 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7497,12 +7497,12 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
if (tr == &global_trace) {
guard(rcu)();
list_for_each_entry_rcu(tr, &marker_copies, marker_list) {
- written = write_raw_marker_to_buffer(tr, ubuf, cnt);
+ written = write_raw_marker_to_buffer(tr, buf, cnt);
if (written < 0)
break;
}
} else {
- written = write_raw_marker_to_buffer(tr, ubuf, cnt);
+ written = write_raw_marker_to_buffer(tr, buf, cnt);
}
return written;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x bda745ee8fbb63330d8f2f2ea4157229a5df959e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101312-duplex-shimmer-a161@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bda745ee8fbb63330d8f2f2ea4157229a5df959e Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 10 Oct 2025 23:51:42 -0400
Subject: [PATCH] tracing: Fix tracing_mark_raw_write() to use buf and not ubuf
The fix to use a per CPU buffer to read user space tested only the writes
to trace_marker. But it appears that the selftests are missing tests to
the trace_maker_raw file. The trace_maker_raw file is used by applications
that writes data structures and not strings into the file, and the tools
read the raw ring buffer to process the structures it writes.
The fix that reads the per CPU buffers passes the new per CPU buffer to
the trace_marker file writes, but the update to the trace_marker_raw write
read the data from user space into the per CPU buffer, but then still used
then passed the user space address to the function that records the data.
Pass in the per CPU buffer and not the user space address.
TODO: Add a test to better test trace_marker_raw.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Link: https://lore.kernel.org/20251011035243.386098147@kernel.org
Fixes: 64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space")
Reported-by: syzbot+9a2ede1643175f350105(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68e973f5.050a0220.1186a4.0010.GAE@google.com/
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 0fd582651293..bbb89206a891 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7497,12 +7497,12 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
if (tr == &global_trace) {
guard(rcu)();
list_for_each_entry_rcu(tr, &marker_copies, marker_list) {
- written = write_raw_marker_to_buffer(tr, ubuf, cnt);
+ written = write_raw_marker_to_buffer(tr, buf, cnt);
if (written < 0)
break;
}
} else {
- written = write_raw_marker_to_buffer(tr, ubuf, cnt);
+ written = write_raw_marker_to_buffer(tr, buf, cnt);
}
return written;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x bda745ee8fbb63330d8f2f2ea4157229a5df959e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101311-walnut-drivable-cfbc@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bda745ee8fbb63330d8f2f2ea4157229a5df959e Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 10 Oct 2025 23:51:42 -0400
Subject: [PATCH] tracing: Fix tracing_mark_raw_write() to use buf and not ubuf
The fix to use a per CPU buffer to read user space tested only the writes
to trace_marker. But it appears that the selftests are missing tests to
the trace_maker_raw file. The trace_maker_raw file is used by applications
that writes data structures and not strings into the file, and the tools
read the raw ring buffer to process the structures it writes.
The fix that reads the per CPU buffers passes the new per CPU buffer to
the trace_marker file writes, but the update to the trace_marker_raw write
read the data from user space into the per CPU buffer, but then still used
then passed the user space address to the function that records the data.
Pass in the per CPU buffer and not the user space address.
TODO: Add a test to better test trace_marker_raw.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Link: https://lore.kernel.org/20251011035243.386098147@kernel.org
Fixes: 64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space")
Reported-by: syzbot+9a2ede1643175f350105(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68e973f5.050a0220.1186a4.0010.GAE@google.com/
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 0fd582651293..bbb89206a891 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7497,12 +7497,12 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
if (tr == &global_trace) {
guard(rcu)();
list_for_each_entry_rcu(tr, &marker_copies, marker_list) {
- written = write_raw_marker_to_buffer(tr, ubuf, cnt);
+ written = write_raw_marker_to_buffer(tr, buf, cnt);
if (written < 0)
break;
}
} else {
- written = write_raw_marker_to_buffer(tr, ubuf, cnt);
+ written = write_raw_marker_to_buffer(tr, buf, cnt);
}
return written;
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x bda745ee8fbb63330d8f2f2ea4157229a5df959e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101310-shrapnel-prune-60b1@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bda745ee8fbb63330d8f2f2ea4157229a5df959e Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 10 Oct 2025 23:51:42 -0400
Subject: [PATCH] tracing: Fix tracing_mark_raw_write() to use buf and not ubuf
The fix to use a per CPU buffer to read user space tested only the writes
to trace_marker. But it appears that the selftests are missing tests to
the trace_maker_raw file. The trace_maker_raw file is used by applications
that writes data structures and not strings into the file, and the tools
read the raw ring buffer to process the structures it writes.
The fix that reads the per CPU buffers passes the new per CPU buffer to
the trace_marker file writes, but the update to the trace_marker_raw write
read the data from user space into the per CPU buffer, but then still used
then passed the user space address to the function that records the data.
Pass in the per CPU buffer and not the user space address.
TODO: Add a test to better test trace_marker_raw.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Link: https://lore.kernel.org/20251011035243.386098147@kernel.org
Fixes: 64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space")
Reported-by: syzbot+9a2ede1643175f350105(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68e973f5.050a0220.1186a4.0010.GAE@google.com/
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 0fd582651293..bbb89206a891 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7497,12 +7497,12 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
if (tr == &global_trace) {
guard(rcu)();
list_for_each_entry_rcu(tr, &marker_copies, marker_list) {
- written = write_raw_marker_to_buffer(tr, ubuf, cnt);
+ written = write_raw_marker_to_buffer(tr, buf, cnt);
if (written < 0)
break;
}
} else {
- written = write_raw_marker_to_buffer(tr, ubuf, cnt);
+ written = write_raw_marker_to_buffer(tr, buf, cnt);
}
return written;
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x bda745ee8fbb63330d8f2f2ea4157229a5df959e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101308-chip-shredding-7707@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bda745ee8fbb63330d8f2f2ea4157229a5df959e Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 10 Oct 2025 23:51:42 -0400
Subject: [PATCH] tracing: Fix tracing_mark_raw_write() to use buf and not ubuf
The fix to use a per CPU buffer to read user space tested only the writes
to trace_marker. But it appears that the selftests are missing tests to
the trace_maker_raw file. The trace_maker_raw file is used by applications
that writes data structures and not strings into the file, and the tools
read the raw ring buffer to process the structures it writes.
The fix that reads the per CPU buffers passes the new per CPU buffer to
the trace_marker file writes, but the update to the trace_marker_raw write
read the data from user space into the per CPU buffer, but then still used
then passed the user space address to the function that records the data.
Pass in the per CPU buffer and not the user space address.
TODO: Add a test to better test trace_marker_raw.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Link: https://lore.kernel.org/20251011035243.386098147@kernel.org
Fixes: 64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space")
Reported-by: syzbot+9a2ede1643175f350105(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68e973f5.050a0220.1186a4.0010.GAE@google.com/
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 0fd582651293..bbb89206a891 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7497,12 +7497,12 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
if (tr == &global_trace) {
guard(rcu)();
list_for_each_entry_rcu(tr, &marker_copies, marker_list) {
- written = write_raw_marker_to_buffer(tr, ubuf, cnt);
+ written = write_raw_marker_to_buffer(tr, buf, cnt);
if (written < 0)
break;
}
} else {
- written = write_raw_marker_to_buffer(tr, ubuf, cnt);
+ written = write_raw_marker_to_buffer(tr, buf, cnt);
}
return written;
Use check_add_overflow() to guard against a potential integer overflow
when adding the binary blob lengths in asymmetric_key_generate_id() and
return -EOVERFLOW accordingly. This prevents a possible buffer overflow
when copying data from potentially malicious X.509 fields that can be
arbitrarily large, such as ASN.1 INTEGER serial numbers, issuer names,
etc.
Also use struct_size() to calculate the number of bytes to allocate for
the new asymmetric key id.
Cc: stable(a)vger.kernel.org
Fixes: 7901c1a8effb ("KEYS: Implement binary asymmetric key ID handling")
Signed-off-by: Thorsten Blum <thorsten.blum(a)linux.dev>
---
Changes in v2:
- Use check_add_overflow() and error out as suggested by Lukas
- Update patch description
- Add Fixes: tag and @stable for backporting
- Link to v1: https://lore.kernel.org/lkml/20251007185220.234611-2-thorsten.blum@linux.de…
---
crypto/asymmetric_keys/asymmetric_type.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/crypto/asymmetric_keys/asymmetric_type.c b/crypto/asymmetric_keys/asymmetric_type.c
index ba2d9d1ea235..bd96f799757d 100644
--- a/crypto/asymmetric_keys/asymmetric_type.c
+++ b/crypto/asymmetric_keys/asymmetric_type.c
@@ -11,6 +11,7 @@
#include <crypto/public_key.h>
#include <linux/seq_file.h>
#include <linux/module.h>
+#include <linux/overflow.h>
#include <linux/slab.h>
#include <linux/ctype.h>
#include <keys/system_keyring.h>
@@ -141,12 +142,14 @@ struct asymmetric_key_id *asymmetric_key_generate_id(const void *val_1,
size_t len_2)
{
struct asymmetric_key_id *kid;
+ size_t len;
- kid = kmalloc(sizeof(struct asymmetric_key_id) + len_1 + len_2,
- GFP_KERNEL);
+ if (check_add_overflow(len_1, len_2, &len))
+ return ERR_PTR(-EOVERFLOW);
+ kid = kmalloc(struct_size(kid, data, len), GFP_KERNEL);
if (!kid)
return ERR_PTR(-ENOMEM);
- kid->len = len_1 + len_2;
+ kid->len = len;
memcpy(kid->data, val_1, len_1);
memcpy(kid->data + len_1, val_2, len_2);
return kid;
--
2.51.0
Hi,
We’re providing access to the verified visitor list for MEDICA 2025, connecting you with professionals and organizations across the global healthcare and medical equipment sector.
Event Details:
Location: Düsseldorf, Germany
Visitors: 80,562
Exhibitors: 5,675
Data Includes:
Name, Job Title, Company, Email, Phone, Website, Address, LinkedIn Profile, and more.
To receive pricing or more details, reply with “Pricing Info.”
Best regards,
Sarah Morgan
Sr. Marketing Manager
If you do not wish to receive further messages, reply “Not Interested.”
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 64cf7d058a005c5c31eb8a0b741f35dc12915d18
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101305-harsh-moisture-93ae@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64cf7d058a005c5c31eb8a0b741f35dc12915d18 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Wed, 8 Oct 2025 12:45:10 -0400
Subject: [PATCH] tracing: Have trace_marker use per-cpu data to read user
space
It was reported that using __copy_from_user_inatomic() can actually
schedule. Which is bad when preemption is disabled. Even though there's
logic to check in_atomic() is set, but this is a nop when the kernel is
configured with PREEMPT_NONE. This is due to page faulting and the code
could schedule with preemption disabled.
Link: https://lore.kernel.org/all/20250819105152.2766363-1-luogengkun@huaweicloud…
The solution was to change the __copy_from_user_inatomic() to
copy_from_user_nofault(). But then it was reported that this caused a
regression in Android. There's several applications writing into
trace_marker() in Android, but now instead of showing the expected data,
it is showing:
tracing_mark_write: <faulted>
After reverting the conversion to copy_from_user_nofault(), Android was
able to get the data again.
Writes to the trace_marker is a way to efficiently and quickly enter data
into the Linux tracing buffer. It takes no locks and was designed to be as
non-intrusive as possible. This means it cannot allocate memory, and must
use pre-allocated data.
A method that is actively being worked on to have faultable system call
tracepoints read user space data is to allocate per CPU buffers, and use
them in the callback. The method uses a technique similar to seqcount.
That is something like this:
preempt_disable();
cpu = smp_processor_id();
buffer = this_cpu_ptr(&pre_allocated_cpu_buffers, cpu);
do {
cnt = nr_context_switches_cpu(cpu);
migrate_disable();
preempt_enable();
ret = copy_from_user(buffer, ptr, size);
preempt_disable();
migrate_enable();
} while (!ret && cnt != nr_context_switches_cpu(cpu));
if (!ret)
ring_buffer_write(buffer);
preempt_enable();
It's a little more involved than that, but the above is the basic logic.
The idea is to acquire the current CPU buffer, disable migration, and then
enable preemption. At this moment, it can safely use copy_from_user().
After reading the data from user space, it disables preemption again. It
then checks to see if there was any new scheduling on this CPU. If there
was, it must assume that the buffer was corrupted by another task. If
there wasn't, then the buffer is still valid as only tasks in preemptable
context can write to this buffer and only those that are running on the
CPU.
By using this method, where trace_marker open allocates the per CPU
buffers, trace_marker writes can access user space and even fault it in,
without having to allocate or take any locks of its own.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Luo Gengkun <luogengkun(a)huaweicloud.com>
Cc: Wattson CI <wattson-external(a)google.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Link: https://lore.kernel.org/20251008124510.6dba541a@gandalf.local.home
Fixes: 3d62ab32df065 ("tracing: Fix tracing_marker may trigger page fault during preempt_disable")
Reported-by: Runping Lai <runpinglai(a)google.com>
Tested-by: Runping Lai <runpinglai(a)google.com>
Closes: https://lore.kernel.org/linux-trace-kernel/20251007003417.3470979-2-runping…
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index b3c94fbaf002..0fd582651293 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4791,12 +4791,6 @@ int tracing_single_release_file_tr(struct inode *inode, struct file *filp)
return single_release(inode, filp);
}
-static int tracing_mark_open(struct inode *inode, struct file *filp)
-{
- stream_open(inode, filp);
- return tracing_open_generic_tr(inode, filp);
-}
-
static int tracing_release(struct inode *inode, struct file *file)
{
struct trace_array *tr = inode->i_private;
@@ -7163,7 +7157,7 @@ tracing_free_buffer_release(struct inode *inode, struct file *filp)
#define TRACE_MARKER_MAX_SIZE 4096
-static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user *ubuf,
+static ssize_t write_marker_to_buffer(struct trace_array *tr, const char *buf,
size_t cnt, unsigned long ip)
{
struct ring_buffer_event *event;
@@ -7173,20 +7167,11 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
int meta_size;
ssize_t written;
size_t size;
- int len;
-
-/* Used in tracing_mark_raw_write() as well */
-#define FAULTED_STR "<faulted>"
-#define FAULTED_SIZE (sizeof(FAULTED_STR) - 1) /* '\0' is already accounted for */
meta_size = sizeof(*entry) + 2; /* add '\0' and possible '\n' */
again:
size = cnt + meta_size;
- /* If less than "<faulted>", then make sure we can still add that */
- if (cnt < FAULTED_SIZE)
- size += FAULTED_SIZE - cnt;
-
buffer = tr->array_buffer.buffer;
event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size,
tracing_gen_ctx());
@@ -7196,9 +7181,6 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
* make it smaller and try again.
*/
if (size > ring_buffer_max_event_size(buffer)) {
- /* cnt < FAULTED size should never be bigger than max */
- if (WARN_ON_ONCE(cnt < FAULTED_SIZE))
- return -EBADF;
cnt = ring_buffer_max_event_size(buffer) - meta_size;
/* The above should only happen once */
if (WARN_ON_ONCE(cnt + meta_size == size))
@@ -7212,14 +7194,8 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
entry = ring_buffer_event_data(event);
entry->ip = ip;
-
- len = copy_from_user_nofault(&entry->buf, ubuf, cnt);
- if (len) {
- memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE);
- cnt = FAULTED_SIZE;
- written = -EFAULT;
- } else
- written = cnt;
+ memcpy(&entry->buf, buf, cnt);
+ written = cnt;
if (tr->trace_marker_file && !list_empty(&tr->trace_marker_file->triggers)) {
/* do not add \n before testing triggers, but add \0 */
@@ -7243,6 +7219,169 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
return written;
}
+struct trace_user_buf {
+ char *buf;
+};
+
+struct trace_user_buf_info {
+ struct trace_user_buf __percpu *tbuf;
+ int ref;
+};
+
+
+static DEFINE_MUTEX(trace_user_buffer_mutex);
+static struct trace_user_buf_info *trace_user_buffer;
+
+static void trace_user_fault_buffer_free(struct trace_user_buf_info *tinfo)
+{
+ char *buf;
+ int cpu;
+
+ for_each_possible_cpu(cpu) {
+ buf = per_cpu_ptr(tinfo->tbuf, cpu)->buf;
+ kfree(buf);
+ }
+ free_percpu(tinfo->tbuf);
+ kfree(tinfo);
+}
+
+static int trace_user_fault_buffer_enable(void)
+{
+ struct trace_user_buf_info *tinfo;
+ char *buf;
+ int cpu;
+
+ guard(mutex)(&trace_user_buffer_mutex);
+
+ if (trace_user_buffer) {
+ trace_user_buffer->ref++;
+ return 0;
+ }
+
+ tinfo = kmalloc(sizeof(*tinfo), GFP_KERNEL);
+ if (!tinfo)
+ return -ENOMEM;
+
+ tinfo->tbuf = alloc_percpu(struct trace_user_buf);
+ if (!tinfo->tbuf) {
+ kfree(tinfo);
+ return -ENOMEM;
+ }
+
+ tinfo->ref = 1;
+
+ /* Clear each buffer in case of error */
+ for_each_possible_cpu(cpu) {
+ per_cpu_ptr(tinfo->tbuf, cpu)->buf = NULL;
+ }
+
+ for_each_possible_cpu(cpu) {
+ buf = kmalloc_node(TRACE_MARKER_MAX_SIZE, GFP_KERNEL,
+ cpu_to_node(cpu));
+ if (!buf) {
+ trace_user_fault_buffer_free(tinfo);
+ return -ENOMEM;
+ }
+ per_cpu_ptr(tinfo->tbuf, cpu)->buf = buf;
+ }
+
+ trace_user_buffer = tinfo;
+
+ return 0;
+}
+
+static void trace_user_fault_buffer_disable(void)
+{
+ struct trace_user_buf_info *tinfo;
+
+ guard(mutex)(&trace_user_buffer_mutex);
+
+ tinfo = trace_user_buffer;
+
+ if (WARN_ON_ONCE(!tinfo))
+ return;
+
+ if (--tinfo->ref)
+ return;
+
+ trace_user_fault_buffer_free(tinfo);
+ trace_user_buffer = NULL;
+}
+
+/* Must be called with preemption disabled */
+static char *trace_user_fault_read(struct trace_user_buf_info *tinfo,
+ const char __user *ptr, size_t size,
+ size_t *read_size)
+{
+ int cpu = smp_processor_id();
+ char *buffer = per_cpu_ptr(tinfo->tbuf, cpu)->buf;
+ unsigned int cnt;
+ int trys = 0;
+ int ret;
+
+ if (size > TRACE_MARKER_MAX_SIZE)
+ size = TRACE_MARKER_MAX_SIZE;
+ *read_size = 0;
+
+ /*
+ * This acts similar to a seqcount. The per CPU context switches are
+ * recorded, migration is disabled and preemption is enabled. The
+ * read of the user space memory is copied into the per CPU buffer.
+ * Preemption is disabled again, and if the per CPU context switches count
+ * is still the same, it means the buffer has not been corrupted.
+ * If the count is different, it is assumed the buffer is corrupted
+ * and reading must be tried again.
+ */
+
+ do {
+ /*
+ * If for some reason, copy_from_user() always causes a context
+ * switch, this would then cause an infinite loop.
+ * If this task is preempted by another user space task, it
+ * will cause this task to try again. But just in case something
+ * changes where the copying from user space causes another task
+ * to run, prevent this from going into an infinite loop.
+ * 100 tries should be plenty.
+ */
+ if (WARN_ONCE(trys++ > 100, "Error: Too many tries to read user space"))
+ return NULL;
+
+ /* Read the current CPU context switch counter */
+ cnt = nr_context_switches_cpu(cpu);
+
+ /*
+ * Preemption is going to be enabled, but this task must
+ * remain on this CPU.
+ */
+ migrate_disable();
+
+ /*
+ * Now preemption is being enabed and another task can come in
+ * and use the same buffer and corrupt our data.
+ */
+ preempt_enable_notrace();
+
+ ret = __copy_from_user(buffer, ptr, size);
+
+ preempt_disable_notrace();
+ migrate_enable();
+
+ /* if it faulted, no need to test if the buffer was corrupted */
+ if (ret)
+ return NULL;
+
+ /*
+ * Preemption is disabled again, now check the per CPU context
+ * switch counter. If it doesn't match, then another user space
+ * process may have schedule in and corrupted our buffer. In that
+ * case the copying must be retried.
+ */
+ } while (nr_context_switches_cpu(cpu) != cnt);
+
+ *read_size = size;
+ return buffer;
+}
+
static ssize_t
tracing_mark_write(struct file *filp, const char __user *ubuf,
size_t cnt, loff_t *fpos)
@@ -7250,6 +7389,8 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
struct trace_array *tr = filp->private_data;
ssize_t written = -ENODEV;
unsigned long ip;
+ size_t size;
+ char *buf;
if (tracing_disabled)
return -EINVAL;
@@ -7263,6 +7404,16 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
if (cnt > TRACE_MARKER_MAX_SIZE)
cnt = TRACE_MARKER_MAX_SIZE;
+ /* Must have preemption disabled while having access to the buffer */
+ guard(preempt_notrace)();
+
+ buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size);
+ if (!buf)
+ return -EFAULT;
+
+ if (cnt > size)
+ cnt = size;
+
/* The selftests expect this function to be the IP address */
ip = _THIS_IP_;
@@ -7270,32 +7421,27 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
if (tr == &global_trace) {
guard(rcu)();
list_for_each_entry_rcu(tr, &marker_copies, marker_list) {
- written = write_marker_to_buffer(tr, ubuf, cnt, ip);
+ written = write_marker_to_buffer(tr, buf, cnt, ip);
if (written < 0)
break;
}
} else {
- written = write_marker_to_buffer(tr, ubuf, cnt, ip);
+ written = write_marker_to_buffer(tr, buf, cnt, ip);
}
return written;
}
static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
- const char __user *ubuf, size_t cnt)
+ const char *buf, size_t cnt)
{
struct ring_buffer_event *event;
struct trace_buffer *buffer;
struct raw_data_entry *entry;
ssize_t written;
- int size;
- int len;
-
-#define FAULT_SIZE_ID (FAULTED_SIZE + sizeof(int))
+ size_t size;
size = sizeof(*entry) + cnt;
- if (cnt < FAULT_SIZE_ID)
- size += FAULT_SIZE_ID - cnt;
buffer = tr->array_buffer.buffer;
@@ -7309,14 +7455,8 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
return -EBADF;
entry = ring_buffer_event_data(event);
-
- len = copy_from_user_nofault(&entry->id, ubuf, cnt);
- if (len) {
- entry->id = -1;
- memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE);
- written = -EFAULT;
- } else
- written = cnt;
+ memcpy(&entry->id, buf, cnt);
+ written = cnt;
__buffer_unlock_commit(buffer, event);
@@ -7329,8 +7469,8 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
{
struct trace_array *tr = filp->private_data;
ssize_t written = -ENODEV;
-
-#define FAULT_SIZE_ID (FAULTED_SIZE + sizeof(int))
+ size_t size;
+ char *buf;
if (tracing_disabled)
return -EINVAL;
@@ -7342,6 +7482,17 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
if (cnt < sizeof(unsigned int))
return -EINVAL;
+ /* Must have preemption disabled while having access to the buffer */
+ guard(preempt_notrace)();
+
+ buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size);
+ if (!buf)
+ return -EFAULT;
+
+ /* raw write is all or nothing */
+ if (cnt > size)
+ return -EINVAL;
+
/* The global trace_marker_raw can go to multiple instances */
if (tr == &global_trace) {
guard(rcu)();
@@ -7357,6 +7508,27 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
return written;
}
+static int tracing_mark_open(struct inode *inode, struct file *filp)
+{
+ int ret;
+
+ ret = trace_user_fault_buffer_enable();
+ if (ret < 0)
+ return ret;
+
+ stream_open(inode, filp);
+ ret = tracing_open_generic_tr(inode, filp);
+ if (ret < 0)
+ trace_user_fault_buffer_disable();
+ return ret;
+}
+
+static int tracing_mark_release(struct inode *inode, struct file *file)
+{
+ trace_user_fault_buffer_disable();
+ return tracing_release_generic_tr(inode, file);
+}
+
static int tracing_clock_show(struct seq_file *m, void *v)
{
struct trace_array *tr = m->private;
@@ -7764,13 +7936,13 @@ static const struct file_operations tracing_free_buffer_fops = {
static const struct file_operations tracing_mark_fops = {
.open = tracing_mark_open,
.write = tracing_mark_write,
- .release = tracing_release_generic_tr,
+ .release = tracing_mark_release,
};
static const struct file_operations tracing_mark_raw_fops = {
.open = tracing_mark_open,
.write = tracing_mark_raw_write,
- .release = tracing_release_generic_tr,
+ .release = tracing_mark_release,
};
static const struct file_operations trace_clock_fops = {
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 64cf7d058a005c5c31eb8a0b741f35dc12915d18
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101303-unsterile-observant-971f@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64cf7d058a005c5c31eb8a0b741f35dc12915d18 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Wed, 8 Oct 2025 12:45:10 -0400
Subject: [PATCH] tracing: Have trace_marker use per-cpu data to read user
space
It was reported that using __copy_from_user_inatomic() can actually
schedule. Which is bad when preemption is disabled. Even though there's
logic to check in_atomic() is set, but this is a nop when the kernel is
configured with PREEMPT_NONE. This is due to page faulting and the code
could schedule with preemption disabled.
Link: https://lore.kernel.org/all/20250819105152.2766363-1-luogengkun@huaweicloud…
The solution was to change the __copy_from_user_inatomic() to
copy_from_user_nofault(). But then it was reported that this caused a
regression in Android. There's several applications writing into
trace_marker() in Android, but now instead of showing the expected data,
it is showing:
tracing_mark_write: <faulted>
After reverting the conversion to copy_from_user_nofault(), Android was
able to get the data again.
Writes to the trace_marker is a way to efficiently and quickly enter data
into the Linux tracing buffer. It takes no locks and was designed to be as
non-intrusive as possible. This means it cannot allocate memory, and must
use pre-allocated data.
A method that is actively being worked on to have faultable system call
tracepoints read user space data is to allocate per CPU buffers, and use
them in the callback. The method uses a technique similar to seqcount.
That is something like this:
preempt_disable();
cpu = smp_processor_id();
buffer = this_cpu_ptr(&pre_allocated_cpu_buffers, cpu);
do {
cnt = nr_context_switches_cpu(cpu);
migrate_disable();
preempt_enable();
ret = copy_from_user(buffer, ptr, size);
preempt_disable();
migrate_enable();
} while (!ret && cnt != nr_context_switches_cpu(cpu));
if (!ret)
ring_buffer_write(buffer);
preempt_enable();
It's a little more involved than that, but the above is the basic logic.
The idea is to acquire the current CPU buffer, disable migration, and then
enable preemption. At this moment, it can safely use copy_from_user().
After reading the data from user space, it disables preemption again. It
then checks to see if there was any new scheduling on this CPU. If there
was, it must assume that the buffer was corrupted by another task. If
there wasn't, then the buffer is still valid as only tasks in preemptable
context can write to this buffer and only those that are running on the
CPU.
By using this method, where trace_marker open allocates the per CPU
buffers, trace_marker writes can access user space and even fault it in,
without having to allocate or take any locks of its own.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Luo Gengkun <luogengkun(a)huaweicloud.com>
Cc: Wattson CI <wattson-external(a)google.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Link: https://lore.kernel.org/20251008124510.6dba541a@gandalf.local.home
Fixes: 3d62ab32df065 ("tracing: Fix tracing_marker may trigger page fault during preempt_disable")
Reported-by: Runping Lai <runpinglai(a)google.com>
Tested-by: Runping Lai <runpinglai(a)google.com>
Closes: https://lore.kernel.org/linux-trace-kernel/20251007003417.3470979-2-runping…
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index b3c94fbaf002..0fd582651293 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4791,12 +4791,6 @@ int tracing_single_release_file_tr(struct inode *inode, struct file *filp)
return single_release(inode, filp);
}
-static int tracing_mark_open(struct inode *inode, struct file *filp)
-{
- stream_open(inode, filp);
- return tracing_open_generic_tr(inode, filp);
-}
-
static int tracing_release(struct inode *inode, struct file *file)
{
struct trace_array *tr = inode->i_private;
@@ -7163,7 +7157,7 @@ tracing_free_buffer_release(struct inode *inode, struct file *filp)
#define TRACE_MARKER_MAX_SIZE 4096
-static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user *ubuf,
+static ssize_t write_marker_to_buffer(struct trace_array *tr, const char *buf,
size_t cnt, unsigned long ip)
{
struct ring_buffer_event *event;
@@ -7173,20 +7167,11 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
int meta_size;
ssize_t written;
size_t size;
- int len;
-
-/* Used in tracing_mark_raw_write() as well */
-#define FAULTED_STR "<faulted>"
-#define FAULTED_SIZE (sizeof(FAULTED_STR) - 1) /* '\0' is already accounted for */
meta_size = sizeof(*entry) + 2; /* add '\0' and possible '\n' */
again:
size = cnt + meta_size;
- /* If less than "<faulted>", then make sure we can still add that */
- if (cnt < FAULTED_SIZE)
- size += FAULTED_SIZE - cnt;
-
buffer = tr->array_buffer.buffer;
event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size,
tracing_gen_ctx());
@@ -7196,9 +7181,6 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
* make it smaller and try again.
*/
if (size > ring_buffer_max_event_size(buffer)) {
- /* cnt < FAULTED size should never be bigger than max */
- if (WARN_ON_ONCE(cnt < FAULTED_SIZE))
- return -EBADF;
cnt = ring_buffer_max_event_size(buffer) - meta_size;
/* The above should only happen once */
if (WARN_ON_ONCE(cnt + meta_size == size))
@@ -7212,14 +7194,8 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
entry = ring_buffer_event_data(event);
entry->ip = ip;
-
- len = copy_from_user_nofault(&entry->buf, ubuf, cnt);
- if (len) {
- memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE);
- cnt = FAULTED_SIZE;
- written = -EFAULT;
- } else
- written = cnt;
+ memcpy(&entry->buf, buf, cnt);
+ written = cnt;
if (tr->trace_marker_file && !list_empty(&tr->trace_marker_file->triggers)) {
/* do not add \n before testing triggers, but add \0 */
@@ -7243,6 +7219,169 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
return written;
}
+struct trace_user_buf {
+ char *buf;
+};
+
+struct trace_user_buf_info {
+ struct trace_user_buf __percpu *tbuf;
+ int ref;
+};
+
+
+static DEFINE_MUTEX(trace_user_buffer_mutex);
+static struct trace_user_buf_info *trace_user_buffer;
+
+static void trace_user_fault_buffer_free(struct trace_user_buf_info *tinfo)
+{
+ char *buf;
+ int cpu;
+
+ for_each_possible_cpu(cpu) {
+ buf = per_cpu_ptr(tinfo->tbuf, cpu)->buf;
+ kfree(buf);
+ }
+ free_percpu(tinfo->tbuf);
+ kfree(tinfo);
+}
+
+static int trace_user_fault_buffer_enable(void)
+{
+ struct trace_user_buf_info *tinfo;
+ char *buf;
+ int cpu;
+
+ guard(mutex)(&trace_user_buffer_mutex);
+
+ if (trace_user_buffer) {
+ trace_user_buffer->ref++;
+ return 0;
+ }
+
+ tinfo = kmalloc(sizeof(*tinfo), GFP_KERNEL);
+ if (!tinfo)
+ return -ENOMEM;
+
+ tinfo->tbuf = alloc_percpu(struct trace_user_buf);
+ if (!tinfo->tbuf) {
+ kfree(tinfo);
+ return -ENOMEM;
+ }
+
+ tinfo->ref = 1;
+
+ /* Clear each buffer in case of error */
+ for_each_possible_cpu(cpu) {
+ per_cpu_ptr(tinfo->tbuf, cpu)->buf = NULL;
+ }
+
+ for_each_possible_cpu(cpu) {
+ buf = kmalloc_node(TRACE_MARKER_MAX_SIZE, GFP_KERNEL,
+ cpu_to_node(cpu));
+ if (!buf) {
+ trace_user_fault_buffer_free(tinfo);
+ return -ENOMEM;
+ }
+ per_cpu_ptr(tinfo->tbuf, cpu)->buf = buf;
+ }
+
+ trace_user_buffer = tinfo;
+
+ return 0;
+}
+
+static void trace_user_fault_buffer_disable(void)
+{
+ struct trace_user_buf_info *tinfo;
+
+ guard(mutex)(&trace_user_buffer_mutex);
+
+ tinfo = trace_user_buffer;
+
+ if (WARN_ON_ONCE(!tinfo))
+ return;
+
+ if (--tinfo->ref)
+ return;
+
+ trace_user_fault_buffer_free(tinfo);
+ trace_user_buffer = NULL;
+}
+
+/* Must be called with preemption disabled */
+static char *trace_user_fault_read(struct trace_user_buf_info *tinfo,
+ const char __user *ptr, size_t size,
+ size_t *read_size)
+{
+ int cpu = smp_processor_id();
+ char *buffer = per_cpu_ptr(tinfo->tbuf, cpu)->buf;
+ unsigned int cnt;
+ int trys = 0;
+ int ret;
+
+ if (size > TRACE_MARKER_MAX_SIZE)
+ size = TRACE_MARKER_MAX_SIZE;
+ *read_size = 0;
+
+ /*
+ * This acts similar to a seqcount. The per CPU context switches are
+ * recorded, migration is disabled and preemption is enabled. The
+ * read of the user space memory is copied into the per CPU buffer.
+ * Preemption is disabled again, and if the per CPU context switches count
+ * is still the same, it means the buffer has not been corrupted.
+ * If the count is different, it is assumed the buffer is corrupted
+ * and reading must be tried again.
+ */
+
+ do {
+ /*
+ * If for some reason, copy_from_user() always causes a context
+ * switch, this would then cause an infinite loop.
+ * If this task is preempted by another user space task, it
+ * will cause this task to try again. But just in case something
+ * changes where the copying from user space causes another task
+ * to run, prevent this from going into an infinite loop.
+ * 100 tries should be plenty.
+ */
+ if (WARN_ONCE(trys++ > 100, "Error: Too many tries to read user space"))
+ return NULL;
+
+ /* Read the current CPU context switch counter */
+ cnt = nr_context_switches_cpu(cpu);
+
+ /*
+ * Preemption is going to be enabled, but this task must
+ * remain on this CPU.
+ */
+ migrate_disable();
+
+ /*
+ * Now preemption is being enabed and another task can come in
+ * and use the same buffer and corrupt our data.
+ */
+ preempt_enable_notrace();
+
+ ret = __copy_from_user(buffer, ptr, size);
+
+ preempt_disable_notrace();
+ migrate_enable();
+
+ /* if it faulted, no need to test if the buffer was corrupted */
+ if (ret)
+ return NULL;
+
+ /*
+ * Preemption is disabled again, now check the per CPU context
+ * switch counter. If it doesn't match, then another user space
+ * process may have schedule in and corrupted our buffer. In that
+ * case the copying must be retried.
+ */
+ } while (nr_context_switches_cpu(cpu) != cnt);
+
+ *read_size = size;
+ return buffer;
+}
+
static ssize_t
tracing_mark_write(struct file *filp, const char __user *ubuf,
size_t cnt, loff_t *fpos)
@@ -7250,6 +7389,8 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
struct trace_array *tr = filp->private_data;
ssize_t written = -ENODEV;
unsigned long ip;
+ size_t size;
+ char *buf;
if (tracing_disabled)
return -EINVAL;
@@ -7263,6 +7404,16 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
if (cnt > TRACE_MARKER_MAX_SIZE)
cnt = TRACE_MARKER_MAX_SIZE;
+ /* Must have preemption disabled while having access to the buffer */
+ guard(preempt_notrace)();
+
+ buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size);
+ if (!buf)
+ return -EFAULT;
+
+ if (cnt > size)
+ cnt = size;
+
/* The selftests expect this function to be the IP address */
ip = _THIS_IP_;
@@ -7270,32 +7421,27 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
if (tr == &global_trace) {
guard(rcu)();
list_for_each_entry_rcu(tr, &marker_copies, marker_list) {
- written = write_marker_to_buffer(tr, ubuf, cnt, ip);
+ written = write_marker_to_buffer(tr, buf, cnt, ip);
if (written < 0)
break;
}
} else {
- written = write_marker_to_buffer(tr, ubuf, cnt, ip);
+ written = write_marker_to_buffer(tr, buf, cnt, ip);
}
return written;
}
static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
- const char __user *ubuf, size_t cnt)
+ const char *buf, size_t cnt)
{
struct ring_buffer_event *event;
struct trace_buffer *buffer;
struct raw_data_entry *entry;
ssize_t written;
- int size;
- int len;
-
-#define FAULT_SIZE_ID (FAULTED_SIZE + sizeof(int))
+ size_t size;
size = sizeof(*entry) + cnt;
- if (cnt < FAULT_SIZE_ID)
- size += FAULT_SIZE_ID - cnt;
buffer = tr->array_buffer.buffer;
@@ -7309,14 +7455,8 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
return -EBADF;
entry = ring_buffer_event_data(event);
-
- len = copy_from_user_nofault(&entry->id, ubuf, cnt);
- if (len) {
- entry->id = -1;
- memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE);
- written = -EFAULT;
- } else
- written = cnt;
+ memcpy(&entry->id, buf, cnt);
+ written = cnt;
__buffer_unlock_commit(buffer, event);
@@ -7329,8 +7469,8 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
{
struct trace_array *tr = filp->private_data;
ssize_t written = -ENODEV;
-
-#define FAULT_SIZE_ID (FAULTED_SIZE + sizeof(int))
+ size_t size;
+ char *buf;
if (tracing_disabled)
return -EINVAL;
@@ -7342,6 +7482,17 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
if (cnt < sizeof(unsigned int))
return -EINVAL;
+ /* Must have preemption disabled while having access to the buffer */
+ guard(preempt_notrace)();
+
+ buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size);
+ if (!buf)
+ return -EFAULT;
+
+ /* raw write is all or nothing */
+ if (cnt > size)
+ return -EINVAL;
+
/* The global trace_marker_raw can go to multiple instances */
if (tr == &global_trace) {
guard(rcu)();
@@ -7357,6 +7508,27 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
return written;
}
+static int tracing_mark_open(struct inode *inode, struct file *filp)
+{
+ int ret;
+
+ ret = trace_user_fault_buffer_enable();
+ if (ret < 0)
+ return ret;
+
+ stream_open(inode, filp);
+ ret = tracing_open_generic_tr(inode, filp);
+ if (ret < 0)
+ trace_user_fault_buffer_disable();
+ return ret;
+}
+
+static int tracing_mark_release(struct inode *inode, struct file *file)
+{
+ trace_user_fault_buffer_disable();
+ return tracing_release_generic_tr(inode, file);
+}
+
static int tracing_clock_show(struct seq_file *m, void *v)
{
struct trace_array *tr = m->private;
@@ -7764,13 +7936,13 @@ static const struct file_operations tracing_free_buffer_fops = {
static const struct file_operations tracing_mark_fops = {
.open = tracing_mark_open,
.write = tracing_mark_write,
- .release = tracing_release_generic_tr,
+ .release = tracing_mark_release,
};
static const struct file_operations tracing_mark_raw_fops = {
.open = tracing_mark_open,
.write = tracing_mark_raw_write,
- .release = tracing_release_generic_tr,
+ .release = tracing_mark_release,
};
static const struct file_operations trace_clock_fops = {
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 64cf7d058a005c5c31eb8a0b741f35dc12915d18
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101300-unburned-stage-ee62@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64cf7d058a005c5c31eb8a0b741f35dc12915d18 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Wed, 8 Oct 2025 12:45:10 -0400
Subject: [PATCH] tracing: Have trace_marker use per-cpu data to read user
space
It was reported that using __copy_from_user_inatomic() can actually
schedule. Which is bad when preemption is disabled. Even though there's
logic to check in_atomic() is set, but this is a nop when the kernel is
configured with PREEMPT_NONE. This is due to page faulting and the code
could schedule with preemption disabled.
Link: https://lore.kernel.org/all/20250819105152.2766363-1-luogengkun@huaweicloud…
The solution was to change the __copy_from_user_inatomic() to
copy_from_user_nofault(). But then it was reported that this caused a
regression in Android. There's several applications writing into
trace_marker() in Android, but now instead of showing the expected data,
it is showing:
tracing_mark_write: <faulted>
After reverting the conversion to copy_from_user_nofault(), Android was
able to get the data again.
Writes to the trace_marker is a way to efficiently and quickly enter data
into the Linux tracing buffer. It takes no locks and was designed to be as
non-intrusive as possible. This means it cannot allocate memory, and must
use pre-allocated data.
A method that is actively being worked on to have faultable system call
tracepoints read user space data is to allocate per CPU buffers, and use
them in the callback. The method uses a technique similar to seqcount.
That is something like this:
preempt_disable();
cpu = smp_processor_id();
buffer = this_cpu_ptr(&pre_allocated_cpu_buffers, cpu);
do {
cnt = nr_context_switches_cpu(cpu);
migrate_disable();
preempt_enable();
ret = copy_from_user(buffer, ptr, size);
preempt_disable();
migrate_enable();
} while (!ret && cnt != nr_context_switches_cpu(cpu));
if (!ret)
ring_buffer_write(buffer);
preempt_enable();
It's a little more involved than that, but the above is the basic logic.
The idea is to acquire the current CPU buffer, disable migration, and then
enable preemption. At this moment, it can safely use copy_from_user().
After reading the data from user space, it disables preemption again. It
then checks to see if there was any new scheduling on this CPU. If there
was, it must assume that the buffer was corrupted by another task. If
there wasn't, then the buffer is still valid as only tasks in preemptable
context can write to this buffer and only those that are running on the
CPU.
By using this method, where trace_marker open allocates the per CPU
buffers, trace_marker writes can access user space and even fault it in,
without having to allocate or take any locks of its own.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Luo Gengkun <luogengkun(a)huaweicloud.com>
Cc: Wattson CI <wattson-external(a)google.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Link: https://lore.kernel.org/20251008124510.6dba541a@gandalf.local.home
Fixes: 3d62ab32df065 ("tracing: Fix tracing_marker may trigger page fault during preempt_disable")
Reported-by: Runping Lai <runpinglai(a)google.com>
Tested-by: Runping Lai <runpinglai(a)google.com>
Closes: https://lore.kernel.org/linux-trace-kernel/20251007003417.3470979-2-runping…
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index b3c94fbaf002..0fd582651293 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4791,12 +4791,6 @@ int tracing_single_release_file_tr(struct inode *inode, struct file *filp)
return single_release(inode, filp);
}
-static int tracing_mark_open(struct inode *inode, struct file *filp)
-{
- stream_open(inode, filp);
- return tracing_open_generic_tr(inode, filp);
-}
-
static int tracing_release(struct inode *inode, struct file *file)
{
struct trace_array *tr = inode->i_private;
@@ -7163,7 +7157,7 @@ tracing_free_buffer_release(struct inode *inode, struct file *filp)
#define TRACE_MARKER_MAX_SIZE 4096
-static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user *ubuf,
+static ssize_t write_marker_to_buffer(struct trace_array *tr, const char *buf,
size_t cnt, unsigned long ip)
{
struct ring_buffer_event *event;
@@ -7173,20 +7167,11 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
int meta_size;
ssize_t written;
size_t size;
- int len;
-
-/* Used in tracing_mark_raw_write() as well */
-#define FAULTED_STR "<faulted>"
-#define FAULTED_SIZE (sizeof(FAULTED_STR) - 1) /* '\0' is already accounted for */
meta_size = sizeof(*entry) + 2; /* add '\0' and possible '\n' */
again:
size = cnt + meta_size;
- /* If less than "<faulted>", then make sure we can still add that */
- if (cnt < FAULTED_SIZE)
- size += FAULTED_SIZE - cnt;
-
buffer = tr->array_buffer.buffer;
event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size,
tracing_gen_ctx());
@@ -7196,9 +7181,6 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
* make it smaller and try again.
*/
if (size > ring_buffer_max_event_size(buffer)) {
- /* cnt < FAULTED size should never be bigger than max */
- if (WARN_ON_ONCE(cnt < FAULTED_SIZE))
- return -EBADF;
cnt = ring_buffer_max_event_size(buffer) - meta_size;
/* The above should only happen once */
if (WARN_ON_ONCE(cnt + meta_size == size))
@@ -7212,14 +7194,8 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
entry = ring_buffer_event_data(event);
entry->ip = ip;
-
- len = copy_from_user_nofault(&entry->buf, ubuf, cnt);
- if (len) {
- memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE);
- cnt = FAULTED_SIZE;
- written = -EFAULT;
- } else
- written = cnt;
+ memcpy(&entry->buf, buf, cnt);
+ written = cnt;
if (tr->trace_marker_file && !list_empty(&tr->trace_marker_file->triggers)) {
/* do not add \n before testing triggers, but add \0 */
@@ -7243,6 +7219,169 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
return written;
}
+struct trace_user_buf {
+ char *buf;
+};
+
+struct trace_user_buf_info {
+ struct trace_user_buf __percpu *tbuf;
+ int ref;
+};
+
+
+static DEFINE_MUTEX(trace_user_buffer_mutex);
+static struct trace_user_buf_info *trace_user_buffer;
+
+static void trace_user_fault_buffer_free(struct trace_user_buf_info *tinfo)
+{
+ char *buf;
+ int cpu;
+
+ for_each_possible_cpu(cpu) {
+ buf = per_cpu_ptr(tinfo->tbuf, cpu)->buf;
+ kfree(buf);
+ }
+ free_percpu(tinfo->tbuf);
+ kfree(tinfo);
+}
+
+static int trace_user_fault_buffer_enable(void)
+{
+ struct trace_user_buf_info *tinfo;
+ char *buf;
+ int cpu;
+
+ guard(mutex)(&trace_user_buffer_mutex);
+
+ if (trace_user_buffer) {
+ trace_user_buffer->ref++;
+ return 0;
+ }
+
+ tinfo = kmalloc(sizeof(*tinfo), GFP_KERNEL);
+ if (!tinfo)
+ return -ENOMEM;
+
+ tinfo->tbuf = alloc_percpu(struct trace_user_buf);
+ if (!tinfo->tbuf) {
+ kfree(tinfo);
+ return -ENOMEM;
+ }
+
+ tinfo->ref = 1;
+
+ /* Clear each buffer in case of error */
+ for_each_possible_cpu(cpu) {
+ per_cpu_ptr(tinfo->tbuf, cpu)->buf = NULL;
+ }
+
+ for_each_possible_cpu(cpu) {
+ buf = kmalloc_node(TRACE_MARKER_MAX_SIZE, GFP_KERNEL,
+ cpu_to_node(cpu));
+ if (!buf) {
+ trace_user_fault_buffer_free(tinfo);
+ return -ENOMEM;
+ }
+ per_cpu_ptr(tinfo->tbuf, cpu)->buf = buf;
+ }
+
+ trace_user_buffer = tinfo;
+
+ return 0;
+}
+
+static void trace_user_fault_buffer_disable(void)
+{
+ struct trace_user_buf_info *tinfo;
+
+ guard(mutex)(&trace_user_buffer_mutex);
+
+ tinfo = trace_user_buffer;
+
+ if (WARN_ON_ONCE(!tinfo))
+ return;
+
+ if (--tinfo->ref)
+ return;
+
+ trace_user_fault_buffer_free(tinfo);
+ trace_user_buffer = NULL;
+}
+
+/* Must be called with preemption disabled */
+static char *trace_user_fault_read(struct trace_user_buf_info *tinfo,
+ const char __user *ptr, size_t size,
+ size_t *read_size)
+{
+ int cpu = smp_processor_id();
+ char *buffer = per_cpu_ptr(tinfo->tbuf, cpu)->buf;
+ unsigned int cnt;
+ int trys = 0;
+ int ret;
+
+ if (size > TRACE_MARKER_MAX_SIZE)
+ size = TRACE_MARKER_MAX_SIZE;
+ *read_size = 0;
+
+ /*
+ * This acts similar to a seqcount. The per CPU context switches are
+ * recorded, migration is disabled and preemption is enabled. The
+ * read of the user space memory is copied into the per CPU buffer.
+ * Preemption is disabled again, and if the per CPU context switches count
+ * is still the same, it means the buffer has not been corrupted.
+ * If the count is different, it is assumed the buffer is corrupted
+ * and reading must be tried again.
+ */
+
+ do {
+ /*
+ * If for some reason, copy_from_user() always causes a context
+ * switch, this would then cause an infinite loop.
+ * If this task is preempted by another user space task, it
+ * will cause this task to try again. But just in case something
+ * changes where the copying from user space causes another task
+ * to run, prevent this from going into an infinite loop.
+ * 100 tries should be plenty.
+ */
+ if (WARN_ONCE(trys++ > 100, "Error: Too many tries to read user space"))
+ return NULL;
+
+ /* Read the current CPU context switch counter */
+ cnt = nr_context_switches_cpu(cpu);
+
+ /*
+ * Preemption is going to be enabled, but this task must
+ * remain on this CPU.
+ */
+ migrate_disable();
+
+ /*
+ * Now preemption is being enabed and another task can come in
+ * and use the same buffer and corrupt our data.
+ */
+ preempt_enable_notrace();
+
+ ret = __copy_from_user(buffer, ptr, size);
+
+ preempt_disable_notrace();
+ migrate_enable();
+
+ /* if it faulted, no need to test if the buffer was corrupted */
+ if (ret)
+ return NULL;
+
+ /*
+ * Preemption is disabled again, now check the per CPU context
+ * switch counter. If it doesn't match, then another user space
+ * process may have schedule in and corrupted our buffer. In that
+ * case the copying must be retried.
+ */
+ } while (nr_context_switches_cpu(cpu) != cnt);
+
+ *read_size = size;
+ return buffer;
+}
+
static ssize_t
tracing_mark_write(struct file *filp, const char __user *ubuf,
size_t cnt, loff_t *fpos)
@@ -7250,6 +7389,8 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
struct trace_array *tr = filp->private_data;
ssize_t written = -ENODEV;
unsigned long ip;
+ size_t size;
+ char *buf;
if (tracing_disabled)
return -EINVAL;
@@ -7263,6 +7404,16 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
if (cnt > TRACE_MARKER_MAX_SIZE)
cnt = TRACE_MARKER_MAX_SIZE;
+ /* Must have preemption disabled while having access to the buffer */
+ guard(preempt_notrace)();
+
+ buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size);
+ if (!buf)
+ return -EFAULT;
+
+ if (cnt > size)
+ cnt = size;
+
/* The selftests expect this function to be the IP address */
ip = _THIS_IP_;
@@ -7270,32 +7421,27 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
if (tr == &global_trace) {
guard(rcu)();
list_for_each_entry_rcu(tr, &marker_copies, marker_list) {
- written = write_marker_to_buffer(tr, ubuf, cnt, ip);
+ written = write_marker_to_buffer(tr, buf, cnt, ip);
if (written < 0)
break;
}
} else {
- written = write_marker_to_buffer(tr, ubuf, cnt, ip);
+ written = write_marker_to_buffer(tr, buf, cnt, ip);
}
return written;
}
static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
- const char __user *ubuf, size_t cnt)
+ const char *buf, size_t cnt)
{
struct ring_buffer_event *event;
struct trace_buffer *buffer;
struct raw_data_entry *entry;
ssize_t written;
- int size;
- int len;
-
-#define FAULT_SIZE_ID (FAULTED_SIZE + sizeof(int))
+ size_t size;
size = sizeof(*entry) + cnt;
- if (cnt < FAULT_SIZE_ID)
- size += FAULT_SIZE_ID - cnt;
buffer = tr->array_buffer.buffer;
@@ -7309,14 +7455,8 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
return -EBADF;
entry = ring_buffer_event_data(event);
-
- len = copy_from_user_nofault(&entry->id, ubuf, cnt);
- if (len) {
- entry->id = -1;
- memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE);
- written = -EFAULT;
- } else
- written = cnt;
+ memcpy(&entry->id, buf, cnt);
+ written = cnt;
__buffer_unlock_commit(buffer, event);
@@ -7329,8 +7469,8 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
{
struct trace_array *tr = filp->private_data;
ssize_t written = -ENODEV;
-
-#define FAULT_SIZE_ID (FAULTED_SIZE + sizeof(int))
+ size_t size;
+ char *buf;
if (tracing_disabled)
return -EINVAL;
@@ -7342,6 +7482,17 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
if (cnt < sizeof(unsigned int))
return -EINVAL;
+ /* Must have preemption disabled while having access to the buffer */
+ guard(preempt_notrace)();
+
+ buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size);
+ if (!buf)
+ return -EFAULT;
+
+ /* raw write is all or nothing */
+ if (cnt > size)
+ return -EINVAL;
+
/* The global trace_marker_raw can go to multiple instances */
if (tr == &global_trace) {
guard(rcu)();
@@ -7357,6 +7508,27 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
return written;
}
+static int tracing_mark_open(struct inode *inode, struct file *filp)
+{
+ int ret;
+
+ ret = trace_user_fault_buffer_enable();
+ if (ret < 0)
+ return ret;
+
+ stream_open(inode, filp);
+ ret = tracing_open_generic_tr(inode, filp);
+ if (ret < 0)
+ trace_user_fault_buffer_disable();
+ return ret;
+}
+
+static int tracing_mark_release(struct inode *inode, struct file *file)
+{
+ trace_user_fault_buffer_disable();
+ return tracing_release_generic_tr(inode, file);
+}
+
static int tracing_clock_show(struct seq_file *m, void *v)
{
struct trace_array *tr = m->private;
@@ -7764,13 +7936,13 @@ static const struct file_operations tracing_free_buffer_fops = {
static const struct file_operations tracing_mark_fops = {
.open = tracing_mark_open,
.write = tracing_mark_write,
- .release = tracing_release_generic_tr,
+ .release = tracing_mark_release,
};
static const struct file_operations tracing_mark_raw_fops = {
.open = tracing_mark_open,
.write = tracing_mark_raw_write,
- .release = tracing_release_generic_tr,
+ .release = tracing_mark_release,
};
static const struct file_operations trace_clock_fops = {
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 64cf7d058a005c5c31eb8a0b741f35dc12915d18
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101357-tremble-silenced-1049@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64cf7d058a005c5c31eb8a0b741f35dc12915d18 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Wed, 8 Oct 2025 12:45:10 -0400
Subject: [PATCH] tracing: Have trace_marker use per-cpu data to read user
space
It was reported that using __copy_from_user_inatomic() can actually
schedule. Which is bad when preemption is disabled. Even though there's
logic to check in_atomic() is set, but this is a nop when the kernel is
configured with PREEMPT_NONE. This is due to page faulting and the code
could schedule with preemption disabled.
Link: https://lore.kernel.org/all/20250819105152.2766363-1-luogengkun@huaweicloud…
The solution was to change the __copy_from_user_inatomic() to
copy_from_user_nofault(). But then it was reported that this caused a
regression in Android. There's several applications writing into
trace_marker() in Android, but now instead of showing the expected data,
it is showing:
tracing_mark_write: <faulted>
After reverting the conversion to copy_from_user_nofault(), Android was
able to get the data again.
Writes to the trace_marker is a way to efficiently and quickly enter data
into the Linux tracing buffer. It takes no locks and was designed to be as
non-intrusive as possible. This means it cannot allocate memory, and must
use pre-allocated data.
A method that is actively being worked on to have faultable system call
tracepoints read user space data is to allocate per CPU buffers, and use
them in the callback. The method uses a technique similar to seqcount.
That is something like this:
preempt_disable();
cpu = smp_processor_id();
buffer = this_cpu_ptr(&pre_allocated_cpu_buffers, cpu);
do {
cnt = nr_context_switches_cpu(cpu);
migrate_disable();
preempt_enable();
ret = copy_from_user(buffer, ptr, size);
preempt_disable();
migrate_enable();
} while (!ret && cnt != nr_context_switches_cpu(cpu));
if (!ret)
ring_buffer_write(buffer);
preempt_enable();
It's a little more involved than that, but the above is the basic logic.
The idea is to acquire the current CPU buffer, disable migration, and then
enable preemption. At this moment, it can safely use copy_from_user().
After reading the data from user space, it disables preemption again. It
then checks to see if there was any new scheduling on this CPU. If there
was, it must assume that the buffer was corrupted by another task. If
there wasn't, then the buffer is still valid as only tasks in preemptable
context can write to this buffer and only those that are running on the
CPU.
By using this method, where trace_marker open allocates the per CPU
buffers, trace_marker writes can access user space and even fault it in,
without having to allocate or take any locks of its own.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Luo Gengkun <luogengkun(a)huaweicloud.com>
Cc: Wattson CI <wattson-external(a)google.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Link: https://lore.kernel.org/20251008124510.6dba541a@gandalf.local.home
Fixes: 3d62ab32df065 ("tracing: Fix tracing_marker may trigger page fault during preempt_disable")
Reported-by: Runping Lai <runpinglai(a)google.com>
Tested-by: Runping Lai <runpinglai(a)google.com>
Closes: https://lore.kernel.org/linux-trace-kernel/20251007003417.3470979-2-runping…
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index b3c94fbaf002..0fd582651293 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4791,12 +4791,6 @@ int tracing_single_release_file_tr(struct inode *inode, struct file *filp)
return single_release(inode, filp);
}
-static int tracing_mark_open(struct inode *inode, struct file *filp)
-{
- stream_open(inode, filp);
- return tracing_open_generic_tr(inode, filp);
-}
-
static int tracing_release(struct inode *inode, struct file *file)
{
struct trace_array *tr = inode->i_private;
@@ -7163,7 +7157,7 @@ tracing_free_buffer_release(struct inode *inode, struct file *filp)
#define TRACE_MARKER_MAX_SIZE 4096
-static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user *ubuf,
+static ssize_t write_marker_to_buffer(struct trace_array *tr, const char *buf,
size_t cnt, unsigned long ip)
{
struct ring_buffer_event *event;
@@ -7173,20 +7167,11 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
int meta_size;
ssize_t written;
size_t size;
- int len;
-
-/* Used in tracing_mark_raw_write() as well */
-#define FAULTED_STR "<faulted>"
-#define FAULTED_SIZE (sizeof(FAULTED_STR) - 1) /* '\0' is already accounted for */
meta_size = sizeof(*entry) + 2; /* add '\0' and possible '\n' */
again:
size = cnt + meta_size;
- /* If less than "<faulted>", then make sure we can still add that */
- if (cnt < FAULTED_SIZE)
- size += FAULTED_SIZE - cnt;
-
buffer = tr->array_buffer.buffer;
event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size,
tracing_gen_ctx());
@@ -7196,9 +7181,6 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
* make it smaller and try again.
*/
if (size > ring_buffer_max_event_size(buffer)) {
- /* cnt < FAULTED size should never be bigger than max */
- if (WARN_ON_ONCE(cnt < FAULTED_SIZE))
- return -EBADF;
cnt = ring_buffer_max_event_size(buffer) - meta_size;
/* The above should only happen once */
if (WARN_ON_ONCE(cnt + meta_size == size))
@@ -7212,14 +7194,8 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
entry = ring_buffer_event_data(event);
entry->ip = ip;
-
- len = copy_from_user_nofault(&entry->buf, ubuf, cnt);
- if (len) {
- memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE);
- cnt = FAULTED_SIZE;
- written = -EFAULT;
- } else
- written = cnt;
+ memcpy(&entry->buf, buf, cnt);
+ written = cnt;
if (tr->trace_marker_file && !list_empty(&tr->trace_marker_file->triggers)) {
/* do not add \n before testing triggers, but add \0 */
@@ -7243,6 +7219,169 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user
return written;
}
+struct trace_user_buf {
+ char *buf;
+};
+
+struct trace_user_buf_info {
+ struct trace_user_buf __percpu *tbuf;
+ int ref;
+};
+
+
+static DEFINE_MUTEX(trace_user_buffer_mutex);
+static struct trace_user_buf_info *trace_user_buffer;
+
+static void trace_user_fault_buffer_free(struct trace_user_buf_info *tinfo)
+{
+ char *buf;
+ int cpu;
+
+ for_each_possible_cpu(cpu) {
+ buf = per_cpu_ptr(tinfo->tbuf, cpu)->buf;
+ kfree(buf);
+ }
+ free_percpu(tinfo->tbuf);
+ kfree(tinfo);
+}
+
+static int trace_user_fault_buffer_enable(void)
+{
+ struct trace_user_buf_info *tinfo;
+ char *buf;
+ int cpu;
+
+ guard(mutex)(&trace_user_buffer_mutex);
+
+ if (trace_user_buffer) {
+ trace_user_buffer->ref++;
+ return 0;
+ }
+
+ tinfo = kmalloc(sizeof(*tinfo), GFP_KERNEL);
+ if (!tinfo)
+ return -ENOMEM;
+
+ tinfo->tbuf = alloc_percpu(struct trace_user_buf);
+ if (!tinfo->tbuf) {
+ kfree(tinfo);
+ return -ENOMEM;
+ }
+
+ tinfo->ref = 1;
+
+ /* Clear each buffer in case of error */
+ for_each_possible_cpu(cpu) {
+ per_cpu_ptr(tinfo->tbuf, cpu)->buf = NULL;
+ }
+
+ for_each_possible_cpu(cpu) {
+ buf = kmalloc_node(TRACE_MARKER_MAX_SIZE, GFP_KERNEL,
+ cpu_to_node(cpu));
+ if (!buf) {
+ trace_user_fault_buffer_free(tinfo);
+ return -ENOMEM;
+ }
+ per_cpu_ptr(tinfo->tbuf, cpu)->buf = buf;
+ }
+
+ trace_user_buffer = tinfo;
+
+ return 0;
+}
+
+static void trace_user_fault_buffer_disable(void)
+{
+ struct trace_user_buf_info *tinfo;
+
+ guard(mutex)(&trace_user_buffer_mutex);
+
+ tinfo = trace_user_buffer;
+
+ if (WARN_ON_ONCE(!tinfo))
+ return;
+
+ if (--tinfo->ref)
+ return;
+
+ trace_user_fault_buffer_free(tinfo);
+ trace_user_buffer = NULL;
+}
+
+/* Must be called with preemption disabled */
+static char *trace_user_fault_read(struct trace_user_buf_info *tinfo,
+ const char __user *ptr, size_t size,
+ size_t *read_size)
+{
+ int cpu = smp_processor_id();
+ char *buffer = per_cpu_ptr(tinfo->tbuf, cpu)->buf;
+ unsigned int cnt;
+ int trys = 0;
+ int ret;
+
+ if (size > TRACE_MARKER_MAX_SIZE)
+ size = TRACE_MARKER_MAX_SIZE;
+ *read_size = 0;
+
+ /*
+ * This acts similar to a seqcount. The per CPU context switches are
+ * recorded, migration is disabled and preemption is enabled. The
+ * read of the user space memory is copied into the per CPU buffer.
+ * Preemption is disabled again, and if the per CPU context switches count
+ * is still the same, it means the buffer has not been corrupted.
+ * If the count is different, it is assumed the buffer is corrupted
+ * and reading must be tried again.
+ */
+
+ do {
+ /*
+ * If for some reason, copy_from_user() always causes a context
+ * switch, this would then cause an infinite loop.
+ * If this task is preempted by another user space task, it
+ * will cause this task to try again. But just in case something
+ * changes where the copying from user space causes another task
+ * to run, prevent this from going into an infinite loop.
+ * 100 tries should be plenty.
+ */
+ if (WARN_ONCE(trys++ > 100, "Error: Too many tries to read user space"))
+ return NULL;
+
+ /* Read the current CPU context switch counter */
+ cnt = nr_context_switches_cpu(cpu);
+
+ /*
+ * Preemption is going to be enabled, but this task must
+ * remain on this CPU.
+ */
+ migrate_disable();
+
+ /*
+ * Now preemption is being enabed and another task can come in
+ * and use the same buffer and corrupt our data.
+ */
+ preempt_enable_notrace();
+
+ ret = __copy_from_user(buffer, ptr, size);
+
+ preempt_disable_notrace();
+ migrate_enable();
+
+ /* if it faulted, no need to test if the buffer was corrupted */
+ if (ret)
+ return NULL;
+
+ /*
+ * Preemption is disabled again, now check the per CPU context
+ * switch counter. If it doesn't match, then another user space
+ * process may have schedule in and corrupted our buffer. In that
+ * case the copying must be retried.
+ */
+ } while (nr_context_switches_cpu(cpu) != cnt);
+
+ *read_size = size;
+ return buffer;
+}
+
static ssize_t
tracing_mark_write(struct file *filp, const char __user *ubuf,
size_t cnt, loff_t *fpos)
@@ -7250,6 +7389,8 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
struct trace_array *tr = filp->private_data;
ssize_t written = -ENODEV;
unsigned long ip;
+ size_t size;
+ char *buf;
if (tracing_disabled)
return -EINVAL;
@@ -7263,6 +7404,16 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
if (cnt > TRACE_MARKER_MAX_SIZE)
cnt = TRACE_MARKER_MAX_SIZE;
+ /* Must have preemption disabled while having access to the buffer */
+ guard(preempt_notrace)();
+
+ buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size);
+ if (!buf)
+ return -EFAULT;
+
+ if (cnt > size)
+ cnt = size;
+
/* The selftests expect this function to be the IP address */
ip = _THIS_IP_;
@@ -7270,32 +7421,27 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
if (tr == &global_trace) {
guard(rcu)();
list_for_each_entry_rcu(tr, &marker_copies, marker_list) {
- written = write_marker_to_buffer(tr, ubuf, cnt, ip);
+ written = write_marker_to_buffer(tr, buf, cnt, ip);
if (written < 0)
break;
}
} else {
- written = write_marker_to_buffer(tr, ubuf, cnt, ip);
+ written = write_marker_to_buffer(tr, buf, cnt, ip);
}
return written;
}
static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
- const char __user *ubuf, size_t cnt)
+ const char *buf, size_t cnt)
{
struct ring_buffer_event *event;
struct trace_buffer *buffer;
struct raw_data_entry *entry;
ssize_t written;
- int size;
- int len;
-
-#define FAULT_SIZE_ID (FAULTED_SIZE + sizeof(int))
+ size_t size;
size = sizeof(*entry) + cnt;
- if (cnt < FAULT_SIZE_ID)
- size += FAULT_SIZE_ID - cnt;
buffer = tr->array_buffer.buffer;
@@ -7309,14 +7455,8 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr,
return -EBADF;
entry = ring_buffer_event_data(event);
-
- len = copy_from_user_nofault(&entry->id, ubuf, cnt);
- if (len) {
- entry->id = -1;
- memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE);
- written = -EFAULT;
- } else
- written = cnt;
+ memcpy(&entry->id, buf, cnt);
+ written = cnt;
__buffer_unlock_commit(buffer, event);
@@ -7329,8 +7469,8 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
{
struct trace_array *tr = filp->private_data;
ssize_t written = -ENODEV;
-
-#define FAULT_SIZE_ID (FAULTED_SIZE + sizeof(int))
+ size_t size;
+ char *buf;
if (tracing_disabled)
return -EINVAL;
@@ -7342,6 +7482,17 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
if (cnt < sizeof(unsigned int))
return -EINVAL;
+ /* Must have preemption disabled while having access to the buffer */
+ guard(preempt_notrace)();
+
+ buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size);
+ if (!buf)
+ return -EFAULT;
+
+ /* raw write is all or nothing */
+ if (cnt > size)
+ return -EINVAL;
+
/* The global trace_marker_raw can go to multiple instances */
if (tr == &global_trace) {
guard(rcu)();
@@ -7357,6 +7508,27 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
return written;
}
+static int tracing_mark_open(struct inode *inode, struct file *filp)
+{
+ int ret;
+
+ ret = trace_user_fault_buffer_enable();
+ if (ret < 0)
+ return ret;
+
+ stream_open(inode, filp);
+ ret = tracing_open_generic_tr(inode, filp);
+ if (ret < 0)
+ trace_user_fault_buffer_disable();
+ return ret;
+}
+
+static int tracing_mark_release(struct inode *inode, struct file *file)
+{
+ trace_user_fault_buffer_disable();
+ return tracing_release_generic_tr(inode, file);
+}
+
static int tracing_clock_show(struct seq_file *m, void *v)
{
struct trace_array *tr = m->private;
@@ -7764,13 +7936,13 @@ static const struct file_operations tracing_free_buffer_fops = {
static const struct file_operations tracing_mark_fops = {
.open = tracing_mark_open,
.write = tracing_mark_write,
- .release = tracing_release_generic_tr,
+ .release = tracing_mark_release,
};
static const struct file_operations tracing_mark_raw_fops = {
.open = tracing_mark_open,
.write = tracing_mark_raw_write,
- .release = tracing_release_generic_tr,
+ .release = tracing_mark_release,
};
static const struct file_operations trace_clock_fops = {
pidfd_pid() may return an ERR_PTR() when the file does not refer to a
valid pidfs file. Currently pidfd_info() calls pid_in_current_pidns()
directly on the returned value, which risks dereferencing an ERR_PTR.
Fix it by explicitly checking IS_ERR(pid) and returning PTR_ERR(pid)
before further use.
Fixes: 7477d7dce48a ("pidfs: allow to retrieve exit information")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zhen Ni <zhen.ni(a)easystack.cn>
---
fs/pidfs.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/pidfs.c b/fs/pidfs.c
index 0ef5b47d796a..16670648bb09 100644
--- a/fs/pidfs.c
+++ b/fs/pidfs.c
@@ -314,6 +314,9 @@ static long pidfd_info(struct file *file, unsigned int cmd, unsigned long arg)
if (copy_from_user(&mask, &uinfo->mask, sizeof(mask)))
return -EFAULT;
+ if (IS_ERR(pid))
+ return PTR_ERR(pid);
+
/*
* Restrict information retrieval to tasks within the caller's pid
* namespace hierarchy.
--
2.20.1
Hi Sasha,
Please do NOT backport commit dd83609b8898 alone to stable. This patch
causes a regression in fallocate(PUNCH_HOLE) operations where pages are
not freed immediately, as reported by Mark Brown.
The fix for this regression is already in linux-next as commit
91a830422707 ("hugetlbfs: check for shareable lock before calling
huge_pmd_unshare()").
Please backport both commits together to avoid introducing the
regression in stable kernels:
- dd83609b88986f4add37c0871c3434310652ebd5 ("hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list")
- 91a830422707a62629fc4fbf8cdc3c8acf56ca64 ("hugetlbfs: check for shareable lock before calling huge_pmd_unshare()")
Thanks,
Deepanshu Kartikey
The function lan78xx_write_raw_eeprom failed to properly propagate EEPROM
write timeout errors (-ETIMEDOUT). In the timeout fallthrough path, it first
attempted to restore the pin configuration for LED outputs and then
returned only the status of that restore operation, discarding the
original timeout error saved in ret.
As a result, callers could mistakenly treat EEPROM write operation as
successful even though the EEPROM write had actually timed out with no
or partial data write.
To fix this, handle errors in restoring the LED pin configuration separately.
If the restore succeeds, return any prior EEPROM write timeout error saved
in ret to the caller.
Suggested-by: Oleksij Rempel <o.rempel(a)pengutronix.de>
Fixes: 8b1b2ca83b20 ("net: usb: lan78xx: Improve error handling in EEPROM and OTP operations")
cc: stable(a)vger.kernel.org
Signed-off-by: Bhanu Seshu Kumar Valluri <bhanuseshukumar(a)gmail.com>
---
Note:
The patch is compiled and tested using EVB-LAN7800LC.
The patch was suggested by Oleksij Rempel while reviewing a fix to a bug
found by syzbot earlier.
The review mail chain where this fix was suggested is given below.
https://lore.kernel.org/all/aNzojoXK-m1Tn6Lc@pengutronix.de/
ChangeLog:
v1->v2:
Added cc:stable tag as asked during v1 review.
V1 Link : https://lore.kernel.org/all/20251004040722.82882-1-bhanuseshukumar@gmail.co…
drivers/net/usb/lan78xx.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index d75502ebbc0d..5ccbe6ae2ebe 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -1174,10 +1174,13 @@ static int lan78xx_write_raw_eeprom(struct lan78xx_net *dev, u32 offset,
}
write_raw_eeprom_done:
- if (dev->chipid == ID_REV_CHIP_ID_7800_)
- return lan78xx_write_reg(dev, HW_CFG, saved);
-
- return 0;
+ if (dev->chipid == ID_REV_CHIP_ID_7800_) {
+ int rc = lan78xx_write_reg(dev, HW_CFG, saved);
+ /* If USB fails, there is nothing to do */
+ if (rc < 0)
+ return rc;
+ }
+ return ret;
}
static int lan78xx_read_raw_otp(struct lan78xx_net *dev, u32 offset,
--
2.34.1
On October 12, 2025 7:20:16 AM PDT, Sasha Levin <sashal(a)kernel.org> wrote:
>This is a note to let you know that I've just added the patch titled
>
> x86/vdso: Fix output operand size of RDPID
>
>to the 6.16-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
>The filename of the patch is:
> x86-vdso-fix-output-operand-size-of-rdpid.patch
>and it can be found in the queue-6.16 subdirectory.
>
>If you, or anyone else, feels it should not be added to the stable tree,
>please let <stable(a)vger.kernel.org> know about it.
>
>
>
>commit 9e09c5e5e76f1bb0480722f36d5a266d2faaf00d
>Author: Uros Bizjak <ubizjak(a)gmail.com>
>Date: Mon Jun 16 11:52:57 2025 +0200
>
> x86/vdso: Fix output operand size of RDPID
>
> [ Upstream commit ac9c408ed19d535289ca59200dd6a44a6a2d6036 ]
>
> RDPID instruction outputs to a word-sized register (64-bit on x86_64 and
> 32-bit on x86_32). Use an unsigned long variable to store the correct size.
>
> LSL outputs to 32-bit register, use %k operand prefix to always print the
> 32-bit name of the register.
>
> Use RDPID insn mnemonic while at it as the minimum binutils version of
> 2.30 supports it.
>
> [ bp: Merge two patches touching the same function into a single one. ]
>
> Fixes: ffebbaedc861 ("x86/vdso: Introduce helper functions for CPU and node number")
> Signed-off-by: Uros Bizjak <ubizjak(a)gmail.com>
> Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
> Link: https://lore.kernel.org/20250616095315.230620-1-ubizjak@gmail.com
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
>diff --git a/arch/x86/include/asm/segment.h b/arch/x86/include/asm/segment.h
>index 77d8f49b92bdd..f59ae7186940a 100644
>--- a/arch/x86/include/asm/segment.h
>+++ b/arch/x86/include/asm/segment.h
>@@ -244,7 +244,7 @@ static inline unsigned long vdso_encode_cpunode(int cpu, unsigned long node)
>
> static inline void vdso_read_cpunode(unsigned *cpu, unsigned *node)
> {
>- unsigned int p;
>+ unsigned long p;
>
> /*
> * Load CPU and node number from the GDT. LSL is faster than RDTSCP
>@@ -254,10 +254,10 @@ static inline void vdso_read_cpunode(unsigned *cpu, unsigned *node)
> *
> * If RDPID is available, use it.
> */
>- alternative_io ("lsl %[seg],%[p]",
>- ".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */
>+ alternative_io ("lsl %[seg],%k[p]",
>+ "rdpid %[p]",
> X86_FEATURE_RDPID,
>- [p] "=a" (p), [seg] "r" (__CPUNODE_SEG));
>+ [p] "=r" (p), [seg] "r" (__CPUNODE_SEG));
>
> if (cpu)
> *cpu = (p & VDSO_CPUNODE_MASK);
What the actual hell?!
Doesn't *anyone* know that x86 zero-extends a 32-bit value to 64 bits?
All this code does is put a completely unnecessary REX prefix on RDPID.
Greetings future partner.
My name is Mr. David Wright. I write to request your cooperation in my desire to find a foreign partner who will assist me in the relocation and investment of funds. I am offering you 50:50 of the total sum after the transfer.The entire plan of this transaction will be forwarded to you as soon as I receive your positive response. I assure you that there is no risk attached to you in this transaction.
I await your positive response.
May God bless you.
Kind regards,
Mr. David Wright
Tel:+447418603273.
Whatsapp:+447886787422
This series fixes an issue with DMABUF support in the IIO subsystem where
the wrong DMA device could be used for buffer mapping operations. This
becomes critical on systems like Xilinx/AMD ZynqMP Ultrascale where memory
can be mapped above the 32-bit address range.
Problem:
--------
The current IIO DMABUF implementation assumes it can use the parent device
of the IIO device for DMA operations. However, this device may not have
the appropriate DMA mask configuration for accessing high memory addresses.
On systems where memory is mapped above 32-bits, this leads to the use of
bounce buffers through swiotlb, significantly impacting performance.
Solution:
---------
This series introduces a new .get_dma_dev() callback in the buffer access
functions that allows buffer implementations to specify the correct DMA
device that should be used for DMABUF operations. The DMA buffer
infrastructure implements this callback to return the device that actually
owns the DMA channel, ensuring proper memory mapping without bounce buffers.
Changes:
--------
1. Add .get_dma_dev() callback to iio_buffer_access_funcs and update core
DMABUF code to use it when available
2. Implement the callback in the DMA buffer infrastructure
3. Wire up the callback in the dmaengine buffer implementation
This ensures that DMABUF operations use the device with the correct DMA
configuration, eliminating unnecessary bounce buffer usage and improving
performance on high-memory systems.
(AI generated cover. I would not be this formal but I guess is not
that bad :))
---
Changes in v3:
- Patch 1
* Add a new iio_buffer_get_dma_dev() helper to get the DMA dev.
- Link to v2: https://lore.kernel.org/r/20251006-fix-iio-dmabuf-get-dma-device-v2-0-d960b…
---
Nuno Sá (3):
iio: buffer: support getting dma channel from the buffer
iio: buffer-dma: support getting the DMA channel
iio: buffer-dmaengine: enable .get_dma_dev()
drivers/iio/buffer/industrialio-buffer-dma.c | 6 ++++++
drivers/iio/buffer/industrialio-buffer-dmaengine.c | 2 ++
drivers/iio/industrialio-buffer.c | 21 ++++++++++++++++-----
include/linux/iio/buffer-dma.h | 1 +
include/linux/iio/buffer_impl.h | 2 ++
5 files changed, 27 insertions(+), 5 deletions(-)
---
base-commit: b9700f87939f0f477e5c00db817f54ab8a97702b
change-id: 20250930-fix-iio-dmabuf-get-dma-device-339ac70543db
--
Thanks!
- Nuno Sá
This series fixes an issue with DMABUF support in the IIO subsystem where
the wrong DMA device could be used for buffer mapping operations. This
becomes critical on systems like Xilinx/AMD ZynqMP Ultrascale where memory
can be mapped above the 32-bit address range.
Problem:
--------
The current IIO DMABUF implementation assumes it can use the parent device
of the IIO device for DMA operations. However, this device may not have
the appropriate DMA mask configuration for accessing high memory addresses.
On systems where memory is mapped above 32-bits, this leads to the use of
bounce buffers through swiotlb, significantly impacting performance.
Solution:
---------
This series introduces a new .get_dma_dev() callback in the buffer access
functions that allows buffer implementations to specify the correct DMA
device that should be used for DMABUF operations. The DMA buffer
infrastructure implements this callback to return the device that actually
owns the DMA channel, ensuring proper memory mapping without bounce buffers.
Changes:
--------
1. Add .get_dma_dev() callback to iio_buffer_access_funcs and update core
DMABUF code to use it when available
2. Implement the callback in the DMA buffer infrastructure
3. Wire up the callback in the dmaengine buffer implementation
This ensures that DMABUF operations use the device with the correct DMA
configuration, eliminating unnecessary bounce buffer usage and improving
performance on high-memory systems.
(AI generated cover. I would not be this formal but I guess is not
that bad :))
---
Changes in v2:
- Dropped Fixes tags on the first two patches and Cc stable them instead
(as prerequisites for the third patch).
- Link to v1: https://lore.kernel.org/r/20251002-fix-iio-dmabuf-get-dma-device-v1-0-c1c99…
---
Nuno Sá (3):
iio: buffer: support getting dma channel from the buffer
iio: buffer-dma: support getting the DMA channel
iio: buffer-dmaengine: enable .get_dma_dev()
drivers/iio/buffer/industrialio-buffer-dma.c | 6 +++++
drivers/iio/buffer/industrialio-buffer-dmaengine.c | 2 ++
drivers/iio/industrialio-buffer.c | 28 +++++++++++++++++-----
include/linux/iio/buffer-dma.h | 1 +
include/linux/iio/buffer_impl.h | 2 ++
5 files changed, 33 insertions(+), 6 deletions(-)
---
base-commit: b9700f87939f0f477e5c00db817f54ab8a97702b
change-id: 20250930-fix-iio-dmabuf-get-dma-device-339ac70543db
--
Thanks!
- Nuno Sá
From: Celeste Liu <uwu(a)coelacanthus.name>
The gs_usb driver supports USB devices with more than 1 CAN channel.
In old kernel before 3.15, it uses net_device->dev_id to distinguish
different channel in userspace, which was done in commit
acff76fa45b4 ("can: gs_usb: gs_make_candev(): set netdev->dev_id").
But since 3.15, the correct way is populating net_device->dev_port.
And according to documentation, if network device support multiple
interface, lack of net_device->dev_port SHALL be treated as a bug.
Fixes: acff76fa45b4 ("can: gs_usb: gs_make_candev(): set netdev->dev_id")
Cc: stable(a)vger.kernel.org
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
Link: https://patch.msgid.link/20250930-gs-usb-populate-net_device-dev_port-v1-1-…
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/usb/gs_usb.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
index 9fb4cbbd6d6d..69b8d6da651b 100644
--- a/drivers/net/can/usb/gs_usb.c
+++ b/drivers/net/can/usb/gs_usb.c
@@ -1245,6 +1245,7 @@ static struct gs_can *gs_make_candev(unsigned int channel,
netdev->flags |= IFF_ECHO; /* we support full roundtrip echo */
netdev->dev_id = channel;
+ netdev->dev_port = channel;
/* dev setup */
strcpy(dev->bt_const.name, KBUILD_MODNAME);
--
2.51.0
From: Celeste Liu <uwu(a)coelacanthus.name>
This issue was found by Runcheng Lu when develop HSCanT USB to CAN FD
converter[1]. The original developers may have only 3 interfaces
device to test so they write 3 here and wait for future change.
During the HSCanT development, we actually used 4 interfaces, so the
limitation of 3 is not enough now. But just increase one is not
future-proofed. Since the channel index type in gs_host_frame is u8,
just make canch[] become a flexible array with a u8 index, so it
naturally constraint by U8_MAX and avoid statically allocate 256
pointer for every gs_usb device.
[1]: https://github.com/cherry-embedded/HSCanT-hardware
Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices")
Reported-by: Runcheng Lu <runcheng.lu(a)hpmicro.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Vincent Mailhol <mailhol(a)kernel.org>
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
Link: https://patch.msgid.link/20250930-gs-usb-max-if-v5-1-863330bf6666@coelacant…
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/usb/gs_usb.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
index c9482d6e947b..9fb4cbbd6d6d 100644
--- a/drivers/net/can/usb/gs_usb.c
+++ b/drivers/net/can/usb/gs_usb.c
@@ -289,11 +289,6 @@ struct gs_host_frame {
#define GS_MAX_RX_URBS 30
#define GS_NAPI_WEIGHT 32
-/* Maximum number of interfaces the driver supports per device.
- * Current hardware only supports 3 interfaces. The future may vary.
- */
-#define GS_MAX_INTF 3
-
struct gs_tx_context {
struct gs_can *dev;
unsigned int echo_id;
@@ -324,7 +319,6 @@ struct gs_can {
/* usb interface struct */
struct gs_usb {
- struct gs_can *canch[GS_MAX_INTF];
struct usb_anchor rx_submitted;
struct usb_device *udev;
@@ -336,9 +330,11 @@ struct gs_usb {
unsigned int hf_size_rx;
u8 active_channels;
+ u8 channel_cnt;
unsigned int pipe_in;
unsigned int pipe_out;
+ struct gs_can *canch[] __counted_by(channel_cnt);
};
/* 'allocate' a tx context.
@@ -599,7 +595,7 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
}
/* device reports out of range channel id */
- if (hf->channel >= GS_MAX_INTF)
+ if (hf->channel >= parent->channel_cnt)
goto device_detach;
dev = parent->canch[hf->channel];
@@ -699,7 +695,7 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
/* USB failure take down all interfaces */
if (rc == -ENODEV) {
device_detach:
- for (rc = 0; rc < GS_MAX_INTF; rc++) {
+ for (rc = 0; rc < parent->channel_cnt; rc++) {
if (parent->canch[rc])
netif_device_detach(parent->canch[rc]->netdev);
}
@@ -1460,17 +1456,19 @@ static int gs_usb_probe(struct usb_interface *intf,
icount = dconf.icount + 1;
dev_info(&intf->dev, "Configuring for %u interfaces\n", icount);
- if (icount > GS_MAX_INTF) {
+ if (icount > type_max(parent->channel_cnt)) {
dev_err(&intf->dev,
"Driver cannot handle more that %u CAN interfaces\n",
- GS_MAX_INTF);
+ type_max(parent->channel_cnt));
return -EINVAL;
}
- parent = kzalloc(sizeof(*parent), GFP_KERNEL);
+ parent = kzalloc(struct_size(parent, canch, icount), GFP_KERNEL);
if (!parent)
return -ENOMEM;
+ parent->channel_cnt = icount;
+
init_usb_anchor(&parent->rx_submitted);
usb_set_intfdata(intf, parent);
@@ -1531,7 +1529,7 @@ static void gs_usb_disconnect(struct usb_interface *intf)
return;
}
- for (i = 0; i < GS_MAX_INTF; i++)
+ for (i = 0; i < parent->channel_cnt; i++)
if (parent->canch[i])
gs_destroy_candev(parent->canch[i]);
base-commit: 2c95a756e0cfc19af6d0b32b0c6cf3bada334998
--
2.51.0
This is the start of the stable review cycle for the 6.17.2 release.
There are 26 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 12 Oct 2025 13:13:18 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.17.2-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.17.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.17.2-rc1
Ankit Khushwaha <ankitkhushwaha.linux(a)gmail.com>
ring buffer: Propagate __rb_map_vma return value to caller
Chao Yu <chao(a)kernel.org>
f2fs: fix to do sanity check on node footer for non inode dnode
Sean Christopherson <seanjc(a)google.com>
KVM: x86: Don't (re)check L1 intercepts when completing userspace I/O
Nalivayko Sergey <Sergey.Nalivayko(a)kaspersky.com>
net/9p: fix double req put in p9_fd_cancelled
Herbert Xu <herbert(a)gondor.apana.org.au>
crypto: rng - Ensure set_ent is always present
Herbert Xu <herbert(a)gondor.apana.org.au>
crypto: zstd - Fix compression bug caused by truncation
Herbert Xu <herbert(a)gondor.apana.org.au>
Revert "crypto: testmgr - desupport SHA-1 for FIPS 140"
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
driver core/PM: Set power.no_callbacks along with power.no_pm
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
driver core: faux: Set power.no_pm for faux devices
Ovidiu Panait <ovidiu.panait.oss(a)gmail.com>
staging: axis-fifo: flush RX FIFO on read errors
Ovidiu Panait <ovidiu.panait.oss(a)gmail.com>
staging: axis-fifo: fix TX handling on copy_from_user() failure
Ovidiu Panait <ovidiu.panait.oss(a)gmail.com>
staging: axis-fifo: fix maximum TX packet length check
Raphael Gallais-Pou <raphael.gallais-pou(a)foss.st.com>
serial: stm32: allow selecting console when the driver is module
Carlos Llamas <cmllamas(a)google.com>
binder: fix double-free in dbitmap
Max Kellermann <max.kellermann(a)ionos.com>
drivers/misc/amd-sbi/Kconfig: select REGMAP_I2C
Michael Walle <mwalle(a)kernel.org>
nvmem: layouts: fix automatic module loading
Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
serial: qcom-geni: Fix blocked task
Rahul Rameshbabu <sergeantsagara(a)protonmail.com>
rust: pci: fix incorrect platform reference in PCI driver unbind doc comment
Rahul Rameshbabu <sergeantsagara(a)protonmail.com>
rust: pci: fix incorrect platform reference in PCI driver probe doc comment
Miguel Ojeda <ojeda(a)kernel.org>
rust: block: fix `srctree/` links
Miguel Ojeda <ojeda(a)kernel.org>
rust: drm: fix `srctree/` links
Bitterblue Smith <rtl8821cerfe2(a)gmail.com>
wifi: rtl8xxxu: Don't claim USB ID 07b8:8188
Bitterblue Smith <rtl8821cerfe2(a)gmail.com>
wifi: rtlwifi: rtl8192cu: Don't claim USB ID 07b8:8188
Zenm Chen <zenmchen(a)gmail.com>
Bluetooth: btusb: Add USB ID 2001:332a for D-Link AX9U rev. A1
Xiaowei Li <xiaowei.li(a)simcom.com>
USB: serial: option: add SIMCom 8230C compositions
Mario Limonciello <mario.limonciello(a)amd.com>
drm/amdgpu: Enable MES lr_compute_wa by default
-------------
Diffstat:
Makefile | 4 +-
arch/x86/kvm/emulate.c | 9 +-
arch/x86/kvm/kvm_emulate.h | 3 +-
arch/x86/kvm/x86.c | 15 +-
crypto/rng.c | 8 +
crypto/testmgr.c | 5 +
crypto/zstd.c | 2 +-
drivers/android/dbitmap.h | 1 +
drivers/base/faux.c | 1 +
drivers/bluetooth/btusb.c | 2 +
drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 6 +
drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 5 +
drivers/gpu/drm/amd/include/mes_v11_api_def.h | 3 +-
drivers/gpu/drm/amd/include/mes_v12_api_def.h | 3 +-
drivers/misc/amd-sbi/Kconfig | 1 +
drivers/net/wireless/realtek/rtl8xxxu/core.c | 2 -
.../net/wireless/realtek/rtlwifi/rtl8192cu/sw.c | 1 -
drivers/nvmem/layouts.c | 13 ++
drivers/staging/axis-fifo/axis-fifo.c | 68 ++++----
drivers/tty/serial/Kconfig | 2 +-
drivers/tty/serial/qcom_geni_serial.c | 176 ++-------------------
drivers/usb/serial/option.c | 6 +
fs/f2fs/f2fs.h | 4 +-
fs/f2fs/gc.c | 4 +-
fs/f2fs/node.c | 58 ++++---
fs/f2fs/node.h | 1 +
fs/f2fs/recovery.c | 2 +-
include/linux/device.h | 3 +
kernel/trace/ring_buffer.c | 2 +-
net/9p/trans_fd.c | 8 +-
rust/kernel/block/mq/gen_disk.rs | 2 +-
rust/kernel/drm/device.rs | 2 +-
rust/kernel/drm/driver.rs | 2 +-
rust/kernel/drm/file.rs | 2 +-
rust/kernel/drm/gem/mod.rs | 2 +-
rust/kernel/drm/ioctl.rs | 2 +-
rust/kernel/pci.rs | 6 +-
37 files changed, 179 insertions(+), 257 deletions(-)