The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 94a17f2dc90bc7eae36c0f478515d4bd1c23e877
Gitweb: https://git.kernel.org/tip/94a17f2dc90bc7eae36c0f478515d4bd1c23e877
Author: Dave Hansen <dave.hansen(a)linux.intel.com>
AuthorDate: Tue, 10 Jun 2025 15:24:20 -07:00
Committer: Dave Hansen <dave.hansen(a)linux.intel.com>
CommitterDate: Tue, 17 Jun 2025 15:36:57 -07:00
x86/mm: Disable INVLPGB when PTI is enabled
PTI uses separate ASIDs (aka. PCIDs) for kernel and user address
spaces. When the kernel needs to flush the user address space, it
just sets a bit in a bitmap and then flushes the entire PCID on
the next switch to userspace.
This bitmap is a single 'unsigned long' which is plenty for all 6
dynamic ASIDs. But, unfortunately, the INVLPGB support brings along a
bunch more user ASIDs, as many as ~2k more. The bitmap can't address
that many.
Fortunately, the bitmap is only needed for PTI and all the CPUs
with INVLPGB are AMD CPUs that aren't vulnerable to Meltdown and
don't need PTI. The only way someone can run into an issue in
practice is by booting with pti=on on a newer AMD CPU.
Disable INVLPGB if PTI is enabled. Avoid overrunning the small
bitmap.
Note: this will be fixed up properly by making the bitmap bigger.
For now, just avoid the mostly theoretical bug.
Fixes: 4afeb0ed1753 ("x86/mm: Enable broadcast TLB invalidation for multi-threaded processes")
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Rik van Riel <riel(a)surriel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250610222420.E8CBF472%40davehans-spike.ostc.i…
---
arch/x86/mm/pti.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
index 1902998..c0c40b6 100644
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -98,6 +98,11 @@ void __init pti_check_boottime_disable(void)
return;
setup_force_cpu_cap(X86_FEATURE_PTI);
+
+ if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) {
+ pr_debug("PTI enabled, disabling INVLPGB\n");
+ setup_clear_cpu_cap(X86_FEATURE_INVLPGB);
+ }
}
static int __init pti_parse_cmdline(char *arg)
Hello,
New build issue found on stable-rc/linux-6.12.y:
---
‘lvts_debugfs_exit’ defined but not used [-Werror=unused-function] in
drivers/thermal/mediatek/lvts_thermal.o
(drivers/thermal/mediatek/lvts_thermal.c)
[logspec:kbuild,kbuild.compiler.error]
---
- dashboard: https://d.kernelci.org/i/maestro:fb8aae5340da55b6254442f0858147bf5f0b39dc
- giturl: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
- commit HEAD: 519e0647630e07972733e99a0dc82065a65736ea
Log excerpt:
=====================================================
drivers/thermal/mediatek/lvts_thermal.c:262:13: error:
‘lvts_debugfs_exit’ defined but not used [-Werror=unused-function]
262 | static void lvts_debugfs_exit(struct lvts_domain *lvts_td) { }
| ^~~~~~~~~~~~~~~~~
CC [M] drivers/watchdog/softdog.o
cc1: all warnings being treated as errors
=====================================================
# Builds where the incident occurred:
## cros://chromeos-6.6/arm64/chromiumos-mediatek.flavour.config+lab-setup+arm64-chromebook+CONFIG_MODULE_COMPRESS=n+CONFIG_MODULE_COMPRESS_NONE=y
on (arm64):
- compiler: gcc-12
- dashboard: https://d.kernelci.org/build/maestro:685194ac5c2cf25042b9c1a8
#kernelci issue maestro:fb8aae5340da55b6254442f0858147bf5f0b39dc
Reported-by: kernelci.org bot <bot(a)kernelci.org>
--
This is an experimental report format. Please send feedback in!
Talk to us at kernelci(a)lists.linux.dev
Made with love by the KernelCI team - https://kernelci.org
Hi Greg & Sasha !
I ran into some trouble in my nightly CI systems that test v6.6.y and
v6.1.y. Using "make binrpm-pkg" followed by "rpm -iv ..." results in the
test systems being unbootable because the vmlinuz file is never copied
to /boot. The test systems are imaged with Fedora 39.
I found a related Fedora bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2239008
It appears there is a missing fix in LTS kernels. I bisected the kernel
fix to:
358de8b4f201 ("kbuild: rpm-pkg: simplify installkernel %post")
which includes a "Cc: stable" tag but does not appear in
origin/linux-6.6.y, origin/linux-6.1.y, or origin/5.15.y (I did not look
further back than that).
Would it be appropriate to apply 358de8b4f201 to LTS kernels?
--
Chuck Lever
tianshuo han reported a remotely-triggerable crash if the client sends a
kernel RPC server a specially crafted packet. If decoding the RPC reply
fails in such a way that SVC_GARBAGE is returned without setting the
rq_accept_statp pointer, then that pointer can be dereferenced and a
value stored there.
If it's the first time the thread has processed an RPC, then that
pointer will be set to NULL and the kernel will crash. In other cases,
it could create a memory scribble.
The server sunrpc code treats a SVC_GARBAGE return from svc_authenticate
or pg_authenticate as if it should send a GARBAGE_ARGS reply. RFC 5531
says that if authentication fails that the RPC should be rejected
instead with a status of AUTH_ERR.
Handle a SVC_GARBAGE return as an AUTH_ERROR, with a reason of
AUTH_BADCRED instead of returning GARBAGE_ARGS in that case. This
sidesteps the whole problem of touching the rpc_accept_statp pointer in
this situation and avoids the crash.
Cc: stable(a)vger.kernel.org # v6.9+
Fixes: 29cd2927fb91 ("SUNRPC: Fix encoding of accepted but unsuccessful RPC replies")
Reported-by: tianshuo han <hantianshuo233(a)gmail.com>
Signed-off-by: Jeff Layton <jlayton(a)kernel.org>
---
This should be more correct. Unfortunately, I don't know of any
testcases for low-level RPC error handling. That seems like something
that would be nice to do with pynfs or similar though.
---
Changes in v2:
- Fix endianness of rq_accept_statp assignment
- Better describe the way the crash happens and how this fixes it
- point Fixes: tag at correct patch
- add Cc: stable tag
---
net/sunrpc/svc.c | 11 ++---------
1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index 939b6239df8ab6229ce34836d77d3a6b983fbbb7..99050ab1435148ac5d52b697ab1a771b9e948143 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -1375,7 +1375,8 @@ svc_process_common(struct svc_rqst *rqstp)
case SVC_OK:
break;
case SVC_GARBAGE:
- goto err_garbage_args;
+ rqstp->rq_auth_stat = rpc_autherr_badcred;
+ goto err_bad_auth;
case SVC_SYSERR:
goto err_system_err;
case SVC_DENIED:
@@ -1516,14 +1517,6 @@ svc_process_common(struct svc_rqst *rqstp)
*rqstp->rq_accept_statp = rpc_proc_unavail;
goto sendit;
-err_garbage_args:
- svc_printk(rqstp, "failed to decode RPC header\n");
-
- if (serv->sv_stats)
- serv->sv_stats->rpcbadfmt++;
- *rqstp->rq_accept_statp = rpc_garbage_args;
- goto sendit;
-
err_system_err:
if (serv->sv_stats)
serv->sv_stats->rpcbadfmt++;
---
base-commit: 9afe652958c3ee88f24df1e4a97f298afce89407
change-id: 20250617-rpc-6-16-cc7a23e9c961
Best regards,
--
Jeff Layton <jlayton(a)kernel.org>
Hello,
New build issue found on stable-rc/linux-5.4.y:
---
clang: error: assembler command failed with exit code 1 (use -v to
see invocation) in drivers/firmware/qcom_scm-32.o
(scripts/Makefile.build:262) [logspec:kbuild,kbuild.compiler.error]
---
- dashboard: https://d.kernelci.org/i/maestro:e1ce6e2cb61e68ec7bf14991570487d713f77e0a
- giturl: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
- commit HEAD: e2f5a2e75b315706dd2d1d50a4313e5785eb189d
Log excerpt:
=====================================================
CC drivers/firmware/qcom_scm-32.o
CC lib/idr.o
CC drivers/gpu/host1x/debug.o
CC drivers/clk/rockchip/clk-rk3328.o
CC drivers/clk/rockchip/clk-rk3368.o
CC lib/ioremap.o
CC drivers/gpu/drm/drm_probe_helper.o
CC drivers/clk/rockchip/clk-rk3399.o
/tmp/qcom_scm-32-2d4d72.s: Assembler messages:
/tmp/qcom_scm-32-2d4d72.s:56: Error: selected processor does not
support `smc #0' in ARM mode
/tmp/qcom_scm-32-2d4d72.s:69: Error: selected processor does not
support `smc #0' in ARM mode
/tmp/qcom_scm-32-2d4d72.s:173: Error: selected processor does not
support `smc #0' in ARM mode
/tmp/qcom_scm-32-2d4d72.s:394: Error: selected processor does not
support `smc #0' in ARM mode
/tmp/qcom_scm-32-2d4d72.s:545: Error: selected processor does not
support `smc #0' in ARM mode
/tmp/qcom_scm-32-2d4d72.s:930: Error: selected processor does not
support `smc #0' in ARM mode
/tmp/qcom_scm-32-2d4d72.s:1070: Error: selected processor does not
support `smc #0' in ARM mode
/tmp/qcom_scm-32-2d4d72.s:1117: Error: selected processor does not
support `smc #0' in ARM mode
clang: error: assembler command failed with exit code 1 (use -v to see
invocation)
=====================================================
# Builds where the incident occurred:
## defconfig+allmodconfig+CONFIG_FRAME_WARN=2048 on (arm):
- compiler: clang-17
- dashboard: https://d.kernelci.org/build/maestro:685191885c2cf25042b9bb39
## multi_v7_defconfig on (arm):
- compiler: clang-17
- dashboard: https://d.kernelci.org/build/maestro:685191845c2cf25042b9bb35
#kernelci issue maestro:e1ce6e2cb61e68ec7bf14991570487d713f77e0a
Reported-by: kernelci.org bot <bot(a)kernelci.org>
--
This is an experimental report format. Please send feedback in!
Talk to us at kernelci(a)lists.linux.dev
Made with love by the KernelCI team - https://kernelci.org
Hello,
New build issue found on stable-rc/linux-6.1.y:
---
stack frame size (2488) exceeds limit (2048) in
'dml31_ModeSupportAndSystemConfigurationFull'
[-Werror,-Wframe-larger-than] in
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.o
(drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c)
[logspec:kbuild,kbuild.compiler.error]
---
- dashboard: https://d.kernelci.org/i/maestro:69fb66ef80a96ff4750a9dacf73be24a7cbe888e
- giturl: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
- commit HEAD: 2c86adab41e98d103953bf8c447202c9147150ab
Log excerpt:
=====================================================
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3795:6:
error: stack frame size (2488) exceeds limit (2048) in
'dml31_ModeSupportAndSystemConfigurationFull'
[-Werror,-Wframe-larger-than]
3795 | void dml31_ModeSupportAndSystemConfigurationFull(struct
display_mode_lib *mode_lib)
| ^
CC drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn303/dcn303_fpu.o
1 error generated.
CC drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/dcn314_fpu.o
=====================================================
# Builds where the incident occurred:
## x86_64_defconfig+kselftest+x86-board on (x86_64):
- compiler: clang-17
- dashboard: https://d.kernelci.org/build/maestro:685193725c2cf25042b9bcc9
#kernelci issue maestro:69fb66ef80a96ff4750a9dacf73be24a7cbe888e
Reported-by: kernelci.org bot <bot(a)kernelci.org>
--
This is an experimental report format. Please send feedback in!
Talk to us at kernelci(a)lists.linux.dev
Made with love by the KernelCI team - https://kernelci.org
Hello,
New build issue found on stable-rc/linux-5.4.y:
---
in drivers/firmware/qcom_scm-32.o (scripts/Makefile.build:262)
[logspec:kbuild,kbuild.compiler]
---
- dashboard: https://d.kernelci.org/i/maestro:04c1ce2921a16b59c7329a6026c59ea7942ef691
- giturl: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
- commit HEAD: e2f5a2e75b315706dd2d1d50a4313e5785eb189d
Log excerpt:
=====================================================
CC drivers/firmware/qcom_scm-32.o
CC drivers/firmware/trusted_foundations.o
CC drivers/clk/qcom/clk-regmap.o
CC drivers/gpio/gpio-pl061.o
CC kernel/resource.o
CC kernel/sysctl.o
/tmp/ccsKkK07.s: Assembler messages:
/tmp/ccsKkK07.s:45: Error: selected processor does not support `smc
#0' in ARM mode
/tmp/ccsKkK07.s:94: Error: selected processor does not support `smc
#0' in ARM mode
/tmp/ccsKkK07.s:160: Error: selected processor does not support `smc
#0' in ARM mode
/tmp/ccsKkK07.s:296: Error: selected processor does not support `smc
#0' in ARM mode
=====================================================
# Builds where the incident occurred:
## multi_v7_defconfig on (arm):
- compiler: gcc-12
- dashboard: https://d.kernelci.org/build/maestro:685191a35c2cf25042b9bb4f
## multi_v7_defconfig+kselftest on (arm):
- compiler: gcc-12
- dashboard: https://d.kernelci.org/build/maestro:685191ab5c2cf25042b9bb56
#kernelci issue maestro:04c1ce2921a16b59c7329a6026c59ea7942ef691
Reported-by: kernelci.org bot <bot(a)kernelci.org>
--
This is an experimental report format. Please send feedback in!
Talk to us at kernelci(a)lists.linux.dev
Made with love by the KernelCI team - https://kernelci.org
Hello,
New build issue found on stable-rc/linux-5.15.y:
---
integer literal is too large to be represented in type 'long',
interpreting as 'unsigned long' per C89; this literal will have type
'long long' in C99 onwards [-Werror,-Wc99-compat] in
drivers/gpu/drm/meson/meson_vclk.o
(drivers/gpu/drm/meson/meson_vclk.c)
[logspec:kbuild,kbuild.compiler.error]
---
- dashboard: https://d.kernelci.org/i/maestro:ae3b0334acd91200d6ced325a381bafac2d46493
- giturl: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
- commit HEAD: d99cd6f3a570e2f93e8f966b8ca772ef3da54fe2
Log excerpt:
=====================================================
drivers/gpu/drm/meson/meson_vclk.c:399:15: error: integer literal is
too large to be represented in type 'long', interpreting as 'unsigned
long' per C89; this literal will have type 'long long' in C99 onwards
[-Werror,-Wc99-compat]
399 | .pll_freq = 2970000000,
| ^
drivers/gpu/drm/meson/meson_vclk.c:411:15: error: integer literal is
too large to be represented in type 'long', interpreting as 'unsigned
long' per C89; this literal will have type 'long long' in C99 onwards
[-Werror,-Wc99-compat]
411 | .pll_freq = 2970000000,
| ^
drivers/gpu/drm/meson/meson_vclk.c:423:15: error: integer literal is
too large to be represented in type 'long', interpreting as 'unsigned
long' per C89; this literal will have type 'long long' in C99 onwards
[-Werror,-Wc99-compat]
423 | .pll_freq = 2970000000,
| ^
drivers/gpu/drm/meson/meson_vclk.c:436:15: error: integer literal is
too large to be represented in type 'long', interpreting as 'unsigned
long' per C89; this literal will have type 'long long' in C99 onwards
[-Werror,-Wc99-compat]
436 | .phy_freq = 2970000000,
| ^
drivers/gpu/drm/meson/meson_vclk.c:460:15: error: integer literal is
too large to be represented in type 'long', interpreting as 'unsigned
long' per C89; this literal will have type 'long long' in C99 onwards
[-Werror,-Wc99-compat]
460 | .phy_freq = 2970000000,
| ^
CC [M] net/dccp/feat.o
drivers/gpu/drm/meson/meson_vclk.c:850:8: error: integer literal is
too large to be represented in type 'long', interpreting as 'unsigned
long' per C89; this literal will have type 'long long' in C99 onwards
[-Werror,-Wc99-compat]
850 | case 2970000000:
| ^
drivers/gpu/drm/meson/meson_vclk.c:868:8: error: integer literal is
too large to be represented in type 'long', interpreting as 'unsigned
long' per C89; this literal will have type 'long long' in C99 onwards
[-Werror,-Wc99-compat]
868 | case 2970000000:
| ^
drivers/gpu/drm/meson/meson_vclk.c:885:8: error: integer literal is
too large to be represented in type 'long', interpreting as 'unsigned
long' per C89; this literal will have type 'long long' in C99 onwards
[-Werror,-Wc99-compat]
885 | case 2970000000:
| ^
8 errors generated.
=====================================================
# Builds where the incident occurred:
## defconfig+allmodconfig+CONFIG_FRAME_WARN=2048 on (arm):
- compiler: clang-17
- dashboard: https://d.kernelci.org/build/maestro:685192b15c2cf25042b9bc2e
#kernelci issue maestro:ae3b0334acd91200d6ced325a381bafac2d46493
Reported-by: kernelci.org bot <bot(a)kernelci.org>
--
This is an experimental report format. Please send feedback in!
Talk to us at kernelci(a)lists.linux.dev
Made with love by the KernelCI team - https://kernelci.org
Since in v6.8-rc1, the of_node symlink under tty devices is
missing. This breaks any udev rules relying on this information.
Link the of_node information in the serial controller device with the
parent defined in the device tree. This will also apply to the serial
device which takes the serial controller as a parent device.
Fixes: b286f4e87e32 ("serial: core: Move tty and serdev to be children of serial core port device")
Cc: stable(a)vger.kernel.org
Signed-off-by: Aidan Stewart <astewart(a)tektelic.com>
---
v1 -> v2:
- v1: https://lore.kernel.org/linux-serial/20250616162154.9057-1-astewart@tekteli…
- Remove IS_ENABLED(CONFIG_OF) check.
---
drivers/tty/serial/serial_base_bus.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/tty/serial/serial_base_bus.c b/drivers/tty/serial/serial_base_bus.c
index 5d1677f1b651..cb3b127b06b6 100644
--- a/drivers/tty/serial/serial_base_bus.c
+++ b/drivers/tty/serial/serial_base_bus.c
@@ -72,6 +72,7 @@ static int serial_base_device_init(struct uart_port *port,
dev->parent = parent_dev;
dev->bus = &serial_base_bus_type;
dev->release = release;
+ device_set_of_node_from_dev(dev, parent_dev);
if (!serial_base_initialized) {
dev_dbg(port->dev, "uart_add_one_port() called before arch_initcall()?\n");
--
2.49.0
The longest length of a symbol (KSYM_NAME_LEN) was increased to 512 in
the reference [1]. Because in Rust symbols can become quite long due to
namespacing introduced by modules, types, traits, generics, etc.
This patch series presents two commits that implement a test to verify
that a symbol with KSYM_NAME_LEN of 512 can be read.
The first commit: To check that symbol length was valid, the commit
implements a kunit test that verifies that a symbol of 512 length can
be read.
The second commit: There was a warning when building with clang because
there was a definition of unlikely from compiler.h in tools/include/linux,
which conflicted with the one in the instruction decoder selftest.
[1] https://lore.kernel.org/lkml/20220802015052.10452-6-ojeda@kernel.org/
---
Nathan Chancellor (1):
x86/tools: Drop duplicate unlikely() definition in insn_decoder_test.c
Sergio González Collado (1):
Kunit to check the longest symbol length
arch/x86/tools/insn_decoder_test.c | 5 +-
lib/Kconfig.debug | 9 ++++
lib/Makefile | 2 +
lib/longest_symbol_kunit.c | 82 ++++++++++++++++++++++++++++++
4 files changed, 95 insertions(+), 3 deletions(-)
create mode 100644 lib/longest_symbol_kunit.c
base-commit: ba9210b8c96355a16b78e1b890dce78f284d6f31
--
2.39.2
Since in v6.8-rc1, the of_node symlink under tty devices is
missing. This breaks any udev rules relying on this information.
Link the of_node information in the serial controller device with the
parent defined in the device tree. This will also apply to the serial
device which takes the serial controller as a parent device.
Fixes: b286f4e87e32 ("serial: core: Move tty and serdev to be children of serial core port device")
Cc: stable(a)vger.kernel.org
Signed-off-by: Aidan Stewart <astewart(a)tektelic.com>
---
drivers/tty/serial/serial_base_bus.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/tty/serial/serial_base_bus.c b/drivers/tty/serial/serial_base_bus.c
index 5d1677f1b651..0e4bf7a3e775 100644
--- a/drivers/tty/serial/serial_base_bus.c
+++ b/drivers/tty/serial/serial_base_bus.c
@@ -73,6 +73,10 @@ static int serial_base_device_init(struct uart_port *port,
dev->bus = &serial_base_bus_type;
dev->release = release;
+ if (IS_ENABLED(CONFIG_OF)) {
+ device_set_of_node_from_dev(dev, parent_dev);
+ }
+
if (!serial_base_initialized) {
dev_dbg(port->dev, "uart_add_one_port() called before arch_initcall()?\n");
return -EPROBE_DEFER;
--
2.49.0
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x f4ecdc352646f7d23f348e5c544dbe3212c94fc8
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025061708-cameo-caring-38a0@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f4ecdc352646f7d23f348e5c544dbe3212c94fc8 Mon Sep 17 00:00:00 2001
From: Pawel Laszczak <pawell(a)cadence.com>
Date: Tue, 13 May 2025 05:30:09 +0000
Subject: [PATCH] usb: cdnsp: Fix issue with detecting command completion event
In some cases, there is a small-time gap in which CMD_RING_BUSY can be
cleared by controller but adding command completion event to event ring
will be delayed. As the result driver will return error code.
This behavior has been detected on usbtest driver (test 9) with
configuration including ep1in/ep1out bulk and ep2in/ep2out isoc
endpoint.
Probably this gap occurred because controller was busy with adding some
other events to event ring.
The CMD_RING_BUSY is cleared to '0' when the Command Descriptor has been
executed and not when command completion event has been added to event
ring.
To fix this issue for this test the small delay is sufficient less than
10us) but to make sure the problem doesn't happen again in the future
the patch introduces 10 retries to check with delay about 20us before
returning error code.
Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Pawel Laszczak <pawell(a)cadence.com>
Acked-by: Peter Chen <peter.chen(a)kernel.org>
Link: https://lore.kernel.org/r/PH7PR07MB9538AA45362ACCF1B94EE9B7DD96A@PH7PR07MB9…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/cdns3/cdnsp-gadget.c b/drivers/usb/cdns3/cdnsp-gadget.c
index cd1e00daf43f..55f95f41b3b4 100644
--- a/drivers/usb/cdns3/cdnsp-gadget.c
+++ b/drivers/usb/cdns3/cdnsp-gadget.c
@@ -548,6 +548,7 @@ int cdnsp_wait_for_cmd_compl(struct cdnsp_device *pdev)
dma_addr_t cmd_deq_dma;
union cdnsp_trb *event;
u32 cycle_state;
+ u32 retry = 10;
int ret, val;
u64 cmd_dma;
u32 flags;
@@ -579,8 +580,23 @@ int cdnsp_wait_for_cmd_compl(struct cdnsp_device *pdev)
flags = le32_to_cpu(event->event_cmd.flags);
/* Check the owner of the TRB. */
- if ((flags & TRB_CYCLE) != cycle_state)
+ if ((flags & TRB_CYCLE) != cycle_state) {
+ /*
+ * Give some extra time to get chance controller
+ * to finish command before returning error code.
+ * Checking CMD_RING_BUSY is not sufficient because
+ * this bit is cleared to '0' when the Command
+ * Descriptor has been executed by controller
+ * and not when command completion event has
+ * be added to event ring.
+ */
+ if (retry--) {
+ udelay(20);
+ continue;
+ }
+
return -EINVAL;
+ }
cmd_dma = le64_to_cpu(event->event_cmd.cmd_trb);
This series introduces a new metadata format for UVC cameras and adds a
couple of improvements to the UVC metadata handling.
The new metadata format can be enabled in runtime with quirks.
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
Changes in v7:
- Add patch: Introduce dev->meta_formats
- Link to v6: https://lore.kernel.org/r/20250604-uvc-meta-v6-0-7141d48c322c@chromium.org
Changes in v6 (Thanks Laurent):
- Fix typo in metafmt-uvc.rst
- Improve metafmt-uvc-msxu-1-5.rst
- uvc_meta_v4l2_try_format() block MSXU format unless the quirk is
active
- Refactor uvc_enable_msxu
- Document uvc_meta_detect_msxu
- Rebase series
- Add R-b
- Link to v5: https://lore.kernel.org/r/20250404-uvc-meta-v5-0-f79974fc2d20@chromium.org
Changes in v5:
- Fix codestyle and kerneldoc warnings reported by media-ci
- Link to v4: https://lore.kernel.org/r/20250403-uvc-meta-v4-0-877aa6475975@chromium.org
Changes in v4:
- Rename format to V4L2_META_FMT_UVC_MSXU_1_5 (Thanks Mauro)
- Flag the new format with a quirk.
- Autodetect MSXU devices.
- Link to v3: https://lore.kernel.org/linux-media/20250313-uvc-metadata-v3-0-c467af869c60…
Changes in v3:
- Fix doc syntax errors.
- Link to v2: https://lore.kernel.org/r/20250306-uvc-metadata-v2-0-7e939857cad5@chromium.…
Changes in v2:
- Add metadata invalid fix
- Move doc note to a separate patch
- Introduce V4L2_META_FMT_UVC_CUSTOM (thanks HdG!).
- Link to v1: https://lore.kernel.org/r/20250226-uvc-metadata-v1-1-6cd6fe5ec2cb@chromium.…
---
Ricardo Ribalda (5):
media: uvcvideo: Do not mark valid metadata as invalid
media: Documentation: Add note about UVCH length field
media: uvcvideo: Introduce dev->meta_formats
media: uvcvideo: Introduce V4L2_META_FMT_UVC_MSXU_1_5
media: uvcvideo: Auto-set UVC_QUIRK_MSXU_META
.../userspace-api/media/v4l/meta-formats.rst | 1 +
.../media/v4l/metafmt-uvc-msxu-1-5.rst | 23 ++++
.../userspace-api/media/v4l/metafmt-uvc.rst | 4 +-
MAINTAINERS | 1 +
drivers/media/usb/uvc/uvc_driver.c | 7 ++
drivers/media/usb/uvc/uvc_metadata.c | 133 +++++++++++++++++++--
drivers/media/usb/uvc/uvc_video.c | 12 +-
drivers/media/usb/uvc/uvcvideo.h | 3 +
drivers/media/v4l2-core/v4l2-ioctl.c | 1 +
include/linux/usb/uvc.h | 3 +
include/uapi/linux/videodev2.h | 1 +
11 files changed, 175 insertions(+), 14 deletions(-)
---
base-commit: c3021d6a80ff05034dfee494115ec71f1954e311
change-id: 20250403-uvc-meta-e556773d12ae
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>
Hi Greg and crew,
Can you revert:
commit 746e7d285dcb96caa1845fbbb62b14bf4010cdfb
Author: Jens Axboe <axboe(a)kernel.dk>
Date: Wed May 7 08:07:09 2025 -0600
io_uring: ensure deferred completions are posted for multishot
in 6.6-stable? There's some missing dependencies that makes this not
work right, I'll bring it back in a series instead.
--
Jens Axboe
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 23be716b1c4f3f3a6c00ee38d51a57ef7db9ef7d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025061709-overboard-duplicate-5035@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 23be716b1c4f3f3a6c00ee38d51a57ef7db9ef7d Mon Sep 17 00:00:00 2001
From: Dave Chinner <dchinner(a)redhat.com>
Date: Thu, 1 May 2025 09:27:24 +1000
Subject: [PATCH] xfs: don't assume perags are initialised when trimming AGs
When running fstrim immediately after mounting a V4 filesystem,
the fstrim fails to trim all the free space in the filesystem. It
only trims the first extent in the by-size free space tree in each
AG and then returns. If a second fstrim is then run, it runs
correctly and the entire free space in the filesystem is iterated
and discarded correctly.
The problem lies in the setup of the trim cursor - it assumes that
pag->pagf_longest is valid without either reading the AGF first or
checking if xfs_perag_initialised_agf(pag) is true or not.
As a result, when a filesystem is mounted without reading the AGF
(e.g. a clean mount on a v4 filesystem) and the first operation is a
fstrim call, pag->pagf_longest is zero and so the free extent search
starts at the wrong end of the by-size btree and exits after
discarding the first record in the tree.
Fix this by deferring the initialisation of tcur->count to after
we have locked the AGF and guaranteed that the perag is properly
initialised. We trigger this on tcur->count == 0 after locking the
AGF, as this will only occur on the first call to
xfs_trim_gather_extents() for each AG. If we need to iterate,
tcur->count will be set to the length of the record we need to
restart at, so we can use this to ensure we only sample a valid
pag->pagf_longest value for the iteration.
Signed-off-by: Dave Chinner <dchinner(a)redhat.com>
Reviewed-by: Bill O'Donnell <bodonnel(a)redhat.com>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Fixes: 89cfa899608f ("xfs: reduce AGF hold times during fstrim operations")
Cc: <stable(a)vger.kernel.org> # v6.6
Signed-off-by: Carlos Maiolino <cem(a)kernel.org>
diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c
index c1a306268ae4..94d0873bcd62 100644
--- a/fs/xfs/xfs_discard.c
+++ b/fs/xfs/xfs_discard.c
@@ -167,6 +167,14 @@ xfs_discard_extents(
return error;
}
+/*
+ * Care must be taken setting up the trim cursor as the perags may not have been
+ * initialised when the cursor is initialised. e.g. a clean mount which hasn't
+ * read in AGFs and the first operation run on the mounted fs is a trim. This
+ * can result in perag fields that aren't initialised until
+ * xfs_trim_gather_extents() calls xfs_alloc_read_agf() to lock down the AG for
+ * the free space search.
+ */
struct xfs_trim_cur {
xfs_agblock_t start;
xfs_extlen_t count;
@@ -204,6 +212,14 @@ xfs_trim_gather_extents(
if (error)
goto out_trans_cancel;
+ /*
+ * First time through tcur->count will not have been initialised as
+ * pag->pagf_longest is not guaranteed to be valid before we read
+ * the AGF buffer above.
+ */
+ if (!tcur->count)
+ tcur->count = pag->pagf_longest;
+
if (tcur->by_bno) {
/* sub-AG discard request always starts at tcur->start */
cur = xfs_bnobt_init_cursor(mp, tp, agbp, pag);
@@ -350,7 +366,6 @@ xfs_trim_perag_extents(
{
struct xfs_trim_cur tcur = {
.start = start,
- .count = pag->pagf_longest,
.end = end,
.minlen = minlen,
};
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 0736299d090f5c6a1032678705c4bc0a9511a3db
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025061709-nacho-bronchial-18a8@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0736299d090f5c6a1032678705c4bc0a9511a3db Mon Sep 17 00:00:00 2001
From: Amit Sunil Dhamne <amitsd(a)google.com>
Date: Fri, 2 May 2025 16:57:03 -0700
Subject: [PATCH] usb: typec: tcpm/tcpci_maxim: Fix bounds check in
process_rx()
Register read of TCPC_RX_BYTE_CNT returns the total size consisting of:
PD message (pending read) size + 1 Byte for Frame Type (SOP*)
This is validated against the max PD message (`struct pd_message`) size
without accounting for the extra byte for the frame type. Note that the
struct pd_message does not contain a field for the frame_type. This
results in false negatives when the "PD message (pending read)" is equal
to the max PD message size.
Fixes: 6f413b559f86 ("usb: typec: tcpci_maxim: Chip level TCPC driver")
Signed-off-by: Amit Sunil Dhamne <amitsd(a)google.com>
Signed-off-by: Badhri Jagan Sridharan <badhri(a)google.com>
Reviewed-by: Kyle Tso <kyletso(a)google.com>
Cc: stable <stable(a)kernel.org>
Link: https://lore.kernel.org/stable/20250502-b4-new-fix-pd-rx-count-v1-1-e5711ed…
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Link: https://lore.kernel.org/r/20250502-b4-new-fix-pd-rx-count-v1-1-e5711ed09b3d…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/typec/tcpm/tcpci_maxim_core.c b/drivers/usb/typec/tcpm/tcpci_maxim_core.c
index 29a4aa89d1a1..b5a5ed40faea 100644
--- a/drivers/usb/typec/tcpm/tcpci_maxim_core.c
+++ b/drivers/usb/typec/tcpm/tcpci_maxim_core.c
@@ -166,7 +166,8 @@ static void process_rx(struct max_tcpci_chip *chip, u16 status)
return;
}
- if (count > sizeof(struct pd_message) || count + 1 > TCPC_RECEIVE_BUFFER_LEN) {
+ if (count > sizeof(struct pd_message) + 1 ||
+ count + 1 > TCPC_RECEIVE_BUFFER_LEN) {
dev_err(chip->dev, "Invalid TCPC_RX_BYTE_CNT %d\n", count);
return;
}
Currently the 'pispbe_schedule()' function does two things:
1) Tries to assemble a job by inspecting all the video node queues
to make sure all the required buffers are available
2) Submit the job to the hardware
The pispbe_schedule() function is called at:
- video device start_streaming() time
- video device qbuf() time
- irq handler
As assembling a job requires inspecting all queues, it is a rather
time consuming operation which is better not run in IRQ context.
To avoid executing the time consuming job creation in interrupt
context, split the job creation and job scheduling in two distinct
operations. When a well-formed job is created, append it to the
newly introduced 'pispbe->job_queue' where it will be dequeued from
by the scheduling routine.
At start_streaming() and qbuf() time immediately try to schedule a job
if one has been created as the irq handler routine is only called when
a job has completed, and we can't solely rely on it for scheduling new
jobs.
Signed-off-by: Jacopo Mondi <jacopo.mondi(a)ideasonboard.com>
---
Changes in v8:
- Use automatic release of *job in pispbe_prepare_job()
- Use temporary list to release jobs without holding the main driver
lock
- Collect tags
- Rebased on rpi-6.6.y: https://github.com/raspberrypi/linux/pull/6905
- Link to v7: https://lore.kernel.org/r/20250606-pispbe-mainline-split-jobs-handling-v6-v…
Changes in v7:
- Rebased on media-committers/next
- Fix lockdep warning by using the proper spinlock_irq() primitive in
pispbe_prepare_job() which can race with the IRQ handler
- Link to v6: https://lore.kernel.org/r/20240930-pispbe-mainline-split-jobs-handling-v6-v…
v5->v6:
- Make the driver depend on PM
- Simplify the probe() routine by using pm_runtime_
- Remove suspend call from remove()
v4->v5:
- Use appropriate locking constructs:
- spin_lock_irq() for pispbe_prepare_job() called from non irq context
- spin_lock_irqsave() for pispbe_schedule() called from irq context
- Remove hw_lock from ready_queue accesses in stop_streaming and
start_streaming
- Fix trivial indentation mistake in 4/4
v3->v4:
- Expand commit message in 2/4 to explain why removing validation in schedule()
is safe
- Drop ready_lock spinlock
- Use non _irqsave version of safe_guard(spinlock
- Support !CONFIG_PM in 4/4 by calling the enable/disable routines directly
and adjust pm_runtime usage as suggested by Laurent
v2->v3:
- Mark pispbe_runtime_resume() as __maybe_unused
- Add fixes tags where appropriate
v1->v2:
- Add two patches to address Laurent's comments separately
- use scoped_guard() when possible
- Add patch to fix runtime_pm imbalance
---
Jacopo Mondi (4):
media: pisp_be: Drop reference to non-existing function
media: pisp_be: Remove config validation from schedule()
media: pisp_be: Split jobs creation and scheduling
media: pisp_be: Fix pm_runtime underrun in probe
drivers/media/platform/raspberrypi/pisp_be/Kconfig | 1 +
.../media/platform/raspberrypi/pisp_be/pisp_be.c | 196 ++++++++++-----------
2 files changed, 98 insertions(+), 99 deletions(-)
---
base-commit: ce5cac69b2edac3e3246fee03e8f4c2a1075238b
change-id: 20240930-pispbe-mainline-split-jobs-handling-v6-15dc16e11e3a
Best regards,
--
Jacopo Mondi <jacopo.mondi(a)ideasonboard.com>
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x f4ecdc352646f7d23f348e5c544dbe3212c94fc8
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025061707-putt-mutable-5fb5@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f4ecdc352646f7d23f348e5c544dbe3212c94fc8 Mon Sep 17 00:00:00 2001
From: Pawel Laszczak <pawell(a)cadence.com>
Date: Tue, 13 May 2025 05:30:09 +0000
Subject: [PATCH] usb: cdnsp: Fix issue with detecting command completion event
In some cases, there is a small-time gap in which CMD_RING_BUSY can be
cleared by controller but adding command completion event to event ring
will be delayed. As the result driver will return error code.
This behavior has been detected on usbtest driver (test 9) with
configuration including ep1in/ep1out bulk and ep2in/ep2out isoc
endpoint.
Probably this gap occurred because controller was busy with adding some
other events to event ring.
The CMD_RING_BUSY is cleared to '0' when the Command Descriptor has been
executed and not when command completion event has been added to event
ring.
To fix this issue for this test the small delay is sufficient less than
10us) but to make sure the problem doesn't happen again in the future
the patch introduces 10 retries to check with delay about 20us before
returning error code.
Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Pawel Laszczak <pawell(a)cadence.com>
Acked-by: Peter Chen <peter.chen(a)kernel.org>
Link: https://lore.kernel.org/r/PH7PR07MB9538AA45362ACCF1B94EE9B7DD96A@PH7PR07MB9…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/cdns3/cdnsp-gadget.c b/drivers/usb/cdns3/cdnsp-gadget.c
index cd1e00daf43f..55f95f41b3b4 100644
--- a/drivers/usb/cdns3/cdnsp-gadget.c
+++ b/drivers/usb/cdns3/cdnsp-gadget.c
@@ -548,6 +548,7 @@ int cdnsp_wait_for_cmd_compl(struct cdnsp_device *pdev)
dma_addr_t cmd_deq_dma;
union cdnsp_trb *event;
u32 cycle_state;
+ u32 retry = 10;
int ret, val;
u64 cmd_dma;
u32 flags;
@@ -579,8 +580,23 @@ int cdnsp_wait_for_cmd_compl(struct cdnsp_device *pdev)
flags = le32_to_cpu(event->event_cmd.flags);
/* Check the owner of the TRB. */
- if ((flags & TRB_CYCLE) != cycle_state)
+ if ((flags & TRB_CYCLE) != cycle_state) {
+ /*
+ * Give some extra time to get chance controller
+ * to finish command before returning error code.
+ * Checking CMD_RING_BUSY is not sufficient because
+ * this bit is cleared to '0' when the Command
+ * Descriptor has been executed by controller
+ * and not when command completion event has
+ * be added to event ring.
+ */
+ if (retry--) {
+ udelay(20);
+ continue;
+ }
+
return -EINVAL;
+ }
cmd_dma = le64_to_cpu(event->event_cmd.cmd_trb);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 1bd6406fb5f36c2bb1e96e27d4c3e9f4d09edde4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025061733-scarring-crevice-7648@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1bd6406fb5f36c2bb1e96e27d4c3e9f4d09edde4 Mon Sep 17 00:00:00 2001
From: Wupeng Ma <mawupeng1(a)huawei.com>
Date: Sat, 10 May 2025 11:30:40 +0800
Subject: [PATCH] VMCI: fix race between vmci_host_setup_notify and
vmci_ctx_unset_notify
During our test, it is found that a warning can be trigger in try_grab_folio
as follow:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1678 at mm/gup.c:147 try_grab_folio+0x106/0x130
Modules linked in:
CPU: 0 UID: 0 PID: 1678 Comm: syz.3.31 Not tainted 6.15.0-rc5 #163 PREEMPT(undef)
RIP: 0010:try_grab_folio+0x106/0x130
Call Trace:
<TASK>
follow_huge_pmd+0x240/0x8e0
follow_pmd_mask.constprop.0.isra.0+0x40b/0x5c0
follow_pud_mask.constprop.0.isra.0+0x14a/0x170
follow_page_mask+0x1c2/0x1f0
__get_user_pages+0x176/0x950
__gup_longterm_locked+0x15b/0x1060
? gup_fast+0x120/0x1f0
gup_fast_fallback+0x17e/0x230
get_user_pages_fast+0x5f/0x80
vmci_host_unlocked_ioctl+0x21c/0xf80
RIP: 0033:0x54d2cd
---[ end trace 0000000000000000 ]---
Digging into the source, context->notify_page may init by get_user_pages_fast
and can be seen in vmci_ctx_unset_notify which will try to put_page. However
get_user_pages_fast is not finished here and lead to following
try_grab_folio warning. The race condition is shown as follow:
cpu0 cpu1
vmci_host_do_set_notify
vmci_host_setup_notify
get_user_pages_fast(uva, 1, FOLL_WRITE, &context->notify_page);
lockless_pages_from_mm
gup_pgd_range
gup_huge_pmd // update &context->notify_page
vmci_host_do_set_notify
vmci_ctx_unset_notify
notify_page = context->notify_page;
if (notify_page)
put_page(notify_page); // page is freed
__gup_longterm_locked
__get_user_pages
follow_trans_huge_pmd
try_grab_folio // warn here
To slove this, use local variable page to make notify_page can be seen
after finish get_user_pages_fast.
Fixes: a1d88436d53a ("VMCI: Fix two UVA mapping bugs")
Cc: stable <stable(a)kernel.org>
Closes: https://lore.kernel.org/all/e91da589-ad57-3969-d979-879bbd10dddd@huawei.com/
Signed-off-by: Wupeng Ma <mawupeng1(a)huawei.com>
Link: https://lore.kernel.org/r/20250510033040.901582-1-mawupeng1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/misc/vmw_vmci/vmci_host.c b/drivers/misc/vmw_vmci/vmci_host.c
index abe79f6fd2a7..b64944367ac5 100644
--- a/drivers/misc/vmw_vmci/vmci_host.c
+++ b/drivers/misc/vmw_vmci/vmci_host.c
@@ -227,6 +227,7 @@ static int drv_cp_harray_to_user(void __user *user_buf_uva,
static int vmci_host_setup_notify(struct vmci_ctx *context,
unsigned long uva)
{
+ struct page *page;
int retval;
if (context->notify_page) {
@@ -243,13 +244,11 @@ static int vmci_host_setup_notify(struct vmci_ctx *context,
/*
* Lock physical page backing a given user VA.
*/
- retval = get_user_pages_fast(uva, 1, FOLL_WRITE, &context->notify_page);
- if (retval != 1) {
- context->notify_page = NULL;
+ retval = get_user_pages_fast(uva, 1, FOLL_WRITE, &page);
+ if (retval != 1)
return VMCI_ERROR_GENERIC;
- }
- if (context->notify_page == NULL)
- return VMCI_ERROR_UNAVAILABLE;
+
+ context->notify_page = page;
/*
* Map the locked page and set up notify pointer.
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 1bd6406fb5f36c2bb1e96e27d4c3e9f4d09edde4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025061722-shaded-throwback-5dda@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1bd6406fb5f36c2bb1e96e27d4c3e9f4d09edde4 Mon Sep 17 00:00:00 2001
From: Wupeng Ma <mawupeng1(a)huawei.com>
Date: Sat, 10 May 2025 11:30:40 +0800
Subject: [PATCH] VMCI: fix race between vmci_host_setup_notify and
vmci_ctx_unset_notify
During our test, it is found that a warning can be trigger in try_grab_folio
as follow:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1678 at mm/gup.c:147 try_grab_folio+0x106/0x130
Modules linked in:
CPU: 0 UID: 0 PID: 1678 Comm: syz.3.31 Not tainted 6.15.0-rc5 #163 PREEMPT(undef)
RIP: 0010:try_grab_folio+0x106/0x130
Call Trace:
<TASK>
follow_huge_pmd+0x240/0x8e0
follow_pmd_mask.constprop.0.isra.0+0x40b/0x5c0
follow_pud_mask.constprop.0.isra.0+0x14a/0x170
follow_page_mask+0x1c2/0x1f0
__get_user_pages+0x176/0x950
__gup_longterm_locked+0x15b/0x1060
? gup_fast+0x120/0x1f0
gup_fast_fallback+0x17e/0x230
get_user_pages_fast+0x5f/0x80
vmci_host_unlocked_ioctl+0x21c/0xf80
RIP: 0033:0x54d2cd
---[ end trace 0000000000000000 ]---
Digging into the source, context->notify_page may init by get_user_pages_fast
and can be seen in vmci_ctx_unset_notify which will try to put_page. However
get_user_pages_fast is not finished here and lead to following
try_grab_folio warning. The race condition is shown as follow:
cpu0 cpu1
vmci_host_do_set_notify
vmci_host_setup_notify
get_user_pages_fast(uva, 1, FOLL_WRITE, &context->notify_page);
lockless_pages_from_mm
gup_pgd_range
gup_huge_pmd // update &context->notify_page
vmci_host_do_set_notify
vmci_ctx_unset_notify
notify_page = context->notify_page;
if (notify_page)
put_page(notify_page); // page is freed
__gup_longterm_locked
__get_user_pages
follow_trans_huge_pmd
try_grab_folio // warn here
To slove this, use local variable page to make notify_page can be seen
after finish get_user_pages_fast.
Fixes: a1d88436d53a ("VMCI: Fix two UVA mapping bugs")
Cc: stable <stable(a)kernel.org>
Closes: https://lore.kernel.org/all/e91da589-ad57-3969-d979-879bbd10dddd@huawei.com/
Signed-off-by: Wupeng Ma <mawupeng1(a)huawei.com>
Link: https://lore.kernel.org/r/20250510033040.901582-1-mawupeng1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/misc/vmw_vmci/vmci_host.c b/drivers/misc/vmw_vmci/vmci_host.c
index abe79f6fd2a7..b64944367ac5 100644
--- a/drivers/misc/vmw_vmci/vmci_host.c
+++ b/drivers/misc/vmw_vmci/vmci_host.c
@@ -227,6 +227,7 @@ static int drv_cp_harray_to_user(void __user *user_buf_uva,
static int vmci_host_setup_notify(struct vmci_ctx *context,
unsigned long uva)
{
+ struct page *page;
int retval;
if (context->notify_page) {
@@ -243,13 +244,11 @@ static int vmci_host_setup_notify(struct vmci_ctx *context,
/*
* Lock physical page backing a given user VA.
*/
- retval = get_user_pages_fast(uva, 1, FOLL_WRITE, &context->notify_page);
- if (retval != 1) {
- context->notify_page = NULL;
+ retval = get_user_pages_fast(uva, 1, FOLL_WRITE, &page);
+ if (retval != 1)
return VMCI_ERROR_GENERIC;
- }
- if (context->notify_page == NULL)
- return VMCI_ERROR_UNAVAILABLE;
+
+ context->notify_page = page;
/*
* Map the locked page and set up notify pointer.