There is no page fault without MMU. Compiling the rtapp/pagefault monitor
without CONFIG_MMU fails as page fault tracepoints' definitions are not
available.
Make rtapp/pagefault monitor depends on CONFIG_MMU.
Fixes: 9162620eb604 ("rv: Add rtapp_pagefault monitor")
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202509260455.6Z9Vkty4-lkp@intel.com/
Cc: stable(a)vger.kernel.org
---
kernel/trace/rv/monitors/pagefault/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/trace/rv/monitors/pagefault/Kconfig b/kernel/trace/rv/monitors/pagefault/Kconfig
index 5e16625f1653..0e013f00c33b 100644
--- a/kernel/trace/rv/monitors/pagefault/Kconfig
+++ b/kernel/trace/rv/monitors/pagefault/Kconfig
@@ -5,6 +5,7 @@ config RV_MON_PAGEFAULT
select RV_LTL_MONITOR
depends on RV_MON_RTAPP
depends on X86 || RISCV
+ depends on MMU
default y
select LTL_MON_EVENTS_ID
bool "pagefault monitor"
--
2.51.0
Loading a large (~2.1G) files with kexec crashes the host with when
running:
# kexec --load kernel --initrd initrd_with_2G_or_more
UBSAN: signed-integer-overflow in ./include/crypto/sha256_base.h:64:19
34152083 * 64 cannot be represented in type 'int'
...
BUG: unable to handle page fault for address: ff9fffff83b624c0
sha256_update (lib/crypto/sha256.c:137)
crypto_sha256_update (crypto/sha256_generic.c:40)
kexec_calculate_store_digests (kernel/kexec_file.c:769)
__se_sys_kexec_file_load (kernel/kexec_file.c:397 kernel/kexec_file.c:332)
...
(Line numbers based on commit da274362a7bd9 ("Linux 6.12.49")
This is not happening upstream (v6.16+), given that `block` type was
upgraded from "int" to "size_t" in commit 74a43a2cf5e8 ("crypto:
lib/sha256 - Move partial block handling out")
Upgrade the block type similar to the commit above, avoiding hitting the
overflow.
This patch is only suitable for the stable tree, and before 6.16, which
got commit 74a43a2cf5e8 ("crypto: lib/sha256 - Move partial block
handling out")
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Fixes: 11b8d5ef9138 ("crypto: sha256 - implement base layer for SHA-256") # not after v6.16
Reported-by: Michael van der Westhuizen <rmikey(a)meta.com>
Reported-by: Tobias Fleig <tfleig(a)meta.com>
---
include/crypto/sha256_base.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/crypto/sha256_base.h b/include/crypto/sha256_base.h
index e0418818d63c8..fa63af10102b2 100644
--- a/include/crypto/sha256_base.h
+++ b/include/crypto/sha256_base.h
@@ -44,7 +44,7 @@ static inline int lib_sha256_base_do_update(struct sha256_state *sctx,
sctx->count += len;
if (unlikely((partial + len) >= SHA256_BLOCK_SIZE)) {
- int blocks;
+ size_t blocks;
if (partial) {
int p = SHA256_BLOCK_SIZE - partial;
---
base-commit: da274362a7bd9ab3a6e46d15945029145ebce672
change-id: 20251001-stable_crash-f2151baf043b
Best regards,
--
Breno Leitao <leitao(a)debian.org>
Hi Stable,
Please provide a quote for your products:
Include:
1.Pricing (per unit)
2.Delivery cost & timeline
3.Quote expiry date
Deadline: September
Thanks!
Kamal Prasad
Albinayah Trading
The patch titled
Subject: mm/ksm: fix flag-dropping behavior in ksm_madvise
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-ksm-fix-flag-dropping-behavior-in-ksm_madvise.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Jakub Acs <acsjakub(a)amazon.de>
Subject: mm/ksm: fix flag-dropping behavior in ksm_madvise
Date: Wed, 1 Oct 2025 09:03:52 +0000
syzkaller discovered the following crash: (kernel BUG)
[ 44.607039] ------------[ cut here ]------------
[ 44.607422] kernel BUG at mm/userfaultfd.c:2067!
[ 44.608148] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 44.608814] CPU: 1 UID: 0 PID: 2475 Comm: reproducer Not tainted 6.16.0-rc6 #1 PREEMPT(none)
[ 44.609635] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 44.610695] RIP: 0010:userfaultfd_release_all+0x3a8/0x460
<snip other registers, drop unreliable trace>
[ 44.617726] Call Trace:
[ 44.617926] <TASK>
[ 44.619284] userfaultfd_release+0xef/0x1b0
[ 44.620976] __fput+0x3f9/0xb60
[ 44.621240] fput_close_sync+0x110/0x210
[ 44.622222] __x64_sys_close+0x8f/0x120
[ 44.622530] do_syscall_64+0x5b/0x2f0
[ 44.622840] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 44.623244] RIP: 0033:0x7f365bb3f227
Kernel panics because it detects UFFD inconsistency during
userfaultfd_release_all(). Specifically, a VMA which has a valid pointer
to vma->vm_userfaultfd_ctx, but no UFFD flags in vma->vm_flags.
The inconsistency is caused in ksm_madvise(): when user calls madvise()
with MADV_UNMEARGEABLE on a VMA that is registered for UFFD in MINOR mode,
it accidentally clears all flags stored in the upper 32 bits of
vma->vm_flags.
Assuming x86_64 kernel build, unsigned long is 64-bit and unsigned int and
int are 32-bit wide. This setup causes the following mishap during the &=
~VM_MERGEABLE assignment.
VM_MERGEABLE is a 32-bit constant of type unsigned int, 0x8000'0000.
After ~ is applied, it becomes 0x7fff'ffff unsigned int, which is then
promoted to unsigned long before the & operation. This promotion fills
upper 32 bits with leading 0s, as we're doing unsigned conversion (and
even for a signed conversion, this wouldn't help as the leading bit is 0).
& operation thus ends up AND-ing vm_flags with 0x0000'0000'7fff'ffff
instead of intended 0xffff'ffff'7fff'ffff and hence accidentally clears
the upper 32-bits of its value.
Fix it by changing `VM_MERGEABLE` constant to unsigned long, using the
BIT() macro.
Note: other VM_* flags are not affected: This only happens to the
VM_MERGEABLE flag, as the other VM_* flags are all constants of type int
and after ~ operation, they end up with leading 1 and are thus converted
to unsigned long with leading 1s.
Note 2:
After commit 31defc3b01d9 ("userfaultfd: remove (VM_)BUG_ON()s"), this is
no longer a kernel BUG, but a WARNING at the same place:
[ 45.595973] WARNING: CPU: 1 PID: 2474 at mm/userfaultfd.c:2067
but the root-cause (flag-drop) remains the same.
Link: https://lkml.kernel.org/r/20251001090353.57523-2-acsjakub@amazon.de
Fixes: 7677f7fd8be7 ("userfaultfd: add minor fault registration mode")
Signed-off-by: Jakub Acs <acsjakub(a)amazon.de>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: SeongJae Park <sj(a)kernel.org>
Cc: Xu Xin <xu.xin16(a)zte.com.cn>
Cc: Chengming Zhou <chengming.zhou(a)linux.dev>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mm.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/mm.h~mm-ksm-fix-flag-dropping-behavior-in-ksm_madvise
+++ a/include/linux/mm.h
@@ -296,7 +296,7 @@ extern unsigned int kobjsize(const void
#define VM_MIXEDMAP 0x10000000 /* Can contain "struct page" and pure PFN pages */
#define VM_HUGEPAGE 0x20000000 /* MADV_HUGEPAGE marked this vma */
#define VM_NOHUGEPAGE 0x40000000 /* MADV_NOHUGEPAGE marked this vma */
-#define VM_MERGEABLE 0x80000000 /* KSM may merge identical pages */
+#define VM_MERGEABLE BIT(31) /* KSM may merge identical pages */
#ifdef CONFIG_ARCH_USES_HIGH_VMA_FLAGS
#define VM_HIGH_ARCH_BIT_0 32 /* bit only usable on 64-bit architectures */
_
Patches currently in -mm which might be from acsjakub(a)amazon.de are
mm-ksm-fix-flag-dropping-behavior-in-ksm_madvise.patch
The core scheduling is for smt enabled cpus. It is not returns
failure and gives plenty of error messages and not clearly points
to the smt issue if the smt is disabled. It just mention
"not a core sched system" and many other messages. For example:
Not a core sched system
tid=210574, / tgid=210574 / pgid=210574: ffffffffffffffff
Not a core sched system
tid=210575, / tgid=210575 / pgid=210574: ffffffffffffffff
Not a core sched system
tid=210577, / tgid=210575 / pgid=210574: ffffffffffffffff
(similar things many other times)
In this patch, the test will first read /sys/devices/system/cpu/smt/active,
if the file cannot be opened or its value is 0, the test is skipped with
an explanatory message. This helps developers understand why it is skipped
and avoids unnecessary attention when running the full selftest suite.
Signed-off-by: Yifei Liu <yifei.l.liu(a)oracle.com>
---
tools/testing/selftests/sched/cs_prctl_test.c | 23 ++++++++++++++++++-
1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/sched/cs_prctl_test.c b/tools/testing/selftests/sched/cs_prctl_test.c
index 52d97fae4dbd..7ce8088cde6a 100644
--- a/tools/testing/selftests/sched/cs_prctl_test.c
+++ b/tools/testing/selftests/sched/cs_prctl_test.c
@@ -32,6 +32,8 @@
#include <stdlib.h>
#include <string.h>
+#include "../kselftest.h"
+
#if __GLIBC_PREREQ(2, 30) == 0
#include <sys/syscall.h>
static pid_t gettid(void)
@@ -109,6 +111,22 @@ static void handle_usage(int rc, char *msg)
exit(rc);
}
+int check_smt(void)
+{
+ int c = 0;
+ FILE *file;
+
+ file = fopen("/sys/devices/system/cpu/smt/active", "r");
+ if (!file)
+ return 0;
+ c = fgetc(file) - 0x30;
+ fclose(file);
+ if (c == 0 || c == 1)
+ return c;
+ //if fgetc returns EOF or -1 for correupted files, return 0.
+ return 0;
+}
+
static unsigned long get_cs_cookie(int pid)
{
unsigned long long cookie;
@@ -271,7 +289,10 @@ int main(int argc, char *argv[])
delay = -1;
srand(time(NULL));
-
+ if (!check_smt()) {
+ ksft_test_result_skip("smt not enabled\n");
+ return 1;
+ }
/* put into separate process group */
if (setpgid(0, 0) != 0)
handle_error("process group");
--
2.50.1
Greetings:
Sending via plain text email -- apologies if you receive this twice.
If this isn't the process for reporting a regression in a LTS kernel per https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html, I'm happy to follow another process.
Kernel 6.1.149 introduced a regression, at least on our ARM Cortex A57-based platforms, via commit 8f4dc4e54eed4bebb18390305eb1f721c00457e1 in arch/arm64/kernel/fpsimd.c where booting KVM VMs eventually leads to a spinlock recursion BUG and crash of the box.
Reverting that commit via the below reverts to the old (working) behavior:
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 837d1937300a57..bc42163a7fd1f0 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1851,10 +1851,10 @@ void fpsimd_save_and_flush_cpu_state(void)
if (!system_supports_fpsimd())
return;
WARN_ON(preemptible());
- get_cpu_fpsimd_context();
+ __get_cpu_fpsimd_context();
fpsimd_save();
fpsimd_flush_cpu_state();
- put_cpu_fpsimd_context();
+ __put_cpu_fpsimd_context();
}
#ifdef CONFIG_KERNEL_MODE_NEON
It's not entirely clear to me if this is specific to our firmware, specific to ARM Cortex A57, or more systemic as we lack sufficiently differentiated hardware to know. I've tested on the latest 6.1 kernel in addition to the one in the log below and have also tested a number of firmware versions available for these boxes.
Steps to reproduce:
Boot VM in qemu-system-aarch64 with "-accel kvm" and "-cpu host" flags set -- no other arguments seem to matter
Generate CPU load in VM
Kernel log:
[sjc1] root@si-compute-kvm-e0fff70016b4:/# [ 805.905413] BUG: spinlock recursion on CPU#7, CPU 3/KVM/57616
[ 805.905452] lock: 0xffff3045ef850240, .magic: dead4ead, .owner: CPU 3/KVM/57616, .owner_cpu: 7
[ 805.905477] CPU: 7 PID: 57616 Comm: CPU 3/KVM Tainted: G O 6.1.152 #1
[ 805.905495] Hardware name: SoftIron SoftIron Platform Mainboard/SoftIron Platform Mainboard, BIOS 1.31 May 11 2023
[ 805.905516] Call trace:
[ 805.905524] dump_backtrace+0xe4/0x110
[ 805.905538] show_stack+0x20/0x30
[ 805.905548] dump_stack_lvl+0x6c/0x88
[ 805.905561] dump_stack+0x18/0x34
[ 805.905571] spin_dump+0x98/0xac
[ 805.905583] do_raw_spin_lock+0x70/0x128
[ 805.905596] _raw_spin_lock+0x18/0x28
[ 805.905607] raw_spin_rq_lock_nested+0x18/0x28
[ 805.905620] update_blocked_averages+0x70/0x550
[ 805.905634] run_rebalance_domains+0x50/0x70
[ 805.905645] handle_softirqs+0x198/0x328
[ 805.905659] __do_softirq+0x1c/0x28
[ 805.905669] ____do_softirq+0x18/0x28
[ 805.905680] call_on_irq_stack+0x30/0x48
[ 805.905691] do_softirq_own_stack+0x24/0x30
[ 805.905703] do_softirq+0x74/0x90
[ 805.905714] __local_bh_enable_ip+0x64/0x80
[ 805.905727] fpsimd_save_and_flush_cpu_state+0x5c/0x68
[ 805.905740] kvm_arch_vcpu_put_fp+0x4c/0x88
[ 805.905752] kvm_arch_vcpu_put+0x28/0x88
[ 805.905764] kvm_sched_out+0x38/0x58
[ 805.905774] __schedule+0x55c/0x6c8
[ 805.905786] schedule+0x60/0xa8
[ 805.905796] kvm_vcpu_block+0x5c/0x90
[ 805.905807] kvm_vcpu_halt+0x440/0x468
[ 805.905818] kvm_vcpu_wfi+0x3c/0x70
[ 805.905828] kvm_handle_wfx+0x18c/0x1f0
[ 805.905840] handle_exit+0xb8/0x148
[ 805.905851] kvm_arch_vcpu_ioctl_run+0x6c4/0x7b0
[ 805.905863] kvm_vcpu_ioctl+0x1d0/0x8b8
[ 805.905874] __arm64_sys_ioctl+0x9c/0xe0
[ 805.905886] invoke_syscall+0x78/0x108
[ 805.905899] el0_svc_common.constprop.3+0xb4/0xf8
[ 805.905912] do_el0_svc+0x78/0x88
[ 805.905922] el0_svc+0x48/0x78
[ 805.905932] el0t_64_sync_handler+0x40/0xc0
[ 805.905943] el0t_64_sync+0x18c/0x190
[ 806.048300] hrtimer: interrupt took 2976 ns
[ 826.924613] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
SoC 0 became not ready
SoC 0 became ready
Thanks,
--
Kenneth Van Alstyne, Jr.
This is the start of the stable review cycle for the 6.6.109 release.
There are 91 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 02 Oct 2025 14:37:59 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.109-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.6.109-rc1
David Laight <David.Laight(a)ACULAB.COM>
minmax.h: remove some #defines that are only expanded once
David Laight <David.Laight(a)ACULAB.COM>
minmax.h: simplify the variants of clamp()
David Laight <David.Laight(a)ACULAB.COM>
minmax.h: move all the clamp() definitions after the min/max() ones
David Laight <David.Laight(a)ACULAB.COM>
minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()
David Laight <David.Laight(a)ACULAB.COM>
minmax.h: reduce the #define expansion of min(), max() and clamp()
David Laight <David.Laight(a)ACULAB.COM>
minmax.h: update some comments
David Laight <David.Laight(a)ACULAB.COM>
minmax.h: add whitespace around operators and after commas
Linus Torvalds <torvalds(a)linux-foundation.org>
minmax: fix up min3() and max3() too
Linus Torvalds <torvalds(a)linux-foundation.org>
minmax: improve macro expansion and type checking
Linus Torvalds <torvalds(a)linux-foundation.org>
minmax: don't use max() in situations that want a C constant expression
Linus Torvalds <torvalds(a)linux-foundation.org>
minmax: simplify min()/max()/clamp() implementation
Linus Torvalds <torvalds(a)linux-foundation.org>
minmax: make generic MIN() and MAX() macros available everywhere
Lukasz Czapnik <lukasz.czapnik(a)intel.com>
i40e: add validation for ring_len param
Justin Bronder <jsbronder(a)cold-front.org>
i40e: increase max descriptors for XL710
Nirmoy Das <nirmoyd(a)nvidia.com>
drm/ast: Use msleep instead of mdelay for edid read
Hans de Goede <hansg(a)kernel.org>
gpiolib: Extend software-node support to support secondary software-nodes
Jan Kara <jack(a)suse.cz>
loop: Avoid updating block size under exclusive owner
David Hildenbrand <david(a)redhat.com>
mm/migrate_device: don't add folio to be freed to LRU in migrate_device_finalize()
Kefeng Wang <wangkefeng.wang(a)huawei.com>
mm: migrate_device: use more folio in migrate_device_finalize()
Florian Fainelli <florian.fainelli(a)broadcom.com>
ARM: bcm: Select ARM_GIC_V3 for ARCH_BRCMSTB
Nathan Chancellor <nathan(a)kernel.org>
s390/cpum_cf: Fix uninitialized warning after backport of ce971233242b
Thomas Zimmermann <tzimmermann(a)suse.de>
fbcon: Fix OOB access in font allocation
Samasth Norway Ananda <samasth.norway.ananda(a)oracle.com>
fbcon: fix integer overflow in fbcon_do_set_font
Jinjiang Tu <tujinjiang(a)huawei.com>
mm/hugetlb: fix folio is still mapped when deleted
Eric Biggers <ebiggers(a)kernel.org>
kmsan: fix out-of-bounds access to shadow memory
Zhen Ni <zhen.ni(a)easystack.cn>
afs: Fix potential null pointer dereference in afs_put_server
Nobuhiro Iwamatsu <iwamatsu(a)nigauri.org>
ARM: dts: socfpga: sodia: Fix mdio bus probe and PHY address
Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
tracing: dynevent: Add a missing lockdown check on dynevent
Eric Biggers <ebiggers(a)kernel.org>
crypto: af_alg - Fix incorrect boolean values in af_alg_ctx
Lukasz Czapnik <lukasz.czapnik(a)intel.com>
i40e: improve VF MAC filters accounting
Lukasz Czapnik <lukasz.czapnik(a)intel.com>
i40e: add mask to apply valid bits for itr_idx
Lukasz Czapnik <lukasz.czapnik(a)intel.com>
i40e: add max boundary check for VF filters
Lukasz Czapnik <lukasz.czapnik(a)intel.com>
i40e: fix validation of VF state in get resources
Lukasz Czapnik <lukasz.czapnik(a)intel.com>
i40e: fix input validation logic for action_meta
Lukasz Czapnik <lukasz.czapnik(a)intel.com>
i40e: fix idx validation in config queues msg
Lukasz Czapnik <lukasz.czapnik(a)intel.com>
i40e: fix idx validation in i40e_validate_queue_map
Amit Chaudhari <amitchaudhari(a)mac.com>
HID: asus: add support for missing PX series fn keys
Sang-Heon Jeon <ekffu200098(a)gmail.com>
smb: client: fix wrong index reference in smb2_compound_op()
Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
futex: Prevent use-after-free during requeue-PI
Zabelin Nikita <n.zabelin(a)mt-integration.ru>
drm/gma500: Fix null dereference in hdmi teardown
Dan Carpenter <dan.carpenter(a)linaro.org>
octeontx2-pf: Fix potential use after free in otx2_tc_add_flow()
Vladimir Oltean <vladimir.oltean(a)nxp.com>
net: dsa: lantiq_gswip: suppress -EINVAL errors for bridge FDB entries added to the CPU port
Vladimir Oltean <vladimir.oltean(a)nxp.com>
net: dsa: lantiq_gswip: move gswip_add_single_port_br() call to port_setup()
Martin Schiller <ms(a)dev.tdt.de>
net: dsa: lantiq_gswip: do also enable or disable cpu port
Ido Schimmel <idosch(a)nvidia.com>
selftests: fib_nexthops: Fix creation of non-FDB nexthops
Ido Schimmel <idosch(a)nvidia.com>
nexthop: Forbid FDB status change while nexthop is in a group
Jason Baron <jbaron(a)akamai.com>
net: allow alloc_skb_with_frags() to use MAX_SKB_FRAGS
Alok Tiwari <alok.a.tiwari(a)oracle.com>
bnxt_en: correct offset handling for IPv6 destination address
Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
vhost: Take a reference on the task in struct vhost_task.
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_event: Fix UAF in hci_acl_create_conn_sync
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_sync: Fix hci_resume_advertising_sync
Petr Malat <oss(a)malat.biz>
ethernet: rvu-af: Remove slash from the driver name
Stéphane Grosjean <stephane.grosjean(a)hms-networks.com>
can: peak_usb: fix shift-out-of-bounds issue
Vincent Mailhol <mailhol(a)kernel.org>
can: mcba_usb: populate ndo_change_mtu() to prevent buffer overflow
Vincent Mailhol <mailhol(a)kernel.org>
can: sun4i_can: populate ndo_change_mtu() to prevent buffer overflow
Vincent Mailhol <mailhol(a)kernel.org>
can: hi311x: populate ndo_change_mtu() to prevent buffer overflow
Vincent Mailhol <mailhol(a)kernel.org>
can: etas_es58x: populate ndo_change_mtu() to prevent buffer overflow
Sabrina Dubroca <sd(a)queasysnail.net>
xfrm: xfrm_alloc_spi shouldn't use 0 as SPI
Leon Hwang <leon.hwang(a)linux.dev>
bpf: Reject bpf_timer for PREEMPT_RT
Geert Uytterhoeven <geert+renesas(a)glider.be>
can: rcar_can: rcar_can_resume(): fix s2ram with PSCI
James Guan <guan_yufei(a)163.com>
wifi: virt_wifi: Fix page fault on connect
Stefan Metzmacher <metze(a)samba.org>
smb: server: don't use delayed_work for post_recv_credits_work
Christian Loehle <christian.loehle(a)arm.com>
cpufreq: Initialize cpufreq-based invariance before subsys
Jihed Chaibi <jihed.chaibi.dev(a)gmail.com>
ARM: dts: kirkwood: Fix sound DAI cells for OpenRD clients
Peng Fan <peng.fan(a)nxp.com>
arm64: dts: imx8mp: Correct thermal sensor index
Hugh Dickins <hughd(a)google.com>
mm: folio_may_be_lru_cached() unless folio_test_large()
Hugh Dickins <hughd(a)google.com>
mm/gup: local lru_add_drain() to avoid lru_add_drain_all()
Hugh Dickins <hughd(a)google.com>
mm/gup: check ref_count instead of lru before migration
Shivank Garg <shivankg(a)amd.com>
mm: add folio_expected_ref_count() for reference count calculation
David Hildenbrand <david(a)redhat.com>
mm/gup: revert "mm: gup: fix infinite loop within __get_longterm_locked"
Or Har-Toov <ohartoov(a)nvidia.com>
IB/mlx5: Fix obj_type mismatch for SRQ event subscriptions
qaqland <anguoli(a)uniontech.com>
ALSA: usb-audio: Add mute TLV for playback volumes on more devices
Cryolitia PukNgae <cryolitia(a)uniontech.com>
ALSA: usb-audio: move mixer_quirks' min_mute into common quirk
noble.yang <noble.yang(a)comtrue-inc.com>
ALSA: usb-audio: Add DSD support for Comtrue USB Audio device
Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
i2c: designware: Add quirk for Intel Xe
Benoît Monin <benoit.monin(a)bootlin.com>
mmc: sdhci-cadence: add Mobileye eyeQ support
Jiayi Li <lijiayi(a)kylinos.cn>
usb: core: Add 0x prefix to quirks debug output
Takashi Iwai <tiwai(a)suse.de>
ALSA: usb-audio: Fix build with CONFIG_INPUT=n
Chen Ni <nichen(a)iscas.ac.cn>
ALSA: usb-audio: Convert comma to semicolon
Kerem Karabay <kekrby(a)gmail.com>
HID: multitouch: specify that Apple Touch Bar is direct
Kerem Karabay <kekrby(a)gmail.com>
HID: multitouch: take cls->maxcontacts into account for Apple Touch Bar even without a HID_DG_CONTACTMAX field
Kerem Karabay <kekrby(a)gmail.com>
HID: multitouch: support getting the tip state from HID_DG_TOUCH fields in Apple Touch Bar
Kerem Karabay <kekrby(a)gmail.com>
HID: multitouch: Get the contact ID from HID_DG_TRANSDUCER_INDEX fields in case of Apple Touch Bar
Cristian Ciocaltea <cristian.ciocaltea(a)collabora.com>
ALSA: usb-audio: Add mixer quirk for Sony DualSense PS5
Cristian Ciocaltea <cristian.ciocaltea(a)collabora.com>
ALSA: usb-audio: Remove unneeded wmb() in mixer_quirks
Cristian Ciocaltea <cristian.ciocaltea(a)collabora.com>
ALSA: usb-audio: Simplify NULL comparison in mixer_quirks
Cristian Ciocaltea <cristian.ciocaltea(a)collabora.com>
ALSA: usb-audio: Avoid multiple assignments in mixer_quirks
Cristian Ciocaltea <cristian.ciocaltea(a)collabora.com>
ALSA: usb-audio: Drop unnecessary parentheses in mixer_quirks
Cristian Ciocaltea <cristian.ciocaltea(a)collabora.com>
ALSA: usb-audio: Fix block comments in mixer_quirks
Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
firewire: core: fix overlooked update of subsystem ABI version
Alok Tiwari <alok.a.tiwari(a)oracle.com>
scsi: ufs: mcq: Fix memory allocation checks for SQE and CQE
-------------
Diffstat:
Makefile | 4 +-
.../dts/intel/socfpga/socfpga_cyclone5_sodia.dts | 6 +-
.../boot/dts/marvell/kirkwood-openrd-client.dts | 2 +-
arch/arm/mach-bcm/Kconfig | 1 +
arch/arm64/boot/dts/freescale/imx8mp.dtsi | 4 +-
arch/s390/kernel/perf_cpum_cf.c | 4 +-
arch/um/drivers/mconsole_user.c | 2 +
drivers/block/loop.c | 40 ++-
drivers/cpufreq/cpufreq.c | 20 +-
drivers/edac/skx_common.h | 1 -
drivers/firewire/core-cdev.c | 2 +-
drivers/gpio/gpiolib.c | 19 +-
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 +
.../gpu/drm/amd/display/modules/hdcp/hdcp_ddc.c | 2 +
drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppevvmath.h | 14 +-
.../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +
.../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 3 +
.../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 3 +
drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 2 +-
drivers/gpu/drm/ast/ast_dp.c | 2 +-
drivers/gpu/drm/gma500/oaktrail_hdmi.c | 2 +-
drivers/gpu/drm/radeon/evergreen_cs.c | 2 +
drivers/hid/hid-asus.c | 3 +
drivers/hid/hid-multitouch.c | 45 +++-
drivers/hwmon/adt7475.c | 24 +-
drivers/i2c/busses/i2c-designware-platdrv.c | 7 +-
drivers/infiniband/hw/mlx5/devx.c | 1 +
drivers/input/touchscreen/cyttsp4_core.c | 2 +-
drivers/irqchip/irq-sun6i-r.c | 2 +-
drivers/media/dvb-frontends/stv0367_priv.h | 3 +
drivers/mmc/host/sdhci-cadence.c | 11 +
drivers/net/can/rcar/rcar_can.c | 8 +-
drivers/net/can/spi/hi311x.c | 1 +
drivers/net/can/sun4i_can.c | 1 +
drivers/net/can/usb/etas_es58x/es58x_core.c | 3 +-
drivers/net/can/usb/etas_es58x/es58x_devlink.c | 2 +-
drivers/net/can/usb/mcba_usb.c | 1 +
drivers/net/can/usb/peak_usb/pcan_usb_core.c | 2 +-
drivers/net/dsa/lantiq_gswip.c | 41 +--
drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 2 +-
drivers/net/ethernet/intel/i40e/i40e.h | 4 +-
drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 25 +-
drivers/net/ethernet/intel/i40e/i40e_main.c | 26 +-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 110 ++++----
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h | 3 +-
drivers/net/ethernet/marvell/octeontx2/af/cgx.c | 3 +-
.../net/ethernet/marvell/octeontx2/nic/otx2_tc.c | 2 +-
drivers/net/fjes/fjes_main.c | 4 +-
drivers/net/wireless/virtual/virt_wifi.c | 4 +-
drivers/nfc/pn544/i2c.c | 2 -
drivers/platform/x86/sony-laptop.c | 1 -
drivers/scsi/isci/init.c | 6 +-
.../pci/hive_isp_css_include/math_support.h | 5 -
drivers/ufs/core/ufs-mcq.c | 4 +-
drivers/usb/core/quirks.c | 2 +-
drivers/video/fbdev/core/fbcon.c | 13 +-
fs/afs/server.c | 3 +-
fs/btrfs/tree-checker.c | 2 +-
fs/hugetlbfs/inode.c | 10 +-
fs/smb/client/smb2inode.c | 2 +-
fs/smb/server/transport_rdma.c | 18 +-
include/crypto/if_alg.h | 2 +-
include/linux/compiler.h | 9 +
include/linux/minmax.h | 234 +++++++++-------
include/linux/mm.h | 55 ++++
include/linux/swap.h | 10 +
include/net/bluetooth/hci_core.h | 21 ++
kernel/bpf/verifier.c | 4 +
kernel/futex/requeue.c | 6 +-
kernel/trace/preemptirq_delay_test.c | 2 -
kernel/trace/trace_dynevent.c | 4 +
kernel/vhost_task.c | 3 +-
lib/btree.c | 1 -
lib/decompress_unlzma.c | 2 +
lib/vsprintf.c | 2 +-
mm/gup.c | 28 +-
mm/kmsan/core.c | 10 +-
mm/kmsan/kmsan_test.c | 16 ++
mm/migrate_device.c | 42 ++-
mm/mlock.c | 6 +-
mm/swap.c | 4 +-
mm/zsmalloc.c | 2 -
net/bluetooth/hci_event.c | 26 +-
net/bluetooth/hci_sync.c | 7 +
net/core/skbuff.c | 2 +-
net/ipv4/nexthop.c | 7 +
net/xfrm/xfrm_state.c | 3 +
sound/usb/mixer_quirks.c | 295 +++++++++++++++++++--
sound/usb/quirks.c | 24 +-
sound/usb/usbaudio.h | 4 +
tools/testing/selftests/mm/mremap_test.c | 2 +
tools/testing/selftests/net/fib_nexthops.sh | 12 +-
tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +
93 files changed, 1031 insertions(+), 363 deletions(-)
From: xu xin <xu.xin16(a)zte.com.cn>
This series aim to fix exec/fork inheritance and introduce ksm-utils tools
including ksm-set and ksm-get, you can see the detail in PATCH 1.
Problem
=======
In some extreme scenarios, however, this inheritance of MMF_VM_MERGE_ANY during
exec/fork can fail. For example, when the scanning frequency of ksmd is tuned
extremely high, a process carrying MMF_VM_MERGE_ANY may still fail to pass it to
the newly exec'd process. This happens because ksm_execve() is executed too early
in the do_execve flow (prematurely adding the new mm_struct to the ksm_mm_slot list).
As a result, before do_execve completes, ksmd may have already performed a scan and
found that this new mm_struct has no VM_MERGEABLE VMAs, thus clearing its
MMF_VM_MERGE_ANY flag. Consequently, when the new program executes, the flag
MMF_VM_MERGE_ANY inheritance fails!
Reproduce
========
Prepare ksm-utils in the prerequisite PATCH, and simply do as follows
echo 1 > /sys/kernel/mm/ksm/run;
echo 2000 > /sys/kernel/mm/ksm/pages_to_scan;
echo 0 > /sys/kernel/mm/ksm/sleep_millisecs;
ksm-set -s on [NEW_PROGRAM_BIN] &
ksm-get -a -e
you can see like this:
Pid Comm Merging_pages Ksm_zero_pages Ksm_profit Ksm_mergeable Ksm_merge_any
206 NEW_PROGRAM_BIN 7680 0 30965760 yes no
Note:
If the first time don't reproduce the issue, pkill NEW_PROGRAM_BIN and try run it
again. Usually, we can reproduce it in 5 times.
Root reason
===========
The commit d7597f59d1d33 ("mm: add new api to enable ksm per process") clear the
flag MMF_VM_MERGE_ANY when ksmd found no VM_MERGEABLE VMAs.
xu xin (2):
tools: add ksm-utils tools
mm/ksm: fix exec/fork inheritance support for prctl
mm/ksm.c | 8 +-
tools/mm/Makefile | 12 +-
tools/mm/ksm-utils/Makefile | 10 +
tools/mm/ksm-utils/ksm-get.c | 397 +++++++++++++++++++++++++++++++++++
tools/mm/ksm-utils/ksm-set.c | 144 +++++++++++++
5 files changed, 567 insertions(+), 4 deletions(-)
create mode 100644 tools/mm/ksm-utils/Makefile
create mode 100644 tools/mm/ksm-utils/ksm-get.c
create mode 100644 tools/mm/ksm-utils/ksm-set.c
--
2.25.1
In of_unittest_pci_node_verify(), when the add parameter is false,
device_find_any_child() obtains a reference to a child device. This
function implicitly calls get_device() to increment the device's
reference count before returning the pointer. However, the caller
fails to properly release this reference by calling put_device(),
leading to a device reference count leak. Add put_device() in the else
branch immediately after child_dev is no longer needed.
As the comment of device_find_any_child states: "NOTE: you will need
to drop the reference with put_device() after use".
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: 26409dd04589 ("of: unittest: Add pci_dt_testdrv pci driver")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
Changes in v2:
- modified the put_device() location as suggestions.
---
drivers/of/unittest.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index e3503ec20f6c..388e9ec2cccf 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -4300,6 +4300,7 @@ static int of_unittest_pci_node_verify(struct pci_dev *pdev, bool add)
unittest(!np, "Child device tree node is not removed\n");
child_dev = device_find_any_child(&pdev->dev);
unittest(!child_dev, "Child device is not removed\n");
+ put_device(child_dev);
}
failed:
--
2.17.1
From: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
[ Upstream commit 73861970938ad1323eb02bbbc87f6fbd1e5bacca ]
The inode mode loaded from corrupted disk can be invalid. Do like what
commit 0a9e74051313 ("isofs: Verify inode mode when loading from disk")
does.
Reported-by: syzbot <syzbot+895c23f6917da440ed0d(a)syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d
Signed-off-by: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Link: https://lore.kernel.org/ec982681-84b8-4624-94fa-8af15b77cbd2@I-love.SAKURA.…
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
## Backport Analysis: minixfs Inode Mode Validation
**RECOMMENDATION: YES**
This commit **MUST be backported** to stable kernel trees. This is a
critical security and stability fix.
---
### Evidence-Based Analysis
#### 1. **Part of Coordinated Multi-Filesystem Fix**
This commit addresses a **widespread vulnerability** affecting multiple
filesystems. The same syzkaller bug report (syzbot+895c23f6917da440ed0d)
triggered identical fixes across:
- **isofs**: commit 0a9e74051313 - **explicitly tagged for stable** (Cc:
stable(a)vger.kernel.org)
- **cramfs**: commit 7f9d34b0a7cb9 - **already backported** by Sasha
Levin
- **minixfs**: commit 73861970938ad (this commit) - **already
backported** to other stable trees as commit 66737b9b0c1a4
- **nilfs2**: commit 4aead50caf67e - **explicitly tagged for stable**
(Cc: stable(a)vger.kernel.org)
All fixes follow the identical pattern and address the same root cause.
#### 2. **Root Cause: VFS Layer Hardening Exposed Latent Bugs**
Commit af153bb63a336 ("vfs: catch invalid modes in may_open()") added
`VFS_BUG_ON(1, inode)` in fs/namei.c:3418 to catch invalid inode modes.
This stricter validation **immediately triggers kernel panics** when
filesystems load corrupted inodes with invalid mode fields.
**Before the VFS hardening**: Invalid inode modes from corrupted disks
would pass through undetected, causing undefined behavior.
**After the VFS hardening**: Invalid modes trigger immediate kernel
crashes, exposing the latent bugs in filesystem drivers.
#### 3. **Code Change Analysis (fs/minix/inode.c:481-497)**
**Before** (vulnerable code):
```c
} else if (S_ISLNK(inode->i_mode)) {
inode->i_op = &minix_symlink_inode_operations;
inode_nohighmem(inode);
inode->i_mapping->a_ops = &minix_aops;
} else
init_special_inode(inode, inode->i_mode, rdev); // Accepts ANY
invalid mode
```
**After** (fixed code):
```c
} else if (S_ISLNK(inode->i_mode)) {
inode->i_op = &minix_symlink_inode_operations;
inode_nohighmem(inode);
inode->i_mapping->a_ops = &minix_aops;
} else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode) ||
S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {
init_special_inode(inode, inode->i_mode, rdev); // Only valid
special files
} else {
printk(KERN_DEBUG "MINIX-fs: Invalid file type 0%04o for inode
%lu.\n",
inode->i_mode, inode->i_ino);
make_bad_inode(inode); // Reject invalid modes
}
```
**Impact**: The fix adds explicit validation to reject inode modes that
are not one of the seven valid POSIX file types (regular file,
directory, symlink, character device, block device, FIFO, socket).
Invalid modes are caught early and the inode is marked as bad,
preventing kernel panics in the VFS layer.
#### 4. **Security Impact: DoS Vulnerability (CVSS ~6.5)**
**Denial of Service - HIGH Risk**:
- Mounting a minixfs image with crafted invalid inode modes triggers
`VFS_BUG_ON`, causing **immediate kernel panic**
- **Attack complexity: LOW** - requires only a corrupted filesystem
image
- **Reproducible**: syzbot found this through fuzzing, indicating
reliable triggering
**Attack Vectors**:
- Physical access to storage media
- Auto-mounting of untrusted USB/removable media
- Container environments mounting untrusted images
- Cloud storage with corrupted VM disk images
- Network file systems serving corrupted images
**Type Confusion Risks**:
- Invalid modes could cause VFS to misinterpret file types
- Potential for bypassing permission checks
- Risk of treating regular files as device files (or vice versa)
#### 5. **Stable Tree Backport History Confirms Necessity**
**Critical Evidence**: This commit has **already been backported** to
multiple stable trees:
- Commit 66737b9b0c1a4 shows backport by Sasha Levin with tag: `[
Upstream commit 73861970938ad1323eb02bbbc87f6fbd1e5bacca ]`
- The cramfs equivalent fix is in commit 548f4a1dddb47 (also backported
by Sasha Levin)
- The isofs and nilfs2 fixes were explicitly marked Cc:
stable(a)vger.kernel.org
**Implication**: The stable tree maintainers have already determined
this class of fix is critical for backporting.
#### 6. **Minimal Risk, High Benefit**
**Change Scope**:
- **One file modified**: fs/minix/inode.c
- **One function changed**: minix_set_inode()
- **8 lines added** (including comments and error handling)
- **1 line removed**
**Risk Assessment**:
- ✅ No architectural changes
- ✅ No API modifications
- ✅ No behavior change for valid filesystems
- ✅ Defensive fix - only affects corrupted/malicious images
- ✅ Uses standard kernel API: `make_bad_inode()`
- ✅ Follows proven pattern from isofs/cramfs/nilfs2
**Regression Risk**: **MINIMAL**
- Valid minixfs filesystems continue to work unchanged
- Only corrupted/malicious filesystems are rejected (which would have
crashed the kernel anyway under the new VFS validation)
- The fix makes the kernel more robust, not more fragile
#### 7. **Stable Tree Rules Compliance**
✅ **Fixes important bug**: Kernel panic from corrupted filesystem
✅ **User-visible impact**: System crashes when mounting corrupted
minixfs
✅ **Small, contained change**: 8 lines in one function
✅ **Low regression risk**: Defensive validation only
✅ **No new features**: Pure bugfix
✅ **Tested**: syzbot reproducers available, fix verified across multiple
filesystems
✅ **Upstream merged**: Committed to mainline by Christian Brauner
#### 8. **Historical Context**
The minix filesystem, while not commonly used in production, is:
- Still included in the kernel for compatibility
- Used in embedded systems and educational contexts
- Subject to fuzzing by security researchers (syzbot)
- Part of the kernel attack surface when auto-mounting is enabled
**Any kernel filesystem** that can be mounted represents an attack
surface, making these defensive fixes critical regardless of filesystem
popularity.
---
### Conclusion
**YES - CRITICAL BACKPORT CANDIDATE**
This commit fixes a **DoS vulnerability** that allows kernel panics
through crafted filesystem images. It is:
1. Part of a coordinated fix across multiple filesystems
2. Already backported to other stable trees
3. Minimal risk with high security benefit
4. Follows stable tree rules perfectly
5. Addresses a syzbot-reported bug with reproducers
6. Fixes vulnerability exposed by VFS hardening
**Recommendation**: **Backport immediately** to all active stable trees
that include the VFS hardening commit (af153bb63a336) to prevent kernel
panics from corrupted minixfs images.
fs/minix/inode.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index df9d11479caf1..32db676127a9e 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -492,8 +492,14 @@ void minix_set_inode(struct inode *inode, dev_t rdev)
inode->i_op = &minix_symlink_inode_operations;
inode_nohighmem(inode);
inode->i_mapping->a_ops = &minix_aops;
- } else
+ } else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode) ||
+ S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {
init_special_inode(inode, inode->i_mode, rdev);
+ } else {
+ printk(KERN_DEBUG "MINIX-fs: Invalid file type 0%04o for inode %lu.\n",
+ inode->i_mode, inode->i_ino);
+ make_bad_inode(inode);
+ }
}
/*
--
2.51.0
Prevent USB runtime PM (autosuspend) for AX88772* in bind.
usbnet enables runtime PM (autosuspend) by default, so disabling it via
the usb_driver flag is ineffective. On AX88772B, autosuspend shows no
measurable power saving with current driver (no link partner, admin
up/down). The ~0.453 W -> ~0.248 W drop on v6.1 comes from phylib powering
the PHY off on admin-down, not from USB autosuspend.
The real hazard is that with runtime PM enabled, ndo_open() (under RTNL)
may synchronously trigger autoresume (usb_autopm_get_interface()) into
asix_resume() while the USB PM lock is held. Resume paths then invoke
phylink/phylib and MDIO, which also expect RTNL, leading to possible
deadlocks or PM lock vs MDIO wake issues.
To avoid this, keep the device runtime-PM active by taking a usage
reference in ax88772_bind() and dropping it in unbind(). A non-zero PM
usage count blocks runtime suspend regardless of userspace policy
(.../power/control - pm_runtime_allow/forbid), making this approach
robust against sysfs overrides.
System sleep/resume is unchanged.
Fixes: 4a2c7217cd5a ("net: usb: asix: ax88772: manage PHY PM from MAC")
Reported-by: Hubert Wiśniewski <hubert.wisniewski.25632(a)gmail.com>
Closes: https://lore.kernel.org/all/DCGHG5UJT9G3.2K1GHFZ3H87T0@gmail.com
Tested-by: Hubert Wiśniewski <hubert.wisniewski.25632(a)gmail.com>
Reported-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Closes: https://lore.kernel.org/all/b5ea8296-f981-445d-a09a-2f389d7f6fdd@samsung.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Oleksij Rempel <o.rempel(a)pengutronix.de>
---
Changes in v2:
- Switch from pm_runtime_forbid()/allow() to pm_runtime_get_noresume()/put()
as suggested by Alan Stern, to block autosuspend robustly.
- Reword commit message to clarify the actual deadlock condition
(autoresume under RTNL) as pointed out by Oliver Neukum.
- Keep explanation in commit message, shorten in-code comment.
Link to the measurement results:
https://lore.kernel.org/all/aMkPMa650kfKfmF4@pengutronix.de/
---
drivers/net/usb/asix_devices.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index 792ddda1ad49..5c939446515b 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -625,6 +625,21 @@ static void ax88772_suspend(struct usbnet *dev)
asix_read_medium_status(dev, 1));
}
+/* Notes on PM callbacks and locking context:
+ *
+ * - asix_suspend()/asix_resume() are invoked for both runtime PM and
+ * system-wide suspend/resume. For struct usb_driver the ->resume()
+ * callback does not receive pm_message_t, so the resume type cannot
+ * be distinguished here.
+ *
+ * - The MAC driver must hold RTNL when calling phylink interfaces such as
+ * phylink_suspend()/resume(). Those calls will also perform MDIO I/O.
+ *
+ * - Taking RTNL and doing MDIO from a runtime-PM resume callback (while
+ * the USB PM lock is held) is fragile. Since autosuspend brings no
+ * measurable power saving for this device with current driver version, it is
+ * disabled below.
+ */
static int asix_suspend(struct usb_interface *intf, pm_message_t message)
{
struct usbnet *dev = usb_get_intfdata(intf);
@@ -919,6 +934,13 @@ static int ax88772_bind(struct usbnet *dev, struct usb_interface *intf)
if (ret)
goto initphy_err;
+ /* Keep this interface runtime-PM active by taking a usage ref.
+ * Prevents runtime suspend while bound and avoids resume paths
+ * that could deadlock (autoresume under RTNL while USB PM lock
+ * is held, phylink/MDIO wants RTNL).
+ */
+ pm_runtime_get_noresume(&intf->dev);
+
return 0;
initphy_err:
@@ -948,6 +970,8 @@ static void ax88772_unbind(struct usbnet *dev, struct usb_interface *intf)
phylink_destroy(priv->phylink);
ax88772_mdio_unregister(priv);
asix_rx_fixup_common_free(dev->driver_priv);
+ /* Drop the PM usage ref taken in bind() */
+ pm_runtime_put(&intf->dev);
}
static void ax88178_unbind(struct usbnet *dev, struct usb_interface *intf)
@@ -1600,6 +1624,10 @@ static struct usb_driver asix_driver = {
.resume = asix_resume,
.reset_resume = asix_resume,
.disconnect = usbnet_disconnect,
+ /* usbnet will force supports_autosuspend=1; we explicitly forbid RPM
+ * per-interface in bind to keep autosuspend disabled for this driver
+ * by using pm_runtime_forbid().
+ */
.supports_autosuspend = 1,
.disable_hub_initiated_lpm = 1,
};
--
2.47.3
Calling intotify_show_fdinfo() on fd watching an overlayfs inode, while
the overlayfs is being unmounted, can lead to dereferencing NULL ptr.
This issue was found by syzkaller.
Race Condition Diagram:
Thread 1 Thread 2
-------- --------
generic_shutdown_super()
shrink_dcache_for_umount
sb->s_root = NULL
|
| vfs_read()
| inotify_fdinfo()
| * inode get from mark *
| show_mark_fhandle(m, inode)
| exportfs_encode_fid(inode, ..)
| ovl_encode_fh(inode, ..)
| ovl_check_encode_origin(inode)
| * deref i_sb->s_root *
|
|
v
fsnotify_sb_delete(sb)
Which then leads to:
[ 32.133461] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 32.134438] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
[ 32.135032] CPU: 1 UID: 0 PID: 4468 Comm: systemd-coredum Not tainted 6.17.0-rc6 #22 PREEMPT(none)
<snip registers, unreliable trace>
[ 32.143353] Call Trace:
[ 32.143732] ovl_encode_fh+0xd5/0x170
[ 32.144031] exportfs_encode_inode_fh+0x12f/0x300
[ 32.144425] show_mark_fhandle+0xbe/0x1f0
[ 32.145805] inotify_fdinfo+0x226/0x2d0
[ 32.146442] inotify_show_fdinfo+0x1c5/0x350
[ 32.147168] seq_show+0x530/0x6f0
[ 32.147449] seq_read_iter+0x503/0x12a0
[ 32.148419] seq_read+0x31f/0x410
[ 32.150714] vfs_read+0x1f0/0x9e0
[ 32.152297] ksys_read+0x125/0x240
IOW ovl_check_encode_origin derefs inode->i_sb->s_root, after it was set
to NULL in the unmount path.
Fix it by protecting calling exportfs_encode_fid() from
show_mark_fhandle() with s_umount lock.
This form of fix was suggested by Amir in [1].
[1]: https://lore.kernel.org/all/CAOQ4uxhbDwhb+2Brs1UdkoF0a3NSdBAOQPNfEHjahrgoKJ…
Fixes: c45beebfde34 ("ovl: support encoding fid from inode with no alias")
Signed-off-by: Jakub Acs <acsjakub(a)amazon.de>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: Miklos Szeredi <miklos(a)szeredi.hu>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: linux-unionfs(a)vger.kernel.org
Cc: linux-fsdevel(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
---
This issue was already discussed in [1] with no consensus reached on the
fix.
This form was suggested as a band-aid fix, without explicity yes/no
reaction. Hence reviving the discussion around the band-aid.
fs/notify/fdinfo.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c
index 1161eabf11ee..9cc7eb863643 100644
--- a/fs/notify/fdinfo.c
+++ b/fs/notify/fdinfo.c
@@ -17,6 +17,7 @@
#include "fanotify/fanotify.h"
#include "fdinfo.h"
#include "fsnotify.h"
+#include "../internal.h"
#if defined(CONFIG_PROC_FS)
@@ -46,7 +47,12 @@ static void show_mark_fhandle(struct seq_file *m, struct inode *inode)
size = f->handle_bytes >> 2;
+ if (!super_trylock_shared(inode->i_sb))
+ return;
+
ret = exportfs_encode_fid(inode, (struct fid *)f->f_handle, &size);
+ up_read(&inode->i_sb->s_umount);
+
if ((ret == FILEID_INVALID) || (ret < 0))
return;
--
2.47.3
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
Calling intotify_show_fdinfo() on fd watching an overlayfs inode, while
the overlayfs is being unmounted, can lead to dereferencing NULL ptr.
This issue was found by syzkaller.
Race Condition Diagram:
Thread 1 Thread 2
-------- --------
generic_shutdown_super()
shrink_dcache_for_umount
sb->s_root = NULL
|
| vfs_read()
| inotify_fdinfo()
| * inode get from mark *
| show_mark_fhandle(m, inode)
| exportfs_encode_fid(inode, ..)
| ovl_encode_fh(inode, ..)
| ovl_check_encode_origin(inode)
| * deref i_sb->s_root *
|
|
v
fsnotify_sb_delete(sb)
Which then leads to:
[ 32.133461] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 32.134438] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
[ 32.135032] CPU: 1 UID: 0 PID: 4468 Comm: systemd-coredum Not tainted 6.17.0-rc6 #22 PREEMPT(none)
<snip registers, unreliable trace>
[ 32.143353] Call Trace:
[ 32.143732] ovl_encode_fh+0xd5/0x170
[ 32.144031] exportfs_encode_inode_fh+0x12f/0x300
[ 32.144425] show_mark_fhandle+0xbe/0x1f0
[ 32.145805] inotify_fdinfo+0x226/0x2d0
[ 32.146442] inotify_show_fdinfo+0x1c5/0x350
[ 32.147168] seq_show+0x530/0x6f0
[ 32.147449] seq_read_iter+0x503/0x12a0
[ 32.148419] seq_read+0x31f/0x410
[ 32.150714] vfs_read+0x1f0/0x9e0
[ 32.152297] ksys_read+0x125/0x240
IOW ovl_check_encode_origin derefs inode->i_sb->s_root, after it was set
to NULL in the unmount path.
Minimize the window of opportunity by adding explicit check.
Fixes: c45beebfde34 ("ovl: support encoding fid from inode with no alias")
Signed-off-by: Jakub Acs <acsjakub(a)amazon.de>
Cc: Miklos Szeredi <miklos(a)szeredi.hu>
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: linux-unionfs(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
---
I'm happy to take suggestions for a better fix - I looked at taking
s_umount for reading, but it wasn't clear to me for how long would the
fdinfo path need to hold it. Hence the most primitive suggestion in this
v1.
I'm also not sure if ENOENT or EBUSY is better?.. or even something else?
fs/overlayfs/export.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index 83f80fdb1567..424c73188e06 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -195,6 +195,8 @@ static int ovl_check_encode_origin(struct inode *inode)
if (!ovl_inode_lower(inode))
return 0;
+ if (!inode->i_sb->s_root)
+ return -ENOENT;
/*
* Root is never indexed, so if there's an upper layer, encode upper for
* root.
--
2.47.3
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
syzkaller discovered the following crash: (kernel BUG)
[ 44.607039] ------------[ cut here ]------------
[ 44.607422] kernel BUG at mm/userfaultfd.c:2067!
[ 44.608148] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 44.608814] CPU: 1 UID: 0 PID: 2475 Comm: reproducer Not tainted 6.16.0-rc6 #1 PREEMPT(none)
[ 44.609635] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 44.610695] RIP: 0010:userfaultfd_release_all+0x3a8/0x460
<snip other registers, drop unreliable trace>
[ 44.617726] Call Trace:
[ 44.617926] <TASK>
[ 44.619284] userfaultfd_release+0xef/0x1b0
[ 44.620976] __fput+0x3f9/0xb60
[ 44.621240] fput_close_sync+0x110/0x210
[ 44.622222] __x64_sys_close+0x8f/0x120
[ 44.622530] do_syscall_64+0x5b/0x2f0
[ 44.622840] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 44.623244] RIP: 0033:0x7f365bb3f227
Kernel panics because it detects UFFD inconsistency during
userfaultfd_release_all(). Specifically, a VMA which has a valid pointer
to vma->vm_userfaultfd_ctx, but no UFFD flags in vma->vm_flags.
The inconsistency is caused in ksm_madvise(): when user calls madvise()
with MADV_UNMEARGEABLE on a VMA that is registered for UFFD in MINOR
mode, it accidentally clears all flags stored in the upper 32 bits of
vma->vm_flags.
Assuming x86_64 kernel build, unsigned long is 64-bit and unsigned int
and int are 32-bit wide. This setup causes the following mishap during
the &= ~VM_MERGEABLE assignment.
VM_MERGEABLE is a 32-bit constant of type unsigned int, 0x8000'0000.
After ~ is applied, it becomes 0x7fff'ffff unsigned int, which is then
promoted to unsigned long before the & operation. This promotion fills
upper 32 bits with leading 0s, as we're doing unsigned conversion (and
even for a signed conversion, this wouldn't help as the leading bit is
0). & operation thus ends up AND-ing vm_flags with 0x0000'0000'7fff'ffff
instead of intended 0xffff'ffff'7fff'ffff and hence accidentally clears
the upper 32-bits of its value.
Fix it by changing `VM_MERGEABLE` constant to unsigned long. Modify all
other VM_* flags constants for consistency.
Note: other VM_* flags are not affected:
This only happens to the VM_MERGEABLE flag, as the other VM_* flags are
all constants of type int and after ~ operation, they end up with
leading 1 and are thus converted to unsigned long with leading 1s.
Note 2:
After commit 31defc3b01d9 ("userfaultfd: remove (VM_)BUG_ON()s"), this is
no longer a kernel BUG, but a WARNING at the same place:
[ 45.595973] WARNING: CPU: 1 PID: 2474 at mm/userfaultfd.c:2067
but the root-cause (flag-drop) remains the same.
Fixes: 7677f7fd8be76 ("userfaultfd: add minor fault registration mode")
Signed-off-by: Jakub Acs <acsjakub(a)amazon.de>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Xu Xin <xu.xin16(a)zte.com.cn>
Cc: Chengming Zhou <chengming.zhou(a)linux.dev>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: linux-mm(a)kvack.org
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
---
v1 -> v2:
- fix by adding ul to flag constants instead of explicit cast.
- drop Mike Kravetz <mike.kravetz(a)oracle.com> from cc, as the mail
returned
v1:
https://lore.kernel.org/all/20250930063921.62354-1-acsjakub@amazon.de/
include/linux/mm.h | 72 +++++++++++++++++++++++-----------------------
1 file changed, 36 insertions(+), 36 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1ae97a0b8ec7..26a5c0f78b36 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -246,57 +246,57 @@ extern unsigned int kobjsize(const void *objp);
* vm_flags in vm_area_struct, see mm_types.h.
* When changing, update also include/trace/events/mmflags.h
*/
-#define VM_NONE 0x00000000
+#define VM_NONE 0x00000000ul
-#define VM_READ 0x00000001 /* currently active flags */
-#define VM_WRITE 0x00000002
-#define VM_EXEC 0x00000004
-#define VM_SHARED 0x00000008
+#define VM_READ 0x00000001ul /* currently active flags */
+#define VM_WRITE 0x00000002ul
+#define VM_EXEC 0x00000004ul
+#define VM_SHARED 0x00000008ul
/* mprotect() hardcodes VM_MAYREAD >> 4 == VM_READ, and so for r/w/x bits. */
-#define VM_MAYREAD 0x00000010 /* limits for mprotect() etc */
-#define VM_MAYWRITE 0x00000020
-#define VM_MAYEXEC 0x00000040
-#define VM_MAYSHARE 0x00000080
+#define VM_MAYREAD 0x00000010ul /* limits for mprotect() etc */
+#define VM_MAYWRITE 0x00000020ul
+#define VM_MAYEXEC 0x00000040ul
+#define VM_MAYSHARE 0x00000080ul
-#define VM_GROWSDOWN 0x00000100 /* general info on the segment */
+#define VM_GROWSDOWN 0x00000100ul /* general info on the segment */
#ifdef CONFIG_MMU
-#define VM_UFFD_MISSING 0x00000200 /* missing pages tracking */
+#define VM_UFFD_MISSING 0x00000200ul /* missing pages tracking */
#else /* CONFIG_MMU */
-#define VM_MAYOVERLAY 0x00000200 /* nommu: R/O MAP_PRIVATE mapping that might overlay a file mapping */
-#define VM_UFFD_MISSING 0
+#define VM_MAYOVERLAY 0x00000200ul /* nommu: R/O MAP_PRIVATE mapping that might overlay a file mapping */
+#define VM_UFFD_MISSING 0ul
#endif /* CONFIG_MMU */
-#define VM_PFNMAP 0x00000400 /* Page-ranges managed without "struct page", just pure PFN */
-#define VM_UFFD_WP 0x00001000 /* wrprotect pages tracking */
+#define VM_PFNMAP 0x00000400ul /* Page-ranges managed without "struct page", just pure PFN */
+#define VM_UFFD_WP 0x00001000ul /* wrprotect pages tracking */
-#define VM_LOCKED 0x00002000
-#define VM_IO 0x00004000 /* Memory mapped I/O or similar */
+#define VM_LOCKED 0x00002000ul
+#define VM_IO 0x00004000ul /* Memory mapped I/O or similar */
/* Used by sys_madvise() */
-#define VM_SEQ_READ 0x00008000 /* App will access data sequentially */
-#define VM_RAND_READ 0x00010000 /* App will not benefit from clustered reads */
-
-#define VM_DONTCOPY 0x00020000 /* Do not copy this vma on fork */
-#define VM_DONTEXPAND 0x00040000 /* Cannot expand with mremap() */
-#define VM_LOCKONFAULT 0x00080000 /* Lock the pages covered when they are faulted in */
-#define VM_ACCOUNT 0x00100000 /* Is a VM accounted object */
-#define VM_NORESERVE 0x00200000 /* should the VM suppress accounting */
-#define VM_HUGETLB 0x00400000 /* Huge TLB Page VM */
-#define VM_SYNC 0x00800000 /* Synchronous page faults */
-#define VM_ARCH_1 0x01000000 /* Architecture-specific flag */
-#define VM_WIPEONFORK 0x02000000 /* Wipe VMA contents in child. */
-#define VM_DONTDUMP 0x04000000 /* Do not include in the core dump */
+#define VM_SEQ_READ 0x00008000ul /* App will access data sequentially */
+#define VM_RAND_READ 0x00010000ul /* App will not benefit from clustered reads */
+
+#define VM_DONTCOPY 0x00020000ul /* Do not copy this vma on fork */
+#define VM_DONTEXPAND 0x00040000ul /* Cannot expand with mremap() */
+#define VM_LOCKONFAULT 0x00080000ul /* Lock the pages covered when they are faulted in */
+#define VM_ACCOUNT 0x00100000ul /* Is a VM accounted object */
+#define VM_NORESERVE 0x00200000ul /* should the VM suppress accounting */
+#define VM_HUGETLB 0x00400000ul /* Huge TLB Page VM */
+#define VM_SYNC 0x00800000ul /* Synchronous page faults */
+#define VM_ARCH_1 0x01000000ul /* Architecture-specific flag */
+#define VM_WIPEONFORK 0x02000000ul /* Wipe VMA contents in child. */
+#define VM_DONTDUMP 0x04000000ul /* Do not include in the core dump */
#ifdef CONFIG_MEM_SOFT_DIRTY
-# define VM_SOFTDIRTY 0x08000000 /* Not soft dirty clean area */
+# define VM_SOFTDIRTY 0x08000000ul /* Not soft dirty clean area */
#else
-# define VM_SOFTDIRTY 0
+# define VM_SOFTDIRTY 0ul
#endif
-#define VM_MIXEDMAP 0x10000000 /* Can contain "struct page" and pure PFN pages */
-#define VM_HUGEPAGE 0x20000000 /* MADV_HUGEPAGE marked this vma */
-#define VM_NOHUGEPAGE 0x40000000 /* MADV_NOHUGEPAGE marked this vma */
-#define VM_MERGEABLE 0x80000000 /* KSM may merge identical pages */
+#define VM_MIXEDMAP 0x10000000ul /* Can contain "struct page" and pure PFN pages */
+#define VM_HUGEPAGE 0x20000000ul /* MADV_HUGEPAGE marked this vma */
+#define VM_NOHUGEPAGE 0x40000000ul /* MADV_NOHUGEPAGE marked this vma */
+#define VM_MERGEABLE 0x80000000ul /* KSM may merge identical pages */
#ifdef CONFIG_ARCH_USES_HIGH_VMA_FLAGS
#define VM_HIGH_ARCH_BIT_0 32 /* bit only usable on 64-bit architectures */
--
2.47.3
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
Purchase this ALCHIMERA ebook, and a portion of the proceeds will help
children and families in need in Gaza, Ukraine, and the DRC: Your gesture
is a seed of love that will bear eternal fruit.
https://a.co/d/aDuLWHN
🆘Buy this book, save a life today. Every word you read, every page you
turn, becomes an act of compassion.
💔 Children cry from hunger in Gaza, Ukraine, and eastern DRC. Your
purchase becomes a hot meal, a blanket, a breath of hope for those who have
nothing left.
This isn't just an ebook. It's a mission. A portion of the funds is sent to
disaster-stricken families suffering from serious hunger.
🙏 Share this message. Give this book as a gift. Increase the number of
lifesaving actions. Every share is an answered prayer. Every purchase is a
helping hand.
May God bless you abundantly for your generous heart. This book is in your
hands. Someone's life too.
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
syzbot reported use-after-free bugs when accessing extent headers in
ext4_ext_insert_extent() and ext4_ext_correct_indexes(). These occur
when the extent path structure becomes invalid during operations.
The crashes show two patterns:
1. In ext4_ext_map_blocks(), the extent header can be corrupted after
ext4_find_extent() returns, particularly during concurrent writes
to the same file.
2. In ext4_ext_correct_indexes(), accessing path[depth] causes a
use-after-free, indicating the path structure itself is corrupted.
This is partially exposed by commit 665575cff098 ("filemap: move
prefaulting out of hot write path") which changed timing windows in
the write path, making these races more likely to occur.
Fix this by adding validation checks:
- In ext4_ext_map_blocks(): validate the extent header after getting
the path from ext4_find_extent()
- In ext4_ext_correct_indexes(): validate the path pointer before
dereferencing and check extent header magic
While these checks are defensive and don't address the root cause of
path corruption, they prevent kernel crashes from invalid memory access.
A more comprehensive fix to path lifetime management may be needed in
the future.
Reported-by: syzbot+9db318d6167044609878(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9db318d6167044609878
Fixes: 665575cff098 ("filemap: move prefaulting out of hot write path")
Cc: stable(a)vger.kernel.org
Signed-off-by: Deepanshu Kartikey <kartikey406(a)gmail.com>
---
fs/ext4/extents.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index ca5499e9412b..903578d5f68d 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -1708,7 +1708,9 @@ static int ext4_ext_correct_indexes(handle_t *handle, struct inode *inode,
struct ext4_extent *ex;
__le32 border;
int k, err = 0;
-
+ if (!path || depth < 0 || depth > EXT4_MAX_EXTENT_DEPTH) {
+ return -EFSCORRUPTED;
+ }
eh = path[depth].p_hdr;
ex = path[depth].p_ext;
@@ -4200,6 +4202,7 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
unsigned int allocated_clusters = 0;
struct ext4_allocation_request ar;
ext4_lblk_t cluster_offset;
+ struct ext4_extent_header *eh;
ext_debug(inode, "blocks %u/%u requested\n", map->m_lblk, map->m_len);
trace_ext4_ext_map_blocks_enter(inode, map->m_lblk, map->m_len, flags);
@@ -4212,7 +4215,12 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
}
depth = ext_depth(inode);
-
+ eh = path[depth].p_hdr;
+ if (!eh || le16_to_cpu(eh->eh_magic) != EXT4_EXT_MAGIC) {
+ EXT4_ERROR_INODE(inode, "invalid extent header after find_extent");
+ err = -EFSCORRUPTED;
+ goto out;
+ }
/*
* consistent leaf must not be empty;
* this situation is possible, though, _during_ tree modification;
--
2.43.0
The patch titled
Subject: mm/damon/vaddr: do not repeat pte_offset_map_lock() until success
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-damon-vaddr-do-not-repeat-pte_offset_map_lock-until-success.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/vaddr: do not repeat pte_offset_map_lock() until success
Date: Mon, 29 Sep 2025 17:44:09 -0700
DAMON's virtual address space operation set implementation (vaddr) calls
pte_offset_map_lock() inside the page table walk callback function. This
is for reading and writing page table accessed bits. If
pte_offset_map_lock() fails, it retries by returning the page table walk
callback function with ACTION_AGAIN.
pte_offset_map_lock() can continuously fail if the target is a pmd
migration entry, though. Hence it could cause an infinite page table walk
if the migration cannot be done until the page table walk is finished.
This indeed caused a soft lockup when CPU hotplugging and DAMON were
running in parallel.
Avoid the infinite loop by simply not retrying the page table walk. DAMON
is promising only a best-effort accuracy, so missing access to such pages
is no problem.
Link: https://lkml.kernel.org/r/20250930004410.55228-1-sj@kernel.org
Fixes: 7780d04046a2 ("mm/pagewalkers: ACTION_AGAIN if pte_offset_map_lock() fails")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Reported-by: Xinyu Zheng <zhengxinyu6(a)huawei.com>
Closes: https://lore.kernel.org/20250918030029.2652607-1-zhengxinyu6@huawei.com
Acked-by: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org> [6.5+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/vaddr.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
--- a/mm/damon/vaddr.c~mm-damon-vaddr-do-not-repeat-pte_offset_map_lock-until-success
+++ a/mm/damon/vaddr.c
@@ -328,10 +328,8 @@ static int damon_mkold_pmd_entry(pmd_t *
}
pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
- if (!pte) {
- walk->action = ACTION_AGAIN;
+ if (!pte)
return 0;
- }
if (!pte_present(ptep_get(pte)))
goto out;
damon_ptep_mkold(pte, walk->vma, addr);
@@ -481,10 +479,8 @@ regular_page:
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
- if (!pte) {
- walk->action = ACTION_AGAIN;
+ if (!pte)
return 0;
- }
ptent = ptep_get(pte);
if (!pte_present(ptent))
goto out;
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-vaddr-do-not-repeat-pte_offset_map_lock-until-success.patch
The patch titled
Subject: mm/rmap: fix soft-dirty and uffd-wp bit loss when remapping zero-filled mTHP subpage to shared zeropage
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-rmap-fix-soft-dirty-and-uffd-wp-bit-loss-when-remapping-zero-filled-mthp-subpage-to-shared-zeropage.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Lance Yang <lance.yang(a)linux.dev>
Subject: mm/rmap: fix soft-dirty and uffd-wp bit loss when remapping zero-filled mTHP subpage to shared zeropage
Date: Tue, 30 Sep 2025 16:10:40 +0800
When splitting an mTHP and replacing a zero-filled subpage with the shared
zeropage, try_to_map_unused_to_zeropage() currently drops several
important PTE bits.
For userspace tools like CRIU, which rely on the soft-dirty mechanism for
incremental snapshots, losing the soft-dirty bit means modified pages are
missed, leading to inconsistent memory state after restore.
As pointed out by David, the more critical uffd-wp bit is also dropped.
This breaks the userfaultfd write-protection mechanism, causing writes to
be silently missed by monitoring applications, which can lead to data
corruption.
Preserve both the soft-dirty and uffd-wp bits from the old PTE when
creating the new zeropage mapping to ensure they are correctly tracked.
Link: https://lkml.kernel.org/r/20250930081040.80926-1-lance.yang@linux.dev
Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Suggested-by: Dev Jain <dev.jain(a)arm.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Dev Jain <dev.jain(a)arm.com>
Acked-by: Zi Yan <ziy(a)nvidia.com>
Cc: Alistair Popple <apopple(a)nvidia.com>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Barry Song <baohua(a)kernel.org>
Cc: Byungchul Park <byungchul(a)sk.com>
Cc: Gregory Price <gourry(a)gourry.net>
Cc: "Huang, Ying" <ying.huang(a)linux.alibaba.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Joshua Hahn <joshua.hahnjy(a)gmail.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Mariano Pache <npache(a)redhat.com>
Cc: Mathew Brost <matthew.brost(a)intel.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rakie Kim <rakie.kim(a)sk.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Usama Arif <usamaarif642(a)gmail.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/migrate.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
--- a/mm/migrate.c~mm-rmap-fix-soft-dirty-and-uffd-wp-bit-loss-when-remapping-zero-filled-mthp-subpage-to-shared-zeropage
+++ a/mm/migrate.c
@@ -297,8 +297,7 @@ bool isolate_folio_to_list(struct folio
}
static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
- struct folio *folio,
- unsigned long idx)
+ struct folio *folio, pte_t old_pte, unsigned long idx)
{
struct page *page = folio_page(folio, idx);
pte_t newpte;
@@ -307,7 +306,7 @@ static bool try_to_map_unused_to_zeropag
return false;
VM_BUG_ON_PAGE(!PageAnon(page), page);
VM_BUG_ON_PAGE(!PageLocked(page), page);
- VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page);
+ VM_BUG_ON_PAGE(pte_present(old_pte), page);
if (folio_test_mlocked(folio) || (pvmw->vma->vm_flags & VM_LOCKED) ||
mm_forbids_zeropage(pvmw->vma->vm_mm))
@@ -323,6 +322,12 @@ static bool try_to_map_unused_to_zeropag
newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address),
pvmw->vma->vm_page_prot));
+
+ if (pte_swp_soft_dirty(old_pte))
+ newpte = pte_mksoft_dirty(newpte);
+ if (pte_swp_uffd_wp(old_pte))
+ newpte = pte_mkuffd_wp(newpte);
+
set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte);
dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio));
@@ -365,13 +370,13 @@ static bool remove_migration_pte(struct
continue;
}
#endif
+ old_pte = ptep_get(pvmw.pte);
if (rmap_walk_arg->map_unused_to_zeropage &&
- try_to_map_unused_to_zeropage(&pvmw, folio, idx))
+ try_to_map_unused_to_zeropage(&pvmw, folio, old_pte, idx))
continue;
folio_get(folio);
pte = mk_pte(new, READ_ONCE(vma->vm_page_prot));
- old_pte = ptep_get(pvmw.pte);
entry = pte_to_swp_entry(old_pte);
if (!is_migration_entry_young(entry))
_
Patches currently in -mm which might be from lance.yang(a)linux.dev are
hung_task-fix-warnings-caused-by-unaligned-lock-pointers.patch
mm-thp-fix-mte-tag-mismatch-when-replacing-zero-filled-subpages.patch
mm-rmap-fix-soft-dirty-and-uffd-wp-bit-loss-when-remapping-zero-filled-mthp-subpage-to-shared-zeropage.patch
mm-clean-up-is_guard_pte_marker.patch
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
syzbot reported multiple use-after-free bugs when accessing extent headers
in various ext4 functions. These occur because extent headers can be freed
by concurrent operations while other threads still hold pointers to them.
The issue is triggered by racing threads performing concurrent writes to
the same file. After commit 665575cff098 ("filemap: move prefaulting out
of hot write path"), the write path no longer prefaults pages in the hot
path, creating a wider race window where:
1. Thread A calls ext4_find_extent() and gets a path with extent headers
2. Thread A's write attempt fails, entering the slow path
3. During the gap, Thread B modifies the extent tree, freeing nodes
4. Thread A continues using the now-freed extent headers, causing UAF
Fix this by validating the extent header in ext4_find_extent() before
returning the path. This ensures all callers receive a valid extent path,
fixing the race at a single point rather than adding checks throughout
the codebase.
This addresses crashes in ext4_ext_insert_extent(), ext4_ext_binsearch(),
and potentially other locations that use extent paths.
Reported-by: syzbot+9db318d6167044609878(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9db318d6167044609878
Fixes: 665575cff098 ("filemap: move prefaulting out of hot write path")
Cc: stable(a)vger.kernel.org
Signed-off-by: Deepanshu Kartikey <kartikey406(a)gmail.com>
---
fs/ext4/extents.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index ca5499e9412b..04ceae5b0a34 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4200,6 +4200,7 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
unsigned int allocated_clusters = 0;
struct ext4_allocation_request ar;
ext4_lblk_t cluster_offset;
+ struct ext4_extent_header *eh;
ext_debug(inode, "blocks %u/%u requested\n", map->m_lblk, map->m_len);
trace_ext4_ext_map_blocks_enter(inode, map->m_lblk, map->m_len, flags);
@@ -4212,7 +4213,12 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
}
depth = ext_depth(inode);
-
+ eh = path[depth].p_hdr;
+ if (!eh || le16_to_cpu(eh->eh_magic) != EXT4_EXT_MAGIC) {
+ EXT4_ERROR_INODE(inode, "invalid extent header after find_extent");
+ err = -EFSCORRUPTED;
+ goto out;
+ }
/*
* consistent leaf must not be empty;
* this situation is possible, though, _during_ tree modification;
--
2.43.0
Hi ,
Hope you're doing well. I wanted to check if my previous email reached you.
Do you need any additional information regarding my previous email? If so, I can provide it for your review.
Regards
Brenda
Marketing Manager
Prospect Tech Connect.,
Please reply with REMOVE if you don't wish to receive further emails
-----Original Message-----
From: Brenda Wilson
Subject: Executive Assistants and HNWI Directory to Enhance Your Marketing and Networking
Hi ,
Our verified database enables accurate outreach to Executive Assistants and high-net-worth individuals.
Executive Assistants (by region):
USA : 50,000 contacts
Europe : 15,000 contacts
Canada : 2,000 contacts
Middle East : 2,500 contacts
HNWI & Senior Decision-Makers (by region, incl. EAs):
USA : 500,000 contacts
Europe : 50,000 contacts
Canada : 10,000 contacts
UAE : 7,500 contacts
Titles we cover: Business Owners, Founders, Entrepreneurs, C-Level Executives, VPs, and Executive Assistants.
Data fields: Name, Job Title, Company, URL, Email, Revenue and more.
This list helps reach gatekeepers and decision-makers who oversee charter service partnerships.
Happy to share prices if that helps.
Eager to receive your feedback.
Regards
Brenda
Marketing Manager
Prospect Tech Connect.,
Please reply with REMOVE if you don't wish to receive further emails
Hello,o
after upgrading to 6.12.49 my wlan adapter stops working. It is
detected:
kernel: mt76x2u 4-2:1.0: ASIC revision: 76120044
kernel: mt76x2u 4-2:1.0: ROM patch build: 20141115060606a
kernel: usb 3-4: reset high-speed USB device number 2 using xhci_hcd
kernel: mt76x2u 4-2:1.0: Firmware Version: 0.0.00
kernel: mt76x2u 4-2:1.0: Build: 1
kernel: mt76x2u 4-2:1.0: Build Time: 201507311614____
but does nor work. The following 2 messages probably are relevant:
kernel: mt76x2u 4-2:1.0: MAC RX failed to stop
kernel: mt76x2u 4-2:1.0: MAC RX failed to stop
later I see a lot of
kernel: mt76x2u 4-2:1.0: error: mt76x02u_mcu_wait_resp failed with -110
I bisected it down to commit
9b28ef1e4cc07cdb35da257aa4358d0127168b68
usb: xhci: remove option to change a default ring's TRB cycle bit
9b28ef1e4cc07cdb35da257aa4358d0127168b68 is the first bad commit
commit 9b28ef1e4cc07cdb35da257aa4358d0127168b68
Author: Niklas Neronin <niklas.neronin(a)linux.intel.com>
Date: Wed Sep 17 08:39:07 2025 -0400
usb: xhci: remove option to change a default ring's TRB cycle bit
[ Upstream commit e1b0fa863907a61e86acc19ce2d0633941907c8e ]
The TRB cycle bit indicates TRB ownership by the Host Controller
(HC) or
Host Controller Driver (HCD). New rings are initialized with
'cycle_state'
equal to one, and all its TRBs' cycle bits are set to zero. When
handling
ring expansion, set the source ring cycle bits to the same value as
the
destination ring.
Move the cycle bit setting from xhci_segment_alloc() to
xhci_link_rings(),
and remove the 'cycle_state' argument from
xhci_initialize_ring_info().
The xhci_segment_alloc() function uses kzalloc_node() to allocate
segments,
ensuring that all TRB cycle bits are initialized to zero.
Signed-off-by: Niklas Neronin <niklas.neronin(a)linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link:
https://lore.kernel.org/r/20241106101459.775897-12-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Stable-dep-of: a5c98e8b1398 ("xhci: dbc: Fix full DbC transfer ring
after several reconnects")
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Regards,
--
Wolfgang Walter
Studierendenwerk München Oberbayern
Anstalt des öffentlichen Rechts
From: Pierre Gondois <pierre.gondois(a)arm.com>
commit 5944ce092b97caed5d86d961e963b883b5c44ee2 upstream.
commit 3fcbf1c77d08 ("arch_topology: Fix cache attributes detection
in the CPU hotplug path")
adds a call to detect_cache_attributes() to populate the cacheinfo
before updating the siblings mask. detect_cache_attributes() allocates
memory and can take the PPTT mutex (on ACPI platforms). On PREEMPT_RT
kernels, on secondary CPUs, this triggers a:
'BUG: sleeping function called from invalid context' [1]
as the code is executed with preemption and interrupts disabled.
The primary CPU was previously storing the cache information using
the now removed (struct cpu_topology).llc_id:
commit 5b8dc787ce4a ("arch_topology: Drop LLC identifier stash from
the CPU topology")
allocate_cache_info() tries to build the cacheinfo from the primary
CPU prior secondary CPUs boot, if the DT/ACPI description
contains cache information.
If allocate_cache_info() fails, then fallback to the current state
for the cacheinfo allocation. [1] will be triggered in such case.
When unplugging a CPU, the cacheinfo memory cannot be freed. If it
was, then the memory would be allocated early by the re-plugged
CPU and would trigger [1].
Note that populate_cache_leaves() might be called multiple times
due to populate_leaves being moved up. This is required since
detect_cache_attributes() might be called with per_cpu_cacheinfo(cpu)
being allocated but not populated.
[1]:
| BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
| in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
| preempt_count: 1, expected: 0
| RCU nest depth: 1, expected: 1
| 3 locks held by swapper/111/0:
| #0: (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
| #1: (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
| #2: (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
| irq event stamp: 0
| hardirqs last enabled at (0): 0x0
| hardirqs last disabled at (0): copy_process+0x5dc/0x1ab8
| softirqs last enabled at (0): copy_process+0x5dc/0x1ab8
| softirqs last disabled at (0): 0x0
| Preemption disabled at:
| migrate_enable+0x30/0x130
| CPU: 111 PID: 0 Comm: swapper/111 Tainted: G W 6.0.0-rc4-rt6-[...]
| Call trace:
| __kmalloc+0xbc/0x1e8
| detect_cache_attributes+0x2d4/0x5f0
| update_siblings_masks+0x30/0x368
| store_cpu_topology+0x78/0xb8
| secondary_start_kernel+0xd0/0x198
| __secondary_switched+0xb0/0xb4
Signed-off-by: Pierre Gondois <pierre.gondois(a)arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla(a)arm.com>
Acked-by: Palmer Dabbelt <palmer(a)rivosinc.com>
Link: https://lore.kernel.org/r/20230104183033.755668-7-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla(a)arm.com>
Cc: <stable(a)vger.kernel.org> # 6.1.x: c3719bd:cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation
Cc: <stable(a)vger.kernel.org> # 6.1.x: 8844c3d:cacheinfo: Return error code in init_of_cache_level(
Cc: <stable(a)vger.kernel.org> # 6.1.x: de0df44:cacheinfo: Check 'cache-unified' property to count cache leaves
Cc: <stable(a)vger.kernel.org> # 6.1.x: fa4d566:ACPI: PPTT: Remove acpi_find_cache_levels()
Cc: <stable(a)vger.kernel.org> # 6.1.x: bd50036:ACPI: PPTT: Update acpi_find_last_cache_level() to acpi_get_cache_info(
Cc: <stable(a)vger.kernel.org> # 6.1.x
Signed-off-by: Wen Yang <wen.yang(a)linux.dev>
---
arch/riscv/kernel/cacheinfo.c | 5 ---
drivers/base/arch_topology.c | 12 +++++-
drivers/base/cacheinfo.c | 71 ++++++++++++++++++++++++++---------
include/linux/cacheinfo.h | 1 +
4 files changed, 65 insertions(+), 24 deletions(-)
diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
index 440a3df5944c..3a13113f1b29 100644
--- a/arch/riscv/kernel/cacheinfo.c
+++ b/arch/riscv/kernel/cacheinfo.c
@@ -113,11 +113,6 @@ static void fill_cacheinfo(struct cacheinfo **this_leaf,
}
}
-int init_cache_level(unsigned int cpu)
-{
- return init_of_cache_level(cpu);
-}
-
int populate_cache_leaves(unsigned int cpu)
{
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index e7d6e6657ffa..b1c1dd38ab01 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -736,7 +736,7 @@ void update_siblings_masks(unsigned int cpuid)
ret = detect_cache_attributes(cpuid);
if (ret && ret != -ENOENT)
- pr_info("Early cacheinfo failed, ret = %d\n", ret);
+ pr_info("Early cacheinfo allocation failed, ret = %d\n", ret);
/* update core and thread sibling masks */
for_each_online_cpu(cpu) {
@@ -825,7 +825,7 @@ __weak int __init parse_acpi_topology(void)
#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
void __init init_cpu_topology(void)
{
- int ret;
+ int cpu, ret;
reset_cpu_topology();
ret = parse_acpi_topology();
@@ -840,6 +840,14 @@ void __init init_cpu_topology(void)
reset_cpu_topology();
return;
}
+
+ for_each_possible_cpu(cpu) {
+ ret = fetch_cache_info(cpu);
+ if (ret) {
+ pr_err("Early cacheinfo failed, ret = %d\n", ret);
+ break;
+ }
+ }
}
void store_cpu_topology(unsigned int cpuid)
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index ab99b0f0d010..cd943d06d074 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -412,10 +412,6 @@ static void free_cache_attributes(unsigned int cpu)
return;
cache_shared_cpu_map_remove(cpu);
-
- kfree(per_cpu_cacheinfo(cpu));
- per_cpu_cacheinfo(cpu) = NULL;
- cache_leaves(cpu) = 0;
}
int __weak init_cache_level(unsigned int cpu)
@@ -428,29 +424,71 @@ int __weak populate_cache_leaves(unsigned int cpu)
return -ENOENT;
}
+static inline
+int allocate_cache_info(int cpu)
+{
+ per_cpu_cacheinfo(cpu) = kcalloc(cache_leaves(cpu),
+ sizeof(struct cacheinfo), GFP_ATOMIC);
+ if (!per_cpu_cacheinfo(cpu)) {
+ cache_leaves(cpu) = 0;
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+int fetch_cache_info(unsigned int cpu)
+{
+ struct cpu_cacheinfo *this_cpu_ci;
+ unsigned int levels, split_levels;
+ int ret;
+
+ if (acpi_disabled) {
+ ret = init_of_cache_level(cpu);
+ if (ret < 0)
+ return ret;
+ } else {
+ ret = acpi_get_cache_info(cpu, &levels, &split_levels);
+ if (ret < 0)
+ return ret;
+
+ this_cpu_ci = get_cpu_cacheinfo(cpu);
+ this_cpu_ci->num_levels = levels;
+ /*
+ * This assumes that:
+ * - there cannot be any split caches (data/instruction)
+ * above a unified cache
+ * - data/instruction caches come by pair
+ */
+ this_cpu_ci->num_leaves = levels + split_levels;
+ }
+ if (!cache_leaves(cpu))
+ return -ENOENT;
+
+ return allocate_cache_info(cpu);
+}
+
int detect_cache_attributes(unsigned int cpu)
{
int ret;
- /* Since early detection of the cacheinfo is allowed via this
- * function and this also gets called as CPU hotplug callbacks via
- * cacheinfo_cpu_online, the initialisation can be skipped and only
- * CPU maps can be updated as the CPU online status would be update
- * if called via cacheinfo_cpu_online path.
+ /* Since early initialization/allocation of the cacheinfo is allowed
+ * via fetch_cache_info() and this also gets called as CPU hotplug
+ * callbacks via cacheinfo_cpu_online, the init/alloc can be skipped
+ * as it will happen only once (the cacheinfo memory is never freed).
+ * Just populate the cacheinfo.
*/
if (per_cpu_cacheinfo(cpu))
- goto update_cpu_map;
+ goto populate_leaves;
if (init_cache_level(cpu) || !cache_leaves(cpu))
return -ENOENT;
- per_cpu_cacheinfo(cpu) = kcalloc(cache_leaves(cpu),
- sizeof(struct cacheinfo), GFP_ATOMIC);
- if (per_cpu_cacheinfo(cpu) == NULL) {
- cache_leaves(cpu) = 0;
- return -ENOMEM;
- }
+ ret = allocate_cache_info(cpu);
+ if (ret)
+ return ret;
+populate_leaves:
/*
* populate_cache_leaves() may completely setup the cache leaves and
* shared_cpu_map or it may leave it partially setup.
@@ -459,7 +497,6 @@ int detect_cache_attributes(unsigned int cpu)
if (ret)
goto free_ci;
-update_cpu_map:
/*
* For systems using DT for cache hierarchy, fw_token
* and shared_cpu_map will be set up here only if they are
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 00d8e7f9d1c6..dfef57077cd0 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -85,6 +85,7 @@ int populate_cache_leaves(unsigned int cpu);
int cache_setup_acpi(unsigned int cpu);
bool last_level_cache_is_valid(unsigned int cpu);
bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
+int fetch_cache_info(unsigned int cpu);
int detect_cache_attributes(unsigned int cpu);
#ifndef CONFIG_ACPI_PPTT
/*
--
2.25.1
This issue was found by Runcheng Lu when develop HSCanT USB to CAN FD
converter[1]. The original developers may have only 3 interfaces device to
test so they write 3 here and wait for future change.
During the HSCanT development, we actually used 4 interfaces, so the
limitation of 3 is not enough now. But just increase one is not
future-proofed. Since the channel index type in gs_host_frame is u8, just
make canch[] become a flexible array with a u8 index, so it naturally
constraint by U8_MAX and avoid statically allocate 256 pointer for
every gs_usb device.
[1]: https://github.com/cherry-embedded/HSCanT-hardware
Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices")
Reported-by: Runcheng Lu <runcheng.lu(a)hpmicro.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Vincent Mailhol <mailhol(a)kernel.org>
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
---
Changes in v5:
- Reword commit message to match the code better.
- Link to v4: https://lore.kernel.org/r/20250930-gs-usb-max-if-v4-1-8e163eb583da@coelacan…
Changes in v4:
- Remove redudant typeof().
- Fix type: inteface -> interface.
- Link to v3: https://lore.kernel.org/r/20250930-gs-usb-max-if-v3-1-21d97d7f1c34@coelacan…
Changes in v3:
- Cc stable should in patch instead of cover letter.
- Link to v2: https://lore.kernel.org/r/20250930-gs-usb-max-if-v2-1-2cf9a44e6861@coelacan…
Changes in v2:
- Use flexible array member instead of fixed array.
- Link to v1: https://lore.kernel.org/r/20250929-gs-usb-max-if-v1-1-e41b5c09133a@coelacan…
---
drivers/net/can/usb/gs_usb.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
index c9482d6e947b0c7b033dc4f0c35f5b111e1bfd92..9fb4cbbd6d6dc88f433020eb0417ea53cd0c4d5f 100644
--- a/drivers/net/can/usb/gs_usb.c
+++ b/drivers/net/can/usb/gs_usb.c
@@ -289,11 +289,6 @@ struct gs_host_frame {
#define GS_MAX_RX_URBS 30
#define GS_NAPI_WEIGHT 32
-/* Maximum number of interfaces the driver supports per device.
- * Current hardware only supports 3 interfaces. The future may vary.
- */
-#define GS_MAX_INTF 3
-
struct gs_tx_context {
struct gs_can *dev;
unsigned int echo_id;
@@ -324,7 +319,6 @@ struct gs_can {
/* usb interface struct */
struct gs_usb {
- struct gs_can *canch[GS_MAX_INTF];
struct usb_anchor rx_submitted;
struct usb_device *udev;
@@ -336,9 +330,11 @@ struct gs_usb {
unsigned int hf_size_rx;
u8 active_channels;
+ u8 channel_cnt;
unsigned int pipe_in;
unsigned int pipe_out;
+ struct gs_can *canch[] __counted_by(channel_cnt);
};
/* 'allocate' a tx context.
@@ -599,7 +595,7 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
}
/* device reports out of range channel id */
- if (hf->channel >= GS_MAX_INTF)
+ if (hf->channel >= parent->channel_cnt)
goto device_detach;
dev = parent->canch[hf->channel];
@@ -699,7 +695,7 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
/* USB failure take down all interfaces */
if (rc == -ENODEV) {
device_detach:
- for (rc = 0; rc < GS_MAX_INTF; rc++) {
+ for (rc = 0; rc < parent->channel_cnt; rc++) {
if (parent->canch[rc])
netif_device_detach(parent->canch[rc]->netdev);
}
@@ -1460,17 +1456,19 @@ static int gs_usb_probe(struct usb_interface *intf,
icount = dconf.icount + 1;
dev_info(&intf->dev, "Configuring for %u interfaces\n", icount);
- if (icount > GS_MAX_INTF) {
+ if (icount > type_max(parent->channel_cnt)) {
dev_err(&intf->dev,
"Driver cannot handle more that %u CAN interfaces\n",
- GS_MAX_INTF);
+ type_max(parent->channel_cnt));
return -EINVAL;
}
- parent = kzalloc(sizeof(*parent), GFP_KERNEL);
+ parent = kzalloc(struct_size(parent, canch, icount), GFP_KERNEL);
if (!parent)
return -ENOMEM;
+ parent->channel_cnt = icount;
+
init_usb_anchor(&parent->rx_submitted);
usb_set_intfdata(intf, parent);
@@ -1531,7 +1529,7 @@ static void gs_usb_disconnect(struct usb_interface *intf)
return;
}
- for (i = 0; i < GS_MAX_INTF; i++)
+ for (i = 0; i < parent->channel_cnt; i++)
if (parent->canch[i])
gs_destroy_candev(parent->canch[i]);
---
base-commit: e5f0a698b34ed76002dc5cff3804a61c80233a7a
change-id: 20250929-gs-usb-max-if-a304c83243e5
Best regards,
--
Celeste Liu <uwu(a)coelacanthus.name>
The gs_usb driver supports USB devices with more than 1 CAN channel. In
old kernel before 3.15, it uses net_device->dev_id to distinguish
different channel in userspace, which was done in commit
acff76fa45b4 ("can: gs_usb: gs_make_candev(): set netdev->dev_id").
But since 3.15, the correct way is populating net_device->dev_port. And
according to documentation, if network device support multiple interface,
lack of net_device->dev_port SHALL be treated as a bug.
Fixes: acff76fa45b4 ("can: gs_usb: gs_make_candev(): set netdev->dev_id")
Cc: stable(a)vger.kernel.org
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
---
drivers/net/can/usb/gs_usb.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
index c9482d6e947b0c7b033dc4f0c35f5b111e1bfd92..7ee68b47b569a142ffed3981edcaa9a1943ef0c2 100644
--- a/drivers/net/can/usb/gs_usb.c
+++ b/drivers/net/can/usb/gs_usb.c
@@ -1249,6 +1249,7 @@ static struct gs_can *gs_make_candev(unsigned int channel,
netdev->flags |= IFF_ECHO; /* we support full roundtrip echo */
netdev->dev_id = channel;
+ netdev->dev_port = channel;
/* dev setup */
strcpy(dev->bt_const.name, KBUILD_MODNAME);
---
base-commit: 30d4efb2f5a515a60fe6b0ca85362cbebea21e2f
change-id: 20250930-gs-usb-populate-net_device-dev_port-941f2d1c3889
Best regards,
--
Celeste Liu <uwu(a)coelacanthus.name>
This series backports 13 patches to update minmax.h in the 6.1.y branch,
aligning it with v6.17-rc7.
The ultimate goal is to synchronize all longterm branches so that they
include the full set of minmax.h changes (6.12.y was already aligned and
6.6.y is in progress).
The key motivation is to bring in commit d03eba99f5bf ("minmax: allow
min()/max()/clamp() if the arguments have the same signedness"), which
is missing in older kernels.
In mainline, this change enables min()/max()/clamp() to accept mixed
argument types, provided both have the same signedness. Without it,
backported patches that use these forms may trigger compiler warnings,
which escalate to build failures when -Werror is enabled.
Changes between v1 and v2:
- v1 included 19 patches:
https://lore.kernel.org/stable/20250924202320.32333-1-farbere@amazon.com/
- First 6 were pushed to the stable-tree.
- 7th cauded amd driver's build to fail.
- This change fixes it.
- Modified files:
drivers/gpu/drm/amd/amdgpu/amdgpu.h
drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
David Laight (7):
minmax.h: add whitespace around operators and after commas
minmax.h: update some comments
minmax.h: reduce the #define expansion of min(), max() and clamp()
minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()
minmax.h: move all the clamp() definitions after the min/max() ones
minmax.h: simplify the variants of clamp()
minmax.h: remove some #defines that are only expanded once
Linus Torvalds (6):
minmax: make generic MIN() and MAX() macros available everywhere
minmax: add a few more MIN_T/MAX_T users
minmax: simplify min()/max()/clamp() implementation
minmax: don't use max() in situations that want a C constant
expression
minmax: improve macro expansion and type checking
minmax: fix up min3() and max3() too
arch/um/drivers/mconsole_user.c | 2 +
arch/x86/mm/pgtable.c | 2 +-
drivers/edac/sb_edac.c | 4 +-
drivers/edac/skx_common.h | 1 -
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 +
.../drm/amd/display/modules/hdcp/hdcp_ddc.c | 2 +
.../drm/amd/pm/powerplay/hwmgr/ppevvmath.h | 14 +-
.../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +
.../drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 3 +
.../drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 3 +
drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 2 +-
drivers/gpu/drm/drm_color_mgmt.c | 2 +-
drivers/gpu/drm/radeon/evergreen_cs.c | 2 +
drivers/hwmon/adt7475.c | 24 +-
drivers/input/touchscreen/cyttsp4_core.c | 2 +-
drivers/irqchip/irq-sun6i-r.c | 2 +-
drivers/md/dm-integrity.c | 2 +-
drivers/media/dvb-frontends/stv0367_priv.h | 3 +
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
drivers/net/fjes/fjes_main.c | 4 +-
drivers/nfc/pn544/i2c.c | 2 -
drivers/platform/x86/sony-laptop.c | 1 -
drivers/scsi/isci/init.c | 6 +-
.../pci/hive_isp_css_include/math_support.h | 5 -
fs/btrfs/tree-checker.c | 2 +-
include/linux/compiler.h | 9 +
include/linux/minmax.h | 220 ++++++++++--------
kernel/trace/preemptirq_delay_test.c | 2 -
lib/btree.c | 1 -
lib/decompress_unlzma.c | 2 +
lib/vsprintf.c | 2 +-
mm/zsmalloc.c | 1 -
net/ipv4/proc.c | 2 +-
net/ipv6/proc.c | 2 +-
tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +
tools/testing/selftests/vm/mremap_test.c | 2 +
36 files changed, 199 insertions(+), 142 deletions(-)
--
2.47.3
This series backports 15 patches to update minmax.h in the 6.6.y branch,
aligning it with v6.17-rc7.
The ultimate goal is to synchronize all longterm branches so that they
include the full set of minmax.h changes.
The key motivation is to bring in commit d03eba99f5bf ("minmax: allow
min()/max()/clamp() if the arguments have the same signedness"), which
is missing in older kernels.
In mainline, this change enables min()/max()/clamp() to accept mixed
argument types, provided both have the same signedness. Without it,
backported patches that use these forms may trigger compiler warnings,
which escalate to build failures when -Werror is enabled.
Changes between v1 and v2:
- v1 included 15 patches:
https://lore.kernel.org/stable/20250922103241.16213-1-farbere@amazon.com/T/…
- First 3 were pushed to the stable-tree.
- 4th cauded amd driver's build to fail.
- This change fixes it.
- Modified files:
drivers/gpu/drm/amd/amdgpu/amdgpu.h
drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
David Laight (7):
minmax.h: add whitespace around operators and after commas
minmax.h: update some comments
minmax.h: reduce the #define expansion of min(), max() and clamp()
minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()
minmax.h: move all the clamp() definitions after the min/max() ones
minmax.h: simplify the variants of clamp()
minmax.h: remove some #defines that are only expanded once
Linus Torvalds (5):
minmax: make generic MIN() and MAX() macros available everywhere
minmax: simplify min()/max()/clamp() implementation
minmax: don't use max() in situations that want a C constant
expression
minmax: improve macro expansion and type checking
minmax: fix up min3() and max3() too
arch/um/drivers/mconsole_user.c | 2 +
drivers/edac/skx_common.h | 1 -
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 +
.../drm/amd/display/modules/hdcp/hdcp_ddc.c | 2 +
.../drm/amd/pm/powerplay/hwmgr/ppevvmath.h | 14 +-
.../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +
.../drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 3 +
.../drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 3 +
drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 2 +-
drivers/gpu/drm/radeon/evergreen_cs.c | 2 +
drivers/hwmon/adt7475.c | 24 +-
drivers/input/touchscreen/cyttsp4_core.c | 2 +-
drivers/irqchip/irq-sun6i-r.c | 2 +-
drivers/media/dvb-frontends/stv0367_priv.h | 3 +
.../net/can/usb/etas_es58x/es58x_devlink.c | 2 +-
drivers/net/fjes/fjes_main.c | 4 +-
drivers/nfc/pn544/i2c.c | 2 -
drivers/platform/x86/sony-laptop.c | 1 -
drivers/scsi/isci/init.c | 6 +-
.../pci/hive_isp_css_include/math_support.h | 5 -
fs/btrfs/tree-checker.c | 2 +-
include/linux/compiler.h | 9 +
include/linux/minmax.h | 220 ++++++++++--------
kernel/trace/preemptirq_delay_test.c | 2 -
lib/btree.c | 1 -
lib/decompress_unlzma.c | 2 +
lib/vsprintf.c | 2 +-
mm/zsmalloc.c | 2 -
tools/testing/selftests/mm/mremap_test.c | 2 +
tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +
30 files changed, 192 insertions(+), 136 deletions(-)
--
2.47.3
syzkaller discovered the following crash: (kernel BUG)
[ 44.607039] ------------[ cut here ]------------
[ 44.607422] kernel BUG at mm/userfaultfd.c:2067!
[ 44.608148] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 44.608814] CPU: 1 UID: 0 PID: 2475 Comm: reproducer Not tainted 6.16.0-rc6 #1 PREEMPT(none)
[ 44.609635] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 44.610695] RIP: 0010:userfaultfd_release_all+0x3a8/0x460
<snip other registers, drop unreliable trace>
[ 44.617726] Call Trace:
[ 44.617926] <TASK>
[ 44.619284] userfaultfd_release+0xef/0x1b0
[ 44.620976] __fput+0x3f9/0xb60
[ 44.621240] fput_close_sync+0x110/0x210
[ 44.622222] __x64_sys_close+0x8f/0x120
[ 44.622530] do_syscall_64+0x5b/0x2f0
[ 44.622840] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 44.623244] RIP: 0033:0x7f365bb3f227
Kernel panics because it detects UFFD inconsistency during
userfaultfd_release_all(). Specifically, a VMA which has a valid pointer
to vma->vm_userfaultfd_ctx, but no UFFD flags in vma->vm_flags.
The inconsistency is caused in ksm_madvise(): when user calls madvise()
with MADV_UNMEARGEABLE on a VMA that is registered for UFFD in MINOR
mode, it accidentally clears all flags stored in the upper 32 bits of
vma->vm_flags.
Assuming x86_64 kernel build, unsigned long is 64-bit and unsigned int
and int are 32-bit wide. This setup causes the following mishap during
the &= ~VM_MERGEABLE assignment.
VM_MERGEABLE is a 32-bit constant of type unsigned int, 0x8000'0000.
After ~ is applied, it becomes 0x7fff'ffff unsigned int, which is then
promoted to unsigned long before the & operation. This promotion fills
upper 32 bits with leading 0s, as we're doing unsigned conversion (and
even for a signed conversion, this wouldn't help as the leading bit is
0). & operation thus ends up AND-ing vm_flags with 0x0000'0000'7fff'ffff
instead of intended 0xffff'ffff'7fff'ffff and hence accidentally clears
the upper 32-bits of its value.
Fix it by casting `VM_MERGEABLE` constant to unsigned long to preserve
the upper 32 bits, in case it's needed.
Note: other VM_* flags are not affected:
This only happens to the VM_MERGEABLE flag, as the other VM_* flags are
all constants of type int and after ~ operation, they end up with
leading 1 and are thus converted to unsigned long with leading 1s.
Note 2:
After commit 31defc3b01d9 ("userfaultfd: remove (VM_)BUG_ON()s"), this is
no longer a kernel BUG, but a WARNING at the same place:
[ 45.595973] WARNING: CPU: 1 PID: 2474 at mm/userfaultfd.c:2067
but the root-cause (flag-drop) remains the same.
Fixes: 7677f7fd8be76 ("userfaultfd: add minor fault registration mode")
Signed-off-by: Jakub Acs <acsjakub(a)amazon.de>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Xu Xin <xu.xin16(a)zte.com.cn>
Cc: Chengming Zhou <chengming.zhou(a)linux.dev>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: linux-mm(a)kvack.org
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
---
I looked around the kernel and found one more flag that might be
causing similar issues: "IORESOURCE_BUSY" - as its inverted version is
bit-anded to unsigned long fields. However, it seems those fields don't
actually use any bits from upper 32-bits as flags (yet?).
I also considered changing the constant definition by adding ULL, but am
not sure where else that could blow up, plus it would likely call to
define all the related constants as ULL for consistency. If you'd prefer
that fix, let me know.
mm/ksm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/ksm.c b/mm/ksm.c
index 160787bb121c..c24137a1eeb7 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2871,7 +2871,7 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
return err;
}
- *vm_flags &= ~VM_MERGEABLE;
+ *vm_flags &= ~((unsigned long) VM_MERGEABLE);
break;
}
--
2.47.3
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
This issue was found by Runcheng Lu when develop HSCanT USB to CAN FD
converter[1]. The original developers may have only 3 interfaces device to
test so they write 3 here and wait for future change.
During the HSCanT development, we actually used 4 interfaces, so the
limitation of 3 is not enough now. But just increase one is not
future-proofed. Since the channel type in gs_host_frame is u8, just
increase interface number limit to max size of u8 safely.
[1]: https://github.com/cherry-embedded/HSCanT-hardware
Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices")
Reported-by: Runcheng Lu <runcheng.lu(a)hpmicro.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Vincent Mailhol <mailhol(a)kernel.org>
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
---
Changes in v4:
- Remove redudant typeof().
- Fix type: inteface -> interface.
- Link to v3: https://lore.kernel.org/r/20250930-gs-usb-max-if-v3-1-21d97d7f1c34@coelacan…
Changes in v3:
- Cc stable should in patch instead of cover letter.
- Link to v2: https://lore.kernel.org/r/20250930-gs-usb-max-if-v2-1-2cf9a44e6861@coelacan…
Changes in v2:
- Use flexible array member instead of fixed array.
- Link to v1: https://lore.kernel.org/r/20250929-gs-usb-max-if-v1-1-e41b5c09133a@coelacan…
---
drivers/net/can/usb/gs_usb.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
index c9482d6e947b0c7b033dc4f0c35f5b111e1bfd92..9fb4cbbd6d6dc88f433020eb0417ea53cd0c4d5f 100644
--- a/drivers/net/can/usb/gs_usb.c
+++ b/drivers/net/can/usb/gs_usb.c
@@ -289,11 +289,6 @@ struct gs_host_frame {
#define GS_MAX_RX_URBS 30
#define GS_NAPI_WEIGHT 32
-/* Maximum number of interfaces the driver supports per device.
- * Current hardware only supports 3 interfaces. The future may vary.
- */
-#define GS_MAX_INTF 3
-
struct gs_tx_context {
struct gs_can *dev;
unsigned int echo_id;
@@ -324,7 +319,6 @@ struct gs_can {
/* usb interface struct */
struct gs_usb {
- struct gs_can *canch[GS_MAX_INTF];
struct usb_anchor rx_submitted;
struct usb_device *udev;
@@ -336,9 +330,11 @@ struct gs_usb {
unsigned int hf_size_rx;
u8 active_channels;
+ u8 channel_cnt;
unsigned int pipe_in;
unsigned int pipe_out;
+ struct gs_can *canch[] __counted_by(channel_cnt);
};
/* 'allocate' a tx context.
@@ -599,7 +595,7 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
}
/* device reports out of range channel id */
- if (hf->channel >= GS_MAX_INTF)
+ if (hf->channel >= parent->channel_cnt)
goto device_detach;
dev = parent->canch[hf->channel];
@@ -699,7 +695,7 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
/* USB failure take down all interfaces */
if (rc == -ENODEV) {
device_detach:
- for (rc = 0; rc < GS_MAX_INTF; rc++) {
+ for (rc = 0; rc < parent->channel_cnt; rc++) {
if (parent->canch[rc])
netif_device_detach(parent->canch[rc]->netdev);
}
@@ -1460,17 +1456,19 @@ static int gs_usb_probe(struct usb_interface *intf,
icount = dconf.icount + 1;
dev_info(&intf->dev, "Configuring for %u interfaces\n", icount);
- if (icount > GS_MAX_INTF) {
+ if (icount > type_max(parent->channel_cnt)) {
dev_err(&intf->dev,
"Driver cannot handle more that %u CAN interfaces\n",
- GS_MAX_INTF);
+ type_max(parent->channel_cnt));
return -EINVAL;
}
- parent = kzalloc(sizeof(*parent), GFP_KERNEL);
+ parent = kzalloc(struct_size(parent, canch, icount), GFP_KERNEL);
if (!parent)
return -ENOMEM;
+ parent->channel_cnt = icount;
+
init_usb_anchor(&parent->rx_submitted);
usb_set_intfdata(intf, parent);
@@ -1531,7 +1529,7 @@ static void gs_usb_disconnect(struct usb_interface *intf)
return;
}
- for (i = 0; i < GS_MAX_INTF; i++)
+ for (i = 0; i < parent->channel_cnt; i++)
if (parent->canch[i])
gs_destroy_candev(parent->canch[i]);
---
base-commit: e5f0a698b34ed76002dc5cff3804a61c80233a7a
change-id: 20250929-gs-usb-max-if-a304c83243e5
Best regards,
--
Celeste Liu <uwu(a)coelacanthus.name>
From: Lance Yang <lance.yang(a)linux.dev>
When splitting an mTHP and replacing a zero-filled subpage with the shared
zeropage, try_to_map_unused_to_zeropage() currently drops several important
PTE bits.
For userspace tools like CRIU, which rely on the soft-dirty mechanism for
incremental snapshots, losing the soft-dirty bit means modified pages are
missed, leading to inconsistent memory state after restore.
As pointed out by David, the more critical uffd-wp bit is also dropped.
This breaks the userfaultfd write-protection mechanism, causing writes
to be silently missed by monitoring applications, which can lead to data
corruption.
Preserve both the soft-dirty and uffd-wp bits from the old PTE when
creating the new zeropage mapping to ensure they are correctly tracked.
Cc: <stable(a)vger.kernel.org>
Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Suggested-by: David Hildenbrand <david(a)redhat.com>
Suggested-by: Dev Jain <dev.jain(a)arm.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Dev Jain <dev.jain(a)arm.com>
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
---
v3 -> v4:
- Minor formatting tweak in try_to_map_unused_to_zeropage() function
signature (per David and Dev)
- Collect Reviewed-by from Dev - thanks!
- https://lore.kernel.org/linux-mm/20250930060557.85133-1-lance.yang@linux.de…
v2 -> v3:
- ptep_get() gets called only once per iteration (per Dev)
- https://lore.kernel.org/linux-mm/20250930043351.34927-1-lance.yang@linux.de…
v1 -> v2:
- Avoid calling ptep_get() multiple times (per Dev)
- Double-check the uffd-wp bit (per David)
- Collect Acked-by from David - thanks!
- https://lore.kernel.org/linux-mm/20250928044855.76359-1-lance.yang@linux.de…
mm/migrate.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index ce83c2c3c287..21a2a1bf89f7 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -296,8 +296,7 @@ bool isolate_folio_to_list(struct folio *folio, struct list_head *list)
}
static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
- struct folio *folio,
- unsigned long idx)
+ struct folio *folio, pte_t old_pte, unsigned long idx)
{
struct page *page = folio_page(folio, idx);
pte_t newpte;
@@ -306,7 +305,7 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
return false;
VM_BUG_ON_PAGE(!PageAnon(page), page);
VM_BUG_ON_PAGE(!PageLocked(page), page);
- VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page);
+ VM_BUG_ON_PAGE(pte_present(old_pte), page);
if (folio_test_mlocked(folio) || (pvmw->vma->vm_flags & VM_LOCKED) ||
mm_forbids_zeropage(pvmw->vma->vm_mm))
@@ -322,6 +321,12 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address),
pvmw->vma->vm_page_prot));
+
+ if (pte_swp_soft_dirty(old_pte))
+ newpte = pte_mksoft_dirty(newpte);
+ if (pte_swp_uffd_wp(old_pte))
+ newpte = pte_mkuffd_wp(newpte);
+
set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte);
dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio));
@@ -344,7 +349,7 @@ static bool remove_migration_pte(struct folio *folio,
while (page_vma_mapped_walk(&pvmw)) {
rmap_t rmap_flags = RMAP_NONE;
- pte_t old_pte;
+ pte_t old_pte = ptep_get(pvmw.pte);
pte_t pte;
swp_entry_t entry;
struct page *new;
@@ -365,12 +370,11 @@ static bool remove_migration_pte(struct folio *folio,
}
#endif
if (rmap_walk_arg->map_unused_to_zeropage &&
- try_to_map_unused_to_zeropage(&pvmw, folio, idx))
+ try_to_map_unused_to_zeropage(&pvmw, folio, old_pte, idx))
continue;
folio_get(folio);
pte = mk_pte(new, READ_ONCE(vma->vm_page_prot));
- old_pte = ptep_get(pvmw.pte);
entry = pte_to_swp_entry(old_pte);
if (!is_migration_entry_young(entry))
--
2.49.0
From: Lance Yang <lance.yang(a)linux.dev>
When splitting an mTHP and replacing a zero-filled subpage with the shared
zeropage, try_to_map_unused_to_zeropage() currently drops several important
PTE bits.
For userspace tools like CRIU, which rely on the soft-dirty mechanism for
incremental snapshots, losing the soft-dirty bit means modified pages are
missed, leading to inconsistent memory state after restore.
As pointed out by David, the more critical uffd-wp bit is also dropped.
This breaks the userfaultfd write-protection mechanism, causing writes
to be silently missed by monitoring applications, which can lead to data
corruption.
Preserve both the soft-dirty and uffd-wp bits from the old PTE when
creating the new zeropage mapping to ensure they are correctly tracked.
Cc: <stable(a)vger.kernel.org>
Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Suggested-by: David Hildenbrand <david(a)redhat.com>
Suggested-by: Dev Jain <dev.jain(a)arm.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
---
v2 -> v3:
- ptep_get() gets called only once per iteration (per Dev)
- https://lore.kernel.org/linux-mm/20250930043351.34927-1-lance.yang@linux.de…
v1 -> v2:
- Avoid calling ptep_get() multiple times (per Dev)
- Double-check the uffd-wp bit (per David)
- Collect Acked-by from David - thanks!
- https://lore.kernel.org/linux-mm/20250928044855.76359-1-lance.yang@linux.de…
mm/migrate.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index ce83c2c3c287..bafd8cb3bebe 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -297,6 +297,7 @@ bool isolate_folio_to_list(struct folio *folio, struct list_head *list)
static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
struct folio *folio,
+ pte_t old_pte,
unsigned long idx)
{
struct page *page = folio_page(folio, idx);
@@ -306,7 +307,7 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
return false;
VM_BUG_ON_PAGE(!PageAnon(page), page);
VM_BUG_ON_PAGE(!PageLocked(page), page);
- VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page);
+ VM_BUG_ON_PAGE(pte_present(old_pte), page);
if (folio_test_mlocked(folio) || (pvmw->vma->vm_flags & VM_LOCKED) ||
mm_forbids_zeropage(pvmw->vma->vm_mm))
@@ -322,6 +323,12 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address),
pvmw->vma->vm_page_prot));
+
+ if (pte_swp_soft_dirty(old_pte))
+ newpte = pte_mksoft_dirty(newpte);
+ if (pte_swp_uffd_wp(old_pte))
+ newpte = pte_mkuffd_wp(newpte);
+
set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte);
dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio));
@@ -344,7 +351,7 @@ static bool remove_migration_pte(struct folio *folio,
while (page_vma_mapped_walk(&pvmw)) {
rmap_t rmap_flags = RMAP_NONE;
- pte_t old_pte;
+ pte_t old_pte = ptep_get(pvmw.pte);
pte_t pte;
swp_entry_t entry;
struct page *new;
@@ -365,12 +372,11 @@ static bool remove_migration_pte(struct folio *folio,
}
#endif
if (rmap_walk_arg->map_unused_to_zeropage &&
- try_to_map_unused_to_zeropage(&pvmw, folio, idx))
+ try_to_map_unused_to_zeropage(&pvmw, folio, old_pte, idx))
continue;
folio_get(folio);
pte = mk_pte(new, READ_ONCE(vma->vm_page_prot));
- old_pte = ptep_get(pvmw.pte);
entry = pte_to_swp_entry(old_pte);
if (!is_migration_entry_young(entry))
--
2.49.0
From: Zheng Qixing <zhengqixing(a)huawei.com>
From: Jan Kara <jack(a)suse.cz>
[ Upstream commit 7e49538288e523427beedd26993d446afef1a6fb ]
Syzbot came up with a reproducer where a loop device block size is
changed underneath a mounted filesystem. This causes a mismatch between
the block device block size and the block size stored in the superblock
causing confusion in various places such as fs/buffer.c. The particular
issue triggered by syzbot was a warning in __getblk_slow() due to
requested buffer size not matching block device block size.
Fix the problem by getting exclusive hold of the loop device to change
its block size. This fails if somebody (such as filesystem) has already
an exclusive ownership of the block device and thus prevents modifying
the loop device under some exclusive owner which doesn't expect it.
Reported-by: syzbot+01ef7a8da81a975e1ccd(a)syzkaller.appspotmail.com
Signed-off-by: Jan Kara <jack(a)suse.cz>
Tested-by: syzbot+01ef7a8da81a975e1ccd(a)syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20250711163202.19623-2-jack@suse.cz
Signed-off-by: Zheng Qixing <zhengqixing(a)huawei.com>
---
drivers/block/loop.c | 40 +++++++++++++++++++++++++++++++---------
1 file changed, 31 insertions(+), 9 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 455e2a2b149f..6fe9180aafb3 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1472,19 +1472,36 @@ static int loop_set_dio(struct loop_device *lo, unsigned long arg)
return error;
}
-static int loop_set_block_size(struct loop_device *lo, unsigned long arg)
+static int loop_set_block_size(struct loop_device *lo, blk_mode_t mode,
+ struct block_device *bdev, unsigned long arg)
{
int err = 0;
- if (lo->lo_state != Lo_bound)
- return -ENXIO;
+ /*
+ * If we don't hold exclusive handle for the device, upgrade to it
+ * here to avoid changing device under exclusive owner.
+ */
+ if (!(mode & BLK_OPEN_EXCL)) {
+ err = bd_prepare_to_claim(bdev, loop_set_block_size, NULL);
+ if (err)
+ return err;
+ }
+
+ err = mutex_lock_killable(&lo->lo_mutex);
+ if (err)
+ goto abort_claim;
+
+ if (lo->lo_state != Lo_bound) {
+ err = -ENXIO;
+ goto unlock;
+ }
err = blk_validate_block_size(arg);
if (err)
- return err;
+ goto unlock;
if (lo->lo_queue->limits.logical_block_size == arg)
- return 0;
+ goto unlock;
sync_blockdev(lo->lo_device);
invalidate_bdev(lo->lo_device);
@@ -1496,6 +1513,11 @@ static int loop_set_block_size(struct loop_device *lo, unsigned long arg)
loop_update_dio(lo);
blk_mq_unfreeze_queue(lo->lo_queue);
+unlock:
+ mutex_unlock(&lo->lo_mutex);
+abort_claim:
+ if (!(mode & BLK_OPEN_EXCL))
+ bd_abort_claiming(bdev, loop_set_block_size);
return err;
}
@@ -1514,9 +1536,6 @@ static int lo_simple_ioctl(struct loop_device *lo, unsigned int cmd,
case LOOP_SET_DIRECT_IO:
err = loop_set_dio(lo, arg);
break;
- case LOOP_SET_BLOCK_SIZE:
- err = loop_set_block_size(lo, arg);
- break;
default:
err = -EINVAL;
}
@@ -1571,9 +1590,12 @@ static int lo_ioctl(struct block_device *bdev, blk_mode_t mode,
break;
case LOOP_GET_STATUS64:
return loop_get_status64(lo, argp);
+ case LOOP_SET_BLOCK_SIZE:
+ if (!(mode & BLK_OPEN_WRITE) && !capable(CAP_SYS_ADMIN))
+ return -EPERM;
+ return loop_set_block_size(lo, mode, bdev, arg);
case LOOP_SET_CAPACITY:
case LOOP_SET_DIRECT_IO:
- case LOOP_SET_BLOCK_SIZE:
if (!(mode & BLK_OPEN_WRITE) && !capable(CAP_SYS_ADMIN))
return -EPERM;
fallthrough;
--
2.39.2
The driver trusts the RX descriptor length and uses it directly for
dev_alloc_skb(), memcpy_fromio(), and skb_put() without any bounds
checking. If the descriptor gets corrupted or otherwise contains an
invalid value, this can lead to an excessive allocation or reading
past the per-buffer limit programmed by the driver.
Validate 'len' read from the descriptor and drop the frame if it is
zero or greater than HDLC_MAX_MRU. The driver programs BFLL to
HDLC_MAX_MRU for RX buffers, so this is the correct upper bound.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable(a)vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com>
---
drivers/net/wan/hd64572.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/net/wan/hd64572.c b/drivers/net/wan/hd64572.c
index 534369ffe5de..6327204e3c02 100644
--- a/drivers/net/wan/hd64572.c
+++ b/drivers/net/wan/hd64572.c
@@ -199,6 +199,12 @@ static inline void sca_rx(card_t *card, port_t *port, pkt_desc __iomem *desc,
u32 buff;
len = readw(&desc->len);
+
+ if (unlikely(!len || len > HDLC_MAX_MRU)) {
+ dev->stats.rx_length_errors++;
+ return;
+ }
+
skb = dev_alloc_skb(len);
if (!skb) {
dev->stats.rx_dropped++;
--
2.43.0
Another day, another syzkaller bug. KVM erroneously allows userspace to
pend vCPU events for a vCPU that hasn't been initialized yet, leading to
KVM interpreting a bunch of uninitialized garbage for routing /
injecting the exception.
In one case the injection code and the hyp disagree on whether the vCPU
has a 32bit EL1 and put the vCPU into an illegal mode for AArch64,
tripping the BUG() in exception_target_el() during the next injection:
kernel BUG at arch/arm64/kvm/inject_fault.c:40!
Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
CPU: 3 UID: 0 PID: 318 Comm: repro Not tainted 6.17.0-rc4-00104-g10fd0285305d #6 PREEMPT
Hardware name: linux,dummy-virt (DT)
pstate: 21402009 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
pc : exception_target_el+0x88/0x8c
lr : pend_serror_exception+0x18/0x13c
sp : ffff800082f03a10
x29: ffff800082f03a10 x28: ffff0000cb132280 x27: 0000000000000000
x26: 0000000000000000 x25: ffff0000c2a99c20 x24: 0000000000000000
x23: 0000000000008000 x22: 0000000000000002 x21: 0000000000000004
x20: 0000000000008000 x19: ffff0000c2a99c20 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 00000000200000c0
x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
x8 : ffff800082f03af8 x7 : 0000000000000000 x6 : 0000000000000000
x5 : ffff800080f621f0 x4 : 0000000000000000 x3 : 0000000000000000
x2 : 000000000040009b x1 : 0000000000000003 x0 : ffff0000c2a99c20
Call trace:
exception_target_el+0x88/0x8c (P)
kvm_inject_serror_esr+0x40/0x3b4
__kvm_arm_vcpu_set_events+0xf0/0x100
kvm_arch_vcpu_ioctl+0x180/0x9d4
kvm_vcpu_ioctl+0x60c/0x9f4
__arm64_sys_ioctl+0xac/0x104
invoke_syscall+0x48/0x110
el0_svc_common.constprop.0+0x40/0xe0
do_el0_svc+0x1c/0x28
el0_svc+0x34/0xf0
el0t_64_sync_handler+0xa0/0xe4
el0t_64_sync+0x198/0x19c
Code: f946bc01 b4fffe61 9101e020 17fffff2 (d4210000)
Reject the ioctls outright as no sane VMM would call these before
KVM_ARM_VCPU_INIT anyway. Even if it did the exception would've been
thrown away by the eventual reset of the vCPU's state.
Cc: stable(a)vger.kernel.org # 6.17
Fixes: b7b27facc7b5 ("arm/arm64: KVM: Add KVM_GET/SET_VCPU_EVENTS")
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
---
While the blamed commit is indeed broken, only 6.17+ kernels actually
hit the BUG() due to commit efa1368ba9f4 ("KVM: arm64: Commit exceptions
from KVM_SET_VCPU_EVENTS immediately).
arch/arm64/kvm/arm.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index a59b4046617c..c44357d26ee8 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1795,6 +1795,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
case KVM_GET_VCPU_EVENTS: {
struct kvm_vcpu_events events;
+ if (!kvm_vcpu_initialized(vcpu))
+ return -ENOEXEC;
+
if (kvm_arm_vcpu_get_events(vcpu, &events))
return -EINVAL;
@@ -1806,6 +1809,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
case KVM_SET_VCPU_EVENTS: {
struct kvm_vcpu_events events;
+ if (!kvm_vcpu_initialized(vcpu))
+ return -ENOEXEC;
+
if (copy_from_user(&events, argp, sizeof(events)))
return -EFAULT;
base-commit: 10fd0285305d0b48e8a3bf15d4f17fc4f3d68cb6
--
2.39.5
Hello,
After upgrading to 6.16.9 this morning, my laptop can't boot. I cannot
get any logs because the kernel seems to freeze very early, even before
I'm asked for the full disk encryption passphrase.
This is a regression from 6.16.8 to 6.16.9.
I did a git bisect in the stable/linux and this is the commit causing
the issue for me:
97207a4fed5348ff5c5e71a7300db9b638640879 is the first bad commit
commit 97207a4fed5348ff5c5e71a7300db9b638640879 (HEAD)
Author: Daniele Ceraolo Spurio <daniele.ceraolospurio(a)intel.com>
Date: Wed Jun 25 13:54:06 2025 -0700
drm/xe/guc: Enable extended CAT error reporting
[ Upstream commit a7ffcea8631af91479cab10aa7fbfd0722f01d9a ]
https://lore.kernel.org/all/20250625205405.1653212-3-daniele.ceraolospurio@…
How to reproduce:
1. Upgrade to 6.16.9
2. Enable the Xe driver by passing i915.force_probe=!7d55
xe.force_probe=7d55
3. Reboot
Best regards,
Iyán
--
Iyán Méndez Veiga
GPG Key: 204C 461F BA8C 81D1 0327 E647 422E 3694 311E 5AC1
In ips_init_phase1(), most early-exit error paths use ips_abort_init(),
which properly releases ioremap_ptr. However, the path where kzalloc()
fails does not go through ips_abort_init() and therefore skips unmapping
ioremap_ptr, leading to a potential resource leak.
Add the missing iounmap() call in this specific failure path.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zhen Ni <zhen.ni(a)easystack.cn>
---
drivers/scsi/ips.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
index 94adb6ac02a4..a34167ec3038 100644
--- a/drivers/scsi/ips.c
+++ b/drivers/scsi/ips.c
@@ -6877,6 +6877,8 @@ ips_init_phase1(struct pci_dev *pci_dev, int *indexPtr)
if (ha == NULL) {
IPS_PRINTK(KERN_WARNING, pci_dev,
"Unable to allocate temporary ha struct\n");
+ if (ioremap_ptr)
+ iounmap(ioremap_ptr);
return -1;
}
--
2.20.1
The error return from bio_split() is not checked before
being passed to bio_chain(), leading to a kernel panic
from an invalid pointer dereference.
Add a check with IS_ERR() to handle the allocation failure
and prevent the crash.
This patch fixes a bug in the pktcdvd driver, which was removed
from the mainline kernel but still exists in stable branches.
Fixes: 4b83e99ee7092 ("Revert "pktcdvd: remove driver."")
Cc: stable(a)vger.kernel.org
Signed-off-by: Haotian Zhang <vulab(a)iscas.ac.cn>
---
drivers/block/pktcdvd.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index 65b96c083b3c..c0999c3d167a 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -2466,6 +2466,8 @@ static void pkt_submit_bio(struct bio *bio)
split = bio_split(bio, last_zone -
bio->bi_iter.bi_sector,
GFP_NOIO, &pkt_bio_set);
+ if (IS_ERR(split))
+ goto end_io;
bio_chain(split, bio);
} else {
split = bio;
--
2.25.1
This issue was found by Runcheng Lu when develop HSCanT USB to CAN FD
converter[1]. The original developers may have only 3 intefaces device to
test so they write 3 here and wait for future change.
During the HSCanT development, we actually used 4 interfaces, so the
limitation of 3 is not enough now. But just increase one is not
future-proofed. Since the channel type in gs_host_frame is u8, just
increase interface number limit to max size of u8 safely.
[1]: https://github.com/cherry-embedded/HSCanT-hardware
Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices")
Reported-by: Runcheng Lu <runcheng.lu(a)hpmicro.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
---
Changes in v3:
- Cc stable should in patch instead of cover letter.
- Link to v2: https://lore.kernel.org/r/20250930-gs-usb-max-if-v2-1-2cf9a44e6861@coelacan…
Changes in v2:
- Use flexible array member instead of fixed array.
- Link to v1: https://lore.kernel.org/r/20250929-gs-usb-max-if-v1-1-e41b5c09133a@coelacan…
---
drivers/net/can/usb/gs_usb.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
index c9482d6e947b0c7b033dc4f0c35f5b111e1bfd92..69b068c8fa8fbab42337e2b0a3d0860ac678c792 100644
--- a/drivers/net/can/usb/gs_usb.c
+++ b/drivers/net/can/usb/gs_usb.c
@@ -289,11 +289,6 @@ struct gs_host_frame {
#define GS_MAX_RX_URBS 30
#define GS_NAPI_WEIGHT 32
-/* Maximum number of interfaces the driver supports per device.
- * Current hardware only supports 3 interfaces. The future may vary.
- */
-#define GS_MAX_INTF 3
-
struct gs_tx_context {
struct gs_can *dev;
unsigned int echo_id;
@@ -324,7 +319,6 @@ struct gs_can {
/* usb interface struct */
struct gs_usb {
- struct gs_can *canch[GS_MAX_INTF];
struct usb_anchor rx_submitted;
struct usb_device *udev;
@@ -336,9 +330,11 @@ struct gs_usb {
unsigned int hf_size_rx;
u8 active_channels;
+ u8 channel_cnt;
unsigned int pipe_in;
unsigned int pipe_out;
+ struct gs_can *canch[] __counted_by(channel_cnt);
};
/* 'allocate' a tx context.
@@ -599,7 +595,7 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
}
/* device reports out of range channel id */
- if (hf->channel >= GS_MAX_INTF)
+ if (hf->channel >= parent->channel_cnt)
goto device_detach;
dev = parent->canch[hf->channel];
@@ -699,7 +695,7 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
/* USB failure take down all interfaces */
if (rc == -ENODEV) {
device_detach:
- for (rc = 0; rc < GS_MAX_INTF; rc++) {
+ for (rc = 0; rc < parent->channel_cnt; rc++) {
if (parent->canch[rc])
netif_device_detach(parent->canch[rc]->netdev);
}
@@ -1460,17 +1456,19 @@ static int gs_usb_probe(struct usb_interface *intf,
icount = dconf.icount + 1;
dev_info(&intf->dev, "Configuring for %u interfaces\n", icount);
- if (icount > GS_MAX_INTF) {
+ if (icount > type_max(typeof(parent->channel_cnt))) {
dev_err(&intf->dev,
"Driver cannot handle more that %u CAN interfaces\n",
- GS_MAX_INTF);
+ type_max(typeof(parent->channel_cnt)));
return -EINVAL;
}
- parent = kzalloc(sizeof(*parent), GFP_KERNEL);
+ parent = kzalloc(struct_size(parent, canch, icount), GFP_KERNEL);
if (!parent)
return -ENOMEM;
+ parent->channel_cnt = icount;
+
init_usb_anchor(&parent->rx_submitted);
usb_set_intfdata(intf, parent);
@@ -1531,7 +1529,7 @@ static void gs_usb_disconnect(struct usb_interface *intf)
return;
}
- for (i = 0; i < GS_MAX_INTF; i++)
+ for (i = 0; i < parent->channel_cnt; i++)
if (parent->canch[i])
gs_destroy_candev(parent->canch[i]);
---
base-commit: e5f0a698b34ed76002dc5cff3804a61c80233a7a
change-id: 20250929-gs-usb-max-if-a304c83243e5
Best regards,
--
Celeste Liu <uwu(a)coelacanthus.name>
From: Lance Yang <lance.yang(a)linux.dev>
When splitting an mTHP and replacing a zero-filled subpage with the shared
zeropage, try_to_map_unused_to_zeropage() currently drops several important
PTE bits.
For userspace tools like CRIU, which rely on the soft-dirty mechanism for
incremental snapshots, losing the soft-dirty bit means modified pages are
missed, leading to inconsistent memory state after restore.
As pointed out by David, the more critical uffd-wp bit is also dropped.
This breaks the userfaultfd write-protection mechanism, causing writes
to be silently missed by monitoring applications, which can lead to data
corruption.
Preserve both the soft-dirty and uffd-wp bits from the old PTE when
creating the new zeropage mapping to ensure they are correctly tracked.
Cc: <stable(a)vger.kernel.org>
Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Suggested-by: David Hildenbrand <david(a)redhat.com>
Suggested-by: Dev Jain <dev.jain(a)arm.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
---
v1 -> v2:
- Avoid calling ptep_get() multiple times (per Dev)
- Double-check the uffd-wp bit (per David)
- Collect Acked-by from David - thanks!
- https://lore.kernel.org/linux-mm/20250928044855.76359-1-lance.yang@linux.de…
mm/migrate.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index ce83c2c3c287..50aa91d9ab4e 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -300,13 +300,14 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
unsigned long idx)
{
struct page *page = folio_page(folio, idx);
+ pte_t oldpte = ptep_get(pvmw->pte);
pte_t newpte;
if (PageCompound(page))
return false;
VM_BUG_ON_PAGE(!PageAnon(page), page);
VM_BUG_ON_PAGE(!PageLocked(page), page);
- VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page);
+ VM_BUG_ON_PAGE(pte_present(oldpte), page);
if (folio_test_mlocked(folio) || (pvmw->vma->vm_flags & VM_LOCKED) ||
mm_forbids_zeropage(pvmw->vma->vm_mm))
@@ -322,6 +323,12 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address),
pvmw->vma->vm_page_prot));
+
+ if (pte_swp_soft_dirty(oldpte))
+ newpte = pte_mksoft_dirty(newpte);
+ if (pte_swp_uffd_wp(oldpte))
+ newpte = pte_mkuffd_wp(newpte);
+
set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte);
dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio));
--
2.49.0
This issue was found by Runcheng Lu when develop HSCanT USB to CAN FD
converter[1]. The original developers may have only 3 intefaces device to
test so they write 3 here and wait for future change.
During the HSCanT development, we actually used 4 interfaces, so the
limitation of 3 is not enough now. But just increase one is not
future-proofed. Since the channel type in gs_host_frame is u8, just
increase interface number limit to max size of u8 safely.
[1]: https://github.com/cherry-embedded/HSCanT-hardware
Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices")
Reported-by: Runcheng Lu <runcheng.lu(a)hpmicro.com>
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
---
Changes in v2:
- Use flexible array member instead of fixed array.
- Link to v1: https://lore.kernel.org/r/20250929-gs-usb-max-if-v1-1-e41b5c09133a@coelacan…
---
drivers/net/can/usb/gs_usb.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
index c9482d6e947b0c7b033dc4f0c35f5b111e1bfd92..69b068c8fa8fbab42337e2b0a3d0860ac678c792 100644
--- a/drivers/net/can/usb/gs_usb.c
+++ b/drivers/net/can/usb/gs_usb.c
@@ -289,11 +289,6 @@ struct gs_host_frame {
#define GS_MAX_RX_URBS 30
#define GS_NAPI_WEIGHT 32
-/* Maximum number of interfaces the driver supports per device.
- * Current hardware only supports 3 interfaces. The future may vary.
- */
-#define GS_MAX_INTF 3
-
struct gs_tx_context {
struct gs_can *dev;
unsigned int echo_id;
@@ -324,7 +319,6 @@ struct gs_can {
/* usb interface struct */
struct gs_usb {
- struct gs_can *canch[GS_MAX_INTF];
struct usb_anchor rx_submitted;
struct usb_device *udev;
@@ -336,9 +330,11 @@ struct gs_usb {
unsigned int hf_size_rx;
u8 active_channels;
+ u8 channel_cnt;
unsigned int pipe_in;
unsigned int pipe_out;
+ struct gs_can *canch[] __counted_by(channel_cnt);
};
/* 'allocate' a tx context.
@@ -599,7 +595,7 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
}
/* device reports out of range channel id */
- if (hf->channel >= GS_MAX_INTF)
+ if (hf->channel >= parent->channel_cnt)
goto device_detach;
dev = parent->canch[hf->channel];
@@ -699,7 +695,7 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
/* USB failure take down all interfaces */
if (rc == -ENODEV) {
device_detach:
- for (rc = 0; rc < GS_MAX_INTF; rc++) {
+ for (rc = 0; rc < parent->channel_cnt; rc++) {
if (parent->canch[rc])
netif_device_detach(parent->canch[rc]->netdev);
}
@@ -1460,17 +1456,19 @@ static int gs_usb_probe(struct usb_interface *intf,
icount = dconf.icount + 1;
dev_info(&intf->dev, "Configuring for %u interfaces\n", icount);
- if (icount > GS_MAX_INTF) {
+ if (icount > type_max(typeof(parent->channel_cnt))) {
dev_err(&intf->dev,
"Driver cannot handle more that %u CAN interfaces\n",
- GS_MAX_INTF);
+ type_max(typeof(parent->channel_cnt)));
return -EINVAL;
}
- parent = kzalloc(sizeof(*parent), GFP_KERNEL);
+ parent = kzalloc(struct_size(parent, canch, icount), GFP_KERNEL);
if (!parent)
return -ENOMEM;
+ parent->channel_cnt = icount;
+
init_usb_anchor(&parent->rx_submitted);
usb_set_intfdata(intf, parent);
@@ -1531,7 +1529,7 @@ static void gs_usb_disconnect(struct usb_interface *intf)
return;
}
- for (i = 0; i < GS_MAX_INTF; i++)
+ for (i = 0; i < parent->channel_cnt; i++)
if (parent->canch[i])
gs_destroy_candev(parent->canch[i]);
---
base-commit: e5f0a698b34ed76002dc5cff3804a61c80233a7a
change-id: 20250929-gs-usb-max-if-a304c83243e5
Best regards,
--
Celeste Liu <uwu(a)coelacanthus.name>
From: Lizhi Xu <lizhi.xu(a)windriver.com>
[ Upstream commit 66d938e89e940e512f4c3deac938ecef399c13f9 ]
The filio lock has been released here, so there is no need to jump to
error_folio_unlock to release it again.
Reported-by: syzbot+b73c7d94a151e2ee1e9b(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b73c7d94a151e2ee1e9b
Signed-off-by: Lizhi Xu <lizhi.xu(a)windriver.com>
Acked-by: David Howells <dhowells(a)redhat.com>
Reviewed-by: Paulo Alcantara (Red Hat) <pc(a)manguebit.org>
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my comprehensive investigation, here is my analysis:
## Backport Decision: **YES**
### Detailed Analysis
#### Bug Description
This commit fixes a **critical double-unlock bug** in the netfs (Network
Filesystem Library) buffered write path. The bug was introduced in
commit 8f52de0077ba3b (v6.12-rc1) during a performance optimization
refactoring.
**The specific bug**: In the `flush_content` error path at
fs/netfs/buffered_write.c:346, the code unlocks and releases a folio,
then on line 350, if `filemap_write_and_wait_range()` fails, it jumps to
`error_folio_unlock` which attempts to unlock the **already unlocked**
folio again (line 407).
```c
flush_content:
folio_unlock(folio); // First unlock - line 346
folio_put(folio);
ret = filemap_write_and_wait_range(...);
if (ret < 0)
goto error_folio_unlock; // BUG: jumps to unlock again!
```
**The fix**: Changes line 350 from `goto error_folio_unlock` to `goto
out`, correctly bypassing the duplicate unlock.
#### Severity Assessment: **HIGH**
1. **Impact**:
- With `CONFIG_DEBUG_VM=y`: Immediate kernel panic via
`VM_BUG_ON_FOLIO()` at mm/filemap.c:1498
- With `CONFIG_DEBUG_VM=n`: Silent memory corruption, undefined
behavior, potential use-after-free
- Affects **all network filesystems**: 9p, AFS, Ceph, NFS, SMB/CIFS
2. **Syzbot Evidence**:
- Bug ID: syzbot+b73c7d94a151e2ee1e9b(a)syzkaller.appspotmail.com
- Title: "kernel BUG in netfs_perform_write"
- **17 crash instances** recorded
- Reproducers available (both C and syz formats)
- Affected multiple kernel versions (5.4, 5.10, 5.15, 6.1, 6.12)
3. **Triggering Conditions** (Moderate likelihood):
- Network filesystem write operation
- Incompatible write scenario (netfs_group mismatch or streaming
write conflict)
- I/O error from `filemap_write_and_wait_range()` (network failure,
memory pressure, etc.)
#### Backport Criteria Evaluation
✅ **Fixes important bug affecting users**: Yes - causes kernel panics
and potential memory corruption for all network filesystem users
✅ **Small and contained fix**: Yes - **single line change**, minimal
code modification
✅ **No architectural changes**: Yes - simple error path correction
✅ **Minimal regression risk**: Yes - obviously correct fix, well-
reviewed (Acked-by David Howells, Reviewed-by Paulo Alcantara)
✅ **Confined to subsystem**: Yes - only touches netfs buffered write
error path
✅ **Well-tested**: Yes - syzbot has reproducers, 17 crash instances
documented
#### Affected Stable Trees
**Bug introduced**: v6.12-rc1 (commit 8f52de0077ba3b)
**Bug fixed**: v6.17 (commit 66d938e89e940)
**Vulnerable stable kernels**: 6.12.x, 6.13.x, 6.14.x, 6.15.x, 6.16.x
#### Missing Metadata (Should be added)
The commit is **missing critical stable backport tags**:
- No `Fixes: 8f52de0077ba3b ("netfs: Reduce number of conditional
branches in netfs_perform_write()")`
- No `Cc: stable(a)vger.kernel.org`
This appears to be an oversight, as the fix clearly qualifies for stable
backporting.
### Conclusion
**Strong YES for backporting**. This is a textbook stable tree
candidate:
- Fixes a serious kernel panic/memory corruption bug
- One-line change with zero regression risk
- Affects production users of network filesystems
- Well-tested with reproducers
- Reviewed and acked by subsystem maintainers
The fix should be backported to **all stable kernels containing commit
8f52de0077ba3b** (6.12+).
fs/netfs/buffered_write.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
index f27ea5099a681..09394ac2c180d 100644
--- a/fs/netfs/buffered_write.c
+++ b/fs/netfs/buffered_write.c
@@ -347,7 +347,7 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter,
folio_put(folio);
ret = filemap_write_and_wait_range(mapping, fpos, fpos + flen - 1);
if (ret < 0)
- goto error_folio_unlock;
+ goto out;
continue;
copied:
--
2.51.0
From: Lance Yang <lance.yang(a)linux.dev>
When splitting an mTHP and replacing a zero-filled subpage with the shared
zeropage, try_to_map_unused_to_zeropage() currently drops the soft-dirty
bit.
For userspace tools like CRIU, which rely on the soft-dirty mechanism for
incremental snapshots, losing this bit means modified pages are missed,
leading to inconsistent memory state after restore.
Preserve the soft-dirty bit from the old PTE when creating the zeropage
mapping to ensure modified pages are correctly tracked.
Cc: <stable(a)vger.kernel.org>
Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
---
mm/migrate.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/mm/migrate.c b/mm/migrate.c
index ce83c2c3c287..bf364ba07a3f 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -322,6 +322,10 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address),
pvmw->vma->vm_page_prot));
+
+ if (pte_swp_soft_dirty(ptep_get(pvmw->pte)))
+ newpte = pte_mksoft_dirty(newpte);
+
set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte);
dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio));
--
2.49.0
Greetings,
I hope this email finds you well.
I am Mrs. Diana Owen contacting you from Reality Trading Group Ltd India.
.
As a new growing company, we are looking for reliable company of your product as seen in your website for urgent purchase order basis.
Could you please give me more details and specification of your products.
We also need your products catalog, MOQ, Production Lead Time and Unit price?
Best Regards,
Mrs. Diana Owen
Purchase Manager
Company: Reality Trading Group Ltd
DAMON's virtual address space operation set implementation (vaddr) calls
pte_offset_map_lock() inside the page table walk callback function.
This is for reading and writing page table accessed bits. If
pte_offset_map_lock() fails, it retries by returning the page table walk
callback function with ACTION_AGAIN.
pte_offset_map_lock() can continuously fail if the target is a pmd
migration entry, though. Hence it could cause an infinite page table
walk if the migration cannot be done until the page table walk is
finished. This indeed caused a soft lockup when CPU hotplugging and
DAMON were running in parallel.
Avoid the infinite loop by simply not retrying the page table walk.
DAMON is promising only a best-effort accuracy, so missing access to
such pages is no problem.
Reported-by: Xinyu Zheng <zhengxinyu6(a)huawei.com>
Closes: https://lore.kernel.org/20250918030029.2652607-1-zhengxinyu6@huawei.com
Fixes: 7780d04046a2 ("mm/pagewalkers: ACTION_AGAIN if pte_offset_map_lock() fails")
Cc: <stable(a)vger.kernel.org> # 6.5.x
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
mm/damon/vaddr.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
index 8c048f9b129e..7e834467b2d8 100644
--- a/mm/damon/vaddr.c
+++ b/mm/damon/vaddr.c
@@ -328,10 +328,8 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr,
}
pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
- if (!pte) {
- walk->action = ACTION_AGAIN;
+ if (!pte)
return 0;
- }
if (!pte_present(ptep_get(pte)))
goto out;
damon_ptep_mkold(pte, walk->vma, addr);
@@ -481,10 +479,8 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr,
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
- if (!pte) {
- walk->action = ACTION_AGAIN;
+ if (!pte)
return 0;
- }
ptent = ptep_get(pte);
if (!pte_present(ptent))
goto out;
base-commit: 3169a901e935bc1f2d2eec0171abcf524b7747e4
--
2.39.5
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 7b7387650dcf2881fd8bb55bcf3c8bd6c9542dd7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092932-output-egotism-4918@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7b7387650dcf2881fd8bb55bcf3c8bd6c9542dd7 Mon Sep 17 00:00:00 2001
From: Jinjiang Tu <tujinjiang(a)huawei.com>
Date: Fri, 12 Sep 2025 15:41:39 +0800
Subject: [PATCH] mm/hugetlb: fix folio is still mapped when deleted
Migration may be raced with fallocating hole. remove_inode_single_folio
will unmap the folio if the folio is still mapped. However, it's called
without folio lock. If the folio is migrated and the mapped pte has been
converted to migration entry, folio_mapped() returns false, and won't
unmap it. Due to extra refcount held by remove_inode_single_folio,
migration fails, restores migration entry to normal pte, and the folio is
mapped again. As a result, we triggered BUG in filemap_unaccount_folio.
The log is as follows:
BUG: Bad page cache in process hugetlb pfn:156c00
page: refcount:515 mapcount:0 mapping:0000000099fef6e1 index:0x0 pfn:0x156c00
head: order:9 mapcount:1 entire_mapcount:1 nr_pages_mapped:0 pincount:0
aops:hugetlbfs_aops ino:dcc dentry name(?):"my_hugepage_file"
flags: 0x17ffffc00000c1(locked|waiters|head|node=0|zone=2|lastcpupid=0x1fffff)
page_type: f4(hugetlb)
page dumped because: still mapped when deleted
CPU: 1 UID: 0 PID: 395 Comm: hugetlb Not tainted 6.17.0-rc5-00044-g7aac71907bde-dirty #484 NONE
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
Call Trace:
<TASK>
dump_stack_lvl+0x4f/0x70
filemap_unaccount_folio+0xc4/0x1c0
__filemap_remove_folio+0x38/0x1c0
filemap_remove_folio+0x41/0xd0
remove_inode_hugepages+0x142/0x250
hugetlbfs_fallocate+0x471/0x5a0
vfs_fallocate+0x149/0x380
Hold folio lock before checking if the folio is mapped to avold race with
migration.
Link: https://lkml.kernel.org/r/20250912074139.3575005-1-tujinjiang@huawei.com
Fixes: 4aae8d1c051e ("mm/hugetlbfs: unmap pages if page fault raced with hole punch")
Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 09d4baef29cf..be4be99304bc 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -517,14 +517,16 @@ static bool remove_inode_single_folio(struct hstate *h, struct inode *inode,
/*
* If folio is mapped, it was faulted in after being
- * unmapped in caller. Unmap (again) while holding
- * the fault mutex. The mutex will prevent faults
- * until we finish removing the folio.
+ * unmapped in caller or hugetlb_vmdelete_list() skips
+ * unmapping it due to fail to grab lock. Unmap (again)
+ * while holding the fault mutex. The mutex will prevent
+ * faults until we finish removing the folio. Hold folio
+ * lock to guarantee no concurrent migration.
*/
+ folio_lock(folio);
if (unlikely(folio_mapped(folio)))
hugetlb_unmap_file_folio(h, mapping, folio, index);
- folio_lock(folio);
/*
* We must remove the folio from page cache before removing
* the region/ reserve map (hugetlb_unreserve_pages). In
Hello there,
I hope this message finds you well. My name is Ronald Evergreen.
I am a legal representative based in London, I am contacting you
with a confidential business proposal that presents a unique
opportunity for partnership that will be of great benefits to
both sides financially. For more information reply back and I
will give you detailed information.
For more information please contact my personal email:
evergreenronald86(a)gamil.com.
Ronald Evergreen
The patch titled
Subject: mm: hugetlb: avoid soft lockup when mprotect to large memory area
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-hugetlb-avoid-soft-lockup-when-mprotect-to-large-memory-area.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Yang Shi <yang(a)os.amperecomputing.com>
Subject: mm: hugetlb: avoid soft lockup when mprotect to large memory area
Date: Mon, 29 Sep 2025 13:24:02 -0700
When calling mprotect() to a large hugetlb memory area in our customer's
workload (~300GB hugetlb memory), soft lockup was observed:
watchdog: BUG: soft lockup - CPU#98 stuck for 23s! [t2_new_sysv:126916]
CPU: 98 PID: 126916 Comm: t2_new_sysv Kdump: loaded Not tainted 6.17-rc7
Hardware name: GIGACOMPUTING R2A3-T40-AAV1/Jefferson CIO, BIOS 5.4.4.1 07/15/2025
pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc��: mte_clear_page_tags+0x14/0x24
lr��: mte_sync_tags+0x1c0/0x240
sp��: ffff80003150bb80
x29: ffff80003150bb80 x28: ffff00739e9705a8 x27: 0000ffd2d6a00000
x26: 0000ff8e4bc00000 x25: 00e80046cde00f45 x24: 0000000000022458
x23: 0000000000000000 x22: 0000000000000004 x21: 000000011b380000
x20: ffff000000000000 x19: 000000011b379f40 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc875e0aa5e2c
x8��: 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
x5��: fffffc01ce7a5c00 x4 : 00000000046cde00 x3 : fffffc0000000000
x2��: 0000000000000004 x1 : 0000000000000040 x0 : ffff0046cde7c000
Call trace:
����mte_clear_page_tags+0x14/0x24
����set_huge_pte_at+0x25c/0x280
����hugetlb_change_protection+0x220/0x430
����change_protection+0x5c/0x8c
����mprotect_fixup+0x10c/0x294
����do_mprotect_pkey.constprop.0+0x2e0/0x3d4
����__arm64_sys_mprotect+0x24/0x44
����invoke_syscall+0x50/0x160
����el0_svc_common+0x48/0x144
����do_el0_svc+0x30/0xe0
����el0_svc+0x30/0xf0
����el0t_64_sync_handler+0xc4/0x148
����el0t_64_sync+0x1a4/0x1a8
Soft lockup is not triggered with THP or base page because there is
cond_resched() called for each PMD size.
Although the soft lockup was triggered by MTE, it should be not MTE
specific. The other processing which takes long time in the loop may
trigger soft lockup too.
So add cond_resched() for hugetlb to avoid soft lockup.
Link: https://lkml.kernel.org/r/20250929202402.1663290-1-yang@os.amperecomputing.…
Fixes: 8f860591ffb2 ("[PATCH] Enable mprotect on huge pages")
Signed-off-by: Yang Shi <yang(a)os.amperecomputing.com>
Tested-by: Carl Worth <carl(a)os.amperecomputing.com>
Reviewed-by: Christoph Lameter (Ampere) <cl(a)gentwo.org>
Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Oscar Salvador <osalvador(a)suse.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/hugetlb.c~mm-hugetlb-avoid-soft-lockup-when-mprotect-to-large-memory-area
+++ a/mm/hugetlb.c
@@ -7203,6 +7203,8 @@ long hugetlb_change_protection(struct vm
psize);
}
spin_unlock(ptl);
+
+ cond_resched();
}
/*
* Must flush TLB before releasing i_mmap_rwsem: x86's huge_pmd_unshare
_
Patches currently in -mm which might be from yang(a)os.amperecomputing.com are
mm-hugetlb-avoid-soft-lockup-when-mprotect-to-large-memory-area.patch
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 7b7387650dcf2881fd8bb55bcf3c8bd6c9542dd7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092931-vastness-jawed-6945@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7b7387650dcf2881fd8bb55bcf3c8bd6c9542dd7 Mon Sep 17 00:00:00 2001
From: Jinjiang Tu <tujinjiang(a)huawei.com>
Date: Fri, 12 Sep 2025 15:41:39 +0800
Subject: [PATCH] mm/hugetlb: fix folio is still mapped when deleted
Migration may be raced with fallocating hole. remove_inode_single_folio
will unmap the folio if the folio is still mapped. However, it's called
without folio lock. If the folio is migrated and the mapped pte has been
converted to migration entry, folio_mapped() returns false, and won't
unmap it. Due to extra refcount held by remove_inode_single_folio,
migration fails, restores migration entry to normal pte, and the folio is
mapped again. As a result, we triggered BUG in filemap_unaccount_folio.
The log is as follows:
BUG: Bad page cache in process hugetlb pfn:156c00
page: refcount:515 mapcount:0 mapping:0000000099fef6e1 index:0x0 pfn:0x156c00
head: order:9 mapcount:1 entire_mapcount:1 nr_pages_mapped:0 pincount:0
aops:hugetlbfs_aops ino:dcc dentry name(?):"my_hugepage_file"
flags: 0x17ffffc00000c1(locked|waiters|head|node=0|zone=2|lastcpupid=0x1fffff)
page_type: f4(hugetlb)
page dumped because: still mapped when deleted
CPU: 1 UID: 0 PID: 395 Comm: hugetlb Not tainted 6.17.0-rc5-00044-g7aac71907bde-dirty #484 NONE
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
Call Trace:
<TASK>
dump_stack_lvl+0x4f/0x70
filemap_unaccount_folio+0xc4/0x1c0
__filemap_remove_folio+0x38/0x1c0
filemap_remove_folio+0x41/0xd0
remove_inode_hugepages+0x142/0x250
hugetlbfs_fallocate+0x471/0x5a0
vfs_fallocate+0x149/0x380
Hold folio lock before checking if the folio is mapped to avold race with
migration.
Link: https://lkml.kernel.org/r/20250912074139.3575005-1-tujinjiang@huawei.com
Fixes: 4aae8d1c051e ("mm/hugetlbfs: unmap pages if page fault raced with hole punch")
Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 09d4baef29cf..be4be99304bc 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -517,14 +517,16 @@ static bool remove_inode_single_folio(struct hstate *h, struct inode *inode,
/*
* If folio is mapped, it was faulted in after being
- * unmapped in caller. Unmap (again) while holding
- * the fault mutex. The mutex will prevent faults
- * until we finish removing the folio.
+ * unmapped in caller or hugetlb_vmdelete_list() skips
+ * unmapping it due to fail to grab lock. Unmap (again)
+ * while holding the fault mutex. The mutex will prevent
+ * faults until we finish removing the folio. Hold folio
+ * lock to guarantee no concurrent migration.
*/
+ folio_lock(folio);
if (unlikely(folio_mapped(folio)))
hugetlb_unmap_file_folio(h, mapping, folio, index);
- folio_lock(folio);
/*
* We must remove the folio from page cache before removing
* the region/ reserve map (hugetlb_unreserve_pages). In
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 7b7387650dcf2881fd8bb55bcf3c8bd6c9542dd7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092930-header-irritable-c8ed@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7b7387650dcf2881fd8bb55bcf3c8bd6c9542dd7 Mon Sep 17 00:00:00 2001
From: Jinjiang Tu <tujinjiang(a)huawei.com>
Date: Fri, 12 Sep 2025 15:41:39 +0800
Subject: [PATCH] mm/hugetlb: fix folio is still mapped when deleted
Migration may be raced with fallocating hole. remove_inode_single_folio
will unmap the folio if the folio is still mapped. However, it's called
without folio lock. If the folio is migrated and the mapped pte has been
converted to migration entry, folio_mapped() returns false, and won't
unmap it. Due to extra refcount held by remove_inode_single_folio,
migration fails, restores migration entry to normal pte, and the folio is
mapped again. As a result, we triggered BUG in filemap_unaccount_folio.
The log is as follows:
BUG: Bad page cache in process hugetlb pfn:156c00
page: refcount:515 mapcount:0 mapping:0000000099fef6e1 index:0x0 pfn:0x156c00
head: order:9 mapcount:1 entire_mapcount:1 nr_pages_mapped:0 pincount:0
aops:hugetlbfs_aops ino:dcc dentry name(?):"my_hugepage_file"
flags: 0x17ffffc00000c1(locked|waiters|head|node=0|zone=2|lastcpupid=0x1fffff)
page_type: f4(hugetlb)
page dumped because: still mapped when deleted
CPU: 1 UID: 0 PID: 395 Comm: hugetlb Not tainted 6.17.0-rc5-00044-g7aac71907bde-dirty #484 NONE
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
Call Trace:
<TASK>
dump_stack_lvl+0x4f/0x70
filemap_unaccount_folio+0xc4/0x1c0
__filemap_remove_folio+0x38/0x1c0
filemap_remove_folio+0x41/0xd0
remove_inode_hugepages+0x142/0x250
hugetlbfs_fallocate+0x471/0x5a0
vfs_fallocate+0x149/0x380
Hold folio lock before checking if the folio is mapped to avold race with
migration.
Link: https://lkml.kernel.org/r/20250912074139.3575005-1-tujinjiang@huawei.com
Fixes: 4aae8d1c051e ("mm/hugetlbfs: unmap pages if page fault raced with hole punch")
Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 09d4baef29cf..be4be99304bc 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -517,14 +517,16 @@ static bool remove_inode_single_folio(struct hstate *h, struct inode *inode,
/*
* If folio is mapped, it was faulted in after being
- * unmapped in caller. Unmap (again) while holding
- * the fault mutex. The mutex will prevent faults
- * until we finish removing the folio.
+ * unmapped in caller or hugetlb_vmdelete_list() skips
+ * unmapping it due to fail to grab lock. Unmap (again)
+ * while holding the fault mutex. The mutex will prevent
+ * faults until we finish removing the folio. Hold folio
+ * lock to guarantee no concurrent migration.
*/
+ folio_lock(folio);
if (unlikely(folio_mapped(folio)))
hugetlb_unmap_file_folio(h, mapping, folio, index);
- folio_lock(folio);
/*
* We must remove the folio from page cache before removing
* the region/ reserve map (hugetlb_unreserve_pages). In
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 85e1ff61060a765d91ee62dc5606d4d547d9d105
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092954-dance-oat-7fae@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 85e1ff61060a765d91ee62dc5606d4d547d9d105 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Thu, 11 Sep 2025 12:58:58 -0700
Subject: [PATCH] kmsan: fix out-of-bounds access to shadow memory
Running sha224_kunit on a KMSAN-enabled kernel results in a crash in
kmsan_internal_set_shadow_origin():
BUG: unable to handle page fault for address: ffffbc3840291000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 1810067 P4D 1810067 PUD 192d067 PMD 3c17067 PTE 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 81 Comm: kunit_try_catch Tainted: G N 6.17.0-rc3 #10 PREEMPT(voluntary)
Tainted: [N]=TEST
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
RIP: 0010:kmsan_internal_set_shadow_origin+0x91/0x100
[...]
Call Trace:
<TASK>
__msan_memset+0xee/0x1a0
sha224_final+0x9e/0x350
test_hash_buffer_overruns+0x46f/0x5f0
? kmsan_get_shadow_origin_ptr+0x46/0xa0
? __pfx_test_hash_buffer_overruns+0x10/0x10
kunit_try_run_case+0x198/0xa00
This occurs when memset() is called on a buffer that is not 4-byte aligned
and extends to the end of a guard page, i.e. the next page is unmapped.
The bug is that the loop at the end of kmsan_internal_set_shadow_origin()
accesses the wrong shadow memory bytes when the address is not 4-byte
aligned. Since each 4 bytes are associated with an origin, it rounds the
address and size so that it can access all the origins that contain the
buffer. However, when it checks the corresponding shadow bytes for a
particular origin, it incorrectly uses the original unrounded shadow
address. This results in reads from shadow memory beyond the end of the
buffer's shadow memory, which crashes when that memory is not mapped.
To fix this, correctly align the shadow address before accessing the 4
shadow bytes corresponding to each origin.
Link: https://lkml.kernel.org/r/20250911195858.394235-1-ebiggers@kernel.org
Fixes: 2ef3cec44c60 ("kmsan: do not wipe out origin when doing partial unpoisoning")
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
Tested-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: Alexander Potapenko <glider(a)google.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/kmsan/core.c b/mm/kmsan/core.c
index 1ea711786c52..8bca7fece47f 100644
--- a/mm/kmsan/core.c
+++ b/mm/kmsan/core.c
@@ -195,7 +195,8 @@ void kmsan_internal_set_shadow_origin(void *addr, size_t size, int b,
u32 origin, bool checked)
{
u64 address = (u64)addr;
- u32 *shadow_start, *origin_start;
+ void *shadow_start;
+ u32 *aligned_shadow, *origin_start;
size_t pad = 0;
KMSAN_WARN_ON(!kmsan_metadata_is_contiguous(addr, size));
@@ -214,9 +215,12 @@ void kmsan_internal_set_shadow_origin(void *addr, size_t size, int b,
}
__memset(shadow_start, b, size);
- if (!IS_ALIGNED(address, KMSAN_ORIGIN_SIZE)) {
+ if (IS_ALIGNED(address, KMSAN_ORIGIN_SIZE)) {
+ aligned_shadow = shadow_start;
+ } else {
pad = address % KMSAN_ORIGIN_SIZE;
address -= pad;
+ aligned_shadow = shadow_start - pad;
size += pad;
}
size = ALIGN(size, KMSAN_ORIGIN_SIZE);
@@ -230,7 +234,7 @@ void kmsan_internal_set_shadow_origin(void *addr, size_t size, int b,
* corresponding shadow slot is zero.
*/
for (int i = 0; i < size / KMSAN_ORIGIN_SIZE; i++) {
- if (origin || !shadow_start[i])
+ if (origin || !aligned_shadow[i])
origin_start[i] = origin;
}
}
diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
index c6c5b2bbede0..902ec48b1e3e 100644
--- a/mm/kmsan/kmsan_test.c
+++ b/mm/kmsan/kmsan_test.c
@@ -556,6 +556,21 @@ DEFINE_TEST_MEMSETXX(16)
DEFINE_TEST_MEMSETXX(32)
DEFINE_TEST_MEMSETXX(64)
+/* Test case: ensure that KMSAN does not access shadow memory out of bounds. */
+static void test_memset_on_guarded_buffer(struct kunit *test)
+{
+ void *buf = vmalloc(PAGE_SIZE);
+
+ kunit_info(test,
+ "memset() on ends of guarded buffer should not crash\n");
+
+ for (size_t size = 0; size <= 128; size++) {
+ memset(buf, 0xff, size);
+ memset(buf + PAGE_SIZE - size, 0xff, size);
+ }
+ vfree(buf);
+}
+
static noinline void fibonacci(int *array, int size, int start)
{
if (start < 2 || (start == size))
@@ -677,6 +692,7 @@ static struct kunit_case kmsan_test_cases[] = {
KUNIT_CASE(test_memset16),
KUNIT_CASE(test_memset32),
KUNIT_CASE(test_memset64),
+ KUNIT_CASE(test_memset_on_guarded_buffer),
KUNIT_CASE(test_long_origin_chain),
KUNIT_CASE(test_stackdepot_roundtrip),
KUNIT_CASE(test_unpoison_memory),
Hi ,
Our verified database enables accurate outreach to Executive Assistants and high-net-worth individuals.
Executive Assistants (by region):
USA : 50,000 contacts
Europe : 15,000 contacts
Canada : 2,000 contacts
Middle East : 2,500 contacts
HNWI & Senior Decision-Makers (by region, incl. EAs):
USA : 500,000 contacts
Europe : 50,000 contacts
Canada : 10,000 contacts
UAE : 7,500 contacts
Titles we cover: Business Owners, Founders, Entrepreneurs, C-Level Executives, VPs, and Executive Assistants.
Data fields: Name, Job Title, Company, URL, Email, Revenue and more.
This list helps reach gatekeepers and decision-makers who oversee charter service partnerships.
Happy to share prices if that helps.
Eager to receive your feedback.
Regards
Brenda
Marketing Manager
Prospect Tech Connect.,
Please reply with REMOVE if you don't wish to receive further emails
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x c6ccc4dde17676dfe617b9a37bd9ba19a8fc87ee
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092939-unsafe-alfalfa-105c@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c6ccc4dde17676dfe617b9a37bd9ba19a8fc87ee Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Sat, 20 Sep 2025 22:09:55 +0200
Subject: [PATCH] gpiolib: Extend software-node support to support secondary
software-nodes
When a software-node gets added to a device which already has another
fwnode as primary node it will become the secondary fwnode for that
device.
Currently if a software-node with GPIO properties ends up as the secondary
fwnode then gpiod_find_by_fwnode() will fail to find the GPIOs.
Add a new gpiod_fwnode_lookup() helper which falls back to calling
gpiod_find_by_fwnode() with the secondary fwnode if the GPIO was not
found in the primary fwnode.
Fixes: e7f9ff5dc90c ("gpiolib: add support for software nodes")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Reviewed-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Link: https://lore.kernel.org/r/20250920200955.20403-1-hansg@kernel.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 0d2b470a252e..74d54513730a 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -4604,6 +4604,23 @@ static struct gpio_desc *gpiod_find_by_fwnode(struct fwnode_handle *fwnode,
return desc;
}
+static struct gpio_desc *gpiod_fwnode_lookup(struct fwnode_handle *fwnode,
+ struct device *consumer,
+ const char *con_id,
+ unsigned int idx,
+ enum gpiod_flags *flags,
+ unsigned long *lookupflags)
+{
+ struct gpio_desc *desc;
+
+ desc = gpiod_find_by_fwnode(fwnode, consumer, con_id, idx, flags, lookupflags);
+ if (gpiod_not_found(desc) && !IS_ERR_OR_NULL(fwnode))
+ desc = gpiod_find_by_fwnode(fwnode->secondary, consumer, con_id,
+ idx, flags, lookupflags);
+
+ return desc;
+}
+
struct gpio_desc *gpiod_find_and_request(struct device *consumer,
struct fwnode_handle *fwnode,
const char *con_id,
@@ -4622,8 +4639,8 @@ struct gpio_desc *gpiod_find_and_request(struct device *consumer,
int ret = 0;
scoped_guard(srcu, &gpio_devices_srcu) {
- desc = gpiod_find_by_fwnode(fwnode, consumer, con_id, idx,
- &flags, &lookupflags);
+ desc = gpiod_fwnode_lookup(fwnode, consumer, con_id, idx,
+ &flags, &lookupflags);
if (gpiod_not_found(desc) && platform_lookup_allowed) {
/*
* Either we are not using DT or ACPI, or their lookup
This series backports 19 patches to update minmax.h in the 6.1.y branch,
aligning it with v6.17-rc7.
The ultimate goal is to synchronize all longterm branches so that they
include the full set of minmax.h changes.
Previous work to update 6.12.48:
https://lore.kernel.org/stable/20250922103123.14538-1-farbere@amazon.com/T/…
and 6.6.107:
https://lore.kernel.org/stable/20250922103241.16213-1-farbere@amazon.com/T/…
The key motivation is to bring in commit d03eba99f5bf ("minmax: allow
min()/max()/clamp() if the arguments have the same signedness"), which
is missing in older kernels.
In mainline, this change enables min()/max()/clamp() to accept mixed
argument types, provided both have the same signedness. Without it,
backported patches that use these forms may trigger compiler warnings,
which escalate to build failures when -Werror is enabled.
Andy Shevchenko (1):
minmax: deduplicate __unconst_integer_typeof()
David Laight (8):
minmax: fix indentation of __cmp_once() and __clamp_once()
minmax.h: add whitespace around operators and after commas
minmax.h: update some comments
minmax.h: reduce the #define expansion of min(), max() and clamp()
minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()
minmax.h: move all the clamp() definitions after the min/max() ones
minmax.h: simplify the variants of clamp()
minmax.h: remove some #defines that are only expanded once
Herve Codina (1):
minmax: Introduce {min,max}_array()
Linus Torvalds (8):
minmax: avoid overly complicated constant expressions in VM code
minmax: simplify and clarify min_t()/max_t() implementation
minmax: make generic MIN() and MAX() macros available everywhere
minmax: add a few more MIN_T/MAX_T users
minmax: simplify min()/max()/clamp() implementation
minmax: don't use max() in situations that want a C constant
expression
minmax: improve macro expansion and type checking
minmax: fix up min3() and max3() too
Matthew Wilcox (Oracle) (1):
minmax: add in_range() macro
arch/arm/mm/pageattr.c | 6 +-
arch/um/drivers/mconsole_user.c | 2 +
arch/x86/mm/pgtable.c | 2 +-
drivers/edac/sb_edac.c | 4 +-
drivers/edac/skx_common.h | 1 -
.../drm/amd/display/modules/hdcp/hdcp_ddc.c | 2 +
.../drm/amd/pm/powerplay/hwmgr/ppevvmath.h | 14 +-
drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 2 +-
.../drm/arm/display/include/malidp_utils.h | 2 +-
.../display/komeda/komeda_pipeline_state.c | 24 +-
drivers/gpu/drm/drm_color_mgmt.c | 2 +-
drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 6 -
drivers/gpu/drm/radeon/evergreen_cs.c | 2 +
drivers/hwmon/adt7475.c | 24 +-
drivers/input/touchscreen/cyttsp4_core.c | 2 +-
drivers/irqchip/irq-sun6i-r.c | 2 +-
drivers/md/dm-integrity.c | 2 +-
drivers/media/dvb-frontends/stv0367_priv.h | 3 +
.../net/ethernet/chelsio/cxgb3/cxgb3_main.c | 18 +-
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
drivers/net/fjes/fjes_main.c | 4 +-
drivers/nfc/pn544/i2c.c | 2 -
drivers/platform/x86/sony-laptop.c | 1 -
drivers/scsi/isci/init.c | 6 +-
.../pci/hive_isp_css_include/math_support.h | 5 -
drivers/virt/acrn/ioreq.c | 4 +-
fs/btrfs/misc.h | 2 -
fs/btrfs/tree-checker.c | 2 +-
fs/ext2/balloc.c | 2 -
fs/ext4/ext4.h | 2 -
fs/ufs/util.h | 6 -
include/linux/compiler.h | 9 +
include/linux/minmax.h | 264 +++++++++++++-----
include/linux/pageblock-flags.h | 2 +-
kernel/trace/preemptirq_delay_test.c | 2 -
lib/btree.c | 1 -
lib/decompress_unlzma.c | 2 +
lib/logic_pio.c | 3 -
lib/vsprintf.c | 2 +-
mm/zsmalloc.c | 1 -
net/ipv4/proc.c | 2 +-
net/ipv6/proc.c | 2 +-
net/netfilter/nf_nat_core.c | 6 +-
net/tipc/core.h | 2 +-
net/tipc/link.c | 10 +-
.../selftests/bpf/progs/get_branch_snapshot.c | 4 +-
tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +
tools/testing/selftests/vm/mremap_test.c | 2 +
48 files changed, 290 insertions(+), 184 deletions(-)
--
2.47.3
Good day,
Are you looking to acquire the verified attendees database for International Association of Amusement Parks and Attractions - IAAPA Expo 2025?
Who Attends - Director of Operations, Operations Manager, General Manager, Park Manager, Attraction Manager, Ride Engineer, Entertainment Manager, Event Coordinator, Marketing Manager, Sales Manager, Business Development Manager, Guest Services Manager, Food and Beverage Manager, Retail Manager, Safety Manager, Maintenance Manager, Facility Manager, Purchasing Manager, Consultant, Vendor Relations Manager
List Contains - Business name, URL, Contact number, Job title, Industry Type and etc.…
Reply with “Send Cost Details” for more information.
Thanks,
Norma Richard | Event Coordinator
If you don’t want to receive further email revert back as “ Take Out”
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 4e034bf045b12852a24d5d33f2451850818ba0c1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092933-conform-unclaimed-d167@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4e034bf045b12852a24d5d33f2451850818ba0c1 Mon Sep 17 00:00:00 2001
From: Jason Gunthorpe <jgg(a)ziepe.ca>
Date: Tue, 16 Sep 2025 20:47:10 -0300
Subject: [PATCH] iommufd: Fix race during abort for file descriptors
fput() doesn't actually call file_operations release() synchronously, it
puts the file on a work queue and it will be released eventually.
This is normally fine, except for iommufd the file and the iommufd_object
are tied to gether. The file has the object as it's private_data and holds
a users refcount, while the object is expected to remain alive as long as
the file is.
When the allocation of a new object aborts before installing the file it
will fput() the file and then go on to immediately kfree() the obj. This
causes a UAF once the workqueue completes the fput() and tries to
decrement the users refcount.
Fix this by putting the core code in charge of the file lifetime, and call
__fput_sync() during abort to ensure that release() is called before
kfree. __fput_sync() is a bit too tricky to open code in all the object
implementations. Instead the objects tell the core code where the file
pointer is and the core will take care of the life cycle.
If the object is successfully allocated then the file will hold a users
refcount and the iommufd_object cannot be destroyed.
It is worth noting that close(); ioctl(IOMMU_DESTROY); doesn't have an
issue because close() is already using a synchronous version of fput().
The UAF looks like this:
BUG: KASAN: slab-use-after-free in iommufd_eventq_fops_release+0x45/0xc0 drivers/iommu/iommufd/eventq.c:376
Write of size 4 at addr ffff888059c97804 by task syz.0.46/6164
CPU: 0 UID: 0 PID: 6164 Comm: syz.0.46 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xcd/0x630 mm/kasan/report.c:482
kasan_report+0xe0/0x110 mm/kasan/report.c:595
check_region_inline mm/kasan/generic.c:183 [inline]
kasan_check_range+0x100/0x1b0 mm/kasan/generic.c:189
instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
atomic_fetch_sub_release include/linux/atomic/atomic-instrumented.h:400 [inline]
__refcount_dec include/linux/refcount.h:455 [inline]
refcount_dec include/linux/refcount.h:476 [inline]
iommufd_eventq_fops_release+0x45/0xc0 drivers/iommu/iommufd/eventq.c:376
__fput+0x402/0xb70 fs/file_table.c:468
task_work_run+0x14d/0x240 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop+0xeb/0x110 kernel/entry/common.c:43
exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline]
syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline]
do_syscall_64+0x41c/0x4c0 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Link: https://patch.msgid.link/r/1-v1-02cd136829df+31-iommufd_syz_fput_jgg@nvidia…
Cc: stable(a)vger.kernel.org
Fixes: 07838f7fd529 ("iommufd: Add iommufd fault object")
Reviewed-by: Nicolin Chen <nicolinc(a)nvidia.com>
Reviewed-by: Nirmoy Das <nirmoyd(a)nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian(a)intel.com>
Tested-by: Nicolin Chen <nicolinc(a)nvidia.com>
Reported-by: syzbot+80620e2d0d0a33b09f93(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/68c8583d.050a0220.2ff435.03a2.GAE@google.com
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
diff --git a/drivers/iommu/iommufd/eventq.c b/drivers/iommu/iommufd/eventq.c
index fc4de63b0bce..e23d9ee4fe38 100644
--- a/drivers/iommu/iommufd/eventq.c
+++ b/drivers/iommu/iommufd/eventq.c
@@ -393,12 +393,12 @@ static int iommufd_eventq_init(struct iommufd_eventq *eventq, char *name,
const struct file_operations *fops)
{
struct file *filep;
- int fdno;
spin_lock_init(&eventq->lock);
INIT_LIST_HEAD(&eventq->deliver);
init_waitqueue_head(&eventq->wait_queue);
+ /* The filep is fput() by the core code during failure */
filep = anon_inode_getfile(name, fops, eventq, O_RDWR);
if (IS_ERR(filep))
return PTR_ERR(filep);
@@ -408,10 +408,7 @@ static int iommufd_eventq_init(struct iommufd_eventq *eventq, char *name,
eventq->filep = filep;
refcount_inc(&eventq->obj.users);
- fdno = get_unused_fd_flags(O_CLOEXEC);
- if (fdno < 0)
- fput(filep);
- return fdno;
+ return get_unused_fd_flags(O_CLOEXEC);
}
static const struct file_operations iommufd_fault_fops =
@@ -452,7 +449,6 @@ int iommufd_fault_alloc(struct iommufd_ucmd *ucmd)
return 0;
out_put_fdno:
put_unused_fd(fdno);
- fput(fault->common.filep);
return rc;
}
@@ -536,7 +532,6 @@ int iommufd_veventq_alloc(struct iommufd_ucmd *ucmd)
out_put_fdno:
put_unused_fd(fdno);
- fput(veventq->common.filep);
out_abort:
iommufd_object_abort_and_destroy(ucmd->ictx, &veventq->common.obj);
out_unlock_veventqs:
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index a9d4decc8ba1..88be2e157245 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -23,6 +23,7 @@
#include "iommufd_test.h"
struct iommufd_object_ops {
+ size_t file_offset;
void (*pre_destroy)(struct iommufd_object *obj);
void (*destroy)(struct iommufd_object *obj);
void (*abort)(struct iommufd_object *obj);
@@ -131,10 +132,30 @@ void iommufd_object_abort(struct iommufd_ctx *ictx, struct iommufd_object *obj)
void iommufd_object_abort_and_destroy(struct iommufd_ctx *ictx,
struct iommufd_object *obj)
{
- if (iommufd_object_ops[obj->type].abort)
- iommufd_object_ops[obj->type].abort(obj);
+ const struct iommufd_object_ops *ops = &iommufd_object_ops[obj->type];
+
+ if (ops->file_offset) {
+ struct file **filep = ((void *)obj) + ops->file_offset;
+
+ /*
+ * A file should hold a users refcount while the file is open
+ * and put it back in its release. The file should hold a
+ * pointer to obj in their private data. Normal fput() is
+ * deferred to a workqueue and can get out of order with the
+ * following kfree(obj). Using the sync version ensures the
+ * release happens immediately. During abort we require the file
+ * refcount is one at this point - meaning the object alloc
+ * function cannot do anything to allow another thread to take a
+ * refcount prior to a guaranteed success.
+ */
+ if (*filep)
+ __fput_sync(*filep);
+ }
+
+ if (ops->abort)
+ ops->abort(obj);
else
- iommufd_object_ops[obj->type].destroy(obj);
+ ops->destroy(obj);
iommufd_object_abort(ictx, obj);
}
@@ -659,6 +680,12 @@ void iommufd_ctx_put(struct iommufd_ctx *ictx)
}
EXPORT_SYMBOL_NS_GPL(iommufd_ctx_put, "IOMMUFD");
+#define IOMMUFD_FILE_OFFSET(_struct, _filep, _obj) \
+ .file_offset = (offsetof(_struct, _filep) + \
+ BUILD_BUG_ON_ZERO(!__same_type( \
+ struct file *, ((_struct *)NULL)->_filep)) + \
+ BUILD_BUG_ON_ZERO(offsetof(_struct, _obj)))
+
static const struct iommufd_object_ops iommufd_object_ops[] = {
[IOMMUFD_OBJ_ACCESS] = {
.destroy = iommufd_access_destroy_object,
@@ -669,6 +696,7 @@ static const struct iommufd_object_ops iommufd_object_ops[] = {
},
[IOMMUFD_OBJ_FAULT] = {
.destroy = iommufd_fault_destroy,
+ IOMMUFD_FILE_OFFSET(struct iommufd_fault, common.filep, common.obj),
},
[IOMMUFD_OBJ_HW_QUEUE] = {
.destroy = iommufd_hw_queue_destroy,
@@ -691,6 +719,7 @@ static const struct iommufd_object_ops iommufd_object_ops[] = {
[IOMMUFD_OBJ_VEVENTQ] = {
.destroy = iommufd_veventq_destroy,
.abort = iommufd_veventq_abort,
+ IOMMUFD_FILE_OFFSET(struct iommufd_veventq, common.filep, common.obj),
},
[IOMMUFD_OBJ_VIOMMU] = {
.destroy = iommufd_viommu_destroy,
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x eac04428abe9f9cb203ffae4600791ea1d24eb18
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092927-captivate-suspense-fdf7@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From eac04428abe9f9cb203ffae4600791ea1d24eb18 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:17 +0200
Subject: [PATCH] i40e: add mask to apply valid bits for itr_idx
The ITR index (itr_idx) is only 2 bits wide. When constructing the
register value for QINT_RQCTL, all fields are ORed together. Without
masking, higher bits from itr_idx may overwrite adjacent fields in the
register.
Apply I40E_QINT_RQCTL_ITR_INDX_MASK to ensure only the intended bits are
set.
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index f29941c00342..f9b2197f0942 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -448,7 +448,7 @@ static void i40e_config_irq_link_list(struct i40e_vf *vf, u16 vsi_id,
(qtype << I40E_QINT_RQCTL_NEXTQ_TYPE_SHIFT) |
(pf_queue_id << I40E_QINT_RQCTL_NEXTQ_INDX_SHIFT) |
BIT(I40E_QINT_RQCTL_CAUSE_ENA_SHIFT) |
- (itr_idx << I40E_QINT_RQCTL_ITR_INDX_SHIFT);
+ FIELD_PREP(I40E_QINT_RQCTL_ITR_INDX_MASK, itr_idx);
wr32(hw, reg_idx, reg);
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x c7c31f8dc54aa3c9b2c994b5f1ff7e740a654e97
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092921-consensus-mystified-6396@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c7c31f8dc54aa3c9b2c994b5f1ff7e740a654e97 Mon Sep 17 00:00:00 2001
From: Nirmoy Das <nirmoyd(a)nvidia.com>
Date: Wed, 17 Sep 2025 12:43:46 -0700
Subject: [PATCH] drm/ast: Use msleep instead of mdelay for edid read
The busy-waiting in `mdelay()` can cause CPU stalls and kernel timeouts
during boot.
Signed-off-by: Nirmoy Das <nirmoyd(a)nvidia.com>
Reviewed-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Tested-by: Carol L Soto csoto(a)nvidia.com<mailto:csoto@nvidia.com>
Fixes: 594e9c04b586 ("drm/ast: Create the driver for ASPEED proprietory Display-Port")
Cc: KuoHsiang Chou <kuohsiang_chou(a)aspeedtech.com>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Dave Airlie <airlied(a)redhat.com>
Cc: Jocelyn Falempe <jfalempe(a)redhat.com>
Cc: dri-devel(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v5.19+
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Link: https://lore.kernel.org/r/20250917194346.2905522-1-nirmoyd@nvidia.com
diff --git a/drivers/gpu/drm/ast/ast_dp.c b/drivers/gpu/drm/ast/ast_dp.c
index 19c04687b0fe..8e650a02c528 100644
--- a/drivers/gpu/drm/ast/ast_dp.c
+++ b/drivers/gpu/drm/ast/ast_dp.c
@@ -134,7 +134,7 @@ static int ast_astdp_read_edid_block(void *data, u8 *buf, unsigned int block, si
* 3. The Delays are often longer a lot when system resume from S3/S4.
*/
if (j)
- mdelay(j + 1);
+ msleep(j + 1);
/* Wait for EDID offset to show up in mirror register */
vgacrd7 = ast_get_index_reg(ast, AST_IO_VGACRI, 0xd7);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x c7c31f8dc54aa3c9b2c994b5f1ff7e740a654e97
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092920-backspin-glade-7b6c@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c7c31f8dc54aa3c9b2c994b5f1ff7e740a654e97 Mon Sep 17 00:00:00 2001
From: Nirmoy Das <nirmoyd(a)nvidia.com>
Date: Wed, 17 Sep 2025 12:43:46 -0700
Subject: [PATCH] drm/ast: Use msleep instead of mdelay for edid read
The busy-waiting in `mdelay()` can cause CPU stalls and kernel timeouts
during boot.
Signed-off-by: Nirmoy Das <nirmoyd(a)nvidia.com>
Reviewed-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Tested-by: Carol L Soto csoto(a)nvidia.com<mailto:csoto@nvidia.com>
Fixes: 594e9c04b586 ("drm/ast: Create the driver for ASPEED proprietory Display-Port")
Cc: KuoHsiang Chou <kuohsiang_chou(a)aspeedtech.com>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Dave Airlie <airlied(a)redhat.com>
Cc: Jocelyn Falempe <jfalempe(a)redhat.com>
Cc: dri-devel(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v5.19+
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Link: https://lore.kernel.org/r/20250917194346.2905522-1-nirmoyd@nvidia.com
diff --git a/drivers/gpu/drm/ast/ast_dp.c b/drivers/gpu/drm/ast/ast_dp.c
index 19c04687b0fe..8e650a02c528 100644
--- a/drivers/gpu/drm/ast/ast_dp.c
+++ b/drivers/gpu/drm/ast/ast_dp.c
@@ -134,7 +134,7 @@ static int ast_astdp_read_edid_block(void *data, u8 *buf, unsigned int block, si
* 3. The Delays are often longer a lot when system resume from S3/S4.
*/
if (j)
- mdelay(j + 1);
+ msleep(j + 1);
/* Wait for EDID offset to show up in mirror register */
vgacrd7 = ast_get_index_reg(ast, AST_IO_VGACRI, 0xd7);
The patch below does not apply to the 6.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.16.y
git checkout FETCH_HEAD
git cherry-pick -x 4e034bf045b12852a24d5d33f2451850818ba0c1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092931-uneven-never-0b1f@gregkh' --subject-prefix 'PATCH 6.16.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4e034bf045b12852a24d5d33f2451850818ba0c1 Mon Sep 17 00:00:00 2001
From: Jason Gunthorpe <jgg(a)ziepe.ca>
Date: Tue, 16 Sep 2025 20:47:10 -0300
Subject: [PATCH] iommufd: Fix race during abort for file descriptors
fput() doesn't actually call file_operations release() synchronously, it
puts the file on a work queue and it will be released eventually.
This is normally fine, except for iommufd the file and the iommufd_object
are tied to gether. The file has the object as it's private_data and holds
a users refcount, while the object is expected to remain alive as long as
the file is.
When the allocation of a new object aborts before installing the file it
will fput() the file and then go on to immediately kfree() the obj. This
causes a UAF once the workqueue completes the fput() and tries to
decrement the users refcount.
Fix this by putting the core code in charge of the file lifetime, and call
__fput_sync() during abort to ensure that release() is called before
kfree. __fput_sync() is a bit too tricky to open code in all the object
implementations. Instead the objects tell the core code where the file
pointer is and the core will take care of the life cycle.
If the object is successfully allocated then the file will hold a users
refcount and the iommufd_object cannot be destroyed.
It is worth noting that close(); ioctl(IOMMU_DESTROY); doesn't have an
issue because close() is already using a synchronous version of fput().
The UAF looks like this:
BUG: KASAN: slab-use-after-free in iommufd_eventq_fops_release+0x45/0xc0 drivers/iommu/iommufd/eventq.c:376
Write of size 4 at addr ffff888059c97804 by task syz.0.46/6164
CPU: 0 UID: 0 PID: 6164 Comm: syz.0.46 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xcd/0x630 mm/kasan/report.c:482
kasan_report+0xe0/0x110 mm/kasan/report.c:595
check_region_inline mm/kasan/generic.c:183 [inline]
kasan_check_range+0x100/0x1b0 mm/kasan/generic.c:189
instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
atomic_fetch_sub_release include/linux/atomic/atomic-instrumented.h:400 [inline]
__refcount_dec include/linux/refcount.h:455 [inline]
refcount_dec include/linux/refcount.h:476 [inline]
iommufd_eventq_fops_release+0x45/0xc0 drivers/iommu/iommufd/eventq.c:376
__fput+0x402/0xb70 fs/file_table.c:468
task_work_run+0x14d/0x240 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop+0xeb/0x110 kernel/entry/common.c:43
exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline]
syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline]
do_syscall_64+0x41c/0x4c0 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Link: https://patch.msgid.link/r/1-v1-02cd136829df+31-iommufd_syz_fput_jgg@nvidia…
Cc: stable(a)vger.kernel.org
Fixes: 07838f7fd529 ("iommufd: Add iommufd fault object")
Reviewed-by: Nicolin Chen <nicolinc(a)nvidia.com>
Reviewed-by: Nirmoy Das <nirmoyd(a)nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian(a)intel.com>
Tested-by: Nicolin Chen <nicolinc(a)nvidia.com>
Reported-by: syzbot+80620e2d0d0a33b09f93(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/68c8583d.050a0220.2ff435.03a2.GAE@google.com
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
diff --git a/drivers/iommu/iommufd/eventq.c b/drivers/iommu/iommufd/eventq.c
index fc4de63b0bce..e23d9ee4fe38 100644
--- a/drivers/iommu/iommufd/eventq.c
+++ b/drivers/iommu/iommufd/eventq.c
@@ -393,12 +393,12 @@ static int iommufd_eventq_init(struct iommufd_eventq *eventq, char *name,
const struct file_operations *fops)
{
struct file *filep;
- int fdno;
spin_lock_init(&eventq->lock);
INIT_LIST_HEAD(&eventq->deliver);
init_waitqueue_head(&eventq->wait_queue);
+ /* The filep is fput() by the core code during failure */
filep = anon_inode_getfile(name, fops, eventq, O_RDWR);
if (IS_ERR(filep))
return PTR_ERR(filep);
@@ -408,10 +408,7 @@ static int iommufd_eventq_init(struct iommufd_eventq *eventq, char *name,
eventq->filep = filep;
refcount_inc(&eventq->obj.users);
- fdno = get_unused_fd_flags(O_CLOEXEC);
- if (fdno < 0)
- fput(filep);
- return fdno;
+ return get_unused_fd_flags(O_CLOEXEC);
}
static const struct file_operations iommufd_fault_fops =
@@ -452,7 +449,6 @@ int iommufd_fault_alloc(struct iommufd_ucmd *ucmd)
return 0;
out_put_fdno:
put_unused_fd(fdno);
- fput(fault->common.filep);
return rc;
}
@@ -536,7 +532,6 @@ int iommufd_veventq_alloc(struct iommufd_ucmd *ucmd)
out_put_fdno:
put_unused_fd(fdno);
- fput(veventq->common.filep);
out_abort:
iommufd_object_abort_and_destroy(ucmd->ictx, &veventq->common.obj);
out_unlock_veventqs:
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index a9d4decc8ba1..88be2e157245 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -23,6 +23,7 @@
#include "iommufd_test.h"
struct iommufd_object_ops {
+ size_t file_offset;
void (*pre_destroy)(struct iommufd_object *obj);
void (*destroy)(struct iommufd_object *obj);
void (*abort)(struct iommufd_object *obj);
@@ -131,10 +132,30 @@ void iommufd_object_abort(struct iommufd_ctx *ictx, struct iommufd_object *obj)
void iommufd_object_abort_and_destroy(struct iommufd_ctx *ictx,
struct iommufd_object *obj)
{
- if (iommufd_object_ops[obj->type].abort)
- iommufd_object_ops[obj->type].abort(obj);
+ const struct iommufd_object_ops *ops = &iommufd_object_ops[obj->type];
+
+ if (ops->file_offset) {
+ struct file **filep = ((void *)obj) + ops->file_offset;
+
+ /*
+ * A file should hold a users refcount while the file is open
+ * and put it back in its release. The file should hold a
+ * pointer to obj in their private data. Normal fput() is
+ * deferred to a workqueue and can get out of order with the
+ * following kfree(obj). Using the sync version ensures the
+ * release happens immediately. During abort we require the file
+ * refcount is one at this point - meaning the object alloc
+ * function cannot do anything to allow another thread to take a
+ * refcount prior to a guaranteed success.
+ */
+ if (*filep)
+ __fput_sync(*filep);
+ }
+
+ if (ops->abort)
+ ops->abort(obj);
else
- iommufd_object_ops[obj->type].destroy(obj);
+ ops->destroy(obj);
iommufd_object_abort(ictx, obj);
}
@@ -659,6 +680,12 @@ void iommufd_ctx_put(struct iommufd_ctx *ictx)
}
EXPORT_SYMBOL_NS_GPL(iommufd_ctx_put, "IOMMUFD");
+#define IOMMUFD_FILE_OFFSET(_struct, _filep, _obj) \
+ .file_offset = (offsetof(_struct, _filep) + \
+ BUILD_BUG_ON_ZERO(!__same_type( \
+ struct file *, ((_struct *)NULL)->_filep)) + \
+ BUILD_BUG_ON_ZERO(offsetof(_struct, _obj)))
+
static const struct iommufd_object_ops iommufd_object_ops[] = {
[IOMMUFD_OBJ_ACCESS] = {
.destroy = iommufd_access_destroy_object,
@@ -669,6 +696,7 @@ static const struct iommufd_object_ops iommufd_object_ops[] = {
},
[IOMMUFD_OBJ_FAULT] = {
.destroy = iommufd_fault_destroy,
+ IOMMUFD_FILE_OFFSET(struct iommufd_fault, common.filep, common.obj),
},
[IOMMUFD_OBJ_HW_QUEUE] = {
.destroy = iommufd_hw_queue_destroy,
@@ -691,6 +719,7 @@ static const struct iommufd_object_ops iommufd_object_ops[] = {
[IOMMUFD_OBJ_VEVENTQ] = {
.destroy = iommufd_veventq_destroy,
.abort = iommufd_veventq_abort,
+ IOMMUFD_FILE_OFFSET(struct iommufd_veventq, common.filep, common.obj),
},
[IOMMUFD_OBJ_VIOMMU] = {
.destroy = iommufd_viommu_destroy,
In of_unittest_pci_node_verify(), when the add parameter is false,
device_find_any_child() obtains a reference to a child device. This
function implicitly calls get_device() to increment the device's
reference count before returning the pointer. However, the caller
fails to properly release this reference by calling put_device(),
leading to a device reference count leak.
As the comment of device_find_any_child states: "NOTE: you will need
to drop the reference with put_device() after use".
Cc: stable(a)vger.kernel.org
Fixes: 26409dd04589 ("of: unittest: Add pci_dt_testdrv pci driver")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/of/unittest.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index e3503ec20f6c..d225e73781fe 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -4271,7 +4271,7 @@ static struct platform_driver unittest_pci_driver = {
static int of_unittest_pci_node_verify(struct pci_dev *pdev, bool add)
{
struct device_node *pnp, *np = NULL;
- struct device *child_dev;
+ struct device *child_dev = NULL;
char *path = NULL;
const __be32 *reg;
int rc = 0;
@@ -4306,6 +4306,8 @@ static int of_unittest_pci_node_verify(struct pci_dev *pdev, bool add)
kfree(path);
if (np)
of_node_put(np);
+ if (child_dev)
+ put_device(child_dev);
return rc;
}
--
2.17.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 877b7e6ffc23766448236e8732254534c518ba42
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092913-monotype-pouch-f2e0@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 877b7e6ffc23766448236e8732254534c518ba42 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:15 +0200
Subject: [PATCH] i40e: fix validation of VF state in get resources
VF state I40E_VF_STATE_ACTIVE is not the only state in which
VF is actually active so it should not be used to determine
if a VF is allowed to obtain resources.
Use I40E_VF_STATE_RESOURCES_LOADED that is set only in
i40e_vc_get_vf_resources_msg() and cleared during reset.
Fixes: 61125b8be85d ("i40e: Fix failed opcode appearing if handling messages from VF")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index c85715f75435..5ef3dc43a8a0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -1464,6 +1464,7 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr)
* functions that may still be running at this point.
*/
clear_bit(I40E_VF_STATE_INIT, &vf->vf_states);
+ clear_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states);
/* In the case of a VFLR, the HW has already reset the VF and we
* just need to clean up, so don't hit the VFRTRIG register.
@@ -2130,7 +2131,10 @@ static int i40e_vc_get_vf_resources_msg(struct i40e_vf *vf, u8 *msg)
size_t len = 0;
int ret;
- if (!i40e_sync_vf_state(vf, I40E_VF_STATE_INIT)) {
+ i40e_sync_vf_state(vf, I40E_VF_STATE_INIT);
+
+ if (!test_bit(I40E_VF_STATE_INIT, &vf->vf_states) ||
+ test_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states)) {
aq_ret = -EINVAL;
goto err;
}
@@ -2233,6 +2237,7 @@ static int i40e_vc_get_vf_resources_msg(struct i40e_vf *vf, u8 *msg)
vf->default_lan_addr.addr);
}
set_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
+ set_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states);
err:
/* send the response back to the VF */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
index 5cf74f16f433..f558b45725c8 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
@@ -41,7 +41,8 @@ enum i40e_vf_states {
I40E_VF_STATE_MC_PROMISC,
I40E_VF_STATE_UC_PROMISC,
I40E_VF_STATE_PRE_ENABLE,
- I40E_VF_STATE_RESETTING
+ I40E_VF_STATE_RESETTING,
+ I40E_VF_STATE_RESOURCES_LOADED,
};
/* VF capabilities */
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x f1ad24c5abe1eaef69158bac1405a74b3c365115
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092937-trash-dainty-32e4@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f1ad24c5abe1eaef69158bac1405a74b3c365115 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:13 +0200
Subject: [PATCH] i40e: fix idx validation in config queues msg
Ensure idx is within range of active/initialized TCs when iterating over
vf->ch[idx] in i40e_vc_config_queues_msg().
Fixes: c27eac48160d ("i40e: Enable ADq and create queue channel/s on VF")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Kamakshi Nellore <nellorex.kamakshi(a)intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 1c4f86221255..b6db4d78c02d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2395,7 +2395,7 @@ static int i40e_vc_config_queues_msg(struct i40e_vf *vf, u8 *msg)
}
if (vf->adq_enabled) {
- if (idx >= ARRAY_SIZE(vf->ch)) {
+ if (idx >= vf->num_tc) {
aq_ret = -ENODEV;
goto error_param;
}
@@ -2416,7 +2416,7 @@ static int i40e_vc_config_queues_msg(struct i40e_vf *vf, u8 *msg)
* to its appropriate VSIs based on TC mapping
*/
if (vf->adq_enabled) {
- if (idx >= ARRAY_SIZE(vf->ch)) {
+ if (idx >= vf->num_tc) {
aq_ret = -ENODEV;
goto error_param;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 877b7e6ffc23766448236e8732254534c518ba42
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092912-bullfight-urgency-ae9f@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 877b7e6ffc23766448236e8732254534c518ba42 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:15 +0200
Subject: [PATCH] i40e: fix validation of VF state in get resources
VF state I40E_VF_STATE_ACTIVE is not the only state in which
VF is actually active so it should not be used to determine
if a VF is allowed to obtain resources.
Use I40E_VF_STATE_RESOURCES_LOADED that is set only in
i40e_vc_get_vf_resources_msg() and cleared during reset.
Fixes: 61125b8be85d ("i40e: Fix failed opcode appearing if handling messages from VF")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index c85715f75435..5ef3dc43a8a0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -1464,6 +1464,7 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr)
* functions that may still be running at this point.
*/
clear_bit(I40E_VF_STATE_INIT, &vf->vf_states);
+ clear_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states);
/* In the case of a VFLR, the HW has already reset the VF and we
* just need to clean up, so don't hit the VFRTRIG register.
@@ -2130,7 +2131,10 @@ static int i40e_vc_get_vf_resources_msg(struct i40e_vf *vf, u8 *msg)
size_t len = 0;
int ret;
- if (!i40e_sync_vf_state(vf, I40E_VF_STATE_INIT)) {
+ i40e_sync_vf_state(vf, I40E_VF_STATE_INIT);
+
+ if (!test_bit(I40E_VF_STATE_INIT, &vf->vf_states) ||
+ test_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states)) {
aq_ret = -EINVAL;
goto err;
}
@@ -2233,6 +2237,7 @@ static int i40e_vc_get_vf_resources_msg(struct i40e_vf *vf, u8 *msg)
vf->default_lan_addr.addr);
}
set_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
+ set_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states);
err:
/* send the response back to the VF */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
index 5cf74f16f433..f558b45725c8 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
@@ -41,7 +41,8 @@ enum i40e_vf_states {
I40E_VF_STATE_MC_PROMISC,
I40E_VF_STATE_UC_PROMISC,
I40E_VF_STATE_PRE_ENABLE,
- I40E_VF_STATE_RESETTING
+ I40E_VF_STATE_RESETTING,
+ I40E_VF_STATE_RESOURCES_LOADED,
};
/* VF capabilities */
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x f1ad24c5abe1eaef69158bac1405a74b3c365115
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092937-unrushed-recharger-4598@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f1ad24c5abe1eaef69158bac1405a74b3c365115 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:13 +0200
Subject: [PATCH] i40e: fix idx validation in config queues msg
Ensure idx is within range of active/initialized TCs when iterating over
vf->ch[idx] in i40e_vc_config_queues_msg().
Fixes: c27eac48160d ("i40e: Enable ADq and create queue channel/s on VF")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Kamakshi Nellore <nellorex.kamakshi(a)intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 1c4f86221255..b6db4d78c02d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2395,7 +2395,7 @@ static int i40e_vc_config_queues_msg(struct i40e_vf *vf, u8 *msg)
}
if (vf->adq_enabled) {
- if (idx >= ARRAY_SIZE(vf->ch)) {
+ if (idx >= vf->num_tc) {
aq_ret = -ENODEV;
goto error_param;
}
@@ -2416,7 +2416,7 @@ static int i40e_vc_config_queues_msg(struct i40e_vf *vf, u8 *msg)
* to its appropriate VSIs based on TC mapping
*/
if (vf->adq_enabled) {
- if (idx >= ARRAY_SIZE(vf->ch)) {
+ if (idx >= vf->num_tc) {
aq_ret = -ENODEV;
goto error_param;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 877b7e6ffc23766448236e8732254534c518ba42
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092911-washed-tubby-340f@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 877b7e6ffc23766448236e8732254534c518ba42 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:15 +0200
Subject: [PATCH] i40e: fix validation of VF state in get resources
VF state I40E_VF_STATE_ACTIVE is not the only state in which
VF is actually active so it should not be used to determine
if a VF is allowed to obtain resources.
Use I40E_VF_STATE_RESOURCES_LOADED that is set only in
i40e_vc_get_vf_resources_msg() and cleared during reset.
Fixes: 61125b8be85d ("i40e: Fix failed opcode appearing if handling messages from VF")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index c85715f75435..5ef3dc43a8a0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -1464,6 +1464,7 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr)
* functions that may still be running at this point.
*/
clear_bit(I40E_VF_STATE_INIT, &vf->vf_states);
+ clear_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states);
/* In the case of a VFLR, the HW has already reset the VF and we
* just need to clean up, so don't hit the VFRTRIG register.
@@ -2130,7 +2131,10 @@ static int i40e_vc_get_vf_resources_msg(struct i40e_vf *vf, u8 *msg)
size_t len = 0;
int ret;
- if (!i40e_sync_vf_state(vf, I40E_VF_STATE_INIT)) {
+ i40e_sync_vf_state(vf, I40E_VF_STATE_INIT);
+
+ if (!test_bit(I40E_VF_STATE_INIT, &vf->vf_states) ||
+ test_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states)) {
aq_ret = -EINVAL;
goto err;
}
@@ -2233,6 +2237,7 @@ static int i40e_vc_get_vf_resources_msg(struct i40e_vf *vf, u8 *msg)
vf->default_lan_addr.addr);
}
set_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
+ set_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states);
err:
/* send the response back to the VF */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
index 5cf74f16f433..f558b45725c8 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
@@ -41,7 +41,8 @@ enum i40e_vf_states {
I40E_VF_STATE_MC_PROMISC,
I40E_VF_STATE_UC_PROMISC,
I40E_VF_STATE_PRE_ENABLE,
- I40E_VF_STATE_RESETTING
+ I40E_VF_STATE_RESETTING,
+ I40E_VF_STATE_RESOURCES_LOADED,
};
/* VF capabilities */
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 55d225670def06b01af2e7a5e0446fbe946289e8
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092936-anvil-pummel-9e58@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 55d225670def06b01af2e7a5e0446fbe946289e8 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:11 +0200
Subject: [PATCH] i40e: add validation for ring_len param
The `ring_len` parameter provided by the virtual function (VF)
is assigned directly to the hardware memory context (HMC) without
any validation.
To address this, introduce an upper boundary check for both Tx and Rx
queue lengths. The maximum number of descriptors supported by the
hardware is 8k-32.
Additionally, enforce alignment constraints: Tx rings must be a multiple
of 8, and Rx rings must be a multiple of 32.
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 9b8efdeafbcf..cb37b2ac56f1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -653,6 +653,13 @@ static int i40e_config_vsi_tx_queue(struct i40e_vf *vf, u16 vsi_id,
/* only set the required fields */
tx_ctx.base = info->dma_ring_addr / 128;
+
+ /* ring_len has to be multiple of 8 */
+ if (!IS_ALIGNED(info->ring_len, 8) ||
+ info->ring_len > I40E_MAX_NUM_DESCRIPTORS_XL710) {
+ ret = -EINVAL;
+ goto error_context;
+ }
tx_ctx.qlen = info->ring_len;
tx_ctx.rdylist = le16_to_cpu(vsi->info.qs_handle[0]);
tx_ctx.rdylist_act = 0;
@@ -716,6 +723,13 @@ static int i40e_config_vsi_rx_queue(struct i40e_vf *vf, u16 vsi_id,
/* only set the required fields */
rx_ctx.base = info->dma_ring_addr / 128;
+
+ /* ring_len has to be multiple of 32 */
+ if (!IS_ALIGNED(info->ring_len, 32) ||
+ info->ring_len > I40E_MAX_NUM_DESCRIPTORS_XL710) {
+ ret = -EINVAL;
+ goto error_param;
+ }
rx_ctx.qlen = info->ring_len;
if (info->splithdr_enabled) {
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x f1ad24c5abe1eaef69158bac1405a74b3c365115
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092936-outflank-unwoven-3758@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f1ad24c5abe1eaef69158bac1405a74b3c365115 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:13 +0200
Subject: [PATCH] i40e: fix idx validation in config queues msg
Ensure idx is within range of active/initialized TCs when iterating over
vf->ch[idx] in i40e_vc_config_queues_msg().
Fixes: c27eac48160d ("i40e: Enable ADq and create queue channel/s on VF")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Kamakshi Nellore <nellorex.kamakshi(a)intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 1c4f86221255..b6db4d78c02d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2395,7 +2395,7 @@ static int i40e_vc_config_queues_msg(struct i40e_vf *vf, u8 *msg)
}
if (vf->adq_enabled) {
- if (idx >= ARRAY_SIZE(vf->ch)) {
+ if (idx >= vf->num_tc) {
aq_ret = -ENODEV;
goto error_param;
}
@@ -2416,7 +2416,7 @@ static int i40e_vc_config_queues_msg(struct i40e_vf *vf, u8 *msg)
* to its appropriate VSIs based on TC mapping
*/
if (vf->adq_enabled) {
- if (idx >= ARRAY_SIZE(vf->ch)) {
+ if (idx >= vf->num_tc) {
aq_ret = -ENODEV;
goto error_param;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 55d225670def06b01af2e7a5e0446fbe946289e8
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092935-atonable-underdog-9664@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 55d225670def06b01af2e7a5e0446fbe946289e8 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:11 +0200
Subject: [PATCH] i40e: add validation for ring_len param
The `ring_len` parameter provided by the virtual function (VF)
is assigned directly to the hardware memory context (HMC) without
any validation.
To address this, introduce an upper boundary check for both Tx and Rx
queue lengths. The maximum number of descriptors supported by the
hardware is 8k-32.
Additionally, enforce alignment constraints: Tx rings must be a multiple
of 8, and Rx rings must be a multiple of 32.
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 9b8efdeafbcf..cb37b2ac56f1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -653,6 +653,13 @@ static int i40e_config_vsi_tx_queue(struct i40e_vf *vf, u16 vsi_id,
/* only set the required fields */
tx_ctx.base = info->dma_ring_addr / 128;
+
+ /* ring_len has to be multiple of 8 */
+ if (!IS_ALIGNED(info->ring_len, 8) ||
+ info->ring_len > I40E_MAX_NUM_DESCRIPTORS_XL710) {
+ ret = -EINVAL;
+ goto error_context;
+ }
tx_ctx.qlen = info->ring_len;
tx_ctx.rdylist = le16_to_cpu(vsi->info.qs_handle[0]);
tx_ctx.rdylist_act = 0;
@@ -716,6 +723,13 @@ static int i40e_config_vsi_rx_queue(struct i40e_vf *vf, u16 vsi_id,
/* only set the required fields */
rx_ctx.base = info->dma_ring_addr / 128;
+
+ /* ring_len has to be multiple of 32 */
+ if (!IS_ALIGNED(info->ring_len, 32) ||
+ info->ring_len > I40E_MAX_NUM_DESCRIPTORS_XL710) {
+ ret = -EINVAL;
+ goto error_param;
+ }
rx_ctx.qlen = info->ring_len;
if (info->splithdr_enabled) {
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 877b7e6ffc23766448236e8732254534c518ba42
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092910-busboy-umbrella-d6c5@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 877b7e6ffc23766448236e8732254534c518ba42 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:15 +0200
Subject: [PATCH] i40e: fix validation of VF state in get resources
VF state I40E_VF_STATE_ACTIVE is not the only state in which
VF is actually active so it should not be used to determine
if a VF is allowed to obtain resources.
Use I40E_VF_STATE_RESOURCES_LOADED that is set only in
i40e_vc_get_vf_resources_msg() and cleared during reset.
Fixes: 61125b8be85d ("i40e: Fix failed opcode appearing if handling messages from VF")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index c85715f75435..5ef3dc43a8a0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -1464,6 +1464,7 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr)
* functions that may still be running at this point.
*/
clear_bit(I40E_VF_STATE_INIT, &vf->vf_states);
+ clear_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states);
/* In the case of a VFLR, the HW has already reset the VF and we
* just need to clean up, so don't hit the VFRTRIG register.
@@ -2130,7 +2131,10 @@ static int i40e_vc_get_vf_resources_msg(struct i40e_vf *vf, u8 *msg)
size_t len = 0;
int ret;
- if (!i40e_sync_vf_state(vf, I40E_VF_STATE_INIT)) {
+ i40e_sync_vf_state(vf, I40E_VF_STATE_INIT);
+
+ if (!test_bit(I40E_VF_STATE_INIT, &vf->vf_states) ||
+ test_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states)) {
aq_ret = -EINVAL;
goto err;
}
@@ -2233,6 +2237,7 @@ static int i40e_vc_get_vf_resources_msg(struct i40e_vf *vf, u8 *msg)
vf->default_lan_addr.addr);
}
set_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
+ set_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states);
err:
/* send the response back to the VF */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
index 5cf74f16f433..f558b45725c8 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
@@ -41,7 +41,8 @@ enum i40e_vf_states {
I40E_VF_STATE_MC_PROMISC,
I40E_VF_STATE_UC_PROMISC,
I40E_VF_STATE_PRE_ENABLE,
- I40E_VF_STATE_RESETTING
+ I40E_VF_STATE_RESETTING,
+ I40E_VF_STATE_RESOURCES_LOADED,
};
/* VF capabilities */
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x f1ad24c5abe1eaef69158bac1405a74b3c365115
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092935-lethargy-falcon-de3b@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f1ad24c5abe1eaef69158bac1405a74b3c365115 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:13 +0200
Subject: [PATCH] i40e: fix idx validation in config queues msg
Ensure idx is within range of active/initialized TCs when iterating over
vf->ch[idx] in i40e_vc_config_queues_msg().
Fixes: c27eac48160d ("i40e: Enable ADq and create queue channel/s on VF")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Kamakshi Nellore <nellorex.kamakshi(a)intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 1c4f86221255..b6db4d78c02d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2395,7 +2395,7 @@ static int i40e_vc_config_queues_msg(struct i40e_vf *vf, u8 *msg)
}
if (vf->adq_enabled) {
- if (idx >= ARRAY_SIZE(vf->ch)) {
+ if (idx >= vf->num_tc) {
aq_ret = -ENODEV;
goto error_param;
}
@@ -2416,7 +2416,7 @@ static int i40e_vc_config_queues_msg(struct i40e_vf *vf, u8 *msg)
* to its appropriate VSIs based on TC mapping
*/
if (vf->adq_enabled) {
- if (idx >= ARRAY_SIZE(vf->ch)) {
+ if (idx >= vf->num_tc) {
aq_ret = -ENODEV;
goto error_param;
}
There are some AA deadlock issues in kmemleak, similar to the situation
reported by Breno [1]. The deadlock path is as follows:
mem_pool_alloc()
-> raw_spin_lock_irqsave(&kmemleak_lock, flags);
-> pr_warn()
-> netconsole subsystem
-> netpoll
-> __alloc_skb
-> __create_object
-> raw_spin_lock_irqsave(&kmemleak_lock, flags);
To solve this problem, switch to printk_safe mode before printing warning
message, this will redirect all printk()-s to a special per-CPU buffer,
which will be flushed later from a safe context (irq work), and this
deadlock problem can be avoided. The proper API to use should be
printk_deferred_enter()/printk_deferred_exit() [2]. Another way is to
place the warn print after kmemleak is released.
[1]
https://lore.kernel.org/all/20250731-kmemleak_lock-v1-1-728fd470198f@debian…
[2]
https://lore.kernel.org/all/5ca375cd-4a20-4807-b897-68b289626550@redhat.com/
====================
Signed-off-by: Gu Bowen <gubowen5(a)huawei.com>
---
mm/kmemleak.c | 27 ++++++++++++++++++++-------
1 file changed, 20 insertions(+), 7 deletions(-)
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 84265983f239..1ac56ceb29b6 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -437,9 +437,15 @@ static struct kmemleak_object *__lookup_object(unsigned long ptr, int alias,
else if (untagged_objp == untagged_ptr || alias)
return object;
else {
+ /*
+ * Printk deferring due to the kmemleak_lock held.
+ * This is done to avoid deadlock.
+ */
+ printk_deferred_enter();
kmemleak_warn("Found object by alias at 0x%08lx\n",
ptr);
dump_object_info(object);
+ printk_deferred_exit();
break;
}
}
@@ -736,6 +742,11 @@ static int __link_object(struct kmemleak_object *object, unsigned long ptr,
else if (untagged_objp + parent->size <= untagged_ptr)
link = &parent->rb_node.rb_right;
else {
+ /*
+ * Printk deferring due to the kmemleak_lock held.
+ * This is done to avoid deadlock.
+ */
+ printk_deferred_enter();
kmemleak_stop("Cannot insert 0x%lx into the object search tree (overlaps existing)\n",
ptr);
/*
@@ -743,6 +754,7 @@ static int __link_object(struct kmemleak_object *object, unsigned long ptr,
* be freed while the kmemleak_lock is held.
*/
dump_object_info(parent);
+ printk_deferred_exit();
return -EEXIST;
}
}
@@ -856,13 +868,8 @@ static void delete_object_part(unsigned long ptr, size_t size,
raw_spin_lock_irqsave(&kmemleak_lock, flags);
object = __find_and_remove_object(ptr, 1, objflags);
- if (!object) {
-#ifdef DEBUG
- kmemleak_warn("Partially freeing unknown object at 0x%08lx (size %zu)\n",
- ptr, size);
-#endif
+ if (!object)
goto unlock;
- }
/*
* Create one or two objects that may result from the memory block
@@ -882,8 +889,14 @@ static void delete_object_part(unsigned long ptr, size_t size,
unlock:
raw_spin_unlock_irqrestore(&kmemleak_lock, flags);
- if (object)
+ if (object) {
__delete_object(object);
+ } else {
+#ifdef DEBUG
+ kmemleak_warn("Partially freeing unknown object at 0x%08lx (size %zu)\n",
+ ptr, size);
+#endif
+ }
out:
if (object_l)
--
2.43.0
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 55d225670def06b01af2e7a5e0446fbe946289e8
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092935-hatred-salutary-8769@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 55d225670def06b01af2e7a5e0446fbe946289e8 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:11 +0200
Subject: [PATCH] i40e: add validation for ring_len param
The `ring_len` parameter provided by the virtual function (VF)
is assigned directly to the hardware memory context (HMC) without
any validation.
To address this, introduce an upper boundary check for both Tx and Rx
queue lengths. The maximum number of descriptors supported by the
hardware is 8k-32.
Additionally, enforce alignment constraints: Tx rings must be a multiple
of 8, and Rx rings must be a multiple of 32.
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 9b8efdeafbcf..cb37b2ac56f1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -653,6 +653,13 @@ static int i40e_config_vsi_tx_queue(struct i40e_vf *vf, u16 vsi_id,
/* only set the required fields */
tx_ctx.base = info->dma_ring_addr / 128;
+
+ /* ring_len has to be multiple of 8 */
+ if (!IS_ALIGNED(info->ring_len, 8) ||
+ info->ring_len > I40E_MAX_NUM_DESCRIPTORS_XL710) {
+ ret = -EINVAL;
+ goto error_context;
+ }
tx_ctx.qlen = info->ring_len;
tx_ctx.rdylist = le16_to_cpu(vsi->info.qs_handle[0]);
tx_ctx.rdylist_act = 0;
@@ -716,6 +723,13 @@ static int i40e_config_vsi_rx_queue(struct i40e_vf *vf, u16 vsi_id,
/* only set the required fields */
rx_ctx.base = info->dma_ring_addr / 128;
+
+ /* ring_len has to be multiple of 32 */
+ if (!IS_ALIGNED(info->ring_len, 32) ||
+ info->ring_len > I40E_MAX_NUM_DESCRIPTORS_XL710) {
+ ret = -EINVAL;
+ goto error_param;
+ }
rx_ctx.qlen = info->ring_len;
if (info->splithdr_enabled) {
The patch below does not apply to the 6.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.16.y
git checkout FETCH_HEAD
git cherry-pick -x 55ed11b181c43d81ce03b50209e4e7c4a14ba099
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092952-wooing-result-72e9@gregkh' --subject-prefix 'PATCH 6.16.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 55ed11b181c43d81ce03b50209e4e7c4a14ba099 Mon Sep 17 00:00:00 2001
From: Andrea Righi <arighi(a)nvidia.com>
Date: Sat, 20 Sep 2025 15:26:21 +0200
Subject: [PATCH] sched_ext: idle: Handle migration-disabled tasks in BPF code
When scx_bpf_select_cpu_dfl()/and() kfuncs are invoked outside of
ops.select_cpu() we can't rely on @p->migration_disabled to determine if
migration is disabled for the task @p.
In fact, migration is always disabled for the current task while running
BPF code: __bpf_prog_enter() disables migration and __bpf_prog_exit()
re-enables it.
To handle this, when @p->migration_disabled == 1, check whether @p is
the current task. If so, migration was not disabled before entering the
callback, otherwise migration was disabled.
This ensures correct idle CPU selection in all cases. The behavior of
ops.select_cpu() remains unchanged, because this callback is never
invoked for the current task and migration-disabled tasks are always
excluded.
Example: without this change scx_bpf_select_cpu_and() called from
ops.enqueue() always returns -EBUSY; with this change applied, it
correctly returns idle CPUs.
Fixes: 06efc9fe0b8de ("sched_ext: idle: Handle migration-disabled tasks in idle selection")
Cc: stable(a)vger.kernel.org # v6.16+
Signed-off-by: Andrea Righi <arighi(a)nvidia.com>
Acked-by: Changwoo Min <changwoo(a)igalia.com>
Signed-off-by: Tejun Heo <tj(a)kernel.org>
diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c
index 7174e1c1a392..537c6992bb63 100644
--- a/kernel/sched/ext_idle.c
+++ b/kernel/sched/ext_idle.c
@@ -856,6 +856,32 @@ static bool check_builtin_idle_enabled(void)
return false;
}
+/*
+ * Determine whether @p is a migration-disabled task in the context of BPF
+ * code.
+ *
+ * We can't simply check whether @p->migration_disabled is set in a
+ * sched_ext callback, because migration is always disabled for the current
+ * task while running BPF code.
+ *
+ * The prolog (__bpf_prog_enter) and epilog (__bpf_prog_exit) respectively
+ * disable and re-enable migration. For this reason, the current task
+ * inside a sched_ext callback is always a migration-disabled task.
+ *
+ * Therefore, when @p->migration_disabled == 1, check whether @p is the
+ * current task or not: if it is, then migration was not disabled before
+ * entering the callback, otherwise migration was disabled.
+ *
+ * Returns true if @p is migration-disabled, false otherwise.
+ */
+static bool is_bpf_migration_disabled(const struct task_struct *p)
+{
+ if (p->migration_disabled == 1)
+ return p != current;
+ else
+ return p->migration_disabled;
+}
+
static s32 select_cpu_from_kfunc(struct task_struct *p, s32 prev_cpu, u64 wake_flags,
const struct cpumask *allowed, u64 flags)
{
@@ -898,7 +924,7 @@ static s32 select_cpu_from_kfunc(struct task_struct *p, s32 prev_cpu, u64 wake_f
* selection optimizations and simply check whether the previously
* used CPU is idle and within the allowed cpumask.
*/
- if (p->nr_cpus_allowed == 1 || is_migration_disabled(p)) {
+ if (p->nr_cpus_allowed == 1 || is_bpf_migration_disabled(p)) {
if (cpumask_test_cpu(prev_cpu, allowed ?: p->cpus_ptr) &&
scx_idle_test_and_clear_cpu(prev_cpu))
cpu = prev_cpu;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 55d225670def06b01af2e7a5e0446fbe946289e8
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092934-spinning-happening-f92a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 55d225670def06b01af2e7a5e0446fbe946289e8 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:11 +0200
Subject: [PATCH] i40e: add validation for ring_len param
The `ring_len` parameter provided by the virtual function (VF)
is assigned directly to the hardware memory context (HMC) without
any validation.
To address this, introduce an upper boundary check for both Tx and Rx
queue lengths. The maximum number of descriptors supported by the
hardware is 8k-32.
Additionally, enforce alignment constraints: Tx rings must be a multiple
of 8, and Rx rings must be a multiple of 32.
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 9b8efdeafbcf..cb37b2ac56f1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -653,6 +653,13 @@ static int i40e_config_vsi_tx_queue(struct i40e_vf *vf, u16 vsi_id,
/* only set the required fields */
tx_ctx.base = info->dma_ring_addr / 128;
+
+ /* ring_len has to be multiple of 8 */
+ if (!IS_ALIGNED(info->ring_len, 8) ||
+ info->ring_len > I40E_MAX_NUM_DESCRIPTORS_XL710) {
+ ret = -EINVAL;
+ goto error_context;
+ }
tx_ctx.qlen = info->ring_len;
tx_ctx.rdylist = le16_to_cpu(vsi->info.qs_handle[0]);
tx_ctx.rdylist_act = 0;
@@ -716,6 +723,13 @@ static int i40e_config_vsi_rx_queue(struct i40e_vf *vf, u16 vsi_id,
/* only set the required fields */
rx_ctx.base = info->dma_ring_addr / 128;
+
+ /* ring_len has to be multiple of 32 */
+ if (!IS_ALIGNED(info->ring_len, 32) ||
+ info->ring_len > I40E_MAX_NUM_DESCRIPTORS_XL710) {
+ ret = -EINVAL;
+ goto error_param;
+ }
rx_ctx.qlen = info->ring_len;
if (info->splithdr_enabled) {
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 55d225670def06b01af2e7a5e0446fbe946289e8
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092933-confess-manhole-6d70@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 55d225670def06b01af2e7a5e0446fbe946289e8 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:11 +0200
Subject: [PATCH] i40e: add validation for ring_len param
The `ring_len` parameter provided by the virtual function (VF)
is assigned directly to the hardware memory context (HMC) without
any validation.
To address this, introduce an upper boundary check for both Tx and Rx
queue lengths. The maximum number of descriptors supported by the
hardware is 8k-32.
Additionally, enforce alignment constraints: Tx rings must be a multiple
of 8, and Rx rings must be a multiple of 32.
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 9b8efdeafbcf..cb37b2ac56f1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -653,6 +653,13 @@ static int i40e_config_vsi_tx_queue(struct i40e_vf *vf, u16 vsi_id,
/* only set the required fields */
tx_ctx.base = info->dma_ring_addr / 128;
+
+ /* ring_len has to be multiple of 8 */
+ if (!IS_ALIGNED(info->ring_len, 8) ||
+ info->ring_len > I40E_MAX_NUM_DESCRIPTORS_XL710) {
+ ret = -EINVAL;
+ goto error_context;
+ }
tx_ctx.qlen = info->ring_len;
tx_ctx.rdylist = le16_to_cpu(vsi->info.qs_handle[0]);
tx_ctx.rdylist_act = 0;
@@ -716,6 +723,13 @@ static int i40e_config_vsi_rx_queue(struct i40e_vf *vf, u16 vsi_id,
/* only set the required fields */
rx_ctx.base = info->dma_ring_addr / 128;
+
+ /* ring_len has to be multiple of 32 */
+ if (!IS_ALIGNED(info->ring_len, 32) ||
+ info->ring_len > I40E_MAX_NUM_DESCRIPTORS_XL710) {
+ ret = -EINVAL;
+ goto error_param;
+ }
rx_ctx.qlen = info->ring_len;
if (info->splithdr_enabled) {
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 456c32e3c4316654f95f9d49c12cbecfb77d5660
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092944-overact-prowler-60f7@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 456c32e3c4316654f95f9d49c12cbecfb77d5660 Mon Sep 17 00:00:00 2001
From: "Masami Hiramatsu (Google)" <mhiramat(a)kernel.org>
Date: Fri, 19 Sep 2025 10:15:56 +0900
Subject: [PATCH] tracing: dynevent: Add a missing lockdown check on dynevent
Since dynamic_events interface on tracefs is compatible with
kprobe_events and uprobe_events, it should also check the lockdown
status and reject if it is set.
Link: https://lore.kernel.org/all/175824455687.45175.3734166065458520748.stgit@de…
Fixes: 17911ff38aa5 ("tracing: Add locked_down checks to the open calls of files created for tracefs")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Cc: stable(a)vger.kernel.org
diff --git a/kernel/trace/trace_dynevent.c b/kernel/trace/trace_dynevent.c
index 5d64a18cacac..d06854bd32b3 100644
--- a/kernel/trace/trace_dynevent.c
+++ b/kernel/trace/trace_dynevent.c
@@ -230,6 +230,10 @@ static int dyn_event_open(struct inode *inode, struct file *file)
{
int ret;
+ ret = security_locked_down(LOCKDOWN_TRACEFS);
+ if (ret)
+ return ret;
+
ret = tracing_check_open_get_tr(NULL);
if (ret)
return ret;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x b99dd77076bd3fddac6f7f1cbfa081c38fde17f5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092903-gambling-usable-65ed@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b99dd77076bd3fddac6f7f1cbfa081c38fde17f5 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:18 +0200
Subject: [PATCH] i40e: improve VF MAC filters accounting
When adding new VM MAC, driver checks only *active* filters in
vsi->mac_filter_hash. Each MAC, even in non-active state is using resources.
To determine number of MACs VM uses, count VSI filters in *any* state.
Add i40e_count_all_filters() to simply count all filters, and rename
i40e_count_filters() to i40e_count_active_filters() to avoid ambiguity.
Fixes: cfb1d572c986 ("i40e: Add ensurance of MacVlan resources for every trusted VF")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 49aa4497efce..801a57a925da 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -1278,7 +1278,8 @@ struct i40e_mac_filter *i40e_add_mac_filter(struct i40e_vsi *vsi,
const u8 *macaddr);
int i40e_del_mac_filter(struct i40e_vsi *vsi, const u8 *macaddr);
bool i40e_is_vsi_in_vlan(struct i40e_vsi *vsi);
-int i40e_count_filters(struct i40e_vsi *vsi);
+int i40e_count_all_filters(struct i40e_vsi *vsi);
+int i40e_count_active_filters(struct i40e_vsi *vsi);
struct i40e_mac_filter *i40e_find_mac(struct i40e_vsi *vsi, const u8 *macaddr);
void i40e_vlan_stripping_enable(struct i40e_vsi *vsi);
static inline bool i40e_is_sw_dcb(struct i40e_pf *pf)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index b14019d44b58..529d5501baac 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -1243,12 +1243,30 @@ void i40e_update_stats(struct i40e_vsi *vsi)
}
/**
- * i40e_count_filters - counts VSI mac filters
+ * i40e_count_all_filters - counts VSI MAC filters
* @vsi: the VSI to be searched
*
- * Returns count of mac filters
- **/
-int i40e_count_filters(struct i40e_vsi *vsi)
+ * Return: count of MAC filters in any state.
+ */
+int i40e_count_all_filters(struct i40e_vsi *vsi)
+{
+ struct i40e_mac_filter *f;
+ struct hlist_node *h;
+ int bkt, cnt = 0;
+
+ hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist)
+ cnt++;
+
+ return cnt;
+}
+
+/**
+ * i40e_count_active_filters - counts VSI MAC filters
+ * @vsi: the VSI to be searched
+ *
+ * Return: count of active MAC filters.
+ */
+int i40e_count_active_filters(struct i40e_vsi *vsi)
{
struct i40e_mac_filter *f;
struct hlist_node *h;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index f9b2197f0942..081a4526a2f0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2862,24 +2862,6 @@ static int i40e_vc_get_stats_msg(struct i40e_vf *vf, u8 *msg)
(u8 *)&stats, sizeof(stats));
}
-/**
- * i40e_can_vf_change_mac
- * @vf: pointer to the VF info
- *
- * Return true if the VF is allowed to change its MAC filters, false otherwise
- */
-static bool i40e_can_vf_change_mac(struct i40e_vf *vf)
-{
- /* If the VF MAC address has been set administratively (via the
- * ndo_set_vf_mac command), then deny permission to the VF to
- * add/delete unicast MAC addresses, unless the VF is trusted
- */
- if (vf->pf_set_mac && !vf->trusted)
- return false;
-
- return true;
-}
-
#define I40E_MAX_MACVLAN_PER_HW 3072
#define I40E_MAX_MACVLAN_PER_PF(num_ports) (I40E_MAX_MACVLAN_PER_HW / \
(num_ports))
@@ -2918,8 +2900,10 @@ static inline int i40e_check_vf_permission(struct i40e_vf *vf,
struct i40e_pf *pf = vf->pf;
struct i40e_vsi *vsi = pf->vsi[vf->lan_vsi_idx];
struct i40e_hw *hw = &pf->hw;
- int mac2add_cnt = 0;
- int i;
+ int i, mac_add_max, mac_add_cnt = 0;
+ bool vf_trusted;
+
+ vf_trusted = test_bit(I40E_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
for (i = 0; i < al->num_elements; i++) {
struct i40e_mac_filter *f;
@@ -2939,9 +2923,8 @@ static inline int i40e_check_vf_permission(struct i40e_vf *vf,
* The VF may request to set the MAC address filter already
* assigned to it so do not return an error in that case.
*/
- if (!i40e_can_vf_change_mac(vf) &&
- !is_multicast_ether_addr(addr) &&
- !ether_addr_equal(addr, vf->default_lan_addr.addr)) {
+ if (!vf_trusted && !is_multicast_ether_addr(addr) &&
+ vf->pf_set_mac && !ether_addr_equal(addr, vf->default_lan_addr.addr)) {
dev_err(&pf->pdev->dev,
"VF attempting to override administratively set MAC address, bring down and up the VF interface to resume normal operation\n");
return -EPERM;
@@ -2950,29 +2933,33 @@ static inline int i40e_check_vf_permission(struct i40e_vf *vf,
/*count filters that really will be added*/
f = i40e_find_mac(vsi, addr);
if (!f)
- ++mac2add_cnt;
+ ++mac_add_cnt;
}
/* If this VF is not privileged, then we can't add more than a limited
- * number of addresses. Check to make sure that the additions do not
- * push us over the limit.
- */
- if (!test_bit(I40E_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps)) {
- if ((i40e_count_filters(vsi) + mac2add_cnt) >
- I40E_VC_MAX_MAC_ADDR_PER_VF) {
- dev_err(&pf->pdev->dev,
- "Cannot add more MAC addresses, VF is not trusted, switch the VF to trusted to add more functionality\n");
- return -EPERM;
- }
- /* If this VF is trusted, it can use more resources than untrusted.
+ * number of addresses.
+ *
+ * If this VF is trusted, it can use more resources than untrusted.
* However to ensure that every trusted VF has appropriate number of
* resources, divide whole pool of resources per port and then across
* all VFs.
*/
- } else {
- if ((i40e_count_filters(vsi) + mac2add_cnt) >
- I40E_VC_MAX_MACVLAN_PER_TRUSTED_VF(pf->num_alloc_vfs,
- hw->num_ports)) {
+ if (!vf_trusted)
+ mac_add_max = I40E_VC_MAX_MAC_ADDR_PER_VF;
+ else
+ mac_add_max = I40E_VC_MAX_MACVLAN_PER_TRUSTED_VF(pf->num_alloc_vfs, hw->num_ports);
+
+ /* VF can replace all its filters in one step, in this case mac_add_max
+ * will be added as active and another mac_add_max will be in
+ * a to-be-removed state. Account for that.
+ */
+ if ((i40e_count_active_filters(vsi) + mac_add_cnt) > mac_add_max ||
+ (i40e_count_all_filters(vsi) + mac_add_cnt) > 2 * mac_add_max) {
+ if (!vf_trusted) {
+ dev_err(&pf->pdev->dev,
+ "Cannot add more MAC addresses, VF is not trusted, switch the VF to trusted to add more functionality\n");
+ return -EPERM;
+ } else {
dev_err(&pf->pdev->dev,
"Cannot add more MAC addresses, trusted VF exhausted it's resources\n");
return -EPERM;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x b99dd77076bd3fddac6f7f1cbfa081c38fde17f5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092902-unranked-reboot-ae0d@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b99dd77076bd3fddac6f7f1cbfa081c38fde17f5 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:18 +0200
Subject: [PATCH] i40e: improve VF MAC filters accounting
When adding new VM MAC, driver checks only *active* filters in
vsi->mac_filter_hash. Each MAC, even in non-active state is using resources.
To determine number of MACs VM uses, count VSI filters in *any* state.
Add i40e_count_all_filters() to simply count all filters, and rename
i40e_count_filters() to i40e_count_active_filters() to avoid ambiguity.
Fixes: cfb1d572c986 ("i40e: Add ensurance of MacVlan resources for every trusted VF")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 49aa4497efce..801a57a925da 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -1278,7 +1278,8 @@ struct i40e_mac_filter *i40e_add_mac_filter(struct i40e_vsi *vsi,
const u8 *macaddr);
int i40e_del_mac_filter(struct i40e_vsi *vsi, const u8 *macaddr);
bool i40e_is_vsi_in_vlan(struct i40e_vsi *vsi);
-int i40e_count_filters(struct i40e_vsi *vsi);
+int i40e_count_all_filters(struct i40e_vsi *vsi);
+int i40e_count_active_filters(struct i40e_vsi *vsi);
struct i40e_mac_filter *i40e_find_mac(struct i40e_vsi *vsi, const u8 *macaddr);
void i40e_vlan_stripping_enable(struct i40e_vsi *vsi);
static inline bool i40e_is_sw_dcb(struct i40e_pf *pf)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index b14019d44b58..529d5501baac 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -1243,12 +1243,30 @@ void i40e_update_stats(struct i40e_vsi *vsi)
}
/**
- * i40e_count_filters - counts VSI mac filters
+ * i40e_count_all_filters - counts VSI MAC filters
* @vsi: the VSI to be searched
*
- * Returns count of mac filters
- **/
-int i40e_count_filters(struct i40e_vsi *vsi)
+ * Return: count of MAC filters in any state.
+ */
+int i40e_count_all_filters(struct i40e_vsi *vsi)
+{
+ struct i40e_mac_filter *f;
+ struct hlist_node *h;
+ int bkt, cnt = 0;
+
+ hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist)
+ cnt++;
+
+ return cnt;
+}
+
+/**
+ * i40e_count_active_filters - counts VSI MAC filters
+ * @vsi: the VSI to be searched
+ *
+ * Return: count of active MAC filters.
+ */
+int i40e_count_active_filters(struct i40e_vsi *vsi)
{
struct i40e_mac_filter *f;
struct hlist_node *h;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index f9b2197f0942..081a4526a2f0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2862,24 +2862,6 @@ static int i40e_vc_get_stats_msg(struct i40e_vf *vf, u8 *msg)
(u8 *)&stats, sizeof(stats));
}
-/**
- * i40e_can_vf_change_mac
- * @vf: pointer to the VF info
- *
- * Return true if the VF is allowed to change its MAC filters, false otherwise
- */
-static bool i40e_can_vf_change_mac(struct i40e_vf *vf)
-{
- /* If the VF MAC address has been set administratively (via the
- * ndo_set_vf_mac command), then deny permission to the VF to
- * add/delete unicast MAC addresses, unless the VF is trusted
- */
- if (vf->pf_set_mac && !vf->trusted)
- return false;
-
- return true;
-}
-
#define I40E_MAX_MACVLAN_PER_HW 3072
#define I40E_MAX_MACVLAN_PER_PF(num_ports) (I40E_MAX_MACVLAN_PER_HW / \
(num_ports))
@@ -2918,8 +2900,10 @@ static inline int i40e_check_vf_permission(struct i40e_vf *vf,
struct i40e_pf *pf = vf->pf;
struct i40e_vsi *vsi = pf->vsi[vf->lan_vsi_idx];
struct i40e_hw *hw = &pf->hw;
- int mac2add_cnt = 0;
- int i;
+ int i, mac_add_max, mac_add_cnt = 0;
+ bool vf_trusted;
+
+ vf_trusted = test_bit(I40E_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
for (i = 0; i < al->num_elements; i++) {
struct i40e_mac_filter *f;
@@ -2939,9 +2923,8 @@ static inline int i40e_check_vf_permission(struct i40e_vf *vf,
* The VF may request to set the MAC address filter already
* assigned to it so do not return an error in that case.
*/
- if (!i40e_can_vf_change_mac(vf) &&
- !is_multicast_ether_addr(addr) &&
- !ether_addr_equal(addr, vf->default_lan_addr.addr)) {
+ if (!vf_trusted && !is_multicast_ether_addr(addr) &&
+ vf->pf_set_mac && !ether_addr_equal(addr, vf->default_lan_addr.addr)) {
dev_err(&pf->pdev->dev,
"VF attempting to override administratively set MAC address, bring down and up the VF interface to resume normal operation\n");
return -EPERM;
@@ -2950,29 +2933,33 @@ static inline int i40e_check_vf_permission(struct i40e_vf *vf,
/*count filters that really will be added*/
f = i40e_find_mac(vsi, addr);
if (!f)
- ++mac2add_cnt;
+ ++mac_add_cnt;
}
/* If this VF is not privileged, then we can't add more than a limited
- * number of addresses. Check to make sure that the additions do not
- * push us over the limit.
- */
- if (!test_bit(I40E_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps)) {
- if ((i40e_count_filters(vsi) + mac2add_cnt) >
- I40E_VC_MAX_MAC_ADDR_PER_VF) {
- dev_err(&pf->pdev->dev,
- "Cannot add more MAC addresses, VF is not trusted, switch the VF to trusted to add more functionality\n");
- return -EPERM;
- }
- /* If this VF is trusted, it can use more resources than untrusted.
+ * number of addresses.
+ *
+ * If this VF is trusted, it can use more resources than untrusted.
* However to ensure that every trusted VF has appropriate number of
* resources, divide whole pool of resources per port and then across
* all VFs.
*/
- } else {
- if ((i40e_count_filters(vsi) + mac2add_cnt) >
- I40E_VC_MAX_MACVLAN_PER_TRUSTED_VF(pf->num_alloc_vfs,
- hw->num_ports)) {
+ if (!vf_trusted)
+ mac_add_max = I40E_VC_MAX_MAC_ADDR_PER_VF;
+ else
+ mac_add_max = I40E_VC_MAX_MACVLAN_PER_TRUSTED_VF(pf->num_alloc_vfs, hw->num_ports);
+
+ /* VF can replace all its filters in one step, in this case mac_add_max
+ * will be added as active and another mac_add_max will be in
+ * a to-be-removed state. Account for that.
+ */
+ if ((i40e_count_active_filters(vsi) + mac_add_cnt) > mac_add_max ||
+ (i40e_count_all_filters(vsi) + mac_add_cnt) > 2 * mac_add_max) {
+ if (!vf_trusted) {
+ dev_err(&pf->pdev->dev,
+ "Cannot add more MAC addresses, VF is not trusted, switch the VF to trusted to add more functionality\n");
+ return -EPERM;
+ } else {
dev_err(&pf->pdev->dev,
"Cannot add more MAC addresses, trusted VF exhausted it's resources\n");
return -EPERM;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x b99dd77076bd3fddac6f7f1cbfa081c38fde17f5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092901-doorpost-cure-351c@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b99dd77076bd3fddac6f7f1cbfa081c38fde17f5 Mon Sep 17 00:00:00 2001
From: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Date: Wed, 13 Aug 2025 12:45:18 +0200
Subject: [PATCH] i40e: improve VF MAC filters accounting
When adding new VM MAC, driver checks only *active* filters in
vsi->mac_filter_hash. Each MAC, even in non-active state is using resources.
To determine number of MACs VM uses, count VSI filters in *any* state.
Add i40e_count_all_filters() to simply count all filters, and rename
i40e_count_filters() to i40e_count_active_filters() to avoid ambiguity.
Fixes: cfb1d572c986 ("i40e: Add ensurance of MacVlan resources for every trusted VF")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 49aa4497efce..801a57a925da 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -1278,7 +1278,8 @@ struct i40e_mac_filter *i40e_add_mac_filter(struct i40e_vsi *vsi,
const u8 *macaddr);
int i40e_del_mac_filter(struct i40e_vsi *vsi, const u8 *macaddr);
bool i40e_is_vsi_in_vlan(struct i40e_vsi *vsi);
-int i40e_count_filters(struct i40e_vsi *vsi);
+int i40e_count_all_filters(struct i40e_vsi *vsi);
+int i40e_count_active_filters(struct i40e_vsi *vsi);
struct i40e_mac_filter *i40e_find_mac(struct i40e_vsi *vsi, const u8 *macaddr);
void i40e_vlan_stripping_enable(struct i40e_vsi *vsi);
static inline bool i40e_is_sw_dcb(struct i40e_pf *pf)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index b14019d44b58..529d5501baac 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -1243,12 +1243,30 @@ void i40e_update_stats(struct i40e_vsi *vsi)
}
/**
- * i40e_count_filters - counts VSI mac filters
+ * i40e_count_all_filters - counts VSI MAC filters
* @vsi: the VSI to be searched
*
- * Returns count of mac filters
- **/
-int i40e_count_filters(struct i40e_vsi *vsi)
+ * Return: count of MAC filters in any state.
+ */
+int i40e_count_all_filters(struct i40e_vsi *vsi)
+{
+ struct i40e_mac_filter *f;
+ struct hlist_node *h;
+ int bkt, cnt = 0;
+
+ hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist)
+ cnt++;
+
+ return cnt;
+}
+
+/**
+ * i40e_count_active_filters - counts VSI MAC filters
+ * @vsi: the VSI to be searched
+ *
+ * Return: count of active MAC filters.
+ */
+int i40e_count_active_filters(struct i40e_vsi *vsi)
{
struct i40e_mac_filter *f;
struct hlist_node *h;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index f9b2197f0942..081a4526a2f0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2862,24 +2862,6 @@ static int i40e_vc_get_stats_msg(struct i40e_vf *vf, u8 *msg)
(u8 *)&stats, sizeof(stats));
}
-/**
- * i40e_can_vf_change_mac
- * @vf: pointer to the VF info
- *
- * Return true if the VF is allowed to change its MAC filters, false otherwise
- */
-static bool i40e_can_vf_change_mac(struct i40e_vf *vf)
-{
- /* If the VF MAC address has been set administratively (via the
- * ndo_set_vf_mac command), then deny permission to the VF to
- * add/delete unicast MAC addresses, unless the VF is trusted
- */
- if (vf->pf_set_mac && !vf->trusted)
- return false;
-
- return true;
-}
-
#define I40E_MAX_MACVLAN_PER_HW 3072
#define I40E_MAX_MACVLAN_PER_PF(num_ports) (I40E_MAX_MACVLAN_PER_HW / \
(num_ports))
@@ -2918,8 +2900,10 @@ static inline int i40e_check_vf_permission(struct i40e_vf *vf,
struct i40e_pf *pf = vf->pf;
struct i40e_vsi *vsi = pf->vsi[vf->lan_vsi_idx];
struct i40e_hw *hw = &pf->hw;
- int mac2add_cnt = 0;
- int i;
+ int i, mac_add_max, mac_add_cnt = 0;
+ bool vf_trusted;
+
+ vf_trusted = test_bit(I40E_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
for (i = 0; i < al->num_elements; i++) {
struct i40e_mac_filter *f;
@@ -2939,9 +2923,8 @@ static inline int i40e_check_vf_permission(struct i40e_vf *vf,
* The VF may request to set the MAC address filter already
* assigned to it so do not return an error in that case.
*/
- if (!i40e_can_vf_change_mac(vf) &&
- !is_multicast_ether_addr(addr) &&
- !ether_addr_equal(addr, vf->default_lan_addr.addr)) {
+ if (!vf_trusted && !is_multicast_ether_addr(addr) &&
+ vf->pf_set_mac && !ether_addr_equal(addr, vf->default_lan_addr.addr)) {
dev_err(&pf->pdev->dev,
"VF attempting to override administratively set MAC address, bring down and up the VF interface to resume normal operation\n");
return -EPERM;
@@ -2950,29 +2933,33 @@ static inline int i40e_check_vf_permission(struct i40e_vf *vf,
/*count filters that really will be added*/
f = i40e_find_mac(vsi, addr);
if (!f)
- ++mac2add_cnt;
+ ++mac_add_cnt;
}
/* If this VF is not privileged, then we can't add more than a limited
- * number of addresses. Check to make sure that the additions do not
- * push us over the limit.
- */
- if (!test_bit(I40E_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps)) {
- if ((i40e_count_filters(vsi) + mac2add_cnt) >
- I40E_VC_MAX_MAC_ADDR_PER_VF) {
- dev_err(&pf->pdev->dev,
- "Cannot add more MAC addresses, VF is not trusted, switch the VF to trusted to add more functionality\n");
- return -EPERM;
- }
- /* If this VF is trusted, it can use more resources than untrusted.
+ * number of addresses.
+ *
+ * If this VF is trusted, it can use more resources than untrusted.
* However to ensure that every trusted VF has appropriate number of
* resources, divide whole pool of resources per port and then across
* all VFs.
*/
- } else {
- if ((i40e_count_filters(vsi) + mac2add_cnt) >
- I40E_VC_MAX_MACVLAN_PER_TRUSTED_VF(pf->num_alloc_vfs,
- hw->num_ports)) {
+ if (!vf_trusted)
+ mac_add_max = I40E_VC_MAX_MAC_ADDR_PER_VF;
+ else
+ mac_add_max = I40E_VC_MAX_MACVLAN_PER_TRUSTED_VF(pf->num_alloc_vfs, hw->num_ports);
+
+ /* VF can replace all its filters in one step, in this case mac_add_max
+ * will be added as active and another mac_add_max will be in
+ * a to-be-removed state. Account for that.
+ */
+ if ((i40e_count_active_filters(vsi) + mac_add_cnt) > mac_add_max ||
+ (i40e_count_all_filters(vsi) + mac_add_cnt) > 2 * mac_add_max) {
+ if (!vf_trusted) {
+ dev_err(&pf->pdev->dev,
+ "Cannot add more MAC addresses, VF is not trusted, switch the VF to trusted to add more functionality\n");
+ return -EPERM;
+ } else {
dev_err(&pf->pdev->dev,
"Cannot add more MAC addresses, trusted VF exhausted it's resources\n");
return -EPERM;
Hi,
This series drops the QPIC interconnect and BCM nodes for the SDX75 SoC. The
reason is that this QPIC BCM resource is already defined as a RPMh clock in
clk-rpmh driver as like other SDX SoCs. So it is wrong to describe the same
resource in two different providers.
Also, without this series, the NAND driver fails to probe on SDX75 as the
interconnect sync state disables the QPIC nodes as there were no clients voting
for this ICC resource. However, the NAND driver had already voted for this BCM
resource through the clk-rpmh driver. Since both votes come from Linux, RPMh was
unable to distinguish between these two and ends up disabling the resource
during sync state.
Cc: linux-arm-msm(a)vger.kernel.org
Cc: linux-pm(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Cc: devicetree(a)vger.kernel.org
Cc: Manivannan Sadhasivam <mani(a)kernel.org>
Cc: Raviteja Laggyshetty <quic_rlaggysh(a)quicinc.com>
Cc: Lakshmi Sowjanya D <quic_laksd(a)quicinc.com>
To: Georgi Djakov <djakov(a)kernel.org>
To: Konrad Dybcio <konradybcio(a)kernel.org>
To: Rob Herring <robh(a)kernel.org>
To: Krzysztof Kozlowski <krzk+dt(a)kernel.org>
To: Conor Dooley <conor+dt(a)kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)oss.qualcomm.com>
Changes in v2:
- Taken over the series from Raviteja
- Reordered the patches to avoid breaking build
- Improved the patch descriptions and kept the values for other defines
unchanged
---
Raviteja Laggyshetty (2):
interconnect: qcom: sdx75: Drop QPIC interconnect and BCM nodes
dt-bindings: interconnect: qcom: Drop QPIC_CORE IDs
drivers/interconnect/qcom/sdx75.c | 26 --------------------------
drivers/interconnect/qcom/sdx75.h | 2 --
include/dt-bindings/interconnect/qcom,sdx75.h | 2 --
3 files changed, 30 deletions(-)
---
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
change-id: 20250926-sdx75-icc-67c20b3de84a
Best regards,
--
Manivannan Sadhasivam <manivannan.sadhasivam(a)oss.qualcomm.com>
Struct ff_effect_compat is embedded twice inside
uinput_ff_upload_compat, contains internal padding. In particular, there
is a hole after struct ff_replay to satisfy alignment requirements for
the following union member. Without clearing the structure,
copy_to_user() may leak stack data to userspace.
Initialize ff_up_compat to zero before filling valid fields.
Fixes: 2d56f3a32c0e ("Input: refactor evdev 32bit compat to be shareable with uinput")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zhen Ni <zhen.ni(a)easystack.cn>
---
drivers/input/misc/uinput.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/input/misc/uinput.c b/drivers/input/misc/uinput.c
index 2c51ea9d01d7..13336a2fd49c 100644
--- a/drivers/input/misc/uinput.c
+++ b/drivers/input/misc/uinput.c
@@ -775,6 +775,7 @@ static int uinput_ff_upload_to_user(char __user *buffer,
if (in_compat_syscall()) {
struct uinput_ff_upload_compat ff_up_compat;
+ memset(&ff_up_compat, 0, sizeof(ff_up_compat));
ff_up_compat.request_id = ff_up->request_id;
ff_up_compat.retval = ff_up->retval;
/*
--
2.20.1
Hi Zhang, hi Jiri,
In Debian Staffan Melin reported that after an update containing the
commit 1a8953f4f774 ("HID: Add IGNORE quirk for SMARTLINKTECHNOLOGY"),
the input device with same idVendor and idProduct, the Jieli
Technology USB Composite Device, does not get recognized anymore.
The full Debian report is at: https://bugs.debian.org/1114557
The issue is not specific to the 6.12.y series and confirmed in 6.16.3
as well.
Staffan Melin did bisect the kernels between 6.12.38 (which was still
working) and 6.1.41 (which was not), confirming by bisection that the
offending commit is
1a8953f4f774 ("HID: Add IGNORE quirk for SMARTLINKTECHNOLOGY")
#regzbot introduced: 1a8953f4f774
#regzbot monitor: https://bugs.debian.org/1114557
So it looks that the quirk applied is unfortunately affecting
negatively as well Staffan Melin case.
Can you have a look?
Regards,
Salvatore
The quilt patch titled
Subject: mm: swap: check for stable address space before operating on the VMA
has been removed from the -mm tree. Its filename was
mm-swap-check-for-stable-address-space-before-operating-on-the-vma.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Charan Teja Kalla <charan.kalla(a)oss.qualcomm.com>
Subject: mm: swap: check for stable address space before operating on the VMA
Date: Wed, 24 Sep 2025 23:41:38 +0530
It is possible to hit a zero entry while traversing the vmas in unuse_mm()
called from swapoff path and accessing it causes the OOPS:
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000446--> Loading the memory from offset 0x40 on the
XA_ZERO_ENTRY as address.
Mem abort info:
ESR = 0x0000000096000005
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
The issue is manifested from the below race between the fork() on a
process and swapoff:
fork(dup_mmap()) swapoff(unuse_mm)
--------------- -----------------
1) Identical mtree is built using
__mt_dup().
2) copy_pte_range()-->
copy_nonpresent_pte():
The dst mm is added into the
mmlist to be visible to the
swapoff operation.
3) Fatal signal is sent to the parent
process(which is the current during the
fork) thus skip the duplication of the
vmas and mark the vma range with
XA_ZERO_ENTRY as a marker for this process
that helps during exit_mmap().
4) swapoff is tried on the
'mm' added to the 'mmlist' as
part of the 2.
5) unuse_mm(), that iterates
through the vma's of this 'mm'
will hit the non-NULL zero entry
and operating on this zero entry
as a vma is resulting into the
oops.
The proper fix would be around not exposing this partially-valid tree to
others when droping the mmap lock, which is being solved with [1]. A
simpler solution would be checking for MMF_UNSTABLE, as it is set if
mm_struct is not fully initialized in dup_mmap().
Thanks to Liam/Lorenzo/David for all the suggestions in fixing this
issue.
Link: https://lkml.kernel.org/r/20250924181138.1762750-1-charan.kalla@oss.qualcom…
Link: https://lore.kernel.org/all/20250815191031.3769540-1-Liam.Howlett@oracle.co… [1]
Fixes: d24062914837 ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()")
Signed-off-by: Charan Teja Kalla <charan.kalla(a)oss.qualcomm.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Barry Song <baohua(a)kernel.org>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Kairui Song <kasong(a)tencent.com>
Cc: Kemeng Shi <shikemeng(a)huaweicloud.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Nhat Pham <nphamcs(a)gmail.com>
Cc: Peng Zhang <zhangpeng.00(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/swapfile.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/swapfile.c~mm-swap-check-for-stable-address-space-before-operating-on-the-vma
+++ a/mm/swapfile.c
@@ -2389,6 +2389,8 @@ static int unuse_mm(struct mm_struct *mm
VMA_ITERATOR(vmi, mm, 0);
mmap_read_lock(mm);
+ if (check_stable_address_space(mm))
+ goto unlock;
for_each_vma(vmi, vma) {
if (vma->anon_vma && !is_vm_hugetlb_page(vma)) {
ret = unuse_vma(vma, type);
@@ -2398,6 +2400,7 @@ static int unuse_mm(struct mm_struct *mm
cond_resched();
}
+unlock:
mmap_read_unlock(mm);
return ret;
}
_
Patches currently in -mm which might be from charan.kalla(a)oss.qualcomm.com are
The quilt patch titled
Subject: mm/ksm: fix incorrect KSM counter handling in mm_struct during fork
has been removed from the -mm tree. Its filename was
mm-ksm-fix-incorrect-ksm-counter-handling-in-mm_struct-during-fork.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Donet Tom <donettom(a)linux.ibm.com>
Subject: mm/ksm: fix incorrect KSM counter handling in mm_struct during fork
Date: Wed, 24 Sep 2025 00:16:59 +0530
Patch series "mm/ksm: Fix incorrect accounting of KSM counters during
fork", v3.
The first patch in this series fixes the incorrect accounting of KSM
counters such as ksm_merging_pages, ksm_rmap_items, and the global
ksm_zero_pages during fork.
The following patch add a selftest to verify the ksm_merging_pages counter
was updated correctly during fork.
Test Results
============
Without the first patch
-----------------------
# [RUN] test_fork_ksm_merging_page_count
not ok 10 ksm_merging_page in child: 32
With the first patch
--------------------
# [RUN] test_fork_ksm_merging_page_count
ok 10 ksm_merging_pages is not inherited after fork
This patch (of 2):
Currently, the KSM-related counters in `mm_struct`, such as
`ksm_merging_pages`, `ksm_rmap_items`, and `ksm_zero_pages`, are inherited
by the child process during fork. This results in inconsistent
accounting.
When a process uses KSM, identical pages are merged and an rmap item is
created for each merged page. The `ksm_merging_pages` and
`ksm_rmap_items` counters are updated accordingly. However, after a fork,
these counters are copied to the child while the corresponding rmap items
are not. As a result, when the child later triggers an unmerge, there are
no rmap items present in the child, so the counters remain stale, leading
to incorrect accounting.
A similar issue exists with `ksm_zero_pages`, which maintains both a
global counter and a per-process counter. During fork, the per-process
counter is inherited by the child, but the global counter is not
incremented. Since the child also references zero pages, the global
counter should be updated as well. Otherwise, during zero-page unmerge,
both the global and per-process counters are decremented, causing the
global counter to become inconsistent.
To fix this, ksm_merging_pages and ksm_rmap_items are reset to 0 during
fork, and the global ksm_zero_pages counter is updated with the
per-process ksm_zero_pages value inherited by the child. This ensures
that KSM statistics remain accurate and reflect the activity of each
process correctly.
Link: https://lkml.kernel.org/r/cover.1758648700.git.donettom@linux.ibm.com
Link: https://lkml.kernel.org/r/7b9870eb67ccc0d79593940d9dbd4a0b39b5d396.17586487…
Fixes: 7609385337a4 ("ksm: count ksm merging pages for each process")
Fixes: cb4df4cae4f2 ("ksm: count allocated ksm rmap_items for each process")
Fixes: e2942062e01d ("ksm: count all zero pages placed by KSM")
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
Reviewed-by: Chengming Zhou <chengming.zhou(a)linux.dev>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Aboorva Devarajan <aboorvad(a)linux.ibm.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Donet Tom <donettom(a)linux.ibm.com>
Cc: "Ritesh Harjani (IBM)" <ritesh.list(a)gmail.com>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: xu xin <xu.xin16(a)zte.com.cn>
Cc: <stable(a)vger.kernel.org> [6.6+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/ksm.h | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/include/linux/ksm.h~mm-ksm-fix-incorrect-ksm-counter-handling-in-mm_struct-during-fork
+++ a/include/linux/ksm.h
@@ -56,8 +56,14 @@ static inline long mm_ksm_zero_pages(str
static inline void ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
{
/* Adding mm to ksm is best effort on fork. */
- if (mm_flags_test(MMF_VM_MERGEABLE, oldmm))
+ if (mm_flags_test(MMF_VM_MERGEABLE, oldmm)) {
+ long nr_ksm_zero_pages = atomic_long_read(&mm->ksm_zero_pages);
+
+ mm->ksm_merging_pages = 0;
+ mm->ksm_rmap_items = 0;
+ atomic_long_add(nr_ksm_zero_pages, &ksm_zero_pages);
__ksm_enter(mm);
+ }
}
static inline int ksm_execve(struct mm_struct *mm)
_
Patches currently in -mm which might be from donettom(a)linux.ibm.com are
The quilt patch titled
Subject: Squashfs: reject negative file sizes in squashfs_read_inode()
has been removed from the -mm tree. Its filename was
squashfs-reject-negative-file-sizes-in-squashfs_read_inode.patch
This patch was dropped because it was merged into the mm-nonmm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: Squashfs: reject negative file sizes in squashfs_read_inode()
Date: Fri, 26 Sep 2025 22:59:35 +0100
Syskaller reports a "WARNING in ovl_copy_up_file" in overlayfs.
This warning is ultimately caused because the underlying Squashfs file
system returns a file with a negative file size.
This commit checks for a negative file size and returns EINVAL.
[phillip(a)squashfs.org.uk: only need to check 64 bit quantity]
Link: https://lkml.kernel.org/r/20250926222305.110103-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20250926215935.107233-1-phillip@squashfs.org.uk
Fixes: 6545b246a2c8 ("Squashfs: inode operations")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+f754e01116421e9754b9(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68d580e5.a00a0220.303701.0019.GAE@google.com/
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/inode.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/fs/squashfs/inode.c~squashfs-reject-negative-file-sizes-in-squashfs_read_inode
+++ a/fs/squashfs/inode.c
@@ -197,6 +197,10 @@ int squashfs_read_inode(struct inode *in
goto failed_read;
inode->i_size = le64_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
frag = le32_to_cpu(sqsh_ino->fragment);
if (frag != SQUASHFS_INVALID_FRAG) {
/*
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
The quilt patch titled
Subject: lib/genalloc: fix device leak in of_gen_pool_get()
has been removed from the -mm tree. Its filename was
lib-genalloc-fix-device-leak-in-of_gen_pool_get.patch
This patch was dropped because it was merged into the mm-nonmm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Johan Hovold <johan(a)kernel.org>
Subject: lib/genalloc: fix device leak in of_gen_pool_get()
Date: Wed, 24 Sep 2025 10:02:07 +0200
Make sure to drop the reference taken when looking up the genpool platform
device in of_gen_pool_get() before returning the pool.
Note that holding a reference to a device does typically not prevent its
devres managed resources from being released so there is no point in
keeping the reference.
Link: https://lkml.kernel.org/r/20250924080207.18006-1-johan@kernel.org
Fixes: 9375db07adea ("genalloc: add devres support, allow to find a managed pool by device")
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Cc: Philipp Zabel <p.zabel(a)pengutronix.de>
Cc: Vladimir Zapolskiy <vz(a)mleia.com>
Cc: <stable(a)vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/genalloc.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/lib/genalloc.c~lib-genalloc-fix-device-leak-in-of_gen_pool_get
+++ a/lib/genalloc.c
@@ -899,8 +899,11 @@ struct gen_pool *of_gen_pool_get(struct
if (!name)
name = of_node_full_name(np_pool);
}
- if (pdev)
+ if (pdev) {
pool = gen_pool_get(&pdev->dev, name);
+ put_device(&pdev->dev);
+ }
+
of_node_put(np_pool);
return pool;
_
Patches currently in -mm which might be from johan(a)kernel.org are
The quilt patch titled
Subject: Squashfs: fix uninit-value in squashfs_get_parent
has been removed from the -mm tree. Its filename was
squashfs-fix-uninit-value-in-squashfs_get_parent.patch
This patch was dropped because it was merged into the mm-nonmm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: Squashfs: fix uninit-value in squashfs_get_parent
Date: Fri, 19 Sep 2025 00:33:08 +0100
Syzkaller reports a "KMSAN: uninit-value in squashfs_get_parent" bug.
This is caused by open_by_handle_at() being called with a file handle
containing an invalid parent inode number. In particular the inode number
is that of a symbolic link, rather than a directory.
Squashfs_get_parent() gets called with that symbolic link inode, and
accesses the parent member field.
unsigned int parent_ino = squashfs_i(inode)->parent;
Because non-directory inodes in Squashfs do not have a parent value, this
is uninitialised, and this causes an uninitialised value access.
The fix is to initialise parent with the invalid inode 0, which will cause
an EINVAL error to be returned.
Regular inodes used to share the parent field with the block_list_start
field. This is removed in this commit to enable the parent field to
contain the invalid inode number 0.
Link: https://lkml.kernel.org/r/20250918233308.293861-1-phillip@squashfs.org.uk
Fixes: 122601408d20 ("Squashfs: export operations")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+157bdef5cf596ad0da2c(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68cc2431.050a0220.139b6.0001.GAE@google.com/
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/inode.c | 7 +++++++
fs/squashfs/squashfs_fs_i.h | 2 +-
2 files changed, 8 insertions(+), 1 deletion(-)
--- a/fs/squashfs/inode.c~squashfs-fix-uninit-value-in-squashfs_get_parent
+++ a/fs/squashfs/inode.c
@@ -169,6 +169,7 @@ int squashfs_read_inode(struct inode *in
squashfs_i(inode)->start = le32_to_cpu(sqsh_ino->start_block);
squashfs_i(inode)->block_list_start = block;
squashfs_i(inode)->offset = offset;
+ squashfs_i(inode)->parent = 0;
inode->i_data.a_ops = &squashfs_aops;
TRACE("File inode %x:%x, start_block %llx, block_list_start "
@@ -216,6 +217,7 @@ int squashfs_read_inode(struct inode *in
squashfs_i(inode)->start = le64_to_cpu(sqsh_ino->start_block);
squashfs_i(inode)->block_list_start = block;
squashfs_i(inode)->offset = offset;
+ squashfs_i(inode)->parent = 0;
inode->i_data.a_ops = &squashfs_aops;
TRACE("File inode %x:%x, start_block %llx, block_list_start "
@@ -296,6 +298,7 @@ int squashfs_read_inode(struct inode *in
inode->i_mode |= S_IFLNK;
squashfs_i(inode)->start = block;
squashfs_i(inode)->offset = offset;
+ squashfs_i(inode)->parent = 0;
if (type == SQUASHFS_LSYMLINK_TYPE) {
__le32 xattr;
@@ -333,6 +336,7 @@ int squashfs_read_inode(struct inode *in
set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
rdev = le32_to_cpu(sqsh_ino->rdev);
init_special_inode(inode, inode->i_mode, new_decode_dev(rdev));
+ squashfs_i(inode)->parent = 0;
TRACE("Device inode %x:%x, rdev %x\n",
SQUASHFS_INODE_BLK(ino), offset, rdev);
@@ -357,6 +361,7 @@ int squashfs_read_inode(struct inode *in
set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
rdev = le32_to_cpu(sqsh_ino->rdev);
init_special_inode(inode, inode->i_mode, new_decode_dev(rdev));
+ squashfs_i(inode)->parent = 0;
TRACE("Device inode %x:%x, rdev %x\n",
SQUASHFS_INODE_BLK(ino), offset, rdev);
@@ -377,6 +382,7 @@ int squashfs_read_inode(struct inode *in
inode->i_mode |= S_IFSOCK;
set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
init_special_inode(inode, inode->i_mode, 0);
+ squashfs_i(inode)->parent = 0;
break;
}
case SQUASHFS_LFIFO_TYPE:
@@ -396,6 +402,7 @@ int squashfs_read_inode(struct inode *in
inode->i_op = &squashfs_inode_ops;
set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
init_special_inode(inode, inode->i_mode, 0);
+ squashfs_i(inode)->parent = 0;
break;
}
default:
--- a/fs/squashfs/squashfs_fs_i.h~squashfs-fix-uninit-value-in-squashfs_get_parent
+++ a/fs/squashfs/squashfs_fs_i.h
@@ -16,6 +16,7 @@ struct squashfs_inode_info {
u64 xattr;
unsigned int xattr_size;
int xattr_count;
+ int parent;
union {
struct {
u64 fragment_block;
@@ -27,7 +28,6 @@ struct squashfs_inode_info {
u64 dir_idx_start;
int dir_idx_offset;
int dir_idx_cnt;
- int parent;
};
};
struct inode vfs_inode;
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
The quilt patch titled
Subject: kho: only fill kimage if KHO is finalized
has been removed from the -mm tree. Its filename was
kho-only-fill-kimage-if-kho-is-finalized.patch
This patch was dropped because it was merged into the mm-nonmm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Pratyush Yadav <pratyush(a)kernel.org>
Subject: kho: only fill kimage if KHO is finalized
Date: Thu, 18 Sep 2025 19:06:15 +0200
kho_fill_kimage() only checks for KHO being enabled before filling in the
FDT to the image. KHO being enabled does not mean that the kernel has
data to hand over. That happens when KHO is finalized.
When a kexec is done with KHO enabled but not finalized, the FDT page is
allocated but not initialized. FDT initialization happens after finalize.
This means the KHO segment is filled in but the FDT contains garbage
data.
This leads to the below error messages in the next kernel:
[ 0.000000] KHO: setup: handover FDT (0x10116b000) is invalid: -9
[ 0.000000] KHO: disabling KHO revival: -22
There is no problem in practice, and the next kernel boots and works fine.
But this still leads to misleading error messages and garbage being
handed over.
Only fill in KHO segment when KHO is finalized. When KHO is not enabled,
the debugfs interface is not created and there is no way to finalize it
anyway. So the check for kho_enable is not needed, and kho_out.finalize
alone is enough.
Link: https://lkml.kernel.org/r/20250918170617.91413-1-pratyush@kernel.org
Fixes: 3bdecc3c93f9 ("kexec: add KHO support to kexec file loads")
Signed-off-by: Pratyush Yadav <pratyush(a)kernel.org>
Reviewed-by: Mike Rapoport (Microsoft) <rppt(a)kernel.org>
Cc: Alexander Graf <graf(a)amazon.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Changyuan Lyu <changyuanl(a)google.com>
Cc: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: Pasha Tatashin <pasha.tatashin(a)soleen.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/kexec_handover.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/kexec_handover.c~kho-only-fill-kimage-if-kho-is-finalized
+++ a/kernel/kexec_handover.c
@@ -1253,7 +1253,7 @@ int kho_fill_kimage(struct kimage *image
int err = 0;
struct kexec_buf scratch;
- if (!kho_enable)
+ if (!kho_out.finalized)
return 0;
image->kho.fdt = page_to_phys(kho_out.ser.fdt);
_
Patches currently in -mm which might be from pratyush(a)kernel.org are
kho-add-support-for-preserving-vmalloc-allocations-fix.patch
Good Sunday morning,
We hope 2025 has been strong heading into Fall. We continue to look forward to opportunities in structuring funding approval to offer you this year.
With hopes of inflation rates continuing to come down, combining with a growing market, Watermark's rates have been and will be as competitive as ever for:
* Credit Lines - Working Capital
* All Property
* All Equipment
* Equity Raises - Buyouts
We are available to review your fundraising needs around the world on behalf of you and any clients that you may have ready for us.
Thank you!
Jim Solesky, Managing Partner
Office: (917) 267-9470
1 Bryant Park | Bank of America Tower
New York, NY, United States 10036
[image5]
Dear Sir :
I hope you’re doing well!
My name is Su, and I represent METAL PRODUCTS, a leading manufacturer of premium metal hardware. We specialize in a wide range of products, including:
D rings , O rings ,clasps ,clips , Plastic hardware ,fastening,rivets , grommets,eyelets,buckles,screws , harnesses ,wires ,bolts ,nuts ,clips ,bits ,spurs ,screws,anchors,stainless steel hardware , zamak harnesses,aluminium hardware,brass hardware,steel harnesses ,iron metal hardware,metal & Harnesses,belt and buckles,straps ,leather Harnesses,tags ,belts,straps ,clips ,rings bars ,hooks , safety buckles,brass buckles, brass rings ,belts,buckles ,Split Ring ,straps ,sliders,snaps ,Bars Rings, Fasteners, magnent buttons,craft hardware ,bindings ,knobs,steps ,straps , bits ,eyelets ,tools , metal gear , metal hardware , metal products, metal Components,S hooks ,bars ,snaps ,webbings , buckles, HOOK,brass buckles, clips , hooks ,straps , accessories ,steel wires, wire rings , wire buckles ,brass gears ,brass products ,iron metal hardware ,Fasteners,safety buckles,brass buckles, brass rings ,belts,buckles ,Ring ,Squares ,straps ,sliders,Bars Rings, Buckles, Cord locks, Hooks & Rings,Tabs, Locks and Slides,magnent buttons,carabiners,trigger ,buttons ,snaps ,fasteners, webbings , webbing buckles, buttons, as required for our global clients .Hook Loop ,textile accessories,safety buckles,sliders,snaps ,leather hardware ,bag hardware ,shoe hardware ,cord hardware ,hockey hardware,Rings ,chains,suspenders,custom designs and more !
We are manufactory, we are the source, our price is very competitive ,you will get the best price , We have profuse designs with series quality grade, and expressly.
Our factory always produce customer designs and drawing , if you have any products looking please let me know
we could surely make for you
Sincerely hope could work with you !
Best regards
Su
From: "Masami Hiramatsu (Google)" <mhiramat(a)kernel.org>
function_graph_enter_regs() prevents itself from recursion by
ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
which is called at the exit, does not prevent such recursion.
Therefore, while it can prevent recursive calls from
fgraph_ops::entryfunc(), it is not able to prevent recursive calls
to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
This can lead an unexpected recursion bug reported by Menglong.
is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
-> kprobe_multi_link_exit_handler -> is_endbr.
To fix this issue, acquire ftrace_test_recursion_trylock() in the
__ftrace_return_to_handler() after unwind the shadow stack to mark
this section must prevent recursive call of fgraph inside user-defined
fgraph_ops::retfunc().
This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
fprobe on function-graph tracer"), because before that fgraph was
only used from the function graph tracer. Fprobe allowed user to run
any callbacks from fgraph after that commit.
Reported-by: Menglong Dong <menglong8.dong(a)gmail.com>
Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.c…
Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
Cc: stable(a)vger.kernel.org
Cc: Peter Zijlstra <peterz(a)infradead.org>
Link: https://lore.kernel.org/175852292275.307379.9040117316112640553.stgit@devno…
Signed-off-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Acked-by: Jiri Olsa <jolsa(a)kernel.org>
Tested-by: Menglong Dong <menglong8.dong(a)gmail.com>
Acked-by: Menglong Dong <menglong8.dong(a)gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/fgraph.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 1e3b32b1e82c..484ad7a18463 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
unsigned long bitmap;
unsigned long ret;
int offset;
+ int bit;
int i;
ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
@@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
if (fregs)
ftrace_regs_set_instruction_pointer(fregs, ret);
+ bit = ftrace_test_recursion_trylock(trace.func, ret);
+ /*
+ * This can fail because ftrace_test_recursion_trylock() allows one nest
+ * call. If we are already in a nested call, then we don't probe this and
+ * just return the original return address.
+ */
+ if (unlikely(bit < 0))
+ goto out;
+
#ifdef CONFIG_FUNCTION_GRAPH_RETVAL
trace.retval = ftrace_regs_get_return_value(fregs);
#endif
@@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
}
}
+ ftrace_test_recursion_unlock(bit);
+out:
/*
* The ftrace_graph_return() may still access the current
* ret_stack structure, we need to make sure the update of
--
2.50.1
In exynos_get_pmu_regmap_by_phandle(), driver_find_device_by_of_node()
utilizes driver_find_device_by_fwnode() which internally calls
driver_find_device() to locate the matching device.
driver_find_device() increments the reference count of the found
device by calling get_device(), but exynos_get_pmu_regmap_by_phandle()
fails to call put_device() to decrement the reference count before
returning. This results in a reference count leak of the device each
time exynos_get_pmu_regmap_by_phandle() is executed, which may prevent
the device from being properly released and cause a memory leak.
Since Exynos-PMU is a core system device that is not unloaded at
runtime, and regmap is created during device probing, releasing the
temporary device reference does not affect the validity of regmap.
From the perspective of code standards and maintainability, reference
count leakage is a genuine code defect that should be fixed. Even if
the leakage does not immediately cause issues in certain scenarios,
known leakage points should not be left unaddressed.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: 0b7c6075022c ("soc: samsung: exynos-pmu: Add regmap support for SoCs that protect PMU regs")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
Changes in v2:
- modified the typo of the variable in the patch. Sorry for the typo;
- added more detailed description.
---
drivers/soc/samsung/exynos-pmu.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/soc/samsung/exynos-pmu.c b/drivers/soc/samsung/exynos-pmu.c
index a77288f49d24..b80cc30c1100 100644
--- a/drivers/soc/samsung/exynos-pmu.c
+++ b/drivers/soc/samsung/exynos-pmu.c
@@ -302,6 +302,7 @@ struct regmap *exynos_get_pmu_regmap_by_phandle(struct device_node *np,
{
struct device_node *pmu_np;
struct device *dev;
+ struct regmap *regmap;
if (propname)
pmu_np = of_parse_phandle(np, propname, 0);
@@ -325,7 +326,10 @@ struct regmap *exynos_get_pmu_regmap_by_phandle(struct device_node *np,
if (!dev)
return ERR_PTR(-EPROBE_DEFER);
- return syscon_node_to_regmap(pmu_np);
+ regmap = syscon_node_to_regmap(pmu_np);
+ put_device(dev);
+
+ return regmap;
}
EXPORT_SYMBOL_GPL(exynos_get_pmu_regmap_by_phandle);
--
2.17.1
The previous commit 0718a78f6a9f ("ALSA: usb-audio: Kill timer properly at
removal") patched a UAF issue caused by the error timer.
However, because the error timer kill added in this patch occurs after the
endpoint delete, a race condition to UAF still occurs, albeit rarely.
Additionally, since kill-cleanup for urb is also missing, freed memory can
be accessed in interrupt context related to urb, which can cause UAF.
Therefore, to prevent this, error timer and urb must be killed before
freeing the heap memory.
Cc: <stable(a)vger.kernel.org>
Reported-by: syzbot+f02665daa2abeef4a947(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f02665daa2abeef4a947
Fixes: 0718a78f6a9f ("ALSA: usb-audio: Kill timer properly at removal")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
---
sound/usb/midi.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/sound/usb/midi.c b/sound/usb/midi.c
index acb3bf92857c..97e7e7662b12 100644
--- a/sound/usb/midi.c
+++ b/sound/usb/midi.c
@@ -1522,15 +1522,14 @@ static void snd_usbmidi_free(struct snd_usb_midi *umidi)
{
int i;
+ if (!umidi->disconnected)
+ snd_usbmidi_disconnect(&umidi->list);
+
for (i = 0; i < MIDI_MAX_ENDPOINTS; ++i) {
struct snd_usb_midi_endpoint *ep = &umidi->endpoints[i];
- if (ep->out)
- snd_usbmidi_out_endpoint_delete(ep->out);
- if (ep->in)
- snd_usbmidi_in_endpoint_delete(ep->in);
+ kfree(ep->out);
}
mutex_destroy(&umidi->mutex);
- timer_shutdown_sync(&umidi->error_timer);
kfree(umidi);
}
--
In i2c_amd_probe(), amd_mp2_find_device() utilizes
driver_find_next_device() which internally calls driver_find_device()
to locate the matching device. driver_find_device() increments the
reference count of the found device by calling get_device(), but
amd_mp2_find_device() fails to call put_device() to decrement the
reference count before returning. This results in a reference count
leak of the PCI device each time i2c_amd_probe() is executed, which
may prevent the device from being properly released and cause a memory
leak.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: 529766e0a011 ("i2c: Add drivers for the AMD PCIe MP2 I2C controller")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/i2c/busses/i2c-amd-mp2-pci.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-amd-mp2-pci.c b/drivers/i2c/busses/i2c-amd-mp2-pci.c
index ef7370d3dbea..1f01d3f121ba 100644
--- a/drivers/i2c/busses/i2c-amd-mp2-pci.c
+++ b/drivers/i2c/busses/i2c-amd-mp2-pci.c
@@ -464,7 +464,9 @@ struct amd_mp2_dev *amd_mp2_find_device(void)
return NULL;
pci_dev = to_pci_dev(dev);
- return (struct amd_mp2_dev *)pci_get_drvdata(pci_dev);
+ mp2_dev = (struct amd_mp2_dev *)pci_get_drvdata(pci_dev);
+ put_device(dev);
+ return mp2_dev;
}
EXPORT_SYMBOL_GPL(amd_mp2_find_device);
--
2.17.1
In exynos_get_pmu_regmap_by_phandle(), driver_find_device_by_of_node()
utilizes driver_find_device_by_fwnode() which internally calls
driver_find_device() to locate the matching device.
driver_find_device() increments the reference count of the found
device by calling get_device(), but exynos_get_pmu_regmap_by_phandle()
fails to call put_device() to decrement the reference count before
returning. This results in a reference count leak of the device each
time exynos_get_pmu_regmap_by_phandle() is executed, which may prevent
the device from being properly released and cause a memory leak.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: 0b7c6075022c ("soc: samsung: exynos-pmu: Add regmap support for SoCs that protect PMU regs")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/soc/samsung/exynos-pmu.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/soc/samsung/exynos-pmu.c b/drivers/soc/samsung/exynos-pmu.c
index a77288f49d24..ed903a2dd416 100644
--- a/drivers/soc/samsung/exynos-pmu.c
+++ b/drivers/soc/samsung/exynos-pmu.c
@@ -302,6 +302,7 @@ struct regmap *exynos_get_pmu_regmap_by_phandle(struct device_node *np,
{
struct device_node *pmu_np;
struct device *dev;
+ struct regmap *regmap;
if (propname)
pmu_np = of_parse_phandle(np, propname, 0);
@@ -325,7 +326,10 @@ struct regmap *exynos_get_pmu_regmap_by_phandle(struct device_node *np,
if (!dev)
return ERR_PTR(-EPROBE_DEFER);
- return syscon_node_to_regmap(pmu_np);
+ regmap = syscon_node_to_regmap(pmu_np);
+ put_device(regmap);
+
+ return regmap;
}
EXPORT_SYMBOL_GPL(exynos_get_pmu_regmap_by_phandle);
--
2.17.1
driver_find_device() calls get_device() to increment the reference
count once a matching device is found, but there is no put_device() to
balance the reference count. To avoid reference count leakage, add
put_device() to decrease the reference count.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: a31500fe7055 ("drm/tegra: dc: Restore coupling of display controllers")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/gpu/drm/tegra/dc.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 59d5c1ba145a..6c84bd69b11f 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -3148,6 +3148,7 @@ static int tegra_dc_couple(struct tegra_dc *dc)
dc->client.parent = &parent->client;
dev_dbg(dc->dev, "coupled to %s\n", dev_name(companion));
+ put_device(companion);
}
return 0;
--
2.17.1
The previous commit 0718a78f6a9f ("ALSA: usb-audio: Kill timer properly at
removal") patched a UAF issue caused by the error timer.
However, because the error timer kill added in this patch occurs after the
endpoint delete, a race condition to UAF still occurs, albeit rarely.
Therefore, to prevent this, the error timer must be killed before freeing
the heap memory.
Cc: <stable(a)vger.kernel.org>
Fixes: 0718a78f6a9f ("ALSA: usb-audio: Kill timer properly at removal")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
---
sound/usb/midi.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/sound/usb/midi.c b/sound/usb/midi.c
index acb3bf92857c..8d15f1caa92b 100644
--- a/sound/usb/midi.c
+++ b/sound/usb/midi.c
@@ -1522,6 +1522,8 @@ static void snd_usbmidi_free(struct snd_usb_midi *umidi)
{
int i;
+ timer_shutdown_sync(&umidi->error_timer);
+
for (i = 0; i < MIDI_MAX_ENDPOINTS; ++i) {
struct snd_usb_midi_endpoint *ep = &umidi->endpoints[i];
if (ep->out)
@@ -1530,7 +1532,6 @@ static void snd_usbmidi_free(struct snd_usb_midi *umidi)
snd_usbmidi_in_endpoint_delete(ep->in);
}
mutex_destroy(&umidi->mutex);
- timer_shutdown_sync(&umidi->error_timer);
kfree(umidi);
}
--
Greetings!!
We are a 24+ yr old high tech Web Development firm with presence of over
18+ yrs in Mauritius; partners of RV TechAdvisora Ltd and headquartered in
India
We have catered to over 7000 trendy websites; are a team of 30 people.
Our Services: Domain Registrations, Webhosting, Google Workspace, Mobile
Responsive Website Designing, Wordpress Websites, Mobile Apps, Web Apps,
E-commerce websites, Google Ads, SEO, Catalogue design & affiliated services
Find below some of the ready packages we offer for making an easy selection
for the kind of Mobile Responsive HTML Website:
5 page Responsive Website @ MUR 13,499/-
10 page Responsive Website @ MUR 19,999/-
15 page Responsive Website @ MUR 25,499/-
20 page Responsive Website @ MUR 31,499/-
25 page Responsive Website @ MUR 36,499/-
30 page Responsive Website @ MUR 40,999/-
Additional page beyond 30 pages @ MUR 1199 per page
Find below some of the ready packages we offer for making an easy selection
for the kind of Mobile Responsive Wordpress Website; With Wordpress CMS,
website content can be easily maintained by your company with a backend
login.
5 page Responsive Website @ MUR 14,999/-
10 page Responsive Website @ MUR 21,999/-
15 page Responsive Website @ MUR 27,499/-
20 page Responsive Website @ MUR 32,999/-
25 page Responsive Website @ MUR 36,999/-
30 page Responsive Website @ MUR 40,999/-
Additional page beyond 30 pages @ MUR 1199 per page
Our brief website portfolio: http://www.mirackle.com/portfolio.html
Note: We are also looking for tie-ups with IT/Web design cos. who would
want to outsource work for high end Websites/Mobile APP requirements etc.
We have a team of highly skilled php coders who can cater to any complex
requirement.
India Whatsapp: +91 9323272846 / 9323551195; Mauritius WharsApp: +230 5758
5497; Email: business(a)mirackle.com ; Web: http://www.mirackle.com
Regards,
Amit Patel
driver_find_device() calls get_device() to increment the reference
count once a matching device is found. device_release_driver()
releases the driver, but it does not decrease the reference count that
was incremented by driver_find_device(). At the end of the loop, there
is no put_device() to balance the reference count. To avoid reference
count leakage, add put_device() to decrease the reference count.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: bfc653aa89cb ("perf: arm_cspmu: Separate Arm and vendor module")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/perf/arm_cspmu/arm_cspmu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
index efa9b229e701..e0d4293f06f9 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.c
+++ b/drivers/perf/arm_cspmu/arm_cspmu.c
@@ -1365,8 +1365,10 @@ void arm_cspmu_impl_unregister(const struct arm_cspmu_impl_match *impl_match)
/* Unbind the driver from all matching backend devices. */
while ((dev = driver_find_device(&arm_cspmu_driver.driver, NULL,
- match, arm_cspmu_match_device)))
+ match, arm_cspmu_match_device))) {
device_release_driver(dev);
+ put_device(dev);
+ }
mutex_lock(&arm_cspmu_lock);
--
2.17.1
Here are some patches for the MPTCP PM, including some refactoring that
I thought it would be best to send at the end of a cycle to avoid
conflicts between net and net-next that could last a few weeks.
The most interesting changes are in the first and last patch, the rest
are patches refactoring the code & tests to validate the modifications.
- Patches 1 & 2: When servers set the C-flag in their MP_CAPABLE to tell
clients not to create subflows to the initial address and port -- e.g.
a deployment behind a L4 load balancer like a typical CDN deployment
-- clients will not use their other endpoints when default settings
are used. That's because the in-kernel path-manager uses the 'subflow'
endpoints to create subflows only to the initial address and port. The
first patch fixes that (for >=v5.14), and the second one validates it.
- Patches 3-14: various patches refactoring the code around the
in-kernel PM (mainly): split too long functions, rename variables and
functions to avoid confusions, reduce structure size, and compare IDs
instead of IP addresses. Note that one patch modifies one internal
variable used in one BPF selftest.
- Patch 15: ability to control endpoints that are used in reaction to a
new address announced by the other peer. With that, endpoints can be
used only once.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Notes:
- Patches 1 & 2 are sent to net-next on purpose: to delay a bit the
backports, just in case. Plus we are at the end of a cycle, and not
to delay the other refactoring patches.
- Sorry, I wanted to send this series earlier on, but due to some
unrelated issues (and holiday), it got delayed. Most patches are
pure refactoring ones.
---
Matthieu Baerts (NGI0) (15):
mptcp: pm: in-kernel: usable client side with C-flag
selftests: mptcp: join: validate C-flag + def limit
mptcp: pm: in-kernel: refactor fill_local_addresses_vec
mptcp: pm: in-kernel: refactor fill_remote_addresses_vec
mptcp: pm: rename 'subflows' to 'extra_subflows'
mptcp: pm: in-kernel: rename 'subflows_max' to 'limit_extra_subflows'
mptcp: pm: in-kernel: rename 'add_addr_signal_max' to 'endp_signal_max'
mptcp: pm: in-kernel: rename 'add_addr_accept_max' to 'limit_add_addr_accepted'
mptcp: pm: in-kernel: rename 'local_addr_max' to 'endp_subflow_max'
mptcp: pm: in-kernel: rename 'local_addr_list' to 'endp_list'
mptcp: pm: in-kernel: rename 'addrs' to 'endpoints'
mptcp: pm: in-kernel: remove stale_loss_cnt
mptcp: pm: in-kernel: reduce pernet struct size
mptcp: pm: in-kernel: compare IDs instead of addresses
mptcp: pm: in-kernel: add laminar endpoints
include/uapi/linux/mptcp.h | 11 +-
net/mptcp/pm.c | 32 +-
net/mptcp/pm_kernel.c | 569 ++++++++++++++--------
net/mptcp/pm_userspace.c | 2 +-
net/mptcp/protocol.h | 21 +-
net/mptcp/sockopt.c | 22 +-
tools/testing/selftests/bpf/progs/mptcp_subflow.c | 2 +-
tools/testing/selftests/net/mptcp/mptcp_join.sh | 11 +
8 files changed, 441 insertions(+), 229 deletions(-)
---
base-commit: a1f1f2422e098485b09e55a492de05cf97f9954d
change-id: 20250925-net-next-mptcp-c-flag-laminar-f8442e4d4bd9
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
The patch titled
Subject: Squashfs: reject negative file sizes in squashfs_read_inode()
has been added to the -mm mm-nonmm-unstable branch. Its filename is
squashfs-reject-negative-file-sizes-in-squashfs_read_inode.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-nonmm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: Squashfs: reject negative file sizes in squashfs_read_inode()
Date: Fri, 26 Sep 2025 22:59:35 +0100
Syskaller reports a "WARNING in ovl_copy_up_file" in overlayfs.
This warning is ultimately caused because the underlying Squashfs file
system returns a file with a negative file size.
This commit checks for a negative file size and returns EINVAL.
Link: https://lkml.kernel.org/r/20250926215935.107233-1-phillip@squashfs.org.uk
Fixes: 6545b246a2c8 ("Squashfs: inode operations")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+f754e01116421e9754b9(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68d580e5.a00a0220.303701.0019.GAE@google.com/
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/inode.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
--- a/fs/squashfs/inode.c~squashfs-reject-negative-file-sizes-in-squashfs_read_inode
+++ a/fs/squashfs/inode.c
@@ -145,6 +145,10 @@ int squashfs_read_inode(struct inode *in
goto failed_read;
inode->i_size = le32_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
frag = le32_to_cpu(sqsh_ino->fragment);
if (frag != SQUASHFS_INVALID_FRAG) {
/*
@@ -197,6 +201,10 @@ int squashfs_read_inode(struct inode *in
goto failed_read;
inode->i_size = le64_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
frag = le32_to_cpu(sqsh_ino->fragment);
if (frag != SQUASHFS_INVALID_FRAG) {
/*
@@ -249,8 +257,12 @@ int squashfs_read_inode(struct inode *in
if (err < 0)
goto failed_read;
- set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
inode->i_size = le16_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
+ set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
inode->i_op = &squashfs_dir_inode_ops;
inode->i_fop = &squashfs_dir_ops;
inode->i_mode |= S_IFDIR;
@@ -273,9 +285,13 @@ int squashfs_read_inode(struct inode *in
if (err < 0)
goto failed_read;
+ inode->i_size = le32_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
xattr_id = le32_to_cpu(sqsh_ino->xattr);
set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
- inode->i_size = le32_to_cpu(sqsh_ino->file_size);
inode->i_op = &squashfs_dir_inode_ops;
inode->i_fop = &squashfs_dir_ops;
inode->i_mode |= S_IFDIR;
@@ -302,7 +318,7 @@ int squashfs_read_inode(struct inode *in
goto failed_read;
inode->i_size = le32_to_cpu(sqsh_ino->symlink_size);
- if (inode->i_size > PAGE_SIZE) {
+ if (inode->i_size < 0 || inode->i_size > PAGE_SIZE) {
ERROR("Corrupted symlink\n");
return -EINVAL;
}
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
squashfs-fix-uninit-value-in-squashfs_get_parent.patch
squashfs-add-additional-inode-sanity-checking.patch
squashfs-add-seek_data-seek_hole-support.patch
squashfs-reject-negative-file-sizes-in-squashfs_read_inode.patch
This patch series enables a future version of tune2fs to be able to
modify certain parts of the ext4 superblock without to write to the
block device.
The first patch fixes a potential buffer overrun caused by a
maliciously moified superblock. The second patch adds support for
32-bit uid and gid's which can have access to the reserved blocks pool.
The last patch adds the ioctl's which will be used by tune2fs.
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
---
Changes in v2:
- fix bugs that were detected using sparse
- remove tune (unsafe) ability to clear certain compat faatures
- add the ability to set the encoding and encoding flags for case folding
- Link to v1: https://lore.kernel.org/r/20250908-tune2fs-v1-0-e3a6929f3355@mit.edu
---
Theodore Ts'o (3):
ext4: avoid potential buffer over-read in parse_apply_sb_mount_options()
ext4: add support for 32-bit default reserved uid and gid values
ext4: implemet new ioctls to set and get superblock parameters
fs/ext4/ext4.h | 16 +++-
fs/ext4/ioctl.c | 312 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
fs/ext4/super.c | 25 +++----
include/uapi/linux/ext4.h | 53 +++++++++++++
4 files changed, 382 insertions(+), 24 deletions(-)
---
base-commit: b320789d6883cc00ac78ce83bccbfe7ed58afcf0
change-id: 20250830-tune2fs-3376beb72403
Best regards,
--
Theodore Ts'o <tytso(a)mit.edu>
Make sure to drop the reference taken to the ocmem platform device when
looking up its driver data.
Note that holding a reference to a device does not prevent its driver
data from going away so there is no point in keeping the reference.
Also note that commit 0ff027027e05 ("soc: qcom: ocmem: Fix missing
put_device() call in of_get_ocmem") fixed the leak in a lookup error
path, but the reference is still leaking on success.
Fixes: 88c1e9404f1d ("soc: qcom: add OCMEM driver")
Cc: stable(a)vger.kernel.org # 5.5: 0ff027027e05
Cc: Brian Masney <bmasney(a)redhat.com>
Cc: Miaoqian Lin <linmq006(a)gmail.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/soc/qcom/ocmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soc/qcom/ocmem.c b/drivers/soc/qcom/ocmem.c
index 9c3bd37b6579..71130a2f62e9 100644
--- a/drivers/soc/qcom/ocmem.c
+++ b/drivers/soc/qcom/ocmem.c
@@ -202,9 +202,9 @@ struct ocmem *of_get_ocmem(struct device *dev)
}
ocmem = platform_get_drvdata(pdev);
+ put_device(&pdev->dev);
if (!ocmem) {
dev_err(dev, "Cannot get ocmem\n");
- put_device(&pdev->dev);
return ERR_PTR(-ENODEV);
}
return ocmem;
--
2.49.1
Hello,
This is Chenglong from Google Container Optimized OS. I'm reporting a
severe CPU hang regression that occurs after a high volume of file
creation and subsequent cgroup cleanup.
Through bisection, the issue appears to be caused by a chain reaction
between three commits related to writeback, unbound workqueues, and
CPU-hogging detection. The issue is greatly alleviated on the latest
mainline kernel but is not fully resolved, still occurring
intermittently (~1 in 10 runs).
How to reproduce
The kernel v6.1 is good. The hang is reliably triggered(over 80%
chance) on kernels v6.6 and 6.12 and intermittently on
mainline(6.17-rc7) with the following steps:
Environment: A machine with a fast SSD and a high core count (e.g.,
Google Cloud's N2-standard-128).
Workload: Concurrently generate a large number of files (e.g., 2
million) using multiple services managed by systemd-run. This creates
significant I/O and cgroup churn.
Trigger: After the file generation completes, terminate the
systemd-run services.
Result: Shortly after the services are killed, the system's CPU load
spikes, leading to a massive number of kworker/+inode_switch_wbs
threads and a system-wide hang/livelock where the machine becomes
unresponsive (20s - 300s).
Analysis and Problematic Commits
1. The initial commit: The process begins with a worker that can get
stuck busy-waiting on a spinlock.
Commit: ("writeback, cgroup: release dying cgwbs by switching attached inodes")
Effect: This introduced the inode_switch_wbs_work_fn worker to clean
up cgroup writeback structures. Under our test load, this worker
appears to hit a highly contended wb->list_lock spinlock, causing it
to burn 100% CPU without sleeping.
2. The Kworker Explosion: A subsequent change misinterprets the
spinning worker from Stage 1, leading to a runaway feedback loop of
worker creation.
Commit: 616db8779b1e ("workqueue: Automatically mark CPU-hogging work
items CPU_INTENSIVE")
Effect: This logic sees the spinning worker, marks it as
CPU_INTENSIVE, and excludes it from concurrency management. To handle
the work backlog, it spawns a new kworker, which then also gets stuck
on the same lock, repeating the cycle. This directly causes the
kworker count to explode from <50 to 100-2000+.
3. The System-Wide Lockdown: The final piece allows this localized
worker explosion to saturate the entire system.
Commit: 8639ecebc9b1 ("workqueue: Implement non-strict affinity scope
for unbound workqueues")
Effect: This change introduced non-strict affinity as the default. It
allows the hundreds of kworkers created in Stage 2 to be spread by the
scheduler across all available CPU cores, turning the problem into a
system-wide hang.
Current Status and Mitigation
Mainline Status: On the latest mainline kernel, the hang is far less
frequent and the kworker counts are reduced back to normal (<50),
suggesting other changes have partially mitigated the issue. However,
the hang still occurs, and when it does, the kworker count still
explodes (e.g., 300+), indicating the underlying feedback loop
remains.
Workaround: A reliable mitigation is to revert to the old workqueue
behavior by setting affinity_strict to 1. This contains the kworker
proliferation to a single CPU pod, preventing the system-wide hang.
Questions
Given that the issue is not fully resolved, could you please provide
some guidance?
1. Is this a known issue, and are there patches in development that
might fully address the underlying spinlock contention or the kworker
feedback loop?
2. Is there a better long-term mitigation we can apply other than
forcing strict affinity?
Thank you for your time and help.
Best regards,
Chenglong
On Wed, Sep 24, 2025 at 05:24:15PM -0700, Chenglong Tang wrote:
> The kernel v6.1 is good. The hang is reliably triggered(over 80% chance) on
> kernels v6.6 and 6.12 and intermittently on mainline(6.17-rc7) with the
> following steps:
> -
>
> *Environment:* A machine with a fast SSD and a high core count (e.g.,
> Google Cloud's N2-standard-128).
> -
>
> *Workload:* Concurrently generate a large number of files (e.g., 2 million)
> using multiple services managed by systemd-run. This creates significant
> I/O and cgroup churn.
> -
>
> *Trigger:* After the file generation completes, terminate the systemd-run
> services.
> -
>
> *Result:* Shortly after the services are killed, the system's CPU load
> spikes, leading to a massive number of kworker/+inode_switch_wbs threads
> and a system-wide hang/livelock where the machine becomes unresponsive (20s
> - 300s).
Sounds like:
http://lkml.kernel.org/r/20250912103522.2935-1-jack@suse.cz
Can you see whether those patches resolve the problem?
Thanks.
--
tejun
Make sure to drop the reference taken to the pbs platform device when
looking up its driver data.
Note that holding a reference to a device does not prevent its driver
data from going away so there is no point in keeping the reference.
Fixes: 5b2dd77be1d8 ("soc: qcom: add QCOM PBS driver")
Cc: stable(a)vger.kernel.org # 6.9
Cc: Anjelique Melendez <quic_amelende(a)quicinc.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/soc/qcom/qcom-pbs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/soc/qcom/qcom-pbs.c b/drivers/soc/qcom/qcom-pbs.c
index 1cc5d045f9dd..06b4a596e275 100644
--- a/drivers/soc/qcom/qcom-pbs.c
+++ b/drivers/soc/qcom/qcom-pbs.c
@@ -173,6 +173,8 @@ struct pbs_dev *get_pbs_client_device(struct device *dev)
return ERR_PTR(-EINVAL);
}
+ platform_device_put(pdev);
+
return pbs;
}
EXPORT_SYMBOL_GPL(get_pbs_client_device);
--
2.49.1
Make sure to drop the reference taken to the mbox platform device when
looking up its driver data.
Note that holding a reference to a device does not prevent its driver
data from going away so there is no point in keeping the reference.
Fixes: 6e1457fcad3f ("soc: apple: mailbox: Add ASC/M3 mailbox driver")
Cc: stable(a)vger.kernel.org # 6.8
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/soc/apple/mailbox.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/soc/apple/mailbox.c b/drivers/soc/apple/mailbox.c
index 49a0955e82d6..1685da1da23d 100644
--- a/drivers/soc/apple/mailbox.c
+++ b/drivers/soc/apple/mailbox.c
@@ -299,11 +299,18 @@ struct apple_mbox *apple_mbox_get(struct device *dev, int index)
return ERR_PTR(-EPROBE_DEFER);
mbox = platform_get_drvdata(pdev);
- if (!mbox)
- return ERR_PTR(-EPROBE_DEFER);
+ if (!mbox) {
+ mbox = ERR_PTR(-EPROBE_DEFER);
+ goto out_put_pdev;
+ }
+
+ if (!device_link_add(dev, &pdev->dev, DL_FLAG_AUTOREMOVE_CONSUMER)) {
+ mbox = ERR_PTR(-ENODEV);
+ goto out_put_pdev;
+ }
- if (!device_link_add(dev, &pdev->dev, DL_FLAG_AUTOREMOVE_CONSUMER))
- return ERR_PTR(-ENODEV);
+out_put_pdev:
+ put_device(&pdev->dev);
return mbox;
}
--
2.49.1
The ACPI has ways to annotate the location of a USB device. Wire that
annotation to a v4l2 control.
To support all possible devices, add a way to annotate USB devices on DT
as well. The original binding discussion happened here:
https://lore.kernel.org/linux-devicetree/20241212-usb-orientation-v1-1-0b69…
The following patches are needed regardless if we finally add support
for orientation and rotation or not:
- media: uvcvideo: Always set default_value
- media: uvcvideo: Set a function for UVC_EXT_GPIO_UNIT
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
Changes in v3:
- refactor dt bindings
- add media: uvcvideo: Use current_value for read-only controls
- get_(max|cur|def) = swentity_get_cur
- virtual_entity add codestyle
- Codestyle
- Fix xu get_info and get_len
- Drop ACPI_DEVICE_SWNODE_DEV_ROTATION
- Add missing select V4L2_FWNODE
- Link to v2: https://lore.kernel.org/r/20250605-uvc-orientation-v2-0-5710f9d030aa@chromi…
Changes in v2:
- Add support for rotation
- Rename fwnode to swentity
- Remove the patch to move the gpio file
- Remove patches already in media-committers
- Change priority of data origins
- Patch mipi-disco
- Link to v1: https://lore.kernel.org/r/20250403-uvc-orientation-v1-0-1a0cc595a62d@chromi…
---
Ricardo Ribalda (12):
media: uvcvideo: Always set default_value
media: uvcvideo: Set a function for UVC_EXT_GPIO_UNIT
media: v4l: fwnode: Support ACPI's _PLD for v4l2_fwnode_device_parse
ACPI: mipi-disco-img: Do not duplicate rotation info into swnodes
media: ipu-bridge: Use v4l2_fwnode_device_parse helper
media: ipu-bridge: Use v4l2_fwnode for unknown rotations
dt-bindings: media: Add usb-camera-module
media: uvcvideo: Add support for V4L2_CID_CAMERA_ORIENTATION
media: uvcvideo: Fill ctrl->info.selector earlier
media: uvcvideo: Add uvc_ctrl_query_entity helper
media: uvcvideo: Use current_value for read-only controls
media: uvcvideo: Add support for V4L2_CID_CAMERA_ROTATION
.../bindings/media/usb-camera-module.yaml | 46 +++++
MAINTAINERS | 1 +
drivers/acpi/mipi-disco-img.c | 15 --
drivers/media/pci/intel/Kconfig | 1 +
drivers/media/pci/intel/ipu-bridge.c | 58 +++---
drivers/media/usb/uvc/Kconfig | 1 +
drivers/media/usb/uvc/Makefile | 3 +-
drivers/media/usb/uvc/uvc_ctrl.c | 201 +++++++++++++++------
drivers/media/usb/uvc/uvc_driver.c | 22 ++-
drivers/media/usb/uvc/uvc_entity.c | 3 +-
drivers/media/usb/uvc/uvc_swentity.c | 107 +++++++++++
drivers/media/usb/uvc/uvcvideo.h | 22 +++
drivers/media/v4l2-core/v4l2-fwnode.c | 84 ++++++++-
include/acpi/acpi_bus.h | 1 -
include/linux/usb/uvc.h | 3 +
15 files changed, 441 insertions(+), 127 deletions(-)
---
base-commit: afb100a5ea7a13d7e6937dcd3b36b19dc6cc9328
change-id: 20250403-uvc-orientation-5f7f19da5adb
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>
From: Conor Dooley <conor.dooley(a)microchip.com>
mpfs_gpio_direction_output() actually sets the line to input mode.
Use the correct register settings for output mode so that this function
actually works as intended.
This was a copy-paste mistake made when converting to regmap during the
driver submission process. It went unnoticed because my test for output
mode is toggling LEDs on an Icicle kit which functions with the
incorrect code. The internal reporter has yet to test the patch, but on
their system the incorrect setting may be the reason for failures to
drive the GPIO lines on the BeagleV-fire board.
CC: stable(a)vger.kernel.org
Fixes: a987b78f3615e ("gpio: mpfs: add polarfire soc gpio support")
Signed-off-by: Conor Dooley <conor.dooley(a)microchip.com>
---
CC: Conor Dooley <conor.dooley(a)microchip.com>
CC: Daire McNamara <daire.mcnamara(a)microchip.com>
CC: Cyril.Jean(a)microchip.com
CC: Linus Walleij <linus.walleij(a)linaro.org>
CC: Bartosz Golaszewski <brgl(a)bgdev.pl>
CC: linux-riscv(a)lists.infradead.org
CC: linux-gpio(a)vger.kernel.org
CC: linux-kernel(a)vger.kernel.org
---
drivers/gpio/gpio-mpfs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpio/gpio-mpfs.c b/drivers/gpio/gpio-mpfs.c
index 82d557a7e5d8d..9468795b96348 100644
--- a/drivers/gpio/gpio-mpfs.c
+++ b/drivers/gpio/gpio-mpfs.c
@@ -69,7 +69,7 @@ static int mpfs_gpio_direction_output(struct gpio_chip *gc, unsigned int gpio_in
struct mpfs_gpio_chip *mpfs_gpio = gpiochip_get_data(gc);
regmap_update_bits(mpfs_gpio->regs, MPFS_GPIO_CTRL(gpio_index),
- MPFS_GPIO_DIR_MASK, MPFS_GPIO_EN_IN);
+ MPFS_GPIO_DIR_MASK, MPFS_GPIO_EN_OUT | MPFS_GPIO_EN_OUT_BUF);
regmap_update_bits(mpfs_gpio->regs, mpfs_gpio->offsets->outp, BIT(gpio_index),
value << gpio_index);
--
2.47.3
Hi,
Access a comprehensive, verified database of 501,853 IAA Mobility 2025 attendees and 750 exhibitors to grow your network and maximize business opportunities.
What you’ll get: Contact name, Job Title, Business Name, Physical Address, Phone Numbers, Official Email address and many more.
Delivered within 48 hours and 100% opt-in and GDPR compliant.
Special Discount: Enjoy 20% off for a limited time!
Would you prefer the entire attendee database or a customized segment tailored by geography, job role, or industry? I’ll forward the same to our Business Development Manager, who will provide you with more details about list acquisition.
Best regards,
Garnet Conwell
Sr. Marketing Manager
P.S. To opt out of future emails, simply reply “Unfollow.”
From: Max Kellermann <max.kellermann(a)ionos.com>
Commit 20d72b00ca81 ("netfs: Fix the request's work item to not
require a ref") modified netfs_alloc_request() to initialize the
reference counter to 2 instead of 1. The rationale was that the
requet's "work" would release the second reference after completion
(via netfs_{read,write}_collection_worker()). That works most of the
time if all goes well.
However, it leaks this additional reference if the request is released
before the I/O operation has been submitted: the error code path only
decrements the reference counter once and the work item will never be
queued because there will never be a completion.
This has caused outages of our whole server cluster today because
tasks were blocked in netfs_wait_for_outstanding_io(), leading to
deadlocks in Ceph (another bug that I will address soon in another
patch). This was caused by a netfs_pgpriv2_begin_copy_to_cache() call
which failed in fscache_begin_write_operation(). The leaked
netfs_io_request was never completed, leaving `netfs_inode.io_count`
with a positive value forever.
All of this is super-fragile code. Finding out which code paths will
lead to an eventual completion and which do not is hard to see:
- Some functions like netfs_create_write_req() allocate a request, but
will never submit any I/O.
- netfs_unbuffered_read_iter_locked() calls netfs_unbuffered_read()
and then netfs_put_request(); however, netfs_unbuffered_read() can
also fail early before submitting the I/O request, therefore another
netfs_put_request() call must be added there.
A rule of thumb is that functions that return a `netfs_io_request` do
not submit I/O, and all of their callers must be checked.
For my taste, the whole netfs code needs an overhaul to make reference
counting easier to understand and less fragile & obscure. But to fix
this bug here and now and produce a patch that is adequate for a
stable backport, I tried a minimal approach that quickly frees the
request object upon early failure.
I decided against adding a second netfs_put_request() each time
because that would cause code duplication which obscures the code
further. Instead, I added the function netfs_put_failed_request()
which frees such a failed request synchronously under the assumption
that the reference count is exactly 2 (as initially set by
netfs_alloc_request() and never touched), verified by a
WARN_ON_ONCE(). It then deinitializes the request object (without
going through the "cleanup_work" indirection) and frees the allocation
(with RCU protection to protect against concurrent access by
netfs_requests_seq_start()).
All code paths that fail early have been changed to call
netfs_put_failed_request() instead of netfs_put_request().
Additionally, I have added a netfs_put_request() call to
netfs_unbuffered_read() as explained above because the
netfs_put_failed_request() approach does not work there.
Fixes: 20d72b00ca81 ("netfs: Fix the request's work item to not require a ref")
Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
cc: Paulo Alcantara <pc(a)manguebit.org>
cc: netfs(a)lists.linux.dev,
cc: linux-fsdevel(a)vger.kernel.org,
cc: stable(a)vger.kernel.org
---
Changes
=======
ver #3)
- Log the refcount in the tracepoint in netfs_put_failed_request().
ver #2)
- Fix missing RCU handling in netfs_put_failed_request().
fs/netfs/buffered_read.c | 10 +++++-----
fs/netfs/direct_read.c | 7 ++++++-
fs/netfs/direct_write.c | 6 +++++-
fs/netfs/internal.h | 1 +
fs/netfs/objects.c | 28 +++++++++++++++++++++++++---
fs/netfs/read_pgpriv2.c | 2 +-
fs/netfs/read_single.c | 2 +-
fs/netfs/write_issue.c | 3 +--
8 files changed, 45 insertions(+), 14 deletions(-)
diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index 18b3dc74c70e..37ab6f28b5ad 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -369,7 +369,7 @@ void netfs_readahead(struct readahead_control *ractl)
return netfs_put_request(rreq, netfs_rreq_trace_put_return);
cleanup_free:
- return netfs_put_request(rreq, netfs_rreq_trace_put_failed);
+ return netfs_put_failed_request(rreq);
}
EXPORT_SYMBOL(netfs_readahead);
@@ -472,7 +472,7 @@ static int netfs_read_gaps(struct file *file, struct folio *folio)
return ret < 0 ? ret : 0;
discard:
- netfs_put_request(rreq, netfs_rreq_trace_put_discard);
+ netfs_put_failed_request(rreq);
alloc_error:
folio_unlock(folio);
return ret;
@@ -532,7 +532,7 @@ int netfs_read_folio(struct file *file, struct folio *folio)
return ret < 0 ? ret : 0;
discard:
- netfs_put_request(rreq, netfs_rreq_trace_put_discard);
+ netfs_put_failed_request(rreq);
alloc_error:
folio_unlock(folio);
return ret;
@@ -699,7 +699,7 @@ int netfs_write_begin(struct netfs_inode *ctx,
return 0;
error_put:
- netfs_put_request(rreq, netfs_rreq_trace_put_failed);
+ netfs_put_failed_request(rreq);
error:
if (folio) {
folio_unlock(folio);
@@ -754,7 +754,7 @@ int netfs_prefetch_for_write(struct file *file, struct folio *folio,
return ret < 0 ? ret : 0;
error_put:
- netfs_put_request(rreq, netfs_rreq_trace_put_discard);
+ netfs_put_failed_request(rreq);
error:
_leave(" = %d", ret);
return ret;
diff --git a/fs/netfs/direct_read.c b/fs/netfs/direct_read.c
index a05e13472baf..a498ee8d6674 100644
--- a/fs/netfs/direct_read.c
+++ b/fs/netfs/direct_read.c
@@ -131,6 +131,7 @@ static ssize_t netfs_unbuffered_read(struct netfs_io_request *rreq, bool sync)
if (rreq->len == 0) {
pr_err("Zero-sized read [R=%x]\n", rreq->debug_id);
+ netfs_put_request(rreq, netfs_rreq_trace_put_discard);
return -EIO;
}
@@ -205,7 +206,7 @@ ssize_t netfs_unbuffered_read_iter_locked(struct kiocb *iocb, struct iov_iter *i
if (user_backed_iter(iter)) {
ret = netfs_extract_user_iter(iter, rreq->len, &rreq->buffer.iter, 0);
if (ret < 0)
- goto out;
+ goto error_put;
rreq->direct_bv = (struct bio_vec *)rreq->buffer.iter.bvec;
rreq->direct_bv_count = ret;
rreq->direct_bv_unpin = iov_iter_extract_will_pin(iter);
@@ -238,6 +239,10 @@ ssize_t netfs_unbuffered_read_iter_locked(struct kiocb *iocb, struct iov_iter *i
if (ret > 0)
orig_count -= ret;
return ret;
+
+error_put:
+ netfs_put_failed_request(rreq);
+ return ret;
}
EXPORT_SYMBOL(netfs_unbuffered_read_iter_locked);
diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
index a16660ab7f83..a9d1c3b2c084 100644
--- a/fs/netfs/direct_write.c
+++ b/fs/netfs/direct_write.c
@@ -57,7 +57,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
n = netfs_extract_user_iter(iter, len, &wreq->buffer.iter, 0);
if (n < 0) {
ret = n;
- goto out;
+ goto error_put;
}
wreq->direct_bv = (struct bio_vec *)wreq->buffer.iter.bvec;
wreq->direct_bv_count = n;
@@ -101,6 +101,10 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
out:
netfs_put_request(wreq, netfs_rreq_trace_put_return);
return ret;
+
+error_put:
+ netfs_put_failed_request(wreq);
+ return ret;
}
EXPORT_SYMBOL(netfs_unbuffered_write_iter_locked);
diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
index d4f16fefd965..4319611f5354 100644
--- a/fs/netfs/internal.h
+++ b/fs/netfs/internal.h
@@ -87,6 +87,7 @@ struct netfs_io_request *netfs_alloc_request(struct address_space *mapping,
void netfs_get_request(struct netfs_io_request *rreq, enum netfs_rreq_ref_trace what);
void netfs_clear_subrequests(struct netfs_io_request *rreq);
void netfs_put_request(struct netfs_io_request *rreq, enum netfs_rreq_ref_trace what);
+void netfs_put_failed_request(struct netfs_io_request *rreq);
struct netfs_io_subrequest *netfs_alloc_subrequest(struct netfs_io_request *rreq);
static inline void netfs_see_request(struct netfs_io_request *rreq,
diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c
index e8c99738b5bb..39d5e13f7248 100644
--- a/fs/netfs/objects.c
+++ b/fs/netfs/objects.c
@@ -116,10 +116,8 @@ static void netfs_free_request_rcu(struct rcu_head *rcu)
netfs_stat_d(&netfs_n_rh_rreq);
}
-static void netfs_free_request(struct work_struct *work)
+static void netfs_deinit_request(struct netfs_io_request *rreq)
{
- struct netfs_io_request *rreq =
- container_of(work, struct netfs_io_request, cleanup_work);
struct netfs_inode *ictx = netfs_inode(rreq->inode);
unsigned int i;
@@ -149,6 +147,14 @@ static void netfs_free_request(struct work_struct *work)
if (atomic_dec_and_test(&ictx->io_count))
wake_up_var(&ictx->io_count);
+}
+
+static void netfs_free_request(struct work_struct *work)
+{
+ struct netfs_io_request *rreq =
+ container_of(work, struct netfs_io_request, cleanup_work);
+
+ netfs_deinit_request(rreq);
call_rcu(&rreq->rcu, netfs_free_request_rcu);
}
@@ -167,6 +173,22 @@ void netfs_put_request(struct netfs_io_request *rreq, enum netfs_rreq_ref_trace
}
}
+/*
+ * Free a request (synchronously) that was just allocated but has
+ * failed before it could be submitted.
+ */
+void netfs_put_failed_request(struct netfs_io_request *rreq)
+{
+ /* new requests have two references (see
+ * netfs_alloc_request(), and this function is only allowed on
+ * new request objects
+ */
+ WARN_ON_ONCE(refcount_read(&rreq->ref) != 2);
+
+ trace_netfs_rreq_ref(rreq->debug_id, 0, netfs_rreq_trace_put_failed);
+ netfs_free_request(&rreq->cleanup_work);
+}
+
/*
* Allocate and partially initialise an I/O request structure.
*/
diff --git a/fs/netfs/read_pgpriv2.c b/fs/netfs/read_pgpriv2.c
index 8097bc069c1d..a1489aa29f78 100644
--- a/fs/netfs/read_pgpriv2.c
+++ b/fs/netfs/read_pgpriv2.c
@@ -118,7 +118,7 @@ static struct netfs_io_request *netfs_pgpriv2_begin_copy_to_cache(
return creq;
cancel_put:
- netfs_put_request(creq, netfs_rreq_trace_put_return);
+ netfs_put_failed_request(creq);
cancel:
rreq->copy_to_cache = ERR_PTR(-ENOBUFS);
clear_bit(NETFS_RREQ_FOLIO_COPY_TO_CACHE, &rreq->flags);
diff --git a/fs/netfs/read_single.c b/fs/netfs/read_single.c
index fa622a6cd56d..5c0dc4efc792 100644
--- a/fs/netfs/read_single.c
+++ b/fs/netfs/read_single.c
@@ -189,7 +189,7 @@ ssize_t netfs_read_single(struct inode *inode, struct file *file, struct iov_ite
return ret;
cleanup_free:
- netfs_put_request(rreq, netfs_rreq_trace_put_failed);
+ netfs_put_failed_request(rreq);
return ret;
}
EXPORT_SYMBOL(netfs_read_single);
diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
index 0584cba1a043..dd8743bc8d7f 100644
--- a/fs/netfs/write_issue.c
+++ b/fs/netfs/write_issue.c
@@ -133,8 +133,7 @@ struct netfs_io_request *netfs_create_write_req(struct address_space *mapping,
return wreq;
nomem:
- wreq->error = -ENOMEM;
- netfs_put_request(wreq, netfs_rreq_trace_put_failed);
+ netfs_put_failed_request(wreq);
return ERR_PTR(-ENOMEM);
}
A chip freeze is observed on i.MX7D when PCIe RC kicks off the PM_PME
message and no any devices are connected on the port.
To workaroud such kind of issue, skip PME_Turn_Off message if there is
no endpoint connected.
Cc: stable(a)vger.kernel.org
Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
Fixes: a528d1a72597 ("PCI: imx6: Use DWC common suspend resume method")
Signed-off-by: Richard Zhu <hongxing.zhu(a)nxp.com>
Reviewed-by: Frank Li <Frank.Li(a)nxp.com>
---
drivers/pci/controller/dwc/pcie-designware-host.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index 57a1ba08c427..b303a74b0fd7 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -1008,12 +1008,15 @@ int dw_pcie_suspend_noirq(struct dw_pcie *pci)
u32 val;
int ret;
- if (pci->pp.ops->pme_turn_off) {
- pci->pp.ops->pme_turn_off(&pci->pp);
- } else {
- ret = dw_pcie_pme_turn_off(pci);
- if (ret)
- return ret;
+ /* Skip PME_Turn_Off message if there is no endpoint connected */
+ if (dw_pcie_get_ltssm(pci) > DW_PCIE_LTSSM_DETECT_WAIT) {
+ if (pci->pp.ops->pme_turn_off) {
+ pci->pp.ops->pme_turn_off(&pci->pp);
+ } else {
+ ret = dw_pcie_pme_turn_off(pci);
+ if (ret)
+ return ret;
+ }
}
if (dwc_quirk(pci, QUIRK_NOL2POLL_IN_PM)) {
--
2.37.1
The quilt patch titled
Subject: mm/damon/sysfs: do not ignore callback's return value in damon_sysfs_damon_call()
has been removed from the -mm tree. Its filename was
mm-damon-sysfs-do-not-ignore-callbacks-return-value-in-damon_sysfs_damon_call.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Akinobu Mita <akinobu.mita(a)gmail.com>
Subject: mm/damon/sysfs: do not ignore callback's return value in damon_sysfs_damon_call()
Date: Sat, 20 Sep 2025 22:25:46 +0900
The callback return value is ignored in damon_sysfs_damon_call(), which
means that it is not possible to detect invalid user input when writing
commands such as 'commit' to
/sys/kernel/mm/damon/admin/kdamonds/<K>/state. Fix it.
Link: https://lkml.kernel.org/r/20250920132546.5822-1-akinobu.mita@gmail.com
Fixes: f64539dcdb87 ("mm/damon/sysfs: use damon_call() for update_schemes_stats")
Signed-off-by: Akinobu Mita <akinobu.mita(a)gmail.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [6.14+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/sysfs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/mm/damon/sysfs.c~mm-damon-sysfs-do-not-ignore-callbacks-return-value-in-damon_sysfs_damon_call
+++ a/mm/damon/sysfs.c
@@ -1592,12 +1592,14 @@ static int damon_sysfs_damon_call(int (*
struct damon_sysfs_kdamond *kdamond)
{
struct damon_call_control call_control = {};
+ int err;
if (!kdamond->damon_ctx)
return -EINVAL;
call_control.fn = fn;
call_control.data = kdamond;
- return damon_call(kdamond->damon_ctx, &call_control);
+ err = damon_call(kdamond->damon_ctx, &call_control);
+ return err ? err : call_control.return_code;
}
struct damon_sysfs_schemes_walk_data {
_
Patches currently in -mm which might be from akinobu.mita(a)gmail.com are
The quilt patch titled
Subject: fs/proc/task_mmu: check p->vec_buf for NULL
has been removed from the -mm tree. Its filename was
fs-proc-task_mmu-check-p-vec_buf-for-null.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Jakub Acs <acsjakub(a)amazon.de>
Subject: fs/proc/task_mmu: check p->vec_buf for NULL
Date: Mon, 22 Sep 2025 08:22:05 +0000
When the PAGEMAP_SCAN ioctl is invoked with vec_len = 0 reaches
pagemap_scan_backout_range(), kernel panics with null-ptr-deref:
[ 44.936808] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 44.937797] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 44.938391] CPU: 1 UID: 0 PID: 2480 Comm: reproducer Not tainted 6.17.0-rc6 #22 PREEMPT(none)
[ 44.939062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 44.939935] RIP: 0010:pagemap_scan_thp_entry.isra.0+0x741/0xa80
<snip registers, unreliable trace>
[ 44.946828] Call Trace:
[ 44.947030] <TASK>
[ 44.949219] pagemap_scan_pmd_entry+0xec/0xfa0
[ 44.952593] walk_pmd_range.isra.0+0x302/0x910
[ 44.954069] walk_pud_range.isra.0+0x419/0x790
[ 44.954427] walk_p4d_range+0x41e/0x620
[ 44.954743] walk_pgd_range+0x31e/0x630
[ 44.955057] __walk_page_range+0x160/0x670
[ 44.956883] walk_page_range_mm+0x408/0x980
[ 44.958677] walk_page_range+0x66/0x90
[ 44.958984] do_pagemap_scan+0x28d/0x9c0
[ 44.961833] do_pagemap_cmd+0x59/0x80
[ 44.962484] __x64_sys_ioctl+0x18d/0x210
[ 44.962804] do_syscall_64+0x5b/0x290
[ 44.963111] entry_SYSCALL_64_after_hwframe+0x76/0x7e
vec_len = 0 in pagemap_scan_init_bounce_buffer() means no buffers are
allocated and p->vec_buf remains set to NULL.
This breaks an assumption made later in pagemap_scan_backout_range(), that
page_region is always allocated for p->vec_buf_index.
Fix it by explicitly checking p->vec_buf for NULL before dereferencing.
Other sites that might run into same deref-issue are already (directly or
transitively) protected by checking p->vec_buf.
Note:
From PAGEMAP_SCAN man page, it seems vec_len = 0 is valid when no output
is requested and it's only the side effects caller is interested in,
hence it passes check in pagemap_scan_get_args().
This issue was found by syzkaller.
Link: https://lkml.kernel.org/r/20250922082206.6889-1-acsjakub@amazon.de
Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs")
Signed-off-by: Jakub Acs <acsjakub(a)amazon.de>
Reviewed-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Penglei Jiang <superman.xpt(a)gmail.com>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Andrei Vagin <avagin(a)gmail.com>
Cc: "Micha�� Miros��aw" <mirq-linux(a)rere.qmqm.pl>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/task_mmu.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/proc/task_mmu.c~fs-proc-task_mmu-check-p-vec_buf-for-null
+++ a/fs/proc/task_mmu.c
@@ -2417,6 +2417,9 @@ static void pagemap_scan_backout_range(s
{
struct page_region *cur_buf = &p->vec_buf[p->vec_buf_index];
+ if (!p->vec_buf)
+ return;
+
if (cur_buf->start != addr)
cur_buf->end = addr;
else
_
Patches currently in -mm which might be from acsjakub(a)amazon.de are
The quilt patch titled
Subject: kmsan: fix out-of-bounds access to shadow memory
has been removed from the -mm tree. Its filename was
kmsan-fix-out-of-bounds-access-to-shadow-memory.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Eric Biggers <ebiggers(a)kernel.org>
Subject: kmsan: fix out-of-bounds access to shadow memory
Date: Thu, 11 Sep 2025 12:58:58 -0700
Running sha224_kunit on a KMSAN-enabled kernel results in a crash in
kmsan_internal_set_shadow_origin():
BUG: unable to handle page fault for address: ffffbc3840291000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 1810067 P4D 1810067 PUD 192d067 PMD 3c17067 PTE 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 81 Comm: kunit_try_catch Tainted: G N 6.17.0-rc3 #10 PREEMPT(voluntary)
Tainted: [N]=TEST
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
RIP: 0010:kmsan_internal_set_shadow_origin+0x91/0x100
[...]
Call Trace:
<TASK>
__msan_memset+0xee/0x1a0
sha224_final+0x9e/0x350
test_hash_buffer_overruns+0x46f/0x5f0
? kmsan_get_shadow_origin_ptr+0x46/0xa0
? __pfx_test_hash_buffer_overruns+0x10/0x10
kunit_try_run_case+0x198/0xa00
This occurs when memset() is called on a buffer that is not 4-byte aligned
and extends to the end of a guard page, i.e. the next page is unmapped.
The bug is that the loop at the end of kmsan_internal_set_shadow_origin()
accesses the wrong shadow memory bytes when the address is not 4-byte
aligned. Since each 4 bytes are associated with an origin, it rounds the
address and size so that it can access all the origins that contain the
buffer. However, when it checks the corresponding shadow bytes for a
particular origin, it incorrectly uses the original unrounded shadow
address. This results in reads from shadow memory beyond the end of the
buffer's shadow memory, which crashes when that memory is not mapped.
To fix this, correctly align the shadow address before accessing the 4
shadow bytes corresponding to each origin.
Link: https://lkml.kernel.org/r/20250911195858.394235-1-ebiggers@kernel.org
Fixes: 2ef3cec44c60 ("kmsan: do not wipe out origin when doing partial unpoisoning")
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
Tested-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: Alexander Potapenko <glider(a)google.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kmsan/core.c | 10 +++++++---
mm/kmsan/kmsan_test.c | 16 ++++++++++++++++
2 files changed, 23 insertions(+), 3 deletions(-)
--- a/mm/kmsan/core.c~kmsan-fix-out-of-bounds-access-to-shadow-memory
+++ a/mm/kmsan/core.c
@@ -195,7 +195,8 @@ void kmsan_internal_set_shadow_origin(vo
u32 origin, bool checked)
{
u64 address = (u64)addr;
- u32 *shadow_start, *origin_start;
+ void *shadow_start;
+ u32 *aligned_shadow, *origin_start;
size_t pad = 0;
KMSAN_WARN_ON(!kmsan_metadata_is_contiguous(addr, size));
@@ -214,9 +215,12 @@ void kmsan_internal_set_shadow_origin(vo
}
__memset(shadow_start, b, size);
- if (!IS_ALIGNED(address, KMSAN_ORIGIN_SIZE)) {
+ if (IS_ALIGNED(address, KMSAN_ORIGIN_SIZE)) {
+ aligned_shadow = shadow_start;
+ } else {
pad = address % KMSAN_ORIGIN_SIZE;
address -= pad;
+ aligned_shadow = shadow_start - pad;
size += pad;
}
size = ALIGN(size, KMSAN_ORIGIN_SIZE);
@@ -230,7 +234,7 @@ void kmsan_internal_set_shadow_origin(vo
* corresponding shadow slot is zero.
*/
for (int i = 0; i < size / KMSAN_ORIGIN_SIZE; i++) {
- if (origin || !shadow_start[i])
+ if (origin || !aligned_shadow[i])
origin_start[i] = origin;
}
}
--- a/mm/kmsan/kmsan_test.c~kmsan-fix-out-of-bounds-access-to-shadow-memory
+++ a/mm/kmsan/kmsan_test.c
@@ -556,6 +556,21 @@ DEFINE_TEST_MEMSETXX(16)
DEFINE_TEST_MEMSETXX(32)
DEFINE_TEST_MEMSETXX(64)
+/* Test case: ensure that KMSAN does not access shadow memory out of bounds. */
+static void test_memset_on_guarded_buffer(struct kunit *test)
+{
+ void *buf = vmalloc(PAGE_SIZE);
+
+ kunit_info(test,
+ "memset() on ends of guarded buffer should not crash\n");
+
+ for (size_t size = 0; size <= 128; size++) {
+ memset(buf, 0xff, size);
+ memset(buf + PAGE_SIZE - size, 0xff, size);
+ }
+ vfree(buf);
+}
+
static noinline void fibonacci(int *array, int size, int start)
{
if (start < 2 || (start == size))
@@ -677,6 +692,7 @@ static struct kunit_case kmsan_test_case
KUNIT_CASE(test_memset16),
KUNIT_CASE(test_memset32),
KUNIT_CASE(test_memset64),
+ KUNIT_CASE(test_memset_on_guarded_buffer),
KUNIT_CASE(test_long_origin_chain),
KUNIT_CASE(test_stackdepot_roundtrip),
KUNIT_CASE(test_unpoison_memory),
_
Patches currently in -mm which might be from ebiggers(a)kernel.org are
The quilt patch titled
Subject: mm/hugetlb: fix folio is still mapped when deleted
has been removed from the -mm tree. Its filename was
mm-hugetlb-fix-folio-is-still-mapped-when-deleted.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Jinjiang Tu <tujinjiang(a)huawei.com>
Subject: mm/hugetlb: fix folio is still mapped when deleted
Date: Fri, 12 Sep 2025 15:41:39 +0800
Migration may be raced with fallocating hole. remove_inode_single_folio
will unmap the folio if the folio is still mapped. However, it's called
without folio lock. If the folio is migrated and the mapped pte has been
converted to migration entry, folio_mapped() returns false, and won't
unmap it. Due to extra refcount held by remove_inode_single_folio,
migration fails, restores migration entry to normal pte, and the folio is
mapped again. As a result, we triggered BUG in filemap_unaccount_folio.
The log is as follows:
BUG: Bad page cache in process hugetlb pfn:156c00
page: refcount:515 mapcount:0 mapping:0000000099fef6e1 index:0x0 pfn:0x156c00
head: order:9 mapcount:1 entire_mapcount:1 nr_pages_mapped:0 pincount:0
aops:hugetlbfs_aops ino:dcc dentry name(?):"my_hugepage_file"
flags: 0x17ffffc00000c1(locked|waiters|head|node=0|zone=2|lastcpupid=0x1fffff)
page_type: f4(hugetlb)
page dumped because: still mapped when deleted
CPU: 1 UID: 0 PID: 395 Comm: hugetlb Not tainted 6.17.0-rc5-00044-g7aac71907bde-dirty #484 NONE
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
Call Trace:
<TASK>
dump_stack_lvl+0x4f/0x70
filemap_unaccount_folio+0xc4/0x1c0
__filemap_remove_folio+0x38/0x1c0
filemap_remove_folio+0x41/0xd0
remove_inode_hugepages+0x142/0x250
hugetlbfs_fallocate+0x471/0x5a0
vfs_fallocate+0x149/0x380
Hold folio lock before checking if the folio is mapped to avold race with
migration.
Link: https://lkml.kernel.org/r/20250912074139.3575005-1-tujinjiang@huawei.com
Fixes: 4aae8d1c051e ("mm/hugetlbfs: unmap pages if page fault raced with hole punch")
Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hugetlbfs/inode.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
--- a/fs/hugetlbfs/inode.c~mm-hugetlb-fix-folio-is-still-mapped-when-deleted
+++ a/fs/hugetlbfs/inode.c
@@ -517,14 +517,16 @@ static bool remove_inode_single_folio(st
/*
* If folio is mapped, it was faulted in after being
- * unmapped in caller. Unmap (again) while holding
- * the fault mutex. The mutex will prevent faults
- * until we finish removing the folio.
+ * unmapped in caller or hugetlb_vmdelete_list() skips
+ * unmapping it due to fail to grab lock. Unmap (again)
+ * while holding the fault mutex. The mutex will prevent
+ * faults until we finish removing the folio. Hold folio
+ * lock to guarantee no concurrent migration.
*/
+ folio_lock(folio);
if (unlikely(folio_mapped(folio)))
hugetlb_unmap_file_folio(h, mapping, folio, index);
- folio_lock(folio);
/*
* We must remove the folio from page cache before removing
* the region/ reserve map (hugetlb_unreserve_pages). In
_
Patches currently in -mm which might be from tujinjiang(a)huawei.com are
The quilt patch titled
Subject: hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
has been removed from the -mm tree. Its filename was
hugetlbfs-skip-vmas-without-shareable-locks-in-hugetlb_vmdelete_list.patch
This patch was dropped because it had testing failures
------------------------------------------------------
From: Deepanshu Kartikey <kartikey406(a)gmail.com>
Subject: hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
Date: Thu, 25 Sep 2025 20:19:32 +0530
hugetlb_vmdelete_list() uses trylock to acquire VMA locks during truncate
operations. As per the original design in commit 40549ba8f8e0 ("hugetlb:
use new vma_lock for pmd sharing synchronization"), if the trylock fails
or the VMA has no lock, it should skip that VMA. Any remaining mapped
pages are handled by remove_inode_hugepages() which is called after
hugetlb_vmdelete_list() and uses proper lock ordering to guarantee
unmapping success.
Currently, when hugetlb_vma_trylock_write() returns success (1) for VMAs
without shareable locks, the code proceeds to call unmap_hugepage_range().
This causes assertion failures in huge_pmd_unshare() ���
hugetlb_vma_assert_locked() because no lock is actually held:
WARNING: CPU: 1 PID: 6594 Comm: syz.0.28 Not tainted
Call Trace:
hugetlb_vma_assert_locked+0x1dd/0x250
huge_pmd_unshare+0x2c8/0x540
__unmap_hugepage_range+0x6e3/0x1aa0
unmap_hugepage_range+0x32e/0x410
hugetlb_vmdelete_list+0x189/0x1f0
Fix by explicitly skipping VMAs without shareable locks after trylock
succeeds, consistent with the original design where such VMAs are deferred
to remove_inode_hugepages() for proper handling.
Link: https://lkml.kernel.org/r/20250925144934.150299-1-kartikey406@gmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406(a)gmail.com>
Reported-by: syzbot+f26d7c75c26ec19790e7(a)syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=f26d7c75c26ec19790e7
Fixes: 40549ba8f8e0 ("hugetlb: use new vma_lock for pmd sharing synchronization")
Tested-by: syzbot+f26d7c75c26ec19790e7(a)syzkaller.appspotmail.com
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hugetlbfs/inode.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/hugetlbfs/inode.c~hugetlbfs-skip-vmas-without-shareable-locks-in-hugetlb_vmdelete_list
+++ a/fs/hugetlbfs/inode.c
@@ -487,7 +487,8 @@ hugetlb_vmdelete_list(struct rb_root_cac
if (!hugetlb_vma_trylock_write(vma))
continue;
-
+ if (!__vma_shareable_lock(vma))
+ continue;
v_start = vma_offset_start(vma, start);
v_end = vma_offset_end(vma, end);
_
Patches currently in -mm which might be from kartikey406(a)gmail.com are
The quilt patch titled
Subject: mm/memblock: correct totalram_pages accounting with KMSAN
has been removed from the -mm tree. Its filename was
mm-memblock-correct-totalram_pages-accounting-with-kmsan.patch
This patch was dropped because an updated version will be issued
------------------------------------------------------
From: Alexander Potapenko <glider(a)google.com>
Subject: mm/memblock: correct totalram_pages accounting with KMSAN
Date: Wed, 24 Sep 2025 12:03:01 +0200
When KMSAN is enabled, `kmsan_memblock_free_pages()` can hold back pages
for metadata instead of returning them to the early allocator. The
callers, however, would unconditionally increment `totalram_pages`,
assuming the pages were always freed. This resulted in an incorrect
calculation of the total available RAM, causing the kernel to believe it
had more memory than it actually did.
This patch refactors `memblock_free_pages()` to return the number of pages
it successfully frees. If KMSAN stashes the pages, the function now
returns 0; otherwise, it returns the number of pages in the block.
The callers in `memblock.c` have been updated to use this return value,
ensuring that `totalram_pages` is incremented only by the number of pages
actually returned to the allocator. This corrects the total RAM
accounting when KMSAN is active.
Link: https://lkml.kernel.org/r/20250924100301.1558645-1-glider@google.com
Fixes: 3c2065098260 ("init: kmsan: call KMSAN initialization routines")
Signed-off-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Cc: Aleksandr Nogikh <nogikh(a)google.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: Markus Elfring <Markus.Elfring(a)web.de>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/internal.h | 4 ++--
mm/memblock.c | 21 +++++++++++----------
mm/mm_init.c | 9 +++++----
3 files changed, 18 insertions(+), 16 deletions(-)
--- a/mm/internal.h~mm-memblock-correct-totalram_pages-accounting-with-kmsan
+++ a/mm/internal.h
@@ -742,8 +742,8 @@ static inline void clear_zone_contiguous
extern int __isolate_free_page(struct page *page, unsigned int order);
extern void __putback_isolated_page(struct page *page, unsigned int order,
int mt);
-extern void memblock_free_pages(struct page *page, unsigned long pfn,
- unsigned int order);
+unsigned long memblock_free_pages(struct page *page, unsigned long pfn,
+ unsigned int order);
extern void __free_pages_core(struct page *page, unsigned int order,
enum meminit_context context);
--- a/mm/memblock.c~mm-memblock-correct-totalram_pages-accounting-with-kmsan
+++ a/mm/memblock.c
@@ -1826,6 +1826,7 @@ void *__init __memblock_alloc_or_panic(p
void __init memblock_free_late(phys_addr_t base, phys_addr_t size)
{
phys_addr_t cursor, end;
+ unsigned long freed_pages = 0;
end = base + size - 1;
memblock_dbg("%s: [%pa-%pa] %pS\n",
@@ -1834,10 +1835,9 @@ void __init memblock_free_late(phys_addr
cursor = PFN_UP(base);
end = PFN_DOWN(base + size);
- for (; cursor < end; cursor++) {
- memblock_free_pages(pfn_to_page(cursor), cursor, 0);
- totalram_pages_inc();
- }
+ for (; cursor < end; cursor++)
+ freed_pages += memblock_free_pages(pfn_to_page(cursor), cursor, 0);
+ totalram_pages_add(freed_pages);
}
/*
@@ -2259,9 +2259,11 @@ static void __init free_unused_memmap(vo
#endif
}
-static void __init __free_pages_memory(unsigned long start, unsigned long end)
+static unsigned long __init __free_pages_memory(unsigned long start,
+ unsigned long end)
{
int order;
+ unsigned long freed = 0;
while (start < end) {
/*
@@ -2279,14 +2281,15 @@ static void __init __free_pages_memory(u
while (start + (1UL << order) > end)
order--;
- memblock_free_pages(pfn_to_page(start), start, order);
+ freed += memblock_free_pages(pfn_to_page(start), start, order);
start += (1UL << order);
}
+ return freed;
}
static unsigned long __init __free_memory_core(phys_addr_t start,
- phys_addr_t end)
+ phys_addr_t end)
{
unsigned long start_pfn = PFN_UP(start);
unsigned long end_pfn = PFN_DOWN(end);
@@ -2297,9 +2300,7 @@ static unsigned long __init __free_memor
if (start_pfn >= end_pfn)
return 0;
- __free_pages_memory(start_pfn, end_pfn);
-
- return end_pfn - start_pfn;
+ return __free_pages_memory(start_pfn, end_pfn);
}
static void __init memmap_init_reserved_pages(void)
--- a/mm/mm_init.c~mm-memblock-correct-totalram_pages-accounting-with-kmsan
+++ a/mm/mm_init.c
@@ -2547,24 +2547,25 @@ void *__init alloc_large_system_hash(con
return table;
}
-void __init memblock_free_pages(struct page *page, unsigned long pfn,
- unsigned int order)
+unsigned long __init memblock_free_pages(struct page *page, unsigned long pfn,
+ unsigned int order)
{
if (IS_ENABLED(CONFIG_DEFERRED_STRUCT_PAGE_INIT)) {
int nid = early_pfn_to_nid(pfn);
if (!early_page_initialised(pfn, nid))
- return;
+ return 0;
}
if (!kmsan_memblock_free_pages(page, order)) {
/* KMSAN will take care of these pages. */
- return;
+ return 0;
}
/* pages were reserved and not allocated */
clear_page_tag_ref(page);
__free_pages_core(page, order, MEMINIT_EARLY);
+ return 1UL << order;
}
DEFINE_STATIC_KEY_MAYBE(CONFIG_INIT_ON_ALLOC_DEFAULT_ON, init_on_alloc);
_
Patches currently in -mm which might be from glider(a)google.com are
The patch titled
Subject: hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
has been added to the -mm mm-new branch. Its filename is
hugetlbfs-skip-vmas-without-shareable-locks-in-hugetlb_vmdelete_list.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Deepanshu Kartikey <kartikey406(a)gmail.com>
Subject: hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
Date: Thu, 25 Sep 2025 20:19:32 +0530
hugetlb_vmdelete_list() uses trylock to acquire VMA locks during truncate
operations. As per the original design in commit 40549ba8f8e0 ("hugetlb:
use new vma_lock for pmd sharing synchronization"), if the trylock fails
or the VMA has no lock, it should skip that VMA. Any remaining mapped
pages are handled by remove_inode_hugepages() which is called after
hugetlb_vmdelete_list() and uses proper lock ordering to guarantee
unmapping success.
Currently, when hugetlb_vma_trylock_write() returns success (1) for VMAs
without shareable locks, the code proceeds to call unmap_hugepage_range().
This causes assertion failures in huge_pmd_unshare() ���
hugetlb_vma_assert_locked() because no lock is actually held:
WARNING: CPU: 1 PID: 6594 Comm: syz.0.28 Not tainted
Call Trace:
hugetlb_vma_assert_locked+0x1dd/0x250
huge_pmd_unshare+0x2c8/0x540
__unmap_hugepage_range+0x6e3/0x1aa0
unmap_hugepage_range+0x32e/0x410
hugetlb_vmdelete_list+0x189/0x1f0
Fix by explicitly skipping VMAs without shareable locks after trylock
succeeds, consistent with the original design where such VMAs are deferred
to remove_inode_hugepages() for proper handling.
Link: https://lkml.kernel.org/r/20250925144934.150299-1-kartikey406@gmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406(a)gmail.com>
Reported-by: syzbot+f26d7c75c26ec19790e7(a)syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=f26d7c75c26ec19790e7
Fixes: 40549ba8f8e0 ("hugetlb: use new vma_lock for pmd sharing synchronization")
Tested-by: syzbot+f26d7c75c26ec19790e7(a)syzkaller.appspotmail.com
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hugetlbfs/inode.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/hugetlbfs/inode.c~hugetlbfs-skip-vmas-without-shareable-locks-in-hugetlb_vmdelete_list
+++ a/fs/hugetlbfs/inode.c
@@ -487,7 +487,8 @@ hugetlb_vmdelete_list(struct rb_root_cac
if (!hugetlb_vma_trylock_write(vma))
continue;
-
+ if (!__vma_shareable_lock(vma))
+ continue;
v_start = vma_offset_start(vma, start);
v_end = vma_offset_end(vma, end);
_
Patches currently in -mm which might be from kartikey406(a)gmail.com are
hugetlbfs-skip-vmas-without-shareable-locks-in-hugetlb_vmdelete_list.patch
When PAGEMAP_SCAN ioctl invoked with vec_len = 0 reaches
pagemap_scan_backout_range(), kernel panics with null-ptr-deref:
[ 44.936808] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 44.937797] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 44.938391] CPU: 1 UID: 0 PID: 2480 Comm: reproducer Not tainted 6.17.0-rc6 #22 PREEMPT(none)
[ 44.939062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 44.939935] RIP: 0010:pagemap_scan_thp_entry.isra.0+0x741/0xa80
<snip registers, unreliable trace>
[ 44.946828] Call Trace:
[ 44.947030] <TASK>
[ 44.949219] pagemap_scan_pmd_entry+0xec/0xfa0
[ 44.952593] walk_pmd_range.isra.0+0x302/0x910
[ 44.954069] walk_pud_range.isra.0+0x419/0x790
[ 44.954427] walk_p4d_range+0x41e/0x620
[ 44.954743] walk_pgd_range+0x31e/0x630
[ 44.955057] __walk_page_range+0x160/0x670
[ 44.956883] walk_page_range_mm+0x408/0x980
[ 44.958677] walk_page_range+0x66/0x90
[ 44.958984] do_pagemap_scan+0x28d/0x9c0
[ 44.961833] do_pagemap_cmd+0x59/0x80
[ 44.962484] __x64_sys_ioctl+0x18d/0x210
[ 44.962804] do_syscall_64+0x5b/0x290
[ 44.963111] entry_SYSCALL_64_after_hwframe+0x76/0x7e
vec_len = 0 in pagemap_scan_init_bounce_buffer() means no buffers are
allocated and p->vec_buf remains set to NULL.
This breaks an assumption made later in pagemap_scan_backout_range(),
that page_region is always allocated for p->vec_buf_index.
Fix it by explicitly checking p->vec_buf for NULL before dereferencing.
Other sites that might run into same deref-issue are already (directly
or transitively) protected by checking p->vec_buf.
Note:
From PAGEMAP_SCAN man page, it seems vec_len = 0 is valid when no output
is requested and it's only the side effects caller is interested in,
hence it passes check in pagemap_scan_get_args().
This issue was found by syzkaller.
Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs")
Signed-off-by: Jakub Acs <acsjakub(a)amazon.de>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Penglei Jiang <superman.xpt(a)gmail.com>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Andrei Vagin <avagin(a)gmail.com>
Cc: "Michał Mirosław" <mirq-linux(a)rere.qmqm.pl>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-fsdevel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
---
fs/proc/task_mmu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 29cca0e6d0ff..b26ae556b446 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -2417,6 +2417,9 @@ static void pagemap_scan_backout_range(struct pagemap_scan_private *p,
{
struct page_region *cur_buf = &p->vec_buf[p->vec_buf_index];
+ if (!p->vec_buf)
+ return;
+
if (cur_buf->start != addr)
cur_buf->end = addr;
else
--
2.47.3
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
devm_kcalloc() may fail. ndtest_probe() allocates three DMA address
arrays (dcr_dma, label_dma, dimm_dma) and later unconditionally uses
them in ndtest_nvdimm_init(), which can lead to a NULL pointer
dereference under low-memory conditions.
Check all three allocations and return -ENOMEM if any allocation fails,
jumping to the common error path. Do not emit an extra error message
since the allocator already warns on allocation failure.
Fixes: 9399ab61ad82 ("ndtest: Add dimms to the two buses")
Cc: stable(a)vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com>
---
changelog:
v3:
- Add NULL checks for all three devm_kcalloc() calls and goto the common
error label on failure.
v2:
- Drop pr_err() on allocation failure; only NULL-check and return -ENOMEM.
- No other changes.
---
tools/testing/nvdimm/test/ndtest.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/nvdimm/test/ndtest.c b/tools/testing/nvdimm/test/ndtest.c
index 68a064ce598c..8e3b6be53839 100644
--- a/tools/testing/nvdimm/test/ndtest.c
+++ b/tools/testing/nvdimm/test/ndtest.c
@@ -850,11 +850,22 @@ static int ndtest_probe(struct platform_device *pdev)
p->dcr_dma = devm_kcalloc(&p->pdev.dev, NUM_DCR,
sizeof(dma_addr_t), GFP_KERNEL);
+ if (!p->dcr_dma) {
+ rc = -ENOMEM;
+ goto err;
+ }
p->label_dma = devm_kcalloc(&p->pdev.dev, NUM_DCR,
sizeof(dma_addr_t), GFP_KERNEL);
+ if (!p->label_dma) {
+ rc = -ENOMEM;
+ goto err;
+ }
p->dimm_dma = devm_kcalloc(&p->pdev.dev, NUM_DCR,
sizeof(dma_addr_t), GFP_KERNEL);
-
+ if (!p->dimm_dma) {
+ rc = -ENOMEM;
+ goto err;
+ }
rc = ndtest_nvdimm_init(p);
if (rc)
goto err;
--
2.43.0
I have a couple more fixes I'm testing but the issues have
been with us for a long time, and they come from
code review not from the field IIUC so no rush I think.
The following changes since commit 76eeb9b8de9880ca38696b2fb56ac45ac0a25c6c:
Linux 6.17-rc5 (2025-09-07 14:22:57 -0700)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
for you to fetch changes up to cde7e7c3f8745a61458cea61aa28f37c3f5ae2b4:
MAINTAINERS, mailmap: Update address for Peter Hilber (2025-09-21 17:44:20 -0400)
----------------------------------------------------------------
virtio,vhost: last minute fixes
More small fixes. Most notably this fixes crashes and hangs in
vhost-net.
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
----------------------------------------------------------------
Alok Tiwari (1):
vhost-scsi: fix argument order in tport allocation error message
Alyssa Ross (1):
virtio_config: clarify output parameters
Ashwini Sahu (1):
uapi: vduse: fix typo in comment
Jason Wang (2):
vhost-net: unbreak busy polling
vhost-net: flush batched before enabling notifications
Michael S. Tsirkin (1):
Revert "vhost/net: Defer TX queue re-enable until after sendmsg"
Peter Hilber (1):
MAINTAINERS, mailmap: Update address for Peter Hilber
Sebastian Andrzej Siewior (1):
vhost: Take a reference on the task in struct vhost_task.
.mailmap | 1 +
MAINTAINERS | 2 +-
drivers/vhost/net.c | 40 +++++++++++++++++-----------------------
drivers/vhost/scsi.c | 2 +-
include/linux/virtio_config.h | 11 ++++++-----
include/uapi/linux/vduse.h | 2 +-
kernel/vhost_task.c | 3 ++-
7 files changed, 29 insertions(+), 32 deletions(-)
Hi Sasha,
As the commit message notice, this patch should not be applied to kernel
v6.4 or before. I would like you to exclude this from your queue for the
following versions:
* Patch "firewire: core: fix overlooked update of subsystem ABI version" has been added to the 5.4-stable tree
* Patch "firewire: core: fix overlooked update of subsystem ABI version" has been added to the 5.10-stable tree
* Patch "firewire: core: fix overlooked update of subsystem ABI version" has been added to the 5.15-stable tree
* Patch "firewire: core: fix overlooked update of subsystem ABI version" has been added to the 6.1-stable tree
Thankss
Takashi Sakamoto
On Thu, Sep 25, 2025 at 07:33:23AM -0400, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> firewire: core: fix overlooked update of subsystem ABI version
>
> to the 6.1-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> firewire-core-fix-overlooked-update-of-subsystem-abi.patch
> and it can be found in the queue-6.1 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit bbb4ab1d7b1ad7fff7a85aa9144d0edc5a70bacc
> Author: Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
> Date: Sat Sep 20 11:51:48 2025 +0900
>
> firewire: core: fix overlooked update of subsystem ABI version
>
> [ Upstream commit 853a57ba263adfecf4430b936d6862bc475b4bb5 ]
>
> In kernel v6.5, several functions were added to the cdev layer. This
> required updating the default version of subsystem ABI up to 6, but
> this requirement was overlooked.
>
> This commit updates the version accordingly.
>
> Fixes: 6add87e9764d ("firewire: cdev: add new version of ABI to notify time stamp at request/response subaction of transaction#")
> Link: https://lore.kernel.org/r/20250920025148.163402-1-o-takashi@sakamocchi.jp
> Signed-off-by: Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c
> index 958aa4662ccb0..5cb0059f57e6b 100644
> --- a/drivers/firewire/core-cdev.c
> +++ b/drivers/firewire/core-cdev.c
> @@ -39,7 +39,7 @@
> /*
> * ABI version history is documented in linux/firewire-cdev.h.
> */
> -#define FW_CDEV_KERNEL_VERSION 5
> +#define FW_CDEV_KERNEL_VERSION 6
> #define FW_CDEV_VERSION_EVENT_REQUEST2 4
> #define FW_CDEV_VERSION_ALLOCATE_REGION_END 4
> #define FW_CDEV_VERSION_AUTO_FLUSH_ISO_OVERFLOW 5
The SMBus block read path trusts the device-provided count byte and
copies that many bytes from the master buffer:
buf[0] = readb(p3);
read_count = buf[0];
memcpy_fromio(&buf[1], p3 + 1, read_count);
Without validating 'read_count', a malicious or misbehaving device can
cause an out-of-bounds write to the caller's buffer and may also trigger
out-of-range MMIO reads beyond the controller's buffer window.
SMBus Block Read returns up to 32 data bytes as per the kernel
documentation, so clamp the length to [1, I2C_SMBUS_BLOCK_MAX], verify
the caller's buffer has at least 'read_count + 1' bytes available, and
defensively ensure it does not exceed the controller buffer. Also break
out of the chunking loop after a successful SMBus read.
Return -EPROTO for invalid counts and -EMSGSIZE when the provided buffer
is too small.
Fixes: 361693697249 ("i2c: microchip: pci1xxxx: Add driver for I2C host controller in multifunction endpoint of pci1xxxx switch")
Cc: stable(a)vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com>
---
drivers/i2c/busses/i2c-mchp-pci1xxxx.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-mchp-pci1xxxx.c b/drivers/i2c/busses/i2c-mchp-pci1xxxx.c
index 5ef136c3ecb1..2307c8ec2dc7 100644
--- a/drivers/i2c/busses/i2c-mchp-pci1xxxx.c
+++ b/drivers/i2c/busses/i2c-mchp-pci1xxxx.c
@@ -880,7 +880,22 @@ static int pci1xxxx_i2c_read(struct pci1xxxx_i2c *i2c, u8 slaveaddr,
}
if (i2c->flags & I2C_FLAGS_SMB_BLK_READ) {
- buf[0] = readb(p3);
+ u8 cnt = readb(p3);
+
+ if (!cnt || cnt > I2C_SMBUS_BLOCK_MAX) {
+ retval = -EPROTO;
+ goto cleanup;
+ }
+ if (cnt > total_len - 1) {
+ retval = -EMSGSIZE;
+ goto cleanup;
+ }
+ if (cnt > (SMBUS_BUF_MAX_SIZE - 1)) {
+ retval = -EOVERFLOW;
+ goto cleanup;
+ }
+
+ buf[0] = cnt;
read_count = buf[0];
memcpy_fromio(&buf[1], p3 + 1, read_count);
} else {
--
2.43.0
The pmsr_lock spinlock used to be necessary to synchronize access to the
PMSR register, because that access could have been triggered from either
config space access in rcar_pcie_config_access() or an exception handler
rcar_pcie_aarch32_abort_handler().
The rcar_pcie_aarch32_abort_handler() case is no longer applicable since
commit 6e36203bc14c ("PCI: rcar: Use PCI_SET_ERROR_RESPONSE after read
which triggered an exception"), which performs more accurate, controlled
invocation of the exception, and a fixup.
This leaves rcar_pcie_config_access() as the only call site from which
rcar_pcie_wakeup() is called. The rcar_pcie_config_access() can only be
called from the controller struct pci_ops .read and .write callbacks,
and those are serialized in drivers/pci/access.c using raw spinlock
'pci_lock' . CONFIG_PCI_LOCKLESS_CONFIG is never set on this platform.
Since the 'pci_lock' is a raw spinlock , and the 'pmsr_lock' is not a
raw spinlock, this constellation triggers 'BUG: Invalid wait context'
with CONFIG_PROVE_RAW_LOCK_NESTING=y .
Remove the pmsr_lock to fix the locking.
Fixes: a115b1bd3af0 ("PCI: rcar: Add L1 link state fix into data abort hook")
Reported-by: Duy Nguyen <duy.nguyen.rh(a)renesas.com>
Reported-by: Thuan Nguyen <thuan.nguyen-hong(a)banvien.com.vn>
Cc: stable(a)vger.kernel.org
Signed-off-by: Marek Vasut <marek.vasut+renesas(a)mailbox.org>
---
=============================
[ BUG: Invalid wait context ]
6.17.0-rc4-next-20250905-00048-ga08e553145e7-dirty #1116 Not tainted
-----------------------------
swapper/0/1 is trying to lock:
ffffffd92cf69c30 (pmsr_lock){....}-{3:3}, at: rcar_pcie_config_access+0x48/0x260
other info that might help us debug this:
context-{5:5}
3 locks held by swapper/0/1:
#0: ffffff84c0f890f8 (&dev->mutex){....}-{4:4}, at: device_lock+0x14/0x1c
#1: ffffffd92cf675b0 (pci_rescan_remove_lock){+.+.}-{4:4}, at: pci_lock_rescan_remove+0x18/0x20
#2: ffffffd92cf674a0 (pci_lock){....}-{2:2}, at: pci_bus_read_config_dword+0x54/0xd8
stack backtrace:
CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.0-rc4-next-20250905-00048-ga08e553145e7-dirty #1116 PREEMPT
Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT)
Call trace:
dump_backtrace+0x6c/0x7c (C)
show_stack+0x14/0x1c
dump_stack_lvl+0x68/0x8c
dump_stack+0x14/0x1c
__lock_acquire+0x3e8/0x1064
lock_acquire+0x17c/0x2ac
_raw_spin_lock_irqsave+0x54/0x70
rcar_pcie_config_access+0x48/0x260
rcar_pcie_read_conf+0x44/0xd8
pci_bus_read_config_dword+0x78/0xd8
pci_bus_generic_read_dev_vendor_id+0x30/0x138
pci_bus_read_dev_vendor_id+0x60/0x68
pci_scan_single_device+0x11c/0x1ec
pci_scan_slot+0x7c/0x170
pci_scan_child_bus_extend+0x5c/0x29c
pci_scan_child_bus+0x10/0x18
pci_scan_root_bus_bridge+0x90/0xc8
pci_host_probe+0x24/0xc4
rcar_pcie_probe+0x5e8/0x650
platform_probe+0x58/0x88
really_probe+0x190/0x350
__driver_probe_device+0x120/0x138
driver_probe_device+0x38/0xec
__driver_attach+0x158/0x168
bus_for_each_dev+0x7c/0xd0
driver_attach+0x20/0x28
bus_add_driver+0xe0/0x1d8
driver_register+0xac/0xe8
__platform_driver_register+0x1c/0x24
rcar_pcie_driver_init+0x18/0x20
do_one_initcall+0xd4/0x220
kernel_init_freeable+0x308/0x30c
kernel_init+0x20/0x11c
ret_from_fork+0x10/0x20
---
Cc: "Krzysztof Wilczyński" <kwilczynski(a)kernel.org>
Cc: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: Geert Uytterhoeven <geert+renesas(a)glider.be>
Cc: Lorenzo Pieralisi <lpieralisi(a)kernel.org>
Cc: Magnus Damm <magnus.damm(a)gmail.com>
Cc: Manivannan Sadhasivam <mani(a)kernel.org>
Cc: Marc Zyngier <maz(a)kernel.org>
Cc: Rob Herring <robh(a)kernel.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Cc: linux-pci(a)vger.kernel.org
Cc: linux-renesas-soc(a)vger.kernel.org
---
drivers/pci/controller/pcie-rcar-host.c | 13 ++-----------
1 file changed, 2 insertions(+), 11 deletions(-)
diff --git a/drivers/pci/controller/pcie-rcar-host.c b/drivers/pci/controller/pcie-rcar-host.c
index 4780e0109e583..625a00f3b2230 100644
--- a/drivers/pci/controller/pcie-rcar-host.c
+++ b/drivers/pci/controller/pcie-rcar-host.c
@@ -52,20 +52,13 @@ struct rcar_pcie_host {
int (*phy_init_fn)(struct rcar_pcie_host *host);
};
-static DEFINE_SPINLOCK(pmsr_lock);
-
static int rcar_pcie_wakeup(struct device *pcie_dev, void __iomem *pcie_base)
{
- unsigned long flags;
u32 pmsr, val;
int ret = 0;
- spin_lock_irqsave(&pmsr_lock, flags);
-
- if (!pcie_base || pm_runtime_suspended(pcie_dev)) {
- ret = -EINVAL;
- goto unlock_exit;
- }
+ if (!pcie_base || pm_runtime_suspended(pcie_dev))
+ return -EINVAL;
pmsr = readl(pcie_base + PMSR);
@@ -87,8 +80,6 @@ static int rcar_pcie_wakeup(struct device *pcie_dev, void __iomem *pcie_base)
writel(L1FAEG | PMEL1RX, pcie_base + PMSR);
}
-unlock_exit:
- spin_unlock_irqrestore(&pmsr_lock, flags);
return ret;
}
--
2.51.0
Make sure to drop the reference taken to the iommu platform device when
looking up its driver data during probe_device().
Fixes: b17336c55d89 ("iommu/mediatek: add support for mtk iommu generation one HW")
Cc: stable(a)vger.kernel.org # 4.8
Cc: Honghui Zhang <honghui.zhang(a)mediatek.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/iommu/mtk_iommu_v1.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 10cc0b1197e8..de9153c0a82f 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -435,6 +435,8 @@ static int mtk_iommu_v1_create_mapping(struct device *dev,
return -EINVAL;
dev_iommu_priv_set(dev, platform_get_drvdata(m4updev));
+
+ put_device(&m4updev->dev);
}
ret = iommu_fwspec_add_ids(dev, args->args, 1);
--
2.49.1
Make sure to drop the reference taken to the iommu platform device when
looking up its driver data during of_xlate().
Fixes: 7b2d59611fef ("iommu/ipmmu-vmsa: Replace local utlb code with fwspec ids")
Cc: stable(a)vger.kernel.org # 4.14
Cc: Magnus Damm <damm+renesas(a)opensource.se>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/iommu/ipmmu-vmsa.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index ffa892f65714..02a2a55ffa0a 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -720,6 +720,8 @@ static int ipmmu_init_platform_device(struct device *dev,
dev_iommu_priv_set(dev, platform_get_drvdata(ipmmu_pdev));
+ put_device(&ipmmu_pdev->dev);
+
return 0;
}
--
2.49.1
Make sure to drop the reference taken to the iommu platform device when
looking up its driver data during of_xlate().
Fixes: aa759fd376fb ("iommu/exynos: Add callback for initializing devices from device tree")
Cc: stable(a)vger.kernel.org # 4.2
Cc: Marek Szyprowski <m.szyprowski(a)samsung.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/iommu/exynos-iommu.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index b6edd178fe25..ce9e935cb84c 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -1446,17 +1446,14 @@ static int exynos_iommu_of_xlate(struct device *dev,
return -ENODEV;
data = platform_get_drvdata(sysmmu);
- if (!data) {
- put_device(&sysmmu->dev);
+ put_device(&sysmmu->dev);
+ if (!data)
return -ENODEV;
- }
if (!owner) {
owner = kzalloc(sizeof(*owner), GFP_KERNEL);
- if (!owner) {
- put_device(&sysmmu->dev);
+ if (!owner)
return -ENOMEM;
- }
INIT_LIST_HEAD(&owner->controllers);
mutex_init(&owner->rpm_lock);
--
2.49.1
Make sure to drop the reference taken to the iommu platform device when
looking up its driver data during of_xlate().
Note that commit e2eae09939a8 ("iommu/qcom: add missing put_device()
call in qcom_iommu_of_xlate()") fixed the leak in a couple of error
paths, but the reference is still leaking on success and late failures.
Fixes: 0ae349a0f33f ("iommu/qcom: Add qcom_iommu")
Cc: stable(a)vger.kernel.org # 4.14: e2eae09939a8
Cc: Rob Clark <robin.clark(a)oss.qualcomm.com>
Cc: Yu Kuai <yukuai3(a)huawei.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/iommu/arm/arm-smmu/qcom_iommu.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index c5be95e56031..9c1166a3af6c 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -565,14 +565,14 @@ static int qcom_iommu_of_xlate(struct device *dev,
qcom_iommu = platform_get_drvdata(iommu_pdev);
+ put_device(&iommu_pdev->dev);
+
/* make sure the asid specified in dt is valid, so we don't have
* to sanity check this elsewhere:
*/
if (WARN_ON(asid > qcom_iommu->max_asid) ||
- WARN_ON(qcom_iommu->ctxs[asid] == NULL)) {
- put_device(&iommu_pdev->dev);
+ WARN_ON(qcom_iommu->ctxs[asid] == NULL))
return -EINVAL;
- }
if (!dev_iommu_priv_get(dev)) {
dev_iommu_priv_set(dev, qcom_iommu);
@@ -581,10 +581,8 @@ static int qcom_iommu_of_xlate(struct device *dev,
* multiple different iommu devices. Multiple context
* banks are ok, but multiple devices are not:
*/
- if (WARN_ON(qcom_iommu != dev_iommu_priv_get(dev))) {
- put_device(&iommu_pdev->dev);
+ if (WARN_ON(qcom_iommu != dev_iommu_priv_get(dev)))
return -EINVAL;
- }
}
return iommu_fwspec_add_ids(dev, &asid, 1);
--
2.49.1
Make sure to drop the reference taken to the iommu platform device when
looking up its driver data during of_xlate().
Fixes: 46d1fb072e76 ("iommu/dart: Add DART iommu driver")
Cc: stable(a)vger.kernel.org # 5.15
Cc: Sven Peter <sven(a)kernel.org>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/iommu/apple-dart.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c
index 190f28d76615..1aa7c10262a8 100644
--- a/drivers/iommu/apple-dart.c
+++ b/drivers/iommu/apple-dart.c
@@ -790,6 +790,8 @@ static int apple_dart_of_xlate(struct device *dev,
struct apple_dart *cfg_dart;
int i, sid;
+ put_device(&iommu_pdev->dev);
+
if (args->args_count != 1)
return -EINVAL;
sid = args->args[0];
--
2.49.1
From: Shawn Guo <shawnguo(a)kernel.org>
A regression is seen with 6.6 -> 6.12 kernel upgrade on platforms where
cpufreq-dt driver sets cpuinfo.transition_latency as CPUFREQ_ETERNAL (-1),
due to that platform's DT doesn't provide the optional property
'clock-latency-ns'. The dbs sampling_rate was 10000 us on 6.6 and
suddently becomes 6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these
platforms, because the default transition delay was dropped by the commits
below.
commit 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER")
commit a755d0e2d41b ("cpufreq: Honour transition_latency over transition_delay_us")
commit e13aa799c2a6 ("cpufreq: Change default transition delay to 2ms")
It slows down dbs governor's reacting to CPU loading change
dramatically. Also, as transition_delay_us is used by schedutil governor
as rate_limit_us, it shows a negative impact on device idle power
consumption, because the device gets slightly less time in the lowest OPP.
Fix the regressions by defining a default transition latency for
handling the case of CPUFREQ_ETERNAL.
Cc: stable(a)vger.kernel.org
Fixes: 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER")
Signed-off-by: Shawn Guo <shawnguo(a)kernel.org>
---
Changes for v2:
- Follow Rafael's suggestion to define a default transition latency for
handling CPUFREQ_ETERNAL, and pave the way to get rid of
CPUFREQ_ETERNAL completely later.
v1: https://lkml.org/lkml/2025/9/10/294
drivers/cpufreq/cpufreq.c | 3 +++
include/linux/cpufreq.h | 2 ++
2 files changed, 5 insertions(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index fc7eace8b65b..c69d10f0e8ec 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -549,6 +549,9 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
if (policy->transition_delay_us)
return policy->transition_delay_us;
+ if (policy->cpuinfo.transition_latency == CPUFREQ_ETERNAL)
+ policy->cpuinfo.transition_latency = CPUFREQ_DEFAULT_TANSITION_LATENCY_NS;
+
latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
if (latency)
/* Give a 50% breathing room between updates */
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 95f3807c8c55..935e9a660039 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -36,6 +36,8 @@
/* Print length for names. Extra 1 space for accommodating '\n' in prints */
#define CPUFREQ_NAME_PLEN (CPUFREQ_NAME_LEN + 1)
+#define CPUFREQ_DEFAULT_TANSITION_LATENCY_NS NSEC_PER_MSEC
+
struct cpufreq_governor;
enum cpufreq_table_sorting {
--
2.43.0
Commit 20d72b00ca81 ("netfs: Fix the request's work item to not
require a ref") modified netfs_alloc_request() to initialize the
reference counter to 2 instead of 1. The rationale was that the
requet's "work" would release the second reference after completion
(via netfs_{read,write}_collection_worker()). That works most of the
time if all goes well.
However, it leaks this additional reference if the request is released
before the I/O operation has been submitted: the error code path only
decrements the reference counter once and the work item will never be
queued because there will never be a completion.
This has caused outages of our whole server cluster today because
tasks were blocked in netfs_wait_for_outstanding_io(), leading to
deadlocks in Ceph (another bug that I will address soon in another
patch). This was caused by a netfs_pgpriv2_begin_copy_to_cache() call
which failed in fscache_begin_write_operation(). The leaked
netfs_io_request was never completed, leaving `netfs_inode.io_count`
with a positive value forever.
All of this is super-fragile code. Finding out which code paths will
lead to an eventual completion and which do not is hard to see:
- Some functions like netfs_create_write_req() allocate a request, but
will never submit any I/O.
- netfs_unbuffered_read_iter_locked() calls netfs_unbuffered_read()
and then netfs_put_request(); however, netfs_unbuffered_read() can
also fail early before submitting the I/O request, therefore another
netfs_put_request() call must be added there.
A rule of thumb is that functions that return a `netfs_io_request` do
not submit I/O, and all of their callers must be checked.
For my taste, the whole netfs code needs an overhaul to make reference
counting easier to understand and less fragile & obscure. But to fix
this bug here and now and produce a patch that is adequate for a
stable backport, I tried a minimal approach that quickly frees the
request object upon early failure.
I decided against adding a second netfs_put_request() each time
because that would cause code duplication which obscures the code
further. Instead, I added the function netfs_put_failed_request()
which frees such a failed request synchronously under the assumption
that the reference count is exactly 2 (as initially set by
netfs_alloc_request() and never touched), verified by a
WARN_ON_ONCE(). It then deinitializes the request object (without
going through the "cleanup_work" indirection) and frees the allocation
(with RCU protection to protect against concurrent access by
netfs_requests_seq_start()).
All code paths that fail early have been changed to call
netfs_put_failed_request() instead of netfs_put_request().
Additionally, I have added a netfs_put_request() call to
netfs_unbuffered_read() as explained above because the
netfs_put_failed_request() approach does not work there.
Fixes: 20d72b00ca81 ("netfs: Fix the request's work item to not require a ref")
Cc: stable(a)vger.kernel.org
Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com>
---
v1->v2: free the request with call_rcu() because a proc reader might
be accessing it (suggested by David Howells)
---
fs/netfs/buffered_read.c | 10 +++++-----
fs/netfs/direct_read.c | 7 ++++++-
fs/netfs/direct_write.c | 6 +++++-
fs/netfs/internal.h | 1 +
fs/netfs/objects.c | 28 +++++++++++++++++++++++++---
fs/netfs/read_pgpriv2.c | 2 +-
fs/netfs/read_single.c | 2 +-
fs/netfs/write_issue.c | 3 +--
8 files changed, 45 insertions(+), 14 deletions(-)
diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index 18b3dc74c70e..37ab6f28b5ad 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -369,7 +369,7 @@ void netfs_readahead(struct readahead_control *ractl)
return netfs_put_request(rreq, netfs_rreq_trace_put_return);
cleanup_free:
- return netfs_put_request(rreq, netfs_rreq_trace_put_failed);
+ return netfs_put_failed_request(rreq);
}
EXPORT_SYMBOL(netfs_readahead);
@@ -472,7 +472,7 @@ static int netfs_read_gaps(struct file *file, struct folio *folio)
return ret < 0 ? ret : 0;
discard:
- netfs_put_request(rreq, netfs_rreq_trace_put_discard);
+ netfs_put_failed_request(rreq);
alloc_error:
folio_unlock(folio);
return ret;
@@ -532,7 +532,7 @@ int netfs_read_folio(struct file *file, struct folio *folio)
return ret < 0 ? ret : 0;
discard:
- netfs_put_request(rreq, netfs_rreq_trace_put_discard);
+ netfs_put_failed_request(rreq);
alloc_error:
folio_unlock(folio);
return ret;
@@ -699,7 +699,7 @@ int netfs_write_begin(struct netfs_inode *ctx,
return 0;
error_put:
- netfs_put_request(rreq, netfs_rreq_trace_put_failed);
+ netfs_put_failed_request(rreq);
error:
if (folio) {
folio_unlock(folio);
@@ -754,7 +754,7 @@ int netfs_prefetch_for_write(struct file *file, struct folio *folio,
return ret < 0 ? ret : 0;
error_put:
- netfs_put_request(rreq, netfs_rreq_trace_put_discard);
+ netfs_put_failed_request(rreq);
error:
_leave(" = %d", ret);
return ret;
diff --git a/fs/netfs/direct_read.c b/fs/netfs/direct_read.c
index a05e13472baf..a498ee8d6674 100644
--- a/fs/netfs/direct_read.c
+++ b/fs/netfs/direct_read.c
@@ -131,6 +131,7 @@ static ssize_t netfs_unbuffered_read(struct netfs_io_request *rreq, bool sync)
if (rreq->len == 0) {
pr_err("Zero-sized read [R=%x]\n", rreq->debug_id);
+ netfs_put_request(rreq, netfs_rreq_trace_put_discard);
return -EIO;
}
@@ -205,7 +206,7 @@ ssize_t netfs_unbuffered_read_iter_locked(struct kiocb *iocb, struct iov_iter *i
if (user_backed_iter(iter)) {
ret = netfs_extract_user_iter(iter, rreq->len, &rreq->buffer.iter, 0);
if (ret < 0)
- goto out;
+ goto error_put;
rreq->direct_bv = (struct bio_vec *)rreq->buffer.iter.bvec;
rreq->direct_bv_count = ret;
rreq->direct_bv_unpin = iov_iter_extract_will_pin(iter);
@@ -238,6 +239,10 @@ ssize_t netfs_unbuffered_read_iter_locked(struct kiocb *iocb, struct iov_iter *i
if (ret > 0)
orig_count -= ret;
return ret;
+
+error_put:
+ netfs_put_failed_request(rreq);
+ return ret;
}
EXPORT_SYMBOL(netfs_unbuffered_read_iter_locked);
diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
index a16660ab7f83..a9d1c3b2c084 100644
--- a/fs/netfs/direct_write.c
+++ b/fs/netfs/direct_write.c
@@ -57,7 +57,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
n = netfs_extract_user_iter(iter, len, &wreq->buffer.iter, 0);
if (n < 0) {
ret = n;
- goto out;
+ goto error_put;
}
wreq->direct_bv = (struct bio_vec *)wreq->buffer.iter.bvec;
wreq->direct_bv_count = n;
@@ -101,6 +101,10 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
out:
netfs_put_request(wreq, netfs_rreq_trace_put_return);
return ret;
+
+error_put:
+ netfs_put_failed_request(wreq);
+ return ret;
}
EXPORT_SYMBOL(netfs_unbuffered_write_iter_locked);
diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
index d4f16fefd965..4319611f5354 100644
--- a/fs/netfs/internal.h
+++ b/fs/netfs/internal.h
@@ -87,6 +87,7 @@ struct netfs_io_request *netfs_alloc_request(struct address_space *mapping,
void netfs_get_request(struct netfs_io_request *rreq, enum netfs_rreq_ref_trace what);
void netfs_clear_subrequests(struct netfs_io_request *rreq);
void netfs_put_request(struct netfs_io_request *rreq, enum netfs_rreq_ref_trace what);
+void netfs_put_failed_request(struct netfs_io_request *rreq);
struct netfs_io_subrequest *netfs_alloc_subrequest(struct netfs_io_request *rreq);
static inline void netfs_see_request(struct netfs_io_request *rreq,
diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c
index e8c99738b5bb..39d5e13f7248 100644
--- a/fs/netfs/objects.c
+++ b/fs/netfs/objects.c
@@ -116,10 +116,8 @@ static void netfs_free_request_rcu(struct rcu_head *rcu)
netfs_stat_d(&netfs_n_rh_rreq);
}
-static void netfs_free_request(struct work_struct *work)
+static void netfs_deinit_request(struct netfs_io_request *rreq)
{
- struct netfs_io_request *rreq =
- container_of(work, struct netfs_io_request, cleanup_work);
struct netfs_inode *ictx = netfs_inode(rreq->inode);
unsigned int i;
@@ -149,6 +147,14 @@ static void netfs_free_request(struct work_struct *work)
if (atomic_dec_and_test(&ictx->io_count))
wake_up_var(&ictx->io_count);
+}
+
+static void netfs_free_request(struct work_struct *work)
+{
+ struct netfs_io_request *rreq =
+ container_of(work, struct netfs_io_request, cleanup_work);
+
+ netfs_deinit_request(rreq);
call_rcu(&rreq->rcu, netfs_free_request_rcu);
}
@@ -167,6 +173,22 @@ void netfs_put_request(struct netfs_io_request *rreq, enum netfs_rreq_ref_trace
}
}
+/*
+ * Free a request (synchronously) that was just allocated but has
+ * failed before it could be submitted.
+ */
+void netfs_put_failed_request(struct netfs_io_request *rreq)
+{
+ /* new requests have two references (see
+ * netfs_alloc_request(), and this function is only allowed on
+ * new request objects
+ */
+ WARN_ON_ONCE(refcount_read(&rreq->ref) != 2);
+
+ trace_netfs_rreq_ref(rreq->debug_id, 0, netfs_rreq_trace_put_failed);
+ netfs_free_request(&rreq->cleanup_work);
+}
+
/*
* Allocate and partially initialise an I/O request structure.
*/
diff --git a/fs/netfs/read_pgpriv2.c b/fs/netfs/read_pgpriv2.c
index 8097bc069c1d..a1489aa29f78 100644
--- a/fs/netfs/read_pgpriv2.c
+++ b/fs/netfs/read_pgpriv2.c
@@ -118,7 +118,7 @@ static struct netfs_io_request *netfs_pgpriv2_begin_copy_to_cache(
return creq;
cancel_put:
- netfs_put_request(creq, netfs_rreq_trace_put_return);
+ netfs_put_failed_request(creq);
cancel:
rreq->copy_to_cache = ERR_PTR(-ENOBUFS);
clear_bit(NETFS_RREQ_FOLIO_COPY_TO_CACHE, &rreq->flags);
diff --git a/fs/netfs/read_single.c b/fs/netfs/read_single.c
index fa622a6cd56d..5c0dc4efc792 100644
--- a/fs/netfs/read_single.c
+++ b/fs/netfs/read_single.c
@@ -189,7 +189,7 @@ ssize_t netfs_read_single(struct inode *inode, struct file *file, struct iov_ite
return ret;
cleanup_free:
- netfs_put_request(rreq, netfs_rreq_trace_put_failed);
+ netfs_put_failed_request(rreq);
return ret;
}
EXPORT_SYMBOL(netfs_read_single);
diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
index 0584cba1a043..dd8743bc8d7f 100644
--- a/fs/netfs/write_issue.c
+++ b/fs/netfs/write_issue.c
@@ -133,8 +133,7 @@ struct netfs_io_request *netfs_create_write_req(struct address_space *mapping,
return wreq;
nomem:
- wreq->error = -ENOMEM;
- netfs_put_request(wreq, netfs_rreq_trace_put_failed);
+ netfs_put_failed_request(wreq);
return ERR_PTR(-ENOMEM);
}
--
2.47.3
Hi ,
What would it mean to you if your business was able to reduce Expenses by 20%
(Clients: Littelfuse, Corsair, BMB, Mercedes-Benz, Fantac)
We are a PCBA factory with an area of 6,000 square meters. We have been in this industry for 18 years and have an experienced team of engineers.
Help you reduce BOM Expenses Fast delivery (15 days for Demo) Competitive prices (10% lower than peers) Real factory processing fees are Fees Complete quality management system (ISO9001,ISO14001,ISO13485,IATF16949,UL)Given how well our pcba service suits your needs, I think we could do some Excellent work together.
Seven LeeChief Technology Officer
Business Department | Shenzhen STHL Technology Co,Ltd
+8618569002840 Seven(a)pcba-china.com
在2025-06-04,Seven <seven(a)ems-sthi.com> 写道:-----原始邮件-----
发件人: Seven <seven(a)ems-sthi.com>
发件时间: 2025年06月04日 周三
收件人: [Linux-stable-mirror <linux-stable-mirror(a)lists.linaro.org>]
主题: Re:Jordan recommend me get in touch
Hi,
Glad to know you and your company from Jordan.
I‘m Seven CTO of STHL We are a one-stop service provider for PCBA. We can help you with production from PCB to finished product assembly.
Why Partner With Us?
✅ One-Stop Expertise: From PCB fabrication, PCBA (SMT & Through-Hole), custom cable harnesses, , to final product assembly – we eliminate multi-vendor coordination risks.
✅ Cost Efficiency: 40%+ clients reduce logistics/QC costs through our integrated service model (ISO 9001:2015 certified).
✅ Speed-to-Market: Average 15% faster lead times achieved via in-house vertical integration.
Recent Success Case:
Helped a German IoT startup scale from prototype to 50K-unit/month production within 6 months through our:
PCB Design-for-Manufacturing (DFM) optimization Automated PCBA with 99.98% first-pass yield Mechanical housing CNC machining & IP67-rated assembly
Seven Marcus CTO
Shenzhen STHL Technology Co,Ltd
+8618569002840 Seven(a)pcba-china.com