Linux-stable-mirror July 2025

linux-stable-mirror@lists.linaro.org

532 participants
1246 discussions

[REGRESSION] thunderbolt: Fix a logic error in wake on connect

by Alyssa Ross

On Fri, Apr 11, 2025 at 10:14:44AM -0500, Mario Limonciello wrote: > From: Mario Limonciello <mario.limonciello(a)amd.com> > > commit a5cfc9d65879c ("thunderbolt: Add wake on connect/disconnect > on USB4 ports") introduced a sysfs file to control wake up policy > for a given USB4 port that defaulted to disabled. > > However when testing commit 4bfeea6ec1c02 ("thunderbolt: Use wake > on connect and disconnect over suspend") I found that it was working > even without making changes to the power/wakeup file (which defaults > to disabled). This is because of a logic error doing a bitwise or > of the wake-on-connect flag with device_may_wakeup() which should > have been a logical AND. > > Adjust the logic so that policy is only applied when wakeup is > actually enabled. > > Fixes: a5cfc9d65879c ("thunderbolt: Add wake on connect/disconnect on USB4 ports") > Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com> Hi! There have been a couple of reports of a Thunderbolt regression in recent stable kernels, and one reporter has now bisected it to this change: • https://bugzilla.kernel.org/show_bug.cgi?id=220284 • https://github.com/NixOS/nixpkgs/issues/420730 Both reporters are CCed, and say it starts working after the module is reloaded. Link: https://lore.kernel.org/r/bug-220284-208809@https.bugzilla.kernel.org%2F/ (for regzbot) > --- > drivers/thunderbolt/usb4.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/thunderbolt/usb4.c b/drivers/thunderbolt/usb4.c > index e51d01671d8e7..3e96f1afd4268 100644 > --- a/drivers/thunderbolt/usb4.c > +++ b/drivers/thunderbolt/usb4.c > @@ -440,10 +440,10 @@ int usb4_switch_set_wake(struct tb_switch *sw, unsigned int flags) > bool configured = val & PORT_CS_19_PC; > usb4 = port->usb4; > > - if (((flags & TB_WAKE_ON_CONNECT) | > + if (((flags & TB_WAKE_ON_CONNECT) && > device_may_wakeup(&usb4->dev)) && !configured) > val |= PORT_CS_19_WOC; > - if (((flags & TB_WAKE_ON_DISCONNECT) | > + if (((flags & TB_WAKE_ON_DISCONNECT) && > device_may_wakeup(&usb4->dev)) && configured) > val |= PORT_CS_19_WOD; > if ((flags & TB_WAKE_ON_USB4) && configured) > -- > 2.43.0 >

5 months, 2 weeks

Re: [PATCH 6.6 087150] Input atkbd - skip ATKBD_CMD_GETID in translated mode

by Wang Hai

On 1970/1/1 8:00, wrote: > 6.6-stable review patch. If anyone has any objections, please let me know. > > ------------------ > > From Hans de Goede hdegoede(a)redhat.com > > [ Upstream commit 936e4d49ecbc8c404790504386e1422b599dec39 ] > > There have been multiple reports of keyboard issues on recent laptop models > which can be worked around by setting i8042.dumbkbd, with the downside > being this breaks the capslock LED. > > It seems that these issues are caused by recent laptops getting confused by > ATKBD_CMD_GETID. Rather then adding and endless growing list of quirks for > this, just skip ATKBD_CMD_GETID alltogether on laptops in translated mode. > > The main goal of sending ATKBD_CMD_GETID is to skip binding to ps2 > micetouchpads and those are never used in translated mode. > > Examples of laptop models which benefit from skipping ATKBD_CMD_GETID > > HP Laptop 15s-fq2xxx, HP laptop 15s-fq4xxx and HP Laptop 15-dy2xxx > models the kbd stops working for the first 2 - 5 minutes after boot > (waiting for EC watchdog reset) > > On HP Spectre x360 13-aw2xxx atkbd fails to probe the keyboard > > At least 9 different Lenovo models have issues with ATKBD_CMD_GETID, see > httpsgithub.comyescallopatkbd-nogetid > > This has been tested on > > 1. A MSI B550M PRO-VDH WIFI desktop, where the i8042 controller is not > in translated mode when no keyboard is plugged in and with a ps2 kbd > a AT Translated Set 2 keyboard devinputevent# node shows up > > 2. A Lenovo ThinkPad X1 Yoga gen 8 (always has a translated set 2 keyboard) > > Reported-by Shang Ye yesh25(a)mail2.sysu.edu.cn > Closes httpslore.kernel.orglinux-input886D6167733841AE+20231017135318.11142-1-yesh25(a)mail2.sysu.edu.cn > Closes httpsgithub.comyescallopatkbd-nogetid > Reported-by gurevitch mail(a)gurevit.ch > Closes httpslore.kernel.orglinux-input2iAJTwqZV6lQs26cTb38RNYqxvsink6SRmrZ5h0cBUSuf9NT0tZTsf9fEAbbto2maavHJEOP8GA1evlKa6xjKOsaskDhtJWxjcnrgPigzVo=(a)gurevit.ch > Reported-by Egor Ignatov egori(a)altlinux.org > Closes httpslore.kernel.orgall20210609073333.8425-1-egori(a)altlinux.org > Reported-by Anton Zhilyaev anton(a)cpp.in > Closes httpslore.kernel.orglinux-input20210201160336.16008-1-anton(a)cpp.in > Closes httpsbugzilla.redhat.comshow_bug.cgiid=2086156 > Signed-off-by Hans de Goede hdegoede(a)redhat.com > Link httpslore.kernel.orgr20231115174625.7462-1-hdegoede(a)redhat.com > Signed-off-by Dmitry Torokhov dmitry.torokhov(a)gmail.com > Signed-off-by Sasha Levin sashal(a)kernel.org > --- Hi, Hans I noticed there's a subsequent bugfix [1] for this patch, but it hasn't been merged into the stable-6.6 branch. Based on the bugfix description, the issue should exist there as well. Would you like this patch to be merged into the stable-6.6 branch?" [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…

5 months, 2 weeks

[PATCH v6.1] vhost-scsi: protect vq->log_used with vq->mutex

by Xinyu Zheng

From: Dongli Zhang <dongli.zhang(a)oracle.com> [ Upstream commit f591cf9fce724e5075cc67488c43c6e39e8cbe27 ] The vhost-scsi completion path may access vq->log_base when vq->log_used is already set to false. vhost-thread QEMU-thread vhost_scsi_complete_cmd_work() -> vhost_add_used() -> vhost_add_used_n() if (unlikely(vq->log_used)) QEMU disables vq->log_used via VHOST_SET_VRING_ADDR. mutex_lock(&vq->mutex); vq->log_used = false now! mutex_unlock(&vq->mutex); QEMU gfree(vq->log_base) log_used() -> log_write(vq->log_base) Assuming the VMM is QEMU. The vq->log_base is from QEMU userpace and can be reclaimed via gfree(). As a result, this causes invalid memory writes to QEMU userspace. The control queue path has the same issue. Cc: stable(a)vger.kernel.org#6.1.x Cc: gregkh(a)linuxfoundation.org Signed-off-by: Dongli Zhang <dongli.zhang(a)oracle.com> Acked-by: Jason Wang <jasowang(a)redhat.com> Reviewed-by: Mike Christie <michael.christie(a)oracle.com> Message-Id: <20250403063028.16045-2-dongli.zhang(a)oracle.com> Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> [ Conflicts in drivers/vhost/scsi.c bacause vhost_scsi_complete_cmd_work() has been refactored. ] Signed-off-by: Xinyu Zheng <zhengxinyu6(a)huawei.com> --- drivers/vhost/scsi.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index 3077cb9d58d6..87f2f56fd20a 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -568,8 +568,10 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work) ret = copy_to_iter(&v_rsp, sizeof(v_rsp), &iov_iter); if (likely(ret == sizeof(v_rsp))) { struct vhost_scsi_virtqueue *q; - vhost_add_used(cmd->tvc_vq, cmd->tvc_vq_desc, 0); q = container_of(cmd->tvc_vq, struct vhost_scsi_virtqueue, vq); + mutex_lock(&q->vq.mutex); + vhost_add_used(cmd->tvc_vq, cmd->tvc_vq_desc, 0); + mutex_unlock(&q->vq.mutex); vq = q - vs->vqs; __set_bit(vq, vs->compl_bitmap); } else @@ -1173,8 +1175,11 @@ static void vhost_scsi_tmf_resp_work(struct vhost_work *work) else resp_code = VIRTIO_SCSI_S_FUNCTION_REJECTED; + mutex_lock(&tmf->svq->vq.mutex); vhost_scsi_send_tmf_resp(tmf->vhost, &tmf->svq->vq, tmf->in_iovs, tmf->vq_desc, &tmf->resp_iov, resp_code); + mutex_unlock(&tmf->svq->vq.mutex); + vhost_scsi_release_tmf_res(tmf); } -- 2.34.1

5 months, 2 weeks

[PATCH v5.15] vhost-scsi: protect vq->log_used with vq->mutex

by Xinyu Zheng

From: Dongli Zhang <dongli.zhang(a)oracle.com> [ Upstream commit f591cf9fce724e5075cc67488c43c6e39e8cbe27 ] The vhost-scsi completion path may access vq->log_base when vq->log_used is already set to false. vhost-thread QEMU-thread vhost_scsi_complete_cmd_work() -> vhost_add_used() -> vhost_add_used_n() if (unlikely(vq->log_used)) QEMU disables vq->log_used via VHOST_SET_VRING_ADDR. mutex_lock(&vq->mutex); vq->log_used = false now! mutex_unlock(&vq->mutex); QEMU gfree(vq->log_base) log_used() -> log_write(vq->log_base) Assuming the VMM is QEMU. The vq->log_base is from QEMU userpace and can be reclaimed via gfree(). As a result, this causes invalid memory writes to QEMU userspace. The control queue path has the same issue. Cc: stable(a)vger.kernel.org#5.15.x Cc: gregkh(a)linuxfoundation.org Signed-off-by: Dongli Zhang <dongli.zhang(a)oracle.com> Acked-by: Jason Wang <jasowang(a)redhat.com> Reviewed-by: Mike Christie <michael.christie(a)oracle.com> Message-Id: <20250403063028.16045-2-dongli.zhang(a)oracle.com> Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> [ Conflicts in drivers/vhost/scsi.c bacause vhost_scsi_complete_cmd_work() has been refactored. ] Signed-off-by: Xinyu Zheng <zhengxinyu6(a)huawei.com> --- drivers/vhost/scsi.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index f9930887fdd2..2c19fa02141d 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -563,8 +563,10 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work) ret = copy_to_iter(&v_rsp, sizeof(v_rsp), &iov_iter); if (likely(ret == sizeof(v_rsp))) { struct vhost_scsi_virtqueue *q; - vhost_add_used(cmd->tvc_vq, cmd->tvc_vq_desc, 0); q = container_of(cmd->tvc_vq, struct vhost_scsi_virtqueue, vq); + mutex_lock(&q->vq.mutex); + vhost_add_used(cmd->tvc_vq, cmd->tvc_vq_desc, 0); + mutex_unlock(&q->vq.mutex); vq = q - vs->vqs; __set_bit(vq, signal); } else @@ -1166,8 +1168,11 @@ static void vhost_scsi_tmf_resp_work(struct vhost_work *work) else resp_code = VIRTIO_SCSI_S_FUNCTION_REJECTED; + mutex_lock(&tmf->svq->vq.mutex); vhost_scsi_send_tmf_resp(tmf->vhost, &tmf->svq->vq, tmf->in_iovs, tmf->vq_desc, &tmf->resp_iov, resp_code); + mutex_unlock(&tmf->svq->vq.mutex); + vhost_scsi_release_tmf_res(tmf); } -- 2.34.1

5 months, 2 weeks

[PATCH v5.10 v2] vhost-scsi: protect vq->log_used with vq->mutex

by Xinyu Zheng

From: Dongli Zhang <dongli.zhang(a)oracle.com> [ Upstream commit f591cf9fce724e5075cc67488c43c6e39e8cbe27 ] The vhost-scsi completion path may access vq->log_base when vq->log_used is already set to false. vhost-thread QEMU-thread vhost_scsi_complete_cmd_work() -> vhost_add_used() -> vhost_add_used_n() if (unlikely(vq->log_used)) QEMU disables vq->log_used via VHOST_SET_VRING_ADDR. mutex_lock(&vq->mutex); vq->log_used = false now! mutex_unlock(&vq->mutex); QEMU gfree(vq->log_base) log_used() -> log_write(vq->log_base) Assuming the VMM is QEMU. The vq->log_base is from QEMU userpace and can be reclaimed via gfree(). As a result, this causes invalid memory writes to QEMU userspace. The control queue path has the same issue. Cc: stable(a)vger.kernel.org#5.10.x Cc: gregkh(a)linuxfoundation.org Signed-off-by: Dongli Zhang <dongli.zhang(a)oracle.com> Acked-by: Jason Wang <jasowang(a)redhat.com> Reviewed-by: Mike Christie <michael.christie(a)oracle.com> Message-Id: <20250403063028.16045-2-dongli.zhang(a)oracle.com> Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> [ Conflicts in drivers/vhost/scsi.c bacause vhost_scsi_complete_cmd_work() has been refactored. ] Signed-off-by: Xinyu Zheng <zhengxinyu6(a)huawei.com> --- V1 -> V2: Remove unnecessary CVE tag drivers/vhost/scsi.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index a23a65e7d828..fcde3752b4f1 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -579,8 +579,10 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work) ret = copy_to_iter(&v_rsp, sizeof(v_rsp), &iov_iter); if (likely(ret == sizeof(v_rsp))) { struct vhost_scsi_virtqueue *q; - vhost_add_used(cmd->tvc_vq, cmd->tvc_vq_desc, 0); q = container_of(cmd->tvc_vq, struct vhost_scsi_virtqueue, vq); + mutex_lock(&q->vq.mutex); + vhost_add_used(cmd->tvc_vq, cmd->tvc_vq_desc, 0); + mutex_unlock(&q->vq.mutex); vq = q - vs->vqs; __set_bit(vq, signal); } else @@ -1193,8 +1195,11 @@ static void vhost_scsi_tmf_resp_work(struct vhost_work *work) else resp_code = VIRTIO_SCSI_S_FUNCTION_REJECTED; + mutex_lock(&tmf->svq->vq.mutex); vhost_scsi_send_tmf_resp(tmf->vhost, &tmf->svq->vq, tmf->in_iovs, tmf->vq_desc, &tmf->resp_iov, resp_code); + mutex_unlock(&tmf->svq->vq.mutex); + vhost_scsi_release_tmf_res(tmf); } -- 2.34.1

5 months, 2 weeks

Host panic in bpf verifier when loading bpf prog in 5.10 stable kernel

by Aaron Lu

Hello, Wei reported when loading his bpf prog in 5.10.200 kernel, host would panic, this didn't happen in 5.10.135 kernel. Test on latest v5.10.238 still has this panic. [ 26.531718] BUG: kernel NULL pointer dereference, address: 0000000000000168 [ 26.538093] #PF: supervisor read access in kernel mode [ 26.542727] #PF: error_code(0x0000) - not-present page [ 26.548093] PGD 10f3e9067 P4D 10f332067 PUD 10f0c5067 PMD 0 [ 26.553211] Oops: 0000 [#1] SMP NOPTI [ 26.556531] CPU: 2 PID: 541 Comm: main Not tainted 5.10.238-00267-g01e7e36b8606 #63 [ 26.563816] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 26.572357] RIP: 0010:__mark_chain_precision+0x24b/0x4d0 [ 26.576572] Code: 51 01 be 20 00 00 00 4c 89 ef 48 63 d2 e8 bd df 31 00 89 c1 83 f8 1f 7f 29 48 63 d1 48 89 d0 48 c1 e0 04 48 29 d0 48 8d 04 c3 <83> 38 01 75 c3 0f b6 74 24 06 80 78 74 00 c6 40 74 01 44 0f 44 f6 [ 26.589100] RSP: 0018:ffa0000000ff7b60 EFLAGS: 00010216 [ 26.592612] RAX: 0000000000000168 RBX: 0000000000000000 RCX: 0000000000000003 [ 26.597416] RDX: 0000000000000003 RSI: 0000000000000020 RDI: ffa0000000ff7b78 [ 26.601362] RBP: 0000000000000003 R08: ffa0000000ff7b70 R09: 0000000000000004 [ 26.604261] R10: 0000000000000007 R11: ffa0000000425000 R12: ff11000102ee2000 [ 26.607202] R13: ffa0000000ff7b78 R14: 0000000000000000 R15: ff1100010ee37140 [ 26.610327] FS: 00000000007a0630(0000) GS:ff1100081c400000(0000) knlGS:0000000000000000 [ 26.613678] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 26.616105] CR2: 0000000000000168 CR3: 0000000115e72002 CR4: 0000000000371ee0 [ 26.619059] Call Trace: [ 26.620118] adjust_reg_min_max_vals+0x133/0x340 [ 26.622048] ? krealloc+0x63/0xe0 [ 26.623435] do_check+0x38c/0xa80 [ 26.624859] do_check_common+0x15b/0x280 [ 26.626496] bpf_check+0xbe1/0xd30 [ 26.627939] ? srso_alias_return_thunk+0x5/0x7f [ 26.629796] ? trace_hardirqs_on+0x1a/0xd0 [ 26.631503] ? srso_alias_return_thunk+0x5/0x7f [ 26.633402] bpf_prog_load+0x422/0x8a0 [ 26.634987] ? srso_alias_return_thunk+0x5/0x7f [ 26.636864] ? __handle_mm_fault+0x3cb/0x6d0 [ 26.638658] ? srso_alias_return_thunk+0x5/0x7f [ 26.640543] ? lock_release+0xe3/0x110 [ 26.642114] __do_sys_bpf+0x485/0xdf0 [ 26.643624] do_syscall_64+0x33/0x40 [ 26.645110] entry_SYSCALL_64_after_hwframe+0x67/0xd1 [ 26.647190] RIP: 0033:0x409a6e [ 26.648470] Code: 24 28 44 8b 44 24 2c e9 70 ff ff ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48 [ 26.656154] RSP: 002b:000000c00199edc0 EFLAGS: 00000212 ORIG_RAX: 0000000000000141 [ 26.659451] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 0000000000409a6e [ 26.662375] RDX: 0000000000000098 RSI: 000000c00199f290 RDI: 0000000000000005 [ 26.665267] RBP: 000000c00199ee00 R08: 0000000000000000 R09: 0000000000000000 [ 26.668204] R10: 0000000000000000 R11: 0000000000000212 R12: 0000000000000000 [ 26.671125] R13: 0000000000000080 R14: 000000c000002380 R15: 8080808080808080 [ 26.674085] Modules linked in: [ 26.675363] CR2: 0000000000000168 [ 26.676772] ---[ end trace 3fc192ee4dabbf12 ]--- [ 26.678667] RIP: 0010:__mark_chain_precision+0x24b/0x4d0 [ 26.680926] Code: 51 01 be 20 00 00 00 4c 89 ef 48 63 d2 e8 bd df 31 00 89 c1 83 f8 1f 7f 29 48 63 d1 48 89 d0 48 c1 e0 04 48 29 d0 48 8d 04 c3 <83> 38 01 75 c3 0f b6 74 24 06 80 78 74 00 c6 40 74 01 44 0f 44 f6 [ 26.688665] RSP: 0018:ffa0000000ff7b60 EFLAGS: 00010216 [ 26.690828] RAX: 0000000000000168 RBX: 0000000000000000 RCX: 0000000000000003 [ 26.693777] RDX: 0000000000000003 RSI: 0000000000000020 RDI: ffa0000000ff7b78 [ 26.696680] RBP: 0000000000000003 R08: ffa0000000ff7b70 R09: 0000000000000004 [ 26.699651] R10: 0000000000000007 R11: ffa0000000425000 R12: ff11000102ee2000 [ 26.702561] R13: ffa0000000ff7b78 R14: 0000000000000000 R15: ff1100010ee37140 [ 26.705522] FS: 00000000007a0630(0000) GS:ff1100081c400000(0000) knlGS:0000000000000000 [ 26.708806] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 26.711179] CR2: 0000000000000168 CR3: 0000000115e72002 CR4: 0000000000371ee0 [ 26.714143] Kernel panic - not syncing: Fatal exception [ 26.716893] Kernel Offset: disabled [ 26.718911] Rebooting in 5 seconds.. I did a bisect in linux-5.10.y branch and found the fbc is commit 2474ec58b96d("bpf: allow precision tracking for programs with subprogs"). I noticed there is a commit in Linus master branch that has a fix tag for this bisected commit: commit 81335f90e8a8("bpf: unconditionally reset backtrack_state masks on global func exit"), I tried to apply it in this 5.10.y branch but since the bases are quite different, clean apply is not possible, I end up with the following diff: diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 40ac67a04ab75..71da33fb96552 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2118,11 +2118,9 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r bitmap_from_u64(mask, reg_mask); for_each_set_bit(i, mask, 32) { reg = &st->frame[0]->regs[i]; - if (reg->type != SCALAR_VALUE) { - reg_mask &= ~(1u << i); - continue; - } - reg->precise = true; + reg_mask &= ~(1u << i); + if (reg->type == SCALAR_VALUE) + reg->precise = true; } return 0; } But it didn't make any difference. Here are the reproduce steps: 1 clone this repo https://github.com/bytedance/vArmor-ebpf and switch to panic-analysis branch; 2 make build A binary named main should be built. I used golang compiler downloaded here: https://go.dev/dl/go1.24.3.linux-amd64.tar.gz but other golang compiler may also work. Run main as root and it will panic the host(kernel needs CONFIG_BPF_LSM). Full dmesg and config are attached, feel free to let me know if you need any additional info, thanks. P.S. linux-5.15.y has the same situation.

5 months, 2 weeks

[PATCH 1/1] iommu/sva: Invalidate KVA range on kernel TLB flush

by Lu Baolu

The vmalloc() and vfree() functions manage virtually contiguous, but not necessarily physically contiguous, kernel memory regions. When vfree() unmaps such a region, it tears down the associated kernel page table entries and frees the physical pages. In the IOMMU Shared Virtual Addressing (SVA) context, the IOMMU hardware shares and walks the CPU's page tables. Architectures like x86 share static kernel address mappings across all user page tables, allowing the IOMMU to access the kernel portion of these tables. Modern IOMMUs often cache page table entries to optimize walk performance, even for intermediate page table levels. If kernel page table mappings are changed (e.g., by vfree()), but the IOMMU's internal caches retain stale entries, Use-After-Free (UAF) vulnerability condition arises. If these freed page table pages are reallocated for a different purpose, potentially by an attacker, the IOMMU could misinterpret the new data as valid page table entries. This allows the IOMMU to walk into attacker-controlled memory, leading to arbitrary physical memory DMA access or privilege escalation. To mitigate this, introduce a new iommu interface to flush IOMMU caches and fence pending page table walks when kernel page mappings are updated. This interface should be invoked from architecture-specific code that manages combined user and kernel page tables. Fixes: 26b25a2b98e4 ("iommu: Bind process address spaces to devices") Cc: stable(a)vger.kernel.org Co-developed-by: Jason Gunthorpe <jgg(a)nvidia.com> Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com> Signed-off-by: Lu Baolu <baolu.lu(a)linux.intel.com> --- arch/x86/mm/tlb.c | 2 ++ drivers/iommu/iommu-sva.c | 32 +++++++++++++++++++++++++++++++- include/linux/iommu.h | 4 ++++ 3 files changed, 37 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 39f80111e6f1..a41499dfdc3f 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -12,6 +12,7 @@ #include <linux/task_work.h> #include <linux/mmu_notifier.h> #include <linux/mmu_context.h> +#include <linux/iommu.h> #include <asm/tlbflush.h> #include <asm/mmu_context.h> @@ -1540,6 +1541,7 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end) kernel_tlb_flush_range(info); put_flush_tlb_info(); + iommu_sva_invalidate_kva_range(start, end); } /* diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index 1a51cfd82808..154384eab8a3 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -10,6 +10,8 @@ #include "iommu-priv.h" static DEFINE_MUTEX(iommu_sva_lock); +static DEFINE_STATIC_KEY_FALSE(iommu_sva_present); +static LIST_HEAD(iommu_sva_mms); static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, struct mm_struct *mm); @@ -42,6 +44,7 @@ static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct de return ERR_PTR(-ENOSPC); } iommu_mm->pasid = pasid; + iommu_mm->mm = mm; INIT_LIST_HEAD(&iommu_mm->sva_domains); /* * Make sure the write to mm->iommu_mm is not reordered in front of @@ -132,8 +135,13 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm if (ret) goto out_free_domain; domain->users = 1; - list_add(&domain->next, &mm->iommu_mm->sva_domains); + if (list_empty(&iommu_mm->sva_domains)) { + if (list_empty(&iommu_sva_mms)) + static_branch_enable(&iommu_sva_present); + list_add(&iommu_mm->mm_list_elm, &iommu_sva_mms); + } + list_add(&domain->next, &iommu_mm->sva_domains); out: refcount_set(&handle->users, 1); mutex_unlock(&iommu_sva_lock); @@ -175,6 +183,13 @@ void iommu_sva_unbind_device(struct iommu_sva *handle) list_del(&domain->next); iommu_domain_free(domain); } + + if (list_empty(&iommu_mm->sva_domains)) { + list_del(&iommu_mm->mm_list_elm); + if (list_empty(&iommu_sva_mms)) + static_branch_disable(&iommu_sva_present); + } + mutex_unlock(&iommu_sva_lock); kfree(handle); } @@ -312,3 +327,18 @@ static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, return domain; } + +void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end) +{ + struct iommu_mm_data *iommu_mm; + + might_sleep(); + + if (!static_branch_unlikely(&iommu_sva_present)) + return; + + guard(mutex)(&iommu_sva_lock); + list_for_each_entry(iommu_mm, &iommu_sva_mms, mm_list_elm) + mmu_notifier_arch_invalidate_secondary_tlbs(iommu_mm->mm, start, end); +} +EXPORT_SYMBOL_GPL(iommu_sva_invalidate_kva_range); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 156732807994..31330c12b8ee 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -1090,7 +1090,9 @@ struct iommu_sva { struct iommu_mm_data { u32 pasid; + struct mm_struct *mm; struct list_head sva_domains; + struct list_head mm_list_elm; }; int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode); @@ -1571,6 +1573,7 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm); void iommu_sva_unbind_device(struct iommu_sva *handle); u32 iommu_sva_get_pasid(struct iommu_sva *handle); +void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end); #else static inline struct iommu_sva * iommu_sva_bind_device(struct device *dev, struct mm_struct *mm) @@ -1595,6 +1598,7 @@ static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm) } static inline void mm_pasid_drop(struct mm_struct *mm) {} +static inline void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end) {} #endif /* CONFIG_IOMMU_SVA */ #ifdef CONFIG_IOMMU_IOPF -- 2.43.0

5 months, 2 weeks

[PATCH net-next v2 1/2] net: ipv4: fix incorrect MTU in broadcast routes

by Oscar Maes

Currently, __mkroute_output overrules the MTU value configured for broadcast routes. This buggy behaviour can be reproduced with: ip link set dev eth1 mtu 9000 ip route del broadcast 192.168.0.255 dev eth1 proto kernel scope link src 192.168.0.2 ip route add broadcast 192.168.0.255 dev eth1 proto kernel scope link src 192.168.0.2 mtu 1500 The maximum packet size should be 1500, but it is actually 8000: ping -b 192.168.0.255 -s 8000 Fix __mkroute_output to allow MTU values to be configured for for broadcast routes (to support a mixed-MTU local-area-network). Signed-off-by: Oscar Maes <oscmaes92(a)gmail.com> --- net/ipv4/route.c | 1 - 1 file changed, 1 deletion(-) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 64ba377cd..f639a2ae8 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2588,7 +2588,6 @@ static struct rtable *__mkroute_output(const struct fib_result *res, do_cache = true; if (type == RTN_BROADCAST) { flags |= RTCF_BROADCAST | RTCF_LOCAL; - fi = NULL; } else if (type == RTN_MULTICAST) { flags |= RTCF_MULTICAST | RTCF_LOCAL; if (!ip_check_mc_rcu(in_dev, fl4->daddr, fl4->saddr, -- 2.39.5

5 months, 2 weeks

[PATCH AUTOSEL 5.4] rxrpc: Fix oops due to non-existence of prealloc backlog struct

by Sasha Levin

From: David Howells <dhowells(a)redhat.com> [ Upstream commit 880a88f318cf1d2a0f4c0a7ff7b07e2062b434a4 ] If an AF_RXRPC service socket is opened and bound, but calls are preallocated, then rxrpc_alloc_incoming_call() will oops because the rxrpc_backlog struct doesn't get allocated until the first preallocation is made. Fix this by returning NULL from rxrpc_alloc_incoming_call() if there is no backlog struct. This will cause the incoming call to be aborted. Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com> Suggested-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com> Signed-off-by: David Howells <dhowells(a)redhat.com> cc: LePremierHomme <kwqcheii(a)proton.me> cc: Marc Dionne <marc.dionne(a)auristor.com> cc: Willy Tarreau <w(a)1wt.eu> cc: Simon Horman <horms(a)kernel.org> cc: linux-afs(a)lists.infradead.org Link: https://patch.msgid.link/20250708211506.2699012-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- LLM Generated explanations, may be completely bogus: **YES** This commit should be backported to stable kernel trees for the following reasons: 1. **Fixes a Critical Kernel Oops**: The commit addresses a NULL pointer dereference that causes a kernel crash when `rx->backlog` is NULL. At line 257 of the original code, `smp_load_acquire(&b->call_backlog_head)` would dereference a NULL pointer if no preallocation was done. 2. **Minimal and Safe Fix**: The fix is a simple defensive check: ```c + if (!b) + return NULL; ``` This is placed immediately after obtaining the backlog pointer and before any usage. The fix has zero risk of regression - if `b` is NULL, the code would have crashed anyway. 3. **Clear Reproducible Scenario**: The bug occurs in a specific but realistic scenario - when an AF_RXRPC service socket is opened and bound but no calls are preallocated (meaning `rxrpc_service_prealloc()` was never called to allocate the backlog structure). 4. **Follows Stable Kernel Rules**: This fix meets all criteria for stable backporting: - Fixes a real bug that users can hit - Small and contained change (2 lines) - Obviously correct with no side effects - Already tested and merged upstream 5. **Similar to Previously Backported Fixes**: Looking at Similar Commit #2 which was marked YES, it also fixed an oops in the rxrpc preallocation/backlog system with minimal changes. The commit prevents a kernel crash with a trivial NULL check, making it an ideal candidate for stable backporting. net/rxrpc/call_accept.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c index 55fb3744552de..99f05057e4c90 100644 --- a/net/rxrpc/call_accept.c +++ b/net/rxrpc/call_accept.c @@ -281,6 +281,9 @@ static struct rxrpc_call *rxrpc_alloc_incoming_call(struct rxrpc_sock *rx, unsigned short call_tail, conn_tail, peer_tail; unsigned short call_count, conn_count; + if (!b) + return NULL; + /* #calls >= #conns >= #peers must hold true. */ call_head = smp_load_acquire(&b->call_backlog_head); call_tail = b->call_backlog_tail; -- 2.39.5

5 months, 2 weeks

[PATCH AUTOSEL 5.10] rxrpc: Fix oops due to non-existence of prealloc backlog struct

by Sasha Levin

From: David Howells <dhowells(a)redhat.com> [ Upstream commit 880a88f318cf1d2a0f4c0a7ff7b07e2062b434a4 ] If an AF_RXRPC service socket is opened and bound, but calls are preallocated, then rxrpc_alloc_incoming_call() will oops because the rxrpc_backlog struct doesn't get allocated until the first preallocation is made. Fix this by returning NULL from rxrpc_alloc_incoming_call() if there is no backlog struct. This will cause the incoming call to be aborted. Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com> Suggested-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com> Signed-off-by: David Howells <dhowells(a)redhat.com> cc: LePremierHomme <kwqcheii(a)proton.me> cc: Marc Dionne <marc.dionne(a)auristor.com> cc: Willy Tarreau <w(a)1wt.eu> cc: Simon Horman <horms(a)kernel.org> cc: linux-afs(a)lists.infradead.org Link: https://patch.msgid.link/20250708211506.2699012-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- LLM Generated explanations, may be completely bogus: **YES** This commit should be backported to stable kernel trees for the following reasons: 1. **Fixes a Critical Kernel Oops**: The commit addresses a NULL pointer dereference that causes a kernel crash when `rx->backlog` is NULL. At line 257 of the original code, `smp_load_acquire(&b->call_backlog_head)` would dereference a NULL pointer if no preallocation was done. 2. **Minimal and Safe Fix**: The fix is a simple defensive check: ```c + if (!b) + return NULL; ``` This is placed immediately after obtaining the backlog pointer and before any usage. The fix has zero risk of regression - if `b` is NULL, the code would have crashed anyway. 3. **Clear Reproducible Scenario**: The bug occurs in a specific but realistic scenario - when an AF_RXRPC service socket is opened and bound but no calls are preallocated (meaning `rxrpc_service_prealloc()` was never called to allocate the backlog structure). 4. **Follows Stable Kernel Rules**: This fix meets all criteria for stable backporting: - Fixes a real bug that users can hit - Small and contained change (2 lines) - Obviously correct with no side effects - Already tested and merged upstream 5. **Similar to Previously Backported Fixes**: Looking at Similar Commit #2 which was marked YES, it also fixed an oops in the rxrpc preallocation/backlog system with minimal changes. The commit prevents a kernel crash with a trivial NULL check, making it an ideal candidate for stable backporting. net/rxrpc/call_accept.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c index 2a14d69b171f3..b96af42a1b041 100644 --- a/net/rxrpc/call_accept.c +++ b/net/rxrpc/call_accept.c @@ -271,6 +271,9 @@ static struct rxrpc_call *rxrpc_alloc_incoming_call(struct rxrpc_sock *rx, unsigned short call_tail, conn_tail, peer_tail; unsigned short call_count, conn_count; + if (!b) + return NULL; + /* #calls >= #conns >= #peers must hold true. */ call_head = smp_load_acquire(&b->call_backlog_head); call_tail = b->call_backlog_tail; -- 2.39.5

5 months, 2 weeks

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror July 2025