October 2025 - Linux-stable-mirror

FAILED: patch "[PATCH] selftests: mptcp: connect: catch IO errors on listen side" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x 14e22b43df25dbd4301351b882486ea38892ae4f # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092158-molehill-radiation-11c3@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 14e22b43df25dbd4301351b882486ea38892ae4f Mon Sep 17 00:00:00 2001 From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org> Date: Fri, 12 Sep 2025 14:25:51 +0200 Subject: [PATCH] selftests: mptcp: connect: catch IO errors on listen side IO errors were correctly printed to stderr, and propagated up to the main loop for the server side, but the returned value was ignored. As a consequence, the program for the listener side was no longer exiting with an error code in case of IO issues. Because of that, some issues might not have been seen. But very likely, most issues either had an effect on the client side, or the file transfer was not the expected one, e.g. the connection got reset before the end. Still, it is better to fix this. The main consequence of this issue is the error that was reported by the selftests: the received and sent files were different, and the MIB counters were not printed. Also, when such errors happened during the 'disconnect' tests, the program tried to continue until the timeout. Now when an IO error is detected, the program exits directly with an error. Fixes: 05be5e273c84 ("selftests: mptcp: add disconnect tests") Cc: stable(a)vger.kernel.org Reviewed-by: Mat Martineau <martineau(a)kernel.org> Reviewed-by: Geliang Tang <geliang(a)kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org> Link: https://patch.msgid.link/20250912-net-mptcp-fix-sft-connect-v1-2-d40e77cbbf… Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c index 4f07ac9fa207..1408698df099 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c @@ -1093,6 +1093,7 @@ int main_loop_s(int listensock) struct pollfd polls; socklen_t salen; int remotesock; + int err = 0; int fd = 0; again: @@ -1125,7 +1126,7 @@ int main_loop_s(int listensock) SOCK_TEST_TCPULP(remotesock, 0); memset(&winfo, 0, sizeof(winfo)); - copyfd_io(fd, remotesock, 1, true, &winfo); + err = copyfd_io(fd, remotesock, 1, true, &winfo); } else { perror("accept"); return 1; @@ -1134,10 +1135,10 @@ int main_loop_s(int listensock) if (cfg_input) close(fd); - if (--cfg_repeat > 0) + if (!err && --cfg_repeat > 0) goto again; - return 0; + return err; } static void init_rng(void)

2 weeks, 6 days

2
1
0 0

Unplayable framerates in game but specific kernel versions work, maybe amdgpu problem

by machion＠disroot.org

Hello kernel/driver developers, I hope, with my information it's possible to find a bug/problem in the kernel. Otherwise I am sorry, that I disturbed you. I only use LTS kernels, but I can narrow it down to a hand full of them, where it works. The PC: Manjaro Stable/Cinnamon/X11/AMD Ryzen 5 2600/Radeon HD 7790/8GB RAM I already asked the Manjaro community, but with no luck. The game: Hellpoint (GOG Linux latest version, Unity3D-Engine v2021), uses vulkan --- I came a long road of kernels. I had many versions of 5.4, 5.10, 5.15, 6.1 and 6.6 and and the game was always unplayable, because the frames where around 1fps (performance of PC is not the problem). I asked the mesa and cinnamon team for help in the past, but also with no luck. It never worked, till on 2025-03-29 when I installed 6.12.19 for the first time and it worked! But it only worked with 6.12.19, 6.12.20 and 6.12.21 When I updated to 6.12.25, it was back to unplayable. For testing I installed 6.14.4 with the same result. It doesn't work. I also compared file /proc/config.gz of both kernels (6.12.21 <> 6.14.4), but can't seem to see drastic changes to the graphical part. I presume it has something to do with amdgpu. If you need more information, I would be happy to help. Kind regards, Marion

2 weeks, 6 days

2
5
0 0

[PATCH net v2] r8152: add error handling in rtl8152_driver_init

by yicongsrfy＠163.com

From: Yi Cong <yicong(a)kylinos.cn> rtl8152_driver_init missing error handling. If cannot register rtl8152_driver, rtl8152_cfgselector_driver should be deregistered. Fixes: ec51fbd1b8a2 ("r8152: add USB device driver for config selection") Cc: stable(a)vger.kernel.org Signed-off-by: Yi Cong <yicong(a)kylinos.cn> Reviewed-by: Simon Horman <horms(a)kernel.org> v2: replacing return 0 with return ret and adding Cc stable --- drivers/net/usb/r8152.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index 44cba7acfe7d..8a0c824e9eb4 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -10122,7 +10122,14 @@ static int __init rtl8152_driver_init(void) ret = usb_register_device_driver(&rtl8152_cfgselector_driver, THIS_MODULE); if (ret) return ret; - return usb_register(&rtl8152_driver); + + ret = usb_register(&rtl8152_driver); + if (ret) { + usb_deregister_device_driver(&rtl8152_cfgselector_driver); + return ret; + } + + return ret; } static void __exit rtl8152_driver_exit(void) -- 2.25.1

2 weeks, 6 days

1
0
0 0

[PATCH] dma: dw-axi-dmac: Release the clock resources

by Zhen Ni

In axi_dma_resume(), if clk_prepare_enable(chip->core_clk) fails, chip->cfgr_clk remains enabled and is not disabled. This could lead to resource leaks and inconsistent state during error handling. Ensure that cfgr_clk is properly disabled. Fixes: 1fe20f1b8454 ("dmaengine: Introduce DW AXI DMAC driver") Cc: stable(a)vger.kernel.org Signed-off-by: Zhen Ni <zhen.ni(a)easystack.cn> --- drivers/dma/dw-axi-dmac/dw-axi-dmac-platform.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/dma/dw-axi-dmac/dw-axi-dmac-platform.c b/drivers/dma/dw-axi-dmac/dw-axi-dmac-platform.c index b23536645ff7..ab70dbe54f46 100644 --- a/drivers/dma/dw-axi-dmac/dw-axi-dmac-platform.c +++ b/drivers/dma/dw-axi-dmac/dw-axi-dmac-platform.c @@ -1334,8 +1334,10 @@ static int axi_dma_resume(struct axi_dma_chip *chip) return ret; ret = clk_prepare_enable(chip->core_clk); - if (ret < 0) + if (ret < 0) { + clk_disable_unprepare(chip->cfgr_clk); return ret; + } axi_dma_enable(chip); axi_dma_irq_enable(chip); -- 2.20.1

2 weeks, 6 days

1
0
0 0

+ ocfs2-clear-extent-cache-after-moving-defragmenting-extents.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: ocfs2: clear extent cache after moving/defragmenting extents has been added to the -mm mm-hotfixes-unstable branch. Its filename is ocfs2-clear-extent-cache-after-moving-defragmenting-extents.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Deepanshu Kartikey <kartikey406(a)gmail.com> Subject: ocfs2: clear extent cache after moving/defragmenting extents Date: Thu, 9 Oct 2025 21:19:03 +0530 The extent map cache can become stale when extents are moved or defragmented, causing subsequent operations to see outdated extent flags. This triggers a BUG_ON in ocfs2_refcount_cal_cow_clusters(). The problem occurs when: 1. copy_file_range() creates a reflinked extent with OCFS2_EXT_REFCOUNTED 2. ioctl(FITRIM) triggers ocfs2_move_extents() 3. __ocfs2_move_extents_range() reads and caches the extent (flags=0x2) 4. ocfs2_move_extent()/ocfs2_defrag_extent() calls __ocfs2_move_extent() which clears OCFS2_EXT_REFCOUNTED flag on disk (flags=0x0) 5. The extent map cache is not invalidated after the move 6. Later write() operations read stale cached flags (0x2) but disk has updated flags (0x0), causing a mismatch 7. BUG_ON(!(rec->e_flags & OCFS2_EXT_REFCOUNTED)) triggers Fix by clearing the extent map cache after each extent move/defrag operation in __ocfs2_move_extents_range(). This ensures subsequent operations read fresh extent data from disk. Link: https://lore.kernel.org/all/20251009142917.517229-1-kartikey406@gmail.com/T/ Link: https://lkml.kernel.org/r/20251009154903.522339-1-kartikey406@gmail.com Signed-off-by: Deepanshu Kartikey <kartikey406(a)gmail.com> Reported-by: syzbot+6fdd8fa3380730a4b22c(a)syzkaller.appspotmail.com Tested-by: syzbot+6fdd8fa3380730a4b22c(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?id=2959889e1f6e216585ce522f7e8bc002b46ad9… Reviewed-by: Mark Fasheh <mark(a)fasheh.com> Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com> Cc: Joel Becker <jlbec(a)evilplan.org> Cc: Junxiao Bi <junxiao.bi(a)oracle.com> Cc: Changwei Ge <gechangwei(a)live.cn> Cc: Jun Piao <piaojun(a)huawei.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/ocfs2/move_extents.c | 5 +++++ 1 file changed, 5 insertions(+) --- a/fs/ocfs2/move_extents.c~ocfs2-clear-extent-cache-after-moving-defragmenting-extents +++ a/fs/ocfs2/move_extents.c @@ -867,6 +867,11 @@ static int __ocfs2_move_extents_range(st mlog_errno(ret); goto out; } + /* + * Invalidate extent cache after moving/defragging to prevent + * stale cached data with outdated extent flags. + */ + ocfs2_extent_map_trunc(inode, cpos); context->clusters_moved += alloc_size; next: _ Patches currently in -mm which might be from kartikey406(a)gmail.com are hugetlbfs-check-for-shareable-lock-before-calling-huge_pmd_unshare.patch ocfs2-clear-extent-cache-after-moving-defragmenting-extents.patch

2 weeks, 6 days

1
0
0 0

[PATCH v2 00/10] pmdomain: samsung: add supoort for Google GS101

by André Draszik

Hi, This series adds support for the power domains on Google GS101. It's fairly similar to SoCs already supported by this driver, except that register acces does not work via plain ioremap() / readl() / writel(). Instead, the regmap created by the PMU driver must be used (which uses Arm SMCC calls under the hood). The DT update to add the new required properties on gs101 will be posted separately. Signed-off-by: André Draszik <andre.draszik(a)linaro.org> --- Changes in v2: - Krzysztof: - move google,gs101-pmu binding into separate file - mark devm_kstrdup_const() patch as fix - use bool for need_early_sync_state - merge patches 8 and 10 from v1 series into one patch - collect tags - Link to v1: https://lore.kernel.org/r/20251006-gs101-pd-v1-0-f0cb0c01ea7b@linaro.org --- André Draszik (10): dt-bindings: power: samsung: add google,gs101-pd dt-bindings: soc: samsung: exynos-pmu: move gs101-pmu into separate binding dt-bindings: soc: samsung: gs101-pmu: allow power domains as children pmdomain: samsung: plug potential memleak during probe pmdomain: samsung: convert to using regmap pmdomain: samsung: convert to regmap_read_poll_timeout() pmdomain: samsung: don't hardcode offset for registers to 0 and 4 pmdomain: samsung: selectively handle enforced sync_state pmdomain: samsung: add support for google,gs101-pd pmdomain: samsung: use dev_err() instead of pr_err() .../devicetree/bindings/power/pd-samsung.yaml | 1 + .../bindings/soc/google/google,gs101-pmu.yaml | 107 +++++++++++++++++ .../bindings/soc/samsung/exynos-pmu.yaml | 20 ---- MAINTAINERS | 1 + drivers/pmdomain/samsung/exynos-pm-domains.c | 126 +++++++++++++++------ 5 files changed, 201 insertions(+), 54 deletions(-) --- base-commit: a5f97c90e75f09f24ece2dca34168722b140a798 change-id: 20251001-gs101-pd-d4dc97d70a84 Best regards, -- André Draszik <andre.draszik(a)linaro.org>

2 weeks, 6 days

2
2
0 0

+ dma-debug-dont-report-false-positives-with-dma_bounce_unaligned_kmalloc.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: dma-debug: don't report false positives with DMA_BOUNCE_UNALIGNED_KMALLOC has been added to the -mm mm-hotfixes-unstable branch. Its filename is dma-debug-dont-report-false-positives-with-dma_bounce_unaligned_kmalloc.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Marek Szyprowski <m.szyprowski(a)samsung.com> Subject: dma-debug: don't report false positives with DMA_BOUNCE_UNALIGNED_KMALLOC Date: Thu, 9 Oct 2025 16:15:08 +0200 Commit 370645f41e6e ("dma-mapping: force bouncing if the kmalloc() size is not cache-line-aligned") introduced DMA_BOUNCE_UNALIGNED_KMALLOC feature and lets architecture specific code to configure kmalloc slabs with sizes smaller than the value of dma_get_cache_alignment(). When that feature is enabled, the physical address of some small kmalloc()-ed buffers might be not aligned to the CPU cachelines, thus not really suitable for typical DMA. To properly handle that case a SWIOTLB buffer bouncing is used, so no CPU cache corruption occurs. When that happens, there is no point reporting a false-positive DMA-API warning that the buffer is not properly aligned, as this is not a client driver fault. Link: https://lkml.kernel.org/r/20251009141508.2342138-1-m.szyprowski@samsung.com Fixes: 370645f41e6e ("dma-mapping: force bouncing if the kmalloc() size is not cache-line-aligned") Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com> Cc: Catalin Marinas <catalin.marinas(a)arm.com> Cc: Christoph Hellwig <hch(a)lst.de> Cc: Inki Dae <m.szyprowski(a)samsung.com> Cc: Robin Murohy <robin.murphy(a)arm.com> Cc: "Isaac J. Manjarres" <isaacmanjarres(a)google.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- kernel/dma/debug.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) --- a/kernel/dma/debug.c~dma-debug-dont-report-false-positives-with-dma_bounce_unaligned_kmalloc +++ a/kernel/dma/debug.c @@ -23,6 +23,7 @@ #include <linux/ctype.h> #include <linux/list.h> #include <linux/slab.h> +#include <linux/swiotlb.h> #include <asm/sections.h> #include "debug.h" @@ -594,7 +595,9 @@ static void add_dma_entry(struct dma_deb if (rc == -ENOMEM) { pr_err_once("cacheline tracking ENOMEM, dma-debug disabled\n"); global_disable = true; - } else if (rc == -EEXIST && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) { + } else if (rc == -EEXIST && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) && + !(IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) && + is_swiotlb_allocated())) { err_printk(entry->dev, entry, "cacheline tracking EEXIST, overlapping mappings aren't supported\n"); } _ Patches currently in -mm which might be from m.szyprowski(a)samsung.com are dma-debug-dont-report-false-positives-with-dma_bounce_unaligned_kmalloc.patch

2 weeks, 6 days

1
0
0 0

[PATCH] exfat: fix out-of-bounds in exfat_nls_to_ucs2()

by Jeongjun Park

After the loop that converts characters to ucs2 ends, the variable i may be greater than or equal to len. However, when checking whether the last byte of p_cstring is NULL, the variable i is used as is, resulting in an out-of-bounds read if i >= len. Therefore, to prevent this, we need to modify the function to check whether i is less than len, and if i is greater than or equal to len, to check p_cstring[len - 1] byte. Cc: <stable(a)vger.kernel.org> Reported-by: syzbot+98cc76a76de46b3714d4(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=98cc76a76de46b3714d4 Fixes: 370e812b3ec1 ("exfat: add nls operations") Signed-off-by: Jeongjun Park <aha310510(a)gmail.com> --- fs/exfat/nls.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/exfat/nls.c b/fs/exfat/nls.c index 8243d94ceaf4..a52f3494eb20 100644 --- a/fs/exfat/nls.c +++ b/fs/exfat/nls.c @@ -616,7 +616,7 @@ static int exfat_nls_to_ucs2(struct super_block *sb, unilen++; } - if (p_cstring[i] != '\0') + if (p_cstring[min(i, len - 1)] != '\0') lossy |= NLS_NAME_OVERLEN; *uniname = '\0'; --

2 weeks, 6 days

3
5
0 0

[for-linus][PATCH 5/5] tracing: Have trace_marker use per-cpu data to read user space

by Steven Rostedt

From: Steven Rostedt <rostedt(a)goodmis.org> It was reported that using __copy_from_user_inatomic() can actually schedule. Which is bad when preemption is disabled. Even though there's logic to check in_atomic() is set, but this is a nop when the kernel is configured with PREEMPT_NONE. This is due to page faulting and the code could schedule with preemption disabled. Link: https://lore.kernel.org/all/20250819105152.2766363-1-luogengkun@huaweicloud… The solution was to change the __copy_from_user_inatomic() to copy_from_user_nofault(). But then it was reported that this caused a regression in Android. There's several applications writing into trace_marker() in Android, but now instead of showing the expected data, it is showing: tracing_mark_write: <faulted> After reverting the conversion to copy_from_user_nofault(), Android was able to get the data again. Writes to the trace_marker is a way to efficiently and quickly enter data into the Linux tracing buffer. It takes no locks and was designed to be as non-intrusive as possible. This means it cannot allocate memory, and must use pre-allocated data. A method that is actively being worked on to have faultable system call tracepoints read user space data is to allocate per CPU buffers, and use them in the callback. The method uses a technique similar to seqcount. That is something like this: preempt_disable(); cpu = smp_processor_id(); buffer = this_cpu_ptr(&pre_allocated_cpu_buffers, cpu); do { cnt = nr_context_switches_cpu(cpu); migrate_disable(); preempt_enable(); ret = copy_from_user(buffer, ptr, size); preempt_disable(); migrate_enable(); } while (!ret && cnt != nr_context_switches_cpu(cpu)); if (!ret) ring_buffer_write(buffer); preempt_enable(); It's a little more involved than that, but the above is the basic logic. The idea is to acquire the current CPU buffer, disable migration, and then enable preemption. At this moment, it can safely use copy_from_user(). After reading the data from user space, it disables preemption again. It then checks to see if there was any new scheduling on this CPU. If there was, it must assume that the buffer was corrupted by another task. If there wasn't, then the buffer is still valid as only tasks in preemptable context can write to this buffer and only those that are running on the CPU. By using this method, where trace_marker open allocates the per CPU buffers, trace_marker writes can access user space and even fault it in, without having to allocate or take any locks of its own. Cc: stable(a)vger.kernel.org Cc: Masami Hiramatsu <mhiramat(a)kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> Cc: Luo Gengkun <luogengkun(a)huaweicloud.com> Cc: Wattson CI <wattson-external(a)google.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Link: https://lore.kernel.org/20251008124510.6dba541a@gandalf.local.home Fixes: 3d62ab32df065 ("tracing: Fix tracing_marker may trigger page fault during preempt_disable") Reported-by: Runping Lai <runpinglai(a)google.com> Tested-by: Runping Lai <runpinglai(a)google.com> Closes: https://lore.kernel.org/linux-trace-kernel/20251007003417.3470979-2-runping… Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org> --- kernel/trace/trace.c | 268 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 220 insertions(+), 48 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index b3c94fbaf002..0fd582651293 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -4791,12 +4791,6 @@ int tracing_single_release_file_tr(struct inode *inode, struct file *filp) return single_release(inode, filp); } -static int tracing_mark_open(struct inode *inode, struct file *filp) -{ - stream_open(inode, filp); - return tracing_open_generic_tr(inode, filp); -} - static int tracing_release(struct inode *inode, struct file *file) { struct trace_array *tr = inode->i_private; @@ -7163,7 +7157,7 @@ tracing_free_buffer_release(struct inode *inode, struct file *filp) #define TRACE_MARKER_MAX_SIZE 4096 -static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user *ubuf, +static ssize_t write_marker_to_buffer(struct trace_array *tr, const char *buf, size_t cnt, unsigned long ip) { struct ring_buffer_event *event; @@ -7173,20 +7167,11 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user int meta_size; ssize_t written; size_t size; - int len; - -/* Used in tracing_mark_raw_write() as well */ -#define FAULTED_STR "<faulted>" -#define FAULTED_SIZE (sizeof(FAULTED_STR) - 1) /* '\0' is already accounted for */ meta_size = sizeof(*entry) + 2; /* add '\0' and possible '\n' */ again: size = cnt + meta_size; - /* If less than "<faulted>", then make sure we can still add that */ - if (cnt < FAULTED_SIZE) - size += FAULTED_SIZE - cnt; - buffer = tr->array_buffer.buffer; event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size, tracing_gen_ctx()); @@ -7196,9 +7181,6 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user * make it smaller and try again. */ if (size > ring_buffer_max_event_size(buffer)) { - /* cnt < FAULTED size should never be bigger than max */ - if (WARN_ON_ONCE(cnt < FAULTED_SIZE)) - return -EBADF; cnt = ring_buffer_max_event_size(buffer) - meta_size; /* The above should only happen once */ if (WARN_ON_ONCE(cnt + meta_size == size)) @@ -7212,14 +7194,8 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user entry = ring_buffer_event_data(event); entry->ip = ip; - - len = copy_from_user_nofault(&entry->buf, ubuf, cnt); - if (len) { - memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE); - cnt = FAULTED_SIZE; - written = -EFAULT; - } else - written = cnt; + memcpy(&entry->buf, buf, cnt); + written = cnt; if (tr->trace_marker_file && !list_empty(&tr->trace_marker_file->triggers)) { /* do not add \n before testing triggers, but add \0 */ @@ -7243,6 +7219,169 @@ static ssize_t write_marker_to_buffer(struct trace_array *tr, const char __user return written; } +struct trace_user_buf { + char *buf; +}; + +struct trace_user_buf_info { + struct trace_user_buf __percpu *tbuf; + int ref; +}; + + +static DEFINE_MUTEX(trace_user_buffer_mutex); +static struct trace_user_buf_info *trace_user_buffer; + +static void trace_user_fault_buffer_free(struct trace_user_buf_info *tinfo) +{ + char *buf; + int cpu; + + for_each_possible_cpu(cpu) { + buf = per_cpu_ptr(tinfo->tbuf, cpu)->buf; + kfree(buf); + } + free_percpu(tinfo->tbuf); + kfree(tinfo); +} + +static int trace_user_fault_buffer_enable(void) +{ + struct trace_user_buf_info *tinfo; + char *buf; + int cpu; + + guard(mutex)(&trace_user_buffer_mutex); + + if (trace_user_buffer) { + trace_user_buffer->ref++; + return 0; + } + + tinfo = kmalloc(sizeof(*tinfo), GFP_KERNEL); + if (!tinfo) + return -ENOMEM; + + tinfo->tbuf = alloc_percpu(struct trace_user_buf); + if (!tinfo->tbuf) { + kfree(tinfo); + return -ENOMEM; + } + + tinfo->ref = 1; + + /* Clear each buffer in case of error */ + for_each_possible_cpu(cpu) { + per_cpu_ptr(tinfo->tbuf, cpu)->buf = NULL; + } + + for_each_possible_cpu(cpu) { + buf = kmalloc_node(TRACE_MARKER_MAX_SIZE, GFP_KERNEL, + cpu_to_node(cpu)); + if (!buf) { + trace_user_fault_buffer_free(tinfo); + return -ENOMEM; + } + per_cpu_ptr(tinfo->tbuf, cpu)->buf = buf; + } + + trace_user_buffer = tinfo; + + return 0; +} + +static void trace_user_fault_buffer_disable(void) +{ + struct trace_user_buf_info *tinfo; + + guard(mutex)(&trace_user_buffer_mutex); + + tinfo = trace_user_buffer; + + if (WARN_ON_ONCE(!tinfo)) + return; + + if (--tinfo->ref) + return; + + trace_user_fault_buffer_free(tinfo); + trace_user_buffer = NULL; +} + +/* Must be called with preemption disabled */ +static char *trace_user_fault_read(struct trace_user_buf_info *tinfo, + const char __user *ptr, size_t size, + size_t *read_size) +{ + int cpu = smp_processor_id(); + char *buffer = per_cpu_ptr(tinfo->tbuf, cpu)->buf; + unsigned int cnt; + int trys = 0; + int ret; + + if (size > TRACE_MARKER_MAX_SIZE) + size = TRACE_MARKER_MAX_SIZE; + *read_size = 0; + + /* + * This acts similar to a seqcount. The per CPU context switches are + * recorded, migration is disabled and preemption is enabled. The + * read of the user space memory is copied into the per CPU buffer. + * Preemption is disabled again, and if the per CPU context switches count + * is still the same, it means the buffer has not been corrupted. + * If the count is different, it is assumed the buffer is corrupted + * and reading must be tried again. + */ + + do { + /* + * If for some reason, copy_from_user() always causes a context + * switch, this would then cause an infinite loop. + * If this task is preempted by another user space task, it + * will cause this task to try again. But just in case something + * changes where the copying from user space causes another task + * to run, prevent this from going into an infinite loop. + * 100 tries should be plenty. + */ + if (WARN_ONCE(trys++ > 100, "Error: Too many tries to read user space")) + return NULL; + + /* Read the current CPU context switch counter */ + cnt = nr_context_switches_cpu(cpu); + + /* + * Preemption is going to be enabled, but this task must + * remain on this CPU. + */ + migrate_disable(); + + /* + * Now preemption is being enabed and another task can come in + * and use the same buffer and corrupt our data. + */ + preempt_enable_notrace(); + + ret = __copy_from_user(buffer, ptr, size); + + preempt_disable_notrace(); + migrate_enable(); + + /* if it faulted, no need to test if the buffer was corrupted */ + if (ret) + return NULL; + + /* + * Preemption is disabled again, now check the per CPU context + * switch counter. If it doesn't match, then another user space + * process may have schedule in and corrupted our buffer. In that + * case the copying must be retried. + */ + } while (nr_context_switches_cpu(cpu) != cnt); + + *read_size = size; + return buffer; +} + static ssize_t tracing_mark_write(struct file *filp, const char __user *ubuf, size_t cnt, loff_t *fpos) @@ -7250,6 +7389,8 @@ tracing_mark_write(struct file *filp, const char __user *ubuf, struct trace_array *tr = filp->private_data; ssize_t written = -ENODEV; unsigned long ip; + size_t size; + char *buf; if (tracing_disabled) return -EINVAL; @@ -7263,6 +7404,16 @@ tracing_mark_write(struct file *filp, const char __user *ubuf, if (cnt > TRACE_MARKER_MAX_SIZE) cnt = TRACE_MARKER_MAX_SIZE; + /* Must have preemption disabled while having access to the buffer */ + guard(preempt_notrace)(); + + buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size); + if (!buf) + return -EFAULT; + + if (cnt > size) + cnt = size; + /* The selftests expect this function to be the IP address */ ip = _THIS_IP_; @@ -7270,32 +7421,27 @@ tracing_mark_write(struct file *filp, const char __user *ubuf, if (tr == &global_trace) { guard(rcu)(); list_for_each_entry_rcu(tr, &marker_copies, marker_list) { - written = write_marker_to_buffer(tr, ubuf, cnt, ip); + written = write_marker_to_buffer(tr, buf, cnt, ip); if (written < 0) break; } } else { - written = write_marker_to_buffer(tr, ubuf, cnt, ip); + written = write_marker_to_buffer(tr, buf, cnt, ip); } return written; } static ssize_t write_raw_marker_to_buffer(struct trace_array *tr, - const char __user *ubuf, size_t cnt) + const char *buf, size_t cnt) { struct ring_buffer_event *event; struct trace_buffer *buffer; struct raw_data_entry *entry; ssize_t written; - int size; - int len; - -#define FAULT_SIZE_ID (FAULTED_SIZE + sizeof(int)) + size_t size; size = sizeof(*entry) + cnt; - if (cnt < FAULT_SIZE_ID) - size += FAULT_SIZE_ID - cnt; buffer = tr->array_buffer.buffer; @@ -7309,14 +7455,8 @@ static ssize_t write_raw_marker_to_buffer(struct trace_array *tr, return -EBADF; entry = ring_buffer_event_data(event); - - len = copy_from_user_nofault(&entry->id, ubuf, cnt); - if (len) { - entry->id = -1; - memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE); - written = -EFAULT; - } else - written = cnt; + memcpy(&entry->id, buf, cnt); + written = cnt; __buffer_unlock_commit(buffer, event); @@ -7329,8 +7469,8 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf, { struct trace_array *tr = filp->private_data; ssize_t written = -ENODEV; - -#define FAULT_SIZE_ID (FAULTED_SIZE + sizeof(int)) + size_t size; + char *buf; if (tracing_disabled) return -EINVAL; @@ -7342,6 +7482,17 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf, if (cnt < sizeof(unsigned int)) return -EINVAL; + /* Must have preemption disabled while having access to the buffer */ + guard(preempt_notrace)(); + + buf = trace_user_fault_read(trace_user_buffer, ubuf, cnt, &size); + if (!buf) + return -EFAULT; + + /* raw write is all or nothing */ + if (cnt > size) + return -EINVAL; + /* The global trace_marker_raw can go to multiple instances */ if (tr == &global_trace) { guard(rcu)(); @@ -7357,6 +7508,27 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf, return written; } +static int tracing_mark_open(struct inode *inode, struct file *filp) +{ + int ret; + + ret = trace_user_fault_buffer_enable(); + if (ret < 0) + return ret; + + stream_open(inode, filp); + ret = tracing_open_generic_tr(inode, filp); + if (ret < 0) + trace_user_fault_buffer_disable(); + return ret; +} + +static int tracing_mark_release(struct inode *inode, struct file *file) +{ + trace_user_fault_buffer_disable(); + return tracing_release_generic_tr(inode, file); +} + static int tracing_clock_show(struct seq_file *m, void *v) { struct trace_array *tr = m->private; @@ -7764,13 +7936,13 @@ static const struct file_operations tracing_free_buffer_fops = { static const struct file_operations tracing_mark_fops = { .open = tracing_mark_open, .write = tracing_mark_write, - .release = tracing_release_generic_tr, + .release = tracing_mark_release, }; static const struct file_operations tracing_mark_raw_fops = { .open = tracing_mark_open, .write = tracing_mark_raw_write, - .release = tracing_release_generic_tr, + .release = tracing_mark_release, }; static const struct file_operations trace_clock_fops = { -- 2.51.0

3 weeks

1
0
0 0

[for-linus][PATCH 4/5] ring buffer: Propagate __rb_map_vma return value to caller

by Steven Rostedt

From: Ankit Khushwaha <ankitkhushwaha.linux(a)gmail.com> The return value from `__rb_map_vma()`, which rejects writable or executable mappings (VM_WRITE, VM_EXEC, or !VM_MAYSHARE), was being ignored. As a result the caller of `__rb_map_vma` always returned 0 even when the mapping had actually failed, allowing it to proceed with an invalid VMA. Cc: stable(a)vger.kernel.org Cc: Masami Hiramatsu <mhiramat(a)kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> Link: https://lore.kernel.org/20251008172516.20697-1-ankitkhushwaha.linux@gmail.c… Fixes: 117c39200d9d7 ("ring-buffer: Introducing ring-buffer mapping functions") Reported-by: syzbot+ddc001b92c083dbf2b97(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?id=194151be8eaebd826005329b2e123aecae714b… Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux(a)gmail.com> Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org> --- kernel/trace/ring_buffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 43460949ad3f..1244d2c5c384 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -7273,7 +7273,7 @@ int ring_buffer_map(struct trace_buffer *buffer, int cpu, atomic_dec(&cpu_buffer->resize_disabled); } - return 0; + return err; } int ring_buffer_unmap(struct trace_buffer *buffer, int cpu) -- 2.51.0

3 weeks

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror October 2025