June 2024 - Linux-stable-mirror

[PATCH net v3] net: do not leave a dangling sk pointer, when socket creation fails

by Ignat Korchagin

It is possible to trigger a use-after-free by: * attaching an fentry probe to __sock_release() and the probe calling the bpf_get_socket_cookie() helper * running traceroute -I 1.1.1.1 on a freshly booted VM A KASAN enabled kernel will log something like below (decoded and stripped): ================================================================== BUG: KASAN: slab-use-after-free in __sock_gen_cookie (./arch/x86/include/asm/atomic64_64.h:15 ./include/linux/atomic/atomic-arch-fallback.h:2583 ./include/linux/atomic/atomic-instrumented.h:1611 net/core/sock_diag.c:29) Read of size 8 at addr ffff888007110dd8 by task traceroute/299 CPU: 2 PID: 299 Comm: traceroute Tainted: G E 6.10.0-rc2+ #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:117 (discriminator 1)) print_report (mm/kasan/report.c:378 mm/kasan/report.c:488) ? __sock_gen_cookie (./arch/x86/include/asm/atomic64_64.h:15 ./include/linux/atomic/atomic-arch-fallback.h:2583 ./include/linux/atomic/atomic-instrumented.h:1611 net/core/sock_diag.c:29) kasan_report (mm/kasan/report.c:603) ? __sock_gen_cookie (./arch/x86/include/asm/atomic64_64.h:15 ./include/linux/atomic/atomic-arch-fallback.h:2583 ./include/linux/atomic/atomic-instrumented.h:1611 net/core/sock_diag.c:29) kasan_check_range (mm/kasan/generic.c:183 mm/kasan/generic.c:189) __sock_gen_cookie (./arch/x86/include/asm/atomic64_64.h:15 ./include/linux/atomic/atomic-arch-fallback.h:2583 ./include/linux/atomic/atomic-instrumented.h:1611 net/core/sock_diag.c:29) bpf_get_socket_ptr_cookie (./arch/x86/include/asm/preempt.h:94 ./include/linux/sock_diag.h:42 net/core/filter.c:5094 net/core/filter.c:5092) bpf_prog_875642cf11f1d139___sock_release+0x6e/0x8e bpf_trampoline_6442506592+0x47/0xaf __sock_release (net/socket.c:652) __sock_create (net/socket.c:1601) ... Allocated by task 299 on cpu 2 at 78.328492s: kasan_save_stack (mm/kasan/common.c:48) kasan_save_track (mm/kasan/common.c:68) __kasan_slab_alloc (mm/kasan/common.c:312 mm/kasan/common.c:338) kmem_cache_alloc_noprof (mm/slub.c:3941 mm/slub.c:4000 mm/slub.c:4007) sk_prot_alloc (net/core/sock.c:2075) sk_alloc (net/core/sock.c:2134) inet_create (net/ipv4/af_inet.c:327 net/ipv4/af_inet.c:252) __sock_create (net/socket.c:1572) __sys_socket (net/socket.c:1660 net/socket.c:1644 net/socket.c:1706) __x64_sys_socket (net/socket.c:1718) do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Freed by task 299 on cpu 2 at 78.328502s: kasan_save_stack (mm/kasan/common.c:48) kasan_save_track (mm/kasan/common.c:68) kasan_save_free_info (mm/kasan/generic.c:582) poison_slab_object (mm/kasan/common.c:242) __kasan_slab_free (mm/kasan/common.c:256) kmem_cache_free (mm/slub.c:4437 mm/slub.c:4511) __sk_destruct (net/core/sock.c:2117 net/core/sock.c:2208) inet_create (net/ipv4/af_inet.c:397 net/ipv4/af_inet.c:252) __sock_create (net/socket.c:1572) __sys_socket (net/socket.c:1660 net/socket.c:1644 net/socket.c:1706) __x64_sys_socket (net/socket.c:1718) do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Fix this by clearing the struct socket reference in sk_common_release() to cover all protocol families create functions, which may already attached the reference to the sk object with sock_init_data(). Fixes: c5dbb89fc2ac ("bpf: Expose bpf_get_socket_cookie to tracing programs") Suggested-by: Kuniyuki Iwashima <kuniyu(a)amazon.com> Signed-off-by: Ignat Korchagin <ignat(a)cloudflare.com> Cc: stable(a)vger.kernel.org Link: https://lore.kernel.org/netdev/20240613194047.36478-1-kuniyu@amazon.com/T/ --- Changes in v3: * re-added KASAN repro steps to the commit message (somehow stripped in v2) * stripped timestamps and thread id from the KASAN splat * removed comment from the code (commit message should be enough) Changes in v2: * moved the NULL-ing of the socket reference to sk_common_release() (as suggested by Kuniyuki Iwashima) * trimmed down the KASAN report in the commit message to show only relevant info net/core/sock.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/core/sock.c b/net/core/sock.c index 8629f9aecf91..100e975073ca 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -3742,6 +3742,9 @@ void sk_common_release(struct sock *sk) sk->sk_prot->unhash(sk); + if (sk->sk_socket) + sk->sk_socket->sk = NULL; + /* * In this point socket cannot receive new packets, but it is possible * that some packets are in flight because some CPU runs receiver and -- 2.39.2

1 year

5
6
0 0

[git:media_stage/master] media: uvcvideo: Fix integer overflow calculating timestamp

by Laurent Pinchart

This is an automatic generated email to let you know that the following patch were queued: Subject: media: uvcvideo: Fix integer overflow calculating timestamp Author: Ricardo Ribalda <ribalda(a)chromium.org> Date: Mon Jun 10 19:17:49 2024 +0000 The function uvc_video_clock_update() supports a single SOF overflow. Or in other words, the maximum difference between the first ant the last timestamp can be 4096 ticks or 4.096 seconds. This results in a maximum value for y2 of: 0x12FBECA00, that overflows 32bits. y2 = (u32)ktime_to_ns(ktime_sub(last->host_time, first->host_time)) + y1; Extend the size of y2 to u64 to support all its values. Without this patch: # yavta -s 1920x1080 -f YUYV -t 1/5 -c /dev/video0 Device /dev/v4l/by-id/usb-Shine-Optics_Integrated_Camera_0001-video-index0 opened. Device `Integrated Camera: Integrated C' on `usb-0000:00:14.0-6' (driver 'uvcvideo') supports video, capture, without mplanes. Video format set: YUYV (56595559) 1920x1080 (stride 3840) field none buffer size 4147200 Video format: YUYV (56595559) 1920x1080 (stride 3840) field none buffer size 4147200 Current frame rate: 1/5 Setting frame rate to: 1/5 Frame rate set: 1/5 8 buffers requested. length: 4147200 offset: 0 timestamp type/source: mono/SoE Buffer 0/0 mapped at address 0x7947ea94c000. length: 4147200 offset: 4149248 timestamp type/source: mono/SoE Buffer 1/0 mapped at address 0x7947ea557000. length: 4147200 offset: 8298496 timestamp type/source: mono/SoE Buffer 2/0 mapped at address 0x7947ea162000. length: 4147200 offset: 12447744 timestamp type/source: mono/SoE Buffer 3/0 mapped at address 0x7947e9d6d000. length: 4147200 offset: 16596992 timestamp type/source: mono/SoE Buffer 4/0 mapped at address 0x7947e9978000. length: 4147200 offset: 20746240 timestamp type/source: mono/SoE Buffer 5/0 mapped at address 0x7947e9583000. length: 4147200 offset: 24895488 timestamp type/source: mono/SoE Buffer 6/0 mapped at address 0x7947e918e000. length: 4147200 offset: 29044736 timestamp type/source: mono/SoE Buffer 7/0 mapped at address 0x7947e8d99000. 0 (0) [-] none 0 4147200 B 507.554210 508.874282 242.836 fps ts mono/SoE 1 (1) [-] none 2 4147200 B 508.886298 509.074289 0.751 fps ts mono/SoE 2 (2) [-] none 3 4147200 B 509.076362 509.274307 5.261 fps ts mono/SoE 3 (3) [-] none 4 4147200 B 509.276371 509.474336 5.000 fps ts mono/SoE 4 (4) [-] none 5 4147200 B 509.476394 509.674394 4.999 fps ts mono/SoE 5 (5) [-] none 6 4147200 B 509.676506 509.874345 4.997 fps ts mono/SoE 6 (6) [-] none 7 4147200 B 509.876430 510.074370 5.002 fps ts mono/SoE 7 (7) [-] none 8 4147200 B 510.076434 510.274365 5.000 fps ts mono/SoE 8 (0) [-] none 9 4147200 B 510.276421 510.474333 5.000 fps ts mono/SoE 9 (1) [-] none 10 4147200 B 510.476391 510.674429 5.001 fps ts mono/SoE 10 (2) [-] none 11 4147200 B 510.676434 510.874283 4.999 fps ts mono/SoE 11 (3) [-] none 12 4147200 B 510.886264 511.074349 4.766 fps ts mono/SoE 12 (4) [-] none 13 4147200 B 511.070577 511.274304 5.426 fps ts mono/SoE 13 (5) [-] none 14 4147200 B 511.286249 511.474301 4.637 fps ts mono/SoE 14 (6) [-] none 15 4147200 B 511.470542 511.674251 5.426 fps ts mono/SoE 15 (7) [-] none 16 4147200 B 511.672651 511.874337 4.948 fps ts mono/SoE 16 (0) [-] none 17 4147200 B 511.873988 512.074462 4.967 fps ts mono/SoE 17 (1) [-] none 18 4147200 B 512.075982 512.278296 4.951 fps ts mono/SoE 18 (2) [-] none 19 4147200 B 512.282631 512.482423 4.839 fps ts mono/SoE 19 (3) [-] none 20 4147200 B 518.986637 512.686333 0.149 fps ts mono/SoE 20 (4) [-] none 21 4147200 B 518.342709 512.886386 -1.553 fps ts mono/SoE 21 (5) [-] none 22 4147200 B 517.909812 513.090360 -2.310 fps ts mono/SoE 22 (6) [-] none 23 4147200 B 517.590775 513.294454 -3.134 fps ts mono/SoE 23 (7) [-] none 24 4147200 B 513.298465 513.494335 -0.233 fps ts mono/SoE 24 (0) [-] none 25 4147200 B 513.510273 513.698375 4.721 fps ts mono/SoE 25 (1) [-] none 26 4147200 B 513.698904 513.902327 5.301 fps ts mono/SoE 26 (2) [-] none 27 4147200 B 513.895971 514.102348 5.074 fps ts mono/SoE 27 (3) [-] none 28 4147200 B 514.099091 514.306337 4.923 fps ts mono/SoE 28 (4) [-] none 29 4147200 B 514.310348 514.510567 4.734 fps ts mono/SoE 29 (5) [-] none 30 4147200 B 514.509295 514.710367 5.026 fps ts mono/SoE 30 (6) [-] none 31 4147200 B 521.532513 514.914398 0.142 fps ts mono/SoE 31 (7) [-] none 32 4147200 B 520.885277 515.118385 -1.545 fps ts mono/SoE 32 (0) [-] none 33 4147200 B 520.411140 515.318336 -2.109 fps ts mono/SoE 33 (1) [-] none 34 4147200 B 515.325425 515.522278 -0.197 fps ts mono/SoE 34 (2) [-] none 35 4147200 B 515.538276 515.726423 4.698 fps ts mono/SoE 35 (3) [-] none 36 4147200 B 515.720767 515.930373 5.480 fps ts mono/SoE Cc: stable(a)vger.kernel.org Fixes: 66847ef013cc ("[media] uvcvideo: Add UVC timestamps support") Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org> Reviewed-by: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com> Link: https://lore.kernel.org/r/20240610-hwtimestamp-followup-v1-2-f9eaed7be7f0@c… Signed-off-by: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com> drivers/media/usb/uvc/uvc_video.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) --- diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c index aebcf9b25a16..4dfc1b86bdee 100644 --- a/drivers/media/usb/uvc/uvc_video.c +++ b/drivers/media/usb/uvc/uvc_video.c @@ -760,11 +760,11 @@ void uvc_video_clock_update(struct uvc_streaming *stream, unsigned long flags; u64 timestamp; u32 delta_stc; - u32 y1, y2; + u32 y1; u32 x1, x2; u32 mean; u32 sof; - u64 y; + u64 y, y2; if (!uvc_hw_timestamps_param) return; @@ -816,7 +816,7 @@ void uvc_video_clock_update(struct uvc_streaming *stream, sof = y; uvc_dbg(stream->dev, CLOCK, - "%s: PTS %u y %llu.%06llu SOF %u.%06llu (x1 %u x2 %u y1 %u y2 %u SOF offset %u)\n", + "%s: PTS %u y %llu.%06llu SOF %u.%06llu (x1 %u x2 %u y1 %u y2 %llu SOF offset %u)\n", stream->dev->name, buf->pts, y >> 16, div_u64((y & 0xffff) * 1000000, 65536), sof >> 16, div_u64(((u64)sof & 0xffff) * 1000000LLU, 65536), @@ -831,7 +831,7 @@ void uvc_video_clock_update(struct uvc_streaming *stream, goto done; y1 = NSEC_PER_SEC; - y2 = (u32)ktime_to_ns(ktime_sub(last->host_time, first->host_time)) + y1; + y2 = ktime_to_ns(ktime_sub(last->host_time, first->host_time)) + y1; /* * Interpolated and host SOF timestamps can wrap around at slightly @@ -852,7 +852,7 @@ void uvc_video_clock_update(struct uvc_streaming *stream, timestamp = ktime_to_ns(first->host_time) + y - y1; uvc_dbg(stream->dev, CLOCK, - "%s: SOF %u.%06llu y %llu ts %llu buf ts %llu (x1 %u/%u/%u x2 %u/%u/%u y1 %u y2 %u)\n", + "%s: SOF %u.%06llu y %llu ts %llu buf ts %llu (x1 %u/%u/%u x2 %u/%u/%u y1 %u y2 %llu)\n", stream->dev->name, sof >> 16, div_u64(((u64)sof & 0xffff) * 1000000LLU, 65536), y, timestamp, vbuf->vb2_buf.timestamp,

1 year

1
0
0 0

[PATCH 01/10] ASoC: fsl: fsl_qmc_audio: Check devm_kasprintf() returned value

by Herve Codina

devm_kasprintf() can return a NULL pointer on failure but this returned value is not checked. Fix this lack and check the returned value. Fixes: 075c7125b11c ("ASoC: fsl: Add support for QMC audio") Cc: stable(a)vger.kernel.org Signed-off-by: Herve Codina <herve.codina(a)bootlin.com> --- sound/soc/fsl/fsl_qmc_audio.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/soc/fsl/fsl_qmc_audio.c b/sound/soc/fsl/fsl_qmc_audio.c index bfaaa451735b..dd90ef16fa97 100644 --- a/sound/soc/fsl/fsl_qmc_audio.c +++ b/sound/soc/fsl/fsl_qmc_audio.c @@ -604,6 +604,8 @@ static int qmc_audio_dai_parse(struct qmc_audio *qmc_audio, struct device_node * qmc_dai->name = devm_kasprintf(qmc_audio->dev, GFP_KERNEL, "%s.%d", np->parent->name, qmc_dai->id); + if (!qmc_dai->name) + return -ENOMEM; qmc_dai->qmc_chan = devm_qmc_chan_get_byphandle(qmc_audio->dev, np, "fsl,qmc-chan"); -- 2.45.0

1 year

1
0
0 0

+ mm-page_alloc-separate-thp-pcp-into-movable-and-non-movable-categories.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: mm/page_alloc: Separate THP PCP into movable and non-movable categories has been added to the -mm mm-hotfixes-unstable branch. Its filename is mm-page_alloc-separate-thp-pcp-into-movable-and-non-movable-categories.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: yangge <yangge1116(a)126.com> Subject: mm/page_alloc: Separate THP PCP into movable and non-movable categories Date: Thu, 20 Jun 2024 08:59:50 +0800 Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for THP-sized allocations") no longer differentiates the migration type of pages in THP-sized PCP list, it's possible that non-movable allocation requests may get a CMA page from the list, in some cases, it's not acceptable. If a large number of CMA memory are configured in system (for example, the CMA memory accounts for 50% of the system memory), starting a virtual machine with device passthrough will get stuck. During starting the virtual machine, it will call pin_user_pages_remote(..., FOLL_LONGTERM, ...) to pin memory. Normally if a page is present and in CMA area, pin_user_pages_remote() will migrate the page from CMA area to non-CMA area because of FOLL_LONGTERM flag. But if non-movable allocation requests return CMA memory, migrate_longterm_unpinnable_pages() will migrate a CMA page to another CMA page, which will fail to pass the check in check_and_migrate_movable_pages() and cause migration endless. Call trace: pin_user_pages_remote --__gup_longterm_locked // endless loops in this function ----_get_user_pages_locked ----check_and_migrate_movable_pages ------migrate_longterm_unpinnable_pages --------alloc_migration_target This problem will also have a negative impact on CMA itself. For example, when CMA is borrowed by THP, and we need to reclaim it through cma_alloc() or dma_alloc_coherent(), we must move those pages out to ensure CMA's users can retrieve that contigous memory. Currently, CMA's memory is occupied by non-movable pages, meaning we can't relocate them. As a result, cma_alloc() is more likely to fail. To fix the problem above, we add one PCP list for THP, which will not introduce a new cacheline for struct per_cpu_pages. THP will have 2 PCP lists, one PCP list is used by MOVABLE allocation, and the other PCP list is used by UNMOVABLE allocation. MOVABLE allocation contains GPF_MOVABLE, and UNMOVABLE allocation contains GFP_UNMOVABLE and GFP_RECLAIMABLE. Link: https://lkml.kernel.org/r/1718845190-4456-1-git-send-email-yangge1116@126.c… Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for THP-sized allocations") Signed-off-by: yangge <yangge1116(a)126.com> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Barry Song <21cnbao(a)gmail.com> Cc: Mel Gorman <mgorman(a)techsingularity.net> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/linux/mmzone.h | 9 ++++----- mm/page_alloc.c | 9 +++++++-- 2 files changed, 11 insertions(+), 7 deletions(-) --- a/include/linux/mmzone.h~mm-page_alloc-separate-thp-pcp-into-movable-and-non-movable-categories +++ a/include/linux/mmzone.h @@ -654,13 +654,12 @@ enum zone_watermarks { }; /* - * One per migratetype for each PAGE_ALLOC_COSTLY_ORDER. One additional list - * for THP which will usually be GFP_MOVABLE. Even if it is another type, - * it should not contribute to serious fragmentation causing THP allocation - * failures. + * One per migratetype for each PAGE_ALLOC_COSTLY_ORDER. Two additional lists + * are added for THP. One PCP list is used by GPF_MOVABLE, and the other PCP list + * is used by GFP_UNMOVABLE and GFP_RECLAIMABLE. */ #ifdef CONFIG_TRANSPARENT_HUGEPAGE -#define NR_PCP_THP 1 +#define NR_PCP_THP 2 #else #define NR_PCP_THP 0 #endif --- a/mm/page_alloc.c~mm-page_alloc-separate-thp-pcp-into-movable-and-non-movable-categories +++ a/mm/page_alloc.c @@ -504,10 +504,15 @@ out: static inline unsigned int order_to_pindex(int migratetype, int order) { + bool __maybe_unused movable; + #ifdef CONFIG_TRANSPARENT_HUGEPAGE if (order > PAGE_ALLOC_COSTLY_ORDER) { VM_BUG_ON(order != HPAGE_PMD_ORDER); - return NR_LOWORDER_PCP_LISTS; + + movable = migratetype == MIGRATE_MOVABLE; + + return NR_LOWORDER_PCP_LISTS + movable; } #else VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER); @@ -521,7 +526,7 @@ static inline int pindex_to_order(unsign int order = pindex / MIGRATE_PCPTYPES; #ifdef CONFIG_TRANSPARENT_HUGEPAGE - if (pindex == NR_LOWORDER_PCP_LISTS) + if (pindex >= NR_LOWORDER_PCP_LISTS) order = HPAGE_PMD_ORDER; #else VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER); _ Patches currently in -mm which might be from yangge1116(a)126.com are mm-page_alloc-separate-thp-pcp-into-movable-and-non-movable-categories.patch mm-page_alloc-add-one-pcp-list-for-thp.patch

1 year

1
0
0 0

Re: [PATCH v2] cifs: drop the incorrect assertion in cifs_swap_rw()

by Barry Song

On Wed, Jun 19, 2024 at 3:48 PM Steve French <smfrench(a)gmail.com> wrote: > > tentatively merged into cifs-2.6.git for-next pending testing and any additional review Steve, Thanks! I guess you missed an email from mm-commits. A couple of hours ago, this was pulled into mm-hotfixes-unstable, likely for the same purpose. Will this cause any conflicts when both changes hit linux-next? https://lore.kernel.org/mm-commits/20240618195943.EC07BC3277B@smtp.kernel.o… Will we just keep one? > > On Tue, Jun 18, 2024 at 3:56 AM Barry Song <21cnbao(a)gmail.com> wrote: >> >> From: Barry Song <v-songbaohua(a)oppo.com> >> >> Since commit 2282679fb20b ("mm: submit multipage write for SWP_FS_OPS >> swap-space"), we can plug multiple pages then unplug them all together. >> That means iov_iter_count(iter) could be way bigger than PAGE_SIZE, it >> actually equals the size of iov_iter_npages(iter, INT_MAX). >> >> Note this issue has nothing to do with large folios as we don't support >> THP_SWPOUT to non-block devices. >> >> Fixes: 2282679fb20b ("mm: submit multipage write for SWP_FS_OPS swap-space") >> Reported-by: Christoph Hellwig <hch(a)lst.de> >> Closes: https://lore.kernel.org/linux-mm/20240614100329.1203579-1-hch@lst.de/ >> Cc: NeilBrown <neilb(a)suse.de> >> Cc: Anna Schumaker <anna(a)kernel.org> >> Cc: Steve French <sfrench(a)samba.org> >> Cc: Trond Myklebust <trondmy(a)kernel.org> >> Cc: Chuanhua Han <hanchuanhua(a)oppo.com> >> Cc: Ryan Roberts <ryan.roberts(a)arm.com> >> Cc: Chris Li <chrisl(a)kernel.org> >> Cc: "Huang, Ying" <ying.huang(a)intel.com> >> Cc: Jeff Layton <jlayton(a)kernel.org> >> Cc: Paulo Alcantara <pc(a)manguebit.com> >> Cc: Ronnie Sahlberg <ronniesahlberg(a)gmail.com> >> Cc: Shyam Prasad N <sprasad(a)microsoft.com> >> Cc: Tom Talpey <tom(a)talpey.com> >> Cc: Bharath SM <bharathsm(a)microsoft.com> >> Cc: <stable(a)vger.kernel.org> >> Signed-off-by: Barry Song <v-songbaohua(a)oppo.com> >> --- >> -v2: >> * drop the assertion instead of fixing the assertion. >> per the comments of Willy, Christoph in nfs thread. >> >> fs/smb/client/file.c | 2 -- >> 1 file changed, 2 deletions(-) >> >> diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c >> index 9d5c2440abfc..1e269e0bc75b 100644 >> --- a/fs/smb/client/file.c >> +++ b/fs/smb/client/file.c >> @@ -3200,8 +3200,6 @@ static int cifs_swap_rw(struct kiocb *iocb, struct iov_iter *iter) >> { >> ssize_t ret; >> >> - WARN_ON_ONCE(iov_iter_count(iter) != PAGE_SIZE); >> - >> if (iov_iter_rw(iter) == READ) >> ret = netfs_unbuffered_read_iter_locked(iocb, iter); >> else >> -- >> 2.34.1 >> >> > > > -- > Thanks, > > Steve

1 year

2
1
0 0

[PATCH v2] nfs: drop the incorrect assertion in nfs_swap_rw()

by Barry Song

From: Christoph Hellwig <hch(a)lst.de> Since commit 2282679fb20b ("mm: submit multipage write for SWP_FS_OPS swap-space"), we can plug multiple pages then unplug them all together. That means iov_iter_count(iter) could be way bigger than PAGE_SIZE, it actually equals the size of iov_iter_npages(iter, INT_MAX). Note this issue has nothing to do with large folios as we don't support THP_SWPOUT to non-block devices. Fixes: 2282679fb20b ("mm: submit multipage write for SWP_FS_OPS swap-space") Reported-by: Christoph Hellwig <hch(a)lst.de> Closes: https://lore.kernel.org/linux-mm/20240617053201.GA16852@lst.de/ Cc: NeilBrown <neilb(a)suse.de> Cc: Anna Schumaker <anna(a)kernel.org> Cc: Steve French <sfrench(a)samba.org> Cc: Trond Myklebust <trondmy(a)kernel.org> Cc: Chuanhua Han <hanchuanhua(a)oppo.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: Chris Li <chrisl(a)kernel.org> Cc: "Huang, Ying" <ying.huang(a)intel.com> Cc: Jeff Layton <jlayton(a)kernel.org> Cc: <stable(a)vger.kernel.org> Cc: Matthew Wilcox <willy(a)infradead.org> Cc: Martin Wege <martin.l.wege(a)gmail.com> Signed-off-by: Christoph Hellwig <hch(a)lst.de> [Barry: figure out the cause and correct the commit message] Signed-off-by: Barry Song <v-songbaohua(a)oppo.com> --- fs/nfs/direct.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c index bb2f583eb28b..90079ca134dd 100644 --- a/fs/nfs/direct.c +++ b/fs/nfs/direct.c @@ -141,8 +141,6 @@ int nfs_swap_rw(struct kiocb *iocb, struct iov_iter *iter) { ssize_t ret; - VM_BUG_ON(iov_iter_count(iter) != PAGE_SIZE); - if (iov_iter_rw(iter) == READ) ret = nfs_file_direct_read(iocb, iter, true); else -- 2.34.1

1 year

4
4
0 0

[PATCH V2] mm/page_alloc: Separate THP PCP into movable and non-movable categories

by yangge1116＠126.com

From: yangge <yangge1116(a)126.com> Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for THP-sized allocations") no longer differentiates the migration type of pages in THP-sized PCP list, it's possible that non-movable allocation requests may get a CMA page from the list, in some cases, it's not acceptable. If a large number of CMA memory are configured in system (for example, the CMA memory accounts for 50% of the system memory), starting a virtual machine with device passthrough will get stuck. During starting the virtual machine, it will call pin_user_pages_remote(..., FOLL_LONGTERM, ...) to pin memory. Normally if a page is present and in CMA area, pin_user_pages_remote() will migrate the page from CMA area to non-CMA area because of FOLL_LONGTERM flag. But if non-movable allocation requests return CMA memory, migrate_longterm_unpinnable_pages() will migrate a CMA page to another CMA page, which will fail to pass the check in check_and_migrate_movable_pages() and cause migration endless. Call trace: pin_user_pages_remote --__gup_longterm_locked // endless loops in this function ----_get_user_pages_locked ----check_and_migrate_movable_pages ------migrate_longterm_unpinnable_pages --------alloc_migration_target This problem will also have a negative impact on CMA itself. For example, when CMA is borrowed by THP, and we need to reclaim it through cma_alloc() or dma_alloc_coherent(), we must move those pages out to ensure CMA's users can retrieve that contigous memory. Currently, CMA's memory is occupied by non-movable pages, meaning we can't relocate them. As a result, cma_alloc() is more likely to fail. To fix the problem above, we add one PCP list for THP, which will not introduce a new cacheline for struct per_cpu_pages. THP will have 2 PCP lists, one PCP list is used by MOVABLE allocation, and the other PCP list is used by UNMOVABLE allocation. MOVABLE allocation contains GPF_MOVABLE, and UNMOVABLE allocation contains GFP_UNMOVABLE and GFP_RECLAIMABLE. Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for THP-sized allocations") Cc: <stable(a)vger.kernel.org> Signed-off-by: yangge <yangge1116(a)126.com> --- V2: - Change the commit title - Add Cc to stable include/linux/mmzone.h | 9 ++++----- mm/page_alloc.c | 9 +++++++-- 2 files changed, 11 insertions(+), 7 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index b7546dd..cb7f265 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -656,13 +656,12 @@ enum zone_watermarks { }; /* - * One per migratetype for each PAGE_ALLOC_COSTLY_ORDER. One additional list - * for THP which will usually be GFP_MOVABLE. Even if it is another type, - * it should not contribute to serious fragmentation causing THP allocation - * failures. + * One per migratetype for each PAGE_ALLOC_COSTLY_ORDER. Two additional lists + * are added for THP. One PCP list is used by GPF_MOVABLE, and the other PCP list + * is used by GFP_UNMOVABLE and GFP_RECLAIMABLE. */ #ifdef CONFIG_TRANSPARENT_HUGEPAGE -#define NR_PCP_THP 1 +#define NR_PCP_THP 2 #else #define NR_PCP_THP 0 #endif diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 8f416a0..0a837e6 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -504,10 +504,15 @@ static void bad_page(struct page *page, const char *reason) static inline unsigned int order_to_pindex(int migratetype, int order) { + bool __maybe_unused movable; + #ifdef CONFIG_TRANSPARENT_HUGEPAGE if (order > PAGE_ALLOC_COSTLY_ORDER) { VM_BUG_ON(order != HPAGE_PMD_ORDER); - return NR_LOWORDER_PCP_LISTS; + + movable = migratetype == MIGRATE_MOVABLE; + + return NR_LOWORDER_PCP_LISTS + movable; } #else VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER); @@ -521,7 +526,7 @@ static inline int pindex_to_order(unsigned int pindex) int order = pindex / MIGRATE_PCPTYPES; #ifdef CONFIG_TRANSPARENT_HUGEPAGE - if (pindex == NR_LOWORDER_PCP_LISTS) + if (pindex >= NR_LOWORDER_PCP_LISTS) order = HPAGE_PMD_ORDER; #else VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER); -- 2.7.4

1 year

1
0
0 0

[PATCH net] net/tcp_ao: Don't leak ao_info on error-path

by Dmitry Safonov via B4 Relay

From: Dmitry Safonov <0x7f454c46(a)gmail.com> It seems I introduced it together with TCP_AO_CMDF_AO_REQUIRED, on version 5 [1] of TCP-AO patches. Quite frustrative that having all these selftests that I've written, running kmemtest & kcov was always in todo. [1]: https://lore.kernel.org/netdev/20230215183335.800122-5-dima@arista.com/ Reported-by: Jakub Kicinski <kuba(a)kernel.org> Closes: https://lore.kernel.org/netdev/20240617072451.1403e1d2@kernel.org/ Fixes: 0aadc73995d0 ("net/tcp: Prevent TCP-MD5 with TCP-AO being set") Cc: stable(a)vger.kernel.org Signed-off-by: Dmitry Safonov <0x7f454c46(a)gmail.com> --- net/ipv4/tcp_ao.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index 37c42b63ff99..09c0fa6756b7 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -1968,8 +1968,10 @@ static int tcp_ao_info_cmd(struct sock *sk, unsigned short int family, first = true; } - if (cmd.ao_required && tcp_ao_required_verify(sk)) - return -EKEYREJECTED; + if (cmd.ao_required && tcp_ao_required_verify(sk)) { + err = -EKEYREJECTED; + goto out; + } /* For sockets in TCP_CLOSED it's possible set keys that aren't * matching the future peer (address/port/VRF/etc), --- base-commit: 92e5605a199efbaee59fb19e15d6cc2103a04ec2 change-id: 20240619-tcp-ao-required-leak-22b4a9b6d743 Best regards, -- Dmitry Safonov <0x7f454c46(a)gmail.com>

1 year

3
2
0 0

Re: [PATCH 6.9 000/281] 6.9.6-rc1 review

by Ronald Warsow

Hi Greg no regressions here on x86_64 (RKL, Intel 11th Gen. CPU) Thanks Tested-by: Ronald Warsow <rwarsow(a)gmx.de>

1 year

1
0
0 0

[PATCH] thermal: int340x: processor_thermal: Support shared interrupts

by Srinivas Pandruvada

On some systems the processor thermal device interrupt is shared with other PCI devices. In this case return IRQ_NONE from the interrupt handler when the interrupt is not for the processor thermal device. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com> Fixes: f0658708e863 ("thermal: int340x: processor_thermal: Use non MSI interrupts by default") Cc: <stable(a)vger.kernel.org> # v6.7+ --- This was only observed on a non production system. So not urgent. .../intel/int340x_thermal/processor_thermal_device_pci.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c b/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c index 14e34eabc419..4a1bfebb1b8e 100644 --- a/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c +++ b/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c @@ -150,7 +150,7 @@ static irqreturn_t proc_thermal_irq_handler(int irq, void *devid) { struct proc_thermal_pci *pci_info = devid; struct proc_thermal_device *proc_priv; - int ret = IRQ_HANDLED; + int ret = IRQ_NONE; u32 status; proc_priv = pci_info->proc_priv; @@ -175,6 +175,7 @@ static irqreturn_t proc_thermal_irq_handler(int irq, void *devid) /* Disable enable interrupt flag */ proc_thermal_mmio_write(pci_info, PROC_THERMAL_MMIO_INT_ENABLE_0, 0); pkg_thermal_schedule_work(&pci_info->work); + ret = IRQ_HANDLED; } pci_write_config_byte(pci_info->pdev, 0xdc, 0x01); -- 2.44.0

1 year

2
1
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror June 2024