Linux-stable-mirror February 2024

linux-stable-mirror@lists.linaro.org

476 participants
995 discussions

[PATCH net 00/13] mptcp: misc. fixes for v6.8

by Matthieu Baerts (NGI0)

This series includes 4 types of fixes: Patches 1 and 2 force the path-managers not to allocate a new address entry when dealing with the "special" ID 0, reserved to the address of the initial subflow. These patches can be backported up to v5.19 and v5.12 respectively. Patch 3 to 6 fix the in-kernel path-manager not to create duplicated subflows. Patch 6 is the main fix, but patches 3 to 5 are some kind of pre-requisities: they fix some data races that could also lead to the creation of unexpected subflows. These patches can be backported up to v5.7, v5.10, v6.0, and v5.15 respectively. Note that patch 3 modifies the existing ULP API. No better solutions have been found for -net, and there is some similar prior art, see commit 0df48c26d841 ("tcp: add tcpi_bytes_acked to tcp_info"). Please also note that TLS ULP Diag has likely the same issue. Patches 7 to 9 fix issues in the selftests, when executing them on older kernels, e.g. when testing the last version of these kselftests on the v5.15.148 kernel as it is done by LKFT when validating stable kernels. These patches only avoid printing expected errors the console and marking some tests as "OK" while they have been skipped. Patches 7 and 8 can be backported up to v6.6. Patches 10 to 13 make sure all MPTCP selftests subtests have a unique name. It is important to have a unique (sub)test name in TAP, because that's the test identifier. Some CI environments might drop tests with duplicated names. Patches 10 to 12 can be backported up to v6.6. Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org> --- Geliang Tang (2): mptcp: add needs_id for userspace appending addr mptcp: add needs_id for netlink appending addr Matthieu Baerts (NGI0) (7): selftests: mptcp: pm nl: also list skipped tests selftests: mptcp: pm nl: avoid error msg on older kernels selftests: mptcp: diag: fix bash warnings on older kernels selftests: mptcp: simult flows: fix some subtest names selftests: mptcp: userspace_pm: unique subtest names selftests: mptcp: diag: unique 'in use' subtest names selftests: mptcp: diag: unique 'cestab' subtest names Paolo Abeni (4): mptcp: fix lockless access in subflow ULP diag mptcp: fix data races on local_id mptcp: fix data races on remote_id mptcp: fix duplicate subflow creation include/net/tcp.h | 2 +- net/mptcp/diag.c | 8 ++- net/mptcp/pm_netlink.c | 69 ++++++++++++++--------- net/mptcp/pm_userspace.c | 15 ++--- net/mptcp/protocol.c | 2 +- net/mptcp/protocol.h | 15 ++++- net/mptcp/subflow.c | 15 ++--- net/tls/tls_main.c | 2 +- tools/testing/selftests/net/mptcp/diag.sh | 41 ++++++++------ tools/testing/selftests/net/mptcp/pm_netlink.sh | 8 ++- tools/testing/selftests/net/mptcp/simult_flows.sh | 3 +- tools/testing/selftests/net/mptcp/userspace_pm.sh | 4 +- 12 files changed, 116 insertions(+), 68 deletions(-) --- base-commit: c40c0d3a768c78a023a72fb2ceea00743e3a695d change-id: 20240215-upstream-net-20240215-misc-fixes-03815ec14dc6 Best regards, -- Matthieu Baerts (NGI0) <matttbe(a)kernel.org>

1 year, 10 months

[GIT PULL] bcachefs stable updates for v6.7

by Kent Overstreet

Hi Greg, few stable updates for you - Cheers, Kent The following changes since commit 0dd3ee31125508cd67f7e7172247f05b7fd1753a: Linux 6.7 (2024-01-07 12:18:38 -0800) are available in the Git repository at: https://evilpiepirate.org/git/bcachefs.git tags/bcachefs-for-v6.7-stable-20240208 for you to fetch changes up to f1582f4774ac7c30c5460a8c7a6e5a82b9ce5a6a: bcachefs: time_stats: Check for last_event == 0 when updating freq stats (2024-02-08 15:33:11 -0500) ---------------------------------------------------------------- bcachefs updates for v6.7 stable: locking fixes in subvolume create, destroy paths - Al, Su Yue, Guoyu Ou fix race in thread_with_file - Mathias Krause small rebalance fixes - Daniel, myself workaround for building with old clang (can't take a pointer to memcmp) build fix on parisc minor time_stats fix ---------------------------------------------------------------- Al Viro (2): new helper: user_path_locked_at() bch2_ioctl_subvolume_destroy(): fix locking Christoph Hellwig (1): bcachefs: fix incorrect usage of REQ_OP_FLUSH Daniel Hill (1): bcachefs: rebalance should wakeup on shutdown if disabled Guoyu Ou (1): bcachefs: unlock parent dir if entry is not found in subvolume deletion Helge Deller (1): bcachefs: Fix build on parisc by avoiding __multi3() Kent Overstreet (4): bcachefs: Don't pass memcmp() as a pointer bcachefs: Add missing bch2_moving_ctxt_flush_all() bcachefs: bch2_kthread_io_clock_wait() no longer sleeps until full amount bcachefs: time_stats: Check for last_event == 0 when updating freq stats Mathias Krause (1): bcachefs: install fd later to avoid race with close Su Yue (2): bcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exit bcachefs: grab s_umount only if snapshotting fs/bcachefs/chardev.c | 3 +-- fs/bcachefs/clock.c | 4 ++-- fs/bcachefs/fs-io.c | 2 +- fs/bcachefs/fs-ioctl.c | 42 +++++++++++++++++++++-------------------- fs/bcachefs/journal_io.c | 3 ++- fs/bcachefs/mean_and_variance.h | 2 +- fs/bcachefs/move.c | 2 +- fs/bcachefs/move.h | 1 + fs/bcachefs/rebalance.c | 13 +++++++++++-- fs/bcachefs/replicas.c | 10 ++++++++-- fs/bcachefs/snapshot.c | 2 +- fs/bcachefs/util.c | 5 +++-- fs/namei.c | 16 +++++++++++++--- include/linux/namei.h | 1 + 14 files changed, 68 insertions(+), 38 deletions(-)

1 year, 10 months

ORDER-PO#00997923

by Sales02

Dear , Please find the attached copy of our contract for your reference. Confirm the details, sign and return as soon as possible. The shipment cost remains the same. See No 4 highlighted in RED confirm if we can increase as before. Let us know if you need further clarification. Best Regards, Connor Gilchrist Sales Shadow's Ridge inc. The Shipyard, Bath Road, Lymington, Hampshire, SO41 3YL, England Ph. +44 (0)1590 647406.

1 year, 10 months

[PATCH] accel/ivpu: Don't enable any tiles by default on VPU40xx

by Jacek Lawrynowicz

From: Andrzej Kacprowski <Andrzej.Kacprowski(a)intel.com> There is no point in requesting 1 tile on VPU40xx as the FW will probably need more tiles to run workloads, so it will have to reconfigure PLL anyway. Don't enable any tiles and allow the FW to perform initial tile configuration. This improves NPU boot stability as the tiles are always enabled only by the FW from the same initial state. Fixes: 79cdc56c4a54 ("accel/ivpu: Add initial support for VPU 4") Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski(a)intel.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz(a)linux.intel.com> --- drivers/accel/ivpu/ivpu_hw_40xx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/accel/ivpu/ivpu_hw_40xx.c b/drivers/accel/ivpu/ivpu_hw_40xx.c index 1c995307c113..a1523d0b1ef3 100644 --- a/drivers/accel/ivpu/ivpu_hw_40xx.c +++ b/drivers/accel/ivpu/ivpu_hw_40xx.c @@ -24,7 +24,7 @@ #define SKU_HW_ID_SHIFT 16u #define SKU_HW_ID_MASK 0xffff0000u -#define PLL_CONFIG_DEFAULT 0x1 +#define PLL_CONFIG_DEFAULT 0x0 #define PLL_CDYN_DEFAULT 0x80 #define PLL_EPP_DEFAULT 0x80 #define PLL_REF_CLK_FREQ (50 * 1000000) -- 2.43.0

1 year, 10 months

[PATCH v4] mm/swap: fix race when skipping swapcache

by Kairui Song

From: Kairui Song <kasong(a)tencent.com> When skipping swapcache for SWP_SYNCHRONOUS_IO, if two or more threads swapin the same entry at the same time, they get different pages (A, B). Before one thread (T0) finishes the swapin and installs page (A) to the PTE, another thread (T1) could finish swapin of page (B), swap_free the entry, then swap out the possibly modified page reusing the same entry. It breaks the pte_same check in (T0) because PTE value is unchanged, causing ABA problem. Thread (T0) will install a stalled page (A) into the PTE and cause data corruption. One possible callstack is like this: CPU0 CPU1 ---- ---- do_swap_page() do_swap_page() with same entry <direct swapin path> <direct swapin path> <alloc page A> <alloc page B> swap_read_folio() <- read to page A swap_read_folio() <- read to page B <slow on later locks or interrupt> <finished swapin first> ... set_pte_at() swap_free() <- entry is free <write to page B, now page A stalled> <swap out page B to same swap entry> pte_same() <- Check pass, PTE seems unchanged, but page A is stalled! swap_free() <- page B content lost! set_pte_at() <- staled page A installed! And besides, for ZRAM, swap_free() allows the swap device to discard the entry content, so even if page (B) is not modified, if swap_read_folio() on CPU0 happens later than swap_free() on CPU1, it may also cause data loss. To fix this, reuse swapcache_prepare which will pin the swap entry using the cache flag, and allow only one thread to swap it in, also prevent any parallel code from putting the entry in the cache. Release the pin after PT unlocked. Racers just loop and wait since it's a rare and very short event. A schedule_timeout_uninterruptible(1) call is added to avoid repeated page faults wasting too much CPU, causing livelock or adding too much noise to perf statistics. A similar livelock issue was described in commit 029c4628b2eb ("mm: swap: get rid of livelock in swapin readahead") Reproducer: This race issue can be triggered easily using a well constructed reproducer and patched brd (with a delay in read path) [1]: With latest 6.8 mainline, race caused data loss can be observed easily: $ gcc -g -lpthread test-thread-swap-race.c && ./a.out Polulating 32MB of memory region... Keep swapping out... Starting round 0... Spawning 65536 workers... 32746 workers spawned, wait for done... Round 0: Error on 0x5aa00, expected 32746, got 32743, 3 data loss! Round 0: Error on 0x395200, expected 32746, got 32743, 3 data loss! Round 0: Error on 0x3fd000, expected 32746, got 32737, 9 data loss! Round 0 Failed, 15 data loss! This reproducer spawns multiple threads sharing the same memory region using a small swap device. Every two threads updates mapped pages one by one in opposite direction trying to create a race, with one dedicated thread keep swapping out the data out using madvise. The reproducer created a reproduce rate of about once every 5 minutes, so the race should be totally possible in production. After this patch, I ran the reproducer for over a few hundred rounds and no data loss observed. Performance overhead is minimal, microbenchmark swapin 10G from 32G zram: Before: 10934698 us After: 11157121 us Cached: 13155355 us (Dropping SWP_SYNCHRONOUS_IO flag) Fixes: 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous device") Link: https://github.com/ryncsn/emm-test-project/tree/master/swap-stress-race [1] Reported-by: "Huang, Ying" <ying.huang(a)intel.com> Closes: https://lore.kernel.org/lkml/87bk92gqpx.fsf_-_@yhuang6-desk2.ccr.corp.intel… Signed-off-by: Kairui Song <kasong(a)tencent.com> Cc: stable(a)vger.kernel.org --- V3: https://lore.kernel.org/all/20240216095105.14502-1-ryncsn@gmail.com/ Update from V3: - Use schedule_timeout_uninterruptible(1) for now instead of schedule() to prevent the busy faulting task holds CPU and livelocks [Huang, Ying] V2: https://lore.kernel.org/all/20240206182559.32264-1-ryncsn@gmail.com/ Update from V2: - Add a schedule() if raced to prevent repeated page faults wasting CPU and add noise to perf statistics. - Use a bool to state the special case instead of reusing existing variables fixing error handling [Minchan Kim]. V1: https://lore.kernel.org/all/20240205110959.4021-1-ryncsn@gmail.com/ Update from V1: - Add some words on ZRAM case, it will discard swap content on swap_free so the race window is a bit different but cure is the same. [Barry Song] - Update comments make it cleaner [Huang, Ying] - Add a function place holder to fix CONFIG_SWAP=n built [SeongJae Park] - Update the commit message and summary, refer to SWP_SYNCHRONOUS_IO instead of "direct swapin path" [Yu Zhao] - Update commit message. - Collect Review and Acks. include/linux/swap.h | 5 +++++ mm/memory.c | 20 ++++++++++++++++++++ mm/swap.h | 5 +++++ mm/swapfile.c | 13 +++++++++++++ 4 files changed, 43 insertions(+) diff --git a/include/linux/swap.h b/include/linux/swap.h index 4db00ddad261..8d28f6091a32 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -549,6 +549,11 @@ static inline int swap_duplicate(swp_entry_t swp) return 0; } +static inline int swapcache_prepare(swp_entry_t swp) +{ + return 0; +} + static inline void swap_free(swp_entry_t swp) { } diff --git a/mm/memory.c b/mm/memory.c index 7e1f4849463a..a99f5e7be9a5 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3799,6 +3799,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) struct page *page; struct swap_info_struct *si = NULL; rmap_t rmap_flags = RMAP_NONE; + bool need_clear_cache = false; bool exclusive = false; swp_entry_t entry; pte_t pte; @@ -3867,6 +3868,20 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (!folio) { if (data_race(si->flags & SWP_SYNCHRONOUS_IO) && __swap_count(entry) == 1) { + /* + * Prevent parallel swapin from proceeding with + * the cache flag. Otherwise, another thread may + * finish swapin first, free the entry, and swapout + * reusing the same entry. It's undetectable as + * pte_same() returns true due to entry reuse. + */ + if (swapcache_prepare(entry)) { + /* Relax a bit to prevent rapid repeated page faults */ + schedule_timeout_uninterruptible(1); + goto out; + } + need_clear_cache = true; + /* skip swapcache */ folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, vmf->address, false); @@ -4117,6 +4132,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (vmf->pte) pte_unmap_unlock(vmf->pte, vmf->ptl); out: + /* Clear the swap cache pin for direct swapin after PTL unlock */ + if (need_clear_cache) + swapcache_clear(si, entry); if (si) put_swap_device(si); return ret; @@ -4131,6 +4149,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) folio_unlock(swapcache); folio_put(swapcache); } + if (need_clear_cache) + swapcache_clear(si, entry); if (si) put_swap_device(si); return ret; diff --git a/mm/swap.h b/mm/swap.h index 758c46ca671e..fc2f6ade7f80 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -41,6 +41,7 @@ void __delete_from_swap_cache(struct folio *folio, void delete_from_swap_cache(struct folio *folio); void clear_shadow_from_swap_cache(int type, unsigned long begin, unsigned long end); +void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry); struct folio *swap_cache_get_folio(swp_entry_t entry, struct vm_area_struct *vma, unsigned long addr); struct folio *filemap_get_incore_folio(struct address_space *mapping, @@ -97,6 +98,10 @@ static inline int swap_writepage(struct page *p, struct writeback_control *wbc) return 0; } +static inline void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry) +{ +} + static inline struct folio *swap_cache_get_folio(swp_entry_t entry, struct vm_area_struct *vma, unsigned long addr) { diff --git a/mm/swapfile.c b/mm/swapfile.c index 556ff7347d5f..746aa9da5302 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3365,6 +3365,19 @@ int swapcache_prepare(swp_entry_t entry) return __swap_duplicate(entry, SWAP_HAS_CACHE); } +void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry) +{ + struct swap_cluster_info *ci; + unsigned long offset = swp_offset(entry); + unsigned char usage; + + ci = lock_cluster_or_swap_info(si, offset); + usage = __swap_entry_free_locked(si, offset, SWAP_HAS_CACHE); + unlock_cluster_or_swap_info(si, ci); + if (!usage) + free_swap_slot(entry); +} + struct swap_info_struct *swp_swap_info(swp_entry_t entry) { return swap_type_to_swap_info(swp_type(entry)); -- 2.43.0

1 year, 10 months

[RFC] efi: Add ACPI_MEMORY_NVS into the linear map

by Boqun Feng

Currently ACPI_MEMORY_NVS is omitted from the linear map, which causes a trouble with the following firmware memory region setup: [..] efi: 0x0000dfd62000-0x0000dfd83fff [ACPI Reclaim|...] [..] efi: 0x0000dfd84000-0x0000dfd87fff [ACPI Mem NVS|...] , on ARM64 with 64k page size, the whole 0x0000dfd80000-0x0000dfd8ffff range will be omitted from the the linear map due to 64k round-up. And a page fault happens when trying to access the ACPI_RECLAIM_MEMORY: [...] Unable to handle kernel paging request at virtual address ffff0000dfd80000 To fix this, add ACPI_MEMORY_NVS into the linear map. Signed-off-by: Boqun Feng <boqun.feng(a)gmail.com> Cc: stable(a)vger.kernel.org # 5.15+ --- We hit this in an ARM64 Hyper-V VM when using 64k page size, although this issue may also be fixed if the efi memory regions are all 64k aligned, but I don't find this memory region setup is invalid per UEFI spec, also I don't find that spec disallows ACPI_MEMORY_NVS to be mapped in the OS linear map, but if there is any better way or I'm reading the spec incorrectly, please let me know. It's Cced stable since 5.15 because that's when Hyper-V ARM64 support is added, and Hyper-V is the only one that hits the problem so far. drivers/firmware/efi/efi-init.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/firmware/efi/efi-init.c b/drivers/firmware/efi/efi-init.c index a00e07b853f2..9a1b9bc66d50 100644 --- a/drivers/firmware/efi/efi-init.c +++ b/drivers/firmware/efi/efi-init.c @@ -139,6 +139,7 @@ static __init int is_usable_memory(efi_memory_desc_t *md) case EFI_LOADER_CODE: case EFI_LOADER_DATA: case EFI_ACPI_RECLAIM_MEMORY: + case EFI_ACPI_MEMORY_NVS: case EFI_BOOT_SERVICES_CODE: case EFI_BOOT_SERVICES_DATA: case EFI_CONVENTIONAL_MEMORY: @@ -202,8 +203,12 @@ static __init void reserve_regions(void) if (!is_usable_memory(md)) memblock_mark_nomap(paddr, size); - /* keep ACPI reclaim memory intact for kexec etc. */ - if (md->type == EFI_ACPI_RECLAIM_MEMORY) + /* + * keep ACPI reclaim and NVS memory and intact for kexec + * etc. + */ + if (md->type == EFI_ACPI_RECLAIM_MEMORY || + md->type == EFI_ACPI_MEMORY_NVS) memblock_reserve(paddr, size); } } -- 2.43.0

1 year, 10 months

[PATCH 6.1.y 1/2] smb: client: fix potential OOBs in smb2_parse_contexts()

by Guruswamy Basavaiah

From: Paulo Alcantara <pc(a)manguebit.com> [ Upstream commit af1689a9b7701d9907dfc84d2a4b57c4bc907144 ] Validate offsets and lengths before dereferencing create contexts in smb2_parse_contexts(). This fixes following oops when accessing invalid create contexts from server: BUG: unable to handle page fault for address: ffff8881178d8cc3 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 4a01067 P4D 4a01067 PUD 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 1736 Comm: mount.cifs Not tainted 6.7.0-rc4 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 RIP: 0010:smb2_parse_contexts+0xa0/0x3a0 [cifs] Code: f8 10 75 13 48 b8 93 ad 25 50 9c b4 11 e7 49 39 06 0f 84 d2 00 00 00 8b 45 00 85 c0 74 61 41 29 c5 48 01 c5 41 83 fd 0f 76 55 <0f> b7 7d 04 0f b7 45 06 4c 8d 74 3d 00 66 83 f8 04 75 bc ba 04 00 RSP: 0018:ffffc900007939e0 EFLAGS: 00010216 RAX: ffffc90000793c78 RBX: ffff8880180cc000 RCX: ffffc90000793c90 RDX: ffffc90000793cc0 RSI: ffff8880178d8cc0 RDI: ffff8880180cc000 RBP: ffff8881178d8cbf R08: ffffc90000793c22 R09: 0000000000000000 R10: ffff8880180cc000 R11: 0000000000000024 R12: 0000000000000000 R13: 0000000000000020 R14: 0000000000000000 R15: ffffc90000793c22 FS: 00007f873753cbc0(0000) GS:ffff88806bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff8881178d8cc3 CR3: 00000000181ca000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? __die+0x23/0x70 ? page_fault_oops+0x181/0x480 ? search_module_extables+0x19/0x60 ? srso_alias_return_thunk+0x5/0xfbef5 ? exc_page_fault+0x1b6/0x1c0 ? asm_exc_page_fault+0x26/0x30 ? smb2_parse_contexts+0xa0/0x3a0 [cifs] SMB2_open+0x38d/0x5f0 [cifs] ? smb2_is_path_accessible+0x138/0x260 [cifs] smb2_is_path_accessible+0x138/0x260 [cifs] cifs_is_path_remote+0x8d/0x230 [cifs] cifs_mount+0x7e/0x350 [cifs] cifs_smb3_do_mount+0x128/0x780 [cifs] smb3_get_tree+0xd9/0x290 [cifs] vfs_get_tree+0x2c/0x100 ? capable+0x37/0x70 path_mount+0x2d7/0xb80 ? srso_alias_return_thunk+0x5/0xfbef5 ? _raw_spin_unlock_irqrestore+0x44/0x60 __x64_sys_mount+0x11a/0x150 do_syscall_64+0x47/0xf0 entry_SYSCALL_64_after_hwframe+0x6f/0x77 RIP: 0033:0x7f8737657b1e Reported-by: Robert Morris <rtm(a)csail.mit.edu> Cc: stable(a)vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc(a)manguebit.com> Signed-off-by: Steve French <stfrench(a)microsoft.com> [Guru: Modified the patch to be applicable to the cached_dir.c file.] Signed-off-by: Guruswamy Basavaiah <guruswamy.basavaiah(a)broadcom.com> --- fs/smb/client/cached_dir.c | 8 ++-- fs/smb/client/smb2pdu.c | 93 +++++++++++++++++++++++--------------- fs/smb/client/smb2proto.h | 12 +++-- 3 files changed, 68 insertions(+), 45 deletions(-) diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c index 5a132c1e6f6c..6f4d7aa70e5a 100644 --- a/fs/smb/client/cached_dir.c +++ b/fs/smb/client/cached_dir.c @@ -268,10 +268,12 @@ int open_cached_dir(unsigned int xid, struct cifs_tcon *tcon, if (o_rsp->OplockLevel != SMB2_OPLOCK_LEVEL_LEASE) goto oshr_free; - smb2_parse_contexts(server, o_rsp, + rc = smb2_parse_contexts(server, rsp_iov, &oparms.fid->epoch, - oparms.fid->lease_key, &oplock, - NULL, NULL); + oparms.fid->lease_key, + &oplock, NULL, NULL); + if (rc) + goto oshr_free; if (!(oplock & SMB2_LEASE_READ_CACHING_HE)) goto oshr_free; qi_rsp = (struct smb2_query_info_rsp *)rsp_iov[1].iov_base; diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c index e65f998ea4cf..d610862ac6a0 100644 --- a/fs/smb/client/smb2pdu.c +++ b/fs/smb/client/smb2pdu.c @@ -2145,17 +2145,18 @@ parse_posix_ctxt(struct create_context *cc, struct smb2_file_all_info *info, posix->nlink, posix->mode, posix->reparse_tag); } -void -smb2_parse_contexts(struct TCP_Server_Info *server, - struct smb2_create_rsp *rsp, - unsigned int *epoch, char *lease_key, __u8 *oplock, - struct smb2_file_all_info *buf, - struct create_posix_rsp *posix) +int smb2_parse_contexts(struct TCP_Server_Info *server, + struct kvec *rsp_iov, + unsigned int *epoch, + char *lease_key, __u8 *oplock, + struct smb2_file_all_info *buf, + struct create_posix_rsp *posix) { - char *data_offset; + struct smb2_create_rsp *rsp = rsp_iov->iov_base; struct create_context *cc; - unsigned int next; - unsigned int remaining; + size_t rem, off, len; + size_t doff, dlen; + size_t noff, nlen; char *name; static const char smb3_create_tag_posix[] = { 0x93, 0xAD, 0x25, 0x50, 0x9C, @@ -2164,45 +2165,63 @@ smb2_parse_contexts(struct TCP_Server_Info *server, }; *oplock = 0; - data_offset = (char *)rsp + le32_to_cpu(rsp->CreateContextsOffset); - remaining = le32_to_cpu(rsp->CreateContextsLength); - cc = (struct create_context *)data_offset; + + off = le32_to_cpu(rsp->CreateContextsOffset); + rem = le32_to_cpu(rsp->CreateContextsLength); + if (check_add_overflow(off, rem, &len) || len > rsp_iov->iov_len) + return -EINVAL; + cc = (struct create_context *)((u8 *)rsp + off); /* Initialize inode number to 0 in case no valid data in qfid context */ if (buf) buf->IndexNumber = 0; - while (remaining >= sizeof(struct create_context)) { - name = le16_to_cpu(cc->NameOffset) + (char *)cc; - if (le16_to_cpu(cc->NameLength) == 4 && - strncmp(name, SMB2_CREATE_REQUEST_LEASE, 4) == 0) - *oplock = server->ops->parse_lease_buf(cc, epoch, - lease_key); - else if (buf && (le16_to_cpu(cc->NameLength) == 4) && - strncmp(name, SMB2_CREATE_QUERY_ON_DISK_ID, 4) == 0) - parse_query_id_ctxt(cc, buf); - else if ((le16_to_cpu(cc->NameLength) == 16)) { - if (posix && - memcmp(name, smb3_create_tag_posix, 16) == 0) + while (rem >= sizeof(*cc)) { + doff = le16_to_cpu(cc->DataOffset); + dlen = le32_to_cpu(cc->DataLength); + if (check_add_overflow(doff, dlen, &len) || len > rem) + return -EINVAL; + + noff = le16_to_cpu(cc->NameOffset); + nlen = le16_to_cpu(cc->NameLength); + if (noff + nlen >= doff) + return -EINVAL; + + name = (char *)cc + noff; + switch (nlen) { + case 4: + if (!strncmp(name, SMB2_CREATE_REQUEST_LEASE, 4)) { + *oplock = server->ops->parse_lease_buf(cc, epoch, + lease_key); + } else if (buf && + !strncmp(name, SMB2_CREATE_QUERY_ON_DISK_ID, 4)) { + parse_query_id_ctxt(cc, buf); + } + break; + case 16: + if (posix && !memcmp(name, smb3_create_tag_posix, 16)) parse_posix_ctxt(cc, buf, posix); + break; + default: + cifs_dbg(FYI, "%s: unhandled context (nlen=%zu dlen=%zu)\n", + __func__, nlen, dlen); + if (IS_ENABLED(CONFIG_CIFS_DEBUG2)) + cifs_dump_mem("context data: ", cc, dlen); + break; } - /* else { - cifs_dbg(FYI, "Context not matched with len %d\n", - le16_to_cpu(cc->NameLength)); - cifs_dump_mem("Cctxt name: ", name, 4); - } */ - - next = le32_to_cpu(cc->Next); - if (!next) + + off = le32_to_cpu(cc->Next); + if (!off) break; - remaining -= next; - cc = (struct create_context *)((char *)cc + next); + if (check_sub_overflow(rem, off, &rem)) + return -EINVAL; + cc = (struct create_context *)((u8 *)cc + off); } if (rsp->OplockLevel != SMB2_OPLOCK_LEVEL_LEASE) *oplock = rsp->OplockLevel; - return; + return 0; } static int @@ -3082,8 +3101,8 @@ SMB2_open(const unsigned int xid, struct cifs_open_parms *oparms, __le16 *path, } - smb2_parse_contexts(server, rsp, &oparms->fid->epoch, - oparms->fid->lease_key, oplock, buf, posix); + rc = smb2_parse_contexts(server, &rsp_iov, &oparms->fid->epoch, + oparms->fid->lease_key, oplock, buf, posix); creat_exit: SMB2_open_free(&rqst); free_rsp_buf(resp_buftype, rsp); diff --git a/fs/smb/client/smb2proto.h b/fs/smb/client/smb2proto.h index be21b5d26f67..b325fde010ad 100644 --- a/fs/smb/client/smb2proto.h +++ b/fs/smb/client/smb2proto.h @@ -249,11 +249,13 @@ extern int smb3_validate_negotiate(const unsigned int, struct cifs_tcon *); extern enum securityEnum smb2_select_sectype(struct TCP_Server_Info *, enum securityEnum); -extern void smb2_parse_contexts(struct TCP_Server_Info *server, - struct smb2_create_rsp *rsp, - unsigned int *epoch, char *lease_key, - __u8 *oplock, struct smb2_file_all_info *buf, - struct create_posix_rsp *posix); +int smb2_parse_contexts(struct TCP_Server_Info *server, + struct kvec *rsp_iov, + unsigned int *epoch, + char *lease_key, __u8 *oplock, + struct smb2_file_all_info *buf, + struct create_posix_rsp *posix); + extern int smb3_encryption_required(const struct cifs_tcon *tcon); extern int smb2_validate_iov(unsigned int offset, unsigned int buffer_length, struct kvec *iov, unsigned int min_buf_size); -- 2.25.1

1 year, 10 months

[PATCH 6.1.y] RDMA/irdma: Ensure iWarp QP queue memory is OS paged aligned

by Shiraz Saleem

From: Mike Marciniszyn <mike.marciniszyn(a)intel.com> [ Upstream commit 0a5ec366de7e94192669ba08de6ed336607fd282 ] The SQ is shared for between kernel and used by storing the kernel page pointer and passing that to a kmap_atomic(). This then requires that the alignment is PAGE_SIZE aligned. Fix by adding an iWarp specific alignment check. The patch needed to be reworked because the separate routines present upstream are not there in older irdma drivers. Fixes: e965ef0e7b2c ("RDMA/irdma: Split QP handler into irdma_reg_user_mr_type_qp") Link: https://lore.kernel.org/r/20231129202143.1434-3-shiraz.saleem@intel.com Signed-off-by: Mike Marciniszyn <mike.marciniszyn(a)intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem(a)intel.com> Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com> --- drivers/infiniband/hw/irdma/verbs.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 447e1bcc82a3..3c437c8070b6 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -2845,6 +2845,13 @@ static struct ib_mr *irdma_reg_user_mr(struct ib_pd *pd, u64 start, u64 len, switch (req.reg_type) { case IRDMA_MEMREG_TYPE_QP: + /* iWarp: Catch page not starting on OS page boundary */ + if (!rdma_protocol_roce(&iwdev->ibdev, 1) && + ib_umem_offset(iwmr->region)) { + err = -EINVAL; + goto error; + } + total = req.sq_pages + req.rq_pages + shadow_pgcnt; if (total > iwmr->page_cnt) { err = -EINVAL; -- 1.8.3.1

1 year, 10 months

[PATCH 5.10.y v2] Revert "arm64: Stash shadow stack pointer in the task struct on interrupt"

by Xiang Yang

This reverts commit 3f225f29c69c13ce1cbdb1d607a42efeef080056. The shadow call stack for irq now is stored in current task's thread info in irq_stack_entry. There is a possibility that we have some soft irqs pending at the end of hard irq, and when we process softirq with the irq enabled, irq_stack_entry will enter again and overwrite the shadow call stack whitch stored in current task's thread info, leading to the incorrect shadow call stack restoration for the first entry of the hard IRQ, then the system end up with a panic. task A | task A -------------------------------------+------------------------------------ el1_irq //irq1 enter | irq_handler //save scs_sp1 | gic_handle_irq | irq_exit | __do_softirq | | el1_irq //irq2 enter | irq_handler //save scs_sp2 | //overwrite scs_sp1 | ... | irq_stack_exit //restore scs_sp2 irq_stack_exit //restore wrong | //scs_sp2 | So revert this commit to fix it. Fixes: 3f225f29c69c ("arm64: Stash shadow stack pointer in the task struct on interrupt") Signed-off-by: Xiang Yang <xiangyang3(a)huawei.com> --- arch/arm64/kernel/entry.S | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index a94acea770c7..020a455824be 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -431,7 +431,9 @@ SYM_CODE_END(__swpan_exit_el0) .macro irq_stack_entry mov x19, sp // preserve the original sp - scs_save tsk // preserve the original shadow stack +#ifdef CONFIG_SHADOW_CALL_STACK + mov x24, scs_sp // preserve the original shadow stack +#endif /* * Compare sp with the base of the task stack. @@ -465,7 +467,9 @@ SYM_CODE_END(__swpan_exit_el0) */ .macro irq_stack_exit mov sp, x19 - scs_load_current +#ifdef CONFIG_SHADOW_CALL_STACK + mov scs_sp, x24 +#endif .endm /* GPRs used by entry code */ -- 2.34.1

1 year, 10 months

[PATCH v3][5.10, 5.15, 6.1][0/1] hrtimer: Ignore slack time for RT tasks

by Felix Moessbauer

Changes since v2: - added signed-off Changes since v1: - added upstream commit id to the commit message This suggests a fix from 6.3 for stable that fixes a nasty bug in the timing behavior of periodic RT tasks w.r.t timerslack_ns. While the documentation clearly states that the slack time is ignored for RT tasks, this is not the case for the hrtimer code. This patch fixes the issue and applies to all stable kernels. Best regards, Felix Moessbauer Siemens AG Davidlohr Bueso (1): hrtimer: Ignore slack time for RT tasks in schedule_hrtimeout_range() kernel/time/hrtimer.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) -- 2.39.2

1 year, 10 months

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror February 2024