October 2023 - Linux-stable-mirror

Re: [PATCH] usb: dwc3: Soft reset phy on probe for host

by Da Xue

Hi Thinh, I can confirm your patch fixed the issue on RK3399 when I was running on Linux 6.1.54. I'm not on the ML for this so I'm sorry if this email causes any issue as I'm not sure how to reply to a thread from a ML I am not on. Best, Da

1 year, 9 months

2
1
0 0

[PATCH v4] ASoC: amd: yc: Fix non-functional mic on Lenovo 82YM

by Sven Frotscher

Like the Lenovo 82TL, 82V2, 82QF and 82UG, the 82YM (Yoga 7 14ARP8) requires an entry in the quirk list to enable the internal microphone. The latter two received similar fixes in commit 1263cc0f414d ("ASoC: amd: yc: Fix non-functional mic on Lenovo 82QF and 82UG"). Fixes: c008323fe361 ("ASoC: amd: yc: Fix a non-functional mic on Lenovo 82SJ") Cc: stable(a)vger.kernel.org Signed-off-by: Sven Frotscher <sven.frotscher(a)gmail.com> --- v3->v4 changes: * re-add blank line between commit message title and body --- v2->v3 changes: * add message title of referenced commit to commit message * make whitespace consistent with surrounding code * use a patch-friendly e-mail client --- v1->v2 changes: * add Fixes and Cc tags to commit message * remove redundant LKML link from commit message * fix mangled diff --- sound/soc/amd/yc/acp6x-mach.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/sound/soc/amd/yc/acp6x-mach.c b/sound/soc/amd/yc/acp6x-mach.c index 94e9eb8e73f2..15a864dcd7bd 100644 --- a/sound/soc/amd/yc/acp6x-mach.c +++ b/sound/soc/amd/yc/acp6x-mach.c @@ -241,6 +241,13 @@ static const struct dmi_system_id yc_acp_quirk_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "82V2"), } }, + { + .driver_data = &acp6x_card, + .matches = { + DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"), + DMI_MATCH(DMI_PRODUCT_NAME, "82YM"), + } + }, { .driver_data = &acp6x_card, .matches = { -- 2.42.0

1 year, 9 months

4
14
0 0

[merged] i915-limit-the-length-of-an-sg-list-to-the-requested-length.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: i915: limit the length of an sg list to the requested length has been removed from the -mm tree. Its filename was i915-limit-the-length-of-an-sg-list-to-the-requested-length.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org> Subject: i915: limit the length of an sg list to the requested length Date: Tue, 19 Sep 2023 20:48:55 +0100 The folio conversion changed the behaviour of shmem_sg_alloc_table() to put the entire length of the last folio into the sg list, even if the sg list should have been shorter. gen8_ggtt_insert_entries() relied on the list being the right length and would overrun the end of the page tables. Other functions may also have been affected. Clamp the length of the last entry in the sg list to be the expected length. Link: https://lkml.kernel.org/r/20230919194855.347582-1-willy@infradead.org Link: https://gitlab.freedesktop.org/drm/intel/-/issues/9256 Fixes: 0b62af28f249 ("i915: convert shmem_sg_free_table() to use a folio_batch") Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org> Reported-by: Oleksandr Natalenko <oleksandr(a)natalenko.name> Closes: https://lore.kernel.org/lkml/6287208.lOV4Wx5bFT@natalenko.name/ Tested-by: Oleksandr Natalenko <oleksandr(a)natalenko.name> Reviewed-by: Andrzej Hajda <andrzej.hajda(a)intel.com> Cc: Jani Nikula <jani.nikula(a)linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin(a)linux.intel.com> Cc: <stable(a)vger.kernel.org> [6.5.x] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c~i915-limit-the-length-of-an-sg-list-to-the-requested-length +++ a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -100,6 +100,7 @@ int shmem_sg_alloc_table(struct drm_i915 st->nents = 0; for (i = 0; i < page_count; i++) { struct folio *folio; + unsigned long nr_pages; const unsigned int shrink[] = { I915_SHRINK_BOUND | I915_SHRINK_UNBOUND, 0, @@ -150,6 +151,8 @@ int shmem_sg_alloc_table(struct drm_i915 } } while (1); + nr_pages = min_t(unsigned long, + folio_nr_pages(folio), page_count - i); if (!i || sg->length >= max_segment || folio_pfn(folio) != next_pfn) { @@ -157,13 +160,13 @@ int shmem_sg_alloc_table(struct drm_i915 sg = sg_next(sg); st->nents++; - sg_set_folio(sg, folio, folio_size(folio), 0); + sg_set_folio(sg, folio, nr_pages * PAGE_SIZE, 0); } else { /* XXX: could overflow? */ - sg->length += folio_size(folio); + sg->length += nr_pages * PAGE_SIZE; } - next_pfn = folio_pfn(folio) + folio_nr_pages(folio); - i += folio_nr_pages(folio) - 1; + next_pfn = folio_pfn(folio) + nr_pages; + i += nr_pages - 1; /* Check that the i965g/gm workaround works. */ GEM_BUG_ON(gfp & __GFP_DMA32 && next_pfn >= 0x00100000UL); _ Patches currently in -mm which might be from willy(a)infradead.org are mm-convert-dax-lock-unlock-page-to-lock-unlock-folio.patch buffer-pass-gfp-flags-to-folio_alloc_buffers.patch buffer-hoist-gfp-flags-from-grow_dev_page-to-__getblk_gfp.patch buffer-hoist-gfp-flags-from-grow_dev_page-to-__getblk_gfp-fix.patch ext4-use-bdev_getblk-to-avoid-memory-reclaim-in-readahead-path.patch buffer-use-bdev_getblk-to-avoid-memory-reclaim-in-readahead-path.patch buffer-convert-getblk_unmovable-and-__getblk-to-use-bdev_getblk.patch buffer-convert-sb_getblk-to-call-__getblk.patch ext4-call-bdev_getblk-from-sb_getblk_gfp.patch buffer-remove-__getblk_gfp.patch hugetlb-use-a-folio-in-free_hpage_workfn.patch hugetlb-remove-a-few-calls-to-page_folio.patch hugetlb-convert-remove_pool_huge_page-to-remove_pool_hugetlb_folio.patch mm-make-lock_folio_maybe_drop_mmap-vma-lock-aware.patch mm-call-wp_page_copy-under-the-vma-lock.patch mm-handle-shared-faults-under-the-vma-lock.patch mm-handle-cow-faults-under-the-vma-lock.patch mm-handle-read-faults-under-the-vma-lock.patch mm-handle-write-faults-to-ro-pages-under-the-vma-lock.patch

1 year, 9 months

1
0
0 0

[PATCH v2] can: sja1000: Always restart the Tx queue after an overrun

by Miquel Raynal

Upstream commit 717c6ec241b5 ("can: sja1000: Prevent overrun stalls with a soft reset on Renesas SoCs") fixes an issue with Renesas own SJA1000 CAN controller reception: the Rx buffer is only 5 messages long, so when the bus loaded (eg. a message every 50us), overrun may easily happen. Upon an overrun situation, due to a possible internal crosstalk situation, the controller enters a frozen state which only can be unlocked with a soft reset (experimentally). The solution was to offload a call to sja1000_start() in a threaded handler. This needs to happen in process context as this operation requires to sleep. sja1000_start() basically enters "reset mode", performs a proper software reset and returns back into "normal mode". Since this fix was introduced, we no longer observe any stalls in reception. However it was sporadically observed that the transmit path would now freeze. Further investigation blamed the fix mentioned above, and especially the reset operation. Reproducing the reset in a loop helped identifying what could possibly go wrong. The sja1000 is a single Tx queue device, which leverages the netdev helpers to process one Tx message at a time. The logic is: the queue is stopped, the message sent to the transceiver, once properly transmitted the controller sets a status bit which triggers an interrupt, in the interrupt handler the transmission status is checked and the queue woken up. Unfortunately, if an overrun happens, we might perform the soft reset precisely between the transmission of the buffer to the transceiver and the advent of the transmission status bit. We would then stop the transmission operation without re-enabling the queue, leading to all further transmissions to be ignored. The reset interrupt can only happen while the device is "open", and after a reset we anyway want to resume normal operations, no matter if a packet to transmit got dropped in the process, so we shall wake up the queue. Restarting the device and waking-up the queue is exactly what sja1000_set_mode(CAN_MODE_START) does. In order to be consistent about the queue state, we must acquire a lock both in the reset handler and in the transmit path to ensure serialization of both operations. As the reset handler might still be called after the transmission of a frame to the transceiver but before it actually gets transmitted, we must ensure we don't leak the skb, so we free it (the behavior is consistent, no matter if there was an skb on the stack or not). Fixes: 717c6ec241b5 ("can: sja1000: Prevent overrun stalls with a soft reset on Renesas SoCs") Cc: stable(a)vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com> --- Changes in v2: * As Marc sugested, use netif_tx_{,un}lock() instead of our own spin_lock. drivers/net/can/sja1000/sja1000.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/net/can/sja1000/sja1000.c b/drivers/net/can/sja1000/sja1000.c index ae47fc72aa96..91e3fb3eed20 100644 --- a/drivers/net/can/sja1000/sja1000.c +++ b/drivers/net/can/sja1000/sja1000.c @@ -297,6 +297,7 @@ static netdev_tx_t sja1000_start_xmit(struct sk_buff *skb, if (can_dropped_invalid_skb(dev, skb)) return NETDEV_TX_OK; + netif_tx_lock(dev); netif_stop_queue(dev); fi = dlc = cf->can_dlc; @@ -335,6 +336,8 @@ static netdev_tx_t sja1000_start_xmit(struct sk_buff *skb, sja1000_write_cmdreg(priv, cmd_reg_val); + netif_tx_unlock(dev); + return NETDEV_TX_OK; } @@ -396,7 +399,13 @@ static irqreturn_t sja1000_reset_interrupt(int irq, void *dev_id) struct net_device *dev = (struct net_device *)dev_id; netdev_dbg(dev, "performing a soft reset upon overrun\n"); - sja1000_start(dev); + + netif_tx_lock(dev); + + can_free_echo_skb(dev, 0); + sja1000_set_mode(dev, CAN_MODE_START); + + netif_tx_unlock(dev); return IRQ_HANDLED; } -- 2.34.1

1 year, 9 months

2
2
0 0

patch "usb: hub: Guard against accesses to uninitialized BOS descriptors" added to usb-linus

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled usb: hub: Guard against accesses to uninitialized BOS descriptors to my usb git tree which can be found at git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git in the usb-linus branch. The patch will show up in the next release of the linux-next tree (usually sometime within the next 24 hours during the week.) The patch will hopefully also be merged in Linus's tree for the next -rc kernel release. If you have any questions about this process, please let me know. From f74a7afc224acd5e922c7a2e52244d891bbe44ee Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ricardo=20Ca=C3=B1uelo?= <ricardo.canuelo(a)collabora.com> Date: Wed, 30 Aug 2023 12:04:18 +0200 Subject: usb: hub: Guard against accesses to uninitialized BOS descriptors MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Many functions in drivers/usb/core/hub.c and drivers/usb/core/hub.h access fields inside udev->bos without checking if it was allocated and initialized. If usb_get_bos_descriptor() fails for whatever reason, udev->bos will be NULL and those accesses will result in a crash: BUG: kernel NULL pointer dereference, address: 0000000000000018 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 5 PID: 17818 Comm: kworker/5:1 Tainted: G W 5.15.108-18910-gab0e1cb584e1 #1 <HASH:1f9e 1> Hardware name: Google Kindred/Kindred, BIOS Google_Kindred.12672.413.0 02/03/2021 Workqueue: usb_hub_wq hub_event RIP: 0010:hub_port_reset+0x193/0x788 Code: 89 f7 e8 20 f7 15 00 48 8b 43 08 80 b8 96 03 00 00 03 75 36 0f b7 88 92 03 00 00 81 f9 10 03 00 00 72 27 48 8b 80 a8 03 00 00 <48> 83 78 18 00 74 19 48 89 df 48 8b 75 b0 ba 02 00 00 00 4c 89 e9 RSP: 0018:ffffab740c53fcf8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffa1bc5f678000 RCX: 0000000000000310 RDX: fffffffffffffdff RSI: 0000000000000286 RDI: ffffa1be9655b840 RBP: ffffab740c53fd70 R08: 00001b7d5edaa20c R09: ffffffffb005e060 R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 R13: ffffab740c53fd3e R14: 0000000000000032 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffa1be96540000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000018 CR3: 000000022e80c005 CR4: 00000000003706e0 Call Trace: hub_event+0x73f/0x156e ? hub_activate+0x5b7/0x68f process_one_work+0x1a2/0x487 worker_thread+0x11a/0x288 kthread+0x13a/0x152 ? process_one_work+0x487/0x487 ? kthread_associate_blkcg+0x70/0x70 ret_from_fork+0x1f/0x30 Fall back to a default behavior if the BOS descriptor isn't accessible and skip all the functionalities that depend on it: LPM support checks, Super Speed capabilitiy checks, U1/U2 states setup. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo(a)collabora.com> Cc: stable <stable(a)vger.kernel.org> Link: https://lore.kernel.org/r/20230830100418.1952143-1-ricardo.canuelo@collabor… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/usb/core/hub.c | 25 ++++++++++++++++++++++--- drivers/usb/core/hub.h | 2 +- 2 files changed, 23 insertions(+), 4 deletions(-) diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c index 3c54b218301c..0ff47eeffb49 100644 --- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -151,6 +151,10 @@ int usb_device_supports_lpm(struct usb_device *udev) if (udev->quirks & USB_QUIRK_NO_LPM) return 0; + /* Skip if the device BOS descriptor couldn't be read */ + if (!udev->bos) + return 0; + /* USB 2.1 (and greater) devices indicate LPM support through * their USB 2.0 Extended Capabilities BOS descriptor. */ @@ -327,6 +331,10 @@ static void usb_set_lpm_parameters(struct usb_device *udev) if (!udev->lpm_capable || udev->speed < USB_SPEED_SUPER) return; + /* Skip if the device BOS descriptor couldn't be read */ + if (!udev->bos) + return; + hub = usb_hub_to_struct_hub(udev->parent); /* It doesn't take time to transition the roothub into U0, since it * doesn't have an upstream link. @@ -2704,13 +2712,17 @@ int usb_authorize_device(struct usb_device *usb_dev) static enum usb_ssp_rate get_port_ssp_rate(struct usb_device *hdev, u32 ext_portstatus) { - struct usb_ssp_cap_descriptor *ssp_cap = hdev->bos->ssp_cap; + struct usb_ssp_cap_descriptor *ssp_cap; u32 attr; u8 speed_id; u8 ssac; u8 lanes; int i; + if (!hdev->bos) + goto out; + + ssp_cap = hdev->bos->ssp_cap; if (!ssp_cap) goto out; @@ -4215,8 +4227,15 @@ static void usb_enable_link_state(struct usb_hcd *hcd, struct usb_device *udev, enum usb3_link_state state) { int timeout; - __u8 u1_mel = udev->bos->ss_cap->bU1devExitLat; - __le16 u2_mel = udev->bos->ss_cap->bU2DevExitLat; + __u8 u1_mel; + __le16 u2_mel; + + /* Skip if the device BOS descriptor couldn't be read */ + if (!udev->bos) + return; + + u1_mel = udev->bos->ss_cap->bU1devExitLat; + u2_mel = udev->bos->ss_cap->bU2DevExitLat; /* If the device says it doesn't have *any* exit latency to come out of * U1 or U2, it's probably lying. Assume it doesn't implement that link diff --git a/drivers/usb/core/hub.h b/drivers/usb/core/hub.h index 37897afd1b64..d44dd7f6623e 100644 --- a/drivers/usb/core/hub.h +++ b/drivers/usb/core/hub.h @@ -153,7 +153,7 @@ static inline int hub_is_superspeedplus(struct usb_device *hdev) { return (hdev->descriptor.bDeviceProtocol == USB_HUB_PR_SS && le16_to_cpu(hdev->descriptor.bcdUSB) >= 0x0310 && - hdev->bos->ssp_cap); + hdev->bos && hdev->bos->ssp_cap); } static inline unsigned hub_power_on_good_delay(struct usb_hub *hub) -- 2.42.0

1 year, 9 months

1
0
0 0

[PATCH v3 0/7] selftests/resctrl: Fixes to failing tests

by Ilpo Järvinen

Fix four issues with resctrl selftests. The signal handling fix became necessary after the mount/umount fixes and the uninitialized member bug was discovered during the review. The other two came up when I ran resctrl selftests across the server fleet in our lab to validate the upcoming CAT test rewrite (the rewrite is not part of this series). These are developed and should apply cleanly at least on top the benchmark cleanup series (might apply cleanly also w/o the benchmark series, I didn't test). v3: - Add fix to uninitialized sa_flags - Handle ksft_exit_fail_msg() in per test functions - Make signal handler register fails to also exit - Improve changelogs v2: - Include patch to move _GNU_SOURCE to Makefile to allow normal #include placement - Rework the signal register/unregister into patch to use helpers - Fixed incorrect function parameter description - Use return !!res to avoid confusing implicit boolean conversion - Improve MBA/MBM success bound patch's changelog - Tweak Cc: stable dependencies (make it a chain). Ilpo Järvinen (7): selftests/resctrl: Fix uninitialized .sa_flags selftests/resctrl: Extend signal handler coverage to unmount on receiving signal selftests/resctrl: Remove duplicate feature check from CMT test selftests/resctrl: Move _GNU_SOURCE define into Makefile selftests/resctrl: Refactor feature check to use resource and feature name selftests/resctrl: Fix feature checks selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests tools/testing/selftests/resctrl/Makefile | 2 +- tools/testing/selftests/resctrl/cat_test.c | 8 -- tools/testing/selftests/resctrl/cmt_test.c | 3 - tools/testing/selftests/resctrl/mba_test.c | 2 +- tools/testing/selftests/resctrl/mbm_test.c | 2 +- tools/testing/selftests/resctrl/resctrl.h | 7 +- .../testing/selftests/resctrl/resctrl_tests.c | 82 ++++++++++++------- tools/testing/selftests/resctrl/resctrl_val.c | 24 +++--- tools/testing/selftests/resctrl/resctrlfs.c | 69 ++++++---------- 9 files changed, 96 insertions(+), 103 deletions(-) -- 2.30.2

1 year, 9 months

2
10
0 0

[PATCH v3 1/3] mmap: Fix vma_iterator in error path of vma_merge()

by Liam R. Howlett

During the error path, the vma iterator may not be correctly positioned or set to the correct range. Undo the vma_prev() call by resetting to the passed in address. Re-walking to the same range will fix the range to the area previously passed in. Users would notice increased cycles as vma_merge() would be called an extra time with vma == prev, and thus would fail to merge and return. Link: https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY+Y2YnsU45yL5giS-Qn=ejt… Closes: https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY+Y2YnsU45yL5giS-Qn=ejt… Fixes: 18b098af2890 ("vma_merge: set vma iterator to correct position.") Cc: stable(a)vger.kernel.org Cc: Jann Horn <jannh(a)google.com> Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com> --- mm/mmap.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index b56a7f0c9f85..acb7dea49e23 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -975,7 +975,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, /* Error in anon_vma clone. */ if (err) - return NULL; + goto anon_vma_fail; if (vma_start < vma->vm_start || vma_end > vma->vm_end) vma_expanded = true; @@ -988,7 +988,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, } if (vma_iter_prealloc(vmi, vma)) - return NULL; + goto prealloc_fail; init_multi_vma_prep(&vp, vma, adjust, remove, remove2); VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma && @@ -1016,6 +1016,12 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, vma_complete(&vp, vmi, mm); khugepaged_enter_vma(res, vm_flags); return res; + +prealloc_fail: +anon_vma_fail: + vma_iter_set(vmi, addr); + vma_iter_load(vmi); + return NULL; } /* -- 2.40.1

1 year, 9 months

3
2
0 0

[PATCH 0/2] [4.19, 4.14] watchdog: iTCO: Backport of handle_boot_enabled=0 fix

by Jan Kiszka

This suggests a commit (and a follow-up fix for it) from 5.16+ for stable because it fixes the usage of watchdog.handle_boot_enabled=0 for iTCO, closing a monitoring gap in OTA update scenarios. These patches are applicable to and have been tested with 4.19, 4.14 stable heads. The second patch required rebasing, the first one applied as-is from upstream. Jan Cc: Malin Jonsson <malin.jonsson(a)ericsson.com> Mika Westerberg (2): watchdog: iTCO_wdt: No need to stop the timer in probe watchdog: iTCO_wdt: Set NO_REBOOT if the watchdog is not already running drivers/watchdog/iTCO_wdt.c | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) -- 2.35.3

1 year, 9 months

1
2
0 0

[PATCH net v5 1/3] net: replace calls to sock->ops->connect() with kernel_connect()

by Jordan Rife

commit 0bdf399342c5 ("net: Avoid address overwrite in kernel_connect") ensured that kernel_connect() will not overwrite the address parameter in cases where BPF connect hooks perform an address rewrite. This change replaces direct calls to sock->ops->connect() in net with kernel_connect() to make these call safe. Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/ Fixes: d74bad4e74ee ("bpf: Hooks for sys_connect") Cc: stable(a)vger.kernel.org Reviewed-by: Willem de Bruijn <willemb(a)google.com> Signed-off-by: Jordan Rife <jrife(a)google.com> --- v4->v5: Remove non-net changes. v3->v4: Remove precondition check for addrlen. v2->v3: Add "Fixes" tag. Check for positivity in addrlen sanity check. v1->v2: Split up original patch into patch series. Insulate calls with kernel_connect() instead of pushing address copy deeper into sock->ops->connect(). net/netfilter/ipvs/ip_vs_sync.c | 4 ++-- net/rds/tcp_connect.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c index da5af28ff57b5..6e4ed1e11a3b7 100644 --- a/net/netfilter/ipvs/ip_vs_sync.c +++ b/net/netfilter/ipvs/ip_vs_sync.c @@ -1505,8 +1505,8 @@ static int make_send_sock(struct netns_ipvs *ipvs, int id, } get_mcast_sockaddr(&mcast_addr, &salen, &ipvs->mcfg, id); - result = sock->ops->connect(sock, (struct sockaddr *) &mcast_addr, - salen, 0); + result = kernel_connect(sock, (struct sockaddr *)&mcast_addr, + salen, 0); if (result < 0) { pr_err("Error connecting to the multicast addr\n"); goto error; diff --git a/net/rds/tcp_connect.c b/net/rds/tcp_connect.c index f0c477c5d1db4..d788c6d28986f 100644 --- a/net/rds/tcp_connect.c +++ b/net/rds/tcp_connect.c @@ -173,7 +173,7 @@ int rds_tcp_conn_path_connect(struct rds_conn_path *cp) * own the socket */ rds_tcp_set_callbacks(sock, cp); - ret = sock->ops->connect(sock, addr, addrlen, O_NONBLOCK); + ret = kernel_connect(sock, addr, addrlen, O_NONBLOCK); rdsdebug("connect to address %pI6c returned %d\n", &conn->c_faddr, ret); if (ret == -EINPROGRESS) -- 2.42.0.515.g380fc7ccd1-goog

1 year, 9 months

3
6
0 0

[Kernel 6.5] Important read()/write() performance regression

by Florent DELAHAYE

Hello guys, During the last few months, I felt a performance regression when using read() and write() on my high-speed Nvme SSD (about 7GB/s). To get more precise information about it I quickly developed benchmark tool basically running read() or write() in a loop to simulate a sequential file read or write. The tool also measures the real time consumed by the loop. Finally, the tool can call open() with or without O_DIRECT. I ran the tests on EXT4 and Exfat with following settings (buffer values have been set for best result): - Write settings: buffer 400mb * 100 - Read settings: buffer 200mb - Drop caches before non-direct read/write test With this hardware: - CPU AMD Ryzen 7600X - RAM DDR5 5200 32GB - SSD Kingston Fury Renegade 4TB with 4K LBA Here are some results I got with last upstream kernels (default config): +------------------+----------+------------------+------------------+-- ----------------+------------------+------------------+ | ~42GB | O_DIRECT | Linux 6.2.0 | Linux 6.3.0 | Linux 6.4.0 | Linux 6.5.0 | Linux 6.5.5 | +------------------+----------+------------------+------------------+-- ----------------+------------------+------------------+ | Ext4 (sector 4k) | | | | | | | | Read | no | 7.2s (5800MB/s) | 7.1s (5890MB/s) | 8.3s (5050MB/s) | 13.2s (3180MB/s) | 13.2s (3180MB/s) | | Write | no | 12.0s (3500MB/s) | 12.6s (3340MB/s) | 12.2s (3440MB/s) | 28.9s (1450MB/s) | 28.9s (1450MB/s) | | Read | yes | 6.0s (7000MB/s) | 6.0s (7020MB/s) | 5.9s (7170MB/s) | 5.9s (7100MB/s) | 5.9s (7100MB/s) | | Write | yes | 6.7s (6220MB/s) | 6.7s (6290MB/s) | 6.9s (6080MB/s) | 6.9s (6080MB/s) | 6.9s (6970MB/s) | | Exfat (sector ?) | | | | | | | | Read | no | 7.3s (5770MB/s) | 7.2s (5830MB/s) | 9s (4620MB/s) | 13.3s (3150MB/s) | 13.2s (3180MB/s) | | Write | no | 8.3s (5040MB/s) | 8.9s (4750MB/s) | 8.3s (5040MB/s) | 18.3s (2290MB/s) | 18.5s (2260MB/s) | | Read | yes | 6.2s (6760MB/s) | 6.1s (6870MB/s) | 6.0s (6980MB/s) | 6.5s (6440MB/s) | 6.6s (6320MB/s) | | Write | yes | 16.1s (2610MB/s) | 16.0s (2620MB/s) | 18.7s (2240MB/s) | 34.1s (1230MB/s) | 34.5s (1220MB/s) | +------------------+----------+------------------+------------------+-- ----------------+------------------+------------------+ Please note that I rounded some values to clarify readiness. Small variations can be considered as margin error. Ext4 results: cached reads/writes time have increased of almost 100% from 6.2.0 to 6.5.0 with a first increase with 6.4.0. Direct access times have stayed similar though. Exfat results: performance decrease too with and without direct access this time. I realize there are thousands of commits between, plus the issue can come from multiple kernel parts such as the page cache, the file system implementation (especially for Exfat), the IO engine, a driver, etc. The results also showed that there is not only a specific version impacted. Anyway, at the end the performance have highly decreased. If you want to verify my benchmark tool source code, please ask. PS: sending again as only text body is accepted Regards Florent DELAHAYE

1 year, 9 months

2
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror October 2023