July 2023 - Linux-stable-mirror

Scsi_bus_resume+0x0/0x90 returns -5 when resuming from s3 sleep

by TW

I have been having issues with the 6.x series of kernels resuming from suspend with one of my drives. Far as I can tell it has trouble with the cache on the drive when coming out of s3 sleep. Tried a few different distros (Manjaro, OpenMandriva Rome, EndeavourOS) all that give the same error message. It appears to work fine on the 5.15 kernel just fine however. This is the error or errors that I have been getting and assume has been holding up the system from resuming from suspend. Jul 20 04:13:41 rageworks kernel: ata10.00: device reported invalid CHS sector 0 Jul 20 04:13:41 rageworks kernel: sd 9:0:0:0: [sdc] Start/Stop Unit failed: Result: hostbyte=DID_OK driverbyte=DRIVER_OK Jul 20 04:13:41 rageworks kernel: sd 9:0:0:0: [sdc] Sense Key : Illegal Request [current] Jul 20 04:13:41 rageworks kernel: sd 9:0:0:0: [sdc] Add. Sense: Unaligned write command Jul 20 04:13:41 rageworks kernel: sd 9:0:0:0: PM: dpm_run_callback(): scsi_bus_resume+0x0/0x90 returns -5 Jul 20 04:13:41 rageworks kernel: sd 9:0:0:0: PM: failed to resume async: error -5 The full suspend log. Jul 20 04:12:50 rageworks systemd-logind[869]: The system will suspend now! Jul 20 04:12:50 rageworks ModemManager[902]: <info> [sleep-monitor-systemd] system is about to suspend Jul 20 04:12:50 rageworks NetworkManager[894]: <info> [1689847970.4923] manager: sleep: sleep requested (sleeping: no enabled: yes) Jul 20 04:12:50 rageworks NetworkManager[894]: <info> [1689847970.4924] manager: NetworkManager state is now ASLEEP Jul 20 04:12:50 rageworks NetworkManager[894]: <info> [1689847970.4926] device (enp9s0): state change: activated -> deactivating (reason 'sleeping', sys-iface-state: 'managed') Jul 20 04:12:50 rageworks dbus-daemon[866]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.6' (uid=0 pid=894 comm="/usr/bin/NetworkManager --no-daemon") Jul 20 04:12:50 rageworks systemd[1]: Starting Network Manager Script Dispatcher Service... Jul 20 04:12:50 rageworks dbus-daemon[866]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher' Jul 20 04:12:50 rageworks systemd[1]: Started Network Manager Script Dispatcher Service. Jul 20 04:12:50 rageworks NetworkManager[894]: <info> [1689847970.6544] device (enp9s0): state change: deactivating -> disconnected (reason 'sleeping', sys-iface-state: 'managed') Jul 20 04:12:50 rageworks avahi-daemon[864]: Withdrawing address record for fe80::881:c55d:1583:20f5 on enp9s0. Jul 20 04:12:50 rageworks avahi-daemon[864]: Leaving mDNS multicast group on interface enp9s0.IPv6 with address fe80::881:c55d:1583:20f5. Jul 20 04:12:50 rageworks avahi-daemon[864]: Interface enp9s0.IPv6 no longer relevant for mDNS. Jul 20 04:12:50 rageworks NetworkManager[894]: <info> [1689847970.6726] dhcp4 (enp9s0): canceled DHCP transaction Jul 20 04:12:50 rageworks NetworkManager[894]: <info> [1689847970.6726] dhcp4 (enp9s0): activation: beginning transaction (timeout in 45 seconds) Jul 20 04:12:50 rageworks NetworkManager[894]: <info> [1689847970.6727] dhcp4 (enp9s0): state changed no lease Jul 20 04:12:50 rageworks avahi-daemon[864]: Withdrawing address record for 192.168.1.3 on enp9s0. Jul 20 04:12:50 rageworks avahi-daemon[864]: Leaving mDNS multicast group on interface enp9s0.IPv4 with address 192.168.1.3. Jul 20 04:12:50 rageworks avahi-daemon[864]: Interface enp9s0.IPv4 no longer relevant for mDNS. Jul 20 04:12:50 rageworks NetworkManager[894]: <info> [1689847970.7576] device (enp9s0): state change: disconnected -> unmanaged (reason 'sleeping', sys-iface-state: 'managed') Jul 20 04:12:50 rageworks kernel: r8169 0000:09:00.0 enp9s0: Link is Down Jul 20 04:12:50 rageworks systemd[1]: Reached target Sleep. Jul 20 04:12:50 rageworks systemd[1]: Starting NVIDIA system suspend actions... Jul 20 04:12:50 rageworks suspend[2051]: nvidia-suspend.service Jul 20 04:12:50 rageworks logger[2051]: <13>Jul 20 04:12:50 suspend: nvidia-suspend.service Jul 20 04:12:51 rageworks systemd[1]: nvidia-suspend.service: Deactivated successfully. Jul 20 04:12:51 rageworks systemd[1]: Finished NVIDIA system suspend actions. Jul 20 04:12:51 rageworks systemd[1]: Starting System Suspend... Jul 20 04:12:51 rageworks systemd-sleep[2059]: Entering sleep state 'suspend'... Jul 20 04:12:51 rageworks kernel: PM: suspend entry (deep) Jul 20 04:13:41 rageworks kernel: Filesystems sync: 0.284 seconds Jul 20 04:13:41 rageworks kernel: Freezing user space processes Jul 20 04:13:41 rageworks kernel: Freezing user space processes completed (elapsed 0.001 seconds) Jul 20 04:13:41 rageworks kernel: OOM killer disabled. Jul 20 04:13:41 rageworks kernel: Freezing remaining freezable tasks Jul 20 04:13:41 rageworks kernel: Freezing remaining freezable tasks completed (elapsed 0.001 seconds) Jul 20 04:13:41 rageworks kernel: printk: Suspending console(s) (use no_console_suspend to debug) Jul 20 04:13:41 rageworks kernel: serial 00:05: disabled Jul 20 04:13:41 rageworks kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache Jul 20 04:13:41 rageworks kernel: sd 1:0:0:0: [sdb] Synchronizing SCSI cache Jul 20 04:13:41 rageworks kernel: sd 9:0:0:0: [sdc] Synchronizing SCSI cache Jul 20 04:13:41 rageworks kernel: sd 1:0:0:0: [sdb] Stopping disk Jul 20 04:13:41 rageworks kernel: sd 9:0:0:0: [sdc] Stopping disk Jul 20 04:13:41 rageworks kernel: sd 0:0:0:0: [sda] Stopping disk Jul 20 04:13:41 rageworks kernel: ACPI: PM: Preparing to enter system sleep state S3 Jul 20 04:13:41 rageworks kernel: ACPI: PM: Saving platform NVS memory Jul 20 04:13:41 rageworks kernel: Disabling non-boot CPUs ... Jul 20 04:13:41 rageworks kernel: smpboot: CPU 1 is now offline Jul 20 04:13:41 rageworks kernel: smpboot: CPU 2 is now offline Jul 20 04:13:41 rageworks kernel: smpboot: CPU 3 is now offline Jul 20 04:13:41 rageworks kernel: ACPI: PM: Low-level resume complete Jul 20 04:13:41 rageworks kernel: ACPI: PM: Restoring platform NVS memory Jul 20 04:13:41 rageworks kernel: Enabling non-boot CPUs ... Jul 20 04:13:41 rageworks kernel: x86: Booting SMP configuration: Jul 20 04:13:41 rageworks kernel: smpboot: Booting Node 0 Processor 1 APIC 0x1 Jul 20 04:13:41 rageworks kernel: microcode: CPU1: patch_level=0x08101016 Jul 20 04:13:41 rageworks kernel: CPU1 is up Jul 20 04:13:41 rageworks kernel: smpboot: Booting Node 0 Processor 2 APIC 0x2 Jul 20 04:13:41 rageworks kernel: microcode: CPU2: patch_level=0x08101016 Jul 20 04:13:41 rageworks kernel: CPU2 is up Jul 20 04:13:41 rageworks kernel: smpboot: Booting Node 0 Processor 3 APIC 0x3 Jul 20 04:13:41 rageworks kernel: microcode: CPU3: patch_level=0x08101016 Jul 20 04:13:41 rageworks kernel: CPU3 is up Jul 20 04:13:41 rageworks kernel: ACPI: PM: Waking up from system sleep state S3 Jul 20 04:13:41 rageworks kernel: xhci_hcd 0000:02:00.0: xHC error in resume, USBSTS 0x401, Reinit Jul 20 04:13:41 rageworks kernel: usb usb1: root hub lost power or was reset Jul 20 04:13:41 rageworks kernel: usb usb2: root hub lost power or was reset Jul 20 04:13:41 rageworks kernel: serial 00:05: activated Jul 20 04:13:41 rageworks kernel: sd 0:0:0:0: [sda] Starting disk Jul 20 04:13:41 rageworks kernel: sd 1:0:0:0: [sdb] Starting disk Jul 20 04:13:41 rageworks kernel: sd 9:0:0:0: [sdc] Starting disk Jul 20 04:13:41 rageworks kernel: ata5: SATA link down (SStatus 0 SControl 330) Jul 20 04:13:41 rageworks kernel: ata9: SATA link down (SStatus 0 SControl 300) Jul 20 04:13:41 rageworks kernel: ata6: SATA link down (SStatus 0 SControl 330) Jul 20 04:13:41 rageworks kernel: usb 1-7: reset full-speed USB device number 2 using xhci_hcd Jul 20 04:13:41 rageworks kernel: usb 1-10: reset full-speed USB device number 4 using xhci_hcd Jul 20 04:13:41 rageworks kernel: usb 1-8: reset full-speed USB device number 3 using xhci_hcd Jul 20 04:13:41 rageworks kernel: ata2: found unknown device (class 0) Jul 20 04:13:41 rageworks kernel: ata1: found unknown device (class 0) Jul 20 04:13:41 rageworks kernel: ata10: found unknown device (class 0) Jul 20 04:13:41 rageworks kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jul 20 04:13:41 rageworks kernel: ata2.00: configured for UDMA/133 Jul 20 04:13:41 rageworks kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jul 20 04:13:41 rageworks kernel: ata1.00: configured for UDMA/133 Jul 20 04:13:41 rageworks kernel: ata10: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jul 20 04:13:41 rageworks kernel: ata10.00: configured for UDMA/133 Jul 20 04:13:41 rageworks kernel: ata10: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jul 20 04:13:41 rageworks kernel: ata10.00: configured for UDMA/133 Jul 20 04:13:41 rageworks kernel: ata10.00: device reported invalid CHS sector 0 Jul 20 04:13:41 rageworks kernel: sd 9:0:0:0: [sdc] Start/Stop Unit failed: Result: hostbyte=DID_OK driverbyte=DRIVER_OK Jul 20 04:13:41 rageworks kernel: sd 9:0:0:0: [sdc] Sense Key : Illegal Request [current] Jul 20 04:13:41 rageworks kernel: sd 9:0:0:0: [sdc] Add. Sense: Unaligned write command Jul 20 04:13:41 rageworks kernel: sd 9:0:0:0: PM: dpm_run_callback(): scsi_bus_resume+0x0/0x90 returns -5 Jul 20 04:13:41 rageworks kernel: sd 9:0:0:0: PM: failed to resume async: error -5 Jul 20 04:13:41 rageworks kernel: OOM killer enabled. Jul 20 04:13:41 rageworks kernel: Restarting tasks ... done. Jul 20 04:13:41 rageworks kernel: random: crng reseeded on system resumption Jul 20 04:13:41 rageworks kernel: PM: suspend exit Jul 20 04:13:41 rageworks rtkit-daemon[1349]: The canary thread is apparently starving. Taking action. Jul 20 04:13:41 rageworks systemd-sleep[2059]: System returned from sleep state. Jul 20 04:13:41 rageworks rtkit-daemon[1349]: Demoting known real-time threads. Jul 20 04:13:41 rageworks systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully. Jul 20 04:13:41 rageworks rtkit-daemon[1349]: Successfully demoted thread 1597 of process 1342. Jul 20 04:13:41 rageworks rtkit-daemon[1349]: Successfully demoted thread 1342 of process 1342. Jul 20 04:13:41 rageworks rtkit-daemon[1349]: Demoted 2 threads. Jul 20 04:13:42 rageworks systemd[1]: systemd-suspend.service: Deactivated successfully. Jul 20 04:13:42 rageworks systemd[1]: Finished System Suspend. Jul 20 04:13:42 rageworks systemd[1]: Stopped target Sleep. Jul 20 04:13:42 rageworks systemd[1]: Reached target Suspend. Jul 20 04:13:42 rageworks systemd[1]: Stopped target Suspend. Jul 20 04:13:42 rageworks systemd-logind[869]: Operation 'sleep' finished. Thanks

1 year, 11 months

3
13
0 0

[PATCH] virtio-vdpa: Fix cpumask memory leak in virtio_vdpa_find_vqs()

by Dragos Tatulea

From: Gal Pressman <gal(a)nvidia.com> Free the cpumask allocated by create_affinity_masks() before returning from the function. Fixes: 3dad56823b53 ("virtio-vdpa: Support interrupt affinity spreading mechanism") Signed-off-by: Gal Pressman <gal(a)nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea(a)nvidia.com> --- drivers/virtio/virtio_vdpa.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c index 989e2d7184ce..961161da5900 100644 --- a/drivers/virtio/virtio_vdpa.c +++ b/drivers/virtio/virtio_vdpa.c @@ -393,11 +393,13 @@ static int virtio_vdpa_find_vqs(struct virtio_device *vdev, unsigned int nvqs, cb.callback = virtio_vdpa_config_cb; cb.private = vd_dev; ops->set_config_cb(vdpa, &cb); + kfree(masks); return 0; err_setup_vq: virtio_vdpa_del_vqs(vdev); + kfree(masks); return err; } -- 2.41.0

1 year, 11 months

4
3
0 0

Asus Laptop wont resume by timer (timerfd_settime)

by Alexey Kuznetsov

Hello! I'm using Asus Vivobook Laptop. Which fails to resume by timer set by timerfd_settime function call. It simply ignores the timer and keeps sleeping until I press any key. When key pressed notebook wakes up and treating key resume event as timer event. I tested this using systemd and its HibernateDelaySec option, which allows to wake system during the sleep by timer to switch to hibernate state replacing suspend mode. During suspend notebook simply do nothing when timer hits, and when I press any key it wakes, and went to hibernate (treating key pressing wake event as timer event). Systemd has checks which should prevent hibernating if system wakes by key press, but those checks does not fails. I tested the same suspend / hibernate software on desktop - everything working fine. This systemd code responsible for suspend / timer / hibernate logic: tfd = timerfd_create(CLOCK_BOOTTIME_ALARM, TFD_NONBLOCK|TFD_CLOEXEC); timerfd_settime(tfd, 0, &ts, NULL) execute(sleep_config, SLEEP_HYBRID_SLEEP, NULL) fd_wait_for_event(tfd, POLLIN, 0) woken_by_timer = FLAGS_SET(r, POLLIN) check_wakeup_type() Basically it is POSIX calls responsible for setting timer alarms set and reading timer status. I've tested on recent debian kernel Linux 6.1.0-10-amd64 and stable release from kernel.org Linux 6.4.7 - same behavior. It most likely hardware/EFI or kernel issue. Full logs: https://linux-hardware.org/?probe=d1a4b2769a https://bugzilla.kernel.org/show_bug.cgi?id=217728 -- AK

1 year, 11 months

1
0
0 0

[PATCH v3 1/3] ARM: dts: imx6sx: Remove LDB endpoint

by Fabio Estevam

From: Fabio Estevam <festevam(a)denx.de> Remove the LDB endpoint description from the common imx6sx.dtsi as it causes regression for boards that has the LCDIF connected directly to a parallel display. Let the LDB endpoint be described in the board devicetree file instead. Cc: stable(a)vger.kernel.org Fixes: b74edf626c4f ("ARM: dts: imx6sx: Add LDB support") Signed-off-by: Fabio Estevam <festevam(a)denx.de> --- Changes since v2: - Rebased against 6.5-rc1. arch/arm/boot/dts/nxp/imx/imx6sx.dtsi | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/arm/boot/dts/nxp/imx/imx6sx.dtsi b/arch/arm/boot/dts/nxp/imx/imx6sx.dtsi index 3a4308666552..41c900929758 100644 --- a/arch/arm/boot/dts/nxp/imx/imx6sx.dtsi +++ b/arch/arm/boot/dts/nxp/imx/imx6sx.dtsi @@ -863,7 +863,6 @@ port@0 { reg = <0>; ldb_from_lcdif1: endpoint { - remote-endpoint = <&lcdif1_to_ldb>; }; }; @@ -1309,11 +1308,8 @@ lcdif1: lcdif@2220000 { power-domains = <&pd_disp>; status = "disabled"; - ports { - port { - lcdif1_to_ldb: endpoint { - remote-endpoint = <&ldb_from_lcdif1>; - }; + port { + lcdif1_to_ldb: endpoint { }; }; }; -- 2.34.1

1 year, 11 months

2
2
0 0

[PATCH] rust: allocator: Prevents mis-aligned allocation

by Boqun Feng

Currently the KernelAllocator simply passes the size of the type Layout to krealloc(), and in theory the alignment requirement from the type Layout may be larger than the guarantee provided by SLAB, which means the allocated object is mis-aligned. Fixes this by adjusting the allocation size to the nearest power of two, which SLAB always guarantees a size-aligned allocation. And because Rust guarantees that original size must be a multiple of alignment and the alignment must be a power of two, then the alignment requirement is satisfied. Suggested-by: Vlastimil Babka <vbabka(a)suse.cz> Co-developed-by: Andreas Hindborg (Samsung) <nmi(a)metaspace.dk> Signed-off-by: Andreas Hindborg (Samsung) <nmi(a)metaspace.dk> Signed-off-by: Boqun Feng <boqun.feng(a)gmail.com> Cc: stable(a)vger.kernel.org # v6.1+ --- Some more explanation: * Layout is a data structure describing a particular memory layout, conceptionally it has two fields: align and size. * align is guaranteed to be a power of two. * size can be smaller than align (only when the Layout is created via Layout::from_align_size()) * After pad_to_align(), the size is guaranteed to be a multiple of align For more information, please see: https://doc.rust-lang.org/stable/std/alloc/struct.Layout.html rust/bindings/bindings_helper.h | 1 + rust/kernel/allocator.rs | 17 ++++++++++++++++- 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h index 3e601ce2548d..6619ce95dd37 100644 --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -15,3 +15,4 @@ /* `bindgen` gets confused at certain things. */ const gfp_t BINDINGS_GFP_KERNEL = GFP_KERNEL; const gfp_t BINDINGS___GFP_ZERO = __GFP_ZERO; +const size_t BINDINGS_ARCH_SLAB_MINALIGN = ARCH_SLAB_MINALIGN; diff --git a/rust/kernel/allocator.rs b/rust/kernel/allocator.rs index 397a3dd57a9b..66575cf87ce2 100644 --- a/rust/kernel/allocator.rs +++ b/rust/kernel/allocator.rs @@ -11,9 +11,24 @@ unsafe impl GlobalAlloc for KernelAllocator { unsafe fn alloc(&self, layout: Layout) -> *mut u8 { + // Customized layouts from `Layout::from_size_align()` can have size < align, so pads first. + let layout = layout.pad_to_align(); + + let mut size = layout.size(); + + if layout.align() > bindings::BINDINGS_ARCH_SLAB_MINALIGN { + // The alignment requirement exceeds the slab guarantee, then tries to enlarges the size + // to use the "power-of-two" size/alignment guarantee (see comments in kmalloc() for + // more information). + // + // Note that `layout.size()` (after padding) is guaranteed to be muliples of + // `layout.align()`, so `next_power_of_two` gives enough alignment guarantee. + size = size.next_power_of_two(); + } + // `krealloc()` is used instead of `kmalloc()` because the latter is // an inline function and cannot be bound to as a result. - unsafe { bindings::krealloc(ptr::null(), layout.size(), bindings::GFP_KERNEL) as *mut u8 } + unsafe { bindings::krealloc(ptr::null(), size, bindings::GFP_KERNEL) as *mut u8 } } unsafe fn dealloc(&self, ptr: *mut u8, _layout: Layout) { -- 2.39.2

1 year, 11 months

6
6
0 0

[PATCH 0/2] mm/damon/core: fix unitialized memory error from

by SeongJae Park

damos_new_filter() is returning a damos_filter struct without initializing its ->list field. And the users of the function uses the struct without initializing the field. As a result, uninitialized memory access error is possible. Actually, a kernel NULL pointer dereference BUG can be triggered using DAMON user-space tool, like below. # damo start --damos_action stat --damos_filter anon matching # damo tune --damos_action stat --damos_filter anon matching --damos_filter anon nomatching # dmesg [...] [ 36.908136] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 36.910483] #PF: supervisor write access in kernel mode [ 36.912238] #PF: error_code(0x0002) - not-present page [ 36.913415] PGD 0 P4D 0 [ 36.913978] Oops: 0002 [#1] PREEMPT SMP PTI [ 36.914878] CPU: 32 PID: 1335 Comm: kdamond.0 Not tainted 6.5.0-rc3-mm-unstable-damon+ #1 [ 36.916621] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [ 36.919051] RIP: 0010:damos_destroy_filter (include/linux/list.h:114 include/linux/list.h:137 include/linux/list.h:148 mm/damon/core.c:345 mm/damon/core.c:355) [...] [ 36.938247] Call Trace: [ 36.938721] <TASK> [...] [ 36.950064] ? damos_destroy_filter (include/linux/list.h:114 include/linux/list.h:137 include/linux/list.h:148 mm/damon/core.c:345 mm/damon/core.c:355) [ 36.950883] ? damon_sysfs_set_scheme_filters.isra.0 (mm/damon/sysfs-schemes.c:1573) [ 36.952019] damon_sysfs_set_schemes (mm/damon/sysfs-schemes.c:1674 mm/damon/sysfs-schemes.c:1686) [ 36.952875] damon_sysfs_apply_inputs (mm/damon/sysfs.c:1312 mm/damon/sysfs.c:1298) [ 36.953757] ? damon_pa_check_accesses (mm/damon/paddr.c:168 mm/damon/paddr.c:179) [ 36.954648] damon_sysfs_cmd_request_callback (mm/damon/sysfs.c:1329 mm/damon/sysfs.c:1359) [...] The first patch of this patchset fixes the bug by initializing the field in damos_new_filter(). The second patch adds a unit test for the problem. Note that the second patch Cc stable@ without Fixes: tag, since it would be better to be ingested together for avoiding any future regression. SeongJae Park (2): mm/damon/core: initialize damo_filter->list from damos_new_filter() mm/damon/core-test: add a test for damos_new_filter() mm/damon/core-test.h | 13 +++++++++++++ mm/damon/core.c | 1 + 2 files changed, 14 insertions(+) -- 2.25.1

1 year, 11 months

1
2
0 0

[PATCH v2 1/5] rcutorture: Fix stuttering races and other issues

by Joel Fernandes (Google)

The stuttering code isn't functioning as expected. Ideally, it should pause the torture threads for a designated period before resuming. Yet, it fails to halt the test for the correct duration. Additionally, a race condition exists, potentially causing the stuttering code to pause for an extended period if the 'spt' variable is non-zero due to the stutter orchestration thread's inadequate CPU time. Moreover, over-stuttering can hinder RCU's progress on TREE07 kernels. This happens as the stuttering code may run within a softirq due to RCU callbacks. Consequently, ksoftirqd keeps a CPU busy for several seconds, thus obstructing RCU's progress. This situation triggers a warning message in the logs: [ 2169.481783] rcu_torture_writer: rtort_pipe_count: 9 This warning suggests that an RCU torture object, although invisible to RCU readers, couldn't make it past the pipe array and be freed -- a strong indication that there weren't enough grace periods during the stutter interval. To address these issues, this patch sets the "stutter end" time to an absolute point in the future set by the main stutter thread. This is then used for waiting in stutter_wait(). While the stutter thread still defines this absolute time, the waiters' waiting logic doesn't rely on the stutter thread receiving sufficient CPU time to halt the stuttering as the halting is now self-controlled. Cc: stable(a)vger.kernel.org Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org> --- kernel/torture.c | 45 ++++++++++++--------------------------------- 1 file changed, 12 insertions(+), 33 deletions(-) diff --git a/kernel/torture.c b/kernel/torture.c index 6ba62e5993e7..fd353f98162f 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -720,7 +720,7 @@ static void torture_shutdown_cleanup(void) * suddenly applied to or removed from the system. */ static struct task_struct *stutter_task; -static int stutter_pause_test; +static ktime_t stutter_till_abs_time; static int stutter; static int stutter_gap; @@ -730,30 +730,16 @@ static int stutter_gap; */ bool stutter_wait(const char *title) { - unsigned int i = 0; bool ret = false; - int spt; + ktime_t till_ns; cond_resched_tasks_rcu_qs(); - spt = READ_ONCE(stutter_pause_test); - for (; spt; spt = READ_ONCE(stutter_pause_test)) { - if (!ret && !rt_task(current)) { - sched_set_normal(current, MAX_NICE); - ret = true; - } - if (spt == 1) { - torture_hrtimeout_jiffies(1, NULL); - } else if (spt == 2) { - while (READ_ONCE(stutter_pause_test)) { - if (!(i++ & 0xffff)) - torture_hrtimeout_us(10, 0, NULL); - cond_resched(); - } - } else { - torture_hrtimeout_jiffies(round_jiffies_relative(HZ), NULL); - } - torture_shutdown_absorb(title); + till_ns = READ_ONCE(stutter_till_abs_time); + if (till_ns && ktime_before(ktime_get(), till_ns)) { + torture_hrtimeout_ns(till_ns, 0, HRTIMER_MODE_ABS, NULL); + ret = true; } + torture_shutdown_absorb(title); return ret; } EXPORT_SYMBOL_GPL(stutter_wait); @@ -764,23 +750,16 @@ EXPORT_SYMBOL_GPL(stutter_wait); */ static int torture_stutter(void *arg) { - DEFINE_TORTURE_RANDOM(rand); - int wtime; + ktime_t till_ns; VERBOSE_TOROUT_STRING("torture_stutter task started"); do { if (!torture_must_stop() && stutter > 1) { - wtime = stutter; - if (stutter > 2) { - WRITE_ONCE(stutter_pause_test, 1); - wtime = stutter - 3; - torture_hrtimeout_jiffies(wtime, &rand); - wtime = 2; - } - WRITE_ONCE(stutter_pause_test, 2); - torture_hrtimeout_jiffies(wtime, NULL); + till_ns = ktime_add_ns(ktime_get(), + jiffies_to_nsecs(stutter)); + WRITE_ONCE(stutter_till_abs_time, till_ns); + torture_hrtimeout_jiffies(stutter - 1, NULL); } - WRITE_ONCE(stutter_pause_test, 0); if (!torture_must_stop()) torture_hrtimeout_jiffies(stutter_gap, NULL); torture_shutdown_absorb("torture_stutter"); -- 2.41.0.487.g6d72f3e995-goog

1 year, 11 months

1
0
0 0

[PATCH] hwmon: (aquacomputer_d5next) Add selective 200ms delay after sending ctrl report

by Aleksa Savic

Add a 200ms delay after sending a ctrl report to Quadro, Octo, D5 Next and Aquaero to give them enough time to process the request and save the data to memory. Otherwise, under heavier userspace loads where multiple sysfs entries are usually set in quick succession, a new ctrl report could be requested from the device while it's still processing the previous one and fail with -EPIPE. Reported by a user on Github [1] and tested by both of us. [1] https://github.com/aleksamagicka/aquacomputer_d5next-hwmon/issues/82 Cc: stable(a)vger.kernel.org Signed-off-by: Aleksa Savic <savicaleksa83(a)gmail.com> --- drivers/hwmon/aquacomputer_d5next.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/drivers/hwmon/aquacomputer_d5next.c b/drivers/hwmon/aquacomputer_d5next.c index a997dbcb563f..9cb55d51185a 100644 --- a/drivers/hwmon/aquacomputer_d5next.c +++ b/drivers/hwmon/aquacomputer_d5next.c @@ -652,6 +652,31 @@ static int aqc_send_ctrl_data(struct aqc_data *priv) ret = hid_hw_raw_request(priv->hdev, priv->secondary_ctrl_report_id, priv->secondary_ctrl_report, priv->secondary_ctrl_report_size, HID_FEATURE_REPORT, HID_REQ_SET_REPORT); + if (ret < 0) + return ret; + + /* + * Wait 200ms before returning to make sure that the device actually processed both reports + * and saved ctrl data to memory. Otherwise, an aqc_get_ctrl_data() call made shortly after + * may fail with -EPIPE because the device is still busy and can't provide data. This can + * happen when userspace tools, such as fancontrol or liquidctl, write to sysfs entries in + * quick succession. + * + * 200ms was found to be the sweet spot between fixing the issue and not significantly + * prolonging the call. Quadro, Octo, D5 Next and Aquaero are currently known to be + * affected. + */ + switch (priv->kind) { + case quadro: + case octo: + case d5next: + case aquaero: + msleep(200); + break; + default: + break; + } + return ret; } -- 2.41.0

1 year, 11 months

3
3
0 0

[PATCH RESEND v4 1/1] test_firmware: fix some memory leaks and racing conditions

by Mirsad Todorovac

Some functions were called both from locked and unlocked context, so the lock was dropped prematurely, introducing a race condition when deadlock was avoided. Having two locks wouldn't assure a race-proof mutual exclusion. __test_dev_config_update_bool(), __test_dev_config_update_u8() and __test_dev_config_update_size_t() unlocked versions of the functions were introduced to be called from the locked contexts as a workaround without releasing the main driver's lock and causing a race condition. This should guarantee mutual exclusion and prevent any race conditions. Locked versions simply allow for mutual exclusion and call the unlocked counterparts, to avoid duplication of code. trigger_batched_requests_store() and trigger_batched_requests_async_store() now return -EBUSY if called with test_fw_config->reqs already allocated, so the memory leak is prevented. The same functions now keep track of the allocated buf for firmware in req->fw_buf as release_firmware() will not deallocate this storage for us. Additionally, in __test_release_all_firmware(), req->fw_buf is released before calling release_firmware(req->fw), foreach test_fw_config->reqs[i], i = 0 .. test_fw_config->num_requests-1 Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Cc: Luis Chamberlain <mcgrof(a)kernel.org> Cc: Russ Weight <russell.h.weight(a)intel.com> Cc: Tianfei zhang <tianfei.zhang(a)intel.com> Cc: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr> Cc: Zhengchao Shao <shaozhengchao(a)huawei.com> Cc: Colin Ian King <colin.i.king(a)gmail.com> Cc: linux-kernel(a)vger.kernel.org Cc: Takashi Iwai <tiwai(a)suse.de> Cc: Kees Cook <keescook(a)chromium.org> Cc: Scott Branden <sbranden(a)broadcom.com> Cc: Luis R. Rodriguez <mcgrof(a)kernel.org> Suggested-by: Dan Carpenter <error27(a)gmail.com> Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr> --- v3 -> v4 - fix additional memory leaks of the allocated firmware buffers - fix noticed racing conditions in conformance with the existing code - make it a single patch lib/test_firmware.c | 81 +++++++++++++++++++++++++++++++++++---------- 1 file changed, 63 insertions(+), 18 deletions(-) diff --git a/lib/test_firmware.c b/lib/test_firmware.c index 05ed84c2fc4c..1d7d480b8eeb 100644 --- a/lib/test_firmware.c +++ b/lib/test_firmware.c @@ -45,6 +45,7 @@ struct test_batched_req { bool sent; const struct firmware *fw; const char *name; + const char *fw_buf; struct completion completion; struct task_struct *task; struct device *dev; @@ -175,8 +176,14 @@ static void __test_release_all_firmware(void) for (i = 0; i < test_fw_config->num_requests; i++) { req = &test_fw_config->reqs[i]; - if (req->fw) + if (req->fw) { + if (req->fw_buf) { + kfree_const(req->fw_buf); + req->fw_buf = NULL; + } release_firmware(req->fw); + req->fw = NULL; + } } vfree(test_fw_config->reqs); @@ -353,16 +360,26 @@ static ssize_t config_test_show_str(char *dst, return len; } -static int test_dev_config_update_bool(const char *buf, size_t size, +static inline int __test_dev_config_update_bool(const char *buf, size_t size, bool *cfg) { int ret; - mutex_lock(&test_fw_mutex); if (kstrtobool(buf, cfg) < 0) ret = -EINVAL; else ret = size; + + return ret; +} + +static int test_dev_config_update_bool(const char *buf, size_t size, + bool *cfg) +{ + int ret; + + mutex_lock(&test_fw_mutex); + ret = __test_dev_config_update_bool(buf, size, cfg); mutex_unlock(&test_fw_mutex); return ret; @@ -373,7 +390,8 @@ static ssize_t test_dev_config_show_bool(char *buf, bool val) return snprintf(buf, PAGE_SIZE, "%d\n", val); } -static int test_dev_config_update_size_t(const char *buf, +static int __test_dev_config_update_size_t( + const char *buf, size_t size, size_t *cfg) { @@ -384,9 +402,7 @@ static int test_dev_config_update_size_t(const char *buf, if (ret) return ret; - mutex_lock(&test_fw_mutex); *(size_t *)cfg = new; - mutex_unlock(&test_fw_mutex); /* Always return full write size even if we didn't consume all */ return size; @@ -402,7 +418,7 @@ static ssize_t test_dev_config_show_int(char *buf, int val) return snprintf(buf, PAGE_SIZE, "%d\n", val); } -static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) +static int __test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) { u8 val; int ret; @@ -411,14 +427,23 @@ static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) if (ret) return ret; - mutex_lock(&test_fw_mutex); *(u8 *)cfg = val; - mutex_unlock(&test_fw_mutex); /* Always return full write size even if we didn't consume all */ return size; } +static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) +{ + int ret; + + mutex_lock(&test_fw_mutex); + ret = __test_dev_config_update_u8(buf, size, cfg); + mutex_unlock(&test_fw_mutex); + + return ret; +} + static ssize_t test_dev_config_show_u8(char *buf, u8 val) { return snprintf(buf, PAGE_SIZE, "%u\n", val); @@ -471,10 +496,10 @@ static ssize_t config_num_requests_store(struct device *dev, mutex_unlock(&test_fw_mutex); goto out; } - mutex_unlock(&test_fw_mutex); - rc = test_dev_config_update_u8(buf, count, - &test_fw_config->num_requests); + rc = __test_dev_config_update_u8(buf, count, + &test_fw_config->num_requests); + mutex_unlock(&test_fw_mutex); out: return rc; @@ -518,10 +543,10 @@ static ssize_t config_buf_size_store(struct device *dev, mutex_unlock(&test_fw_mutex); goto out; } - mutex_unlock(&test_fw_mutex); - rc = test_dev_config_update_size_t(buf, count, - &test_fw_config->buf_size); + rc = __test_dev_config_update_size_t(buf, count, + &test_fw_config->buf_size); + mutex_unlock(&test_fw_mutex); out: return rc; @@ -548,10 +573,10 @@ static ssize_t config_file_offset_store(struct device *dev, mutex_unlock(&test_fw_mutex); goto out; } - mutex_unlock(&test_fw_mutex); - rc = test_dev_config_update_size_t(buf, count, - &test_fw_config->file_offset); + rc = __test_dev_config_update_size_t(buf, count, + &test_fw_config->file_offset); + mutex_unlock(&test_fw_mutex); out: return rc; @@ -652,6 +677,8 @@ static ssize_t trigger_request_store(struct device *dev, mutex_lock(&test_fw_mutex); release_firmware(test_firmware); + if (test_fw_config->reqs) + __test_release_all_firmware(); test_firmware = NULL; rc = request_firmware(&test_firmware, name, dev); if (rc) { @@ -752,6 +779,8 @@ static ssize_t trigger_async_request_store(struct device *dev, mutex_lock(&test_fw_mutex); release_firmware(test_firmware); test_firmware = NULL; + if (test_fw_config->reqs) + __test_release_all_firmware(); rc = request_firmware_nowait(THIS_MODULE, 1, name, dev, GFP_KERNEL, NULL, trigger_async_request_cb); if (rc) { @@ -794,6 +823,8 @@ static ssize_t trigger_custom_fallback_store(struct device *dev, mutex_lock(&test_fw_mutex); release_firmware(test_firmware); + if (test_fw_config->reqs) + __test_release_all_firmware(); test_firmware = NULL; rc = request_firmware_nowait(THIS_MODULE, FW_ACTION_NOUEVENT, name, dev, GFP_KERNEL, NULL, @@ -856,6 +887,8 @@ static int test_fw_run_batch_request(void *data) test_fw_config->buf_size); if (!req->fw) kfree(test_buf); + else + req->fw_buf = test_buf; } else { req->rc = test_fw_config->req_firmware(&req->fw, req->name, @@ -895,6 +928,11 @@ static ssize_t trigger_batched_requests_store(struct device *dev, mutex_lock(&test_fw_mutex); + if (test_fw_config->reqs) { + rc = -EBUSY; + goto out_bail; + } + test_fw_config->reqs = vzalloc(array3_size(sizeof(struct test_batched_req), test_fw_config->num_requests, 2)); @@ -911,6 +949,7 @@ static ssize_t trigger_batched_requests_store(struct device *dev, req->fw = NULL; req->idx = i; req->name = test_fw_config->name; + req->fw_buf = NULL; req->dev = dev; init_completion(&req->completion); req->task = kthread_run(test_fw_run_batch_request, req, @@ -993,6 +1032,11 @@ ssize_t trigger_batched_requests_async_store(struct device *dev, mutex_lock(&test_fw_mutex); + if (test_fw_config->reqs) { + rc = -EBUSY; + goto out_bail; + } + test_fw_config->reqs = vzalloc(array3_size(sizeof(struct test_batched_req), test_fw_config->num_requests, 2)); @@ -1010,6 +1054,7 @@ ssize_t trigger_batched_requests_async_store(struct device *dev, for (i = 0; i < test_fw_config->num_requests; i++) { req = &test_fw_config->reqs[i]; req->name = test_fw_config->name; + req->fw_buf = NULL; req->fw = NULL; req->idx = i; init_completion(&req->completion); -- 2.30.2

1 year, 11 months

2
3
0 0

[PATCH net 0/6] There are some bugfix for the HNS3 ethernet driver

by Jijie Shao

There are some bugfix for the HNS3 ethernet driver Jian Shen (1): net: hns3: restore user pause configure when disable autoneg Jie Wang (2): net: hns3: refactor hclge_mac_link_status_wait for interface reuse net: hns3: add wait until mac link down Peiyang Wang (1): net: hns3: fix wrong print link down up Yonglong Liu (2): net: hns3: fix side effects passed to min_t() net: hns3: fix deadlock issue when externel_lb and reset are executed together .../net/ethernet/hisilicon/hns3/hns3_enet.c | 17 ++++++++-- .../hisilicon/hns3/hns3pf/hclge_main.c | 32 ++++++++++++++----- .../ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 2 +- .../ethernet/hisilicon/hns3/hns3pf/hclge_tm.h | 1 + 4 files changed, 41 insertions(+), 11 deletions(-) -- 2.30.0

1 year, 11 months

4
12
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror July 2023