This is the start of the stable review cycle for the 6.18.3 release. There are 430 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 31 Dec 2025 16:06:10 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.18.3-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.18.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.18.3-rc1
Cheng Ding cding@ddn.com fuse: missing copy_finish in fuse-over-io-uring argument copies
Joanne Koong joannelkoong@gmail.com fuse: fix readahead reclaim deadlock
Joanne Koong joannelkoong@gmail.com fuse: fix io-uring list corruption for terminated non-committed requests
Johan Hovold johan@kernel.org iommu/mediatek: fix use-after-free on probe deferral
Damien Le Moal dlemoal@kernel.org block: freeze queue when updating zone resources
Nicolas Ferre nicolas.ferre@microchip.com ARM: dts: microchip: sama7g5: fix uart fifo size to 32
Nicolas Ferre nicolas.ferre@microchip.com ARM: dts: microchip: sama7d65: fix uart fifo size to 32
Nicolas Ferre nicolas.ferre@microchip.com ARM: dts: microchip: sama5d2: fix spi flexcom fifo size to 32
Gui-Dong Han hanguidong02@gmail.com hwmon: (w83l786ng) Convert macros to functions to avoid TOCTOU
Gui-Dong Han hanguidong02@gmail.com hwmon: (w83791d) Convert macros to functions to avoid TOCTOU
Johan Hovold johan@kernel.org hwmon: (max6697) fix regmap leak on probe failure
Gui-Dong Han hanguidong02@gmail.com hwmon: (max16065) Use local variable to avoid TOCTOU
Joanne Koong joannelkoong@gmail.com io_uring/rsrc: fix lost entries after cloned range
Raviteja Laggyshetty quic_rlaggysh@quicinc.com interconnect: qcom: sdx75: Drop QPIC interconnect and BCM nodes
Ma Ke make24@iscas.ac.cn i2c: amd-mp2: fix reference leak in MP2 PCI device
Eric Biggers ebiggers@kernel.org lib/crypto: riscv: Depend on RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS
Bartosz Golaszewski bartosz.golaszewski@linaro.org platform/x86: intel: chtwc_int33fe: don't dereference swnode args
Biju Das biju.das.jz@bp.renesas.com pwm: rzg2l-gpt: Allow checking period_tick cache value only if sibling channel is enabled
Srinivas Kandagatla srinivas.kandagatla@oss.qualcomm.com rpmsg: glink: fix rpmsg device leak
Tomas Glozar tglozar@redhat.com rtla/timerlat_bpf: Stop tracing on user latency
Johan Hovold johan@kernel.org soc: amlogic: canvas: fix device leak on lookup
Johan Hovold johan@kernel.org soc: apple: mailbox: fix device leak on lookup
Johan Hovold johan@kernel.org soc: qcom: ocmem: fix device leak on lookup
Johan Hovold johan@kernel.org soc: qcom: pbs: fix device leak on lookup
Johan Hovold johan@kernel.org soc: samsung: exynos-pmu: fix device leak on regmap lookup
Steven Rostedt rostedt@goodmis.org tracing: Fix fixed array of synthetic event
Raghavendra Rao Ananta rananta@google.com vfio: Fix ksize arg while copying user struct in vfio_df_ioctl_bind_iommufd()
Miaoqian Lin linmq006@gmail.com virtio: vdpa: Fix reference count leak in octep_sriov_enable()
Damien Le Moal dlemoal@kernel.org zloop: make the write pointer of full zones invalid
Damien Le Moal dlemoal@kernel.org zloop: fail zone append operations that are targeting full zones
Johan Hovold johan@kernel.org amba: tegra-ahb: Fix device leak on SMMU enable
Eric Biggers ebiggers@kernel.org crypto: arm64/ghash - Fix incorrect output from ghash-neon
Guangshuo Li lgs201920130244@gmail.com crypto: caam - Add check for kcalloc() in test_len()
Shivani Agarwal shivani.agarwal@broadcom.com crypto: af_alg - zero initialize memory allocated via sock_kmalloc
Krzysztof Kozlowski krzk@kernel.org dt-bindings: PCI: qcom,pcie-sm8550: Add missing required power-domains and resets
Krzysztof Kozlowski krzk@kernel.org dt-bindings: PCI: qcom,pcie-sm8450: Add missing required power-domains and resets
Krzysztof Kozlowski krzk@kernel.org dt-bindings: PCI: qcom,pcie-sm8350: Add missing required power-domains and resets
Krzysztof Kozlowski krzk@kernel.org dt-bindings: PCI: qcom,pcie-sm8250: Add missing required power-domains and resets
Krzysztof Kozlowski krzk@kernel.org dt-bindings: PCI: qcom,pcie-sm8150: Add missing required power-domains and resets
Krzysztof Kozlowski krzk@kernel.org dt-bindings: PCI: qcom,pcie-sc8280xp: Add missing required power-domains and resets
Krzysztof Kozlowski krzk@kernel.org dt-bindings: PCI: qcom,pcie-sc7280: Add missing required power-domains and resets
Jani Nikula jani.nikula@intel.com drm/displayid: pass iter to drm_find_displayid_extension()
Ray Wu ray.wu@amd.com drm/amd/display: Fix scratch registers offsets for DCN351
Ray Wu ray.wu@amd.com drm/amd/display: Fix scratch registers offsets for DCN35
Alex Deucher alexander.deucher@amd.com drm/amd/display: Use GFP_ATOMIC in dc_create_plane_state()
Mario Limonciello mario.limonciello@amd.com Revert "drm/amd/display: Fix pbn to kbps Conversion"
Xi Ruoyao xry111@xry111.site gpio: loongson: Switch 2K2000/3000 GPIO to BYTE_CTRL_MODE
Askar Safin safinaskar@gmail.com gpiolib: acpi: Add quirk for Dell Precision 7780
Wentao Guan guanwentao@uniontech.com gpio: regmap: Fix memleak in error path in gpio_regmap_register()
Sven Schnelle svens@linux.ibm.com s390/ipl: Clear SBP flag when bootprog is set
Shakeel Butt shakeel.butt@linux.dev cgroup: rstat: use LOCK CMPXCHG in css_rstat_updated
Filipe Manana fdmanana@suse.com btrfs: don't log conflicting inode if it's a dir moved in the current transaction
Nysal Jan K.A. nysal@linux.ibm.com powerpc/kexec: Enable SMT before waking offline CPUs
Joshua Rogers linux@joshua.hu SUNRPC: svcauth_gss: avoid NULL deref on zero length gss_token in gss_read_proxy_verf
Joshua Rogers linux@joshua.hu svcrdma: use rc_pageoff for memcpy byte offset
Joshua Rogers linux@joshua.hu svcrdma: return 0 on success from svc_rdma_copy_inline_range
Andy Shevchenko andriy.shevchenko@linux.intel.com nfsd: Mark variable __maybe_unused to avoid W=1 build break
Joshua Rogers linux@joshua.hu svcrdma: bound check rq_pages index in inline path
Peter Ujfalusi peter.ujfalusi@linux.intel.com ASoC: SOF: ipc4-topology: Convert FLOAT to S32 during blob selection
Chuck Lever chuck.lever@oracle.com NFSD: Clear TIME_DELEG in the suppattr_exclcreat bitmap
Antheas Kapenekakis lkml@antheas.dev ALSA: hda/realtek: Add Asus quirk for TAS amplifiers
Peter Ujfalusi peter.ujfalusi@linux.intel.com ASoC: SOF: ipc4-topology: Prefer 32-bit DMIC blobs for 8-bit formats as well
Chuck Lever chuck.lever@oracle.com NFSD: NFSv4 file creation neglects setting ACL
Chuck Lever chuck.lever@oracle.com NFSD: Clear SECLABEL in the suppattr_exclcreat bitmap
caoping caoping@cmss.chinamobile.com net/handshake: restore destructor on submit failure
Amir Goldstein amir73il@gmail.com fsnotify: do not generate ACCESS/MODIFY events on child for special files
Thorsten Blum thorsten.blum@linux.dev net: phy: marvell-88q2xxx: Fix clamped value in mv88q2xxx_hwmon_write
René Rebe rene@exactco.de r8169: fix RTL8117 Wake-on-Lan in DASH mode
Charles Mirabile cmirabil@redhat.com lib/crypto: riscv: Add poly1305-core.S to .gitignore
Mark Brown broonie@kernel.org arm64/gcs: Flush the GCS locking state on exec
Rafael J. Wysocki rafael.j.wysocki@intel.com PM: runtime: Do not clear needs_force_resume with enabled runtime PM
Steven Rostedt rostedt@goodmis.org tracing: Do not register unsupported perf events
Christoph Hellwig hch@lst.de xfs: validate that zoned RT devices are zone aligned
Darrick J. Wong djwong@kernel.org xfs: fix a UAF problem in xattr repair
Christoph Hellwig hch@lst.de xfs: fix the zoned RT growfs check for zone alignment
Darrick J. Wong djwong@kernel.org xfs: fix stupid compiler warning
Haoxiang Li lihaoxiang@isrc.iscas.ac.cn xfs: fix a memory leak in xfs_buf_item_init()
Gavin Shan gshan@redhat.com KVM: selftests: Add missing "break" in rseq_test's param parsing
Sean Christopherson seanjc@google.com KVM: x86: Apply runtime updates to current CPUID during KVM_SET_CPUID{,2}
Sean Christopherson seanjc@google.com KVM: nSVM: Clear exit_code_hi in VMCB when synthesizing nested VM-Exits
Sean Christopherson seanjc@google.com KVM: nSVM: Set exit_code_hi to -1 when synthesizing SVM_EXIT_ERR (failed VMRUN)
Dongli Zhang dongli.zhang@oracle.com KVM: nVMX: Immediately refresh APICv controls as needed on nested VM-Exit
Sean Christopherson seanjc@google.com KVM: TDX: Explicitly set user-return MSRs that *may* be clobbered by the TDX-Module
Jim Mattson jmattson@google.com KVM: SVM: Mark VMCB_PERM_MAP as dirty on nested VMRUN
Wanpeng Li wanpengli@tencent.com KVM: Fix last_boosted_vcpu index assignment bug
Yosry Ahmed yosry.ahmed@linux.dev KVM: nSVM: Propagate SVM_EXIT_CR0_SEL_WRITE correctly for LMSW emulation
Sean Christopherson seanjc@google.com KVM: selftests: Forcefully override ARCH from x86_64 to x86
Jim Mattson jmattson@google.com KVM: SVM: Mark VMCB_NPT as dirty on nested VMRUN
Yosry Ahmed yosry.ahmed@linux.dev KVM: nSVM: Avoid incorrect injection of SVM_EXIT_CR0_SEL_WRITE
fuqiang wang fuqiang.wng@gmail.com KVM: x86: Fix VM hard lockup after prolonged inactivity with periodic HV timer
fuqiang wang fuqiang.wng@gmail.com KVM: x86: Explicitly set new periodic hrtimer expiration in apic_timer_fn()
Sean Christopherson seanjc@google.com KVM: x86: WARN if hrtimer callback for periodic APIC timer fires with period=0
Finn Thain fthain@linux-m68k.org powerpc: Add reloc_offset() to font bitmap pointer used for bootx_printf()
Ilya Dryomov idryomov@gmail.com libceph: make decode_pool() more resilient against corrupted osdmaps
Vivian Wang wangruikang@iscas.ac.cn lib/crypto: riscv/chacha: Avoid s0/fp register
Dongsheng Yang dongsheng.yang@linux.dev dm-pcache: advance slot index before writing slot
Helge Deller deller@gmx.de parisc: Do not reprogram affinitiy on ASP chip
Zhichi Lin zhichi.lin@vivo.com scs: fix a wrong parameter in __scs_magic
Wangao Wang wangao.wang@oss.qualcomm.com media: iris: Add sanity check for stop streaming
Tzung-Bi Shih tzungbi@kernel.org platform/chrome: cros_ec_ishtp: Fix UAF after unbinding driver
Maxim Levitsky mlevitsk@redhat.com KVM: x86: Don't clear async #PF queue when CR0.PG is disabled (e.g. on #SMI)
Prithvi Tambewagh activprithvi@gmail.com ocfs2: fix kernel BUG in ocfs2_find_victim_chain
Jeongjun Park aha310510@gmail.com media: vidtv: initialize local pointers upon transfer of memory ownership
Deepanshu Kartikey kartikey406@gmail.com mm/slub: reset KASAN tag in defer_free() before accessing freed memory
Sean Christopherson seanjc@google.com KVM: Disallow toggling KVM_MEM_GUEST_MEMFD on an existing memslot
Claudiu Beznea claudiu.beznea.uj@bp.renesas.com pinctrl: renesas: rzg2l: Fix ISEL restore on resume
Alison Schofield alison.schofield@intel.com tools/testing/nvdimm: Use per-DIMM device handle
Chao Yu chao@kernel.org f2fs: fix return value of f2fs_recover_fsync_data()
Chao Yu chao@kernel.org f2fs: fix to not account invalid blocks in get_left_section_blocks()
Chao Yu chao@kernel.org f2fs: fix to detect recoverable inode during dryrun of find_fsync_dnodes()
Xiaole He hexiaole1994@126.com f2fs: fix uninitialized one_time_gc in victim_sel_policy
Xiaole He hexiaole1994@126.com f2fs: fix age extent cache insertion skip on counter overflow
Chao Yu chao@kernel.org f2fs: use global inline_xattr_slab instead of per-sb slab cache
Deepanshu Kartikey kartikey406@gmail.com f2fs: invalidate dentry cache on failed whiteout creation
Chao Yu chao@kernel.org f2fs: fix to avoid updating zero-sized extent in extent cache
Chao Yu chao@kernel.org f2fs: fix to propagate error from f2fs_enable_checkpoint()
Chao Yu chao@kernel.org f2fs: fix to avoid potential deadlock
Chao Yu chao@kernel.org f2fs: fix to avoid updating compression context during writeback
Jan Prusakowski jprusakowski@google.com f2fs: ensure node page reads complete before f2fs_put_super() finishes
Seunghwan Baek sh8267.baek@samsung.com scsi: ufs: core: Add ufshcd_update_evt_hist() for UFS suspend error
Chandrakanth Patil chandrakanth.patil@broadcom.com scsi: mpi3mr: Read missing IOCFacts flag for reply queue full overflow
Andrey Vatoropin a.vatoropin@crpt.ru scsi: target: Reset t_task_cdb pointer in error case
Dai Ngo dai.ngo@oracle.com NFSD: use correct reservation type in nfsd4_scsi_fence_client
Junrui Luo moonafterrain@outlook.com scsi: aic94xx: fix use-after-free in device removal path
Tony Battersby tonyb@cybernetics.com scsi: Revert "scsi: qla2xxx: Perform lockless command completion in abort path"
Miaoqian Lin linmq006@gmail.com cpufreq: nforce2: fix reference count leak in nforce2
Rafael J. Wysocki rafael.j.wysocki@intel.com cpuidle: governors: teo: Drop misguided target residency check
Claudiu Beznea claudiu.beznea.uj@bp.renesas.com serial: sh-sci: Check that the DMA cookie is valid
j.turek jakub.turek@elsta.tech serial: xilinx_uartps: fix rs485 delay_rts_after_send
Alexander Stein alexander.stein@ew.tq-group.com serial: core: Fix serial device initialization
Andy Shevchenko andriy.shevchenko@linux.intel.com serial: core: Restore sysfs fwnode information
Junxiao Chang junxiao.chang@intel.com mei: gsc: add dependency on Xe driver
Ma Ke make24@iscas.ac.cn mei: Fix error handling in mei_register
Ma Ke make24@iscas.ac.cn intel_th: Fix error handling in intel_th_output_open
Srinivas Kandagatla srinivas.kandagatla@oss.qualcomm.com dt-bindings: slimbus: fix warning from example
Tianchu Chen flynnnchen@tencent.com char: applicom: fix NULL pointer dereference in ac_ioctl
Łukasz Bartosik ukaszb@chromium.org xhci: dbgtty: fix device unregister: fixup
Haoxiang Li haoxiang_li2024@163.com usb: renesas_usbhs: Fix a resource leak in usbhs_pipe_malloc()
Udipto Goswami udipto.goswami@oss.qualcomm.com usb: dwc3: keep susphy enabled during exit to avoid controller faults
Miaoqian Lin linmq006@gmail.com usb: dwc3: of-simple: fix clock resource leak in dwc3_of_simple_probe
Johan Hovold johan@kernel.org usb: gadget: lpc32xx_udc: fix clock imbalance in error path
Johan Hovold johan@kernel.org usb: phy: isp1301: fix non-OF device reference imbalance
Duoming Zhou duoming@zju.edu.cn usb: phy: fsl-usb: Fix use-after-free in delayed work during device removal
Ma Ke make24@iscas.ac.cn USB: lpc32xx_udc: Fix error handling in probe
Haoxiang Li lihaoxiang@isrc.iscas.ac.cn usb: typec: altmodes/displayport: Drop the device reference in dp_altmode_probe()
Arnd Bergmann arnd@arndb.de usb: typec: ucsi: huawei-gaokin: add DRM dependency
Johan Hovold johan@kernel.org usb: ohci-nxp: fix device leak on probe failure
Johan Hovold johan@kernel.org phy: broadcom: bcm63xx-usbh: fix section mismatches
Colin Ian King colin.i.king@gmail.com media: pvrusb2: Fix incorrect variable used in trace message
Jeongjun Park aha310510@gmail.com media: dvb-usb: dtv5100: fix out-of-bounds in dtv5100_i2c_msg()
Chen Changcheng chenchangcheng@kylinos.cn usb: usb-storage: Maintain minimal modifications to the bcdDevice range.
Paolo Abeni pabeni@redhat.com mptcp: avoid deadlock on fallback while reinjecting
Paolo Abeni pabeni@redhat.com mptcp: schedule rtx timer only after pushing data
Matthieu Baerts (NGI0) matttbe@kernel.org selftests: mptcp: pm: ensure unknown flags are ignored
Matthieu Baerts (NGI0) matttbe@kernel.org mptcp: pm: ignore unknown endpoint flags
Harry Yoo harry.yoo@oracle.com mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction
Hans de Goede johannes.goede@oss.qualcomm.com dma-mapping: Fix DMA_BIT_MASK() macro being broken
Sourabh Jain sourabhjain@linux.ibm.com crash: let architecture decide crash memory export to iomem_resource
Jarkko Sakkinen jarkko@kernel.org tpm2-sessions: Fix tpm2_read_public range checks
Jarkko Sakkinen jarkko@kernel.org tpm2-sessions: Fix out of range indexing in name_size
Wei Yang richard.weiyang@gmail.com mm/huge_memory: add pmd folio to ds_queue in do_huge_zero_wp_pmd()
Laurent Pinchart laurent.pinchart@ideasonboard.com media: v4l2-mem2mem: Fix outdated documentation
xu xin xu.xin16@zte.com.cn mm/ksm: fix exec/fork inheritance support for prctl
Bart Van Assche bvanassche@acm.org block: Remove queue freezing from several sysfs store callbacks
Byungchul Park byungchul@sk.com jbd2: use a weaker annotation in journal handling
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp jbd2: use a per-journal lock_class_key for jbd2_trans_commit_key
Baokun Li libaokun1@huawei.com ext4: align max orphan file size with e2fsprogs limit
Yongjian Sun sunyongjian1@huawei.com ext4: fix incorrect group number assertion in mb_check_buddy
Haibo Chen haibo.chen@nxp.com ext4: clear i_state_flags when alloc inode
Karina Yankevich k.yankevich@omp.ru ext4: xattr: fix null pointer deref in ext4_raw_inode()
Fedor Pchelkin pchelkin@ispras.ru ext4: check if mount_opts is NUL-terminated in ext4_ioctl_set_tune_sb()
Fedor Pchelkin pchelkin@ispras.ru ext4: fix string copying in parse_apply_sb_mount_options()
John Ogness john.ogness@linutronix.de printk: Avoid irq_work for printk_deferred() on suspend
John Ogness john.ogness@linutronix.de printk: Allow printk_trigger_flush() to flush all types
Rafael J. Wysocki rafael.j.wysocki@intel.com fs: PM: Fix reverse check in filesystems_freeze_callback()
Jarkko Sakkinen jarkko@kernel.org tpm: Cap the number of PCR banks
Steven Rostedt rostedt@goodmis.org ktest.pl: Fix uninitialized var in config-bisect.pl
Konstantin Komarov almaz.alexandrovich@paragon-software.com fs/ntfs3: fix mount failure for sparse runs in run_unpack()
Zheng Yejian zhengyejian@huaweicloud.com kallsyms: Fix wrong "big" kernel symbol type read from procfs
Eric Biggers ebiggers@kernel.org crypto: scatterwalk - Fix memcpy_sglist() to always succeed
Rene Rebe rene@exactco.de floppy: fix for PAGE_SIZE != 4KB
Ye Bin yebin10@huawei.com jbd2: fix the inconsistency between checksum and data in memory for journal sb
Li Chen chenl311@chinatelecom.cn block: rate-limit capacity change info log
Alexey Velichayshiy a.velichayshiy@ispras.ru gfs2: fix freeze error handling
Josef Bacik josef@toxicpanda.com btrfs: don't rewrite ret from inode_permission
Sven Eckelmann (Plasma Cloud) se@simonwunderlich.de wifi: mt76: Fix DTS power-limits on little endian systems
Stefan Haberland sth@linux.ibm.com s390/dasd: Fix gendisk parent after copy pair swap
Eric Biggers ebiggers@kernel.org lib/crypto: x86/blake2s: Fix 32-bit arg treated as 64-bit
Ma Ke make24@iscas.ac.cn perf: arm_cspmu: fix error handling in arm_cspmu_impl_unregister()
Ard Biesheuvel ardb@kernel.org efi: Add missing static initializer for efi_mm::cpus_allowed_lock
André Draszik andre.draszik@linaro.org phy: exynos5-usbdrd: fix clock prepare imbalance
Alexey Minnekhanov alexeymin@postmarketos.org dt-bindings: clock: mmcc-sdm660: Add missing MDSS reset
Sarthak Garg sarthak.garg@oss.qualcomm.com mmc: sdhci-msm: Avoid early clock doubling during HS400 transition
Avadhut Naik avadhut.naik@amd.com x86/mce: Do not clear bank's poll bit in mce_poll_banks on AMD SMCA systems
Tejun Heo tj@kernel.org sched_ext: Fix missing post-enqueue handling in move_local_task_to_local_dsq()
Tejun Heo tj@kernel.org sched_ext: Fix bypass depth leak on scx_enable() failure
Zqiang qiang.zhang@linux.dev sched_ext: Fix the memleak for sch->helper objects
Tejun Heo tj@kernel.org sched_ext: Factor out local_dsq_post_enq() from dispatch_enqueue()
John Ogness john.ogness@linutronix.de printk: Avoid scheduling irq_work on suspend
Prithvi Tambewagh activprithvi@gmail.com io_uring: fix filename leak in __io_openat_prep()
Jens Axboe axboe@kernel.dk io_uring: fix min_wait wakeups for SQPOLL
Jens Axboe axboe@kernel.dk io_uring/poll: correctly handle io_poll_add() return value on update
Johan Hovold johan@kernel.org clk: keystone: syscon-clk: fix regmap leak on probe failure
Jarkko Sakkinen jarkko@kernel.org KEYS: trusted: Fix a memory leak in tpm2_load_cmd
Alice Ryhl aliceryhl@google.com rust: io: add typedef for phys_addr_t
Alice Ryhl aliceryhl@google.com rust: io: move ResourceSize to top-level io module
Alice Ryhl aliceryhl@google.com rust: io: define ResourceSize as resource_size_t
Marko Turk mt@markoturk.info samples: rust: fix endianness issue in rust_driver_pci
FUJITA Tomonori fujita.tomonori@gmail.com rust: dma: add helpers for architectures without CONFIG_HAS_DMA
Alice Ryhl aliceryhl@google.com rust_binder: avoid mem::take on delivered_deaths
Lyude Paul lyude@redhat.com rust/drm/gem: Fix missing header in `Object` rustdoc
Zilin Guan zilin@seu.edu.cn cifs: Fix memory and information leak in smb3_reconfigure()
Stefano Garzarella sgarzare@redhat.com vhost/vsock: improve RCU read sections around vhost_vsock_get()
Dan Carpenter dan.carpenter@linaro.org block: rnbd-clt: Fix signedness bug in init_dev()
Caleb Sander Mateos csander@purestorage.com ublk: clean up user copy references on ublk server exit
Alok Tiwari alok.a.tiwari@oracle.com drm/msm/a6xx: move preempt_prepare_postamble after error check
Neil Armstrong neil.armstrong@linaro.org drm/msm: adreno: fix deferencing ifpc_reglist when not declared
John Garry john.g.garry@oracle.com scsi: scsi_debug: Fix atomic write enable module param description
Gregory CLEMENT gregory.clement@bootlin.com MIPS: ftrace: Fix memory corruption when kernel is located beyond 32 bits
Chia-Lin Kao (AceLan) acelan.kao@canonical.com platform/x86/intel/hid: Add Dell Pro Rugged 10/12 tablet to VGBS DMI quirks
Pei Xiao xiaopei01@kylinos.cn hwmon: (emc2305) fix double put in emc2305_probe_childs_from_dt
Justin Tee justintee8345@gmail.com nvme-fabrics: add ENOKEY to no retry criteria for authentication failures
Pei Xiao xiaopei01@kylinos.cn hwmon: (emc2305) fix device node refcount leak in error path
Daniel Wagner wagi@kernel.org nvme-fc: don't hold rport lock when putting ctrl
Derek J. Clark derekjohn.clark@gmail.com platform/x86: wmi-gamezone: Add Legion Go 2 Quirks
Jinhui Guo guojinhui.liam@bytedance.com i2c: designware: Disable SMBus interrupts to prevent storms from mis-configured firmware
Jens Reidel adrian@mainlining.org clk: qcom: dispcc-sm7150: Fix dispcc_mdss_pclk0_clk_src
Ian Rogers irogers@google.com libperf cpumap: Fix perf_cpu_map__max for an empty/NULL map
Wenhua Lin Wenhua.Lin@unisoc.com serial: sprd: Return -EPROBE_DEFER when uart clock is not ready
Michal Pecio michal.pecio@gmail.com usb: xhci: Don't unchain link TRBs on quirky HCs
Chen Changcheng chenchangcheng@kylinos.cn usb: usb-storage: No additional quirks need to be added to the EL-R12 optical drive.
Hongyu Xie xiehongyu1@kylinos.cn usb: xhci: limit run_graceperiod for only usb 3.0 devices
Pei Xiao xiaopei01@kylinos.cn iio: adc: ti_am335x_adc: Limit step_avg to valid range for gcc complains
Mark Pearson mpearson-lenovo@squebb.ca usb: typec: ucsi: Handle incorrect num_connectors capability
Lizhi Xu lizhi.xu@windriver.com usbip: Fix locking bug in RT-enabled kernels
Yuezhang Mo Yuezhang.Mo@sony.com exfat: zero out post-EOF page cache on file extension
Yuezhang Mo Yuezhang.Mo@sony.com exfat: fix remount failure in different process environments
Encrow Thorne jyc0019@gmail.com reset: fix BIT macro reference
Al Viro viro@zeniv.linux.org.uk functionfs: fix the open/removal races
Li Qiang liqiang01@kylinos.cn via_wdt: fix critical boot hang due to unnamed resource allocation
Bernd Schubert bschubert@ddn.com fuse: Invalidate the page cache after FOPEN_DIRECT_IO write
Bernd Schubert bschubert@ddn.com fuse: Always flush the page cache before FOPEN_DIRECT_IO write
Tony Battersby tonyb@cybernetics.com scsi: qla2xxx: Use reinit_completion on mbx_intr_comp
Tony Battersby tonyb@cybernetics.com scsi: qla2xxx: Fix initiator mode with qlini_mode=exclusive
Tony Battersby tonyb@cybernetics.com scsi: qla2xxx: Fix lost interrupts with qlini_mode=disabled
Ben Collins bcollins@kernel.org powerpc/addnote: Fix overflow on 32-bit builds
Josua Mayer josua@solid-run.com clk: mvebu: cp110 add CLK_IGNORE_UNUSED to pcie_x10, pcie_x11 & pcie_x4
Justin Tee justin.tee@broadcom.com scsi: lpfc: Fix reusing an ndlp that is marked NLP_DROPPED during FLOGI
David Strahan David.Strahan@microchip.com scsi: smartpqi: Add support for Hurray Data new controller PCI device
Matthias Schiffer matthias.schiffer@tq-group.com ti-sysc: allow OMAP2 and OMAP4 timers to be reserved on AM33xx
Johannes Berg johannes.berg@intel.com um: init cpu_tasks[] earlier
Peng Fan peng.fan@nxp.com firmware: imx: scu-irq: Init workqueue before request mbox channel
Peter Wang peter.wang@mediatek.com scsi: ufs: host: mediatek: Fix shutdown/suspend race condition
Jinhui Guo guojinhui.liam@bytedance.com ipmi: Fix __scan_channels() failing to rescan channels
Jinhui Guo guojinhui.liam@bytedance.com ipmi: Fix the race between __scan_channels() and deliver_response()
Stefan Binding sbinding@opensource.cirrus.com ASoC: ops: fix snd_soc_get_volsw for sx controls
Shuming Fan shumingf@realtek.com ASoC: SDCA: support Q7.8 volume format
Shardul Bankar shardul.b@mpiricsoftware.com nfsd: fix memory leak in nfsd_create_serv error paths
Shengjiu Wang shengjiu.wang@nxp.com ASoC: ak4458: remove the reset operation in probe and remove
Shipei Qu qu@darknavy.com ALSA: usb-mixer: us16x08: validate meter packet indices
Haotian Zhang vulab@iscas.ac.cn ALSA: pcmcia: Fix resource leak in snd_pdacf_probe error path
Haotian Zhang vulab@iscas.ac.cn ALSA: vxpocket: Fix resource leak in vxpocket_probe error path
Chancel Liu chancel.liu@nxp.com ASoC: fsl_sai: Constrain sample rates from audio PLLs only in master mode
Tal Zussman tz2294@columbia.edu x86/mm/tlb/trace: Export the TLB_REMOTE_WRONG_CPU enum in <trace/events/tlb.h>
Yongxin Liu yongxin.liu@windriver.com x86/fpu: Fix FPU state core dump truncation on CPUs with no extended xfeatures
Thomas Gleixner tglx@linutronix.de x86/msi: Make irq_retrigger() functional for posted MSI
Peter Zijlstra peterz@infradead.org x86/bug: Fix old GCC compile fails
Shaurya Rane ssrane_b23@ee.vjti.ac.in net/hsr: fix NULL pointer dereference in prp_get_untagged_frame()
Andrew Jeffery andrew@codeconstruct.com.au dt-bindings: mmc: sdhci-of-aspeed: Switch ref to sdhci-common.yaml
Sai Krishna Potthuri sai.krishna.potthuri@amd.com mmc: sdhci-of-arasan: Increase CD stable timeout to 2 seconds
Jared Kangas jkangas@redhat.com mmc: sdhci-esdhc-imx: add alternate ARCH_S32 dependency to Kconfig
Christophe Leroy christophe.leroy@csgroup.eu spi: fsl-cpm: Check length parity before switching to 16 bit mode
Pengjie Zhang zhangpengjie2@huawei.com ACPI: CPPC: Fix missing PCC check for guaranteed_perf
Pengjie Zhang zhangpengjie2@huawei.com ACPI: PCC: Fix race condition by removing static qualifier
Yongxin Liu yongxin.liu@windriver.com platform/x86: intel_pmc_ipc: fix ACPI buffer memory leak
Kartik Rajput kkartik@nvidia.com soc/tegra: fuse: Do not register SoC device on ACPI boot
Marc Kleine-Budde mkl@pengutronix.de can: gs_usb: gs_can_open(): fix error handling
Christoph Hellwig hch@lst.de xfs: don't leak a locked dquot when xfs_dquot_attach_buf fails
Christoffer Sandberg cs@tuxedo.de Input: i8042 - add TUXEDO InfinityBook Max Gen10 AMD to i8042 quirk table
Duoming Zhou duoming@zju.edu.cn Input: alps - fix use-after-free bugs caused by dev3_register_work
Minseong Kim ii4gsp@gmail.com Input: lkkbd - disable pending work before freeing device
Junjie Cao junjie.cao@intel.com Input: ti_am335x_tsc - fix off-by-one error in wire_order validation
Sanjay Govind sanjay.govind9@gmail.com Input: xpad - add support for CRKD Guitars
Sasha Finkelstein fnkl.kernel@gmail.com Input: apple_z2 - fix reading incorrect reports after exiting sleep
Ping Cheng pinglinux@gmail.com HID: input: map HID_GD_Z to ABS_DISTANCE for stylus/pen
Namjae Jeon linkinjeon@kernel.org ksmbd: fix buffer validation by including null terminator size in EA length
Namjae Jeon linkinjeon@kernel.org ksmbd: Fix refcount leak when invalid session is found on session lookup
Qianchang Zhao pioooooooooip@gmail.com ksmbd: skip lock-range check on equal size to avoid size==0 underflow
Nuno Sá nuno.sa@analog.com hwmon: (ltc4282): Fix reset_history file permissions
Rob Herring (Arm) robh@kernel.org arm64: dts: mediatek: Apply mt8395-radxa DT overlay at build time
Sairaj Kodilkar sarunkod@amd.com amd/iommu: Preserve domain ids inside the kdump kernel
Ashutosh Dixit ashutosh.dixit@intel.com drm/xe/oa: Always set OAG_OAGLBCTXCTRL_COUNTER_RESUME
Shuicheng Lin shuicheng.lin@intel.com drm/xe/oa: Limit num_syncs to prevent oversized allocations
Shuicheng Lin shuicheng.lin@intel.com drm/xe: Limit num_syncs to prevent oversized allocations
Thomas Fourier fourier.thomas@gmail.com block: rnbd-clt: Fix leaked ID in init_dev()
Ming Lei ming.lei@redhat.com ublk: fix deadlock when reading partition table
Ming Lei ming.lei@redhat.com ublk: refactor auto buffer register in ublk_dispatch_req()
Ming Lei ming.lei@redhat.com ublk: add `union ublk_io_buf` with improved naming
Ming Lei ming.lei@redhat.com ublk: add parameter `struct io_uring_cmd *` to ublk_prep_auto_buf_reg()
huang-jl huang-jl@deepseek.com io_uring: fix nr_segs calculation in io_import_kbuf
Anurag Dutta a-dutta@ti.com spi: cadence-quadspi: Fix clock disable on probe failure path
Alex Deucher alexander.deucher@amd.com drm/amdgpu: fix a job->pasid access race in gpu recovery
Jianpeng Chang jianpeng.chang.cn@windriver.com arm64: kdump: Fix elfcorehdr overlap caused by reserved memory processing reorder
Juergen Gross jgross@suse.com x86/xen: Fix sparse warning in enlighten_pv.c
Marijn Suijten marijn.suijten@somainline.org drm/panel: sony-td4353-jdi: Enable prepare_prev_first
Haoxiang Li haoxiang_li2024@163.com MIPS: Fix a reference leak bug in ip22_check_gio()
Jan Maslak jan.maslak@intel.com drm/xe: Restore engine registers before restarting schedulers after GT reset
Jagmeet Randhawa jagmeet.randhawa@intel.com drm/xe: Increase TDF timeout
Junxiao Chang junxiao.chang@intel.com drm/me/gsc: mei interrupt top half should be in irq disabled context
Arnd Bergmann arnd@arndb.de drm/xe: fix drm_gpusvm_init() arguments
Vinay Belgaumkar vinay.belgaumkar@intel.com drm/xe: Apply Wa_14020316580 in xe_gt_idle_enable_pg()
Shuicheng Lin shuicheng.lin@intel.com drm/xe: Fix freq kobject leak on sysfs_create_files failure
Alexey Simakov bigalex934@gmail.com hwmon: (tmp401) fix overflow caused by default conversion rate value
Junrui Luo moonafterrain@outlook.com hwmon: (ibmpex) fix use-after-free in high/low store
Denis Sergeev denserg.edu@gmail.com hwmon: (dell-smm) Limit fan multiplier to avoid overflow
Christophe JAILLET christophe.jaillet@wanadoo.fr spi: mpfs: Fix an error handling path in mpfs_spi_probe()
Prajna Rajendra Kumar prajna.rajendrakumar@microchip.com spi: microchip: rename driver file and internal identifiers
Ming Lei ming.lei@redhat.com block: fix race between wbt_enable_default and IO submission
Nilay Shroff nilay@linux.ibm.com block: use {alloc|free}_sched data methods
Nilay Shroff nilay@linux.ibm.com block: introduce alloc_sched_data and free_sched_data elevator methods
Nilay Shroff nilay@linux.ibm.com block: move elevator tags into struct elevator_resources
Nilay Shroff nilay@linux.ibm.com block: unify elevator tags and type xarrays into struct elv_change_ctx
Ming Lei ming.lei@redhat.com selftests: ublk: fix overflow in ublk_queue_auto_zc_fallback()
José Expósito jose.exposito89@gmail.com drm/tests: Handle EDEADLK in set_up_atomic_state()
José Expósito jose.exposito89@gmail.com drm/tests: Handle EDEADLK in drm_test_check_valid_clones()
José Expósito jose.exposito89@gmail.com drm/tests: hdmi: Handle drm_kunit_helper_enable_crtc_connector() returning EDEADLK
Jian Shen shenjian15@huawei.com net: hns3: add VLAN id validation before using
Jian Shen shenjian15@huawei.com net: hns3: using the num_tqps to check whether tqp_index is out of range when vf get ring info from mbx
Jian Shen shenjian15@huawei.com net: hns3: using the num_tqps in the vf driver to apply for resources
Wei Fang wei.fang@nxp.com net: enetc: do not transmit redirected XDP frames when the link is down
Scott Mayhew smayhew@redhat.com net/handshake: duplicate handshake cancellations leak socket
Cosmin Ratiu cratiu@nvidia.com net/mlx5e: Don't include PSP in the hard MTU calculations
Jianbo Liu jianbol@nvidia.com net/mlx5e: Trigger neighbor resolution for unresolved destinations
Jianbo Liu jianbol@nvidia.com net/mlx5e: Use ip6_dst_lookup instead of ipv6_dst_lookup_flow for MAC init
Shay Drory shayd@nvidia.com net/mlx5: Serialize firmware reset with devlink
Shay Drory shayd@nvidia.com net/mlx5: fw_tracer, Handle escaped percent properly
Shay Drory shayd@nvidia.com net/mlx5: fw_tracer, Validate format string parameters
Moshe Shemesh moshe@nvidia.com net/mlx5: Drain firmware reset in shutdown callback
Moshe Shemesh moshe@nvidia.com net/mlx5: fw reset, clear reset requested on drain_fw_reset
Gal Pressman gal@nvidia.com ethtool: Avoid overflowing userspace buffer on stats query
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp can: j1939: make j1939_sk_bind() fail if device is no longer registered
Jason Gunthorpe jgg@ziepe.ca iommufd/selftest: Check for overflow in IOMMU_TEST_OP_ADD_RESERVED
Jason Gunthorpe jgg@ziepe.ca iommufd/selftest: Make it clearer to gcc that the access is not out of bounds
Florian Westphal fw@strlen.de selftests: netfilter: packetdrill: avoid failure on HZ=100 kernel
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: remove redundant chain validation on register store
Florian Westphal fw@strlen.de netfilter: nf_nat: remove bogus direction check
Dan Carpenter dan.carpenter@linaro.org nfc: pn533: Fix error code in pn533_acr122_poweron_rdr()
Victor Nogueira victor@mojatatu.com net/sched: ets: Remove drr class from the active list if it changes to strict
Junrui Luo moonafterrain@outlook.com caif: fix integer underflow in cffrml_receive()
Florian Westphal fw@strlen.de selftests: netfilter: prefer xfail in case race wasn't triggered
Slavin Liu slavin452@gmail.com ipvs: fix ipv4 null-ptr-deref in route error path
Fernando Fernandez Mancera fmancera@suse.de netfilter: nf_conncount: fix leaked ct in error paths
Jakub Kicinski kuba@kernel.org inet: frags: flush pending skbs in fqdir_pre_exit()
Jakub Kicinski kuba@kernel.org inet: frags: add inet_frag_queue_flush()
Jakub Kicinski kuba@kernel.org inet: frags: avoid theoretical race in ip_frag_reinit()
Guenter Roeck linux@roeck-us.net selftests: net: tfo: Fix build warning
Guenter Roeck linux@roeck-us.net selftests: net: Fix build warnings
Guenter Roeck linux@roeck-us.net selftest: af_unix: Support compilers without flex-array-member-not-at-end support
Alexey Simakov bigalex934@gmail.com broadcom: b44: prevent uninitialized value usage
Arnd Bergmann arnd@arndb.de net: ti: icssg-prueth: add PTP_1588_CLOCK_OPTIONAL dependency
Ilya Maximets i.maximets@ovn.org net: openvswitch: fix middle attribute validation in push_nsh() action
Michael Chan michael.chan@broadcom.com bnxt_en: Fix XDP_TX path
Ido Schimmel idosch@nvidia.com mlxsw: spectrum_mr: Fix use-after-free when updating multicast route stats
Ido Schimmel idosch@nvidia.com mlxsw: spectrum_router: Fix neighbour use-after-free
Ido Schimmel idosch@nvidia.com mlxsw: spectrum_router: Fix possible neighbour reference count leak
Gerd Bayer gbayer@linux.ibm.com net/mlx5: Fix double unregister of HCA_PORTS component
Dmitry Skorodumov skorodumov.dmitry@huawei.com ipvlan: Ignore PACKET_LOOPBACK in handle_mode_l2()
Ivan Galkin ivan.galkin@axis.com net: phy: RTL8211FVD: Restore disabling of PHY-mode EEE
Vladimir Oltean vladimir.oltean@nxp.com net: phy: realtek: create rtl8211f_config_phy_eee() helper
Vladimir Oltean vladimir.oltean@nxp.com net: phy: realtek: eliminate priv->phycr1 variable
Vladimir Oltean vladimir.oltean@nxp.com net: phy: realtek: allow CLKOUT to be disabled on RTL8211F(D)(I)-VD-CG
Vladimir Oltean vladimir.oltean@nxp.com net: phy: realtek: eliminate has_phycr2 variable
Vladimir Oltean vladimir.oltean@nxp.com net: phy: realtek: eliminate priv->phycr2 variable
Cosmin Ratiu cratiu@nvidia.com net/mlx5e: Avoid unregistering PSP twice
Moshe Shemesh moshe@nvidia.com net/mlx5: make enable_mpesw idempotent
Jamal Hadi Salim jhs@mojatatu.com net/sched: ets: Always remove class from active list before deleting in ets_qdisc_change
Wang Liang wangliang74@huawei.com netrom: Fix memory leak in nr_sendmsg()
Wei Fang wei.fang@nxp.com net: fec: ERR007885 Workaround for XDP TX path
Andreas Gruenbacher agruenba@redhat.com gfs2: Fix use of bio_chain
Max Chou max.chou@realtek.com Bluetooth: btusb: Add new VID/PID 0x0489/0xE12F for RTL8852BE-VT
Gongwei Li ligongwei@kylinos.cn Bluetooth: btusb: Add new VID/PID 13d3/3533 for RTL8821CE
Shuai Zhang quic_shuaz@quicinc.com Bluetooth: btusb: add new custom firmwares
Chris Lu chris.lu@mediatek.com Bluetooth: btusb: MT7920: Add VID/PID 0489/e135
Chris Lu chris.lu@mediatek.com Bluetooth: btusb: MT7922: Add VID/PID 0489/e170
Chingbin Li liqb365@163.com Bluetooth: btusb: Add new VID/PID 2b89/6275 for RTL8761BUV
Qianchang Zhao pioooooooooip@gmail.com ksmbd: vfs: fix race on m_flags in vfs_cache
Namjae Jeon linkinjeon@kernel.org ksmbd: fix use-after-free in ksmbd_tree_connect_put under concurrency
ChenXiaoSong chenxiaosong@kylinos.cn smb/server: fix return value of smb2_ioctl()
Andreas Gruenbacher agruenba@redhat.com gfs2: Fix "gfs2: Switch to wait_event in gfs2_quotad"
Andreas Gruenbacher agruenba@redhat.com gfs2: fix remote evict for read-only filesystems
Viacheslav Dubeyko slava@dubeyko.com hfsplus: fix volume corruption issue for generic/101
Qu Wenruo wqu@suse.com btrfs: scrub: always update btrfs_scrub_progress::last_physical
Hans de Goede hansg@kernel.org wifi: brcmfmac: Add DMI nvram filename quirk for Acer A1 840 tablet
Quan Zhou quan.zhou@mediatek.com wifi: mt76: mt792x: fix wifi init fail by setting MCU_RUNNING after CLC load
Johannes Berg johannes.berg@intel.com wifi: cfg80211: use cfg80211_leave() in iftype change
Johannes Berg johannes.berg@intel.com wifi: cfg80211: stop radar detection in cfg80211_leave()
Bitterblue Smith rtl8821cerfe2@gmail.com wifi: rtl8xxxu: Fix HT40 channel config for RTL8192CU, RTL8723AU
Konstantin Komarov almaz.alexandrovich@paragon-software.com fs/ntfs3: check for shutdown in fsync
Viacheslav Dubeyko slava@dubeyko.com hfsplus: fix volume corruption issue for generic/073
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp hfsplus: Verify inode mode when loading from disk
Yang Chenzhi yang.chenzhi@vivo.com hfsplus: fix missing hfs_bnode_get() in __hfs_bnode_create
Viacheslav Dubeyko slava@dubeyko.com hfsplus: fix volume corruption issue for generic/070
Pedro Demarchi Gomes pedrodemargomes@gmail.com ntfs: set dummy blocksize to read boot_block when mounting
Mikhail Malyshev mike.malyshev@gmail.com kbuild: Use objtree for module signing key path
Konstantin Komarov almaz.alexandrovich@paragon-software.com fs/ntfs3: Support timestamps prior to epoch
Mario Limonciello (AMD) superm1@kernel.org crypto: ccp - Add support for PCI device 0x115A
Song Liu song@kernel.org livepatch: Match old_sympos 0 and 1 in klp_find_func()
Mauro Carvalho Chehab mchehab+huawei@kernel.org scripts: kdoc_parser.py: warn about Python version only once
Yu Peng pengyu@kylinos.cn x86/microcode: Mark early_parse_cmdline() as __init
Aboorva Devarajan aboorvad@linux.ibm.com cpuidle: menu: Use residency threshold in polling state override decisions
Shuhao Fu sfual@cse.ust.hk cpufreq: s5pv210: fix refcount leak
Armin Wolf W_Armin@gmx.de ACPI: fan: Workaround for 64-bit firmware bug
Hal Feng hal.feng@starfivetech.com cpufreq: dt-platdev: Add JH7110S SOC to the allowlist
Sakari Ailus sakari.ailus@linux.intel.com ACPI: property: Use ACPI functions in acpi_graph_get_next_endpoint() only
Cryolitia PukNgae cryolitia.pukngae@linux.dev ACPICA: Avoid walking the Namespace if start_node is NULL
Peter Zijlstra peterz@infradead.org x86/ptrace: Always inline trivial accessors
Peter Zijlstra peterz@infradead.org sched/fair: Revert max_newidle_lb_cost bump
Doug Berger opendmb@gmail.com sched/deadline: only set free_cpus for online runqueues
George Kennedy george.kennedy@oracle.com perf/x86/amd: Check event before enable to avoid GPF
Pankaj Raghav p.raghav@samsung.com scripts/faddr2line: Fix "Argument list too long" error
Joanne Koong joannelkoong@gmail.com iomap: account for unaligned end offsets when truncating read range
Joanne Koong joannelkoong@gmail.com iomap: adjust read range correctly for non-block-aligned positions
Al Viro viro@zeniv.linux.org.uk shmem: fix recovery on rename failures
Filipe Manana fdmanana@suse.com btrfs: fix changeset leak on mmap write after failure to reserve metadata
Deepanshu Kartikey kartikey406@gmail.com btrfs: fix memory leak of fs_devices in degraded seed device path
Shuran Liu electronlsr@gmail.com bpf: Fix verifier assumptions of bpf_d_path's output buffer
T.J. Mercier tjmercier@google.com bpf: Fix truncated dmabuf iterator reads
Ondrej Mosnacek omosnace@redhat.com bpf, arm64: Do not audit capability check in do_jit()
Qu Wenruo wqu@suse.com btrfs: fix a potential path leak in print_data_reloc_error()
Filipe Manana fdmanana@suse.com btrfs: do not skip logging new dentries when logging a new name
-------------
Diffstat:
.../devicetree/bindings/mmc/aspeed,sdhci.yaml | 2 +- .../devicetree/bindings/pci/qcom,pcie-sc7280.yaml | 5 + .../bindings/pci/qcom,pcie-sc8280xp.yaml | 3 + .../devicetree/bindings/pci/qcom,pcie-sm8150.yaml | 5 + .../devicetree/bindings/pci/qcom,pcie-sm8250.yaml | 5 + .../devicetree/bindings/pci/qcom,pcie-sm8350.yaml | 5 + .../devicetree/bindings/pci/qcom,pcie-sm8450.yaml | 5 + .../devicetree/bindings/pci/qcom,pcie-sm8550.yaml | 5 + .../devicetree/bindings/slimbus/slimbus.yaml | 16 +- Makefile | 4 +- arch/arm/boot/dts/microchip/sama5d2.dtsi | 10 +- arch/arm/boot/dts/microchip/sama7d65.dtsi | 6 +- arch/arm/boot/dts/microchip/sama7g5.dtsi | 4 +- arch/arm64/boot/dts/mediatek/Makefile | 2 + arch/arm64/crypto/ghash-ce-glue.c | 2 +- arch/arm64/kernel/process.c | 1 + arch/arm64/net/bpf_jit_comp.c | 2 +- arch/mips/kernel/ftrace.c | 25 ++- arch/mips/sgi-ip22/ip22-gio.c | 3 +- arch/powerpc/boot/addnote.c | 7 +- arch/powerpc/include/asm/crash_reserve.h | 8 + arch/powerpc/kernel/btext.c | 3 +- arch/powerpc/kexec/core_64.c | 19 ++ arch/riscv/crypto/Kconfig | 12 +- arch/s390/include/uapi/asm/ipl.h | 1 + arch/s390/kernel/ipl.c | 48 +++-- arch/um/kernel/process.c | 4 +- arch/um/kernel/um_arch.c | 2 - arch/x86/events/amd/core.c | 7 +- arch/x86/include/asm/bug.h | 2 +- arch/x86/include/asm/irq_remapping.h | 7 + arch/x86/include/asm/kvm_host.h | 1 - arch/x86/include/asm/ptrace.h | 20 +- arch/x86/kernel/cpu/mce/threshold.c | 3 +- arch/x86/kernel/cpu/microcode/core.c | 2 +- arch/x86/kernel/fpu/xstate.c | 4 +- arch/x86/kernel/irq.c | 23 +++ arch/x86/kvm/cpuid.c | 11 +- arch/x86/kvm/lapic.c | 32 +++- arch/x86/kvm/svm/nested.c | 6 +- arch/x86/kvm/svm/svm.c | 54 ++++-- arch/x86/kvm/svm/svm.h | 7 +- arch/x86/kvm/vmx/nested.c | 3 +- arch/x86/kvm/vmx/tdx.c | 56 +++--- arch/x86/kvm/vmx/tdx.h | 1 - arch/x86/kvm/x86.c | 34 ++-- arch/x86/xen/enlighten_pv.c | 2 +- block/bfq-iosched.c | 2 +- block/blk-mq-sched.c | 117 +++++++++--- block/blk-mq-sched.h | 40 +++- block/blk-mq.c | 50 ++--- block/blk-sysfs.c | 28 +-- block/blk-wbt.c | 20 +- block/blk-wbt.h | 5 + block/blk-zoned.c | 42 +++-- block/blk.h | 7 +- block/elevator.c | 84 ++++----- block/elevator.h | 27 ++- block/genhd.c | 2 +- crypto/af_alg.c | 5 +- crypto/algif_hash.c | 3 +- crypto/algif_rng.c | 3 +- crypto/scatterwalk.c | 95 ++++++++-- drivers/acpi/acpi_pcc.c | 2 +- drivers/acpi/acpica/nswalk.c | 9 +- drivers/acpi/cppc_acpi.c | 3 +- drivers/acpi/fan.h | 33 ++++ drivers/acpi/fan_hwmon.c | 10 +- drivers/acpi/property.c | 8 +- drivers/amba/tegra-ahb.c | 1 + drivers/android/binder/process.rs | 8 +- drivers/base/power/runtime.c | 22 ++- drivers/block/floppy.c | 2 +- drivers/block/rnbd/rnbd-clt.c | 13 +- drivers/block/rnbd/rnbd-clt.h | 2 +- drivers/block/ublk_drv.c | 139 +++++++++----- drivers/block/zloop.c | 12 +- drivers/bluetooth/btusb.c | 11 ++ drivers/bus/ti-sysc.c | 11 +- drivers/char/applicom.c | 5 +- drivers/char/ipmi/ipmi_msghandler.c | 20 +- drivers/char/tpm/tpm-chip.c | 1 - drivers/char/tpm/tpm1-cmd.c | 5 - drivers/char/tpm/tpm2-cmd.c | 34 +++- drivers/char/tpm/tpm2-sessions.c | 194 ++++++++++++------- drivers/clk/keystone/syscon-clk.c | 2 +- drivers/clk/mvebu/cp110-system-controller.c | 20 ++ drivers/clk/qcom/dispcc-sm7150.c | 2 +- drivers/cpufreq/cpufreq-dt-platdev.c | 1 + drivers/cpufreq/cpufreq-nforce2.c | 3 + drivers/cpufreq/s5pv210-cpufreq.c | 6 +- drivers/cpuidle/governors/menu.c | 9 +- drivers/cpuidle/governors/teo.c | 7 +- drivers/crypto/caam/caamrng.c | 4 +- drivers/crypto/ccp/sp-pci.c | 19 ++ drivers/firmware/efi/efi.c | 3 + drivers/firmware/imx/imx-scu-irq.c | 4 +- drivers/gpio/gpio-loongson-64bit.c | 10 +- drivers/gpio/gpio-regmap.c | 2 +- drivers/gpio/gpiolib-acpi-quirks.c | 22 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +- .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 59 +++--- drivers/gpu/drm/amd/display/dc/core/dc_surface.c | 2 +- .../amd/display/dc/resource/dcn35/dcn35_resource.c | 8 +- .../display/dc/resource/dcn351/dcn351_resource.c | 8 +- drivers/gpu/drm/drm_displayid.c | 19 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 18 +- drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 4 +- drivers/gpu/drm/panel/panel-sony-td4353-jdi.c | 2 + drivers/gpu/drm/tests/drm_atomic_state_test.c | 40 +++- drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c | 143 ++++++++++++++ drivers/gpu/drm/xe/xe_device.c | 2 +- drivers/gpu/drm/xe/xe_exec.c | 3 +- drivers/gpu/drm/xe/xe_gt.c | 7 +- drivers/gpu/drm/xe/xe_gt_freq.c | 4 +- drivers/gpu/drm/xe/xe_gt_idle.c | 8 + drivers/gpu/drm/xe/xe_heci_gsc.c | 4 +- drivers/gpu/drm/xe/xe_oa.c | 10 +- drivers/gpu/drm/xe/xe_svm.h | 2 +- drivers/gpu/drm/xe/xe_vm.c | 3 + drivers/gpu/drm/xe/xe_wa.c | 8 - drivers/gpu/drm/xe/xe_wa_oob.rules | 1 + drivers/hid/hid-input.c | 18 +- drivers/hwmon/dell-smm-hwmon.c | 9 + drivers/hwmon/emc2305.c | 8 +- drivers/hwmon/ibmpex.c | 9 +- drivers/hwmon/ltc4282.c | 9 +- drivers/hwmon/max16065.c | 7 +- drivers/hwmon/max6697.c | 2 +- drivers/hwmon/tmp401.c | 2 +- drivers/hwmon/w83791d.c | 19 +- drivers/hwmon/w83l786ng.c | 26 ++- drivers/hwtracing/intel_th/core.c | 20 +- drivers/i2c/busses/i2c-amd-mp2-pci.c | 5 +- drivers/i2c/busses/i2c-designware-core.h | 1 + drivers/i2c/busses/i2c-designware-master.c | 7 + drivers/iio/adc/ti_am335x_adc.c | 2 +- drivers/input/joystick/xpad.c | 5 + drivers/input/keyboard/lkkbd.c | 5 +- drivers/input/mouse/alps.c | 1 + drivers/input/serio/i8042-acpipnpio.h | 7 + drivers/input/touchscreen/apple_z2.c | 4 + drivers/input/touchscreen/ti_am335x_tsc.c | 2 +- drivers/interconnect/qcom/sdx75.c | 26 --- drivers/interconnect/qcom/sdx75.h | 2 - drivers/iommu/amd/init.c | 23 ++- drivers/iommu/intel/irq_remapping.c | 8 +- drivers/iommu/iommufd/selftest.c | 8 +- drivers/iommu/mtk_iommu.c | 25 ++- drivers/md/dm-pcache/cache.c | 8 +- drivers/md/dm-pcache/cache_segment.c | 8 +- drivers/media/platform/qcom/iris/iris_vb2.c | 8 +- drivers/media/test-drivers/vidtv/vidtv_channel.c | 3 + drivers/media/usb/dvb-usb/dtv5100.c | 5 + drivers/media/usb/pvrusb2/pvrusb2-hdw.c | 2 +- drivers/misc/mei/Kconfig | 2 +- drivers/misc/mei/main.c | 1 + drivers/mmc/host/Kconfig | 4 +- drivers/mmc/host/sdhci-msm.c | 27 +-- drivers/mmc/host/sdhci-of-arasan.c | 2 +- drivers/net/can/usb/gs_usb.c | 2 +- drivers/net/ethernet/broadcom/b44.c | 3 + drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 3 +- drivers/net/ethernet/freescale/enetc/enetc.c | 3 +- drivers/net/ethernet/freescale/fec_main.c | 7 +- .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 3 + .../net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c | 4 +- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 +- drivers/net/ethernet/mellanox/mlx5/core/devlink.c | 5 + .../ethernet/mellanox/mlx5/core/diag/fw_tracer.c | 97 ++++++++-- .../ethernet/mellanox/mlx5/core/diag/fw_tracer.h | 1 + drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 +- .../ethernet/mellanox/mlx5/core/en_accel/ipsec.c | 8 +- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 1 - .../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 6 + drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 48 ++++- drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h | 1 + drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 1 + .../net/ethernet/mellanox/mlx5/core/lag/mpesw.c | 11 +- drivers/net/ethernet/mellanox/mlx5/core/main.c | 1 + drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c | 2 + .../net/ethernet/mellanox/mlxsw/spectrum_router.c | 27 +-- drivers/net/ethernet/realtek/r8169_main.c | 5 +- drivers/net/ethernet/ti/Kconfig | 3 +- drivers/net/ipvlan/ipvlan_core.c | 3 + drivers/net/phy/marvell-88q2xxx.c | 2 +- drivers/net/phy/realtek/realtek_main.c | 114 ++++++----- .../net/wireless/broadcom/brcm80211/brcmfmac/dmi.c | 14 ++ drivers/net/wireless/mediatek/mt76/eeprom.c | 37 ++-- drivers/net/wireless/mediatek/mt76/mt7921/mcu.c | 2 +- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 2 +- drivers/net/wireless/realtek/rtl8xxxu/core.c | 7 +- drivers/nfc/pn533/usb.c | 2 +- drivers/nvme/host/fabrics.c | 2 +- drivers/nvme/host/fc.c | 6 +- drivers/of/fdt.c | 2 +- drivers/parisc/gsc.c | 4 +- drivers/perf/arm_cspmu/arm_cspmu.c | 4 +- drivers/phy/broadcom/phy-bcm63xx-usbh.c | 6 +- drivers/phy/samsung/phy-exynos5-usbdrd.c | 2 +- drivers/pinctrl/renesas/pinctrl-rzg2l.c | 71 ++++--- drivers/platform/chrome/cros_ec_ishtp.c | 1 + drivers/platform/x86/intel/chtwc_int33fe.c | 29 ++- drivers/platform/x86/intel/hid.c | 12 ++ drivers/platform/x86/lenovo/wmi-gamezone.c | 17 +- drivers/pwm/pwm-rzg2l-gpt.c | 15 +- drivers/rpmsg/qcom_glink_native.c | 8 + drivers/s390/block/dasd_eckd.c | 8 + drivers/scsi/aic94xx/aic94xx_init.c | 3 + drivers/scsi/lpfc/lpfc_els.c | 36 +++- drivers/scsi/lpfc/lpfc_hbadisc.c | 4 +- drivers/scsi/mpi3mr/mpi/mpi30_ioc.h | 1 + drivers/scsi/mpi3mr/mpi3mr_fw.c | 2 + drivers/scsi/qla2xxx/qla_def.h | 1 - drivers/scsi/qla2xxx/qla_gbl.h | 2 +- drivers/scsi/qla2xxx/qla_isr.c | 32 +--- drivers/scsi/qla2xxx/qla_mbx.c | 2 + drivers/scsi/qla2xxx/qla_mid.c | 4 +- drivers/scsi/qla2xxx/qla_os.c | 14 +- drivers/scsi/scsi_debug.c | 2 +- drivers/scsi/smartpqi/smartpqi_init.c | 4 + drivers/soc/amlogic/meson-canvas.c | 5 +- drivers/soc/apple/mailbox.c | 15 +- drivers/soc/qcom/ocmem.c | 2 +- drivers/soc/qcom/qcom-pbs.c | 2 + drivers/soc/samsung/exynos-pmu.c | 2 + drivers/soc/tegra/fuse/fuse-tegra.c | 2 - drivers/spi/Kconfig | 19 +- drivers/spi/Makefile | 2 +- drivers/spi/spi-cadence-quadspi.c | 4 +- drivers/spi/spi-fsl-spi.c | 2 +- drivers/spi/{spi-microchip-core.c => spi-mpfs.c} | 208 +++++++++++---------- drivers/target/target_core_transport.c | 1 + drivers/tty/serial/serial_base_bus.c | 11 +- drivers/tty/serial/sh-sci.c | 2 +- drivers/tty/serial/sprd_serial.c | 6 + drivers/tty/serial/xilinx_uartps.c | 14 +- drivers/ufs/core/ufshcd.c | 5 +- drivers/ufs/host/ufs-mediatek.c | 5 + drivers/usb/dwc3/dwc3-of-simple.c | 7 +- drivers/usb/dwc3/gadget.c | 2 +- drivers/usb/dwc3/host.c | 2 +- drivers/usb/gadget/function/f_fs.c | 53 +++++- drivers/usb/gadget/udc/lpc32xx_udc.c | 21 ++- drivers/usb/host/ohci-nxp.c | 2 + drivers/usb/host/xhci-dbgtty.c | 2 +- drivers/usb/host/xhci-hub.c | 2 +- drivers/usb/host/xhci-ring.c | 27 +-- drivers/usb/phy/phy-fsl-usb.c | 1 + drivers/usb/phy/phy-isp1301.c | 7 +- drivers/usb/renesas_usbhs/pipe.c | 2 + drivers/usb/storage/unusual_uas.h | 2 +- drivers/usb/typec/altmodes/displayport.c | 8 +- drivers/usb/typec/ucsi/Kconfig | 1 + drivers/usb/typec/ucsi/ucsi.c | 6 + drivers/usb/usbip/vhci_hcd.c | 6 +- drivers/vdpa/octeon_ep/octep_vdpa_main.c | 1 + drivers/vfio/device_cdev.c | 2 +- drivers/vhost/vsock.c | 15 +- drivers/watchdog/via_wdt.c | 1 + fs/btrfs/file.c | 3 +- fs/btrfs/inode.c | 1 + fs/btrfs/ioctl.c | 4 +- fs/btrfs/scrub.c | 5 + fs/btrfs/tree-log.c | 46 ++++- fs/btrfs/volumes.c | 1 + fs/exfat/file.c | 5 + fs/exfat/super.c | 19 +- fs/ext4/ialloc.c | 1 - fs/ext4/inode.c | 1 - fs/ext4/ioctl.c | 4 + fs/ext4/mballoc.c | 2 + fs/ext4/orphan.c | 4 +- fs/ext4/super.c | 6 +- fs/ext4/xattr.c | 6 +- fs/f2fs/compress.c | 5 +- fs/f2fs/data.c | 17 ++ fs/f2fs/extent_cache.c | 5 +- fs/f2fs/f2fs.h | 13 +- fs/f2fs/file.c | 12 +- fs/f2fs/gc.c | 2 +- fs/f2fs/namei.c | 6 +- fs/f2fs/recovery.c | 20 +- fs/f2fs/segment.c | 9 +- fs/f2fs/segment.h | 8 +- fs/f2fs/super.c | 116 +++++------- fs/f2fs/xattr.c | 32 ++-- fs/f2fs/xattr.h | 10 +- fs/fuse/dev.c | 2 +- fs/fuse/dev_uring.c | 6 +- fs/fuse/file.c | 37 +++- fs/fuse/fuse_dev_i.h | 1 + fs/gfs2/glops.c | 3 +- fs/gfs2/lops.c | 2 +- fs/gfs2/quota.c | 2 +- fs/gfs2/super.c | 4 +- fs/hfsplus/bnode.c | 4 +- fs/hfsplus/dir.c | 7 +- fs/hfsplus/hfsplus_fs.h | 2 + fs/hfsplus/inode.c | 41 +++- fs/hfsplus/super.c | 87 +++++---- fs/iomap/buffered-io.c | 41 +++- fs/jbd2/journal.c | 20 +- fs/jbd2/transaction.c | 2 +- fs/libfs.c | 50 +++-- fs/nfsd/blocklayout.c | 3 +- fs/nfsd/export.c | 2 +- fs/nfsd/nfs4xdr.c | 5 + fs/nfsd/nfsd.h | 8 +- fs/nfsd/nfssvc.c | 5 +- fs/nfsd/vfs.h | 3 +- fs/notify/fsnotify.c | 9 +- fs/ntfs3/file.c | 14 +- fs/ntfs3/ntfs_fs.h | 9 +- fs/ntfs3/run.c | 6 +- fs/ntfs3/super.c | 5 + fs/ocfs2/suballoc.c | 10 + fs/smb/client/fs_context.c | 2 + fs/smb/server/mgmt/tree_connect.c | 18 +- fs/smb/server/mgmt/tree_connect.h | 1 - fs/smb/server/mgmt/user_session.c | 4 +- fs/smb/server/smb2pdu.c | 16 +- fs/smb/server/vfs.c | 5 +- fs/smb/server/vfs_cache.c | 88 ++++++--- fs/super.c | 2 +- fs/xfs/libxfs/xfs_sb.c | 15 ++ fs/xfs/scrub/attr_repair.c | 2 +- fs/xfs/xfs_attr_item.c | 2 +- fs/xfs/xfs_buf_item.c | 1 + fs/xfs/xfs_qm.c | 5 +- fs/xfs/xfs_rtalloc.c | 14 +- include/crypto/scatterwalk.h | 52 +++--- include/dt-bindings/clock/qcom,mmcc-sdm660.h | 1 + include/linux/blkdev.h | 2 +- include/linux/crash_reserve.h | 6 + include/linux/dma-mapping.h | 2 +- include/linux/fs.h | 2 +- include/linux/jbd2.h | 6 + include/linux/ksm.h | 4 +- include/linux/platform_data/x86/intel_pmc_ipc.h | 4 +- include/linux/reset.h | 1 + include/linux/slab.h | 7 + include/linux/tpm.h | 21 ++- include/media/v4l2-mem2mem.h | 3 +- include/net/inet_frag.h | 18 +- include/net/ipv6_frag.h | 9 +- include/sound/soc.h | 1 + include/trace/events/tlb.h | 5 +- include/uapi/drm/xe_drm.h | 1 + include/uapi/linux/mptcp.h | 1 + io_uring/io_uring.c | 3 + io_uring/openclose.c | 2 +- io_uring/poll.c | 9 +- io_uring/rsrc.c | 13 +- kernel/bpf/dmabuf_iter.c | 56 +++++- kernel/cgroup/rstat.c | 13 +- kernel/crash_reserve.c | 3 + kernel/kallsyms.c | 5 +- kernel/livepatch/core.c | 8 +- kernel/printk/internal.h | 8 +- kernel/printk/nbcon.c | 9 +- kernel/printk/printk.c | 83 ++++++-- kernel/sched/cpudeadline.c | 34 +--- kernel/sched/cpudeadline.h | 4 +- kernel/sched/deadline.c | 8 +- kernel/sched/ext.c | 62 ++++-- kernel/sched/fair.c | 19 +- kernel/scs.c | 2 +- kernel/trace/bpf_trace.c | 2 +- kernel/trace/trace_events.c | 2 + kernel/trace/trace_events_synth.c | 1 - lib/crypto/Kconfig | 9 +- lib/crypto/riscv/.gitignore | 2 + lib/crypto/riscv/chacha-riscv64-zvkb.S | 5 +- lib/crypto/x86/blake2s-core.S | 4 +- mm/huge_memory.c | 2 +- mm/ksm.c | 20 +- mm/shmem.c | 24 ++- mm/slab.h | 1 + mm/slab_common.c | 52 ++++-- mm/slub.c | 59 +++--- net/caif/cffrml.c | 9 +- net/can/j1939/socket.c | 6 + net/ceph/osdmap.c | 116 ++++++------ net/ethtool/ioctl.c | 30 ++- net/handshake/request.c | 8 +- net/hsr/hsr_forward.c | 2 + net/ipv4/inet_fragment.c | 55 +++++- net/ipv4/ip_fragment.c | 22 +-- net/mptcp/pm_netlink.c | 3 +- net/mptcp/protocol.c | 22 ++- net/netfilter/ipvs/ip_vs_xmit.c | 3 + net/netfilter/nf_conncount.c | 25 +-- net/netfilter/nf_nat_core.c | 14 +- net/netfilter/nf_tables_api.c | 11 -- net/netrom/nr_out.c | 4 +- net/openvswitch/flow_netlink.c | 13 +- net/sched/sch_ets.c | 6 +- net/sunrpc/auth_gss/svcauth_gss.c | 3 +- net/sunrpc/xprtrdma/svc_rdma_rw.c | 7 +- net/wireless/core.c | 1 + net/wireless/core.h | 1 + net/wireless/mlme.c | 19 ++ net/wireless/util.c | 23 +-- rust/helpers/dma.c | 21 +++ rust/kernel/devres.rs | 18 +- rust/kernel/drm/gem/mod.rs | 2 +- rust/kernel/io.rs | 26 ++- rust/kernel/io/resource.rs | 13 +- samples/rust/rust_driver_pci.rs | 2 +- scripts/Makefile.modinst | 2 +- scripts/faddr2line | 13 +- scripts/lib/kdoc/kdoc_parser.py | 7 +- security/keys/trusted-keys/trusted_tpm2.c | 35 +++- sound/hda/codecs/realtek/alc269.c | 11 +- sound/pcmcia/pdaudiocf/pdaudiocf.c | 8 +- sound/pcmcia/vx/vxpocket.c | 8 +- sound/soc/codecs/ak4458.c | 4 - sound/soc/fsl/fsl_sai.c | 10 +- sound/soc/sdca/sdca_asoc.c | 34 +--- sound/soc/soc-ops.c | 84 +++++++-- sound/soc/sof/ipc4-topology.c | 24 ++- sound/usb/mixer_us16x08.c | 20 +- tools/lib/perf/cpumap.c | 10 +- tools/testing/ktest/config-bisect.pl | 4 +- tools/testing/nvdimm/test/nfit.c | 7 +- tools/testing/selftests/iommu/iommufd.c | 8 +- tools/testing/selftests/kvm/Makefile | 2 +- tools/testing/selftests/kvm/rseq_test.c | 1 + tools/testing/selftests/net/af_unix/Makefile | 7 +- tools/testing/selftests/net/lib/ksft.h | 6 +- tools/testing/selftests/net/mptcp/pm_netlink.sh | 4 + tools/testing/selftests/net/mptcp/pm_nl_ctl.c | 11 ++ .../selftests/net/netfilter/conntrack_clash.sh | 9 +- .../net/netfilter/conntrack_reverse_clash.c | 13 +- .../net/netfilter/conntrack_reverse_clash.sh | 2 + .../packetdrill/conntrack_syn_challenge_ack.pkt | 2 +- tools/testing/selftests/net/tfo.c | 3 +- tools/testing/selftests/ublk/kublk.h | 12 +- tools/tracing/rtla/src/timerlat.bpf.c | 3 + virt/kvm/kvm_main.c | 4 +- 441 files changed, 4077 insertions(+), 1949 deletions(-)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 5630f7557de61264ccb4f031d4734a1a97eaed16 ]
When we are logging a directory and the log context indicates that we are logging a new name for some other file (that is or was inside that directory), we skip logging the inodes for new dentries in the directory.
This is ok most of the time, but if after the rename or link operation that triggered the logging of that directory, we have an explicit fsync of that directory without the directory inode being evicted and reloaded, we end up never logging the inodes for the new dentries that we found during the new name logging, as the next directory fsync will only process dentries that were added after the last time we logged the directory (we are doing an incremental directory logging).
So make sure we always log new dentries for a directory even if we are in a context of logging a new name.
We started skipping logging inodes for new dentries as of commit c48792c6ee7a ("btrfs: do not log new dentries when logging that a new name exists") and it was fine back then, because when logging a directory we always iterated over all the directory entries (for leaves changed in the current transaction) so a subsequent fsync would always log anything that was previously skipped while logging a directory when logging a new name (with btrfs_log_new_name()). But later support for incrementally logging a directory was added in commit dc2872247ec0 ("btrfs: keep track of the last logged keys when logging a directory"), to avoid checking all dir items every time we log a directory, so the check to skip dentry logging added in the first commit should have been removed when the incremental support for logging a directory was added.
A test case for fstests will follow soon.
Reported-by: Vyacheslav Kovalevsky slava.kovalevskiy.2014@gmail.com Link: https://lore.kernel.org/linux-btrfs/84c4e713-85d6-42b9-8dcf-0722ed26cb05@gma... Fixes: dc2872247ec0 ("btrfs: keep track of the last logged keys when logging a directory") Reviewed-by: Boris Burkov boris@bur.io Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/tree-log.c | 8 -------- 1 file changed, 8 deletions(-)
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 30f3c3b849c14..f55d886debe2f 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -5872,14 +5872,6 @@ static int log_new_dir_dentries(struct btrfs_trans_handle *trans, struct btrfs_inode *curr_inode = start_inode; int ret = 0;
- /* - * If we are logging a new name, as part of a link or rename operation, - * don't bother logging new dentries, as we just want to log the names - * of an inode and that any new parents exist. - */ - if (ctx->logging_new_name) - return 0; - path = btrfs_alloc_path(); if (!path) return -ENOMEM;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
[ Upstream commit 313ef70a9f0f637a09d9ef45222f5bdcf30a354b ]
Inside print_data_reloc_error(), if extent_from_logical() failed we return immediately.
However there are the following cases where extent_from_logical() can return error but still holds a path:
- btrfs_search_slot() returned 0
- No backref item found in extent tree
- No flags_ret provided This is not possible in this call site though.
So for the above two cases, we can return without releasing the path, causing extent buffer leaks.
Fixes: b9a9a85059cd ("btrfs: output affected files when relocation fails") Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/inode.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 6282911e536f0..51401d586a7b6 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -256,6 +256,7 @@ static void print_data_reloc_error(const struct btrfs_inode *inode, u64 file_off if (ret < 0) { btrfs_err_rl(fs_info, "failed to lookup extent item for logical %llu: %d", logical, ret); + btrfs_release_path(&path); return; } eb = path.nodes[0];
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ondrej Mosnacek omosnace@redhat.com
[ Upstream commit 189e5deb944a6f9c7992355d60bffd8ec2e54a9c ]
Analogically to the x86 commit 881a9c9cb785 ("bpf: Do not audit capability check in do_jit()"), change the capable() call to ns_capable_noaudit() in order to avoid spurious SELinux denials in audit log.
The commit log from that commit applies here as well: """ The failure of this check only results in a security mitigation being applied, slightly affecting performance of the compiled BPF program. It doesn't result in a failed syscall, an thus auditing a failed LSM permission check for it is unwanted. For example with SELinux, it causes a denial to be reported for confined processes running as root, which tends to be flagged as a problem to be fixed in the policy. Yet dontauditing or allowing CAP_SYS_ADMIN to the domain may not be desirable, as it would allow/silence also other checks - either going against the principle of least privilege or making debugging potentially harder.
Fix it by changing it from capable() to ns_capable_noaudit(), which instructs the LSMs to not audit the resulting denials. """
Fixes: f300769ead03 ("arm64: bpf: Only mitigate cBPF programs loaded by unprivileged users") Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Link: https://lore.kernel.org/r/20251204125916.441021-1-omosnace@redhat.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/net/bpf_jit_comp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index 0c9a50a1e73e7..0dfefeedfe56c 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -1004,7 +1004,7 @@ static void __maybe_unused build_bhb_mitigation(struct jit_ctx *ctx) arm64_get_spectre_v2_state() == SPECTRE_VULNERABLE) return;
- if (capable(CAP_SYS_ADMIN)) + if (ns_capable_noaudit(&init_user_ns, CAP_SYS_ADMIN)) return;
if (supports_clearbhb(SCOPE_SYSTEM)) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: T.J. Mercier tjmercier@google.com
[ Upstream commit 234483565dbb2b264fdd165927c89fbf3ecf4733 ]
If there is a large number (hundreds) of dmabufs allocated, the text output generated from dmabuf_iter_seq_show can exceed common user buffer sizes (e.g. PAGE_SIZE) necessitating multiple start/stop cycles to iterate through all dmabufs. However the dmabuf iterator currently returns NULL in dmabuf_iter_seq_start for all non-zero pos values, which results in the truncation of the output before all dmabufs are handled.
After dma_buf_iter_begin / dma_buf_iter_next, the refcount of the buffer is elevated so that the BPF iterator program can run without holding any locks. When a stop occurs, instead of immediately dropping the reference on the buffer, stash a pointer to the buffer in seq->priv until either start is called or the iterator is released. This also enables the resumption of iteration without first walking through the list of dmabufs based on the pos value.
Fixes: 76ea95534995 ("bpf: Add dmabuf iterator") Signed-off-by: T.J. Mercier tjmercier@google.com Link: https://lore.kernel.org/r/20251204000348.1413593-1-tjmercier@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/dmabuf_iter.c | 56 +++++++++++++++++++++++++++++++++++----- 1 file changed, 49 insertions(+), 7 deletions(-)
diff --git a/kernel/bpf/dmabuf_iter.c b/kernel/bpf/dmabuf_iter.c index 4dd7ef7c145ca..cd500248abd95 100644 --- a/kernel/bpf/dmabuf_iter.c +++ b/kernel/bpf/dmabuf_iter.c @@ -6,10 +6,33 @@ #include <linux/kernel.h> #include <linux/seq_file.h>
+struct dmabuf_iter_priv { + /* + * If this pointer is non-NULL, the buffer's refcount is elevated to + * prevent destruction between stop/start. If reading is not resumed and + * start is never called again, then dmabuf_iter_seq_fini drops the + * reference when the iterator is released. + */ + struct dma_buf *dmabuf; +}; + static void *dmabuf_iter_seq_start(struct seq_file *seq, loff_t *pos) { - if (*pos) - return NULL; + struct dmabuf_iter_priv *p = seq->private; + + if (*pos) { + struct dma_buf *dmabuf = p->dmabuf; + + if (!dmabuf) + return NULL; + + /* + * Always resume from where we stopped, regardless of the value + * of pos. + */ + p->dmabuf = NULL; + return dmabuf; + }
return dma_buf_iter_begin(); } @@ -54,8 +77,11 @@ static void dmabuf_iter_seq_stop(struct seq_file *seq, void *v) { struct dma_buf *dmabuf = v;
- if (dmabuf) - dma_buf_put(dmabuf); + if (dmabuf) { + struct dmabuf_iter_priv *p = seq->private; + + p->dmabuf = dmabuf; + } }
static const struct seq_operations dmabuf_iter_seq_ops = { @@ -71,11 +97,27 @@ static void bpf_iter_dmabuf_show_fdinfo(const struct bpf_iter_aux_info *aux, seq_puts(seq, "dmabuf iter\n"); }
+static int dmabuf_iter_seq_init(void *priv, struct bpf_iter_aux_info *aux) +{ + struct dmabuf_iter_priv *p = (struct dmabuf_iter_priv *)priv; + + p->dmabuf = NULL; + return 0; +} + +static void dmabuf_iter_seq_fini(void *priv) +{ + struct dmabuf_iter_priv *p = (struct dmabuf_iter_priv *)priv; + + if (p->dmabuf) + dma_buf_put(p->dmabuf); +} + static const struct bpf_iter_seq_info dmabuf_iter_seq_info = { .seq_ops = &dmabuf_iter_seq_ops, - .init_seq_private = NULL, - .fini_seq_private = NULL, - .seq_priv_size = 0, + .init_seq_private = dmabuf_iter_seq_init, + .fini_seq_private = dmabuf_iter_seq_fini, + .seq_priv_size = sizeof(struct dmabuf_iter_priv), };
static struct bpf_iter_reg bpf_dmabuf_reg_info = {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuran Liu electronlsr@gmail.com
[ Upstream commit ac44dcc788b950606793e8f9690c30925f59df02 ]
Commit 37cce22dbd51 ("bpf: verifier: Refactor helper access type tracking") started distinguishing read vs write accesses performed by helpers.
The second argument of bpf_d_path() is a pointer to a buffer that the helper fills with the resulting path. However, its prototype currently uses ARG_PTR_TO_MEM without MEM_WRITE.
Before 37cce22dbd51, helper accesses were conservatively treated as potential writes, so this mismatch did not cause issues. Since that commit, the verifier may incorrectly assume that the buffer contents are unchanged across the helper call and base its optimizations on this wrong assumption. This can lead to misbehaviour in BPF programs that read back the buffer, such as prefix comparisons on the returned path.
Fix this by marking the second argument of bpf_d_path() as ARG_PTR_TO_MEM | MEM_WRITE so that the verifier correctly models the write to the caller-provided buffer.
Fixes: 37cce22dbd51 ("bpf: verifier: Refactor helper access type tracking") Co-developed-by: Zesen Liu ftyg@live.com Signed-off-by: Zesen Liu ftyg@live.com Co-developed-by: Peili Gao gplhust955@gmail.com Signed-off-by: Peili Gao gplhust955@gmail.com Co-developed-by: Haoran Ni haoran.ni.cs@gmail.com Signed-off-by: Haoran Ni haoran.ni.cs@gmail.com Signed-off-by: Shuran Liu electronlsr@gmail.com Reviewed-by: Matt Bobrowski mattbobrowski@google.com Link: https://lore.kernel.org/r/20251206141210.3148-2-electronlsr@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/bpf_trace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 4f87c16d915a0..49e0bdaa7a1bf 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -965,7 +965,7 @@ static const struct bpf_func_proto bpf_d_path_proto = { .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_BTF_ID, .arg1_btf_id = &bpf_d_path_btf_ids[0], - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_WRITE, .arg3_type = ARG_CONST_SIZE_OR_ZERO, .allowed = bpf_d_path_allowed, };
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Deepanshu Kartikey kartikey406@gmail.com
[ Upstream commit b57f2ddd28737db6ff0e9da8467f0ab9d707e997 ]
In open_seed_devices(), when find_fsid() fails and we're in DEGRADED mode, a new fs_devices is allocated via alloc_fs_devices() but is never added to the seed_list before returning. This contrasts with the normal path where fs_devices is properly added via list_add().
If any error occurs later in read_one_dev() or btrfs_read_chunk_tree(), the cleanup code iterates seed_list to free seed devices, but this orphaned fs_devices is never found and never freed, causing a memory leak. Any devices allocated via add_missing_dev() and attached to this fs_devices are also leaked.
Fix this by adding the newly allocated fs_devices to seed_list in the degraded path, consistent with the normal path.
Fixes: 5f37583569442 ("Btrfs: move the missing device to its own fs device list") Reported-by: syzbot+eadd98df8bceb15d7fed@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=eadd98df8bceb15d7fed Tested-by: syzbot+eadd98df8bceb15d7fed@syzkaller.appspotmail.com Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Deepanshu Kartikey kartikey406@gmail.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 2bec544d8ba30..48e717c105c35 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -7178,6 +7178,7 @@ static struct btrfs_fs_devices *open_seed_devices(struct btrfs_fs_info *fs_info,
fs_devices->seeding = true; fs_devices->opened = 1; + list_add(&fs_devices->seed_list, &fs_info->fs_devices->seed_list); return fs_devices; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 37343524f000d2a64359867d7024a73233d3b438 ]
If the call to btrfs_delalloc_reserve_metadata() fails we jump to the 'out_noreserve' label and there we never free the extent_changeset allocated by the previous call to btrfs_check_data_free_space() (if qgroups are enabled). Fix this by calling extent_changeset_free() under the 'out_noreserve' label.
Fixes: 6599716de2d6 ("btrfs: fix -ENOSPC mmap write failure on NOCOW files/extents") Reported-by: syzbot+2f8aa76e6acc9fce6638@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/693a635a.a70a0220.33cd7b.0029.GAE@google... Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/file.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index fa82def46e395..0fee45d35a5f8 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2018,13 +2018,14 @@ static vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf) else btrfs_delalloc_release_space(inode, data_reserved, page_start, reserved_space, true); - extent_changeset_free(data_reserved); out_noreserve: if (only_release_metadata) btrfs_check_nocow_unlock(inode);
sb_end_pagefault(inode->vfs_inode.i_sb);
+ extent_changeset_free(data_reserved); + if (ret < 0) return vmf_error(ret);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit e1b4c6a58304fd490124cc2b454d80edc786665c ]
maple_tree insertions can fail if we are seriously short on memory; simple_offset_rename() does not recover well if it runs into that. The same goes for simple_offset_rename_exchange().
Moreover, shmem_whiteout() expects that if it succeeds, the caller will progress to d_move(), i.e. that shmem_rename2() won't fail past the successful call of shmem_whiteout().
Not hard to fix, fortunately - mtree_store() can't fail if the index we are trying to store into is already present in the tree as a singleton.
For simple_offset_rename_exchange() that's enough - we just need to be careful about the order of operations.
For simple_offset_rename() solution is to preinsert the target into the tree for new_dir; the rest can be done without any potentially failing operations.
That preinsertion has to be done in shmem_rename2() rather than in simple_offset_rename() itself - otherwise we'd need to deal with the possibility of failure after successful shmem_whiteout().
Fixes: a2e459555c5f ("shmem: stable directory offsets") Reviewed-by: Christian Brauner brauner@kernel.org Reviewed-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/libfs.c | 50 +++++++++++++++++++--------------------------- include/linux/fs.h | 2 +- mm/shmem.c | 18 ++++++++++++----- 3 files changed, 35 insertions(+), 35 deletions(-)
diff --git a/fs/libfs.c b/fs/libfs.c index ce8c496a6940a..6be233c787fd8 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -346,22 +346,22 @@ void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry) * User space expects the directory offset value of the replaced * (new) directory entry to be unchanged after a rename. * - * Returns zero on success, a negative errno value on failure. + * Caller must have grabbed a slot for new_dentry in the maple_tree + * associated with new_dir, even if dentry is negative. */ -int simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry, - struct inode *new_dir, struct dentry *new_dentry) +void simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry, + struct inode *new_dir, struct dentry *new_dentry) { struct offset_ctx *old_ctx = old_dir->i_op->get_offset_ctx(old_dir); struct offset_ctx *new_ctx = new_dir->i_op->get_offset_ctx(new_dir); long new_offset = dentry2offset(new_dentry);
- simple_offset_remove(old_ctx, old_dentry); + if (WARN_ON(!new_offset)) + return;
- if (new_offset) { - offset_set(new_dentry, 0); - return simple_offset_replace(new_ctx, old_dentry, new_offset); - } - return simple_offset_add(new_ctx, old_dentry); + simple_offset_remove(old_ctx, old_dentry); + offset_set(new_dentry, 0); + WARN_ON(simple_offset_replace(new_ctx, old_dentry, new_offset)); }
/** @@ -388,31 +388,23 @@ int simple_offset_rename_exchange(struct inode *old_dir, long new_index = dentry2offset(new_dentry); int ret;
- simple_offset_remove(old_ctx, old_dentry); - simple_offset_remove(new_ctx, new_dentry); + if (WARN_ON(!old_index || !new_index)) + return -EINVAL;
- ret = simple_offset_replace(new_ctx, old_dentry, new_index); - if (ret) - goto out_restore; + ret = mtree_store(&new_ctx->mt, new_index, old_dentry, GFP_KERNEL); + if (WARN_ON(ret)) + return ret;
- ret = simple_offset_replace(old_ctx, new_dentry, old_index); - if (ret) { - simple_offset_remove(new_ctx, old_dentry); - goto out_restore; + ret = mtree_store(&old_ctx->mt, old_index, new_dentry, GFP_KERNEL); + if (WARN_ON(ret)) { + mtree_store(&new_ctx->mt, new_index, new_dentry, GFP_KERNEL); + return ret; }
- ret = simple_rename_exchange(old_dir, old_dentry, new_dir, new_dentry); - if (ret) { - simple_offset_remove(new_ctx, old_dentry); - simple_offset_remove(old_ctx, new_dentry); - goto out_restore; - } + offset_set(old_dentry, new_index); + offset_set(new_dentry, old_index); + simple_rename_exchange(old_dir, old_dentry, new_dir, new_dentry); return 0; - -out_restore: - (void)simple_offset_replace(old_ctx, old_dentry, old_index); - (void)simple_offset_replace(new_ctx, new_dentry, new_index); - return ret; }
/** diff --git a/include/linux/fs.h b/include/linux/fs.h index dd3b57cfadeeb..9b2230fb2332f 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3676,7 +3676,7 @@ struct offset_ctx { void simple_offset_init(struct offset_ctx *octx); int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry); void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry); -int simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry, +void simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry, struct inode *new_dir, struct dentry *new_dentry); int simple_offset_rename_exchange(struct inode *old_dir, struct dentry *old_dentry, diff --git a/mm/shmem.c b/mm/shmem.c index 5a3f0f754dc0c..d09ccb2ba21e9 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -4039,6 +4039,7 @@ static int shmem_rename2(struct mnt_idmap *idmap, { struct inode *inode = d_inode(old_dentry); int they_are_dirs = S_ISDIR(inode->i_mode); + bool had_offset = false; int error;
if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE | RENAME_WHITEOUT)) @@ -4051,16 +4052,23 @@ static int shmem_rename2(struct mnt_idmap *idmap, if (!simple_empty(new_dentry)) return -ENOTEMPTY;
+ error = simple_offset_add(shmem_get_offset_ctx(new_dir), new_dentry); + if (error == -EBUSY) + had_offset = true; + else if (unlikely(error)) + return error; + if (flags & RENAME_WHITEOUT) { error = shmem_whiteout(idmap, old_dir, old_dentry); - if (error) + if (error) { + if (!had_offset) + simple_offset_remove(shmem_get_offset_ctx(new_dir), + new_dentry); return error; + } }
- error = simple_offset_rename(old_dir, old_dentry, new_dir, new_dentry); - if (error) - return error; - + simple_offset_rename(old_dir, old_dentry, new_dir, new_dentry); if (d_really_is_positive(new_dentry)) { (void) shmem_unlink(new_dir, new_dentry); if (they_are_dirs) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joanne Koong joannelkoong@gmail.com
[ Upstream commit 7aa6bc3e8766990824f66ca76c19596ce10daf3e ]
iomap_adjust_read_range() assumes that the position and length passed in are block-aligned. This is not always the case however, as shown in the syzbot generated case for erofs. This causes too many bytes to be skipped for uptodate blocks, which results in returning the incorrect position and length to read in. If all the blocks are uptodate, this underflows length and returns a position beyond the folio.
Fix the calculation to also take into account the block offset when calculating how many bytes can be skipped for uptodate blocks.
Signed-off-by: Joanne Koong joannelkoong@gmail.com Tested-by: syzbot@syzkaller.appspotmail.com Reviewed-by: Brian Foster bfoster@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/iomap/buffered-io.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 8b847a1e27f13..1c95a0a7b302d 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -240,17 +240,24 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, * to avoid reading in already uptodate ranges. */ if (ifs) { - unsigned int i; + unsigned int i, blocks_skipped;
/* move forward for each leading block marked uptodate */ - for (i = first; i <= last; i++) { + for (i = first; i <= last; i++) if (!ifs_block_is_uptodate(ifs, i)) break; - *pos += block_size; - poff += block_size; - plen -= block_size; - first++; + + blocks_skipped = i - first; + if (blocks_skipped) { + unsigned long block_offset = *pos & (block_size - 1); + unsigned bytes_skipped = + (blocks_skipped << block_bits) - block_offset; + + *pos += bytes_skipped; + poff += bytes_skipped; + plen -= bytes_skipped; } + first = i;
/* truncate len if we find any trailing uptodate block(s) */ while (++i <= last) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joanne Koong joannelkoong@gmail.com
[ Upstream commit 9d875e0eef8ec15b6b1da0cb9a0f8ed13efee89e ]
The end position to start truncating from may be at an offset into a block, which under the current logic would result in overtruncation.
Adjust the calculation to account for unaligned end offsets.
Signed-off-by: Joanne Koong joannelkoong@gmail.com Link: https://patch.msgid.link/20251111193658.3495942-3-joannelkoong@gmail.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/iomap/buffered-io.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 1c95a0a7b302d..c0fa6acf19375 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -217,6 +217,22 @@ static void ifs_free(struct folio *folio) kfree(ifs); }
+/* + * Calculate how many bytes to truncate based off the number of blocks to + * truncate and the end position to start truncating from. + */ +static size_t iomap_bytes_to_truncate(loff_t end_pos, unsigned block_bits, + unsigned blocks_truncated) +{ + unsigned block_size = 1 << block_bits; + unsigned block_offset = end_pos & (block_size - 1); + + if (!block_offset) + return blocks_truncated << block_bits; + + return ((blocks_truncated - 1) << block_bits) + block_offset; +} + /* * Calculate the range inside the folio that we actually need to read. */ @@ -262,7 +278,8 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, /* truncate len if we find any trailing uptodate block(s) */ while (++i <= last) { if (ifs_block_is_uptodate(ifs, i)) { - plen -= (last - i + 1) * block_size; + plen -= iomap_bytes_to_truncate(*pos + plen, + block_bits, last - i + 1); last = i - 1; break; } @@ -278,7 +295,8 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, unsigned end = offset_in_folio(folio, isize - 1) >> block_bits;
if (first <= end && last > end) - plen -= (last - end) * block_size; + plen -= iomap_bytes_to_truncate(*pos + plen, block_bits, + last - end); }
*offp = poff;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pankaj Raghav p.raghav@samsung.com
[ Upstream commit ff5c0466486ba8d07ab2700380e8fd6d5344b4e9 ]
The run_readelf() function reads the entire output of readelf into a single shell variable. For large object files with extensive debug information, the size of this variable can exceed the system's command-line argument length limit.
When this variable is subsequently passed to sed via `echo "${out}"`, it triggers an "Argument list too long" error, causing the script to fail.
Fix this by redirecting the output of readelf to a temporary file instead of a variable. The sed commands are then modified to read from this file, avoiding the argument length limitation entirely.
Signed-off-by: Pankaj Raghav p.raghav@samsung.com Signed-off-by: Josh Poimboeuf jpoimboe@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/faddr2line | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/scripts/faddr2line b/scripts/faddr2line index 1fa6beef9f978..477b6d2aa3179 100755 --- a/scripts/faddr2line +++ b/scripts/faddr2line @@ -107,14 +107,19 @@ find_dir_prefix() {
run_readelf() { local objfile=$1 - local out=$(${READELF} --file-header --section-headers --symbols --wide $objfile) + local tmpfile + tmpfile=$(mktemp) + + ${READELF} --file-header --section-headers --symbols --wide "$objfile" > "$tmpfile"
# This assumes that readelf first prints the file header, then the section headers, then the symbols. # Note: It seems that GNU readelf does not prefix section headers with the "There are X section headers" # line when multiple options are given, so let's also match with the "Section Headers:" line. - ELF_FILEHEADER=$(echo "${out}" | sed -n '/There are [0-9]* section headers, starting at offset|Section Headers:/q;p') - ELF_SECHEADERS=$(echo "${out}" | sed -n '/There are [0-9]* section headers, starting at offset|Section Headers:/,$p' | sed -n '/Symbol table .* contains [0-9]* entries:/q;p') - ELF_SYMS=$(echo "${out}" | sed -n '/Symbol table .* contains [0-9]* entries:/,$p') + ELF_FILEHEADER=$(sed -n '/There are [0-9]* section headers, starting at offset|Section Headers:/q;p' "$tmpfile") + ELF_SECHEADERS=$(sed -n '/There are [0-9]* section headers, starting at offset|Section Headers:/,$p' "$tmpfile" | sed -n '/Symbol table .* contains [0-9]* entries:/q;p') + ELF_SYMS=$(sed -n '/Symbol table .* contains [0-9]* entries:/,$p' "$tmpfile") + + rm -f -- "$tmpfile" }
check_vmlinux() {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: George Kennedy george.kennedy@oracle.com
[ Upstream commit 866cf36bfee4fba6a492d2dcc5133f857e3446b0 ]
On AMD machines cpuc->events[idx] can become NULL in a subtle race condition with NMI->throttle->x86_pmu_stop().
Check event for NULL in amd_pmu_enable_all() before enable to avoid a GPF. This appears to be an AMD only issue.
Syzkaller reported a GPF in amd_pmu_enable_all.
INFO: NMI handler (perf_event_nmi_handler) took too long to run: 13.143 msecs Oops: general protection fault, probably for non-canonical address 0xdffffc0000000034: 0000 PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x00000000000001a0-0x00000000000001a7] CPU: 0 UID: 0 PID: 328415 Comm: repro_36674776 Not tainted 6.12.0-rc1-syzk RIP: 0010:x86_pmu_enable_event (arch/x86/events/perf_event.h:1195 arch/x86/events/core.c:1430) RSP: 0018:ffff888118009d60 EFLAGS: 00010012 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000034 RSI: 0000000000000000 RDI: 00000000000001a0 RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 R13: ffff88811802a440 R14: ffff88811802a240 R15: ffff8881132d8601 FS: 00007f097dfaa700(0000) GS:ffff888118000000(0000) GS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000200001c0 CR3: 0000000103d56000 CR4: 00000000000006f0 Call Trace: <IRQ> amd_pmu_enable_all (arch/x86/events/amd/core.c:760 (discriminator 2)) x86_pmu_enable (arch/x86/events/core.c:1360) event_sched_out (kernel/events/core.c:1191 kernel/events/core.c:1186 kernel/events/core.c:2346) __perf_remove_from_context (kernel/events/core.c:2435) event_function (kernel/events/core.c:259) remote_function (kernel/events/core.c:92 (discriminator 1) kernel/events/core.c:72 (discriminator 1)) __flush_smp_call_function_queue (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/csd.h:64 kernel/smp.c:135 kernel/smp.c:540) __sysvec_call_function_single (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./arch/x86/include/asm/trace/irq_vectors.h:99 arch/x86/kernel/smp.c:272) sysvec_call_function_single (arch/x86/kernel/smp.c:266 (discriminator 47) arch/x86/kernel/smp.c:266 (discriminator 47)) </IRQ>
Reported-by: syzkaller syzkaller@googlegroups.com Signed-off-by: George Kennedy george.kennedy@oracle.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/amd/core.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c index b20661b8621d1..8868f5f5379ba 100644 --- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -763,7 +763,12 @@ static void amd_pmu_enable_all(int added) if (!test_bit(idx, cpuc->active_mask)) continue;
- amd_pmu_enable_event(cpuc->events[idx]); + /* + * FIXME: cpuc->events[idx] can become NULL in a subtle race + * condition with NMI->throttle->x86_pmu_stop(). + */ + if (cpuc->events[idx]) + amd_pmu_enable_event(cpuc->events[idx]); } }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Doug Berger opendmb@gmail.com
[ Upstream commit 382748c05e58a9f1935f5a653c352422375566ea ]
Commit 16b269436b72 ("sched/deadline: Modify cpudl::free_cpus to reflect rd->online") introduced the cpudl_set/clear_freecpu functions to allow the cpu_dl::free_cpus mask to be manipulated by the deadline scheduler class rq_on/offline callbacks so the mask would also reflect this state.
Commit 9659e1eeee28 ("sched/deadline: Remove cpu_active_mask from cpudl_find()") removed the check of the cpu_active_mask to save some processing on the premise that the cpudl::free_cpus mask already reflected the runqueue online state.
Unfortunately, there are cases where it is possible for the cpudl_clear function to set the free_cpus bit for a CPU when the deadline runqueue is offline. When this occurs while a CPU is connected to the default root domain the flag may retain the bad state after the CPU has been unplugged. Later, a different CPU that is transitioning through the default root domain may push a deadline task to the powered down CPU when cpudl_find sees its free_cpus bit is set. If this happens the task will not have the opportunity to run.
One example is outlined here: https://lore.kernel.org/lkml/20250110233010.2339521-1-opendmb@gmail.com
Another occurs when the last deadline task is migrated from a CPU that has an offlined runqueue. The dequeue_task member of the deadline scheduler class will eventually call cpudl_clear and set the free_cpus bit for the CPU.
This commit modifies the cpudl_clear function to be aware of the online state of the deadline runqueue so that the free_cpus mask can be updated appropriately.
It is no longer necessary to manage the mask outside of the cpudl_set/clear functions so the cpudl_set/clear_freecpu functions are removed. In addition, since the free_cpus mask is now only updated under the cpudl lock the code was changed to use the non-atomic __cpumask functions.
Signed-off-by: Doug Berger opendmb@gmail.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/cpudeadline.c | 34 +++++++++------------------------- kernel/sched/cpudeadline.h | 4 +--- kernel/sched/deadline.c | 8 ++++---- 3 files changed, 14 insertions(+), 32 deletions(-)
diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c index cdd740b3f7743..37b572cc8aca2 100644 --- a/kernel/sched/cpudeadline.c +++ b/kernel/sched/cpudeadline.c @@ -166,12 +166,13 @@ int cpudl_find(struct cpudl *cp, struct task_struct *p, * cpudl_clear - remove a CPU from the cpudl max-heap * @cp: the cpudl max-heap context * @cpu: the target CPU + * @online: the online state of the deadline runqueue * * Notes: assumes cpu_rq(cpu)->lock is locked * * Returns: (void) */ -void cpudl_clear(struct cpudl *cp, int cpu) +void cpudl_clear(struct cpudl *cp, int cpu, bool online) { int old_idx, new_cpu; unsigned long flags; @@ -184,7 +185,7 @@ void cpudl_clear(struct cpudl *cp, int cpu) if (old_idx == IDX_INVALID) { /* * Nothing to remove if old_idx was invalid. - * This could happen if a rq_offline_dl is + * This could happen if rq_online_dl or rq_offline_dl is * called for a CPU without -dl tasks running. */ } else { @@ -195,9 +196,12 @@ void cpudl_clear(struct cpudl *cp, int cpu) cp->elements[new_cpu].idx = old_idx; cp->elements[cpu].idx = IDX_INVALID; cpudl_heapify(cp, old_idx); - - cpumask_set_cpu(cpu, cp->free_cpus); } + if (likely(online)) + __cpumask_set_cpu(cpu, cp->free_cpus); + else + __cpumask_clear_cpu(cpu, cp->free_cpus); + raw_spin_unlock_irqrestore(&cp->lock, flags); }
@@ -228,7 +232,7 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl) cp->elements[new_idx].cpu = cpu; cp->elements[cpu].idx = new_idx; cpudl_heapify_up(cp, new_idx); - cpumask_clear_cpu(cpu, cp->free_cpus); + __cpumask_clear_cpu(cpu, cp->free_cpus); } else { cp->elements[old_idx].dl = dl; cpudl_heapify(cp, old_idx); @@ -237,26 +241,6 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl) raw_spin_unlock_irqrestore(&cp->lock, flags); }
-/* - * cpudl_set_freecpu - Set the cpudl.free_cpus - * @cp: the cpudl max-heap context - * @cpu: rd attached CPU - */ -void cpudl_set_freecpu(struct cpudl *cp, int cpu) -{ - cpumask_set_cpu(cpu, cp->free_cpus); -} - -/* - * cpudl_clear_freecpu - Clear the cpudl.free_cpus - * @cp: the cpudl max-heap context - * @cpu: rd attached CPU - */ -void cpudl_clear_freecpu(struct cpudl *cp, int cpu) -{ - cpumask_clear_cpu(cpu, cp->free_cpus); -} - /* * cpudl_init - initialize the cpudl structure * @cp: the cpudl max-heap context diff --git a/kernel/sched/cpudeadline.h b/kernel/sched/cpudeadline.h index 11c0f1faa7e11..d7699468eedd5 100644 --- a/kernel/sched/cpudeadline.h +++ b/kernel/sched/cpudeadline.h @@ -19,8 +19,6 @@ struct cpudl {
int cpudl_find(struct cpudl *cp, struct task_struct *p, struct cpumask *later_mask); void cpudl_set(struct cpudl *cp, int cpu, u64 dl); -void cpudl_clear(struct cpudl *cp, int cpu); +void cpudl_clear(struct cpudl *cp, int cpu, bool online); int cpudl_init(struct cpudl *cp); -void cpudl_set_freecpu(struct cpudl *cp, int cpu); -void cpudl_clear_freecpu(struct cpudl *cp, int cpu); void cpudl_cleanup(struct cpudl *cp); diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 7b7671060bf9e..19b1a8b81c76c 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1811,7 +1811,7 @@ static void dec_dl_deadline(struct dl_rq *dl_rq, u64 deadline) if (!dl_rq->dl_nr_running) { dl_rq->earliest_dl.curr = 0; dl_rq->earliest_dl.next = 0; - cpudl_clear(&rq->rd->cpudl, rq->cpu); + cpudl_clear(&rq->rd->cpudl, rq->cpu, rq->online); cpupri_set(&rq->rd->cpupri, rq->cpu, rq->rt.highest_prio.curr); } else { struct rb_node *leftmost = rb_first_cached(&dl_rq->root); @@ -2883,9 +2883,10 @@ static void rq_online_dl(struct rq *rq) if (rq->dl.overloaded) dl_set_overload(rq);
- cpudl_set_freecpu(&rq->rd->cpudl, rq->cpu); if (rq->dl.dl_nr_running > 0) cpudl_set(&rq->rd->cpudl, rq->cpu, rq->dl.earliest_dl.curr); + else + cpudl_clear(&rq->rd->cpudl, rq->cpu, true); }
/* Assumes rq->lock is held */ @@ -2894,8 +2895,7 @@ static void rq_offline_dl(struct rq *rq) if (rq->dl.overloaded) dl_clear_overload(rq);
- cpudl_clear(&rq->rd->cpudl, rq->cpu); - cpudl_clear_freecpu(&rq->rd->cpudl, rq->cpu); + cpudl_clear(&rq->rd->cpudl, rq->cpu, false); }
void __init init_sched_dl_class(void)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit d206fbad9328ddb68ebabd7cf7413392acd38081 ]
Many people reported regressions on their database workloads due to:
155213a2aed4 ("sched/fair: Bump sd->max_newidle_lb_cost when newidle balance fails")
For instance Adam Li reported a 6% regression on SpecJBB.
Conversely this will regress schbench again; on my machine from 2.22 Mrps/s down to 2.04 Mrps/s.
Reported-by: Joseph Salisbury joseph.salisbury@oracle.com Reported-by: Adam Li adamli@os.amperecomputing.com Reported-by: Dietmar Eggemann dietmar.eggemann@arm.com Reported-by: Hazem Mohamed Abuelfotoh abuehaze@amazon.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Dietmar Eggemann dietmar.eggemann@arm.com Tested-by: Dietmar Eggemann dietmar.eggemann@arm.com Tested-by: Chris Mason clm@meta.com Link: https://lkml.kernel.org/r/20250626144017.1510594-2-clm@fb.com Link: https://lkml.kernel.org/r/006c9df2-b691-47f1-82e6-e233c3f91faf@oracle.com Link: https://patch.msgid.link/20251107161739.406147760@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/fair.c | 19 +++---------------- 1 file changed, 3 insertions(+), 16 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 71d3f4125b0e7..967ca52fb2311 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -12174,14 +12174,8 @@ static inline bool update_newidle_cost(struct sched_domain *sd, u64 cost) /* * Track max cost of a domain to make sure to not delay the * next wakeup on the CPU. - * - * sched_balance_newidle() bumps the cost whenever newidle - * balance fails, and we don't want things to grow out of - * control. Use the sysctl_sched_migration_cost as the upper - * limit, plus a litle extra to avoid off by ones. */ - sd->max_newidle_lb_cost = - min(cost, sysctl_sched_migration_cost + 200); + sd->max_newidle_lb_cost = cost; sd->last_decay_max_lb_cost = jiffies; } else if (time_after(jiffies, sd->last_decay_max_lb_cost + HZ)) { /* @@ -12873,17 +12867,10 @@ static int sched_balance_newidle(struct rq *this_rq, struct rq_flags *rf)
t1 = sched_clock_cpu(this_cpu); domain_cost = t1 - t0; + update_newidle_cost(sd, domain_cost); + curr_cost += domain_cost; t0 = t1; - - /* - * Failing newidle means it is not effective; - * bump the cost so we end up doing less of it. - */ - if (!pulled_task) - domain_cost = (3 * sd->max_newidle_lb_cost) / 2; - - update_newidle_cost(sd, domain_cost); }
/*
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 1fe4002cf7f23d70c79bda429ca2a9423ebcfdfa ]
A KASAN build bloats these single load/store helpers such that it fails to inline them:
vmlinux.o: error: objtool: irqentry_exit+0x5e8: call to instruction_pointer_set() with UACCESS enabled
Make sure the compiler isn't allowed to do stupid.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://patch.msgid.link/20251031105435.GU4068168@noisy.programming.kicks-as... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/ptrace.h | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h index 50f75467f73d0..b5dec859bc75a 100644 --- a/arch/x86/include/asm/ptrace.h +++ b/arch/x86/include/asm/ptrace.h @@ -187,12 +187,12 @@ convert_ip_to_linear(struct task_struct *child, struct pt_regs *regs); extern void send_sigtrap(struct pt_regs *regs, int error_code, int si_code);
-static inline unsigned long regs_return_value(struct pt_regs *regs) +static __always_inline unsigned long regs_return_value(struct pt_regs *regs) { return regs->ax; }
-static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc) +static __always_inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc) { regs->ax = rc; } @@ -277,34 +277,34 @@ static __always_inline bool ip_within_syscall_gap(struct pt_regs *regs) } #endif
-static inline unsigned long kernel_stack_pointer(struct pt_regs *regs) +static __always_inline unsigned long kernel_stack_pointer(struct pt_regs *regs) { return regs->sp; }
-static inline unsigned long instruction_pointer(struct pt_regs *regs) +static __always_inline unsigned long instruction_pointer(struct pt_regs *regs) { return regs->ip; }
-static inline void instruction_pointer_set(struct pt_regs *regs, - unsigned long val) +static __always_inline +void instruction_pointer_set(struct pt_regs *regs, unsigned long val) { regs->ip = val; }
-static inline unsigned long frame_pointer(struct pt_regs *regs) +static __always_inline unsigned long frame_pointer(struct pt_regs *regs) { return regs->bp; }
-static inline unsigned long user_stack_pointer(struct pt_regs *regs) +static __always_inline unsigned long user_stack_pointer(struct pt_regs *regs) { return regs->sp; }
-static inline void user_stack_pointer_set(struct pt_regs *regs, - unsigned long val) +static __always_inline +void user_stack_pointer_set(struct pt_regs *regs, unsigned long val) { regs->sp = val; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cryolitia PukNgae cryolitia.pukngae@linux.dev
[ Upstream commit 9d6c58dae8f6590c746ac5d0012ffe14a77539f0 ]
Although commit 0c9992315e73 ("ACPICA: Avoid walking the ACPI Namespace if it is not there") fixed the situation when both start_node and acpi_gbl_root_node are NULL, the Linux kernel mainline now still crashed on Honor Magicbook 14 Pro [1].
That happens due to the access to the member of parent_node in acpi_ns_get_next_node(). The NULL pointer dereference will always happen, no matter whether or not the start_node is equal to ACPI_ROOT_OBJECT, so move the check of start_node being NULL out of the if block.
Unfortunately, all the attempts to contact Honor have failed, they refused to provide any technical support for Linux.
The bad DSDT table's dump could be found on GitHub [2].
DMI: HONOR FMB-P/FMB-P-PCB, BIOS 1.13 05/08/2025
Link: https://github.com/acpica/acpica/commit/1c1b57b9eba4554cb132ee658dd942c0210e... Link: https://gist.github.com/Cryolitia/a860ffc97437dcd2cd988371d5b73ed7 [1] Link: https://github.com/denis-bb/honor-fmb-p-dsdt [2] Signed-off-by: Cryolitia PukNgae cryolitia.pukngae@linux.dev Reviewed-by: WangYuli wangyl5933@chinaunicom.cn [ rjw: Subject adjustment, changelog edits ] Link: https://patch.msgid.link/20251125-acpica-v1-1-99e63b1b25f8@linux.dev Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/acpica/nswalk.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/acpi/acpica/nswalk.c b/drivers/acpi/acpica/nswalk.c index a2ac06a26e921..5670ff5a43cd4 100644 --- a/drivers/acpi/acpica/nswalk.c +++ b/drivers/acpi/acpica/nswalk.c @@ -169,9 +169,12 @@ acpi_ns_walk_namespace(acpi_object_type type,
if (start_node == ACPI_ROOT_OBJECT) { start_node = acpi_gbl_root_node; - if (!start_node) { - return_ACPI_STATUS(AE_NO_NAMESPACE); - } + } + + /* Avoid walking the namespace if the StartNode is NULL */ + + if (!start_node) { + return_ACPI_STATUS(AE_NO_NAMESPACE); }
/* Null child means "get first node" */
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sakari Ailus sakari.ailus@linux.intel.com
[ Upstream commit 5d010473cdeaabf6a2d3a9e2aed2186c1b73c213 ]
Calling fwnode_get_next_child_node() in ACPI implementation of the fwnode property API is somewhat problematic as the latter is used in the impelementation of the former. Instead of using fwnode_get_next_child_node() in acpi_graph_get_next_endpoint(), call acpi_get_next_subnode() directly instead.
Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Reviewed-by: Laurent Pinchart laurent.pinchart+renesas@ideasonboard.com Reviewed-by: Jonathan Cameron jonathan.cameron@huawei.com Link: https://patch.msgid.link/20251001104320.1272752-3-sakari.ailus@linux.intel.c... Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/property.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c index b12057baaae7b..19737f9e1e16f 100644 --- a/drivers/acpi/property.c +++ b/drivers/acpi/property.c @@ -1472,7 +1472,7 @@ static struct fwnode_handle *acpi_graph_get_next_endpoint(
if (!prev) { do { - port = fwnode_get_next_child_node(fwnode, port); + port = acpi_get_next_subnode(fwnode, port); /* * The names of the port nodes begin with "port@" * followed by the number of the port node and they also @@ -1490,13 +1490,13 @@ static struct fwnode_handle *acpi_graph_get_next_endpoint( if (!port) return NULL;
- endpoint = fwnode_get_next_child_node(port, prev); + endpoint = acpi_get_next_subnode(port, prev); while (!endpoint) { - port = fwnode_get_next_child_node(fwnode, port); + port = acpi_get_next_subnode(fwnode, port); if (!port) break; if (is_acpi_graph_node(port, "port")) - endpoint = fwnode_get_next_child_node(port, NULL); + endpoint = acpi_get_next_subnode(port, NULL); }
/*
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hal Feng hal.feng@starfivetech.com
[ Upstream commit 6e7970cab51d01b8f7c56f120486c571c22e1b80 ]
Add the compatible strings for supporting the generic cpufreq driver on the StarFive JH7110S SoC.
Signed-off-by: Hal Feng hal.feng@starfivetech.com Reviewed-by: Heinrich Schuchardt heinrich.schuchardt@canonical.com Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cpufreq/cpufreq-dt-platdev.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c b/drivers/cpufreq/cpufreq-dt-platdev.c index cd1816a12bb99..dc11b62399ad5 100644 --- a/drivers/cpufreq/cpufreq-dt-platdev.c +++ b/drivers/cpufreq/cpufreq-dt-platdev.c @@ -87,6 +87,7 @@ static const struct of_device_id allowlist[] __initconst = { { .compatible = "st-ericsson,u9540", },
{ .compatible = "starfive,jh7110", }, + { .compatible = "starfive,jh7110s", },
{ .compatible = "ti,omap2", }, { .compatible = "ti,omap4", },
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Armin Wolf W_Armin@gmx.de
[ Upstream commit 2e00f7a4bb0ac25ec7477b55fe482da39fb4dce8 ]
Some firmware implementations use the "Ones" ASL opcode to produce an integer with all bits set in order to indicate missing speed or power readings. This however only works when using 32-bit integers, as the ACPI spec requires a 32-bit integer (0xFFFFFFFF) to be returned for missing speed/power readings. With 64-bit integers the "Ones" opcode produces a 64-bit integer with all bits set, violating the ACPI spec regarding the placeholder value for missing readings.
Work around such buggy firmware implementation by also checking for 64-bit integers with all bits set when reading _FST.
Signed-off-by: Armin Wolf W_Armin@gmx.de [ rjw: Typo fix in the changelog ] Link: https://patch.msgid.link/20251007234149.2769-3-W_Armin@gmx.de Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/fan.h | 33 +++++++++++++++++++++++++++++++++ drivers/acpi/fan_hwmon.c | 10 +++------- 2 files changed, 36 insertions(+), 7 deletions(-)
diff --git a/drivers/acpi/fan.h b/drivers/acpi/fan.h index bedbab0e8e4e9..0d73433c38892 100644 --- a/drivers/acpi/fan.h +++ b/drivers/acpi/fan.h @@ -11,6 +11,7 @@ #define _ACPI_FAN_H_
#include <linux/kconfig.h> +#include <linux/limits.h>
#define ACPI_FAN_DEVICE_IDS \ {"INT3404", }, /* Fan */ \ @@ -60,6 +61,38 @@ struct acpi_fan { struct device_attribute fine_grain_control; };
+/** + * acpi_fan_speed_valid - Check if fan speed value is valid + * @speeed: Speed value returned by the ACPI firmware + * + * Check if the fan speed value returned by the ACPI firmware is valid. This function is + * necessary as ACPI firmware implementations can return 0xFFFFFFFF to signal that the + * ACPI fan does not support speed reporting. Additionally, some buggy ACPI firmware + * implementations return a value larger than the 32-bit integer value defined by + * the ACPI specification when using placeholder values. Such invalid values are also + * detected by this function. + * + * Returns: True if the fan speed value is valid, false otherwise. + */ +static inline bool acpi_fan_speed_valid(u64 speed) +{ + return speed < U32_MAX; +} + +/** + * acpi_fan_power_valid - Check if fan power value is valid + * @power: Power value returned by the ACPI firmware + * + * Check if the fan power value returned by the ACPI firmware is valid. + * See acpi_fan_speed_valid() for details. + * + * Returns: True if the fan power value is valid, false otherwise. + */ +static inline bool acpi_fan_power_valid(u64 power) +{ + return power < U32_MAX; +} + int acpi_fan_get_fst(acpi_handle handle, struct acpi_fan_fst *fst); int acpi_fan_create_attributes(struct acpi_device *device); void acpi_fan_delete_attributes(struct acpi_device *device); diff --git a/drivers/acpi/fan_hwmon.c b/drivers/acpi/fan_hwmon.c index 4b2c2007f2d7f..47a02ef5a6067 100644 --- a/drivers/acpi/fan_hwmon.c +++ b/drivers/acpi/fan_hwmon.c @@ -15,10 +15,6 @@
#include "fan.h"
-/* Returned when the ACPI fan does not support speed reporting */ -#define FAN_SPEED_UNAVAILABLE U32_MAX -#define FAN_POWER_UNAVAILABLE U32_MAX - static struct acpi_fan_fps *acpi_fan_get_current_fps(struct acpi_fan *fan, u64 control) { unsigned int i; @@ -77,7 +73,7 @@ static umode_t acpi_fan_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_ * when the associated attribute should not be created. */ for (i = 0; i < fan->fps_count; i++) { - if (fan->fps[i].power != FAN_POWER_UNAVAILABLE) + if (acpi_fan_power_valid(fan->fps[i].power)) return 0444; }
@@ -106,7 +102,7 @@ static int acpi_fan_hwmon_read(struct device *dev, enum hwmon_sensor_types type, case hwmon_fan: switch (attr) { case hwmon_fan_input: - if (fst.speed == FAN_SPEED_UNAVAILABLE) + if (!acpi_fan_speed_valid(fst.speed)) return -ENODEV;
if (fst.speed > LONG_MAX) @@ -134,7 +130,7 @@ static int acpi_fan_hwmon_read(struct device *dev, enum hwmon_sensor_types type, if (!fps) return -EIO;
- if (fps->power == FAN_POWER_UNAVAILABLE) + if (!acpi_fan_power_valid(fps->power)) return -ENODEV;
if (fps->power > LONG_MAX / MICROWATT_PER_MILLIWATT)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuhao Fu sfual@cse.ust.hk
[ Upstream commit 2de5cb96060a1664880d65b120e59485a73588a8 ]
In function `s5pv210_cpu_init`, a possible refcount inconsistency has been identified, causing a resource leak.
Why it is a bug: 1. For every clk_get, there should be a matching clk_put on every successive error handling path. 2. After calling `clk_get(dmc1_clk)`, variable `dmc1_clk` will not be freed even if any error happens.
How it is fixed: For every failed path, an extra goto label is added to ensure `dmc1_clk` will be freed regardlessly.
Signed-off-by: Shuhao Fu sfual@cse.ust.hk Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cpufreq/s5pv210-cpufreq.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/s5pv210-cpufreq.c b/drivers/cpufreq/s5pv210-cpufreq.c index 4215621deb3fe..ba8a1c96427a1 100644 --- a/drivers/cpufreq/s5pv210-cpufreq.c +++ b/drivers/cpufreq/s5pv210-cpufreq.c @@ -518,7 +518,7 @@ static int s5pv210_cpu_init(struct cpufreq_policy *policy)
if (policy->cpu != 0) { ret = -EINVAL; - goto out_dmc1; + goto out; }
/* @@ -530,7 +530,7 @@ static int s5pv210_cpu_init(struct cpufreq_policy *policy) if ((mem_type != LPDDR) && (mem_type != LPDDR2)) { pr_err("CPUFreq doesn't support this memory type\n"); ret = -EINVAL; - goto out_dmc1; + goto out; }
/* Find current refresh counter and frequency each DMC */ @@ -544,6 +544,8 @@ static int s5pv210_cpu_init(struct cpufreq_policy *policy) cpufreq_generic_init(policy, s5pv210_freq_table, 40000); return 0;
+out: + clk_put(dmc1_clk); out_dmc1: clk_put(dmc0_clk); out_dmc0:
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aboorva Devarajan aboorvad@linux.ibm.com
[ Upstream commit 07d815701274d156ad8c7c088a52e01642156fb8 ]
On virtualized PowerPC (pseries) systems, where only one polling state (Snooze) and one deep state (CEDE) are available, selecting CEDE when the predicted idle duration is less than the target residency of CEDE state can hurt performance. In such cases, the entry/exit overhead of CEDE outweighs the power savings, leading to unnecessary state transitions and higher latency.
Menu governor currently contains a special-case rule that prioritizes the first non-polling state over polling, even when its target residency is much longer than the predicted idle duration. On PowerPC/pseries, where the gap between the polling state (Snooze) and the first non-polling state (CEDE) is large, this behavior causes performance regressions.
Refine that special case by adding an extra requirement: the first non-polling state can only be chosen if its target residency is below the defined RESIDENCY_THRESHOLD_NS. If this condition is not satisfied, polling is allowed instead, avoiding suboptimal non-polling state entries.
This change is limited to the single special-case rule for the first non-polling state. The general non-polling state selection logic in the menu governor remains unchanged.
Performance improvement observed with pgbench on PowerPC (pseries) system: +---------------------------+------------+------------+------------+ | Metric | Baseline | Patched | Change (%) | +---------------------------+------------+------------+------------+ | Transactions/sec (TPS) | 495,210 | 536,982 | +8.45% | | Avg latency (ms) | 0.163 | 0.150 | -7.98% | +---------------------------+------------+------------+------------+
CPUIdle state usage: +--------------+--------------+-------------+ | Metric | Baseline | Patched | +--------------+--------------+-------------+ | Total usage | 12,735,820 | 13,918,442 | | Above usage | 11,401,520 | 1,598,210 | | Below usage | 20,145 | 702,395 | +--------------+--------------+-------------+
Above/Total and Below/Total usage percentages: +------------------------+-----------+---------+ | Metric | Baseline | Patched | +------------------------+-----------+---------+ | Above % (Above/Total) | 89.56% | 11.49% | | Below % (Below/Total) | 0.16% | 5.05% | | Total cpuidle miss (%) | 89.72% | 16.54% | +------------------------+-----------+---------+
The results indicate that restricting CEDE selection to cases where its residency matches the predicted idle time reduces mispredictions, lowers unnecessary state transitions, and improves overall throughput.
Reviewed-by: Christian Loehle christian.loehle@arm.com Signed-off-by: Aboorva Devarajan aboorvad@linux.ibm.com [ rjw: Changelog edits, rebase ] Link: https://patch.msgid.link/20251006013954.17972-1-aboorvad@linux.ibm.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cpuidle/governors/menu.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c index 23239b0c04f95..64d6f7a1c7766 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -317,12 +317,13 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, }
/* - * Use a physical idle state, not busy polling, unless a timer - * is going to trigger soon enough or the exit latency of the - * idle state in question is greater than the predicted idle - * duration. + * Use a physical idle state instead of busy polling so long as + * its target residency is below the residency threshold, its + * exit latency is not greater than the predicted idle duration, + * and the next timer doesn't expire soon. */ if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) && + s->target_residency_ns < RESIDENCY_THRESHOLD_NS && s->target_residency_ns <= data->next_timer_ns && s->exit_latency_ns <= predicted_ns) { predicted_ns = s->target_residency_ns;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yu Peng pengyu@kylinos.cn
[ Upstream commit ca8313fd83399ea1d18e695c2ae9b259985c9e1f ]
Fix section mismatch warning reported by modpost:
.text:early_parse_cmdline() -> .init.data:boot_command_line
The function early_parse_cmdline() is only called during init and accesses init data, so mark it __init to match its usage.
[ bp: This happens only when the toolchain fails to inline the function and I haven't been able to reproduce it with any toolchain I'm using. Patch is obviously correct regardless. ]
Signed-off-by: Yu Peng pengyu@kylinos.cn Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Link: https://patch.msgid.link/all/20251030123757.1410904-1-pengyu@kylinos.cn Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/cpu/microcode/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c index f75c140906d00..539edd6d6dc8c 100644 --- a/arch/x86/kernel/cpu/microcode/core.c +++ b/arch/x86/kernel/cpu/microcode/core.c @@ -136,7 +136,7 @@ bool __init microcode_loader_disabled(void) return dis_ucode_ldr; }
-static void early_parse_cmdline(void) +static void __init early_parse_cmdline(void) { char cmd_buf[64] = {}; char *s, *p = cmd_buf;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mauro Carvalho Chehab mchehab+huawei@kernel.org
[ Upstream commit ade9b9576e2f000fb2ef0ac3bcd26e1167fd813b ]
When running kernel-doc over multiple documents, it emits one error message per file with is not what we want:
$ python3.6 scripts/kernel-doc.py . --none ... Warning: ./include/trace/events/swiotlb.h:0 Python 3.7 or later is required for correct results Warning: ./include/trace/events/iommu.h:0 Python 3.7 or later is required for correct results Warning: ./include/trace/events/sock.h:0 Python 3.7 or later is required for correct results ...
Change the logic to warn it only once at the library:
$ python3.6 scripts/kernel-doc.py . --none Warning: Python 3.7 or later is required for correct results Warning: ./include/cxl/features.h:0 Python 3.7 or later is required for correct results
When running from command line, it warns twice, but that sounds ok.
Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Message-ID: 68e54cf8b1201d1f683aad9bc710a99421910356.1758196090.git.mchehab+huawei@kernel.org Signed-off-by: Jonathan Corbet corbet@lwn.net Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/lib/kdoc/kdoc_parser.py | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py index 2376f180b1fa9..89d920e0b65ca 100644 --- a/scripts/lib/kdoc/kdoc_parser.py +++ b/scripts/lib/kdoc/kdoc_parser.py @@ -350,6 +350,7 @@ class KernelEntry: self.section = SECTION_DEFAULT self._contents = []
+python_warning = False
class KernelDoc: """ @@ -383,9 +384,13 @@ class KernelDoc: # We need Python 3.7 for its "dicts remember the insertion # order" guarantee # - if sys.version_info.major == 3 and sys.version_info.minor < 7: + global python_warning + if (not python_warning and + sys.version_info.major == 3 and sys.version_info.minor < 7): + self.emit_msg(0, 'Python 3.7 or later is required for correct results') + python_warning = True
def emit_msg(self, ln, msg, warning=True): """Emit a message"""
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Song Liu song@kernel.org
[ Upstream commit 139560e8b973402140cafeb68c656c1374bd4c20 ]
When there is only one function of the same name, old_sympos of 0 and 1 are logically identical. Match them in klp_find_func().
This is to avoid a corner case with different toolchain behavior.
In this specific issue, two versions of kpatch-build were used to build livepatch for the same kernel. One assigns old_sympos == 0 for unique local functions, the other assigns old_sympos == 1 for unique local functions. Both versions work fine by themselves. (PS: This behavior change was introduced in a downstream version of kpatch-build. This change does not exist in upstream kpatch-build.)
However, during livepatch upgrade (with the replace flag set) from a patch built with one version of kpatch-build to the same fix built with the other version of kpatch-build, livepatching fails with errors like:
[ 14.218706] sysfs: cannot create duplicate filename 'xxx/somefunc,1' ... [ 14.219466] Call Trace: [ 14.219468] <TASK> [ 14.219469] dump_stack_lvl+0x47/0x60 [ 14.219474] sysfs_warn_dup.cold+0x17/0x27 [ 14.219476] sysfs_create_dir_ns+0x95/0xb0 [ 14.219479] kobject_add_internal+0x9e/0x260 [ 14.219483] kobject_add+0x68/0x80 [ 14.219485] ? kstrdup+0x3c/0xa0 [ 14.219486] klp_enable_patch+0x320/0x830 [ 14.219488] patch_init+0x443/0x1000 [ccc_0_6] [ 14.219491] ? 0xffffffffa05eb000 [ 14.219492] do_one_initcall+0x2e/0x190 [ 14.219494] do_init_module+0x67/0x270 [ 14.219496] init_module_from_file+0x75/0xa0 [ 14.219499] idempotent_init_module+0x15a/0x240 [ 14.219501] __x64_sys_finit_module+0x61/0xc0 [ 14.219503] do_syscall_64+0x5b/0x160 [ 14.219505] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 14.219507] RIP: 0033:0x7f545a4bd96d ... [ 14.219516] kobject: kobject_add_internal failed for somefunc,1 with -EEXIST, don't try to register things with the same name ...
This happens because klp_find_func() thinks somefunc with old_sympos==0 is not the same as somefunc with old_sympos==1, and klp_add_object_nops adds another xxx/func,1 to the list of functions to patch.
Signed-off-by: Song Liu song@kernel.org Acked-by: Josh Poimboeuf jpoimboe@kernel.org [pmladek@suse.com: Fixed some typos.] Reviewed-by: Petr Mladek pmladek@suse.com Tested-by: Petr Mladek pmladek@suse.com Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/livepatch/core.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c index 0e73fac55f8eb..4e7a5cbc40a91 100644 --- a/kernel/livepatch/core.c +++ b/kernel/livepatch/core.c @@ -88,8 +88,14 @@ static struct klp_func *klp_find_func(struct klp_object *obj, struct klp_func *func;
klp_for_each_func(obj, func) { + /* + * Besides identical old_sympos, also consider old_sympos + * of 0 and 1 are identical. + */ if ((strcmp(old_func->old_name, func->old_name) == 0) && - (old_func->old_sympos == func->old_sympos)) { + ((old_func->old_sympos == func->old_sympos) || + (old_func->old_sympos == 0 && func->old_sympos == 1) || + (old_func->old_sympos == 1 && func->old_sympos == 0))) { return func; } }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello (AMD) superm1@kernel.org
[ Upstream commit 9fc6290117259a8dbf8247cb54559df62fd1550f ]
PCI device 0x115A is similar to pspv5, except it doesn't have platform access mailbox support.
Signed-off-by: Mario Limonciello (AMD) superm1@kernel.org Acked-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/crypto/ccp/sp-pci.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/drivers/crypto/ccp/sp-pci.c b/drivers/crypto/ccp/sp-pci.c index e7bb803912a6d..8891ceee1d7d0 100644 --- a/drivers/crypto/ccp/sp-pci.c +++ b/drivers/crypto/ccp/sp-pci.c @@ -459,6 +459,17 @@ static const struct psp_vdata pspv6 = { .intsts_reg = 0x10514, /* P2CMSG_INTSTS */ };
+static const struct psp_vdata pspv7 = { + .tee = &teev2, + .cmdresp_reg = 0x10944, /* C2PMSG_17 */ + .cmdbuff_addr_lo_reg = 0x10948, /* C2PMSG_18 */ + .cmdbuff_addr_hi_reg = 0x1094c, /* C2PMSG_19 */ + .bootloader_info_reg = 0x109ec, /* C2PMSG_59 */ + .feature_reg = 0x109fc, /* C2PMSG_63 */ + .inten_reg = 0x10510, /* P2CMSG_INTEN */ + .intsts_reg = 0x10514, /* P2CMSG_INTSTS */ +}; + #endif
static const struct sp_dev_vdata dev_vdata[] = { @@ -525,6 +536,13 @@ static const struct sp_dev_vdata dev_vdata[] = { .psp_vdata = &pspv6, #endif }, + { /* 9 */ + .bar = 2, +#ifdef CONFIG_CRYPTO_DEV_SP_PSP + .psp_vdata = &pspv7, +#endif + }, + }; static const struct pci_device_id sp_pci_table[] = { { PCI_VDEVICE(AMD, 0x1537), (kernel_ulong_t)&dev_vdata[0] }, @@ -539,6 +557,7 @@ static const struct pci_device_id sp_pci_table[] = { { PCI_VDEVICE(AMD, 0x17E0), (kernel_ulong_t)&dev_vdata[7] }, { PCI_VDEVICE(AMD, 0x156E), (kernel_ulong_t)&dev_vdata[8] }, { PCI_VDEVICE(AMD, 0x17D8), (kernel_ulong_t)&dev_vdata[8] }, + { PCI_VDEVICE(AMD, 0x115A), (kernel_ulong_t)&dev_vdata[9] }, /* Last entry must be zero */ { 0, } };
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Konstantin Komarov almaz.alexandrovich@paragon-software.com
[ Upstream commit 5180138604323895b5c291eca6aa7c20be494ade ]
Before it used an unsigned 64-bit type, which prevented proper handling of timestamps earlier than 1970-01-01. Switch to a signed 64-bit type to support pre-epoch timestamps. The issue was caught by xfstests.
Signed-off-by: Konstantin Komarov almaz.alexandrovich@paragon-software.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ntfs3/ntfs_fs.h | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/fs/ntfs3/ntfs_fs.h b/fs/ntfs3/ntfs_fs.h index 630128716ea73..2649fbe16669d 100644 --- a/fs/ntfs3/ntfs_fs.h +++ b/fs/ntfs3/ntfs_fs.h @@ -979,11 +979,12 @@ static inline __le64 kernel2nt(const struct timespec64 *ts) */ static inline void nt2kernel(const __le64 tm, struct timespec64 *ts) { - u64 t = le64_to_cpu(tm) - _100ns2seconds * SecondsToStartOf1970; + s32 t32; + /* use signed 64 bit to support timestamps prior to epoch. xfstest 258. */ + s64 t = le64_to_cpu(tm) - _100ns2seconds * SecondsToStartOf1970;
- // WARNING: do_div changes its first argument(!) - ts->tv_nsec = do_div(t, _100ns2seconds) * 100; - ts->tv_sec = t; + ts->tv_sec = div_s64_rem(t, _100ns2seconds, &t32); + ts->tv_nsec = t32 * 100; }
static inline struct ntfs_sb_info *ntfs_sb(struct super_block *sb)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikhail Malyshev mike.malyshev@gmail.com
[ Upstream commit af61da281f52aba0c5b090bafb3a31c5739850ff ]
When building out-of-tree modules with CONFIG_MODULE_SIG_FORCE=y, module signing fails because the private key path uses $(srctree) while the public key path uses $(objtree). Since signing keys are generated in the build directory during kernel compilation, both paths should use $(objtree) for consistency.
This causes SSL errors like: SSL error:02001002:system library:fopen:No such file or directory sign-file: /kernel-src/certs/signing_key.pem
The issue occurs because: - sig-key uses: $(srctree)/certs/signing_key.pem (source tree) - cmd_sign uses: $(objtree)/certs/signing_key.x509 (build tree)
But both keys are generated in $(objtree) during the build.
This complements commit 25ff08aa43e37 ("kbuild: Fix signing issue for external modules") which fixed the scripts path and public key path, but missed the private key path inconsistency.
Fixes out-of-tree module signing for configurations with separate source and build directories (e.g., O=/kernel-out).
Signed-off-by: Mikhail Malyshev mike.malyshev@gmail.com Reviewed-by: Nathan Chancellor nathan@kernel.org Tested-by: Nicolas Schier nsc@kernel.org Link: https://patch.msgid.link/20251015163452.3754286-1-mike.malyshev@gmail.com Signed-off-by: Nicolas Schier nsc@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/Makefile.modinst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/Makefile.modinst b/scripts/Makefile.modinst index 1628198f3e830..9ba45e5b32b18 100644 --- a/scripts/Makefile.modinst +++ b/scripts/Makefile.modinst @@ -100,7 +100,7 @@ endif # Don't stop modules_install even if we can't sign external modules. # ifeq ($(filter pkcs11:%, $(CONFIG_MODULE_SIG_KEY)),) -sig-key := $(if $(wildcard $(CONFIG_MODULE_SIG_KEY)),,$(srctree)/)$(CONFIG_MODULE_SIG_KEY) +sig-key := $(if $(wildcard $(CONFIG_MODULE_SIG_KEY)),,$(objtree)/)$(CONFIG_MODULE_SIG_KEY) else sig-key := $(CONFIG_MODULE_SIG_KEY) endif
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pedro Demarchi Gomes pedrodemargomes@gmail.com
[ Upstream commit d1693a7d5a38acf6424235a6070bcf5b186a360d ]
When mounting, sb->s_blocksize is used to read the boot_block without being defined or validated. Set a dummy blocksize before attempting to read the boot_block.
The issue can be triggered with the following syz reproducer:
mkdirat(0xffffffffffffff9c, &(0x7f0000000080)='./file1\x00', 0x0) r4 = openat$nullb(0xffffffffffffff9c, &(0x7f0000000040), 0x121403, 0x0) ioctl$FS_IOC_SETFLAGS(r4, 0x40081271, &(0x7f0000000980)=0x4000) mount(&(0x7f0000000140)=@nullb, &(0x7f0000000040)='./cgroup\x00', &(0x7f0000000000)='ntfs3\x00', 0x2208004, 0x0) syz_clone(0x88200200, 0x0, 0x0, 0x0, 0x0, 0x0)
Here, the ioctl sets the bdev block size to 16384. During mount, get_tree_bdev_flags() calls sb_set_blocksize(sb, block_size(bdev)), but since block_size(bdev) > PAGE_SIZE, sb_set_blocksize() leaves sb->s_blocksize at zero.
Later, ntfs_init_from_boot() attempts to read the boot_block while sb->s_blocksize is still zero, which triggers the bug.
Reported-by: syzbot+f4f84b57a01d6b8364ad@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f4f84b57a01d6b8364ad Signed-off-by: Pedro Demarchi Gomes pedrodemargomes@gmail.com [almaz.alexandrovich@paragon-software.com: changed comment style, added return value handling] Signed-off-by: Konstantin Komarov almaz.alexandrovich@paragon-software.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ntfs3/super.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c index ddff94c091b8c..e6c0908e27c29 100644 --- a/fs/ntfs3/super.c +++ b/fs/ntfs3/super.c @@ -933,6 +933,11 @@ static int ntfs_init_from_boot(struct super_block *sb, u32 sector_size,
sbi->volume.blocks = dev_size >> PAGE_SHIFT;
+ /* Set dummy blocksize to read boot_block. */ + if (!sb_min_blocksize(sb, PAGE_SIZE)) { + return -EINVAL; + } + read_boot: bh = ntfs_bread(sb, boot_block); if (!bh)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Viacheslav Dubeyko slava@dubeyko.com
[ Upstream commit ed490f36f439b877393c12a2113601e4145a5a56 ]
The xfstests' test-case generic/070 leaves HFS+ volume in corrupted state:
sudo ./check generic/070 FSTYP -- hfsplus PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.17.0-rc1+ #4 SMP PREEMPT_DYNAMIC Wed Oct 1 15:02:44 PDT 2025 MKFS_OPTIONS -- /dev/loop51 MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch
generic/070 _check_generic_filesystem: filesystem on /dev/loop50 is inconsistent (see xfstests-dev/results//generic/070.full for details)
Ran: generic/070 Failures: generic/070 Failed 1 of 1 tests
sudo fsck.hfsplus -d /dev/loop50 ** /dev/loop50 Using cacheBlockSize=32K cacheTotalBlock=1024 cacheSize=32768K. Executing fsck_hfs (version 540.1-Linux). ** Checking non-journaled HFS Plus Volume. The volume name is test ** Checking extents overflow file. Unused node is not erased (node = 1) ** Checking catalog file. ** Checking multi-linked files. ** Checking catalog hierarchy. ** Checking extended attributes file. ** Checking volume bitmap. ** Checking volume information. Verify Status: VIStat = 0x0000, ABTStat = 0x0000 EBTStat = 0x0004 CBTStat = 0x0000 CatStat = 0x00000000 ** Repairing volume. ** Rechecking volume. ** Checking non-journaled HFS Plus Volume. The volume name is test ** Checking extents overflow file. ** Checking catalog file. ** Checking multi-linked files. ** Checking catalog hierarchy. ** Checking extended attributes file. ** Checking volume bitmap. ** Checking volume information. ** The volume test was repaired successfully.
It is possible to see that fsck.hfsplus detected not erased and unused node for the case of extents overflow file. The HFS+ logic has special method that defines if the node should be erased:
bool hfs_bnode_need_zeroout(struct hfs_btree *tree) { struct super_block *sb = tree->inode->i_sb; struct hfsplus_sb_info *sbi = HFSPLUS_SB(sb); const u32 volume_attr = be32_to_cpu(sbi->s_vhdr->attributes);
return tree->cnid == HFSPLUS_CAT_CNID && volume_attr & HFSPLUS_VOL_UNUSED_NODE_FIX; }
However, it is possible to see that this method works only for the case of catalog file. But debugging of the issue has shown that HFSPLUS_VOL_UNUSED_NODE_FIX attribute has been requested for the extents overflow file too:
catalog file kernel: hfsplus: node 4, num_recs 0, flags 0x10 kernel: hfsplus: tree->cnid 4, volume_attr 0x80000800
extents overflow file kernel: hfsplus: node 1, num_recs 0, flags 0x10 kernel: hfsplus: tree->cnid 3, volume_attr 0x80000800
This patch modifies the hfs_bnode_need_zeroout() by checking only volume_attr but not the b-tree ID because node zeroing can be requested for all HFS+ b-tree types.
sudo ./check generic/070 FSTYP -- hfsplus PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.18.0-rc3+ #79 SMP PREEMPT_DYNAMIC Fri Oct 31 16:07:42 PDT 2025 MKFS_OPTIONS -- /dev/loop51 MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch
generic/070 33s ... 34s Ran: generic/070 Passed all 1 tests
Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com cc: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de cc: Yangtao Li frank.li@vivo.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20251101001229.247432-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfsplus/bnode.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/hfsplus/bnode.c b/fs/hfsplus/bnode.c index 63e652ad1e0de..edf7e27e1e375 100644 --- a/fs/hfsplus/bnode.c +++ b/fs/hfsplus/bnode.c @@ -704,6 +704,5 @@ bool hfs_bnode_need_zeroout(struct hfs_btree *tree) struct hfsplus_sb_info *sbi = HFSPLUS_SB(sb); const u32 volume_attr = be32_to_cpu(sbi->s_vhdr->attributes);
- return tree->cnid == HFSPLUS_CAT_CNID && - volume_attr & HFSPLUS_VOL_UNUSED_NODE_FIX; + return volume_attr & HFSPLUS_VOL_UNUSED_NODE_FIX; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yang Chenzhi yang.chenzhi@vivo.com
[ Upstream commit 152af114287851583cf7e0abc10129941f19466a ]
When sync() and link() are called concurrently, both threads may enter hfs_bnode_find() without finding the node in the hash table and proceed to create it.
Thread A: hfsplus_write_inode() -> hfsplus_write_system_inode() -> hfs_btree_write() -> hfs_bnode_find(tree, 0) -> __hfs_bnode_create(tree, 0)
Thread B: hfsplus_create_cat() -> hfs_brec_insert() -> hfs_bnode_split() -> hfs_bmap_alloc() -> hfs_bnode_find(tree, 0) -> __hfs_bnode_create(tree, 0)
In this case, thread A creates the bnode, sets refcnt=1, and hashes it. Thread B also tries to create the same bnode, notices it has already been inserted, drops its own instance, and uses the hashed one without getting the node.
```
node2 = hfs_bnode_findhash(tree, cnid); if (!node2) { <- Thread A hash = hfs_bnode_hash(cnid); node->next_hash = tree->node_hash[hash]; tree->node_hash[hash] = node; tree->node_hash_cnt++; } else { <- Thread B spin_unlock(&tree->hash_lock); kfree(node); wait_event(node2->lock_wq, !test_bit(HFS_BNODE_NEW, &node2->flags)); return node2; } ```
However, hfs_bnode_find() requires each call to take a reference. Here both threads end up setting refcnt=1. When they later put the node, this triggers:
BUG_ON(!atomic_read(&node->refcnt))
In this scenario, Thread B in fact finds the node in the hash table rather than creating a new one, and thus must take a reference.
Fix this by calling hfs_bnode_get() when reusing a bnode newly created by another thread to ensure the refcount is updated correctly.
A similar bug was fixed in HFS long ago in commit a9dc087fd3c4 ("fix missing hfs_bnode_get() in __hfs_bnode_create") but the same issue remained in HFS+ until now.
Reported-by: syzbot+005d2a9ecd9fbf525f6a@syzkaller.appspotmail.com Signed-off-by: Yang Chenzhi yang.chenzhi@vivo.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Link: https://lore.kernel.org/r/20250829093912.611853-1-yang.chenzhi@vivo.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfsplus/bnode.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/hfsplus/bnode.c b/fs/hfsplus/bnode.c index edf7e27e1e375..482a6c5faa197 100644 --- a/fs/hfsplus/bnode.c +++ b/fs/hfsplus/bnode.c @@ -481,6 +481,7 @@ static struct hfs_bnode *__hfs_bnode_create(struct hfs_btree *tree, u32 cnid) tree->node_hash[hash] = node; tree->node_hash_cnt++; } else { + hfs_bnode_get(node2); spin_unlock(&tree->hash_lock); kfree(node); wait_event(node2->lock_wq,
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit 005d4b0d33f6b4a23d382b7930f7a96b95b01f39 ]
syzbot is reporting that S_IFMT bits of inode->i_mode can become bogus when the S_IFMT bits of the 16bits "mode" field loaded from disk are corrupted.
According to [1], the permissions field was treated as reserved in Mac OS 8 and 9. According to [2], the reserved field was explicitly initialized with 0, and that field must remain 0 as long as reserved. Therefore, when the "mode" field is not 0 (i.e. no longer reserved), the file must be S_IFDIR if dir == 1, and the file must be one of S_IFREG/S_IFLNK/S_IFCHR/ S_IFBLK/S_IFIFO/S_IFSOCK if dir == 0.
Reported-by: syzbot syzbot+895c23f6917da440ed0d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d Link: https://developer.apple.com/library/archive/technotes/tn/tn1150.html#HFSPlus... [1] Link: https://developer.apple.com/library/archive/technotes/tn/tn1150.html#Reserve... [2] Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Reviewed-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Link: https://lore.kernel.org/r/04ded9f9-73fb-496c-bfa5-89c4f5d1d7bb@I-love.SAKURA... Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfsplus/inode.c | 32 ++++++++++++++++++++++++++++---- 1 file changed, 28 insertions(+), 4 deletions(-)
diff --git a/fs/hfsplus/inode.c b/fs/hfsplus/inode.c index b51a411ecd237..e290e417ed3a7 100644 --- a/fs/hfsplus/inode.c +++ b/fs/hfsplus/inode.c @@ -180,13 +180,29 @@ const struct dentry_operations hfsplus_dentry_operations = { .d_compare = hfsplus_compare_dentry, };
-static void hfsplus_get_perms(struct inode *inode, - struct hfsplus_perm *perms, int dir) +static int hfsplus_get_perms(struct inode *inode, + struct hfsplus_perm *perms, int dir) { struct hfsplus_sb_info *sbi = HFSPLUS_SB(inode->i_sb); u16 mode;
mode = be16_to_cpu(perms->mode); + if (dir) { + if (mode && !S_ISDIR(mode)) + goto bad_type; + } else if (mode) { + switch (mode & S_IFMT) { + case S_IFREG: + case S_IFLNK: + case S_IFCHR: + case S_IFBLK: + case S_IFIFO: + case S_IFSOCK: + break; + default: + goto bad_type; + } + }
i_uid_write(inode, be32_to_cpu(perms->owner)); if ((test_bit(HFSPLUS_SB_UID, &sbi->flags)) || (!i_uid_read(inode) && !mode)) @@ -212,6 +228,10 @@ static void hfsplus_get_perms(struct inode *inode, inode->i_flags |= S_APPEND; else inode->i_flags &= ~S_APPEND; + return 0; +bad_type: + pr_err("invalid file type 0%04o for inode %lu\n", mode, inode->i_ino); + return -EIO; }
static int hfsplus_file_open(struct inode *inode, struct file *file) @@ -516,7 +536,9 @@ int hfsplus_cat_read_inode(struct inode *inode, struct hfs_find_data *fd) } hfs_bnode_read(fd->bnode, &entry, fd->entryoffset, sizeof(struct hfsplus_cat_folder)); - hfsplus_get_perms(inode, &folder->permissions, 1); + res = hfsplus_get_perms(inode, &folder->permissions, 1); + if (res) + goto out; set_nlink(inode, 1); inode->i_size = 2 + be32_to_cpu(folder->valence); inode_set_atime_to_ts(inode, hfsp_mt2ut(folder->access_date)); @@ -545,7 +567,9 @@ int hfsplus_cat_read_inode(struct inode *inode, struct hfs_find_data *fd)
hfsplus_inode_read_fork(inode, HFSPLUS_IS_RSRC(inode) ? &file->rsrc_fork : &file->data_fork); - hfsplus_get_perms(inode, &file->permissions, 0); + res = hfsplus_get_perms(inode, &file->permissions, 0); + if (res) + goto out; set_nlink(inode, 1); if (S_ISREG(inode->i_mode)) { if (file->permissions.dev)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Viacheslav Dubeyko slava@dubeyko.com
[ Upstream commit 24e17a29cf7537f0947f26a50f85319abd723c6c ]
The xfstests' test-case generic/073 leaves HFS+ volume in corrupted state:
sudo ./check generic/073 FSTYP -- hfsplus PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.17.0-rc1+ #4 SMP PREEMPT_DYNAMIC Wed Oct 1 15:02:44 PDT 2025 MKFS_OPTIONS -- /dev/loop51 MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch
generic/073 _check_generic_filesystem: filesystem on /dev/loop51 is inconsistent (see XFSTESTS-2/xfstests-dev/results//generic/073.full for details)
Ran: generic/073 Failures: generic/073 Failed 1 of 1 tests
sudo fsck.hfsplus -d /dev/loop51 ** /dev/loop51 Using cacheBlockSize=32K cacheTotalBlock=1024 cacheSize=32768K. Executing fsck_hfs (version 540.1-Linux). ** Checking non-journaled HFS Plus Volume. The volume name is untitled ** Checking extents overflow file. ** Checking catalog file. ** Checking multi-linked files. ** Checking catalog hierarchy. Invalid directory item count (It should be 1 instead of 0) ** Checking extended attributes file. ** Checking volume bitmap. ** Checking volume information. Verify Status: VIStat = 0x0000, ABTStat = 0x0000 EBTStat = 0x0000 CBTStat = 0x0000 CatStat = 0x00004000 ** Repairing volume. ** Rechecking volume. ** Checking non-journaled HFS Plus Volume. The volume name is untitled ** Checking extents overflow file. ** Checking catalog file. ** Checking multi-linked files. ** Checking catalog hierarchy. ** Checking extended attributes file. ** Checking volume bitmap. ** Checking volume information. ** The volume untitled was repaired successfully.
The test is doing these steps on final phase:
mv $SCRATCH_MNT/testdir_1/bar $SCRATCH_MNT/testdir_2/bar $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/testdir_1 $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo
So, we move file bar from testdir_1 into testdir_2 folder. It means that HFS+ logic decrements the number of entries in testdir_1 and increments number of entries in testdir_2. Finally, we do fsync only for testdir_1 and foo but not for testdir_2. As a result, this is the reason why fsck.hfsplus detects the volume corruption afterwards.
This patch fixes the issue by means of adding the hfsplus_cat_write_inode() call for old_dir and new_dir in hfsplus_rename() after the successful ending of hfsplus_rename_cat(). This method makes modification of in-core inode objects for old_dir and new_dir but it doesn't save these modifications in Catalog File's entries. It was expected that hfsplus_write_inode() will save these modifications afterwards. However, because generic/073 does fsync only for testdir_1 and foo then testdir_2 modification hasn't beed saved into Catalog File's entry and it was flushed without this modification. And it was detected by fsck.hfsplus. Now, hfsplus_rename() stores in Catalog File all modified entries and correct state of Catalog File will be flushed during hfsplus_file_fsync() call. Finally, it makes fsck.hfsplus happy.
sudo ./check generic/073 FSTYP -- hfsplus PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.18.0-rc3+ #93 SMP PREEMPT_DYNAMIC Wed Nov 12 14:37:49 PST 2025 MKFS_OPTIONS -- /dev/loop51 MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch
generic/073 32s ... 32s Ran: generic/073 Passed all 1 tests
Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com cc: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de cc: Yangtao Li frank.li@vivo.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20251112232522.814038-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfsplus/dir.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/hfsplus/dir.c b/fs/hfsplus/dir.c index 1b3e27a0d5e03..cadf0b5f93422 100644 --- a/fs/hfsplus/dir.c +++ b/fs/hfsplus/dir.c @@ -552,8 +552,13 @@ static int hfsplus_rename(struct mnt_idmap *idmap, res = hfsplus_rename_cat((u32)(unsigned long)old_dentry->d_fsdata, old_dir, &old_dentry->d_name, new_dir, &new_dentry->d_name); - if (!res) + if (!res) { new_dentry->d_fsdata = old_dentry->d_fsdata; + + res = hfsplus_cat_write_inode(old_dir); + if (!res) + res = hfsplus_cat_write_inode(new_dir); + } return res; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Konstantin Komarov almaz.alexandrovich@paragon-software.com
[ Upstream commit 1b2ae190ea43bebb8c73d21f076addc8a8c71849 ]
Ensure fsync() returns -EIO when the ntfs3 filesystem is in forced shutdown, instead of silently succeeding via generic_file_fsync().
Signed-off-by: Konstantin Komarov almaz.alexandrovich@paragon-software.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ntfs3/file.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c index 4c90ec2fa2eae..83f0072f0896c 100644 --- a/fs/ntfs3/file.c +++ b/fs/ntfs3/file.c @@ -1375,6 +1375,18 @@ static ssize_t ntfs_file_splice_write(struct pipe_inode_info *pipe, return iter_file_splice_write(pipe, file, ppos, len, flags); }
+/* + * ntfs_file_fsync - file_operations::fsync + */ +static int ntfs_file_fsync(struct file *file, loff_t start, loff_t end, int datasync) +{ + struct inode *inode = file_inode(file); + if (unlikely(ntfs3_forced_shutdown(inode->i_sb))) + return -EIO; + + return generic_file_fsync(file, start, end, datasync); +} + // clang-format off const struct inode_operations ntfs_file_inode_operations = { .getattr = ntfs_getattr, @@ -1397,7 +1409,7 @@ const struct file_operations ntfs_file_operations = { .splice_write = ntfs_file_splice_write, .mmap_prepare = ntfs_file_mmap_prepare, .open = ntfs_file_open, - .fsync = generic_file_fsync, + .fsync = ntfs_file_fsync, .fallocate = ntfs_fallocate, .release = ntfs_file_release, };
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bitterblue Smith rtl8821cerfe2@gmail.com
[ Upstream commit 5511ba3de434892e5ef3594d6eabbd12b1629356 ]
Flip the response rate subchannel. It was backwards, causing low speeds when using 40 MHz channel width. "iw dev ... station dump" showed a low RX rate, 11M or less.
Also fix the channel width field of RF6052_REG_MODE_AG.
Tested only with RTL8192CU, but these settings are identical for RTL8723AU.
Signed-off-by: Bitterblue Smith rtl8821cerfe2@gmail.com Reviewed-by: Ping-Ke Shih pkshih@realtek.com Signed-off-by: Ping-Ke Shih pkshih@realtek.com Link: https://patch.msgid.link/1f46571d-855b-43e1-8bfc-abacceb96043@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/realtek/rtl8xxxu/core.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtl8xxxu/core.c b/drivers/net/wireless/realtek/rtl8xxxu/core.c index be39463bd6c46..3e87c571e2419 100644 --- a/drivers/net/wireless/realtek/rtl8xxxu/core.c +++ b/drivers/net/wireless/realtek/rtl8xxxu/core.c @@ -1252,7 +1252,7 @@ void rtl8xxxu_gen1_config_channel(struct ieee80211_hw *hw) opmode &= ~BW_OPMODE_20MHZ; rtl8xxxu_write8(priv, REG_BW_OPMODE, opmode); rsr &= ~RSR_RSC_BANDWIDTH_40M; - if (sec_ch_above) + if (!sec_ch_above) rsr |= RSR_RSC_UPPER_SUB_CHANNEL; else rsr |= RSR_RSC_LOWER_SUB_CHANNEL; @@ -1321,9 +1321,8 @@ void rtl8xxxu_gen1_config_channel(struct ieee80211_hw *hw)
for (i = RF_A; i < priv->rf_paths; i++) { val32 = rtl8xxxu_read_rfreg(priv, i, RF6052_REG_MODE_AG); - if (hw->conf.chandef.width == NL80211_CHAN_WIDTH_40) - val32 &= ~MODE_AG_CHANNEL_20MHZ; - else + val32 &= ~MODE_AG_BW_MASK; + if (hw->conf.chandef.width != NL80211_CHAN_WIDTH_40) val32 |= MODE_AG_CHANNEL_20MHZ; rtl8xxxu_write_rfreg(priv, i, RF6052_REG_MODE_AG, val32); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 9f33477b9a31a1edfe2df9f1a0359cccb0e16b4c ]
If an interface is set down or, per the previous patch, changes type, radar detection for it should be cancelled. This is done for AP mode in mac80211 (somewhat needlessly, since cfg80211 can do it, but didn't until now), but wasn't handled for mesh, so if radar detection was started and then the interface set down or its type switched (the latter sometimes happning in the hwsim test 'mesh_peer_connected_dfs'), radar detection would be around with the interface unknown to the driver, later leading to some warnings around chanctx usage.
Link: https://patch.msgid.link/20251121174021.290120e419e3.I2a5650c9062e29c988992d... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/core.c | 1 + net/wireless/core.h | 1 + net/wireless/mlme.c | 19 +++++++++++++++++++ 3 files changed, 21 insertions(+)
diff --git a/net/wireless/core.c b/net/wireless/core.c index 54a34d8d356e0..5e5c1bc380a89 100644 --- a/net/wireless/core.c +++ b/net/wireless/core.c @@ -1365,6 +1365,7 @@ void cfg80211_leave(struct cfg80211_registered_device *rdev,
cfg80211_pmsr_wdev_down(wdev);
+ cfg80211_stop_radar_detection(wdev); cfg80211_stop_background_radar_detection(wdev);
switch (wdev->iftype) { diff --git a/net/wireless/core.h b/net/wireless/core.h index b6bd7f4d6385a..d5d78752227af 100644 --- a/net/wireless/core.h +++ b/net/wireless/core.h @@ -489,6 +489,7 @@ cfg80211_start_background_radar_detection(struct cfg80211_registered_device *rde struct wireless_dev *wdev, struct cfg80211_chan_def *chandef);
+void cfg80211_stop_radar_detection(struct wireless_dev *wdev); void cfg80211_stop_background_radar_detection(struct wireless_dev *wdev);
void cfg80211_background_cac_done_wk(struct work_struct *work); diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c index 46394eb2086f6..3fc175f9f8686 100644 --- a/net/wireless/mlme.c +++ b/net/wireless/mlme.c @@ -1295,6 +1295,25 @@ cfg80211_start_background_radar_detection(struct cfg80211_registered_device *rde return 0; }
+void cfg80211_stop_radar_detection(struct wireless_dev *wdev) +{ + struct wiphy *wiphy = wdev->wiphy; + struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy); + int link_id; + + for_each_valid_link(wdev, link_id) { + struct cfg80211_chan_def chandef; + + if (!wdev->links[link_id].cac_started) + continue; + + chandef = *wdev_chandef(wdev, link_id); + rdev_end_cac(rdev, wdev->netdev, link_id); + nl80211_radar_notify(rdev, &chandef, NL80211_RADAR_CAC_ABORTED, + wdev->netdev, GFP_KERNEL); + } +} + void cfg80211_stop_background_radar_detection(struct wireless_dev *wdev) { struct wiphy *wiphy = wdev->wiphy;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 7a27b73943a70ee226fa125327101fb18e94701d ]
When changing the interface type, all activity on the interface has to be stopped first. This was done independent of existing code in cfg80211_leave(), so didn't handle e.g. background radar detection. Use cfg80211_leave() to handle it the same way.
Note that cfg80211_leave() behaves slightly differently for IBSS in wireless extensions, it won't send an event in that case. We could handle that, but since nl80211 was used to change the type, IBSS is rare, and wext is already a corner case, it doesn't seem worth it.
Link: https://patch.msgid.link/20251121174021.922ef48ce007.I970c8514252ef8a864a7fb... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/util.c | 23 +---------------------- 1 file changed, 1 insertion(+), 22 deletions(-)
diff --git a/net/wireless/util.c b/net/wireless/util.c index 56724b33af045..4eb028ad16836 100644 --- a/net/wireless/util.c +++ b/net/wireless/util.c @@ -1203,28 +1203,7 @@ int cfg80211_change_iface(struct cfg80211_registered_device *rdev, dev->ieee80211_ptr->use_4addr = false; rdev_set_qos_map(rdev, dev, NULL);
- switch (otype) { - case NL80211_IFTYPE_AP: - case NL80211_IFTYPE_P2P_GO: - cfg80211_stop_ap(rdev, dev, -1, true); - break; - case NL80211_IFTYPE_ADHOC: - cfg80211_leave_ibss(rdev, dev, false); - break; - case NL80211_IFTYPE_STATION: - case NL80211_IFTYPE_P2P_CLIENT: - cfg80211_disconnect(rdev, dev, - WLAN_REASON_DEAUTH_LEAVING, true); - break; - case NL80211_IFTYPE_MESH_POINT: - /* mesh should be handled? */ - break; - case NL80211_IFTYPE_OCB: - cfg80211_leave_ocb(rdev, dev); - break; - default: - break; - } + cfg80211_leave(rdev, dev->ieee80211_ptr);
cfg80211_process_rdev_events(rdev); cfg80211_mlme_purge_registrations(dev->ieee80211_ptr);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Quan Zhou quan.zhou@mediatek.com
[ Upstream commit 066f417be5fd8c7fe581c5550206364735dad7a3 ]
Set the MT76_STATE_MCU_RUNNING bit only after mt7921_load_clc() has successfully completed. Previously, the MCU_RUNNING state was set before loading CLC, which could cause conflict between chip mcu_init retry and mac_reset flow, result in chip init fail and chip abnormal status. By moving the state set after CLC load, firmware initialization becomes robust and resolves init fail issue.
Signed-off-by: Quan Zhou quan.zhou@mediatek.com Reviewed-by: druth@chromium.org Link: https://patch.msgid.link/19ec8e4465142e774f17801025accd0ae2214092.1763465933... Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mediatek/mt76/mt7921/mcu.c | 2 +- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c index 86bd33b916a9d..edc1df3c071e5 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c @@ -646,10 +646,10 @@ int mt7921_run_firmware(struct mt792x_dev *dev) if (err) return err;
- set_bit(MT76_STATE_MCU_RUNNING, &dev->mphy.state); err = mt7921_load_clc(dev, mt792x_ram_name(dev)); if (err) return err; + set_bit(MT76_STATE_MCU_RUNNING, &dev->mphy.state);
return mt7921_mcu_fw_log_2_host(dev, 1); } diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index 8eda407e4135e..c12b71b71cfc7 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1003,10 +1003,10 @@ int mt7925_run_firmware(struct mt792x_dev *dev) if (err) return err;
- set_bit(MT76_STATE_MCU_RUNNING, &dev->mphy.state); err = mt7925_load_clc(dev, mt792x_ram_name(dev)); if (err) return err; + set_bit(MT76_STATE_MCU_RUNNING, &dev->mphy.state);
return mt7925_mcu_fw_log_2_host(dev, 1); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans de Goede hansg@kernel.org
[ Upstream commit a8e5a110c0c38e08e5dd66356cd1156e91cf88e1 ]
The Acer A1 840 tablet contains quite generic names in the sys_vendor and product_name DMI strings, without this patch brcmfmac will try to load: brcmfmac43340-sdio.Insyde-BayTrail.txt as nvram file which is a bit too generic.
Add a DMI quirk so that a unique and clearly identifiable nvram file name is used on the Acer A1 840 tablet.
Acked-by: Arend van Spriel arend.vanspriel@broadcom.com Signed-off-by: Hans de Goede hansg@kernel.org Link: https://patch.msgid.link/20251103100314.353826-1-hansg@kernel.org Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/wireless/broadcom/brcm80211/brcmfmac/dmi.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/dmi.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/dmi.c index c3a602197662b..abe7f6501e5ed 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/dmi.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/dmi.c @@ -24,6 +24,10 @@ static const struct brcmf_dmi_data acepc_t8_data = { BRCM_CC_4345_CHIP_ID, 6, "acepc-t8" };
+static const struct brcmf_dmi_data acer_a1_840_data = { + BRCM_CC_43340_CHIP_ID, 2, "acer-a1-840" +}; + /* The Chuwi Hi8 Pro uses the same Ampak AP6212 module as the Chuwi Vi8 Plus * and the nvram for the Vi8 Plus is already in linux-firmware, so use that. */ @@ -91,6 +95,16 @@ static const struct dmi_system_id dmi_platform_data[] = { }, .driver_data = (void *)&acepc_t8_data, }, + { + /* Acer Iconia One 8 A1-840 (non FHD version) */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Insyde"), + DMI_MATCH(DMI_PRODUCT_NAME, "BayTrail"), + /* Above strings are too generic also match BIOS date */ + DMI_MATCH(DMI_BIOS_DATE, "04/01/2014"), + }, + .driver_data = (void *)&acer_a1_840_data, + }, { /* Chuwi Hi8 Pro with D2D3_Hi8Pro.233 BIOS */ .matches = {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
[ Upstream commit 54df8b80cc63aa0f22c4590cad11542731ed43ff ]
[BUG] When a scrub failed immediately without any byte scrubbed, the returned btrfs_scrub_progress::last_physical will always be 0, even if there is a non-zero @start passed into btrfs_scrub_dev() for resume cases.
This will reset the progress and make later scrub resume start from the beginning.
[CAUSE] The function btrfs_scrub_dev() accepts a @progress parameter to copy its updated progress to the caller, there are cases where we either don't touch progress::last_physical at all or copy 0 into last_physical:
- last_physical not updated at all If some error happened before scrubbing any super block or chunk, we will not copy the progress, leaving the @last_physical untouched.
E.g. failed to allocate @sctx, scrubbing a missing device or even there is already a running scrub and so on.
All those cases won't touch @progress at all, resulting the last_physical untouched and will be left as 0 for most cases.
- Error out before scrubbing any bytes In those case we allocated @sctx, and sctx->stat.last_physical is all zero (initialized by kvzalloc()). Unfortunately some critical errors happened during scrub_enumerate_chunks() or scrub_supers() before any stripe is really scrubbed.
In that case although we will copy sctx->stat back to @progress, since no byte is really scrubbed, last_physical will be overwritten to 0.
[FIX] Make sure the parameter @progress always has its @last_physical member updated to @start parameter inside btrfs_scrub_dev().
At the very beginning of the function, set @progress->last_physical to @start, so that even if we error out without doing progress copying, last_physical is still at @start.
Then after we got @sctx allocated, set sctx->stat.last_physical to @start, this will make sure even if we didn't get any byte scrubbed, at the progress copying stage the @last_physical is not left as zero.
This should resolve the resume progress reset problem.
Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/scrub.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 65361af302343..b6a7ea105eb13 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -3039,6 +3039,10 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, unsigned int nofs_flag; bool need_commit = false;
+ /* Set the basic fallback @last_physical before we got a sctx. */ + if (progress) + progress->last_physical = start; + if (btrfs_fs_closing(fs_info)) return -EAGAIN;
@@ -3057,6 +3061,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, sctx = scrub_setup_ctx(fs_info, is_dev_replace); if (IS_ERR(sctx)) return PTR_ERR(sctx); + sctx->stat.last_physical = start;
ret = scrub_workers_get(fs_info); if (ret)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Viacheslav Dubeyko slava@dubeyko.com
[ Upstream commit 3f04ee216bc1406cb6214ceaa7e544114108e0fa ]
The xfstests' test-case generic/101 leaves HFS+ volume in corrupted state:
sudo ./check generic/101 FSTYP -- hfsplus PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.17.0-rc1+ #4 SMP PREEMPT_DYNAMIC Wed Oct 1 15:02:44 PDT 2025 MKFS_OPTIONS -- /dev/loop51 MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch
generic/101 _check_generic_filesystem: filesystem on /dev/loop51 is inconsistent (see XFSTESTS-2/xfstests-dev/results//generic/101.full for details)
Ran: generic/101 Failures: generic/101 Failed 1 of 1 tests
sudo fsck.hfsplus -d /dev/loop51 ** /dev/loop51 Using cacheBlockSize=32K cacheTotalBlock=1024 cacheSize=32768K. Executing fsck_hfs (version 540.1-Linux). ** Checking non-journaled HFS Plus Volume. The volume name is untitled ** Checking extents overflow file. ** Checking catalog file. ** Checking multi-linked files. ** Checking catalog hierarchy. ** Checking extended attributes file. ** Checking volume bitmap. ** Checking volume information. Invalid volume free block count (It should be 2614350 instead of 2614382) Verify Status: VIStat = 0x8000, ABTStat = 0x0000 EBTStat = 0x0000 CBTStat = 0x0000 CatStat = 0x00000000 ** Repairing volume. ** Rechecking volume. ** Checking non-journaled HFS Plus Volume. The volume name is untitled ** Checking extents overflow file. ** Checking catalog file. ** Checking multi-linked files. ** Checking catalog hierarchy. ** Checking extended attributes file. ** Checking volume bitmap. ** Checking volume information. ** The volume untitled was repaired successfully.
This test executes such steps: "Test that if we truncate a file to a smaller size, then truncate it to its original size or a larger size, then fsyncing it and a power failure happens, the file will have the range [first_truncate_size, last_size[ with all bytes having a value of 0x00 if we read it the next time the filesystem is mounted.".
HFS+ keeps volume's free block count in the superblock. However, hfsplus_file_fsync() doesn't store superblock's content. As a result, superblock contains not correct value of free blocks if a power failure happens.
This patch adds functionality of saving superblock's content during hfsplus_file_fsync() call.
sudo ./check generic/101 FSTYP -- hfsplus PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.18.0-rc3+ #96 SMP PREEMPT_DYNAMIC Wed Nov 19 12:47:37 PST 2025 MKFS_OPTIONS -- /dev/loop51 MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch
generic/101 32s ... 30s Ran: generic/101 Passed all 1 tests
sudo fsck.hfsplus -d /dev/loop51 ** /dev/loop51 Using cacheBlockSize=32K cacheTotalBlock=1024 cacheSize=32768K. Executing fsck_hfs (version 540.1-Linux). ** Checking non-journaled HFS Plus Volume. The volume name is untitled ** Checking extents overflow file. ** Checking catalog file. ** Checking multi-linked files. ** Checking catalog hierarchy. ** Checking extended attributes file. ** Checking volume bitmap. ** Checking volume information. ** The volume untitled appears to be OK.
Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com cc: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de cc: Yangtao Li frank.li@vivo.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20251119223219.1824434-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfsplus/hfsplus_fs.h | 2 + fs/hfsplus/inode.c | 9 +++++ fs/hfsplus/super.c | 87 +++++++++++++++++++++++++---------------- 3 files changed, 65 insertions(+), 33 deletions(-)
diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h index 89e8b19c127b0..de801942ae471 100644 --- a/fs/hfsplus/hfsplus_fs.h +++ b/fs/hfsplus/hfsplus_fs.h @@ -477,6 +477,8 @@ int hfs_part_find(struct super_block *sb, sector_t *part_start, /* super.c */ struct inode *hfsplus_iget(struct super_block *sb, unsigned long ino); void hfsplus_mark_mdb_dirty(struct super_block *sb); +void hfsplus_prepare_volume_header_for_commit(struct hfsplus_vh *vhdr); +int hfsplus_commit_superblock(struct super_block *sb);
/* tables.c */ extern u16 hfsplus_case_fold_table[]; diff --git a/fs/hfsplus/inode.c b/fs/hfsplus/inode.c index e290e417ed3a7..7ae6745ca7ae1 100644 --- a/fs/hfsplus/inode.c +++ b/fs/hfsplus/inode.c @@ -325,6 +325,7 @@ int hfsplus_file_fsync(struct file *file, loff_t start, loff_t end, struct inode *inode = file->f_mapping->host; struct hfsplus_inode_info *hip = HFSPLUS_I(inode); struct hfsplus_sb_info *sbi = HFSPLUS_SB(inode->i_sb); + struct hfsplus_vh *vhdr = sbi->s_vhdr; int error = 0, error2;
error = file_write_and_wait_range(file, start, end); @@ -368,6 +369,14 @@ int hfsplus_file_fsync(struct file *file, loff_t start, loff_t end, error = error2; }
+ mutex_lock(&sbi->vh_mutex); + hfsplus_prepare_volume_header_for_commit(vhdr); + mutex_unlock(&sbi->vh_mutex); + + error2 = hfsplus_commit_superblock(inode->i_sb); + if (!error) + error = error2; + if (!test_bit(HFSPLUS_SB_NOBARRIER, &sbi->flags)) blkdev_issue_flush(inode->i_sb->s_bdev);
diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c index 16bc4abc67e08..67a7a2a093476 100644 --- a/fs/hfsplus/super.c +++ b/fs/hfsplus/super.c @@ -187,40 +187,15 @@ static void hfsplus_evict_inode(struct inode *inode) } }
-static int hfsplus_sync_fs(struct super_block *sb, int wait) +int hfsplus_commit_superblock(struct super_block *sb) { struct hfsplus_sb_info *sbi = HFSPLUS_SB(sb); struct hfsplus_vh *vhdr = sbi->s_vhdr; int write_backup = 0; - int error, error2; - - if (!wait) - return 0; + int error = 0, error2;
hfs_dbg("starting...\n");
- /* - * Explicitly write out the special metadata inodes. - * - * While these special inodes are marked as hashed and written - * out peridocically by the flusher threads we redirty them - * during writeout of normal inodes, and thus the life lock - * prevents us from getting the latest state to disk. - */ - error = filemap_write_and_wait(sbi->cat_tree->inode->i_mapping); - error2 = filemap_write_and_wait(sbi->ext_tree->inode->i_mapping); - if (!error) - error = error2; - if (sbi->attr_tree) { - error2 = - filemap_write_and_wait(sbi->attr_tree->inode->i_mapping); - if (!error) - error = error2; - } - error2 = filemap_write_and_wait(sbi->alloc_file->i_mapping); - if (!error) - error = error2; - mutex_lock(&sbi->vh_mutex); mutex_lock(&sbi->alloc_mutex); vhdr->free_blocks = cpu_to_be32(sbi->free_blocks); @@ -249,11 +224,52 @@ static int hfsplus_sync_fs(struct super_block *sb, int wait) sbi->part_start + sbi->sect_count - 2, sbi->s_backup_vhdr_buf, NULL, REQ_OP_WRITE); if (!error) - error2 = error; + error = error2; out: mutex_unlock(&sbi->alloc_mutex); mutex_unlock(&sbi->vh_mutex);
+ hfs_dbg("finished: err %d\n", error); + + return error; +} + +static int hfsplus_sync_fs(struct super_block *sb, int wait) +{ + struct hfsplus_sb_info *sbi = HFSPLUS_SB(sb); + int error, error2; + + if (!wait) + return 0; + + hfs_dbg("starting...\n"); + + /* + * Explicitly write out the special metadata inodes. + * + * While these special inodes are marked as hashed and written + * out peridocically by the flusher threads we redirty them + * during writeout of normal inodes, and thus the life lock + * prevents us from getting the latest state to disk. + */ + error = filemap_write_and_wait(sbi->cat_tree->inode->i_mapping); + error2 = filemap_write_and_wait(sbi->ext_tree->inode->i_mapping); + if (!error) + error = error2; + if (sbi->attr_tree) { + error2 = + filemap_write_and_wait(sbi->attr_tree->inode->i_mapping); + if (!error) + error = error2; + } + error2 = filemap_write_and_wait(sbi->alloc_file->i_mapping); + if (!error) + error = error2; + + error2 = hfsplus_commit_superblock(sb); + if (!error) + error = error2; + if (!test_bit(HFSPLUS_SB_NOBARRIER, &sbi->flags)) blkdev_issue_flush(sb->s_bdev);
@@ -395,6 +411,15 @@ static const struct super_operations hfsplus_sops = { .show_options = hfsplus_show_options, };
+void hfsplus_prepare_volume_header_for_commit(struct hfsplus_vh *vhdr) +{ + vhdr->last_mount_vers = cpu_to_be32(HFSP_MOUNT_VERSION); + vhdr->modify_date = hfsp_now2mt(); + be32_add_cpu(&vhdr->write_count, 1); + vhdr->attributes &= cpu_to_be32(~HFSPLUS_VOL_UNMNT); + vhdr->attributes |= cpu_to_be32(HFSPLUS_VOL_INCNSTNT); +} + static int hfsplus_fill_super(struct super_block *sb, struct fs_context *fc) { struct hfsplus_vh *vhdr; @@ -562,11 +587,7 @@ static int hfsplus_fill_super(struct super_block *sb, struct fs_context *fc) * H+LX == hfsplusutils, H+Lx == this driver, H+lx is unused * all three are registered with Apple for our use */ - vhdr->last_mount_vers = cpu_to_be32(HFSP_MOUNT_VERSION); - vhdr->modify_date = hfsp_now2mt(); - be32_add_cpu(&vhdr->write_count, 1); - vhdr->attributes &= cpu_to_be32(~HFSPLUS_VOL_UNMNT); - vhdr->attributes |= cpu_to_be32(HFSPLUS_VOL_INCNSTNT); + hfsplus_prepare_volume_header_for_commit(vhdr); hfsplus_sync_fs(sb, 1);
if (!sbi->hidden_dir) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andreas Gruenbacher agruenba@redhat.com
[ Upstream commit 64c10ed9274bc46416f502afea48b4ae11279669 ]
When a node tries to delete an inode, it first requests exclusive access to the iopen glock. This triggers demote requests on all remote nodes currently holding the iopen glock. To satisfy those requests, the remote nodes evict the inode in question, or they poke the corresponding inode glock to signal that the inode is still in active use.
This behavior doesn't depend on whether or not a filesystem is read-only, so remove the incorrect read-only check.
Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/glops.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c index 0c0a80b3bacab..0c68ab4432b08 100644 --- a/fs/gfs2/glops.c +++ b/fs/gfs2/glops.c @@ -630,8 +630,7 @@ static void iopen_go_callback(struct gfs2_glock *gl, bool remote) struct gfs2_inode *ip = gl->gl_object; struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
- if (!remote || sb_rdonly(sdp->sd_vfs) || - test_bit(SDF_KILL, &sdp->sd_flags)) + if (!remote || test_bit(SDF_KILL, &sdp->sd_flags)) return;
if (gl->gl_demote_state == LM_ST_UNLOCKED &&
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andreas Gruenbacher agruenba@redhat.com
[ Upstream commit dff1fb6d8b7abe5b1119fa060f5d6b3370bf10ac ]
Commit e4a8b5481c59a ("gfs2: Switch to wait_event in gfs2_quotad") broke cyclic statfs syncing, so the numbers reported by "df" could easily get completely out of sync with reality. Fix this by reverting part of commit e4a8b5481c59a for now.
A follow-up commit will clean this code up later.
Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/quota.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c index 2298e06797ac3..f2df01f801b81 100644 --- a/fs/gfs2/quota.c +++ b/fs/gfs2/quota.c @@ -1616,7 +1616,7 @@ int gfs2_quotad(void *data)
t = min(quotad_timeo, statfs_timeo);
- t = wait_event_freezable_timeout(sdp->sd_quota_wait, + t -= wait_event_freezable_timeout(sdp->sd_quota_wait, sdp->sd_statfs_force_sync || gfs2_withdrawing_or_withdrawn(sdp) || kthread_should_stop(),
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: ChenXiaoSong chenxiaosong@kylinos.cn
[ Upstream commit 269df046c1e15ab34fa26fd90db9381f022a0963 ]
__process_request() will not print error messages if smb2_ioctl() always returns 0.
Fix this by returning the correct value at the end of function.
Signed-off-by: ChenXiaoSong chenxiaosong@kylinos.cn Acked-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/server/smb2pdu.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index f901ae18e68ad..a2830ec67e782 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -8164,7 +8164,7 @@ int smb2_ioctl(struct ksmbd_work *work) id = req->VolatileFileId;
if (req->Flags != cpu_to_le32(SMB2_0_IOCTL_IS_FSCTL)) { - rsp->hdr.Status = STATUS_NOT_SUPPORTED; + ret = -EOPNOTSUPP; goto out; }
@@ -8184,8 +8184,9 @@ int smb2_ioctl(struct ksmbd_work *work) case FSCTL_DFS_GET_REFERRALS: case FSCTL_DFS_GET_REFERRALS_EX: /* Not support DFS yet */ + ret = -EOPNOTSUPP; rsp->hdr.Status = STATUS_FS_DRIVER_REQUIRED; - goto out; + goto out2; case FSCTL_CREATE_OR_GET_OBJECT_ID: { struct file_object_buf_type1_ioctl_rsp *obj_buf; @@ -8475,8 +8476,10 @@ int smb2_ioctl(struct ksmbd_work *work) rsp->hdr.Status = STATUS_BUFFER_TOO_SMALL; else if (ret < 0 || rsp->hdr.Status == 0) rsp->hdr.Status = STATUS_INVALID_PARAMETER; + +out2: smb2_set_err_rsp(work); - return 0; + return ret; }
/**
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon linkinjeon@kernel.org
[ Upstream commit b39a1833cc4a2755b02603eec3a71a85e9dff926 ]
Under high concurrency, A tree-connection object (tcon) is freed on a disconnect path while another path still holds a reference and later executes *_put()/write on it.
Reported-by: Qianchang Zhao pioooooooooip@gmail.com Reported-by: Zhitong Liu liuzhitong1993@gmail.com Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/server/mgmt/tree_connect.c | 18 ++++-------------- fs/smb/server/mgmt/tree_connect.h | 1 - fs/smb/server/smb2pdu.c | 3 --- 3 files changed, 4 insertions(+), 18 deletions(-)
diff --git a/fs/smb/server/mgmt/tree_connect.c b/fs/smb/server/mgmt/tree_connect.c index ecfc575086712..d3483d9c757c7 100644 --- a/fs/smb/server/mgmt/tree_connect.c +++ b/fs/smb/server/mgmt/tree_connect.c @@ -78,7 +78,6 @@ ksmbd_tree_conn_connect(struct ksmbd_work *work, const char *share_name) tree_conn->t_state = TREE_NEW; status.tree_conn = tree_conn; atomic_set(&tree_conn->refcount, 1); - init_waitqueue_head(&tree_conn->refcount_q);
ret = xa_err(xa_store(&sess->tree_conns, tree_conn->id, tree_conn, KSMBD_DEFAULT_GFP)); @@ -100,14 +99,8 @@ ksmbd_tree_conn_connect(struct ksmbd_work *work, const char *share_name)
void ksmbd_tree_connect_put(struct ksmbd_tree_connect *tcon) { - /* - * Checking waitqueue to releasing tree connect on - * tree disconnect. waitqueue_active is safe because it - * uses atomic operation for condition. - */ - if (!atomic_dec_return(&tcon->refcount) && - waitqueue_active(&tcon->refcount_q)) - wake_up(&tcon->refcount_q); + if (atomic_dec_and_test(&tcon->refcount)) + kfree(tcon); }
int ksmbd_tree_conn_disconnect(struct ksmbd_session *sess, @@ -119,14 +112,11 @@ int ksmbd_tree_conn_disconnect(struct ksmbd_session *sess, xa_erase(&sess->tree_conns, tree_conn->id); write_unlock(&sess->tree_conns_lock);
- if (!atomic_dec_and_test(&tree_conn->refcount)) - wait_event(tree_conn->refcount_q, - atomic_read(&tree_conn->refcount) == 0); - ret = ksmbd_ipc_tree_disconnect_request(sess->id, tree_conn->id); ksmbd_release_tree_conn_id(sess, tree_conn->id); ksmbd_share_config_put(tree_conn->share_conf); - kfree(tree_conn); + if (atomic_dec_and_test(&tree_conn->refcount)) + kfree(tree_conn); return ret; }
diff --git a/fs/smb/server/mgmt/tree_connect.h b/fs/smb/server/mgmt/tree_connect.h index a42cdd0510411..f0023d86716f2 100644 --- a/fs/smb/server/mgmt/tree_connect.h +++ b/fs/smb/server/mgmt/tree_connect.h @@ -33,7 +33,6 @@ struct ksmbd_tree_connect { int maximal_access; bool posix_extensions; atomic_t refcount; - wait_queue_head_t refcount_q; unsigned int t_state; };
diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index a2830ec67e782..dba29881debdc 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -2200,7 +2200,6 @@ int smb2_tree_disconnect(struct ksmbd_work *work) goto err_out; }
- WARN_ON_ONCE(atomic_dec_and_test(&tcon->refcount)); tcon->t_state = TREE_DISCONNECTED; write_unlock(&sess->tree_conns_lock);
@@ -2210,8 +2209,6 @@ int smb2_tree_disconnect(struct ksmbd_work *work) goto err_out; }
- work->tcon = NULL; - rsp->StructureSize = cpu_to_le16(4); err = ksmbd_iov_pin_rsp(work, rsp, sizeof(struct smb2_tree_disconnect_rsp));
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qianchang Zhao pioooooooooip@gmail.com
[ Upstream commit 991f8a79db99b14c48d20d2052c82d65b9186cad ]
ksmbd maintains delete-on-close and pending-delete state in ksmbd_inode->m_flags. In vfs_cache.c this field is accessed under inconsistent locking: some paths read and modify m_flags under ci->m_lock while others do so without taking the lock at all.
Examples:
- ksmbd_query_inode_status() and __ksmbd_inode_close() use ci->m_lock when checking or updating m_flags. - ksmbd_inode_pending_delete(), ksmbd_set_inode_pending_delete(), ksmbd_clear_inode_pending_delete() and ksmbd_fd_set_delete_on_close() used to read and modify m_flags without ci->m_lock.
This creates a potential data race on m_flags when multiple threads open, close and delete the same file concurrently. In the worst case delete-on-close and pending-delete bits can be lost or observed in an inconsistent state, leading to confusing delete semantics (files that stay on disk after delete-on-close, or files that disappear while still in use).
Fix it by:
- Making ksmbd_query_inode_status() look at m_flags under ci->m_lock after dropping inode_hash_lock. - Adding ci->m_lock protection to all helpers that read or modify m_flags (ksmbd_inode_pending_delete(), ksmbd_set_inode_pending_delete(), ksmbd_clear_inode_pending_delete(), ksmbd_fd_set_delete_on_close()). - Keeping the existing ci->m_lock protection in __ksmbd_inode_close(), and moving the actual unlink/xattr removal outside the lock.
This unifies the locking around m_flags and removes the data race while preserving the existing delete-on-close behaviour.
Reported-by: Qianchang Zhao pioooooooooip@gmail.com Reported-by: Zhitong Liu liuzhitong1993@gmail.com Signed-off-by: Qianchang Zhao pioooooooooip@gmail.com Acked-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/server/vfs_cache.c | 88 +++++++++++++++++++++++++++------------ 1 file changed, 62 insertions(+), 26 deletions(-)
diff --git a/fs/smb/server/vfs_cache.c b/fs/smb/server/vfs_cache.c index dfed6fce89049..6ef116585af64 100644 --- a/fs/smb/server/vfs_cache.c +++ b/fs/smb/server/vfs_cache.c @@ -112,40 +112,62 @@ int ksmbd_query_inode_status(struct dentry *dentry)
read_lock(&inode_hash_lock); ci = __ksmbd_inode_lookup(dentry); - if (ci) { - ret = KSMBD_INODE_STATUS_OK; - if (ci->m_flags & (S_DEL_PENDING | S_DEL_ON_CLS)) - ret = KSMBD_INODE_STATUS_PENDING_DELETE; - atomic_dec(&ci->m_count); - } read_unlock(&inode_hash_lock); + if (!ci) + return ret; + + down_read(&ci->m_lock); + if (ci->m_flags & (S_DEL_PENDING | S_DEL_ON_CLS)) + ret = KSMBD_INODE_STATUS_PENDING_DELETE; + else + ret = KSMBD_INODE_STATUS_OK; + up_read(&ci->m_lock); + + atomic_dec(&ci->m_count); return ret; }
bool ksmbd_inode_pending_delete(struct ksmbd_file *fp) { - return (fp->f_ci->m_flags & (S_DEL_PENDING | S_DEL_ON_CLS)); + struct ksmbd_inode *ci = fp->f_ci; + int ret; + + down_read(&ci->m_lock); + ret = (ci->m_flags & (S_DEL_PENDING | S_DEL_ON_CLS)); + up_read(&ci->m_lock); + + return ret; }
void ksmbd_set_inode_pending_delete(struct ksmbd_file *fp) { - fp->f_ci->m_flags |= S_DEL_PENDING; + struct ksmbd_inode *ci = fp->f_ci; + + down_write(&ci->m_lock); + ci->m_flags |= S_DEL_PENDING; + up_write(&ci->m_lock); }
void ksmbd_clear_inode_pending_delete(struct ksmbd_file *fp) { - fp->f_ci->m_flags &= ~S_DEL_PENDING; + struct ksmbd_inode *ci = fp->f_ci; + + down_write(&ci->m_lock); + ci->m_flags &= ~S_DEL_PENDING; + up_write(&ci->m_lock); }
void ksmbd_fd_set_delete_on_close(struct ksmbd_file *fp, int file_info) { - if (ksmbd_stream_fd(fp)) { - fp->f_ci->m_flags |= S_DEL_ON_CLS_STREAM; - return; - } + struct ksmbd_inode *ci = fp->f_ci;
- fp->f_ci->m_flags |= S_DEL_ON_CLS; + down_write(&ci->m_lock); + if (ksmbd_stream_fd(fp)) + ci->m_flags |= S_DEL_ON_CLS_STREAM; + else + ci->m_flags |= S_DEL_ON_CLS; + up_write(&ci->m_lock); }
static void ksmbd_inode_hash(struct ksmbd_inode *ci) @@ -257,27 +279,41 @@ static void __ksmbd_inode_close(struct ksmbd_file *fp) struct file *filp;
filp = fp->filp; - if (ksmbd_stream_fd(fp) && (ci->m_flags & S_DEL_ON_CLS_STREAM)) { - ci->m_flags &= ~S_DEL_ON_CLS_STREAM; - err = ksmbd_vfs_remove_xattr(file_mnt_idmap(filp), - &filp->f_path, - fp->stream.name, - true); - if (err) - pr_err("remove xattr failed : %s\n", - fp->stream.name); + + if (ksmbd_stream_fd(fp)) { + bool remove_stream_xattr = false; + + down_write(&ci->m_lock); + if (ci->m_flags & S_DEL_ON_CLS_STREAM) { + ci->m_flags &= ~S_DEL_ON_CLS_STREAM; + remove_stream_xattr = true; + } + up_write(&ci->m_lock); + + if (remove_stream_xattr) { + err = ksmbd_vfs_remove_xattr(file_mnt_idmap(filp), + &filp->f_path, + fp->stream.name, + true); + if (err) + pr_err("remove xattr failed : %s\n", + fp->stream.name); + } }
if (atomic_dec_and_test(&ci->m_count)) { + bool do_unlink = false; + down_write(&ci->m_lock); if (ci->m_flags & (S_DEL_ON_CLS | S_DEL_PENDING)) { ci->m_flags &= ~(S_DEL_ON_CLS | S_DEL_PENDING); - up_write(&ci->m_lock); - ksmbd_vfs_unlink(filp); - down_write(&ci->m_lock); + do_unlink = true; } up_write(&ci->m_lock);
+ if (do_unlink) + ksmbd_vfs_unlink(filp); + ksmbd_inode_free(ci); } }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chingbin Li liqb365@163.com
[ Upstream commit 8dbbb5423c0802ec21266765de80fd491868fab1 ]
Add VID 2b89 & PID 6275 for Realtek RTL8761BUV USB Bluetooth chip.
The information in /sys/kernel/debug/usb/devices about the Bluetooth device is listed as the below.
T: Bus=01 Lev=01 Prnt=01 Port=02 Cnt=01 Dev#= 6 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=2b89 ProdID=6275 Rev= 2.00 S: Manufacturer=Realtek S: Product=Bluetooth Radio S: SerialNumber=00E04C239987 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms
Signed-off-by: Chingbin Li liqb365@163.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index fa683bb7f0b49..c70e79e69be8d 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -781,6 +781,8 @@ static const struct usb_device_id quirks_table[] = { BTUSB_WIDEBAND_SPEECH }, { USB_DEVICE(0x2b89, 0x8761), .driver_info = BTUSB_REALTEK | BTUSB_WIDEBAND_SPEECH }, + { USB_DEVICE(0x2b89, 0x6275), .driver_info = BTUSB_REALTEK | + BTUSB_WIDEBAND_SPEECH },
/* Additional Realtek 8821AE Bluetooth devices */ { USB_DEVICE(0x0b05, 0x17dc), .driver_info = BTUSB_REALTEK },
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chris Lu chris.lu@mediatek.com
[ Upstream commit 5a6700a31c953af9a17a7e2681335f31d922614d ]
Add VID 0489 & PID e170 for MediaTek MT7922 USB Bluetooth chip.
The information in /sys/kernel/debug/usb/devices about the Bluetooth device is listed as the below.
T: Bus=06 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0489 ProdID=e170 Rev= 1.00 S: Manufacturer=MediaTek Inc. S: Product=Wireless_Device S: SerialNumber=000000000 C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=100mA A: FirstIf#= 0 IfCount= 3 Cls=e0(wlcon) Sub=01 Prot=01 I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=125us E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms I: If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 63 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 63 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 64 Ivl=125us I: If#= 2 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 512 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 512 Ivl=125us
Signed-off-by: Chris Lu chris.lu@mediatek.com Reviewed-by: Paul Menzel pmenzel@molgen.mpg.de Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index c70e79e69be8d..36f18f2657ab8 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -685,6 +685,8 @@ static const struct usb_device_id quirks_table[] = { BTUSB_WIDEBAND_SPEECH }, { USB_DEVICE(0x0489, 0xe153), .driver_info = BTUSB_MEDIATEK | BTUSB_WIDEBAND_SPEECH }, + { USB_DEVICE(0x0489, 0xe170), .driver_info = BTUSB_MEDIATEK | + BTUSB_WIDEBAND_SPEECH }, { USB_DEVICE(0x04ca, 0x3804), .driver_info = BTUSB_MEDIATEK | BTUSB_WIDEBAND_SPEECH }, { USB_DEVICE(0x04ca, 0x38e4), .driver_info = BTUSB_MEDIATEK |
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chris Lu chris.lu@mediatek.com
[ Upstream commit c126f98c011f5796ba118ef2093122d02809d30d ]
Add VID 0489 & PID e135 for MediaTek MT7920 USB Bluetooth chip.
The information in /sys/kernel/debug/usb/devices about the Bluetooth device is listed as the below.
T: Bus=06 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0489 ProdID=e135 Rev= 1.00 S: Manufacturer=MediaTek Inc. S: Product=Wireless_Device S: SerialNumber=000000000 C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=100mA A: FirstIf#= 0 IfCount= 3 Cls=e0(wlcon) Sub=01 Prot=01 I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=125us E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms I: If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 63 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 63 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 64 Ivl=125us I: If#= 2 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 64 Ivl=125us
Signed-off-by: Chris Lu chris.lu@mediatek.com Reviewed-by: Paul Menzel pmenzel@molgen.mpg.de Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index 36f18f2657ab8..cc03c8c38b16f 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -621,6 +621,8 @@ static const struct usb_device_id quirks_table[] = { /* Additional MediaTek MT7920 Bluetooth devices */ { USB_DEVICE(0x0489, 0xe134), .driver_info = BTUSB_MEDIATEK | BTUSB_WIDEBAND_SPEECH }, + { USB_DEVICE(0x0489, 0xe135), .driver_info = BTUSB_MEDIATEK | + BTUSB_WIDEBAND_SPEECH }, { USB_DEVICE(0x13d3, 0x3620), .driver_info = BTUSB_MEDIATEK | BTUSB_WIDEBAND_SPEECH }, { USB_DEVICE(0x13d3, 0x3621), .driver_info = BTUSB_MEDIATEK |
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuai Zhang quic_shuaz@quicinc.com
[ Upstream commit a8b38d19857d42a1f2e90c9d9b0f74de2500acd7 ]
The new platform uses the QCA2066 chip along with a new board ID, which requires a dedicated firmware file to ensure proper initialization. Without this entry, the driver cannot locate and load the correct firmware, resulting in Bluetooth bring-up failure.
This patch adds a new entry to the firmware table for QCA2066 so that the driver can correctly identify the board ID and load the appropriate firmware from 'qca/QCA2066/' in the linux-firmware repository.
Signed-off-by: Shuai Zhang quic_shuaz@quicinc.com Acked-by: Dmitry Baryshkov dmitry.baryshkov@oss.qualcomm.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index cc03c8c38b16f..22f1932fe9126 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -3267,6 +3267,7 @@ static const struct qca_device_info qca_devices_table[] = {
static const struct qca_custom_firmware qca_custom_btfws[] = { { 0x00130201, 0x030A, "QCA2066" }, + { 0x00130201, 0x030B, "QCA2066" }, { }, };
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gongwei Li ligongwei@kylinos.cn
[ Upstream commit 525459da4bd62a81142fea3f3d52188ceb4d8907 ]
Add VID 13d3 & PID 3533 for Realtek RTL8821CE USB Bluetooth chip.
The information in /sys/kernel/debug/usb/devices about the Bluetooth device is listed as the below.
T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=13d3 ProdID=3533 Rev= 1.10 S: Manufacturer=Realtek S: Product=Bluetooth Radio S: SerialNumber=00e04c000001 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms
Signed-off-by: Gongwei Li ligongwei@kylinos.cn Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index 22f1932fe9126..bd26a8db64096 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -504,6 +504,8 @@ static const struct usb_device_id quirks_table[] = { /* Realtek 8821CE Bluetooth devices */ { USB_DEVICE(0x13d3, 0x3529), .driver_info = BTUSB_REALTEK | BTUSB_WIDEBAND_SPEECH }, + { USB_DEVICE(0x13d3, 0x3533), .driver_info = BTUSB_REALTEK | + BTUSB_WIDEBAND_SPEECH },
/* Realtek 8822CE Bluetooth devices */ { USB_DEVICE(0x0bda, 0xb00c), .driver_info = BTUSB_REALTEK |
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Max Chou max.chou@realtek.com
[ Upstream commit 32caa197b9b603e20f49fd3a0dffecd0cd620499 ]
Add the support ID(0x0489, 0xE12F) to usb_device_id table for Realtek RTL8852BE-VT.
The device info from /sys/kernel/debug/usb/devices as below.
T: Bus=04 Lev=02 Prnt=02 Port=05 Cnt=01 Dev#= 86 Spd=12 MxCh= 0 D: Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0489 ProdID=e12f Rev= 0.00 S: Manufacturer=Realtek S: Product=Bluetooth Radio S: SerialNumber=00e04c000001 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms
Signed-off-by: Max Chou max.chou@realtek.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index bd26a8db64096..b92bfd131567e 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -587,6 +587,8 @@ static const struct usb_device_id quirks_table[] = { /* Realtek 8852BT/8852BE-VT Bluetooth devices */ { USB_DEVICE(0x0bda, 0x8520), .driver_info = BTUSB_REALTEK | BTUSB_WIDEBAND_SPEECH }, + { USB_DEVICE(0x0489, 0xe12f), .driver_info = BTUSB_REALTEK | + BTUSB_WIDEBAND_SPEECH },
/* Realtek 8922AE Bluetooth devices */ { USB_DEVICE(0x0bda, 0x8922), .driver_info = BTUSB_REALTEK |
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andreas Gruenbacher agruenba@redhat.com
[ Upstream commit 8a157e0a0aa5143b5d94201508c0ca1bb8cfb941 ]
In gfs2_chain_bio(), the call to bio_chain() has its arguments swapped. The result is leaked bios and incorrect synchronization (only the last bio will actually be waited for). This code is only used during mount and filesystem thaw, so the bug normally won't be noticeable.
Reported-by: Stephen Zhang starzhangzsd@gmail.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/lops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c index 9c8c305a75c46..914d03f6c4e82 100644 --- a/fs/gfs2/lops.c +++ b/fs/gfs2/lops.c @@ -487,7 +487,7 @@ static struct bio *gfs2_chain_bio(struct bio *prev, unsigned int nr_iovecs) new = bio_alloc(prev->bi_bdev, nr_iovecs, prev->bi_opf, GFP_NOIO); bio_clone_blkg_association(new, prev); new->bi_iter.bi_sector = bio_end_sector(prev); - bio_chain(new, prev); + bio_chain(prev, new); submit_bio(prev); return new; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wei Fang wei.fang@nxp.com
[ Upstream commit e8e032cd24dda7cceaa27bc2eb627f82843f0466 ]
The ERR007885 will lead to a TDAR race condition for mutliQ when the driver sets TDAR and the UDMA clears TDAR simultaneously or in a small window (2-4 cycles). And it will cause the udma_tx and udma_tx_arbiter state machines to hang. Therefore, the commit 53bb20d1faba ("net: fec: add variable reg_desc_active to speed things up") and the commit a179aad12bad ("net: fec: ERR007885 Workaround for conventional TX") have added the workaround to fix the potential issue for the conventional TX path. Similarly, the XDP TX path should also have the potential hang issue, so add the workaround for XDP TX path.
Fixes: 6d6b39f180b8 ("net: fec: add initial XDP support") Signed-off-by: Wei Fang wei.fang@nxp.com Link: https://patch.msgid.link/20251128025915.2486943-1-wei.fang@nxp.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/fec_main.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 3222359ac15b7..e2b75d1970ae6 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -3948,7 +3948,12 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep, txq->bd.cur = bdp;
/* Trigger transmission start */ - writel(0, txq->bd.reg_desc_active); + if (!(fep->quirks & FEC_QUIRK_ERR007885) || + !readl(txq->bd.reg_desc_active) || + !readl(txq->bd.reg_desc_active) || + !readl(txq->bd.reg_desc_active) || + !readl(txq->bd.reg_desc_active)) + writel(0, txq->bd.reg_desc_active);
return 0; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wang Liang wangliang74@huawei.com
[ Upstream commit 613d12dd794e078be8ff3cf6b62a6b9acf7f4619 ]
syzbot reported a memory leak [1].
When function sock_alloc_send_skb() return NULL in nr_output(), the original skb is not freed, which was allocated in nr_sendmsg(). Fix this by freeing it before return.
[1] BUG: memory leak unreferenced object 0xffff888129f35500 (size 240): comm "syz.0.17", pid 6119, jiffies 4294944652 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 10 52 28 81 88 ff ff ..........R(.... backtrace (crc 1456a3e4): kmemleak_alloc_recursive include/linux/kmemleak.h:44 [inline] slab_post_alloc_hook mm/slub.c:4983 [inline] slab_alloc_node mm/slub.c:5288 [inline] kmem_cache_alloc_node_noprof+0x36f/0x5e0 mm/slub.c:5340 __alloc_skb+0x203/0x240 net/core/skbuff.c:660 alloc_skb include/linux/skbuff.h:1383 [inline] alloc_skb_with_frags+0x69/0x3f0 net/core/skbuff.c:6671 sock_alloc_send_pskb+0x379/0x3e0 net/core/sock.c:2965 sock_alloc_send_skb include/net/sock.h:1859 [inline] nr_sendmsg+0x287/0x450 net/netrom/af_netrom.c:1105 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg net/socket.c:742 [inline] sock_write_iter+0x293/0x2a0 net/socket.c:1195 new_sync_write fs/read_write.c:593 [inline] vfs_write+0x45d/0x710 fs/read_write.c:686 ksys_write+0x143/0x170 fs/read_write.c:738 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xa4/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Reported-by: syzbot+d7abc36bbbb6d7d40b58@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d7abc36bbbb6d7d40b58 Tested-by: syzbot+d7abc36bbbb6d7d40b58@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Wang Liang wangliang74@huawei.com Link: https://patch.msgid.link/20251129041315.1550766-1-wangliang74@huawei.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/netrom/nr_out.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/netrom/nr_out.c b/net/netrom/nr_out.c index 5e531394a724b..2b3cbceb0b52d 100644 --- a/net/netrom/nr_out.c +++ b/net/netrom/nr_out.c @@ -43,8 +43,10 @@ void nr_output(struct sock *sk, struct sk_buff *skb) frontlen = skb_headroom(skb);
while (skb->len > 0) { - if ((skbn = sock_alloc_send_skb(sk, frontlen + NR_MAX_PACKET_SIZE, 0, &err)) == NULL) + if ((skbn = sock_alloc_send_skb(sk, frontlen + NR_MAX_PACKET_SIZE, 0, &err)) == NULL) { + kfree_skb(skb); return; + }
skb_reserve(skbn, frontlen);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jamal Hadi Salim jhs@mojatatu.com
[ Upstream commit ce052b9402e461a9aded599f5b47e76bc727f7de ]
zdi-disclosures@trendmicro.com says:
The vulnerability is a race condition between `ets_qdisc_dequeue` and `ets_qdisc_change`. It leads to UAF on `struct Qdisc` object. Attacker requires the capability to create new user and network namespace in order to trigger the bug. See my additional commentary at the end of the analysis.
Analysis:
static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt, struct netlink_ext_ack *extack) { ...
// (1) this lock is preventing .change handler (`ets_qdisc_change`) //to race with .dequeue handler (`ets_qdisc_dequeue`) sch_tree_lock(sch);
for (i = nbands; i < oldbands; i++) { if (i >= q->nstrict && q->classes[i].qdisc->q.qlen) list_del_init(&q->classes[i].alist); qdisc_purge_queue(q->classes[i].qdisc); }
WRITE_ONCE(q->nbands, nbands); for (i = nstrict; i < q->nstrict; i++) { if (q->classes[i].qdisc->q.qlen) { // (2) the class is added to the q->active list_add_tail(&q->classes[i].alist, &q->active); q->classes[i].deficit = quanta[i]; } } WRITE_ONCE(q->nstrict, nstrict); memcpy(q->prio2band, priomap, sizeof(priomap));
for (i = 0; i < q->nbands; i++) WRITE_ONCE(q->classes[i].quantum, quanta[i]);
for (i = oldbands; i < q->nbands; i++) { q->classes[i].qdisc = queues[i]; if (q->classes[i].qdisc != &noop_qdisc) qdisc_hash_add(q->classes[i].qdisc, true); }
// (3) the qdisc is unlocked, now dequeue can be called in parallel // to the rest of .change handler sch_tree_unlock(sch);
ets_offload_change(sch); for (i = q->nbands; i < oldbands; i++) { // (4) we're reducing the refcount for our class's qdisc and // freeing it qdisc_put(q->classes[i].qdisc); // (5) If we call .dequeue between (4) and (5), we will have // a strong UAF and we can control RIP q->classes[i].qdisc = NULL; WRITE_ONCE(q->classes[i].quantum, 0); q->classes[i].deficit = 0; gnet_stats_basic_sync_init(&q->classes[i].bstats); memset(&q->classes[i].qstats, 0, sizeof(q->classes[i].qstats)); } return 0; }
Comment: This happens because some of the classes have their qdiscs assigned to NULL, but remain in the active list. This commit fixes this issue by always removing the class from the active list before deleting and freeing its associated qdisc
Reproducer Steps (trimmed version of what was sent by zdi-disclosures@trendmicro.com)
``` DEV="${DEV:-lo}" ROOT_HANDLE="${ROOT_HANDLE:-1:}" BAND2_HANDLE="${BAND2_HANDLE:-20:}" # child under 1:2 PING_BYTES="${PING_BYTES:-48}" PING_COUNT="${PING_COUNT:-200000}" PING_DST="${PING_DST:-127.0.0.1}"
SLOW_TBF_RATE="${SLOW_TBF_RATE:-8bit}" SLOW_TBF_BURST="${SLOW_TBF_BURST:-100b}" SLOW_TBF_LAT="${SLOW_TBF_LAT:-1s}"
cleanup() { tc qdisc del dev "$DEV" root 2>/dev/null } trap cleanup EXIT
ip link set "$DEV" up
tc qdisc del dev "$DEV" root 2>/dev/null || true
tc qdisc add dev "$DEV" root handle "$ROOT_HANDLE" ets bands 2 strict 2
tc qdisc add dev "$DEV" parent 1:2 handle "$BAND2_HANDLE" \ tbf rate "$SLOW_TBF_RATE" burst "$SLOW_TBF_BURST" latency "$SLOW_TBF_LAT"
tc filter add dev "$DEV" parent 1: protocol all prio 1 u32 match u32 0 0 flowid 1:2 tc -s qdisc ls dev $DEV
ping -I "$DEV" -f -c "$PING_COUNT" -s "$PING_BYTES" -W 0.001 "$PING_DST" \
/dev/null 2>&1 &
tc qdisc change dev "$DEV" root handle "$ROOT_HANDLE" ets bands 2 strict 0 tc qdisc change dev "$DEV" root handle "$ROOT_HANDLE" ets bands 2 strict 2 tc -s qdisc ls dev $DEV tc qdisc del dev "$DEV" parent 1:2 || true tc -s qdisc ls dev $DEV tc qdisc change dev "$DEV" root handle "$ROOT_HANDLE" ets bands 1 strict 1 ```
KASAN report ``` ================================================================== BUG: KASAN: slab-use-after-free in ets_qdisc_dequeue+0x1071/0x11b0 kernel/net/sched/sch_ets.c:481 Read of size 8 at addr ffff8880502fc018 by task ping/12308
CPU: 0 UID: 0 PID: 12308 Comm: ping Not tainted 6.18.0-rc4-dirty #1 PREEMPT(full) Hardware name: QEMU Ubuntu 25.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Call Trace: <IRQ> __dump_stack kernel/lib/dump_stack.c:94 dump_stack_lvl+0x100/0x190 kernel/lib/dump_stack.c:120 print_address_description kernel/mm/kasan/report.c:378 print_report+0x156/0x4c9 kernel/mm/kasan/report.c:482 kasan_report+0xdf/0x110 kernel/mm/kasan/report.c:595 ets_qdisc_dequeue+0x1071/0x11b0 kernel/net/sched/sch_ets.c:481 dequeue_skb kernel/net/sched/sch_generic.c:294 qdisc_restart kernel/net/sched/sch_generic.c:399 __qdisc_run+0x1c9/0x1b00 kernel/net/sched/sch_generic.c:417 __dev_xmit_skb kernel/net/core/dev.c:4221 __dev_queue_xmit+0x2848/0x4410 kernel/net/core/dev.c:4729 dev_queue_xmit kernel/./include/linux/netdevice.h:3365 [...]
Allocated by task 17115: kasan_save_stack+0x30/0x50 kernel/mm/kasan/common.c:56 kasan_save_track+0x14/0x30 kernel/mm/kasan/common.c:77 poison_kmalloc_redzone kernel/mm/kasan/common.c:400 __kasan_kmalloc+0xaa/0xb0 kernel/mm/kasan/common.c:417 kasan_kmalloc kernel/./include/linux/kasan.h:262 __do_kmalloc_node kernel/mm/slub.c:5642 __kmalloc_node_noprof+0x34e/0x990 kernel/mm/slub.c:5648 kmalloc_node_noprof kernel/./include/linux/slab.h:987 qdisc_alloc+0xb8/0xc30 kernel/net/sched/sch_generic.c:950 qdisc_create_dflt+0x93/0x490 kernel/net/sched/sch_generic.c:1012 ets_class_graft+0x4fd/0x800 kernel/net/sched/sch_ets.c:261 qdisc_graft+0x3e4/0x1780 kernel/net/sched/sch_api.c:1196 [...]
Freed by task 9905: kasan_save_stack+0x30/0x50 kernel/mm/kasan/common.c:56 kasan_save_track+0x14/0x30 kernel/mm/kasan/common.c:77 __kasan_save_free_info+0x3b/0x70 kernel/mm/kasan/generic.c:587 kasan_save_free_info kernel/mm/kasan/kasan.h:406 poison_slab_object kernel/mm/kasan/common.c:252 __kasan_slab_free+0x5f/0x80 kernel/mm/kasan/common.c:284 kasan_slab_free kernel/./include/linux/kasan.h:234 slab_free_hook kernel/mm/slub.c:2539 slab_free kernel/mm/slub.c:6630 kfree+0x144/0x700 kernel/mm/slub.c:6837 rcu_do_batch kernel/kernel/rcu/tree.c:2605 rcu_core+0x7c0/0x1500 kernel/kernel/rcu/tree.c:2861 handle_softirqs+0x1ea/0x8a0 kernel/kernel/softirq.c:622 __do_softirq kernel/kernel/softirq.c:656 [...]
Commentary:
1. Maher Azzouzi working with Trend Micro Zero Day Initiative was reported as the person who found the issue. I requested to get a proper email to add to the reported-by tag but got no response. For this reason i will credit the person i exchanged emails with i.e zdi-disclosures@trendmicro.com
2. Neither i nor Victor who did a much more thorough testing was able to reproduce a UAF with the PoC or other approaches we tried. We were both able to reproduce a null ptr deref. After exchange with zdi-disclosures@trendmicro.com they sent a small change to be made to the code to add an extra delay which was able to simulate the UAF. i.e, this: qdisc_put(q->classes[i].qdisc); mdelay(90); q->classes[i].qdisc = NULL;
I was informed by Thomas Gleixner(tglx@linutronix.de) that adding delays was acceptable approach for demonstrating the bug, quote: "Adding such delays is common exploit validation practice" The equivalent delay could happen "by virt scheduling the vCPU out, SMIs, NMIs, PREEMPT_RT enabled kernel"
3. I asked the OP to test and report back but got no response and after a few days gave up and proceeded to submit this fix.
Fixes: de6d25924c2a ("net/sched: sch_ets: don't peek at classes beyond 'nbands'") Reported-by: zdi-disclosures@trendmicro.com Tested-by: Victor Nogueira victor@mojatatu.com Signed-off-by: Jamal Hadi Salim jhs@mojatatu.com Reviewed-by: Davide Caratti dcaratti@redhat.com Link: https://patch.msgid.link/20251128151919.576920-1-jhs@mojatatu.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_ets.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c index 82635dd2cfa59..ae46643e596d3 100644 --- a/net/sched/sch_ets.c +++ b/net/sched/sch_ets.c @@ -652,7 +652,7 @@ static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt, sch_tree_lock(sch);
for (i = nbands; i < oldbands; i++) { - if (i >= q->nstrict && q->classes[i].qdisc->q.qlen) + if (cl_is_active(&q->classes[i])) list_del_init(&q->classes[i].alist); qdisc_purge_queue(q->classes[i].qdisc); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Moshe Shemesh moshe@nvidia.com
[ Upstream commit cd7671ef4cf2edf73cd2a3dca3a2f522a4525bf5 ]
The enable_mpesw() function returns -EINVAL if ldev->mode is not MLX5_LAG_MODE_NONE. This means attempting to enable MPESW mode when it's already enabled will fail. In contrast, disable_mpesw() properly checks if the mode is MLX5_LAG_MODE_MPESW before proceeding, making it naturally idempotent and safe to call multiple times.
Fix enable_mpesw() to return success if mpesw is already enabled.
Fixes: a32327a3a02c ("net/mlx5: Lag, Control MultiPort E-Switch single FDB mode") Signed-off-by: Moshe Shemesh moshe@nvidia.com Reviewed-by: Shay Drori shayd@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/1764602008-1334866-2-git-send-email-tariqt@nvidia.c... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c index aad52d3a90e68..2d86af8f0d9b8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c @@ -67,12 +67,19 @@ static int mlx5_mpesw_metadata_set(struct mlx5_lag *ldev)
static int enable_mpesw(struct mlx5_lag *ldev) { - int idx = mlx5_lag_get_dev_index_by_seq(ldev, MLX5_LAG_P1); struct mlx5_core_dev *dev0; int err; + int idx; int i;
- if (idx < 0 || ldev->mode != MLX5_LAG_MODE_NONE) + if (ldev->mode == MLX5_LAG_MODE_MPESW) + return 0; + + if (ldev->mode != MLX5_LAG_MODE_NONE) + return -EINVAL; + + idx = mlx5_lag_get_dev_index_by_seq(ldev, MLX5_LAG_P1); + if (idx < 0) return -EINVAL;
dev0 = ldev->pf[idx].dev;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cosmin Ratiu cratiu@nvidia.com
[ Upstream commit 35e93736f69963337912594eb3951ab320b77521 ]
PSP is unregistered twice in: _mlx5e_remove -> mlx5e_psp_unregister mlx5e_nic_cleanup -> mlx5e_psp_unregister
This leads to a refcount underflow in some conditions: ------------[ cut here ]------------ refcount_t: underflow; use-after-free. WARNING: CPU: 2 PID: 1694 at lib/refcount.c:28 refcount_warn_saturate+0xd8/0xe0 [...] mlx5e_psp_unregister+0x26/0x50 [mlx5_core] mlx5e_nic_cleanup+0x26/0x90 [mlx5_core] mlx5e_remove+0xe6/0x1f0 [mlx5_core] auxiliary_bus_remove+0x18/0x30 device_release_driver_internal+0x194/0x1f0 bus_remove_device+0xc6/0x130 device_del+0x159/0x3c0 mlx5_rescan_drivers_locked+0xbc/0x2a0 [mlx5_core] [...]
Do not directly remove psp from the _mlx5e_remove path, the PSP cleanup happens as part of profile cleanup.
Fixes: 89ee2d92f66c ("net/mlx5e: Support PSP offload functionality") Signed-off-by: Cosmin Ratiu cratiu@nvidia.com Reviewed-by: Dragos Tatulea dtatulea@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/1764602008-1334866-3-git-send-email-tariqt@nvidia.c... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 5e17eae81f4b3..1545f9c008f49 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -6805,7 +6805,6 @@ static void _mlx5e_remove(struct auxiliary_device *adev) * is already unregistered before changing to NIC profile. */ if (priv->netdev->reg_state == NETREG_REGISTERED) { - mlx5e_psp_unregister(priv); unregister_netdev(priv->netdev); _mlx5e_suspend(adev, false); } else {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 27033d06917758d47162581da7e9de8004049dee ]
The RTL8211F(D)(I)-VD-CG PHY also has support for disabling the CLKOUT, and we'd like to introduce the "realtek,clkout-disable" property for that.
But it isn't done through the PHYCR2 register, and it becomes awkward to have the driver pretend that it is. So just replace the machine-level "u16 phycr2" variable with a logical "bool disable_clk_out", which scales better to the other PHY as well.
The change is a complete functional equivalent. Before, if the device tree property was absent, priv->phycr2 would contain the RTL8211F_CLKOUT_EN bit as read from hardware. Now, we don't save priv->phycr2, but we just don't call phy_modify_paged() on it. Also, we can simply call phy_modify_paged() with the "set" argument to 0.
Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Link: https://patch.msgid.link/20251117234033.345679-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 4f0638b12451 ("net: phy: RTL8211FVD: Restore disabling of PHY-mode EEE") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/realtek/realtek_main.c | 38 ++++++++++++++++---------- 1 file changed, 23 insertions(+), 15 deletions(-)
diff --git a/drivers/net/phy/realtek/realtek_main.c b/drivers/net/phy/realtek/realtek_main.c index b26f4d1c6bbfa..1625919a47be8 100644 --- a/drivers/net/phy/realtek/realtek_main.c +++ b/drivers/net/phy/realtek/realtek_main.c @@ -167,8 +167,8 @@ MODULE_LICENSE("GPL");
struct rtl821x_priv { u16 phycr1; - u16 phycr2; bool has_phycr2; + bool disable_clk_out; struct clk *clk; /* rtl8211f */ u16 iner; @@ -239,15 +239,8 @@ static int rtl821x_probe(struct phy_device *phydev) priv->phycr1 |= RTL8211F_ALDPS_PLL_OFF | RTL8211F_ALDPS_ENABLE | RTL8211F_ALDPS_XTAL_OFF;
priv->has_phycr2 = !(phy_id == RTL_8211FVD_PHYID); - if (priv->has_phycr2) { - ret = phy_read_paged(phydev, RTL8211F_PHYCR_PAGE, RTL8211F_PHYCR2); - if (ret < 0) - return ret; - - priv->phycr2 = ret & RTL8211F_CLKOUT_EN; - if (of_property_read_bool(dev->of_node, "realtek,clkout-disable")) - priv->phycr2 &= ~RTL8211F_CLKOUT_EN; - } + priv->disable_clk_out = of_property_read_bool(dev->of_node, + "realtek,clkout-disable");
phydev->priv = priv;
@@ -627,6 +620,23 @@ static int rtl8211f_config_rgmii_delay(struct phy_device *phydev) return 0; }
+static int rtl8211f_config_clk_out(struct phy_device *phydev) +{ + struct rtl821x_priv *priv = phydev->priv; + int ret; + + /* The value is preserved if the device tree property is absent */ + if (!priv->disable_clk_out) + return 0; + + ret = phy_modify_paged(phydev, RTL8211F_PHYCR_PAGE, + RTL8211F_PHYCR2, RTL8211F_CLKOUT_EN, 0); + if (ret) + return ret; + + return genphy_soft_reset(phydev); +} + static int rtl8211f_config_init(struct phy_device *phydev) { struct rtl821x_priv *priv = phydev->priv; @@ -655,16 +665,14 @@ static int rtl8211f_config_init(struct phy_device *phydev) if (ret) return ret;
- ret = phy_modify_paged(phydev, RTL8211F_PHYCR_PAGE, - RTL8211F_PHYCR2, RTL8211F_CLKOUT_EN, - priv->phycr2); - if (ret < 0) { + ret = rtl8211f_config_clk_out(phydev); + if (ret) { dev_err(dev, "clkout configuration failed: %pe\n", ERR_PTR(ret)); return ret; }
- return genphy_soft_reset(phydev); + return 0; }
static int rtl821x_suspend(struct phy_device *phydev)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 910ac7bfb1af1ae4cd141ef80e03a6729213c189 ]
This variable is assigned in rtl821x_probe() and used in rtl8211f_config_init(), which is more complex than it needs to be. Simply testing the same condition from rtl821x_probe() in rtl8211f_config_init() yields the same result (the PHY driver ID is a runtime invariant), but with one temporary variable less.
Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://patch.msgid.link/20251117234033.345679-4-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 4f0638b12451 ("net: phy: RTL8211FVD: Restore disabling of PHY-mode EEE") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/realtek/realtek_main.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/phy/realtek/realtek_main.c b/drivers/net/phy/realtek/realtek_main.c index 1625919a47be8..daf2457b378bf 100644 --- a/drivers/net/phy/realtek/realtek_main.c +++ b/drivers/net/phy/realtek/realtek_main.c @@ -167,7 +167,6 @@ MODULE_LICENSE("GPL");
struct rtl821x_priv { u16 phycr1; - bool has_phycr2; bool disable_clk_out; struct clk *clk; /* rtl8211f */ @@ -218,7 +217,6 @@ static int rtl821x_probe(struct phy_device *phydev) { struct device *dev = &phydev->mdio.dev; struct rtl821x_priv *priv; - u32 phy_id = phydev->drv->phy_id; int ret;
priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL); @@ -238,7 +236,6 @@ static int rtl821x_probe(struct phy_device *phydev) if (of_property_read_bool(dev->of_node, "realtek,aldps-enable")) priv->phycr1 |= RTL8211F_ALDPS_PLL_OFF | RTL8211F_ALDPS_ENABLE | RTL8211F_ALDPS_XTAL_OFF;
- priv->has_phycr2 = !(phy_id == RTL_8211FVD_PHYID); priv->disable_clk_out = of_property_read_bool(dev->of_node, "realtek,clkout-disable");
@@ -656,7 +653,8 @@ static int rtl8211f_config_init(struct phy_device *phydev) if (ret) return ret;
- if (!priv->has_phycr2) + /* RTL8211FVD has no PHYCR2 register */ + if (phydev->drv->phy_id == RTL_8211FVD_PHYID) return 0;
/* Disable PHY-mode EEE so LPI is passed to the MAC */
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit e1a31c41bef678afe0d99b7f0dc3711a80c68447 ]
Add CLKOUT disable support for RTL8211F(D)(I)-VD-CG. Like with other PHY variants, this feature might be requested by customers when the clock output is not used, in order to reduce electromagnetic interference (EMI).
In the common driver, the CLKOUT configuration is done through PHYCR2. The RTL_8211FVD_PHYID is singled out as not having that register, and execution in rtl8211f_config_init() returns early after commit 2c67301584f2 ("net: phy: realtek: Avoid PHYCR2 access if PHYCR2 not present").
But actually CLKOUT is configured through a different register for this PHY. Instead of pretending this is PHYCR2 (which it is not), just add some code for modifying this register inside the rtl8211f_disable_clk_out() function, and move that outside the code portion that runs only if PHYCR2 exists.
In practice this reorders the PHYCR2 writes to disable PHY-mode EEE and to disable the CLKOUT for the normal RTL8211F variants, but this should be perfectly fine.
It was not noted that RTL8211F(D)(I)-VD-CG would need a genphy_soft_reset() call after disabling the CLKOUT. Despite that, we do it out of caution and for symmetry with the other RTL8211F models.
Co-developed-by: Clark Wang xiaoning.wang@nxp.com Signed-off-by: Clark Wang xiaoning.wang@nxp.com Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://patch.msgid.link/20251117234033.345679-5-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 4f0638b12451 ("net: phy: RTL8211FVD: Restore disabling of PHY-mode EEE") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/realtek/realtek_main.c | 31 ++++++++++++++++++-------- 1 file changed, 22 insertions(+), 9 deletions(-)
diff --git a/drivers/net/phy/realtek/realtek_main.c b/drivers/net/phy/realtek/realtek_main.c index daf2457b378bf..f0348b92beec3 100644 --- a/drivers/net/phy/realtek/realtek_main.c +++ b/drivers/net/phy/realtek/realtek_main.c @@ -89,6 +89,14 @@ #define RTL8211F_LEDCR_MASK GENMASK(4, 0) #define RTL8211F_LEDCR_SHIFT 5
+/* RTL8211F(D)(I)-VD-CG CLKOUT configuration is specified via magic values + * to undocumented register pages. The names here do not reflect the datasheet. + * Unlike other PHY models, CLKOUT configuration does not go through PHYCR2. + */ +#define RTL8211FVD_CLKOUT_PAGE 0xd05 +#define RTL8211FVD_CLKOUT_REG 0x11 +#define RTL8211FVD_CLKOUT_EN BIT(8) + /* RTL8211F RGMII configuration */ #define RTL8211F_RGMII_PAGE 0xd08
@@ -626,8 +634,13 @@ static int rtl8211f_config_clk_out(struct phy_device *phydev) if (!priv->disable_clk_out) return 0;
- ret = phy_modify_paged(phydev, RTL8211F_PHYCR_PAGE, - RTL8211F_PHYCR2, RTL8211F_CLKOUT_EN, 0); + if (phydev->drv->phy_id == RTL_8211FVD_PHYID) + ret = phy_modify_paged(phydev, RTL8211FVD_CLKOUT_PAGE, + RTL8211FVD_CLKOUT_REG, + RTL8211FVD_CLKOUT_EN, 0); + else + ret = phy_modify_paged(phydev, RTL8211F_PHYCR_PAGE, + RTL8211F_PHYCR2, RTL8211F_CLKOUT_EN, 0); if (ret) return ret;
@@ -653,6 +666,13 @@ static int rtl8211f_config_init(struct phy_device *phydev) if (ret) return ret;
+ ret = rtl8211f_config_clk_out(phydev); + if (ret) { + dev_err(dev, "clkout configuration failed: %pe\n", + ERR_PTR(ret)); + return ret; + } + /* RTL8211FVD has no PHYCR2 register */ if (phydev->drv->phy_id == RTL_8211FVD_PHYID) return 0; @@ -663,13 +683,6 @@ static int rtl8211f_config_init(struct phy_device *phydev) if (ret) return ret;
- ret = rtl8211f_config_clk_out(phydev); - if (ret) { - dev_err(dev, "clkout configuration failed: %pe\n", - ERR_PTR(ret)); - return ret; - } - return 0; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit bb78b71faf60d11a15f07e3390fcfd31e5e523bb ]
Previous changes have replaced the machine-level priv->phycr2 with a high-level priv->disable_clk_out. This created a discrepancy with priv->phycr1 which is resolved here, for uniformity.
One advantage of this new implementation is that we don't read priv->phycr1 in rtl821x_probe() if we're never going to modify it.
We never test the positive return code from phy_modify_mmd_changed(), so we could just as well use phy_modify_mmd().
I took the ALDPS feature description from commit d90db36a9e74 ("net: phy: realtek: add dt property to enable ALDPS mode") and transformed it into a function comment - the feature is sufficiently non-obvious to deserve that.
Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://patch.msgid.link/20251117234033.345679-6-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 4f0638b12451 ("net: phy: RTL8211FVD: Restore disabling of PHY-mode EEE") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/realtek/realtek_main.c | 44 ++++++++++++++++---------- 1 file changed, 28 insertions(+), 16 deletions(-)
diff --git a/drivers/net/phy/realtek/realtek_main.c b/drivers/net/phy/realtek/realtek_main.c index f0348b92beec3..35f40bfdaf113 100644 --- a/drivers/net/phy/realtek/realtek_main.c +++ b/drivers/net/phy/realtek/realtek_main.c @@ -174,7 +174,7 @@ MODULE_AUTHOR("Johnson Leung"); MODULE_LICENSE("GPL");
struct rtl821x_priv { - u16 phycr1; + bool enable_aldps; bool disable_clk_out; struct clk *clk; /* rtl8211f */ @@ -225,7 +225,6 @@ static int rtl821x_probe(struct phy_device *phydev) { struct device *dev = &phydev->mdio.dev; struct rtl821x_priv *priv; - int ret;
priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL); if (!priv) @@ -236,14 +235,8 @@ static int rtl821x_probe(struct phy_device *phydev) return dev_err_probe(dev, PTR_ERR(priv->clk), "failed to get phy clock\n");
- ret = phy_read_paged(phydev, RTL8211F_PHYCR_PAGE, RTL8211F_PHYCR1); - if (ret < 0) - return ret; - - priv->phycr1 = ret & (RTL8211F_ALDPS_PLL_OFF | RTL8211F_ALDPS_ENABLE | RTL8211F_ALDPS_XTAL_OFF); - if (of_property_read_bool(dev->of_node, "realtek,aldps-enable")) - priv->phycr1 |= RTL8211F_ALDPS_PLL_OFF | RTL8211F_ALDPS_ENABLE | RTL8211F_ALDPS_XTAL_OFF; - + priv->enable_aldps = of_property_read_bool(dev->of_node, + "realtek,aldps-enable"); priv->disable_clk_out = of_property_read_bool(dev->of_node, "realtek,clkout-disable");
@@ -647,17 +640,36 @@ static int rtl8211f_config_clk_out(struct phy_device *phydev) return genphy_soft_reset(phydev); }
-static int rtl8211f_config_init(struct phy_device *phydev) +/* Advance Link Down Power Saving (ALDPS) mode changes crystal/clock behaviour, + * which causes the RXC clock signal to stop for tens to hundreds of + * milliseconds. + * + * Some MACs need the RXC clock to support their internal RX logic, so ALDPS is + * only enabled based on an opt-in device tree property. + */ +static int rtl8211f_config_aldps(struct phy_device *phydev) { struct rtl821x_priv *priv = phydev->priv; + u16 mask = RTL8211F_ALDPS_PLL_OFF | + RTL8211F_ALDPS_ENABLE | + RTL8211F_ALDPS_XTAL_OFF; + + /* The value is preserved if the device tree property is absent */ + if (!priv->enable_aldps) + return 0; + + return phy_modify_paged(phydev, RTL8211F_PHYCR_PAGE, RTL8211F_PHYCR1, + mask, mask); +} + +static int rtl8211f_config_init(struct phy_device *phydev) +{ struct device *dev = &phydev->mdio.dev; int ret;
- ret = phy_modify_paged_changed(phydev, RTL8211F_PHYCR_PAGE, RTL8211F_PHYCR1, - RTL8211F_ALDPS_PLL_OFF | RTL8211F_ALDPS_ENABLE | RTL8211F_ALDPS_XTAL_OFF, - priv->phycr1); - if (ret < 0) { - dev_err(dev, "aldps mode configuration failed: %pe\n", + ret = rtl8211f_config_aldps(phydev); + if (ret) { + dev_err(dev, "aldps mode configuration failed: %pe\n", ERR_PTR(ret)); return ret; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 4465ae435ddc0162d5033a543658449d53d46d08 ]
To simplify the rtl8211f_config_init() control flow and get rid of "early" returns for PHYs where the PHYCR2 register is absent, move the entire logic sub-block that deals with disabling PHY-mode EEE to a separate function. There, it is much more obvious what the early "return 0" skips, and it becomes more difficult to accidentally skip unintended stuff.
Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Link: https://patch.msgid.link/20251117234033.345679-7-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 4f0638b12451 ("net: phy: RTL8211FVD: Restore disabling of PHY-mode EEE") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/realtek/realtek_main.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/drivers/net/phy/realtek/realtek_main.c b/drivers/net/phy/realtek/realtek_main.c index 35f40bfdaf113..2c661346050f1 100644 --- a/drivers/net/phy/realtek/realtek_main.c +++ b/drivers/net/phy/realtek/realtek_main.c @@ -662,6 +662,17 @@ static int rtl8211f_config_aldps(struct phy_device *phydev) mask, mask); }
+static int rtl8211f_config_phy_eee(struct phy_device *phydev) +{ + /* RTL8211FVD has no PHYCR2 register */ + if (phydev->drv->phy_id == RTL_8211FVD_PHYID) + return 0; + + /* Disable PHY-mode EEE so LPI is passed to the MAC */ + return phy_modify_paged(phydev, RTL8211F_PHYCR_PAGE, RTL8211F_PHYCR2, + RTL8211F_PHYCR2_PHY_EEE_ENABLE, 0); +} + static int rtl8211f_config_init(struct phy_device *phydev) { struct device *dev = &phydev->mdio.dev; @@ -685,17 +696,7 @@ static int rtl8211f_config_init(struct phy_device *phydev) return ret; }
- /* RTL8211FVD has no PHYCR2 register */ - if (phydev->drv->phy_id == RTL_8211FVD_PHYID) - return 0; - - /* Disable PHY-mode EEE so LPI is passed to the MAC */ - ret = phy_modify_paged(phydev, RTL8211F_PHYCR_PAGE, RTL8211F_PHYCR2, - RTL8211F_PHYCR2_PHY_EEE_ENABLE, 0); - if (ret) - return ret; - - return 0; + return rtl8211f_config_phy_eee(phydev); }
static int rtl821x_suspend(struct phy_device *phydev)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ivan Galkin ivan.galkin@axis.com
[ Upstream commit 4f0638b12451112de4138689fa679315c8d388dc ]
When support for RTL8211F(D)(I)-VD-CG was introduced in commit bb726b753f75 ("net: phy: realtek: add support for RTL8211F(D)(I)-VD-CG") the implementation assumed that this PHY model doesn't have the control register PHYCR2 (Page 0xa43 Address 0x19). This assumption was based on the differences in CLKOUT configurations between RTL8211FVD and the remaining RTL8211F PHYs. In the latter commit 2c67301584f2 ("net: phy: realtek: Avoid PHYCR2 access if PHYCR2 not present") this assumption was expanded to the PHY-mode EEE.
I performed tests on RTL8211FI-VD-CG and confirmed that disabling PHY-mode EEE works correctly and is uniform with other PHYs supported by the driver. To validate the correctness, I contacted Realtek support. Realtek confirmed that PHY-mode EEE on RTL8211F(D)(I)-VD-CG is configured via Page 0xa43 Address 0x19 bit 5.
Moreover, Realtek informed me that the most recent datasheet for RTL8211F(D)(I)-VD-CG v1.1 is incomplete and the naming of control registers is partly inconsistent. The errata I received from Realtek corrects the naming as follows:
| Register | Datasheet v1.1 | Errata | |-------------------------|----------------|--------| | Page 0xa44 Address 0x11 | PHYCR2 | PHYCR3 | | Page 0xa43 Address 0x19 | N/A | PHYCR2 |
This information confirms that the supposedly missing control register, PHYCR2, exists in the RTL8211F(D)(I)-VD-CG under the same address and the same name. It controls widely the same configs as other PHYs from the RTL8211F series (e.g. PHY-mode EEE). Clock out configuration is an exception.
Given all this information, restore disabling of the PHY-mode EEE.
Fixes: 2c67301584f2 ("net: phy: realtek: Avoid PHYCR2 access if PHYCR2 not present") Signed-off-by: Ivan Galkin ivan.galkin@axis.com Reviewed-by: Vladimir Oltean vladimir.oltean@nxp.com Link: https://patch.msgid.link/20251202-phy_eee-v1-1-fe0bf6ab3df0@axis.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/realtek/realtek_main.c | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/drivers/net/phy/realtek/realtek_main.c b/drivers/net/phy/realtek/realtek_main.c index 2c661346050f1..7c3d277efaf07 100644 --- a/drivers/net/phy/realtek/realtek_main.c +++ b/drivers/net/phy/realtek/realtek_main.c @@ -664,10 +664,6 @@ static int rtl8211f_config_aldps(struct phy_device *phydev)
static int rtl8211f_config_phy_eee(struct phy_device *phydev) { - /* RTL8211FVD has no PHYCR2 register */ - if (phydev->drv->phy_id == RTL_8211FVD_PHYID) - return 0; - /* Disable PHY-mode EEE so LPI is passed to the MAC */ return phy_modify_paged(phydev, RTL8211F_PHYCR_PAGE, RTL8211F_PHYCR2, RTL8211F_PHYCR2_PHY_EEE_ENABLE, 0);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Skorodumov skorodumov.dmitry@huawei.com
[ Upstream commit 0c57ff008a11f24f7f05fa760222692a00465fec ]
Packets with pkt_type == PACKET_LOOPBACK are captured by handle_frame() function, but they don't have L2 header. We should not process them in handle_mode_l2().
This doesn't affect old L2 functionality, since handling was anyway incorrect.
Handle them the same way as in br_handle_frame(): just pass the skb.
To observe invalid behaviour, just start "ping -b" on bcast address of port-interface.
Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.") Signed-off-by: Dmitry Skorodumov skorodumov.dmitry@huawei.com Link: https://patch.msgid.link/20251202103906.4087675-1-skorodumov.dmitry@huawei.c... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ipvlan/ipvlan_core.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c index d7e3ddbcab6f4..baf2ef3bcd54b 100644 --- a/drivers/net/ipvlan/ipvlan_core.c +++ b/drivers/net/ipvlan/ipvlan_core.c @@ -737,6 +737,9 @@ static rx_handler_result_t ipvlan_handle_mode_l2(struct sk_buff **pskb, struct ethhdr *eth = eth_hdr(skb); rx_handler_result_t ret = RX_HANDLER_PASS;
+ if (unlikely(skb->pkt_type == PACKET_LOOPBACK)) + return RX_HANDLER_PASS; + if (is_multicast_ether_addr(eth->h_dest)) { if (ipvlan_external_frame(skb, port)) { struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gerd Bayer gbayer@linux.ibm.com
[ Upstream commit 6a107cfe9c99a079e578a4c5eb70038101a3599f ]
Clear hca_devcom_comp in device's private data after unregistering it in LAG teardown. Otherwise a slightly lagging second pass through mlx5_unload_one() might try to unregister it again and trip over use-after-free.
On s390 almost all PCI level recovery events trigger two passes through mxl5_unload_one() - one through the poll_health() method and one through mlx5_pci_err_detected() as callback from generic PCI error recovery. While testing PCI error recovery paths with more kernel debug features enabled, this issue reproducibly led to kernel panics with the following call chain:
Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6803 ESOP-2 FSI Fault in home space mode while using kernel ASCE. AS:00000000705c4007 R3:0000000000000024 Oops: 0038 ilc:3 [#1]SMP
CPU: 14 UID: 0 PID: 156 Comm: kmcheck Kdump: loaded Not tainted 6.18.0-20251130.rc7.git0.16131a59cab1.300.fc43.s390x+debug #1 PREEMPT
Krnl PSW : 0404e00180000000 0000020fc86aa1dc (__lock_acquire+0x5c/0x15f0) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 Krnl GPRS: 0000000000000000 0000020f00000001 6b6b6b6b6b6b6c33 0000000000000000 0000000000000000 0000000000000000 0000000000000001 0000000000000000 0000000000000000 0000020fca28b820 0000000000000000 0000010a1ced8100 0000010a1ced8100 0000020fc9775068 0000018fce14f8b8 0000018fce14f7f8 Krnl Code: 0000020fc86aa1cc: e3b003400004 lg %r11,832 0000020fc86aa1d2: a7840211 brc 8,0000020fc86aa5f4 *0000020fc86aa1d6: c09000df0b25 larl %r9,0000020fca28b820 >0000020fc86aa1dc: d50790002000 clc 0(8,%r9),0(%r2) 0000020fc86aa1e2: a7840209 brc 8,0000020fc86aa5f4 0000020fc86aa1e6: c0e001100401 larl %r14,0000020fca8aa9e8 0000020fc86aa1ec: c01000e25a00 larl %r1,0000020fca2f55ec 0000020fc86aa1f2: a7eb00e8 aghi %r14,232
Call Trace: __lock_acquire+0x5c/0x15f0 lock_acquire.part.0+0xf8/0x270 lock_acquire+0xb0/0x1b0 down_write+0x5a/0x250 mlx5_detach_device+0x42/0x110 [mlx5_core] mlx5_unload_one_devl_locked+0x50/0xc0 [mlx5_core] mlx5_unload_one+0x42/0x60 [mlx5_core] mlx5_pci_err_detected+0x94/0x150 [mlx5_core] zpci_event_attempt_error_recovery+0xcc/0x388
Fixes: 5a977b5833b7 ("net/mlx5: Lag, move devcom registration to LAG layer") Signed-off-by: Gerd Bayer gbayer@linux.ibm.com Reviewed-by: Moshe Shemesh moshe@nvidia.com Acked-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/20251202-fix_lag-v1-1-59e8177ffce0@linux.ibm.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c index 3db0387bf6dcb..8ec04a5f434dd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c @@ -1413,6 +1413,7 @@ static int __mlx5_lag_dev_add_mdev(struct mlx5_core_dev *dev) static void mlx5_lag_unregister_hca_devcom_comp(struct mlx5_core_dev *dev) { mlx5_devcom_unregister_component(dev->priv.hca_devcom_comp); + dev->priv.hca_devcom_comp = NULL; }
static int mlx5_lag_register_hca_devcom_comp(struct mlx5_core_dev *dev)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit b6b638bda240395dff49a87403b2e32493e56d2a ]
mlxsw_sp_router_schedule_work() takes a reference on a neighbour, expecting a work item to release it later on. However, we might fail to schedule the work item, in which case the neighbour reference count will be leaked.
Fix by taking the reference just before scheduling the work item. Note that mlxsw_sp_router_schedule_work() can receive a NULL neighbour pointer, but neigh_clone() handles that correctly.
Spotted during code review, did not actually observe the reference count leak.
Fixes: 151b89f6025a ("mlxsw: spectrum_router: Reuse work neighbor initialization in work scheduler") Reviewed-by: Petr Machata petrm@nvidia.com Signed-off-by: Ido Schimmel idosch@nvidia.com Signed-off-by: Petr Machata petrm@nvidia.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/ec2934ae4aca187a8d8c9329a08ce93cca411378.1764695650... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c index a2033837182e8..f4e9ecaeb104f 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c @@ -2858,6 +2858,11 @@ static int mlxsw_sp_router_schedule_work(struct net *net, if (!net_work) return NOTIFY_BAD;
+ /* Take a reference to ensure the neighbour won't be destructed until + * we drop the reference in the work item. + */ + neigh_clone(n); + INIT_WORK(&net_work->work, cb); net_work->mlxsw_sp = router->mlxsw_sp; net_work->n = n; @@ -2881,11 +2886,6 @@ static int mlxsw_sp_router_schedule_neigh_work(struct mlxsw_sp_router *router, struct net *net;
net = neigh_parms_net(n->parms); - - /* Take a reference to ensure the neighbour won't be destructed until we - * drop the reference in delayed work. - */ - neigh_clone(n); return mlxsw_sp_router_schedule_work(net, router, n, mlxsw_sp_router_neigh_event_work); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit 8b0e69763ef948fb872a7767df4be665d18f5fd4 ]
We sometimes observe use-after-free when dereferencing a neighbour [1]. The problem seems to be that the driver stores a pointer to the neighbour, but without holding a reference on it. A reference is only taken when the neighbour is used by a nexthop.
Fix by simplifying the reference counting scheme. Always take a reference when storing a neighbour pointer in a neighbour entry. Avoid taking a referencing when the neighbour is used by a nexthop as the neighbour entry associated with the nexthop already holds a reference.
Tested by running the test that uncovered the problem over 300 times. Without this patch the problem was reproduced after a handful of iterations.
[1] BUG: KASAN: slab-use-after-free in mlxsw_sp_neigh_entry_update+0x2d4/0x310 Read of size 8 at addr ffff88817f8e3420 by task ip/3929
CPU: 3 UID: 0 PID: 3929 Comm: ip Not tainted 6.18.0-rc4-virtme-g36b21a067510 #3 PREEMPT(full) Hardware name: Nvidia SN5600/VMOD0013, BIOS 5.13 05/31/2023 Call Trace: <TASK> dump_stack_lvl+0x6f/0xa0 print_address_description.constprop.0+0x6e/0x300 print_report+0xfc/0x1fb kasan_report+0xe4/0x110 mlxsw_sp_neigh_entry_update+0x2d4/0x310 mlxsw_sp_router_rif_gone_sync+0x35f/0x510 mlxsw_sp_rif_destroy+0x1ea/0x730 mlxsw_sp_inetaddr_port_vlan_event+0xa1/0x1b0 __mlxsw_sp_inetaddr_lag_event+0xcc/0x130 __mlxsw_sp_inetaddr_event+0xf5/0x3c0 mlxsw_sp_router_netdevice_event+0x1015/0x1580 notifier_call_chain+0xcc/0x150 call_netdevice_notifiers_info+0x7e/0x100 __netdev_upper_dev_unlink+0x10b/0x210 netdev_upper_dev_unlink+0x79/0xa0 vrf_del_slave+0x18/0x50 do_set_master+0x146/0x7d0 do_setlink.isra.0+0x9a0/0x2880 rtnl_newlink+0x637/0xb20 rtnetlink_rcv_msg+0x6fe/0xb90 netlink_rcv_skb+0x123/0x380 netlink_unicast+0x4a3/0x770 netlink_sendmsg+0x75b/0xc90 __sock_sendmsg+0xbe/0x160 ____sys_sendmsg+0x5b2/0x7d0 ___sys_sendmsg+0xfd/0x180 __sys_sendmsg+0x124/0x1c0 do_syscall_64+0xbb/0xfd0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 [...]
Allocated by task 109: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x7b/0x90 __kmalloc_noprof+0x2c1/0x790 neigh_alloc+0x6af/0x8f0 ___neigh_create+0x63/0xe90 mlxsw_sp_nexthop_neigh_init+0x430/0x7e0 mlxsw_sp_nexthop_type_init+0x212/0x960 mlxsw_sp_nexthop6_group_info_init.constprop.0+0x81f/0x1280 mlxsw_sp_nexthop6_group_get+0x392/0x6a0 mlxsw_sp_fib6_entry_create+0x46a/0xfd0 mlxsw_sp_router_fib6_replace+0x1ed/0x5f0 mlxsw_sp_router_fib6_event_work+0x10a/0x2a0 process_one_work+0xd57/0x1390 worker_thread+0x4d6/0xd40 kthread+0x355/0x5b0 ret_from_fork+0x1d4/0x270 ret_from_fork_asm+0x11/0x20
Freed by task 154: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x43/0x70 kmem_cache_free_bulk.part.0+0x1eb/0x5e0 kvfree_rcu_bulk+0x1f2/0x260 kfree_rcu_work+0x130/0x1b0 process_one_work+0xd57/0x1390 worker_thread+0x4d6/0xd40 kthread+0x355/0x5b0 ret_from_fork+0x1d4/0x270 ret_from_fork_asm+0x11/0x20
Last potentially related work creation: kasan_save_stack+0x30/0x50 kasan_record_aux_stack+0x8c/0xa0 kvfree_call_rcu+0x93/0x5b0 mlxsw_sp_router_neigh_event_work+0x67d/0x860 process_one_work+0xd57/0x1390 worker_thread+0x4d6/0xd40 kthread+0x355/0x5b0 ret_from_fork+0x1d4/0x270 ret_from_fork_asm+0x11/0x20
Fixes: 6cf3c971dc84 ("mlxsw: spectrum_router: Add private neigh table") Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Petr Machata petrm@nvidia.com Signed-off-by: Petr Machata petrm@nvidia.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/92d75e21d95d163a41b5cea67a15cd33f547cba6.1764695650... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/mellanox/mlxsw/spectrum_router.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c index f4e9ecaeb104f..2d0e89bd2fb9c 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c @@ -2265,6 +2265,7 @@ mlxsw_sp_neigh_entry_alloc(struct mlxsw_sp *mlxsw_sp, struct neighbour *n, if (!neigh_entry) return NULL;
+ neigh_hold(n); neigh_entry->key.n = n; neigh_entry->rif = rif; INIT_LIST_HEAD(&neigh_entry->nexthop_list); @@ -2274,6 +2275,7 @@ mlxsw_sp_neigh_entry_alloc(struct mlxsw_sp *mlxsw_sp, struct neighbour *n,
static void mlxsw_sp_neigh_entry_free(struct mlxsw_sp_neigh_entry *neigh_entry) { + neigh_release(neigh_entry->key.n); kfree(neigh_entry); }
@@ -4320,6 +4322,8 @@ mlxsw_sp_nexthop_dead_neigh_replace(struct mlxsw_sp *mlxsw_sp, if (err) goto err_neigh_entry_insert;
+ neigh_release(old_n); + read_lock_bh(&n->lock); nud_state = n->nud_state; dead = n->dead; @@ -4328,14 +4332,10 @@ mlxsw_sp_nexthop_dead_neigh_replace(struct mlxsw_sp *mlxsw_sp,
list_for_each_entry(nh, &neigh_entry->nexthop_list, neigh_list_node) { - neigh_release(old_n); - neigh_clone(n); __mlxsw_sp_nexthop_neigh_update(nh, !entry_connected); mlxsw_sp_nexthop_group_refresh(mlxsw_sp, nh->nhgi->nh_grp); }
- neigh_release(n); - return 0;
err_neigh_entry_insert: @@ -4428,6 +4428,11 @@ static int mlxsw_sp_nexthop_neigh_init(struct mlxsw_sp *mlxsw_sp, } }
+ /* Release the reference taken by neigh_lookup() / neigh_create() since + * neigh_entry already holds one. + */ + neigh_release(n); + /* If that is the first nexthop connected to that neigh, add to * nexthop_neighs_list */ @@ -4454,11 +4459,9 @@ static void mlxsw_sp_nexthop_neigh_fini(struct mlxsw_sp *mlxsw_sp, struct mlxsw_sp_nexthop *nh) { struct mlxsw_sp_neigh_entry *neigh_entry = nh->neigh_entry; - struct neighbour *n;
if (!neigh_entry) return; - n = neigh_entry->key.n;
__mlxsw_sp_nexthop_neigh_update(nh, true); list_del(&nh->neigh_list_node); @@ -4472,8 +4475,6 @@ static void mlxsw_sp_nexthop_neigh_fini(struct mlxsw_sp *mlxsw_sp,
if (!neigh_entry->connected && list_empty(&neigh_entry->nexthop_list)) mlxsw_sp_neigh_entry_destroy(mlxsw_sp, neigh_entry); - - neigh_release(n); }
static bool mlxsw_sp_ipip_netdev_ul_up(struct net_device *ol_dev)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit 8ac1dacec458f55f871f7153242ed6ab60373b90 ]
Cited commit added a dedicated mutex (instead of RTNL) to protect the multicast route list, so that it will not change while the driver periodically traverses it in order to update the kernel about multicast route stats that were queried from the device.
One instance of list entry deletion (during route replace) was missed and it can result in a use-after-free [1].
Fix by acquiring the mutex before deleting the entry from the list and releasing it afterwards.
[1] BUG: KASAN: slab-use-after-free in mlxsw_sp_mr_stats_update+0x4a5/0x540 drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c:1006 [mlxsw_spectrum] Read of size 8 at addr ffff8881523c2fa8 by task kworker/2:5/22043
CPU: 2 UID: 0 PID: 22043 Comm: kworker/2:5 Not tainted 6.18.0-rc1-custom-g1a3d6d7cd014 #1 PREEMPT(full) Hardware name: Mellanox Technologies Ltd. MSN2010/SA002610, BIOS 5.6.5 08/24/2017 Workqueue: mlxsw_core mlxsw_sp_mr_stats_update [mlxsw_spectrum] Call Trace: <TASK> dump_stack_lvl+0xba/0x110 print_report+0x174/0x4f5 kasan_report+0xdf/0x110 mlxsw_sp_mr_stats_update+0x4a5/0x540 drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c:1006 [mlxsw_spectrum] process_one_work+0x9cc/0x18e0 worker_thread+0x5df/0xe40 kthread+0x3b8/0x730 ret_from_fork+0x3e9/0x560 ret_from_fork_asm+0x1a/0x30 </TASK>
Allocated by task 29933: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x8f/0xa0 mlxsw_sp_mr_route_add+0xd8/0x4770 [mlxsw_spectrum] mlxsw_sp_router_fibmr_event_work+0x371/0xad0 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:7965 [mlxsw_spectrum] process_one_work+0x9cc/0x18e0 worker_thread+0x5df/0xe40 kthread+0x3b8/0x730 ret_from_fork+0x3e9/0x560 ret_from_fork_asm+0x1a/0x30
Freed by task 29933: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_save_free_info+0x3b/0x70 __kasan_slab_free+0x43/0x70 kfree+0x14e/0x700 mlxsw_sp_mr_route_add+0x2dea/0x4770 drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c:444 [mlxsw_spectrum] mlxsw_sp_router_fibmr_event_work+0x371/0xad0 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:7965 [mlxsw_spectrum] process_one_work+0x9cc/0x18e0 worker_thread+0x5df/0xe40 kthread+0x3b8/0x730 ret_from_fork+0x3e9/0x560 ret_from_fork_asm+0x1a/0x30
Fixes: f38656d06725 ("mlxsw: spectrum_mr: Protect multicast route list with a lock") Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Petr Machata petrm@nvidia.com Signed-off-by: Petr Machata petrm@nvidia.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/f996feecfd59fde297964bfc85040b6d83ec6089.1764695650... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c index 5afe6b155ef0d..81935f87bfcd7 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c @@ -440,7 +440,9 @@ int mlxsw_sp_mr_route_add(struct mlxsw_sp_mr_table *mr_table, rhashtable_remove_fast(&mr_table->route_ht, &mr_orig_route->ht_node, mlxsw_sp_mr_route_ht_params); + mutex_lock(&mr_table->route_list_lock); list_del(&mr_orig_route->node); + mutex_unlock(&mr_table->route_list_lock); mlxsw_sp_mr_route_destroy(mr_table, mr_orig_route); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit 0373d5c387f24de749cc22e694a14b3a7c7eb515 ]
For XDP_TX action in bnxt_rx_xdp(), clearing of the event flags is not correct. __bnxt_poll_work() -> bnxt_rx_pkt() -> bnxt_rx_xdp() may be looping within NAPI and some event flags may be set in earlier iterations. In particular, if BNXT_TX_EVENT is set earlier indicating some XDP_TX packets are ready and pending, it will be cleared if it is XDP_TX action again. Normally, we will set BNXT_TX_EVENT again when we successfully call __bnxt_xmit_xdp(). But if the TX ring has no more room, the flag will not be set. This will cause the TX producer to be ahead but the driver will not hit the TX doorbell.
For multi-buf XDP_TX, there is no need to clear the event flags and set BNXT_AGG_EVENT. The BNXT_AGG_EVENT flag should have been set earlier in bnxt_rx_pkt().
The visible symptom of this is that the RX ring associated with the TX XDP ring will eventually become empty and all packets will be dropped. Because this condition will cause the driver to not refill the RX ring seeing that the TX ring has forever pending XDP_TX packets.
The fix is to only clear BNXT_RX_EVENT when we have successfully called __bnxt_xmit_xdp().
Fixes: 7f0a168b0441 ("bnxt_en: Add completion ring pointer in TX and RX ring structures") Reported-by: Pavel Dubovitsky pdubovitsky@meta.com Reviewed-by: Andy Gospodarek andrew.gospodarek@broadcom.com Reviewed-by: Pavan Chebbi pavan.chebbi@broadcom.com Reviewed-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/20251203003024.2246699-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c index 3e77a96e5a3e3..c94a391b1ba5b 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c @@ -268,13 +268,11 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons, case XDP_TX: rx_buf = &rxr->rx_buf_ring[cons]; mapping = rx_buf->mapping - bp->rx_dma_offset; - *event &= BNXT_TX_CMP_EVENT;
if (unlikely(xdp_buff_has_frags(xdp))) { struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
tx_needed += sinfo->nr_frags; - *event = BNXT_AGG_EVENT; }
if (tx_avail < tx_needed) { @@ -287,6 +285,7 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons, dma_sync_single_for_device(&pdev->dev, mapping + offset, *len, bp->rx_dir);
+ *event &= ~BNXT_RX_EVENT; *event |= BNXT_TX_EVENT; __bnxt_xmit_xdp(bp, txr, mapping + offset, *len, NEXT_RX(rxr->rx_prod), xdp);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Maximets i.maximets@ovn.org
[ Upstream commit 5ace7ef87f059d68b5f50837ef3e8a1a4870c36e ]
The push_nsh() action structure looks like this:
OVS_ACTION_ATTR_PUSH_NSH(OVS_KEY_ATTR_NSH(OVS_NSH_KEY_ATTR_BASE,...))
The outermost OVS_ACTION_ATTR_PUSH_NSH attribute is OK'ed by the nla_for_each_nested() inside __ovs_nla_copy_actions(). The innermost OVS_NSH_KEY_ATTR_BASE/MD1/MD2 are OK'ed by the nla_for_each_nested() inside nsh_key_put_from_nlattr(). But nothing checks if the attribute in the middle is OK. We don't even check that this attribute is the OVS_KEY_ATTR_NSH. We just do a double unwrap with a pair of nla_data() calls - first time directly while calling validate_push_nsh() and the second time as part of the nla_for_each_nested() macro, which isn't safe, potentially causing invalid memory access if the size of this attribute is incorrect. The failure may not be noticed during validation due to larger netlink buffer, but cause trouble later during action execution where the buffer is allocated exactly to the size:
BUG: KASAN: slab-out-of-bounds in nsh_hdr_from_nlattr+0x1dd/0x6a0 [openvswitch] Read of size 184 at addr ffff88816459a634 by task a.out/22624
CPU: 8 UID: 0 PID: 22624 6.18.0-rc7+ #115 PREEMPT(voluntary) Call Trace: <TASK> dump_stack_lvl+0x51/0x70 print_address_description.constprop.0+0x2c/0x390 kasan_report+0xdd/0x110 kasan_check_range+0x35/0x1b0 __asan_memcpy+0x20/0x60 nsh_hdr_from_nlattr+0x1dd/0x6a0 [openvswitch] push_nsh+0x82/0x120 [openvswitch] do_execute_actions+0x1405/0x2840 [openvswitch] ovs_execute_actions+0xd5/0x3b0 [openvswitch] ovs_packet_cmd_execute+0x949/0xdb0 [openvswitch] genl_family_rcv_msg_doit+0x1d6/0x2b0 genl_family_rcv_msg+0x336/0x580 genl_rcv_msg+0x9f/0x130 netlink_rcv_skb+0x11f/0x370 genl_rcv+0x24/0x40 netlink_unicast+0x73e/0xaa0 netlink_sendmsg+0x744/0xbf0 __sys_sendto+0x3d6/0x450 do_syscall_64+0x79/0x2c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e </TASK>
Let's add some checks that the attribute is properly sized and it's the only one attribute inside the action. Technically, there is no real reason for OVS_KEY_ATTR_NSH to be there, as we know that we're pushing an NSH header already, it just creates extra nesting, but that's how uAPI works today. So, keeping as it is.
Fixes: b2d0f5d5dc53 ("openvswitch: enable NSH support") Reported-by: Junvy Yang zhuque@tencent.com Signed-off-by: Ilya Maximets i.maximets@ovn.org Acked-by: Eelco Chaudron echaudro@redhat.com Reviewed-by: Aaron Conole aconole@redhat.com Link: https://patch.msgid.link/20251204105334.900379-1-i.maximets@ovn.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/openvswitch/flow_netlink.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c index 1cb4f97335d87..2d536901309ea 100644 --- a/net/openvswitch/flow_netlink.c +++ b/net/openvswitch/flow_netlink.c @@ -2802,13 +2802,20 @@ static int validate_and_copy_set_tun(const struct nlattr *attr, return err; }
-static bool validate_push_nsh(const struct nlattr *attr, bool log) +static bool validate_push_nsh(const struct nlattr *a, bool log) { + struct nlattr *nsh_key = nla_data(a); struct sw_flow_match match; struct sw_flow_key key;
+ /* There must be one and only one NSH header. */ + if (!nla_ok(nsh_key, nla_len(a)) || + nla_total_size(nla_len(nsh_key)) != nla_len(a) || + nla_type(nsh_key) != OVS_KEY_ATTR_NSH) + return false; + ovs_match_init(&match, &key, true, NULL); - return !nsh_key_put_from_nlattr(attr, &match, false, true, log); + return !nsh_key_put_from_nlattr(nsh_key, &match, false, true, log); }
/* Return false if there are any non-masked bits set. @@ -3389,7 +3396,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr, return -EINVAL; } mac_proto = MAC_PROTO_NONE; - if (!validate_push_nsh(nla_data(a), log)) + if (!validate_push_nsh(a, log)) return -EINVAL; break;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 9e7477a427449a8a3cd00c188e20a880e3d94638 ]
The new icssg-prueth driver needs the same dependency as the other parts that use the ptp-1588:
WARNING: unmet direct dependencies detected for TI_ICSS_IEP Depends on [m]: NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_TI [=y] && PTP_1588_CLOCK_OPTIONAL [=m] && TI_PRUSS [=y] Selected by [y]: - TI_PRUETH [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_TI [=y] && PRU_REMOTEPROC [=y] && NET_SWITCHDEV [=y]
Add the correct dependency on the two drivers missing it, and remove the pointless 'imply' in the process.
Fixes: e654b85a693e ("net: ti: icssg-prueth: Add ICSSG Ethernet driver for AM65x SR1.0 platforms") Fixes: 511f6c1ae093 ("net: ti: icssm-prueth: Adds ICSSM Ethernet driver") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Vadim Fedorenko vadim.fedorenko@linux.dev Link: https://patch.msgid.link/20251204100138.1034175-1-arnd@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/ti/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig index a54d71155263c..fe5b2926d8ab0 100644 --- a/drivers/net/ethernet/ti/Kconfig +++ b/drivers/net/ethernet/ti/Kconfig @@ -209,6 +209,7 @@ config TI_ICSSG_PRUETH_SR1 depends on PRU_REMOTEPROC depends on NET_SWITCHDEV depends on ARCH_K3 && OF && TI_K3_UDMA_GLUE_LAYER + depends on PTP_1588_CLOCK_OPTIONAL help Support dual Gigabit Ethernet ports over the ICSSG PRU Subsystem. This subsystem is available on the AM65 SR1.0 platform. @@ -234,7 +235,7 @@ config TI_PRUETH depends on PRU_REMOTEPROC depends on NET_SWITCHDEV select TI_ICSS_IEP - imply PTP_1588_CLOCK + depends on PTP_1588_CLOCK_OPTIONAL help Some TI SoCs has Programmable Realtime Unit (PRU) cores which can support Single or Dual Ethernet ports with the help of firmware code
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Simakov bigalex934@gmail.com
[ Upstream commit 50b3db3e11864cb4e18ff099cfb38e11e7f87a68 ]
On execution path with raised B44_FLAG_EXTERNAL_PHY, b44_readphy() leaves bmcr value uninitialized and it is used later in the code.
Add check of this flag at the beginning of the b44_nway_reset() and exit early of the function with restarting autonegotiation if an external PHY is used.
Fixes: 753f492093da ("[B44]: port to native ssb support") Reviewed-by: Jonas Gorski jonas.gorski@gmail.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Alexey Simakov bigalex934@gmail.com Reviewed-by: Michael Chan michael.chan@broadcom.com Link: https://patch.msgid.link/20251205155815.4348-1-bigalex934@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/b44.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/b44.c b/drivers/net/ethernet/broadcom/b44.c index 0353359c3fe96..073d7d490d4b6 100644 --- a/drivers/net/ethernet/broadcom/b44.c +++ b/drivers/net/ethernet/broadcom/b44.c @@ -1789,6 +1789,9 @@ static int b44_nway_reset(struct net_device *dev) u32 bmcr; int r;
+ if (bp->flags & B44_FLAG_EXTERNAL_PHY) + return phy_ethtool_nway_reset(dev); + spin_lock_irq(&bp->lock); b44_readphy(bp, MII_BMCR, &bmcr); b44_readphy(bp, MII_BMCR, &bmcr);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
[ Upstream commit 06f7cae92fe346fa49a8a9b161124b26cc5c3ed1 ]
Fix:
gcc: error: unrecognized command-line option ‘-Wflex-array-member-not-at-end’
by making the compiler option dependent on its support.
Fixes: 1838731f1072c ("selftest: af_unix: Add -Wall and -Wflex-array-member-not-at-end to CFLAGS.") Cc: Kuniyuki Iwashima kuniyu@google.com Signed-off-by: Guenter Roeck linux@roeck-us.net Link: https://patch.msgid.link/20251205171010.515236-7-linux@roeck-us.net Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/af_unix/Makefile | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/af_unix/Makefile b/tools/testing/selftests/net/af_unix/Makefile index 528d14c598bb5..2889403e35468 100644 --- a/tools/testing/selftests/net/af_unix/Makefile +++ b/tools/testing/selftests/net/af_unix/Makefile @@ -1,4 +1,9 @@ -CFLAGS += $(KHDR_INCLUDES) -Wall -Wflex-array-member-not-at-end +top_srcdir := ../../../../.. +include $(top_srcdir)/scripts/Makefile.compiler + +cc-option = $(call __cc-option, $(CC),,$(1),$(2)) + +CFLAGS += $(KHDR_INCLUDES) -Wall $(call cc-option,-Wflex-array-member-not-at-end)
TEST_GEN_PROGS := \ diag_uid \
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
[ Upstream commit 59546e874403c1dd0cbc42df06fdf8c113f72022 ]
Fix
ksft.h: In function ‘ksft_ready’: ksft.h:27:9: warning: ignoring return value of ‘write’ declared with attribute ‘warn_unused_result’
ksft.h: In function ‘ksft_wait’: ksft.h:51:9: warning: ignoring return value of ‘read’ declared with attribute ‘warn_unused_result’
by checking the return value of the affected functions and displaying an error message if an error is seen.
Fixes: 2b6d490b82668 ("selftests: drv-net: Factor out ksft C helpers") Cc: Joe Damato jdamato@fastly.com Signed-off-by: Guenter Roeck linux@roeck-us.net Link: https://patch.msgid.link/20251205171010.515236-11-linux@roeck-us.net Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/lib/ksft.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/lib/ksft.h b/tools/testing/selftests/net/lib/ksft.h index 17dc34a612c64..03912902a6d30 100644 --- a/tools/testing/selftests/net/lib/ksft.h +++ b/tools/testing/selftests/net/lib/ksft.h @@ -24,7 +24,8 @@ static inline void ksft_ready(void) fd = STDOUT_FILENO; }
- write(fd, msg, sizeof(msg)); + if (write(fd, msg, sizeof(msg)) < 0) + perror("write()"); if (fd != STDOUT_FILENO) close(fd); } @@ -48,7 +49,8 @@ static inline void ksft_wait(void) fd = STDIN_FILENO; }
- read(fd, &byte, sizeof(byte)); + if (read(fd, &byte, sizeof(byte)) < 0) + perror("read()"); if (fd != STDIN_FILENO) close(fd); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
[ Upstream commit 91dc09a609d9443e6b34bdb355a18d579a95e132 ]
Fix
tfo.c: In function ‘run_server’: tfo.c:84:9: warning: ignoring return value of ‘read’ declared with attribute ‘warn_unused_result’
by evaluating the return value from read() and displaying an error message if it reports an error.
Fixes: c65b5bb2329e3 ("selftests: net: add passive TFO test binary") Cc: David Wei dw@davidwei.uk Signed-off-by: Guenter Roeck linux@roeck-us.net Link: https://patch.msgid.link/20251205171010.515236-14-linux@roeck-us.net Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/tfo.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/tfo.c b/tools/testing/selftests/net/tfo.c index eb3cac5e583c9..8d82140f0f767 100644 --- a/tools/testing/selftests/net/tfo.c +++ b/tools/testing/selftests/net/tfo.c @@ -81,7 +81,8 @@ static void run_server(void) if (getsockopt(connfd, SOL_SOCKET, SO_INCOMING_NAPI_ID, &opt, &len) < 0) error(1, errno, "getsockopt(SO_INCOMING_NAPI_ID)");
- read(connfd, buf, 64); + if (read(connfd, buf, 64) < 0) + perror("read()"); fprintf(outfile, "%d\n", opt);
fclose(outfile);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 8ef522c8a59a048117f7e05eb5213043c02f986f ]
In ip_frag_reinit() we want to move the frag timeout timer into the future. If the timer fires in the meantime we inadvertently scheduled it again, and since the timer assumes a ref on frag_queue we need to acquire one to balance things out.
This is technically racy, we should have acquired the reference _before_ we touch the timer, it may fire again before we take the ref. Avoid this entire dance by using mod_timer_pending() which only modifies the timer if its pending (and which exists since Linux v2.6.30)
Note that this was the only place we ever took a ref on frag_queue since Eric's conversion to RCU. So we could potentially replace the whole refcnt field with an atomic flag and a bit more RCU.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/20251207010942.1672972-2-kuba@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/inet_fragment.c | 4 +++- net/ipv4/ip_fragment.c | 4 +--- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c index 025895eb6ec59..30f4fa50ee2d7 100644 --- a/net/ipv4/inet_fragment.c +++ b/net/ipv4/inet_fragment.c @@ -327,7 +327,9 @@ static struct inet_frag_queue *inet_frag_alloc(struct fqdir *fqdir,
timer_setup(&q->timer, f->frag_expire, 0); spin_lock_init(&q->lock); - /* One reference for the timer, one for the hash table. */ + /* One reference for the timer, one for the hash table. + * We never take any extra references, only decrement this field. + */ refcount_set(&q->refcnt, 2);
return q; diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index f7012479713ba..d7bccdc9dc693 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -242,10 +242,8 @@ static int ip_frag_reinit(struct ipq *qp) { unsigned int sum_truesize = 0;
- if (!mod_timer(&qp->q.timer, jiffies + qp->q.fqdir->timeout)) { - refcount_inc(&qp->q.refcnt); + if (!mod_timer_pending(&qp->q.timer, jiffies + qp->q.fqdir->timeout)) return -ETIMEDOUT; - }
sum_truesize = inet_frag_rbtree_purge(&qp->q.rb_fragments, SKB_DROP_REASON_FRAG_TOO_FAR);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 1231eec6994be29d6bb5c303dfa54731ed9fc0e6 ]
Instead of exporting inet_frag_rbtree_purge() which requires that caller takes care of memory accounting, add a new helper. We will need to call it from a few places in the next patch.
Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/20251207010942.1672972-3-kuba@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 006a5035b495 ("inet: frags: flush pending skbs in fqdir_pre_exit()") Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/inet_frag.h | 5 ++--- net/ipv4/inet_fragment.c | 15 ++++++++++++--- net/ipv4/ip_fragment.c | 6 +----- 3 files changed, 15 insertions(+), 11 deletions(-)
diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h index 0eccd9c3a883f..3ffaceee7bbc0 100644 --- a/include/net/inet_frag.h +++ b/include/net/inet_frag.h @@ -141,9 +141,8 @@ void inet_frag_kill(struct inet_frag_queue *q, int *refs); void inet_frag_destroy(struct inet_frag_queue *q); struct inet_frag_queue *inet_frag_find(struct fqdir *fqdir, void *key);
-/* Free all skbs in the queue; return the sum of their truesizes. */ -unsigned int inet_frag_rbtree_purge(struct rb_root *root, - enum skb_drop_reason reason); +void inet_frag_queue_flush(struct inet_frag_queue *q, + enum skb_drop_reason reason);
static inline void inet_frag_putn(struct inet_frag_queue *q, int refs) { diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c index 30f4fa50ee2d7..1bf969b5a1cb5 100644 --- a/net/ipv4/inet_fragment.c +++ b/net/ipv4/inet_fragment.c @@ -263,8 +263,8 @@ static void inet_frag_destroy_rcu(struct rcu_head *head) kmem_cache_free(f->frags_cachep, q); }
-unsigned int inet_frag_rbtree_purge(struct rb_root *root, - enum skb_drop_reason reason) +static unsigned int +inet_frag_rbtree_purge(struct rb_root *root, enum skb_drop_reason reason) { struct rb_node *p = rb_first(root); unsigned int sum = 0; @@ -284,7 +284,16 @@ unsigned int inet_frag_rbtree_purge(struct rb_root *root, } return sum; } -EXPORT_SYMBOL(inet_frag_rbtree_purge); + +void inet_frag_queue_flush(struct inet_frag_queue *q, + enum skb_drop_reason reason) +{ + unsigned int sum; + + sum = inet_frag_rbtree_purge(&q->rb_fragments, reason); + sub_frag_mem_limit(q->fqdir, sum); +} +EXPORT_SYMBOL(inet_frag_queue_flush);
void inet_frag_destroy(struct inet_frag_queue *q) { diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index d7bccdc9dc693..32f1c1a46ba72 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -240,14 +240,10 @@ static int ip_frag_too_far(struct ipq *qp)
static int ip_frag_reinit(struct ipq *qp) { - unsigned int sum_truesize = 0; - if (!mod_timer_pending(&qp->q.timer, jiffies + qp->q.fqdir->timeout)) return -ETIMEDOUT;
- sum_truesize = inet_frag_rbtree_purge(&qp->q.rb_fragments, - SKB_DROP_REASON_FRAG_TOO_FAR); - sub_frag_mem_limit(qp->q.fqdir, sum_truesize); + inet_frag_queue_flush(&qp->q, SKB_DROP_REASON_FRAG_TOO_FAR);
qp->q.flags = 0; qp->q.len = 0;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 006a5035b495dec008805df249f92c22c89c3d2e ]
We have been seeing occasional deadlocks on pernet_ops_rwsem since September in NIPA. The stuck task was usually modprobe (often loading a driver like ipvlan), trying to take the lock as a Writer. lockdep does not track readers for rwsems so the read wasn't obvious from the reports.
On closer inspection the Reader holding the lock was conntrack looping forever in nf_conntrack_cleanup_net_list(). Based on past experience with occasional NIPA crashes I looked thru the tests which run before the crash and noticed that the crash follows ip_defrag.sh. An immediate red flag. Scouring thru (de)fragmentation queues reveals skbs sitting around, holding conntrack references.
The problem is that since conntrack depends on nf_defrag_ipv6, nf_defrag_ipv6 will load first. Since nf_defrag_ipv6 loads first its netns exit hooks run _after_ conntrack's netns exit hook.
Flush all fragment queue SKBs during fqdir_pre_exit() to release conntrack references before conntrack cleanup runs. Also flush the queues in timer expiry handlers when they discover fqdir->dead is set, in case packet sneaks in while we're running the pre_exit flush.
The commit under Fixes is not exactly the culprit, but I think previously the timer firing would eventually unblock the spinning conntrack.
Fixes: d5dd88794a13 ("inet: fix various use-after-free in defrags units") Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/20251207010942.1672972-4-kuba@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/inet_frag.h | 13 +------------ include/net/ipv6_frag.h | 9 ++++++--- net/ipv4/inet_fragment.c | 36 ++++++++++++++++++++++++++++++++++++ net/ipv4/ip_fragment.c | 12 +++++++----- 4 files changed, 50 insertions(+), 20 deletions(-)
diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h index 3ffaceee7bbc0..365925c9d2628 100644 --- a/include/net/inet_frag.h +++ b/include/net/inet_frag.h @@ -123,18 +123,7 @@ void inet_frags_fini(struct inet_frags *);
int fqdir_init(struct fqdir **fqdirp, struct inet_frags *f, struct net *net);
-static inline void fqdir_pre_exit(struct fqdir *fqdir) -{ - /* Prevent creation of new frags. - * Pairs with READ_ONCE() in inet_frag_find(). - */ - WRITE_ONCE(fqdir->high_thresh, 0); - - /* Pairs with READ_ONCE() in inet_frag_kill(), ip_expire() - * and ip6frag_expire_frag_queue(). - */ - WRITE_ONCE(fqdir->dead, true); -} +void fqdir_pre_exit(struct fqdir *fqdir); void fqdir_exit(struct fqdir *fqdir);
void inet_frag_kill(struct inet_frag_queue *q, int *refs); diff --git a/include/net/ipv6_frag.h b/include/net/ipv6_frag.h index 38ef66826939e..41d9fc6965f9a 100644 --- a/include/net/ipv6_frag.h +++ b/include/net/ipv6_frag.h @@ -69,9 +69,6 @@ ip6frag_expire_frag_queue(struct net *net, struct frag_queue *fq) int refs = 1;
rcu_read_lock(); - /* Paired with the WRITE_ONCE() in fqdir_pre_exit(). */ - if (READ_ONCE(fq->q.fqdir->dead)) - goto out_rcu_unlock; spin_lock(&fq->q.lock);
if (fq->q.flags & INET_FRAG_COMPLETE) @@ -80,6 +77,12 @@ ip6frag_expire_frag_queue(struct net *net, struct frag_queue *fq) fq->q.flags |= INET_FRAG_DROP; inet_frag_kill(&fq->q, &refs);
+ /* Paired with the WRITE_ONCE() in fqdir_pre_exit(). */ + if (READ_ONCE(fq->q.fqdir->dead)) { + inet_frag_queue_flush(&fq->q, 0); + goto out; + } + dev = dev_get_by_index_rcu(net, fq->iif); if (!dev) goto out; diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c index 1bf969b5a1cb5..001ee5c4d962e 100644 --- a/net/ipv4/inet_fragment.c +++ b/net/ipv4/inet_fragment.c @@ -218,6 +218,41 @@ static int __init inet_frag_wq_init(void)
pure_initcall(inet_frag_wq_init);
+void fqdir_pre_exit(struct fqdir *fqdir) +{ + struct inet_frag_queue *fq; + struct rhashtable_iter hti; + + /* Prevent creation of new frags. + * Pairs with READ_ONCE() in inet_frag_find(). + */ + WRITE_ONCE(fqdir->high_thresh, 0); + + /* Pairs with READ_ONCE() in inet_frag_kill(), ip_expire() + * and ip6frag_expire_frag_queue(). + */ + WRITE_ONCE(fqdir->dead, true); + + rhashtable_walk_enter(&fqdir->rhashtable, &hti); + rhashtable_walk_start(&hti); + + while ((fq = rhashtable_walk_next(&hti))) { + if (IS_ERR(fq)) { + if (PTR_ERR(fq) != -EAGAIN) + break; + continue; + } + spin_lock_bh(&fq->lock); + if (!(fq->flags & INET_FRAG_COMPLETE)) + inet_frag_queue_flush(fq, 0); + spin_unlock_bh(&fq->lock); + } + + rhashtable_walk_stop(&hti); + rhashtable_walk_exit(&hti); +} +EXPORT_SYMBOL(fqdir_pre_exit); + void fqdir_exit(struct fqdir *fqdir) { INIT_WORK(&fqdir->destroy_work, fqdir_work_fn); @@ -290,6 +325,7 @@ void inet_frag_queue_flush(struct inet_frag_queue *q, { unsigned int sum;
+ reason = reason ?: SKB_DROP_REASON_FRAG_REASM_TIMEOUT; sum = inet_frag_rbtree_purge(&q->rb_fragments, reason); sub_frag_mem_limit(q->fqdir, sum); } diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index 32f1c1a46ba72..56b0f738d2f27 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -134,11 +134,6 @@ static void ip_expire(struct timer_list *t) net = qp->q.fqdir->net;
rcu_read_lock(); - - /* Paired with WRITE_ONCE() in fqdir_pre_exit(). */ - if (READ_ONCE(qp->q.fqdir->dead)) - goto out_rcu_unlock; - spin_lock(&qp->q.lock);
if (qp->q.flags & INET_FRAG_COMPLETE) @@ -146,6 +141,13 @@ static void ip_expire(struct timer_list *t)
qp->q.flags |= INET_FRAG_DROP; inet_frag_kill(&qp->q, &refs); + + /* Paired with WRITE_ONCE() in fqdir_pre_exit(). */ + if (READ_ONCE(qp->q.fqdir->dead)) { + inet_frag_queue_flush(&qp->q, 0); + goto out; + } + __IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS); __IP_INC_STATS(net, IPSTATS_MIB_REASMTIMEOUT);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fernando Fernandez Mancera fmancera@suse.de
[ Upstream commit 2e2a720766886190a6d35c116794693aabd332b6 ]
There are some situations where ct might be leaked as error paths are skipping the refcounted check and return immediately. In order to solve it make sure that the check is always called.
Fixes: be102eb6a0e7 ("netfilter: nf_conncount: rework API to use sk_buff directly") Signed-off-by: Fernando Fernandez Mancera fmancera@suse.de Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_conncount.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c index b84cfb5616df4..3c1b155f7a0ea 100644 --- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -172,14 +172,14 @@ static int __nf_conncount_add(struct net *net, struct nf_conn *found_ct; unsigned int collect = 0; bool refcounted = false; + int err = 0;
if (!get_ct_or_tuple_from_skb(net, skb, l3num, &ct, &tuple, &zone, &refcounted)) return -ENOENT;
if (ct && nf_ct_is_confirmed(ct)) { - if (refcounted) - nf_ct_put(ct); - return -EEXIST; + err = -EEXIST; + goto out_put; }
if ((u32)jiffies == list->last_gc) @@ -231,12 +231,16 @@ static int __nf_conncount_add(struct net *net, }
add_new_node: - if (WARN_ON_ONCE(list->count > INT_MAX)) - return -EOVERFLOW; + if (WARN_ON_ONCE(list->count > INT_MAX)) { + err = -EOVERFLOW; + goto out_put; + }
conn = kmem_cache_alloc(conncount_conn_cachep, GFP_ATOMIC); - if (conn == NULL) - return -ENOMEM; + if (conn == NULL) { + err = -ENOMEM; + goto out_put; + }
conn->tuple = tuple; conn->zone = *zone; @@ -249,7 +253,7 @@ static int __nf_conncount_add(struct net *net, out_put: if (refcounted) nf_ct_put(ct); - return 0; + return err; }
int nf_conncount_add_skb(struct net *net, @@ -446,11 +450,10 @@ insert_tree(struct net *net,
rb_link_node_rcu(&rbconn->node, parent, rbnode); rb_insert_color(&rbconn->node, root); - - if (refcounted) - nf_ct_put(ct); } out_unlock: + if (refcounted) + nf_ct_put(ct); spin_unlock_bh(&nf_conncount_locks[hash]); return count; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Slavin Liu slavin452@gmail.com
[ Upstream commit ad891bb3d079a46a821bf2b8867854645191bab0 ]
The IPv4 code path in __ip_vs_get_out_rt() calls dst_link_failure() without ensuring skb->dev is set, leading to a NULL pointer dereference in fib_compute_spec_dst() when ipv4_link_failure() attempts to send ICMP destination unreachable messages.
The issue emerged after commit ed0de45a1008 ("ipv4: recompile ip options in ipv4_link_failure") started calling __ip_options_compile() from ipv4_link_failure(). This code path eventually calls fib_compute_spec_dst() which dereferences skb->dev. An attempt was made to fix the NULL skb->dev dereference in commit 0113d9c9d1cc ("ipv4: fix null-deref in ipv4_link_failure"), but it only addressed the immediate dev_net(skb->dev) dereference by using a fallback device. The fix was incomplete because fib_compute_spec_dst() later in the call chain still accesses skb->dev directly, which remains NULL when IPVS calls dst_link_failure().
The crash occurs when: 1. IPVS processes a packet in NAT mode with a misconfigured destination 2. Route lookup fails in __ip_vs_get_out_rt() before establishing a route 3. The error path calls dst_link_failure(skb) with skb->dev == NULL 4. ipv4_link_failure() → ipv4_send_dest_unreach() → __ip_options_compile() → fib_compute_spec_dst() 5. fib_compute_spec_dst() dereferences NULL skb->dev
Apply the same fix used for IPv6 in commit 326bf17ea5d4 ("ipvs: fix ipv6 route unreach panic"): set skb->dev from skb_dst(skb)->dev before calling dst_link_failure().
KASAN: null-ptr-deref in range [0x0000000000000328-0x000000000000032f] CPU: 1 PID: 12732 Comm: syz.1.3469 Not tainted 6.6.114 #2 RIP: 0010:__in_dev_get_rcu include/linux/inetdevice.h:233 RIP: 0010:fib_compute_spec_dst+0x17a/0x9f0 net/ipv4/fib_frontend.c:285 Call Trace: <TASK> spec_dst_fill net/ipv4/ip_options.c:232 spec_dst_fill net/ipv4/ip_options.c:229 __ip_options_compile+0x13a1/0x17d0 net/ipv4/ip_options.c:330 ipv4_send_dest_unreach net/ipv4/route.c:1252 ipv4_link_failure+0x702/0xb80 net/ipv4/route.c:1265 dst_link_failure include/net/dst.h:437 __ip_vs_get_out_rt+0x15fd/0x19e0 net/netfilter/ipvs/ip_vs_xmit.c:412 ip_vs_nat_xmit+0x1d8/0xc80 net/netfilter/ipvs/ip_vs_xmit.c:764
Fixes: ed0de45a1008 ("ipv4: recompile ip options in ipv4_link_failure") Signed-off-by: Slavin Liu slavin452@gmail.com Acked-by: Julian Anastasov ja@ssi.bg Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipvs/ip_vs_xmit.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c index 95af252b29397..618fbe1240b54 100644 --- a/net/netfilter/ipvs/ip_vs_xmit.c +++ b/net/netfilter/ipvs/ip_vs_xmit.c @@ -409,6 +409,9 @@ __ip_vs_get_out_rt(struct netns_ipvs *ipvs, int skb_af, struct sk_buff *skb, return -1;
err_unreach: + if (!skb->dev) + skb->dev = skb_dst(skb)->dev; + dst_link_failure(skb); return -1; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
[ Upstream commit b8a81b0ce539e021ac72825238aea1eb657000f0 ]
Jakub says: "We try to reserve SKIP for tests skipped because tool is missing in env, something isn't built into the kernel etc."
use xfail, we can't force the race condition to appear at will so its expected that the test 'fails' occasionally.
Fixes: 78a588363587 ("selftests: netfilter: add conntrack clash resolution test case") Reported-by: Jakub Kicinski kuba@kernel.org Closes: https://lore.kernel.org/netdev/20251206175647.5c32f419@kernel.org/ Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/netfilter/conntrack_clash.sh | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/net/netfilter/conntrack_clash.sh b/tools/testing/selftests/net/netfilter/conntrack_clash.sh index 7fc6c5dbd5516..84b8eb12143ae 100755 --- a/tools/testing/selftests/net/netfilter/conntrack_clash.sh +++ b/tools/testing/selftests/net/netfilter/conntrack_clash.sh @@ -116,7 +116,7 @@ run_one_clash_test() # not a failure: clash resolution logic did not trigger. # With right timing, xmit completed sequentially and # no parallel insertion occurs. - return $ksft_skip + return $ksft_xfail }
run_clash_test() @@ -133,12 +133,12 @@ run_clash_test() if [ $rv -eq 0 ];then echo "PASS: clash resolution test for $daddr:$dport on attempt $i" return 0 - elif [ $rv -eq $ksft_skip ]; then + elif [ $rv -eq $ksft_xfail ]; then softerr=1 fi done
- [ $softerr -eq 1 ] && echo "SKIP: clash resolution for $daddr:$dport did not trigger" + [ $softerr -eq 1 ] && echo "XFAIL: clash resolution for $daddr:$dport did not trigger" }
ip link add veth0 netns "$nsclient1" type veth peer name veth0 netns "$nsrouter" @@ -167,8 +167,7 @@ load_simple_ruleset "$nsclient2" run_clash_test "$nsclient2" "$nsclient2" 127.0.0.1 9001
if [ $clash_resolution_active -eq 0 ];then - [ "$ret" -eq 0 ] && ret=$ksft_skip - echo "SKIP: Clash resolution did not trigger" + [ "$ret" -eq 0 ] && ret=$ksft_xfail fi
exit $ret
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Junrui Luo moonafterrain@outlook.com
[ Upstream commit 8a11ff0948b5ad09b71896b7ccc850625f9878d1 ]
The cffrml_receive() function extracts a length field from the packet header and, when FCS is disabled, subtracts 2 from this length without validating that len >= 2.
If an attacker sends a malicious packet with a length field of 0 or 1 to an interface with FCS disabled, the subtraction causes an integer underflow.
This can lead to memory exhaustion and kernel instability, potential information disclosure if padding contains uninitialized kernel memory.
Fix this by validating that len >= 2 before performing the subtraction.
Reported-by: Yuhao Jiang danisjiang@gmail.com Reported-by: Junrui Luo moonafterrain@outlook.com Fixes: b482cd2053e3 ("net-caif: add CAIF core protocol stack") Signed-off-by: Junrui Luo moonafterrain@outlook.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/SYBPR01MB7881511122BAFEA8212A1608AFA6A@SYBPR01MB788... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/caif/cffrml.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/net/caif/cffrml.c b/net/caif/cffrml.c index 6651a8dc62e04..d4d63586053ad 100644 --- a/net/caif/cffrml.c +++ b/net/caif/cffrml.c @@ -92,8 +92,15 @@ static int cffrml_receive(struct cflayer *layr, struct cfpkt *pkt) len = le16_to_cpu(tmp);
/* Subtract for FCS on length if FCS is not used. */ - if (!this->dofcs) + if (!this->dofcs) { + if (len < 2) { + ++cffrml_rcv_error; + pr_err("Invalid frame length (%d)\n", len); + cfpkt_destroy(pkt); + return -EPROTO; + } len -= 2; + }
if (cfpkt_setlen(pkt, len) < 0) { ++cffrml_rcv_error;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Victor Nogueira victor@mojatatu.com
[ Upstream commit b1e125ae425aba9b45252e933ca8df52a843ec70 ]
Whenever a user issues an ets qdisc change command, transforming a drr class into a strict one, the ets code isn't checking whether that class was in the active list and removing it. This means that, if a user changes a strict class (which was in the active list) back to a drr one, that class will be added twice to the active list [1].
Doing so with the following commands:
tc qdisc add dev lo root handle 1: ets bands 2 strict 1 tc qdisc add dev lo parent 1:2 handle 20: \ tbf rate 8bit burst 100b latency 1s tc filter add dev lo parent 1: basic classid 1:2 ping -c1 -W0.01 -s 56 127.0.0.1 tc qdisc change dev lo root handle 1: ets bands 2 strict 2 tc qdisc change dev lo root handle 1: ets bands 2 strict 1 ping -c1 -W0.01 -s 56 127.0.0.1
Will trigger the following splat with list debug turned on:
[ 59.279014][ T365] ------------[ cut here ]------------ [ 59.279452][ T365] list_add double add: new=ffff88801d60e350, prev=ffff88801d60e350, next=ffff88801d60e2c0. [ 59.280153][ T365] WARNING: CPU: 3 PID: 365 at lib/list_debug.c:35 __list_add_valid_or_report+0x17f/0x220 [ 59.280860][ T365] Modules linked in: [ 59.281165][ T365] CPU: 3 UID: 0 PID: 365 Comm: tc Not tainted 6.18.0-rc7-00105-g7e9f13163c13-dirty #239 PREEMPT(voluntary) [ 59.281977][ T365] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 59.282391][ T365] RIP: 0010:__list_add_valid_or_report+0x17f/0x220 [ 59.282842][ T365] Code: 89 c6 e8 d4 b7 0d ff 90 0f 0b 90 90 31 c0 e9 31 ff ff ff 90 48 c7 c7 e0 a0 22 9f 48 89 f2 48 89 c1 4c 89 c6 e8 b2 b7 0d ff 90 <0f> 0b 90 90 31 c0 e9 0f ff ff ff 48 89 f7 48 89 44 24 10 4c 89 44 ... [ 59.288812][ T365] Call Trace: [ 59.289056][ T365] <TASK> [ 59.289224][ T365] ? srso_alias_return_thunk+0x5/0xfbef5 [ 59.289546][ T365] ets_qdisc_change+0xd2b/0x1e80 [ 59.289891][ T365] ? __lock_acquire+0x7e7/0x1be0 [ 59.290223][ T365] ? __pfx_ets_qdisc_change+0x10/0x10 [ 59.290546][ T365] ? srso_alias_return_thunk+0x5/0xfbef5 [ 59.290898][ T365] ? __mutex_trylock_common+0xda/0x240 [ 59.291228][ T365] ? __pfx___mutex_trylock_common+0x10/0x10 [ 59.291655][ T365] ? srso_alias_return_thunk+0x5/0xfbef5 [ 59.291993][ T365] ? srso_alias_return_thunk+0x5/0xfbef5 [ 59.292313][ T365] ? trace_contention_end+0xc8/0x110 [ 59.292656][ T365] ? srso_alias_return_thunk+0x5/0xfbef5 [ 59.293022][ T365] ? srso_alias_return_thunk+0x5/0xfbef5 [ 59.293351][ T365] tc_modify_qdisc+0x63a/0x1cf0
Fix this by always checking and removing an ets class from the active list when changing it to strict.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/tree/net/sche...
Fixes: cd9b50adc6bb9 ("net/sched: ets: fix crash when flipping from 'strict' to 'quantum'") Acked-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Victor Nogueira victor@mojatatu.com Reviewed-by: Petr Machata petrm@nvidia.com Link: https://patch.msgid.link/20251208190125.1868423-1-victor@mojatatu.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_ets.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c index ae46643e596d3..306e046276d46 100644 --- a/net/sched/sch_ets.c +++ b/net/sched/sch_ets.c @@ -664,6 +664,10 @@ static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt, q->classes[i].deficit = quanta[i]; } } + for (i = q->nstrict; i < nstrict; i++) { + if (cl_is_active(&q->classes[i])) + list_del_init(&q->classes[i].alist); + } WRITE_ONCE(q->nstrict, nstrict); memcpy(q->prio2band, priomap, sizeof(priomap));
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 885bebac9909994050bbbeed0829c727e42bd1b7 ]
Set the error code if "transferred != sizeof(cmd)" instead of returning success.
Fixes: dbafc28955fa ("NFC: pn533: don't send USB data off of the stack") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Link: https://patch.msgid.link/aTfIJ9tZPmeUF4W1@stanley.mountain Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nfc/pn533/usb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nfc/pn533/usb.c b/drivers/nfc/pn533/usb.c index ffd7367ce1194..018a80674f06e 100644 --- a/drivers/nfc/pn533/usb.c +++ b/drivers/nfc/pn533/usb.c @@ -406,7 +406,7 @@ static int pn533_acr122_poweron_rdr(struct pn533_usb_phy *phy) if (rc || (transferred != sizeof(cmd))) { nfc_err(&phy->udev->dev, "Reader power on cmd error %d\n", rc); - return rc; + return rc ?: -EINVAL; }
rc = usb_submit_urb(phy->in_urb, GFP_KERNEL);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
[ Upstream commit 5ec8ca26fe93103577c904644b0957f069d0051a ]
Jakub reports spurious failures of the 'conntrack_reverse_clash.sh' selftest. A bogus test makes nat core resort to port rewrite even though there is no need for this.
When the test is made, nf_nat_used_tuple() would already have caused us to return if no other CPU had added a colliding entry. Moreover, nf_nat_used_tuple() would have ignored the colliding entry if their origin tuples had been the same.
All that is left to check is if the colliding entry in the hash table is subject to NAT, and, if its not, if our entry matches in the reverse direction, e.g. hash table has
addr1:1234 -> addr2:80, and we want to commit addr2:80 -> addr1:1234.
Because we already checked that neither the new nor the committed entry is subject to NAT we only have to check origin vs. reply tuple: for non-nat entries, the reply tuple is always the inverted original.
Just in case there are more problems extend the error reporting in the selftest while at it and dump conntrack table/stats on error.
Reported-by: Jakub Kicinski kuba@kernel.org Closes: https://lore.kernel.org/netdev/20251206175135.4a56591b@kernel.org/ Fixes: d8f84a9bc7c4 ("netfilter: nf_nat: don't try nat source port reallocation for reverse dir clash") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_nat_core.c | 14 +------------- .../net/netfilter/conntrack_reverse_clash.c | 13 +++++++++---- .../net/netfilter/conntrack_reverse_clash.sh | 2 ++ 3 files changed, 12 insertions(+), 17 deletions(-)
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c index 78a61dac4ade8..e6b24586d2fed 100644 --- a/net/netfilter/nf_nat_core.c +++ b/net/netfilter/nf_nat_core.c @@ -294,25 +294,13 @@ nf_nat_used_tuple_new(const struct nf_conntrack_tuple *tuple,
ct = nf_ct_tuplehash_to_ctrack(thash);
- /* NB: IP_CT_DIR_ORIGINAL should be impossible because - * nf_nat_used_tuple() handles origin collisions. - * - * Handle remote chance other CPU confirmed its ct right after. - */ - if (thash->tuple.dst.dir != IP_CT_DIR_REPLY) - goto out; - /* clashing connection subject to NAT? Retry with new tuple. */ if (READ_ONCE(ct->status) & uses_nat) goto out;
if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, - &ignored_ct->tuplehash[IP_CT_DIR_REPLY].tuple) && - nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_REPLY].tuple, - &ignored_ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple)) { + &ignored_ct->tuplehash[IP_CT_DIR_REPLY].tuple)) taken = false; - goto out; - } out: nf_ct_put(ct); return taken; diff --git a/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.c b/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.c index 507930cee8cb6..462d628cc3bdb 100644 --- a/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.c +++ b/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.c @@ -33,9 +33,14 @@ static void die(const char *e) exit(111); }
-static void die_port(uint16_t got, uint16_t want) +static void die_port(const struct sockaddr_in *sin, uint16_t want) { - fprintf(stderr, "Port number changed, wanted %d got %d\n", want, ntohs(got)); + uint16_t got = ntohs(sin->sin_port); + char str[INET_ADDRSTRLEN]; + + inet_ntop(AF_INET, &sin->sin_addr, str, sizeof(str)); + + fprintf(stderr, "Port number changed, wanted %d got %d from %s\n", want, got, str); exit(1); }
@@ -100,7 +105,7 @@ int main(int argc, char *argv[]) die("child recvfrom");
if (peer.sin_port != htons(PORT)) - die_port(peer.sin_port, PORT); + die_port(&peer, PORT); } else { if (sendto(s2, buf, LEN, 0, (struct sockaddr *)&sa1, sizeof(sa1)) != LEN) continue; @@ -109,7 +114,7 @@ int main(int argc, char *argv[]) die("parent recvfrom");
if (peer.sin_port != htons((PORT + 1))) - die_port(peer.sin_port, PORT + 1); + die_port(&peer, PORT + 1); } }
diff --git a/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.sh b/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.sh index a24c896347a88..dc7e9d6da0624 100755 --- a/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.sh +++ b/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.sh @@ -45,6 +45,8 @@ if ip netns exec "$ns0" ./conntrack_reverse_clash; then echo "PASS: No SNAT performed for null bindings" else echo "ERROR: SNAT performed without any matching snat rule" + ip netns exec "$ns0" conntrack -L + ip netns exec "$ns0" conntrack -S exit 1 fi
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit a67fd55f6a09f4119b7232c19e0f348fe31ab0db ]
This validation predates the introduction of the state machine that determines when to enter slow path validation for error reporting.
Currently, table validation is perform when:
- new rule contains expressions that need validation. - new set element with jump/goto verdict.
Validation on register store skips most checks with no basechains, still this walks the graph searching for loops and ensuring expressions are called from the right hook. Remove this.
Fixes: a654de8fdc18 ("netfilter: nf_tables: fix chain dependency validation") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 11 ----------- 1 file changed, 11 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index eed434e0a9702..1a204f6371ad1 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -11678,21 +11678,10 @@ static int nft_validate_register_store(const struct nft_ctx *ctx, enum nft_data_types type, unsigned int len) { - int err; - switch (reg) { case NFT_REG_VERDICT: if (type != NFT_DATA_VERDICT) return -EINVAL; - - if (data != NULL && - (data->verdict.code == NFT_GOTO || - data->verdict.code == NFT_JUMP)) { - err = nft_chain_validate(ctx, data->verdict.chain); - if (err < 0) - return err; - } - break; default: if (type != NFT_DATA_VALUE)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
[ Upstream commit fec7b0795548b43e2c3c46e3143c34ef6070341c ]
packetdrill --ip_version=ipv4 --mtu=1500 --tolerance_usecs=1000000 --non_fatal packet conntrack_syn_challenge_ack.pkt conntrack v1.4.8 (conntrack-tools): 1 flow entries have been shown. conntrack_syn_challenge_ack.pkt:32: error executing `conntrack -f $NFCT_IP_VERSION \ -L -p tcp --dport 8080 | grep UNREPLIED | grep -q SYN_SENT` command: non-zero status 1
Affected kernel had CONFIG_HZ=100; reset packet was still sitting in backlog.
Reported-by: Yi Chen yiche@redhat.com Fixes: a8a388c2aae4 ("selftests: netfilter: add packetdrill based conntrack tests") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/netfilter/packetdrill/conntrack_syn_challenge_ack.pkt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/netfilter/packetdrill/conntrack_syn_challenge_ack.pkt b/tools/testing/selftests/net/netfilter/packetdrill/conntrack_syn_challenge_ack.pkt index 3442cd29bc932..cdb3910af95b4 100644 --- a/tools/testing/selftests/net/netfilter/packetdrill/conntrack_syn_challenge_ack.pkt +++ b/tools/testing/selftests/net/netfilter/packetdrill/conntrack_syn_challenge_ack.pkt @@ -26,7 +26,7 @@
+0.01 > R 643160523:643160523(0) win 0
-+0.01 `conntrack -f $NFCT_IP_VERSION -L -p tcp --dport 8080 2>/dev/null | grep UNREPLIED | grep -q SYN_SENT` ++0.1 `conntrack -f $NFCT_IP_VERSION -L -p tcp --dport 8080 2>/dev/null | grep UNREPLIED | grep -q SYN_SENT`
// Must go through. +0.01 > S 0:0(0) win 65535 <mss 1460,sackOK,TS val 1 ecr 0,nop,wscale 8>
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Gunthorpe jgg@nvidia.com
[ Upstream commit 5b244b077c0b0e76573fbb9542cf038e42368901 ]
GCC gets a bit confused and reports:
In function '_test_cmd_get_hw_info', inlined from 'iommufd_ioas_get_hw_info' at iommufd.c:779:3, inlined from 'wrapper_iommufd_ioas_get_hw_info' at iommufd.c:752:1:
iommufd_utils.h:804:37: warning: array subscript 'struct iommu_test_hw_info[0]' is partly outside array bounds of 'struct iommu_test_hw_info_buffer_smaller[1]' [-Warray-bounds=]
804 | assert(!info->flags); | ~~~~^~~~~~~ iommufd.c: In function 'wrapper_iommufd_ioas_get_hw_info': iommufd.c:761:11: note: object 'buffer_smaller' of size 4 761 | } buffer_smaller; | ^~~~~~~~~~~~~~
While it is true that "struct iommu_test_hw_info[0]" is partly out of bounds of the input pointer, it is not true that info->flags is out of bounds. Unclear why it warns on this.
Reuse an existing properly sized stack buffer and pass a truncated length instead to test the same thing.
Fixes: af4fde93c319 ("iommufd/selftest: Add coverage for IOMMU_GET_HW_INFO ioctl") Link: https://patch.msgid.link/r/0-v1-63a2cffb09da+4486-iommufd_gcc_bounds_jgg@nvi... Reviewed-by: Kevin Tian kevin.tian@intel.com Reviewed-by: Nicolin Chen nicolinc@nvidia.com Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202512032344.kaAcKFIM-lkp@intel.com/ Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/iommu/iommufd.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index bb4d33dde3c89..1f52140d9d9ee 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -758,9 +758,6 @@ TEST_F(iommufd_ioas, get_hw_info) struct iommu_test_hw_info info; uint64_t trailing_bytes; } buffer_larger; - struct iommu_test_hw_info_buffer_smaller { - __u32 flags; - } buffer_smaller;
if (self->device_id) { uint8_t max_pasid = 0; @@ -792,8 +789,9 @@ TEST_F(iommufd_ioas, get_hw_info) * the fields within the size range still gets updated. */ test_cmd_get_hw_info(self->device_id, - IOMMU_HW_INFO_TYPE_DEFAULT, - &buffer_smaller, sizeof(buffer_smaller)); + IOMMU_HW_INFO_TYPE_DEFAULT, &buffer_exact, + offsetofend(struct iommu_test_hw_info, + flags)); test_cmd_get_hw_info_pasid(self->device_id, &max_pasid); ASSERT_EQ(0, max_pasid); if (variant->pasid_capable) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Gunthorpe jgg@nvidia.com
[ Upstream commit e6a973af11135439de32ece3b9cbe3bfc043bea8 ]
syzkaller found it could overflow math in the test infrastructure and cause a WARN_ON by corrupting the reserved interval tree. This only effects test kernels with CONFIG_IOMMUFD_TEST.
Validate the user input length in the test ioctl.
Fixes: f4b20bb34c83 ("iommufd: Add kernel support for testing iommufd") Link: https://patch.msgid.link/r/0-v1-cd99f6049ba5+51-iommufd_syz_add_resv_jgg@nvi... Reviewed-by: Samiullah Khawaja skhawaja@google.com Reviewed-by: Kevin Tian kevin.tian@intel.com Tested-by: Yi Liu yi.l.liu@intel.com Reported-by: syzbot+57fdb0cf6a0c5d1f15a2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/69368129.a70a0220.38f243.008f.GAE@google.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/iommufd/selftest.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index de178827a078a..dc0947aaac625 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -1257,14 +1257,20 @@ static int iommufd_test_add_reserved(struct iommufd_ucmd *ucmd, unsigned int mockpt_id, unsigned long start, size_t length) { + unsigned long last; struct iommufd_ioas *ioas; int rc;
+ if (!length) + return -EINVAL; + if (check_add_overflow(start, length - 1, &last)) + return -EOVERFLOW; + ioas = iommufd_get_ioas(ucmd->ictx, mockpt_id); if (IS_ERR(ioas)) return PTR_ERR(ioas); down_write(&ioas->iopt.iova_rwsem); - rc = iopt_reserve_iova(&ioas->iopt, start, start + length - 1, NULL); + rc = iopt_reserve_iova(&ioas->iopt, start, last, NULL); up_write(&ioas->iopt.iova_rwsem); iommufd_put_object(ucmd->ictx, &ioas->obj); return rc;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit 46cea215dc9444ec32a76b1b6a9cb809e17b64d5 ]
There is a theoretical race window in j1939_sk_netdev_event_unregister() where two j1939_sk_bind() calls jump in between read_unlock_bh() and lock_sock().
The assumption jsk->priv == priv can fail if the first j1939_sk_bind() call once made jsk->priv == NULL due to failed j1939_local_ecu_get() call and the second j1939_sk_bind() call again made jsk->priv != NULL due to successful j1939_local_ecu_get() call.
Since the socket lock is held by both j1939_sk_netdev_event_unregister() and j1939_sk_bind(), checking ndev->reg_state with the socket lock held can reliably make the second j1939_sk_bind() call fail (and close this race window).
Fixes: 7fcbe5b2c6a4 ("can: j1939: implement NETDEV_UNREGISTER notification handler") Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Acked-by: Oleksij Rempel o.rempel@pengutronix.de Link: https://patch.msgid.link/5732921e-247e-4957-a364-da74bd7031d7@I-love.SAKURA.... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/can/j1939/socket.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c index 88e7160d42489..e3ba2e9fc0e9b 100644 --- a/net/can/j1939/socket.c +++ b/net/can/j1939/socket.c @@ -482,6 +482,12 @@ static int j1939_sk_bind(struct socket *sock, struct sockaddr *uaddr, int len) goto out_release_sock; }
+ if (ndev->reg_state != NETREG_REGISTERED) { + dev_put(ndev); + ret = -ENODEV; + goto out_release_sock; + } + can_ml = can_get_ml_priv(ndev); if (!can_ml) { dev_put(ndev);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gal Pressman gal@nvidia.com
[ Upstream commit 7b07be1ff1cb6c49869910518650e8d0abc7d25f ]
The ethtool -S command operates across three ioctl calls: ETHTOOL_GSSET_INFO for the size, ETHTOOL_GSTRINGS for the names, and ETHTOOL_GSTATS for the values.
If the number of stats changes between these calls (e.g., due to device reconfiguration), userspace's buffer allocation will be incorrect, potentially leading to buffer overflow.
Drivers are generally expected to maintain stable stat counts, but some drivers (e.g., mlx5, bnx2x, bna, ksz884x) use dynamic counters, making this scenario possible.
Some drivers try to handle this internally: - bnad_get_ethtool_stats() returns early in case stats.n_stats is not equal to the driver's stats count. - micrel/ksz884x also makes sure not to write anything beyond stats.n_stats and overflow the buffer.
However, both use stats.n_stats which is already assigned with the value returned from get_sset_count(), hence won't solve the issue described here.
Change ethtool_get_strings(), ethtool_get_stats(), ethtool_get_phy_stats() to not return anything in case of a mismatch between userspace's size and get_sset_size(), to prevent buffer overflow. The returned n_stats value will be equal to zero, to reflect that nothing has been returned.
This could result in one of two cases when using upstream ethtool, depending on when the size change is detected: 1. When detected in ethtool_get_strings(): # ethtool -S eth2 no stats available
2. When detected in get stats, all stats will be reported as zero.
Both cases are presumably transient, and a subsequent ethtool call should succeed.
Other than the overflow avoidance, these two cases are very evident (no output/cleared stats), which is arguably better than presenting incorrect/shifted stats. I also considered returning an error instead of a "silent" response, but that seems more destructive towards userspace apps.
Notes: - This patch does not claim to fix the inherent race, it only makes sure that we do not overflow the userspace buffer, and makes for a more predictable behavior.
- RTNL lock is held during each ioctl, the race window exists between the separate ioctl calls when the lock is released.
- Userspace ethtool always fills stats.n_stats, but it is likely that these stats ioctls are implemented in other userspace applications which might not fill it. The added code checks that it's not zero, to prevent any regressions.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reviewed-by: Dragos Tatulea dtatulea@nvidia.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: Gal Pressman gal@nvidia.com Link: https://patch.msgid.link/20251208121901.3203692-1-gal@nvidia.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ethtool/ioctl.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-)
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index fa83ddade4f81..9431e305b2333 100644 --- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -2383,7 +2383,10 @@ static int ethtool_get_strings(struct net_device *dev, void __user *useraddr) return -ENOMEM; WARN_ON_ONCE(!ret);
- gstrings.len = ret; + if (gstrings.len && gstrings.len != ret) + gstrings.len = 0; + else + gstrings.len = ret;
if (gstrings.len) { data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN)); @@ -2509,10 +2512,13 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr) if (copy_from_user(&stats, useraddr, sizeof(stats))) return -EFAULT;
- stats.n_stats = n_stats; + if (stats.n_stats && stats.n_stats != n_stats) + stats.n_stats = 0; + else + stats.n_stats = n_stats;
- if (n_stats) { - data = vzalloc(array_size(n_stats, sizeof(u64))); + if (stats.n_stats) { + data = vzalloc(array_size(stats.n_stats, sizeof(u64))); if (!data) return -ENOMEM; ops->get_ethtool_stats(dev, &stats, data); @@ -2524,7 +2530,9 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr) if (copy_to_user(useraddr, &stats, sizeof(stats))) goto out; useraddr += sizeof(stats); - if (n_stats && copy_to_user(useraddr, data, array_size(n_stats, sizeof(u64)))) + if (stats.n_stats && + copy_to_user(useraddr, data, + array_size(stats.n_stats, sizeof(u64)))) goto out; ret = 0;
@@ -2560,6 +2568,10 @@ static int ethtool_get_phy_stats_phydev(struct phy_device *phydev, return -EOPNOTSUPP;
n_stats = phy_ops->get_sset_count(phydev); + if (stats->n_stats && stats->n_stats != n_stats) { + stats->n_stats = 0; + return 0; + }
ret = ethtool_vzalloc_stats_array(n_stats, data); if (ret) @@ -2580,6 +2592,10 @@ static int ethtool_get_phy_stats_ethtool(struct net_device *dev, return -EOPNOTSUPP;
n_stats = ops->get_sset_count(dev, ETH_SS_PHY_STATS); + if (stats->n_stats && stats->n_stats != n_stats) { + stats->n_stats = 0; + return 0; + }
ret = ethtool_vzalloc_stats_array(n_stats, data); if (ret) @@ -2616,7 +2632,9 @@ static int ethtool_get_phy_stats(struct net_device *dev, void __user *useraddr) }
useraddr += sizeof(stats); - if (copy_to_user(useraddr, data, array_size(stats.n_stats, sizeof(u64)))) + if (stats.n_stats && + copy_to_user(useraddr, data, + array_size(stats.n_stats, sizeof(u64)))) ret = -EFAULT;
out:
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Moshe Shemesh moshe@nvidia.com
[ Upstream commit 89a898d63f6f588acf5c104c65c94a38b68c69a6 ]
drain_fw_reset() waits for ongoing firmware reset events and blocks new event handling, but does not clear the reset requested flag, and may keep sync reset polling.
To fix it, call mlx5_sync_reset_clear_reset_requested() to clear the flag, stop sync reset polling, and resume health polling, ensuring health issues are still detected after the firmware reset drain.
Fixes: 16d42d313350 ("net/mlx5: Drain fw_reset when removing device") Signed-off-by: Moshe Shemesh moshe@nvidia.com Reviewed-by: Shay Drori shayd@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/1765284977-1363052-2-git-send-email-tariqt@nvidia.c... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c index 89e399606877b..33df0418e5754 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c @@ -843,7 +843,8 @@ void mlx5_drain_fw_reset(struct mlx5_core_dev *dev) cancel_work_sync(&fw_reset->reset_reload_work); cancel_work_sync(&fw_reset->reset_now_work); cancel_work_sync(&fw_reset->reset_abort_work); - cancel_delayed_work(&fw_reset->reset_timeout_work); + if (test_bit(MLX5_FW_RESET_FLAGS_RESET_REQUESTED, &fw_reset->reset_flags)) + mlx5_sync_reset_clear_reset_requested(dev, true); }
static const struct devlink_param mlx5_fw_reset_devlink_params[] = {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Moshe Shemesh moshe@nvidia.com
[ Upstream commit 5846a365fc6476b02d6766963cf0985520f0385f ]
Invoke drain_fw_reset() in the shutdown callback to ensure all firmware reset handling is completed before shutdown proceeds.
Fixes: 16d42d313350 ("net/mlx5: Drain fw_reset when removing device") Signed-off-by: Moshe Shemesh moshe@nvidia.com Reviewed-by: Shay Drori shayd@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/1765284977-1363052-3-git-send-email-tariqt@nvidia.c... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 70c156591b0ba..9e0c9e6266a47 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -2189,6 +2189,7 @@ static void shutdown(struct pci_dev *pdev)
mlx5_core_info(dev, "Shutdown was called\n"); set_bit(MLX5_BREAK_FW_WAIT, &dev->intf_state); + mlx5_drain_fw_reset(dev); mlx5_drain_health_wq(dev); err = mlx5_try_fast_unload(dev); if (err)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shay Drory shayd@nvidia.com
[ Upstream commit b35966042d20b14e2d83330049f77deec5229749 ]
Add validation for format string parameters in the firmware tracer to prevent potential security vulnerabilities and crashes from malformed format strings received from firmware.
The firmware tracer receives format strings from the device firmware and uses them to format trace messages. Without proper validation, bad firmware could provide format strings with invalid format specifiers (e.g., %s, %p, %n) that could lead to crashes, or other undefined behavior.
Add mlx5_tracer_validate_params() to validate that all format specifiers in trace strings are limited to safe integer/hex formats (%x, %d, %i, %u, %llx, %lx, etc.). Reject strings containing other format types that could be used to access arbitrary memory or cause crashes. Invalid format strings are added to the trace output for visibility with "BAD_FORMAT: " prefix.
Fixes: 70dd6fdb8987 ("net/mlx5: FW tracer, parse traces and kernel tracing support") Signed-off-by: Shay Drory shayd@nvidia.com Reviewed-by: Moshe Shemesh moshe@nvidia.com Reported-by: Breno Leitao leitao@debian.org Closes: https://lore.kernel.org/netdev/hanz6rzrb2bqbplryjrakvkbmv4y5jlmtthnvi3thg5sl... Signed-off-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/1765284977-1363052-4-git-send-email-tariqt@nvidia.c... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../mellanox/mlx5/core/diag/fw_tracer.c | 83 ++++++++++++++++--- .../mellanox/mlx5/core/diag/fw_tracer.h | 1 + 2 files changed, 74 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c index 080e7eab52c7e..9c86c8c72d049 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c @@ -33,6 +33,7 @@ #include "lib/eq.h" #include "fw_tracer.h" #include "fw_tracer_tracepoint.h" +#include <linux/ctype.h>
static int mlx5_query_mtrc_caps(struct mlx5_fw_tracer *tracer) { @@ -358,6 +359,43 @@ static const char *VAL_PARM = "%llx"; static const char *REPLACE_64_VAL_PARM = "%x%x"; static const char *PARAM_CHAR = "%";
+static bool mlx5_is_valid_spec(const char *str) +{ + /* Parse format specifiers to find the actual type. + * Structure: %[flags][width][.precision][length]type + * Skip flags, width, precision & length. + */ + while (isdigit(*str) || *str == '#' || *str == '.' || *str == 'l') + str++; + + /* Check if it's a valid integer/hex specifier: + * Valid formats: %x, %d, %i, %u, etc. + */ + if (*str != 'x' && *str != 'X' && *str != 'd' && *str != 'i' && + *str != 'u' && *str != 'c') + return false; + + return true; +} + +static bool mlx5_tracer_validate_params(const char *str) +{ + const char *substr = str; + + if (!str) + return false; + + substr = strstr(substr, PARAM_CHAR); + while (substr) { + if (!mlx5_is_valid_spec(substr + 1)) + return false; + + substr = strstr(substr + 1, PARAM_CHAR); + } + + return true; +} + static int mlx5_tracer_message_hash(u32 message_id) { return jhash_1word(message_id, 0) & (MESSAGE_HASH_SIZE - 1); @@ -419,6 +457,10 @@ static int mlx5_tracer_get_num_of_params(char *str) char *substr, *pstr = str; int num_of_params = 0;
+ /* Validate that all parameters are valid before processing */ + if (!mlx5_tracer_validate_params(str)) + return -EINVAL; + /* replace %llx with %x%x */ substr = strstr(pstr, VAL_PARM); while (substr) { @@ -570,14 +612,17 @@ void mlx5_tracer_print_trace(struct tracer_string_format *str_frmt, { char tmp[512];
- snprintf(tmp, sizeof(tmp), str_frmt->string, - str_frmt->params[0], - str_frmt->params[1], - str_frmt->params[2], - str_frmt->params[3], - str_frmt->params[4], - str_frmt->params[5], - str_frmt->params[6]); + if (str_frmt->invalid_string) + snprintf(tmp, sizeof(tmp), "BAD_FORMAT: %s", str_frmt->string); + else + snprintf(tmp, sizeof(tmp), str_frmt->string, + str_frmt->params[0], + str_frmt->params[1], + str_frmt->params[2], + str_frmt->params[3], + str_frmt->params[4], + str_frmt->params[5], + str_frmt->params[6]);
trace_mlx5_fw(dev->tracer, trace_timestamp, str_frmt->lost, str_frmt->event_id, tmp); @@ -609,6 +654,13 @@ static int mlx5_tracer_handle_raw_string(struct mlx5_fw_tracer *tracer, return 0; }
+static void mlx5_tracer_handle_bad_format_string(struct mlx5_fw_tracer *tracer, + struct tracer_string_format *cur_string) +{ + cur_string->invalid_string = true; + list_add_tail(&cur_string->list, &tracer->ready_strings_list); +} + static int mlx5_tracer_handle_string_trace(struct mlx5_fw_tracer *tracer, struct tracer_event *tracer_event) { @@ -619,12 +671,18 @@ static int mlx5_tracer_handle_string_trace(struct mlx5_fw_tracer *tracer, if (!cur_string) return mlx5_tracer_handle_raw_string(tracer, tracer_event);
- cur_string->num_of_params = mlx5_tracer_get_num_of_params(cur_string->string); - cur_string->last_param_num = 0; cur_string->event_id = tracer_event->event_id; cur_string->tmsn = tracer_event->string_event.tmsn; cur_string->timestamp = tracer_event->string_event.timestamp; cur_string->lost = tracer_event->lost_event; + cur_string->last_param_num = 0; + cur_string->num_of_params = mlx5_tracer_get_num_of_params(cur_string->string); + if (cur_string->num_of_params < 0) { + pr_debug("%s Invalid format string parameters\n", + __func__); + mlx5_tracer_handle_bad_format_string(tracer, cur_string); + return 0; + } if (cur_string->num_of_params == 0) /* trace with no params */ list_add_tail(&cur_string->list, &tracer->ready_strings_list); } else { @@ -634,6 +692,11 @@ static int mlx5_tracer_handle_string_trace(struct mlx5_fw_tracer *tracer, __func__, tracer_event->string_event.tmsn); return mlx5_tracer_handle_raw_string(tracer, tracer_event); } + if (cur_string->num_of_params < 0) { + pr_debug("%s string parameter of invalid string, dumping\n", + __func__); + return 0; + } cur_string->last_param_num += 1; if (cur_string->last_param_num > TRACER_MAX_PARAMS) { pr_debug("%s Number of params exceeds the max (%d)\n", diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h index 5c548bb74f07b..30d0bcba88479 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h @@ -125,6 +125,7 @@ struct tracer_string_format { struct list_head list; u32 timestamp; bool lost; + bool invalid_string; };
enum mlx5_fw_tracer_ownership_state {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shay Drory shayd@nvidia.com
[ Upstream commit c0289f67f7d6a0dfba0e92cfe661a5c70c8c6e92 ]
The firmware tracer's format string validation and parameter counting did not properly handle escaped percent signs (%%). This caused fw_tracer to count more parameters when trace format strings contained literal percent characters.
To fix it, allow %% to pass string validation and skip %% sequences when counting parameters since they represent literal percent signs rather than format specifiers.
Fixes: 70dd6fdb8987 ("net/mlx5: FW tracer, parse traces and kernel tracing support") Signed-off-by: Shay Drory shayd@nvidia.com Reported-by: Breno Leitao leitao@debian.org Reviewed-by: Moshe Shemesh moshe@nvidia.com Closes: https://lore.kernel.org/netdev/hanz6rzrb2bqbplryjrakvkbmv4y5jlmtthnvi3thg5sl... Signed-off-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/1765284977-1363052-5-git-send-email-tariqt@nvidia.c... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../mellanox/mlx5/core/diag/fw_tracer.c | 20 +++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c index 9c86c8c72d049..0b82a6a133d6c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c @@ -368,11 +368,11 @@ static bool mlx5_is_valid_spec(const char *str) while (isdigit(*str) || *str == '#' || *str == '.' || *str == 'l') str++;
- /* Check if it's a valid integer/hex specifier: + /* Check if it's a valid integer/hex specifier or %%: * Valid formats: %x, %d, %i, %u, etc. */ if (*str != 'x' && *str != 'X' && *str != 'd' && *str != 'i' && - *str != 'u' && *str != 'c') + *str != 'u' && *str != 'c' && *str != '%') return false;
return true; @@ -390,7 +390,11 @@ static bool mlx5_tracer_validate_params(const char *str) if (!mlx5_is_valid_spec(substr + 1)) return false;
- substr = strstr(substr + 1, PARAM_CHAR); + if (*(substr + 1) == '%') + substr = strstr(substr + 2, PARAM_CHAR); + else + substr = strstr(substr + 1, PARAM_CHAR); + }
return true; @@ -469,11 +473,15 @@ static int mlx5_tracer_get_num_of_params(char *str) substr = strstr(pstr, VAL_PARM); }
- /* count all the % characters */ + /* count all the % characters, but skip %% (escaped percent) */ substr = strstr(str, PARAM_CHAR); while (substr) { - num_of_params += 1; - str = substr + 1; + if (*(substr + 1) != '%') { + num_of_params += 1; + str = substr + 1; + } else { + str = substr + 2; + } substr = strstr(str, PARAM_CHAR); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shay Drory shayd@nvidia.com
[ Upstream commit 367e501f8b095eca08d2eb0ba4ccea5b5e82c169 ]
The firmware reset mechanism can be triggered by asynchronous events, which may race with other devlink operations like devlink reload or devlink dev eswitch set, potentially leading to inconsistent states.
This patch addresses the race by using the devl_lock to serialize the firmware reset against other devlink operations. When a reset is requested, the driver attempts to acquire the lock. If successful, it sets a flag to block devlink reload or eswitch changes, ACKs the reset to firmware and then releases the lock. If the lock is already held by another operation, the driver NACKs the firmware reset request, indicating that the reset cannot proceed.
Firmware reset does not keep the devl_lock and instead uses an internal firmware reset bit. This is because firmware resets can be triggered by asynchronous events, and processed in different threads. It is illegal and unsafe to acquire a lock in one thread and attempt to release it in another, as lock ownership is intrinsically thread-specific.
This change ensures that firmware resets and other devlink operations are mutually exclusive during the critical reset request phase, preventing race conditions.
Fixes: 38b9f903f22b ("net/mlx5: Handle sync reset request event") Signed-off-by: Shay Drory shayd@nvidia.com Reviewed-by: Mateusz Berezecki mberezecki@nvidia.com Reviewed-by: Moshe Shemesh moshe@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/1765284977-1363052-6-git-send-email-tariqt@nvidia.c... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/devlink.c | 5 +++ .../mellanox/mlx5/core/eswitch_offloads.c | 6 +++ .../ethernet/mellanox/mlx5/core/fw_reset.c | 45 +++++++++++++++++-- .../ethernet/mellanox/mlx5/core/fw_reset.h | 1 + 4 files changed, 53 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c index 887adf4807d16..ea77fbd98396a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c @@ -197,6 +197,11 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change, struct pci_dev *pdev = dev->pdev; int ret = 0;
+ if (mlx5_fw_reset_in_progress(dev)) { + NL_SET_ERR_MSG_MOD(extack, "Can't reload during firmware reset"); + return -EBUSY; + } + if (mlx5_dev_is_lightweight(dev)) { if (action != DEVLINK_RELOAD_ACTION_DRIVER_REINIT) return -EOPNOTSUPP; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 44a142a041b2f..784130cdf6c07 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -52,6 +52,7 @@ #include "devlink.h" #include "lag/lag.h" #include "en/tc/post_meter.h" +#include "fw_reset.h"
/* There are two match-all miss flows, one for unicast dst mac and * one for multicast. @@ -3807,6 +3808,11 @@ int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode, if (IS_ERR(esw)) return PTR_ERR(esw);
+ if (mlx5_fw_reset_in_progress(esw->dev)) { + NL_SET_ERR_MSG_MOD(extack, "Can't change eswitch mode during firmware reset"); + return -EBUSY; + } + if (esw_mode_from_devlink(mode, &mlx5_mode)) return -EINVAL;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c index 33df0418e5754..4544f1968f73f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c @@ -15,6 +15,7 @@ enum { MLX5_FW_RESET_FLAGS_DROP_NEW_REQUESTS, MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED, MLX5_FW_RESET_FLAGS_UNLOAD_EVENT, + MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, };
struct mlx5_fw_reset { @@ -127,6 +128,16 @@ int mlx5_fw_reset_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_ty return mlx5_reg_mfrl_query(dev, reset_level, reset_type, NULL, NULL); }
+bool mlx5_fw_reset_in_progress(struct mlx5_core_dev *dev) +{ + struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset; + + if (!fw_reset) + return false; + + return test_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags); +} + static int mlx5_fw_reset_get_reset_method(struct mlx5_core_dev *dev, u8 *reset_method) { @@ -242,6 +253,8 @@ static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev) BIT(DEVLINK_RELOAD_ACTION_FW_ACTIVATE)); devl_unlock(devlink); } + + clear_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags); }
static void mlx5_stop_sync_reset_poll(struct mlx5_core_dev *dev) @@ -461,27 +474,48 @@ static void mlx5_sync_reset_request_event(struct work_struct *work) struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset, reset_request_work); struct mlx5_core_dev *dev = fw_reset->dev; + bool nack_request = false; + struct devlink *devlink; int err;
err = mlx5_fw_reset_get_reset_method(dev, &fw_reset->reset_method); - if (err) + if (err) { + nack_request = true; mlx5_core_warn(dev, "Failed reading MFRL, err %d\n", err); + } else if (!mlx5_is_reset_now_capable(dev, fw_reset->reset_method) || + test_bit(MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, + &fw_reset->reset_flags)) { + nack_request = true; + }
- if (err || test_bit(MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, &fw_reset->reset_flags) || - !mlx5_is_reset_now_capable(dev, fw_reset->reset_method)) { + devlink = priv_to_devlink(dev); + /* For external resets, try to acquire devl_lock. Skip if devlink reset is + * pending (lock already held) + */ + if (nack_request || + (!test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, + &fw_reset->reset_flags) && + !devl_trylock(devlink))) { err = mlx5_fw_reset_set_reset_sync_nack(dev); mlx5_core_warn(dev, "PCI Sync FW Update Reset Nack %s", err ? "Failed" : "Sent"); return; } + if (mlx5_sync_reset_set_reset_requested(dev)) - return; + goto unlock; + + set_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags);
err = mlx5_fw_reset_set_reset_sync_ack(dev); if (err) mlx5_core_warn(dev, "PCI Sync FW Update Reset Ack Failed. Error code: %d\n", err); else mlx5_core_warn(dev, "PCI Sync FW Update Reset Ack. Device reset is expected.\n"); + +unlock: + if (!test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags)) + devl_unlock(devlink); }
static int mlx5_pci_link_toggle(struct mlx5_core_dev *dev, u16 dev_id) @@ -721,6 +755,8 @@ static void mlx5_sync_reset_abort_event(struct work_struct *work)
if (mlx5_sync_reset_clear_reset_requested(dev, true)) return; + + clear_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags); mlx5_core_warn(dev, "PCI Sync FW Update Reset Aborted.\n"); }
@@ -757,6 +793,7 @@ static void mlx5_sync_reset_timeout_work(struct work_struct *work)
if (mlx5_sync_reset_clear_reset_requested(dev, true)) return; + clear_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags); mlx5_core_warn(dev, "PCI Sync FW Update Reset Timeout.\n"); }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h index d5b28525c960d..2d96b2adc1cdf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h @@ -10,6 +10,7 @@ int mlx5_fw_reset_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_ty int mlx5_fw_reset_set_reset_sync(struct mlx5_core_dev *dev, u8 reset_type_sel, struct netlink_ext_ack *extack); int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev); +bool mlx5_fw_reset_in_progress(struct mlx5_core_dev *dev);
int mlx5_fw_reset_wait_reset_done(struct mlx5_core_dev *dev); void mlx5_sync_reset_unload_flow(struct mlx5_core_dev *dev, bool locked);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jianbo Liu jianbol@nvidia.com
[ Upstream commit e35d7da8dd9e55b37c3e8ab548f6793af0c2ab49 ]
Replace ipv6_stub->ipv6_dst_lookup_flow() with ip6_dst_lookup() in mlx5e_ipsec_init_macs() since IPsec transformations are not needed during Security Association setup - only basic routing information is required for nexthop MAC address resolution.
This resolves an issue where XfrmOutNoStates error counter would be incremented when xfrm policy is configured before xfrm state, as the IPsec-aware routing function would attempt policy checks during SA initialization.
Fixes: 71670f766b8f ("net/mlx5e: Support routed networks during IPsec MACs initialization") Signed-off-by: Jianbo Liu jianbol@nvidia.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/1765284977-1363052-7-git-send-email-tariqt@nvidia.c... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c index 35d9530037a65..6c79b9cea2efb 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c @@ -342,9 +342,8 @@ static void mlx5e_ipsec_init_macs(struct mlx5e_ipsec_sa_entry *sa_entry, rt_dst_entry = &rt->dst; break; case AF_INET6: - rt_dst_entry = ipv6_stub->ipv6_dst_lookup_flow( - dev_net(netdev), NULL, &fl6, NULL); - if (IS_ERR(rt_dst_entry)) + if (!IS_ENABLED(CONFIG_IPV6) || + ip6_dst_lookup(dev_net(netdev), NULL, &rt_dst_entry, &fl6)) goto neigh; break; default:
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jianbo Liu jianbol@nvidia.com
[ Upstream commit 9ab89bde13e5251e1d0507e1cc426edcdfe19142 ]
When initializing the MAC addresses for an outbound IPsec packet offload rule in mlx5e_ipsec_init_macs, the call to dst_neigh_lookup is used to find the next-hop neighbor (typically the gateway in tunnel mode). This call might create a new neighbor entry if one doesn't already exist. This newly created entry starts in the INCOMPLETE state, as the kernel hasn't yet sent an ARP or NDISC probe to resolve the MAC address. In this case, neigh_ha_snapshot will correctly return an all-zero MAC address.
IPsec packet offload requires the actual next-hop MAC address to program the rule correctly. If the neighbor state is INCOMPLETE when the rule is created, the hardware rule is programmed with an all-zero destination MAC address. Packets sent using this rule will be subsequently dropped by the receiving network infrastructure or host.
This patch adds a check specifically for the outbound offload path. If neigh_ha_snapshot returns an all-zero MAC address, it proactively calls neigh_event_send(n, NULL). This ensures the kernel immediately sends the initial ARP or NDISC probe if one isn't already pending, accelerating the resolution process. This helps prevent the hardware rule from being programmed with an invalid MAC address and avoids packet drops due to unresolved neighbors.
Fixes: 71670f766b8f ("net/mlx5e: Support routed networks during IPsec MACs initialization") Signed-off-by: Jianbo Liu jianbol@nvidia.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/1765284977-1363052-8-git-send-email-tariqt@nvidia.c... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c index 6c79b9cea2efb..a8fb4bec369cf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c @@ -358,6 +358,9 @@ static void mlx5e_ipsec_init_macs(struct mlx5e_ipsec_sa_entry *sa_entry,
neigh_ha_snapshot(addr, n, netdev); ether_addr_copy(dst, addr); + if (attrs->dir == XFRM_DEV_OFFLOAD_OUT && + is_zero_ether_addr(addr)) + neigh_event_send(n, NULL); dst_release(rt_dst_entry); neigh_release(n); return;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cosmin Ratiu cratiu@nvidia.com
[ Upstream commit 4198a14c8c6252fd1191afaa742dd515dcaf3487 ]
Commit [1] added the 40 bytes required by the PSP header+trailer and the UDP header to MLX5E_ETH_HARD_MTU, which limits the device-wide max software MTU that could be set. This is not okay, because most packets are not PSP packets and it doesn't make sense to always reserve space for headers which won't get added in most cases.
As it turns out, for TCP connections, PSP overhead is already taken into account in the TCP MSS calculations via inet_csk(sk)->icsk_ext_hdr_len. This was added in commit [2]. This means that the extra space reserved in the hard MTU for mlx5 ends up unused and wasted.
Remove the unnecessary 40 byte reservation from hard MTU.
[1] commit e5a1861a298e ("net/mlx5e: Implement PSP Tx data path") [2] commit e97269257fe4 ("net: psp: update the TCP MSS to reflect PSP packet overhead")
Fixes: e5a1861a298e ("net/mlx5e: Implement PSP Tx data path") Signed-off-by: Cosmin Ratiu cratiu@nvidia.com Reviewed-by: Shahar Shitrit shshitrit@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/1765284977-1363052-10-git-send-email-tariqt@nvidia.... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index a163f81f07c13..a6479e4d8d8c6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -69,7 +69,7 @@ struct page_pool; #define MLX5E_METADATA_ETHER_TYPE (0x8CE4) #define MLX5E_METADATA_ETHER_LEN 8
-#define MLX5E_ETH_HARD_MTU (ETH_HLEN + PSP_ENCAP_HLEN + PSP_TRL_SIZE + VLAN_HLEN + ETH_FCS_LEN) +#define MLX5E_ETH_HARD_MTU (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN)
#define MLX5E_HW2SW_MTU(params, hwmtu) ((hwmtu) - ((params)->hard_mtu)) #define MLX5E_SW2HW_MTU(params, swmtu) ((swmtu) + ((params)->hard_mtu))
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Scott Mayhew smayhew@redhat.com
[ Upstream commit 15564bd67e2975002f2a8e9defee33e321d3183f ]
When a handshake request is cancelled it is removed from the handshake_net->hn_requests list, but it is still present in the handshake_rhashtbl until it is destroyed.
If a second cancellation request arrives for the same handshake request, then remove_pending() will return false... and assuming HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue processing through the out_true label, where we put another reference on the sock and a refcount underflow occurs.
This can happen for example if a handshake times out - particularly if the SUNRPC client sends the AUTH_TLS probe to the server but doesn't follow it up with the ClientHello due to a problem with tlshd. When the timeout is hit on the server, the server will send a FIN, which triggers a cancellation request via xs_reset_transport(). When the timeout is hit on the client, another cancellation request happens via xs_tls_handshake_sync().
Add a test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED) in the pending cancel path so duplicate cancels can be detected.
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") Suggested-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Scott Mayhew smayhew@redhat.com Reviewed-by: Chuck Lever chuck.lever@oracle.com Link: https://patch.msgid.link/20251209193015.3032058-1-smayhew@redhat.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/handshake/request.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/net/handshake/request.c b/net/handshake/request.c index 274d2c89b6b20..f78091680bca5 100644 --- a/net/handshake/request.c +++ b/net/handshake/request.c @@ -324,7 +324,11 @@ bool handshake_req_cancel(struct sock *sk)
hn = handshake_pernet(net); if (hn && remove_pending(hn, req)) { - /* Request hadn't been accepted */ + /* Request hadn't been accepted - mark cancelled */ + if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) { + trace_handshake_cancel_busy(net, req, sk); + return false; + } goto out_true; } if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wei Fang wei.fang@nxp.com
[ Upstream commit 2939203ffee818f1e5ebd60bbb85a174d63aab9c ]
In the current implementation, the enetc_xdp_xmit() always transmits redirected XDP frames even if the link is down, but the frames cannot be transmitted from TX BD rings when the link is down, so the frames are still kept in the TX BD rings. If the XDP program is uninstalled, users will see the following warning logs.
fsl_enetc 0000:00:00.0 eno0: timeout for tx ring #6 clear
More worse, the TX BD ring cannot work properly anymore, because the HW PIR and CIR are not equal after the re-initialization of the TX BD ring. At this point, the BDs between CIR and PIR are invalid, which will cause a hardware malfunction.
Another reason is that there is internal context in the ring prefetch logic that will retain the state from the first incarnation of the ring and continue prefetching from the stale location when we re-initialize the ring. The internal context is only reset by an FLR. That is to say, for LS1028A ENETC, software cannot set the HW CIR and PIR when initializing the TX BD ring.
It does not make sense to transmit redirected XDP frames when the link is down. Add a link status check to prevent transmission in this condition. This fixes part of the issue, but more complex cases remain. For example, the TX BD ring may still contain unsent frames when the link goes down. Those situations require additional patches, which will build on this one.
Fixes: 9d2b68cc108d ("net: enetc: add support for XDP_REDIRECT") Signed-off-by: Wei Fang wei.fang@nxp.com Reviewed-by: Frank Li Frank.Li@nxp.com Reviewed-by: Hariprasad Kelam hkelam@marvell.com Reviewed-by: Vladimir Oltean vladimir.oltean@nxp.com Link: https://patch.msgid.link/20251211020919.121113-1-wei.fang@nxp.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/enetc/enetc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 0535e92404e3c..f410c245ea918 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -1778,7 +1778,8 @@ int enetc_xdp_xmit(struct net_device *ndev, int num_frames, int xdp_tx_bd_cnt, i, k; int xdp_tx_frm_cnt = 0;
- if (unlikely(test_bit(ENETC_TX_DOWN, &priv->flags))) + if (unlikely(test_bit(ENETC_TX_DOWN, &priv->flags) || + !netif_carrier_ok(ndev))) return -ENETDOWN;
enetc_lock_mdio();
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jian Shen shenjian15@huawei.com
[ Upstream commit c2a16269742e176fccdd0ef9c016a233491a49ad ]
Currently, hdev->htqp is allocated using hdev->num_tqps, and kinfo->tqp is allocated using kinfo->num_tqps. However, kinfo->num_tqps is set to min(new_tqps, hdev->num_tqps); Therefore, kinfo->num_tqps may be smaller than hdev->num_tqps, which causes some hdev->htqp[i] to remain uninitialized in hclgevf_knic_setup().
Thus, this patch allocates hdev->htqp and kinfo->tqp using hdev->num_tqps, ensuring that the lengths of hdev->htqp and kinfo->tqp are consistent and that all elements are properly initialized.
Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20251211023737.2327018-2-shaojijie@huawei.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c index 8fcf220a120d2..70327a73dee32 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c @@ -368,12 +368,12 @@ static int hclgevf_knic_setup(struct hclgevf_dev *hdev) new_tqps = kinfo->rss_size * num_tc; kinfo->num_tqps = min(new_tqps, hdev->num_tqps);
- kinfo->tqp = devm_kcalloc(&hdev->pdev->dev, kinfo->num_tqps, + kinfo->tqp = devm_kcalloc(&hdev->pdev->dev, hdev->num_tqps, sizeof(struct hnae3_queue *), GFP_KERNEL); if (!kinfo->tqp) return -ENOMEM;
- for (i = 0; i < kinfo->num_tqps; i++) { + for (i = 0; i < hdev->num_tqps; i++) { hdev->htqp[i].q.handle = &hdev->nic; hdev->htqp[i].q.tqp_index = i; kinfo->tqp[i] = &hdev->htqp[i].q;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jian Shen shenjian15@huawei.com
[ Upstream commit d180c11aa8a6fa735f9ac2c72c61364a9afc2ba7 ]
Currently, rss_size = num_tqps / tc_num. If tc_num is 1, then num_tqps equals rss_size. However, if the tc_num is greater than 1, then rss_size will be less than num_tqps, causing the tqp_index check for subsequent TCs using rss_size to always fail.
This patch uses the num_tqps to check whether tqp_index is out of range, instead of rss_size.
Fixes: 326334aad024 ("net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx()") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20251211023737.2327018-3-shaojijie@huawei.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c index c7ff12a6c0764..b7d4e06a55d40 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c @@ -193,10 +193,10 @@ static int hclge_get_ring_chain_from_mbx( return -EINVAL;
for (i = 0; i < ring_num; i++) { - if (req->msg.param[i].tqp_index >= vport->nic.kinfo.rss_size) { + if (req->msg.param[i].tqp_index >= vport->nic.kinfo.num_tqps) { dev_err(&hdev->pdev->dev, "tqp index(%u) is out of range(0-%u)\n", req->msg.param[i].tqp_index, - vport->nic.kinfo.rss_size - 1U); + vport->nic.kinfo.num_tqps - 1U); return -EINVAL; } }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jian Shen shenjian15@huawei.com
[ Upstream commit 6ef935e65902bfed53980ad2754b06a284ea8ac1 ]
Currently, the VLAN id may be used without validation when receive a VLAN configuration mailbox from VF. The length of vlan_del_fail_bmap is BITS_TO_LONGS(VLAN_N_VID). It may cause out-of-bounds memory access once the VLAN id is bigger than or equal to VLAN_N_VID.
Therefore, VLAN id needs to be checked to ensure it is within the range of VLAN_N_VID.
Fixes: fe4144d47eef ("net: hns3: sync VLAN filter entries when kill VLAN ID failed") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20251211023737.2327018-4-shaojijie@huawei.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index 782bb48c9f3d7..1b103d1154da9 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -10562,6 +10562,9 @@ int hclge_set_vlan_filter(struct hnae3_handle *handle, __be16 proto, bool writen_to_tbl = false; int ret = 0;
+ if (vlan_id >= VLAN_N_VID) + return -EINVAL; + /* When device is resetting or reset failed, firmware is unable to * handle mailbox. Just record the vlan id, and remove it after * reset finished.
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: José Expósito jose.exposito89@gmail.com
[ Upstream commit fe27e709d91fb645182751b602cb88966b4a1bb6 ]
Fedora/CentOS/RHEL CI is reporting intermittent failures while running the KUnit tests present in drm_hdmi_state_helper_test.c [1].
While the specific test causing the failure change between runs, all of them are caused by drm_kunit_helper_enable_crtc_connector() returning -EDEADLK. The error trace always follow this structure:
# <test name>: ASSERTION FAILED at # drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c:<line> Expected ret == 0, but ret == -35 (0xffffffffffffffdd)
As documented, if the drm_kunit_helper_enable_crtc_connector() function returns -EDEADLK (-35), the entire atomic sequence must be restarted.
Handle this error code for all function calls.
Closes: https://datawarehouse.cki-project.org/issue/4039 [1] Fixes: 6a5c0ad7e08e ("drm/tests: hdmi_state_helpers: Switch to new helper") Reviewed-by: Maxime Ripard mripard@kernel.org Signed-off-by: José Expósito jose.exposito89@gmail.com Link: https://patch.msgid.link/20251104102258.10026-1-jose.exposito89@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../drm/tests/drm_hdmi_state_helper_test.c | 143 ++++++++++++++++++ 1 file changed, 143 insertions(+)
diff --git a/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c b/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c index 8bd412735000..70f9aa702143 100644 --- a/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c +++ b/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c @@ -257,10 +257,16 @@ static void drm_test_check_broadcast_rgb_crtc_mode_changed(struct kunit *test)
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -326,10 +332,16 @@ static void drm_test_check_broadcast_rgb_crtc_mode_not_changed(struct kunit *tes
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -397,10 +409,16 @@ static void drm_test_check_broadcast_rgb_auto_cea_mode(struct kunit *test)
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -457,10 +475,17 @@ static void drm_test_check_broadcast_rgb_auto_cea_mode_vic_1(struct kunit *test) KUNIT_ASSERT_NOT_NULL(test, mode);
crtc = priv->crtc; + +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, mode, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -518,10 +543,16 @@ static void drm_test_check_broadcast_rgb_full_cea_mode(struct kunit *test)
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -580,10 +611,17 @@ static void drm_test_check_broadcast_rgb_full_cea_mode_vic_1(struct kunit *test) KUNIT_ASSERT_NOT_NULL(test, mode);
crtc = priv->crtc; + +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, mode, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -643,10 +681,16 @@ static void drm_test_check_broadcast_rgb_limited_cea_mode(struct kunit *test)
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -705,10 +749,17 @@ static void drm_test_check_broadcast_rgb_limited_cea_mode_vic_1(struct kunit *te KUNIT_ASSERT_NOT_NULL(test, mode);
crtc = priv->crtc; + +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, mode, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -870,10 +921,16 @@ static void drm_test_check_output_bpc_crtc_mode_changed(struct kunit *test)
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -946,10 +1003,16 @@ static void drm_test_check_output_bpc_crtc_mode_not_changed(struct kunit *test)
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -1022,10 +1085,16 @@ static void drm_test_check_output_bpc_dvi(struct kunit *test)
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
conn_state = conn->state; @@ -1069,10 +1138,16 @@ static void drm_test_check_tmds_char_rate_rgb_8bpc(struct kunit *test)
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
conn_state = conn->state; @@ -1118,10 +1193,16 @@ static void drm_test_check_tmds_char_rate_rgb_10bpc(struct kunit *test)
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
conn_state = conn->state; @@ -1167,10 +1248,16 @@ static void drm_test_check_tmds_char_rate_rgb_12bpc(struct kunit *test)
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
conn_state = conn->state; @@ -1218,10 +1305,16 @@ static void drm_test_check_hdmi_funcs_reject_rate(struct kunit *test)
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
/* You shouldn't be doing that at home. */ @@ -1292,10 +1385,16 @@ static void drm_test_check_max_tmds_rate_bpc_fallback_rgb(struct kunit *test)
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_EXPECT_EQ(test, ret, 0);
conn_state = conn->state; @@ -1440,10 +1539,16 @@ static void drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv422(struct kunit
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_EXPECT_EQ(test, ret, 0);
conn_state = conn->state; @@ -1669,10 +1774,17 @@ static void drm_test_check_output_bpc_format_vic_1(struct kunit *test) drm_modeset_acquire_init(&ctx, 0);
crtc = priv->crtc; + +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, mode, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_EXPECT_EQ(test, ret, 0);
conn_state = conn->state; @@ -1736,10 +1848,16 @@ static void drm_test_check_output_bpc_format_driver_rgb_only(struct kunit *test)
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_EXPECT_EQ(test, ret, 0);
conn_state = conn->state; @@ -1805,10 +1923,16 @@ static void drm_test_check_output_bpc_format_display_rgb_only(struct kunit *test
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_EXPECT_EQ(test, ret, 0);
conn_state = conn->state; @@ -1865,10 +1989,16 @@ static void drm_test_check_output_bpc_format_driver_8bpc_only(struct kunit *test
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_EXPECT_EQ(test, ret, 0);
conn_state = conn->state; @@ -1927,10 +2057,16 @@ static void drm_test_check_output_bpc_format_display_8bpc_only(struct kunit *tes
drm_modeset_acquire_init(&ctx, 0);
+retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_EXPECT_EQ(test, ret, 0);
conn_state = conn->state; @@ -1970,10 +2106,17 @@ static void drm_test_check_disable_connector(struct kunit *test)
drm = &priv->drm; crtc = priv->crtc; + +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0);
state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: José Expósito jose.exposito89@gmail.com
[ Upstream commit 141d95e42884628314f5ad9394657b0b35424300 ]
Fedora/CentOS/RHEL CI is reporting intermittent failures while running the drm_test_check_valid_clones() KUnit test.
The error log can be either [1]:
# drm_test_check_valid_clones: ASSERTION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:295 Expected ret == param->expected_result, but ret == -35 (0xffffffffffffffdd) param->expected_result == 0 (0x0)
Or [2] depending on the test case:
# drm_test_check_valid_clones: ASSERTION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:295 Expected ret == param->expected_result, but ret == -35 (0xffffffffffffffdd) param->expected_result == -22 (0xffffffffffffffea)
Restart the atomic sequence when EDEADLK is returned.
[1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/21... [2] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/21...
Fixes: 88849f24e2ab ("drm/tests: Add test for drm_atomic_helper_check_modeset()") Closes: https://datawarehouse.cki-project.org/issue/4004 Reviewed-by: Maxime Ripard mripard@kernel.org Signed-off-by: José Expósito jose.exposito89@gmail.com Link: https://patch.msgid.link/20251104102535.12212-1-jose.exposito89@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/tests/drm_atomic_state_test.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/tests/drm_atomic_state_test.c b/drivers/gpu/drm/tests/drm_atomic_state_test.c index 2f6ac7a09f44..1e857d86574c 100644 --- a/drivers/gpu/drm/tests/drm_atomic_state_test.c +++ b/drivers/gpu/drm/tests/drm_atomic_state_test.c @@ -283,7 +283,14 @@ static void drm_test_check_valid_clones(struct kunit *test) state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, state);
+retry: crtc_state = drm_atomic_get_crtc_state(state, priv->crtc); + if (PTR_ERR(crtc_state) == -EDEADLK) { + drm_atomic_state_clear(state); + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry; + } KUNIT_ASSERT_NOT_ERR_OR_NULL(test, crtc_state);
crtc_state->encoder_mask = param->encoder_mask; @@ -292,6 +299,12 @@ static void drm_test_check_valid_clones(struct kunit *test) crtc_state->mode_changed = true;
ret = drm_atomic_helper_check_modeset(drm, state); + if (ret == -EDEADLK) { + drm_atomic_state_clear(state); + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry; + } KUNIT_ASSERT_EQ(test, ret, param->expected_result);
drm_modeset_drop_locks(&ctx);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: José Expósito jose.exposito89@gmail.com
[ Upstream commit 526aafabd756cc56401b383d6ae554af3e21dcdd ]
Fedora/CentOS/RHEL CI is reporting intermittent failures while running the drm_validate_modeset test [1]:
# drm_test_check_connector_changed_modeset: EXPECTATION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:162 Expected ret == 0, but ret == -35 (0xffffffffffffffdd)
Change the set_up_atomic_state() helper function to return on error and restart the atomic sequence when the returned error is EDEADLK.
[1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/21...
Fixes: 73d934d7b6e3 ("drm/tests: Add test for drm_atomic_helper_commit_modeset_disables()") Closes: https://datawarehouse.cki-project.org/issue/4004 Reviewed-by: Maxime Ripard mripard@kernel.org Signed-off-by: José Expósito jose.exposito89@gmail.com Link: https://patch.msgid.link/20251104102535.12212-2-jose.exposito89@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/tests/drm_atomic_state_test.c | 27 +++++++++++++++---- 1 file changed, 22 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/tests/drm_atomic_state_test.c b/drivers/gpu/drm/tests/drm_atomic_state_test.c index 1e857d86574c..bc27f65b2823 100644 --- a/drivers/gpu/drm/tests/drm_atomic_state_test.c +++ b/drivers/gpu/drm/tests/drm_atomic_state_test.c @@ -156,24 +156,29 @@ static int set_up_atomic_state(struct kunit *test,
if (connector) { conn_state = drm_atomic_get_connector_state(state, connector); - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, conn_state); + if (IS_ERR(conn_state)) + return PTR_ERR(conn_state);
ret = drm_atomic_set_crtc_for_connector(conn_state, crtc); - KUNIT_EXPECT_EQ(test, ret, 0); + if (ret) + return ret; }
crtc_state = drm_atomic_get_crtc_state(state, crtc); - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, crtc_state); + if (IS_ERR(crtc_state)) + return PTR_ERR(crtc_state);
ret = drm_atomic_set_mode_for_crtc(crtc_state, &drm_atomic_test_mode); - KUNIT_EXPECT_EQ(test, ret, 0); + if (ret) + return ret;
crtc_state->enable = true; crtc_state->active = true;
if (connector) { ret = drm_atomic_commit(state); - KUNIT_ASSERT_EQ(test, ret, 0); + if (ret) + return ret; } else { // dummy connector mask crtc_state->connector_mask = DRM_TEST_CONN_0; @@ -206,7 +211,13 @@ static void drm_test_check_connector_changed_modeset(struct kunit *test) drm_modeset_acquire_init(&ctx, 0);
// first modeset to enable +retry_set_up: ret = set_up_atomic_state(test, priv, old_conn, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_set_up; + } KUNIT_ASSERT_EQ(test, ret, 0);
state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -277,7 +288,13 @@ static void drm_test_check_valid_clones(struct kunit *test)
drm_modeset_acquire_init(&ctx, 0);
+retry_set_up: ret = set_up_atomic_state(test, priv, NULL, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_set_up; + } KUNIT_ASSERT_EQ(test, ret, 0);
state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 9637fc3bdd10c8e073f71897bd35babbd21e9b29 ]
The functions ublk_queue_use_zc(), ublk_queue_use_auto_zc(), and ublk_queue_auto_zc_fallback() were returning int, but performing bitwise AND on q->flags which is __u64.
When a flag bit is set in the upper 32 bits (beyond INT_MAX), the result of the bitwise AND operation could overflow when cast to int, leading to incorrect boolean evaluation.
For example, if UBLKS_Q_AUTO_BUF_REG_FALLBACK is 0x8000000000000000: - (u64)flags & 0x8000000000000000 = 0x8000000000000000 - Cast to int: undefined behavior / incorrect value - Used in if(): may evaluate incorrectly
Fix by: 1. Changing return type from int to bool for semantic correctness 2. Using !! to explicitly convert to boolean (0 or 1)
This ensures the functions return proper boolean values regardless of which bit position the flags occupy in the 64-bit field.
Fixes: c3a6d48f86da ("selftests: ublk: remove ublk queue self-defined flags") Signed-off-by: Ming Lei ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/ublk/kublk.h | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/ublk/kublk.h b/tools/testing/selftests/ublk/kublk.h index 5e55484fb0aa..1b8833a40064 100644 --- a/tools/testing/selftests/ublk/kublk.h +++ b/tools/testing/selftests/ublk/kublk.h @@ -393,19 +393,19 @@ static inline int ublk_completed_tgt_io(struct ublk_thread *t, return --io->tgt_ios == 0; }
-static inline int ublk_queue_use_zc(const struct ublk_queue *q) +static inline bool ublk_queue_use_zc(const struct ublk_queue *q) { - return q->flags & UBLK_F_SUPPORT_ZERO_COPY; + return !!(q->flags & UBLK_F_SUPPORT_ZERO_COPY); }
-static inline int ublk_queue_use_auto_zc(const struct ublk_queue *q) +static inline bool ublk_queue_use_auto_zc(const struct ublk_queue *q) { - return q->flags & UBLK_F_AUTO_BUF_REG; + return !!(q->flags & UBLK_F_AUTO_BUF_REG); }
-static inline int ublk_queue_auto_zc_fallback(const struct ublk_queue *q) +static inline bool ublk_queue_auto_zc_fallback(const struct ublk_queue *q) { - return q->flags & UBLKS_Q_AUTO_BUF_REG_FALLBACK; + return !!(q->flags & UBLKS_Q_AUTO_BUF_REG_FALLBACK); }
static inline int ublk_queue_no_buf(const struct ublk_queue *q)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nilay Shroff nilay@linux.ibm.com
[ Upstream commit 232143b605387b372dee0ec7830f93b93df5f67d ]
Currently, the nr_hw_queues update path manages two disjoint xarrays — one for elevator tags and another for elevator type — both used during elevator switching. Maintaining these two parallel structures for the same purpose adds unnecessary complexity and potential for mismatched state.
This patch unifies both xarrays into a single structure, struct elv_change_ctx, which holds all per-queue elevator change context. A single xarray, named elv_tbl, now maps each queue (q->id) in a tagset to its corresponding elv_change_ctx entry, encapsulating the elevator tags, type and name references.
This unification simplifies the code, improves maintainability, and clarifies ownership of per-queue elevator state.
Reviewed-by: Ming Lei ming.lei@redhat.com Reviewed-by: Yu Kuai yukuai@fnnas.com Signed-off-by: Nilay Shroff nilay@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Stable-dep-of: 9869d3a6fed3 ("block: fix race between wbt_enable_default and IO submission") Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-mq-sched.c | 76 +++++++++++++++++++++++++++++++++----------- block/blk-mq-sched.h | 3 ++ block/blk-mq.c | 50 +++++++++++++++++------------ block/blk.h | 7 ++-- block/elevator.c | 31 ++++-------------- block/elevator.h | 15 +++++++++ 6 files changed, 115 insertions(+), 67 deletions(-)
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index e0bed16485c3..3d9386555a50 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -427,11 +427,11 @@ void blk_mq_free_sched_tags(struct elevator_tags *et, kfree(et); }
-void blk_mq_free_sched_tags_batch(struct xarray *et_table, +void blk_mq_free_sched_tags_batch(struct xarray *elv_tbl, struct blk_mq_tag_set *set) { struct request_queue *q; - struct elevator_tags *et; + struct elv_change_ctx *ctx;
lockdep_assert_held_write(&set->update_nr_hwq_lock);
@@ -444,13 +444,47 @@ void blk_mq_free_sched_tags_batch(struct xarray *et_table, * concurrently. */ if (q->elevator) { - et = xa_load(et_table, q->id); - if (unlikely(!et)) + ctx = xa_load(elv_tbl, q->id); + if (!ctx || !ctx->et) { WARN_ON_ONCE(1); - else - blk_mq_free_sched_tags(et, set); + continue; + } + blk_mq_free_sched_tags(ctx->et, set); + ctx->et = NULL; + } + } +} + +void blk_mq_free_sched_ctx_batch(struct xarray *elv_tbl) +{ + unsigned long i; + struct elv_change_ctx *ctx; + + xa_for_each(elv_tbl, i, ctx) { + xa_erase(elv_tbl, i); + kfree(ctx); + } +} + +int blk_mq_alloc_sched_ctx_batch(struct xarray *elv_tbl, + struct blk_mq_tag_set *set) +{ + struct request_queue *q; + struct elv_change_ctx *ctx; + + lockdep_assert_held_write(&set->update_nr_hwq_lock); + + list_for_each_entry(q, &set->tag_list, tag_set_list) { + ctx = kzalloc(sizeof(struct elv_change_ctx), GFP_KERNEL); + if (!ctx) + return -ENOMEM; + + if (xa_insert(elv_tbl, q->id, ctx, GFP_KERNEL)) { + kfree(ctx); + return -ENOMEM; } } + return 0; }
struct elevator_tags *blk_mq_alloc_sched_tags(struct blk_mq_tag_set *set, @@ -498,12 +532,13 @@ struct elevator_tags *blk_mq_alloc_sched_tags(struct blk_mq_tag_set *set, return NULL; }
-int blk_mq_alloc_sched_tags_batch(struct xarray *et_table, +int blk_mq_alloc_sched_tags_batch(struct xarray *elv_tbl, struct blk_mq_tag_set *set, unsigned int nr_hw_queues) { + struct elv_change_ctx *ctx; struct request_queue *q; struct elevator_tags *et; - gfp_t gfp = GFP_NOIO | __GFP_ZERO | __GFP_NOWARN | __GFP_NORETRY; + int ret = -ENOMEM;
lockdep_assert_held_write(&set->update_nr_hwq_lock);
@@ -516,26 +551,31 @@ int blk_mq_alloc_sched_tags_batch(struct xarray *et_table, * concurrently. */ if (q->elevator) { - et = blk_mq_alloc_sched_tags(set, nr_hw_queues, + ctx = xa_load(elv_tbl, q->id); + if (WARN_ON_ONCE(!ctx)) { + ret = -ENOENT; + goto out_unwind; + } + + ctx->et = blk_mq_alloc_sched_tags(set, nr_hw_queues, blk_mq_default_nr_requests(set)); - if (!et) + if (!ctx->et) goto out_unwind; - if (xa_insert(et_table, q->id, et, gfp)) - goto out_free_tags; + } } return 0; -out_free_tags: - blk_mq_free_sched_tags(et, set); out_unwind: list_for_each_entry_continue_reverse(q, &set->tag_list, tag_set_list) { if (q->elevator) { - et = xa_load(et_table, q->id); - if (et) - blk_mq_free_sched_tags(et, set); + ctx = xa_load(elv_tbl, q->id); + if (ctx && ctx->et) { + blk_mq_free_sched_tags(ctx->et, set); + ctx->et = NULL; + } } } - return -ENOMEM; + return ret; }
/* caller must have a reference to @e, will grab another one if successful */ diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h index 8e21a6b1415d..2fddbc91a235 100644 --- a/block/blk-mq-sched.h +++ b/block/blk-mq-sched.h @@ -27,6 +27,9 @@ struct elevator_tags *blk_mq_alloc_sched_tags(struct blk_mq_tag_set *set, unsigned int nr_hw_queues, unsigned int nr_requests); int blk_mq_alloc_sched_tags_batch(struct xarray *et_table, struct blk_mq_tag_set *set, unsigned int nr_hw_queues); +int blk_mq_alloc_sched_ctx_batch(struct xarray *elv_tbl, + struct blk_mq_tag_set *set); +void blk_mq_free_sched_ctx_batch(struct xarray *elv_tbl); void blk_mq_free_sched_tags(struct elevator_tags *et, struct blk_mq_tag_set *set); void blk_mq_free_sched_tags_batch(struct xarray *et_table, diff --git a/block/blk-mq.c b/block/blk-mq.c index f901aeba8552..180d45db5624 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4996,27 +4996,28 @@ struct elevator_tags *blk_mq_update_nr_requests(struct request_queue *q, * Switch back to the elevator type stored in the xarray. */ static void blk_mq_elv_switch_back(struct request_queue *q, - struct xarray *elv_tbl, struct xarray *et_tbl) + struct xarray *elv_tbl) { - struct elevator_type *e = xa_load(elv_tbl, q->id); - struct elevator_tags *t = xa_load(et_tbl, q->id); + struct elv_change_ctx *ctx = xa_load(elv_tbl, q->id); + + if (WARN_ON_ONCE(!ctx)) + return;
/* The elv_update_nr_hw_queues unfreezes the queue. */ - elv_update_nr_hw_queues(q, e, t); + elv_update_nr_hw_queues(q, ctx);
/* Drop the reference acquired in blk_mq_elv_switch_none. */ - if (e) - elevator_put(e); + if (ctx->type) + elevator_put(ctx->type); }
/* - * Stores elevator type in xarray and set current elevator to none. It uses - * q->id as an index to store the elevator type into the xarray. + * Stores elevator name and type in ctx and set current elevator to none. */ static int blk_mq_elv_switch_none(struct request_queue *q, struct xarray *elv_tbl) { - int ret = 0; + struct elv_change_ctx *ctx;
lockdep_assert_held_write(&q->tag_set->update_nr_hwq_lock);
@@ -5028,10 +5029,11 @@ static int blk_mq_elv_switch_none(struct request_queue *q, * can't run concurrently. */ if (q->elevator) { + ctx = xa_load(elv_tbl, q->id); + if (WARN_ON_ONCE(!ctx)) + return -ENOENT;
- ret = xa_insert(elv_tbl, q->id, q->elevator->type, GFP_KERNEL); - if (WARN_ON_ONCE(ret)) - return ret; + ctx->name = q->elevator->type->elevator_name;
/* * Before we switch elevator to 'none', take a reference to @@ -5042,9 +5044,14 @@ static int blk_mq_elv_switch_none(struct request_queue *q, */ __elevator_get(q->elevator->type);
+ /* + * Store elevator type so that we can release the reference + * taken above later. + */ + ctx->type = q->elevator->type; elevator_set_none(q); } - return ret; + return 0; }
static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, @@ -5054,7 +5061,7 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int prev_nr_hw_queues = set->nr_hw_queues; unsigned int memflags; int i; - struct xarray elv_tbl, et_tbl; + struct xarray elv_tbl; bool queues_frozen = false;
lockdep_assert_held(&set->tag_list_lock); @@ -5068,11 +5075,12 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
memflags = memalloc_noio_save();
- xa_init(&et_tbl); - if (blk_mq_alloc_sched_tags_batch(&et_tbl, set, nr_hw_queues) < 0) - goto out_memalloc_restore; - xa_init(&elv_tbl); + if (blk_mq_alloc_sched_ctx_batch(&elv_tbl, set) < 0) + goto out_free_ctx; + + if (blk_mq_alloc_sched_tags_batch(&elv_tbl, set, nr_hw_queues) < 0) + goto out_free_ctx;
list_for_each_entry(q, &set->tag_list, tag_set_list) { blk_mq_debugfs_unregister_hctxs(q); @@ -5118,7 +5126,7 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, /* switch_back expects queue to be frozen */ if (!queues_frozen) blk_mq_freeze_queue_nomemsave(q); - blk_mq_elv_switch_back(q, &elv_tbl, &et_tbl); + blk_mq_elv_switch_back(q, &elv_tbl); }
list_for_each_entry(q, &set->tag_list, tag_set_list) { @@ -5129,9 +5137,9 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, blk_mq_add_hw_queues_cpuhp(q); }
+out_free_ctx: + blk_mq_free_sched_ctx_batch(&elv_tbl); xa_destroy(&elv_tbl); - xa_destroy(&et_tbl); -out_memalloc_restore: memalloc_noio_restore(memflags);
/* Free the excess tags when nr_hw_queues shrink. */ diff --git a/block/blk.h b/block/blk.h index 170794632135..a7992680f9e1 100644 --- a/block/blk.h +++ b/block/blk.h @@ -11,8 +11,7 @@ #include <xen/xen.h> #include "blk-crypto-internal.h"
-struct elevator_type; -struct elevator_tags; +struct elv_change_ctx;
/* * Default upper limit for the software max_sectors limit used for regular I/Os. @@ -333,8 +332,8 @@ bool blk_bio_list_merge(struct request_queue *q, struct list_head *list,
bool blk_insert_flush(struct request *rq);
-void elv_update_nr_hw_queues(struct request_queue *q, struct elevator_type *e, - struct elevator_tags *t); +void elv_update_nr_hw_queues(struct request_queue *q, + struct elv_change_ctx *ctx); void elevator_set_default(struct request_queue *q); void elevator_set_none(struct request_queue *q);
diff --git a/block/elevator.c b/block/elevator.c index e2ebfbf107b3..cd7bdff205c8 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -45,19 +45,6 @@ #include "blk-wbt.h" #include "blk-cgroup.h"
-/* Holding context data for changing elevator */ -struct elv_change_ctx { - const char *name; - bool no_uevent; - - /* for unregistering old elevator */ - struct elevator_queue *old; - /* for registering new elevator */ - struct elevator_queue *new; - /* holds sched tags data */ - struct elevator_tags *et; -}; - static DEFINE_SPINLOCK(elv_list_lock); static LIST_HEAD(elv_list);
@@ -706,32 +693,28 @@ static int elevator_change(struct request_queue *q, struct elv_change_ctx *ctx) * The I/O scheduler depends on the number of hardware queues, this forces a * reattachment when nr_hw_queues changes. */ -void elv_update_nr_hw_queues(struct request_queue *q, struct elevator_type *e, - struct elevator_tags *t) +void elv_update_nr_hw_queues(struct request_queue *q, + struct elv_change_ctx *ctx) { struct blk_mq_tag_set *set = q->tag_set; - struct elv_change_ctx ctx = {}; int ret = -ENODEV;
WARN_ON_ONCE(q->mq_freeze_depth == 0);
- if (e && !blk_queue_dying(q) && blk_queue_registered(q)) { - ctx.name = e->elevator_name; - ctx.et = t; - + if (ctx->type && !blk_queue_dying(q) && blk_queue_registered(q)) { mutex_lock(&q->elevator_lock); /* force to reattach elevator after nr_hw_queue is updated */ - ret = elevator_switch(q, &ctx); + ret = elevator_switch(q, ctx); mutex_unlock(&q->elevator_lock); } blk_mq_unfreeze_queue_nomemrestore(q); if (!ret) - WARN_ON_ONCE(elevator_change_done(q, &ctx)); + WARN_ON_ONCE(elevator_change_done(q, ctx)); /* * Free sched tags if it's allocated but we couldn't switch elevator. */ - if (t && !ctx.new) - blk_mq_free_sched_tags(t, set); + if (ctx->et && !ctx->new) + blk_mq_free_sched_tags(ctx->et, set); }
/* diff --git a/block/elevator.h b/block/elevator.h index c4d20155065e..bad43182361e 100644 --- a/block/elevator.h +++ b/block/elevator.h @@ -32,6 +32,21 @@ struct elevator_tags { struct blk_mq_tags *tags[]; };
+/* Holding context data for changing elevator */ +struct elv_change_ctx { + const char *name; + bool no_uevent; + + /* for unregistering old elevator */ + struct elevator_queue *old; + /* for registering new elevator */ + struct elevator_queue *new; + /* store elevator type */ + struct elevator_type *type; + /* holds sched tags data */ + struct elevator_tags *et; +}; + struct elevator_mq_ops { int (*init_sched)(struct request_queue *, struct elevator_queue *); void (*exit_sched)(struct elevator_queue *);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nilay Shroff nilay@linux.ibm.com
[ Upstream commit 04728ce90966c54417fd8120a3820104d18ba68d ]
This patch introduces a new structure, struct elevator_resources, to group together all elevator-related resources that share the same lifetime. As a first step, this change moves the elevator tag pointer from struct elv_change_ctx into the new struct elevator_resources.
Additionally, rename blk_mq_alloc_sched_tags_batch() and blk_mq_free_sched_tags_batch() to blk_mq_alloc_sched_res_batch() and blk_mq_free_sched_res_batch(), respectively. Introduce two new wrapper helpers, blk_mq_alloc_sched_res() and blk_mq_free_sched_res(), around blk_mq_alloc_sched_tags() and blk_mq_free_sched_tags().
These changes pave the way for consolidating the allocation and freeing of elevator-specific resources into common helper functions. This refactoring improves encapsulation and prepares the code for future extensions, allowing additional elevator-specific data to be added to struct elevator_resources without cluttering struct elv_change_ctx.
Subsequent patches will extend struct elevator_resources to include other elevator-related data.
Reviewed-by: Ming Lei ming.lei@redhat.com Reviewed-by: Yu Kuai yukuai@fnnas.com Signed-off-by: Nilay Shroff nilay@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Stable-dep-of: 9869d3a6fed3 ("block: fix race between wbt_enable_default and IO submission") Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-mq-sched.c | 48 ++++++++++++++++++++++++++++++-------------- block/blk-mq-sched.h | 10 ++++++--- block/blk-mq.c | 2 +- block/elevator.c | 31 ++++++++++++++-------------- block/elevator.h | 9 +++++++-- 5 files changed, 64 insertions(+), 36 deletions(-)
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 3d9386555a50..03ff16c49976 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -427,7 +427,16 @@ void blk_mq_free_sched_tags(struct elevator_tags *et, kfree(et); }
-void blk_mq_free_sched_tags_batch(struct xarray *elv_tbl, +void blk_mq_free_sched_res(struct elevator_resources *res, + struct blk_mq_tag_set *set) +{ + if (res->et) { + blk_mq_free_sched_tags(res->et, set); + res->et = NULL; + } +} + +void blk_mq_free_sched_res_batch(struct xarray *elv_tbl, struct blk_mq_tag_set *set) { struct request_queue *q; @@ -445,12 +454,11 @@ void blk_mq_free_sched_tags_batch(struct xarray *elv_tbl, */ if (q->elevator) { ctx = xa_load(elv_tbl, q->id); - if (!ctx || !ctx->et) { + if (!ctx) { WARN_ON_ONCE(1); continue; } - blk_mq_free_sched_tags(ctx->et, set); - ctx->et = NULL; + blk_mq_free_sched_res(&ctx->res, set); } } } @@ -532,12 +540,24 @@ struct elevator_tags *blk_mq_alloc_sched_tags(struct blk_mq_tag_set *set, return NULL; }
-int blk_mq_alloc_sched_tags_batch(struct xarray *elv_tbl, +int blk_mq_alloc_sched_res(struct request_queue *q, + struct elevator_resources *res, unsigned int nr_hw_queues) +{ + struct blk_mq_tag_set *set = q->tag_set; + + res->et = blk_mq_alloc_sched_tags(set, nr_hw_queues, + blk_mq_default_nr_requests(set)); + if (!res->et) + return -ENOMEM; + + return 0; +} + +int blk_mq_alloc_sched_res_batch(struct xarray *elv_tbl, struct blk_mq_tag_set *set, unsigned int nr_hw_queues) { struct elv_change_ctx *ctx; struct request_queue *q; - struct elevator_tags *et; int ret = -ENOMEM;
lockdep_assert_held_write(&set->update_nr_hwq_lock); @@ -557,11 +577,10 @@ int blk_mq_alloc_sched_tags_batch(struct xarray *elv_tbl, goto out_unwind; }
- ctx->et = blk_mq_alloc_sched_tags(set, nr_hw_queues, - blk_mq_default_nr_requests(set)); - if (!ctx->et) + ret = blk_mq_alloc_sched_res(q, &ctx->res, + nr_hw_queues); + if (ret) goto out_unwind; - } } return 0; @@ -569,10 +588,8 @@ int blk_mq_alloc_sched_tags_batch(struct xarray *elv_tbl, list_for_each_entry_continue_reverse(q, &set->tag_list, tag_set_list) { if (q->elevator) { ctx = xa_load(elv_tbl, q->id); - if (ctx && ctx->et) { - blk_mq_free_sched_tags(ctx->et, set); - ctx->et = NULL; - } + if (ctx) + blk_mq_free_sched_res(&ctx->res, set); } } return ret; @@ -580,9 +597,10 @@ int blk_mq_alloc_sched_tags_batch(struct xarray *elv_tbl,
/* caller must have a reference to @e, will grab another one if successful */ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e, - struct elevator_tags *et) + struct elevator_resources *res) { unsigned int flags = q->tag_set->flags; + struct elevator_tags *et = res->et; struct blk_mq_hw_ctx *hctx; struct elevator_queue *eq; unsigned long i; diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h index 2fddbc91a235..1f8e58dd4b49 100644 --- a/block/blk-mq-sched.h +++ b/block/blk-mq-sched.h @@ -19,20 +19,24 @@ void __blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx); void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx);
int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e, - struct elevator_tags *et); + struct elevator_resources *res); void blk_mq_exit_sched(struct request_queue *q, struct elevator_queue *e); void blk_mq_sched_free_rqs(struct request_queue *q);
struct elevator_tags *blk_mq_alloc_sched_tags(struct blk_mq_tag_set *set, unsigned int nr_hw_queues, unsigned int nr_requests); -int blk_mq_alloc_sched_tags_batch(struct xarray *et_table, +int blk_mq_alloc_sched_res(struct request_queue *q, + struct elevator_resources *res, unsigned int nr_hw_queues); +int blk_mq_alloc_sched_res_batch(struct xarray *elv_tbl, struct blk_mq_tag_set *set, unsigned int nr_hw_queues); int blk_mq_alloc_sched_ctx_batch(struct xarray *elv_tbl, struct blk_mq_tag_set *set); void blk_mq_free_sched_ctx_batch(struct xarray *elv_tbl); void blk_mq_free_sched_tags(struct elevator_tags *et, struct blk_mq_tag_set *set); -void blk_mq_free_sched_tags_batch(struct xarray *et_table, +void blk_mq_free_sched_res(struct elevator_resources *res, + struct blk_mq_tag_set *set); +void blk_mq_free_sched_res_batch(struct xarray *et_table, struct blk_mq_tag_set *set);
static inline void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx) diff --git a/block/blk-mq.c b/block/blk-mq.c index 180d45db5624..ea5f948af7a4 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -5079,7 +5079,7 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, if (blk_mq_alloc_sched_ctx_batch(&elv_tbl, set) < 0) goto out_free_ctx;
- if (blk_mq_alloc_sched_tags_batch(&elv_tbl, set, nr_hw_queues) < 0) + if (blk_mq_alloc_sched_res_batch(&elv_tbl, set, nr_hw_queues) < 0) goto out_free_ctx;
list_for_each_entry(q, &set->tag_list, tag_set_list) { diff --git a/block/elevator.c b/block/elevator.c index cd7bdff205c8..cbec292a4af5 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -580,7 +580,7 @@ static int elevator_switch(struct request_queue *q, struct elv_change_ctx *ctx) }
if (new_e) { - ret = blk_mq_init_sched(q, new_e, ctx->et); + ret = blk_mq_init_sched(q, new_e, &ctx->res); if (ret) goto out_unfreeze; ctx->new = q->elevator; @@ -604,7 +604,8 @@ static int elevator_switch(struct request_queue *q, struct elv_change_ctx *ctx) return ret; }
-static void elv_exit_and_release(struct request_queue *q) +static void elv_exit_and_release(struct elv_change_ctx *ctx, + struct request_queue *q) { struct elevator_queue *e; unsigned memflags; @@ -616,7 +617,7 @@ static void elv_exit_and_release(struct request_queue *q) mutex_unlock(&q->elevator_lock); blk_mq_unfreeze_queue(q, memflags); if (e) { - blk_mq_free_sched_tags(e->et, q->tag_set); + blk_mq_free_sched_res(&ctx->res, q->tag_set); kobject_put(&e->kobj); } } @@ -627,11 +628,12 @@ static int elevator_change_done(struct request_queue *q, int ret = 0;
if (ctx->old) { + struct elevator_resources res = {.et = ctx->old->et}; bool enable_wbt = test_bit(ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT, &ctx->old->flags);
elv_unregister_queue(q, ctx->old); - blk_mq_free_sched_tags(ctx->old->et, q->tag_set); + blk_mq_free_sched_res(&res, q->tag_set); kobject_put(&ctx->old->kobj); if (enable_wbt) wbt_enable_default(q->disk); @@ -639,7 +641,7 @@ static int elevator_change_done(struct request_queue *q, if (ctx->new) { ret = elv_register_queue(q, ctx->new, !ctx->no_uevent); if (ret) - elv_exit_and_release(q); + elv_exit_and_release(ctx, q); } return ret; } @@ -656,10 +658,9 @@ static int elevator_change(struct request_queue *q, struct elv_change_ctx *ctx) lockdep_assert_held(&set->update_nr_hwq_lock);
if (strncmp(ctx->name, "none", 4)) { - ctx->et = blk_mq_alloc_sched_tags(set, set->nr_hw_queues, - blk_mq_default_nr_requests(set)); - if (!ctx->et) - return -ENOMEM; + ret = blk_mq_alloc_sched_res(q, &ctx->res, set->nr_hw_queues); + if (ret) + return ret; }
memflags = blk_mq_freeze_queue(q); @@ -681,10 +682,10 @@ static int elevator_change(struct request_queue *q, struct elv_change_ctx *ctx) if (!ret) ret = elevator_change_done(q, ctx); /* - * Free sched tags if it's allocated but we couldn't switch elevator. + * Free sched resource if it's allocated but we couldn't switch elevator. */ - if (ctx->et && !ctx->new) - blk_mq_free_sched_tags(ctx->et, set); + if (!ctx->new) + blk_mq_free_sched_res(&ctx->res, set);
return ret; } @@ -711,10 +712,10 @@ void elv_update_nr_hw_queues(struct request_queue *q, if (!ret) WARN_ON_ONCE(elevator_change_done(q, ctx)); /* - * Free sched tags if it's allocated but we couldn't switch elevator. + * Free sched resource if it's allocated but we couldn't switch elevator. */ - if (ctx->et && !ctx->new) - blk_mq_free_sched_tags(ctx->et, set); + if (!ctx->new) + blk_mq_free_sched_res(&ctx->res, set); }
/* diff --git a/block/elevator.h b/block/elevator.h index bad43182361e..621a63597249 100644 --- a/block/elevator.h +++ b/block/elevator.h @@ -32,6 +32,11 @@ struct elevator_tags { struct blk_mq_tags *tags[]; };
+struct elevator_resources { + /* holds elevator tags */ + struct elevator_tags *et; +}; + /* Holding context data for changing elevator */ struct elv_change_ctx { const char *name; @@ -43,8 +48,8 @@ struct elv_change_ctx { struct elevator_queue *new; /* store elevator type */ struct elevator_type *type; - /* holds sched tags data */ - struct elevator_tags *et; + /* store elevator resources */ + struct elevator_resources res; };
struct elevator_mq_ops {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nilay Shroff nilay@linux.ibm.com
[ Upstream commit 61019afdf6ac17c8e8f9c42665aa1fa82f04a3e2 ]
The recent lockdep splat [1] highlights a potential deadlock risk involving ->elevator_lock and ->freeze_lock dependencies on -pcpu_alloc_ mutex. The trace shows that the issue occurs when the Kyber scheduler allocates dynamic memory for its elevator data during initialization.
To address this, introduce two new elevator operation callbacks: ->alloc_sched_data and ->free_sched_data. The subsequent patch would build upon these newly introduced methods to suppress lockdep splat[1].
[1] https://lore.kernel.org/all/CAGVVp+VNW4M-5DZMNoADp6o2VKFhi7KxWpTDkcnVyjO0=-D...
Signed-off-by: Nilay Shroff nilay@linux.ibm.com Reviewed-by: Ming Lei ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Stable-dep-of: 9869d3a6fed3 ("block: fix race between wbt_enable_default and IO submission") Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-mq-sched.h | 24 ++++++++++++++++++++++++ block/elevator.h | 2 ++ 2 files changed, 26 insertions(+)
diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h index 1f8e58dd4b49..4e1b86e85a8a 100644 --- a/block/blk-mq-sched.h +++ b/block/blk-mq-sched.h @@ -38,6 +38,30 @@ void blk_mq_free_sched_res(struct elevator_resources *res, struct blk_mq_tag_set *set); void blk_mq_free_sched_res_batch(struct xarray *et_table, struct blk_mq_tag_set *set); +/* + * blk_mq_alloc_sched_data() - Allocates scheduler specific data + * Returns: + * - Pointer to allocated data on success + * - NULL if no allocation needed + * - ERR_PTR(-ENOMEM) in case of failure + */ +static inline void *blk_mq_alloc_sched_data(struct request_queue *q, + struct elevator_type *e) +{ + void *sched_data; + + if (!e || !e->ops.alloc_sched_data) + return NULL; + + sched_data = e->ops.alloc_sched_data(q); + return (sched_data) ?: ERR_PTR(-ENOMEM); +} + +static inline void blk_mq_free_sched_data(struct elevator_type *e, void *data) +{ + if (e && e->ops.free_sched_data) + e->ops.free_sched_data(data); +}
static inline void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx) { diff --git a/block/elevator.h b/block/elevator.h index 621a63597249..e34043f6da26 100644 --- a/block/elevator.h +++ b/block/elevator.h @@ -58,6 +58,8 @@ struct elevator_mq_ops { int (*init_hctx)(struct blk_mq_hw_ctx *, unsigned int); void (*exit_hctx)(struct blk_mq_hw_ctx *, unsigned int); void (*depth_updated)(struct request_queue *); + void *(*alloc_sched_data)(struct request_queue *); + void (*free_sched_data)(void *);
bool (*allow_merge)(struct request_queue *, struct request *, struct bio *); bool (*bio_merge)(struct request_queue *, struct bio *, unsigned int);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nilay Shroff nilay@linux.ibm.com
[ Upstream commit 0315476e78c050048e80f66334a310e5581b46bb ]
The previous patch introduced ->alloc_sched_data and ->free_sched_data methods. This patch builds upon that by now using these methods during elevator switch and nr_hw_queue update.
It's also ensured that scheduler-specific data is allocated and freed through the new callbacks outside of the ->freeze_lock and ->elevator_lock locking contexts, thereby preventing any dependency on pcpu_alloc_mutex.
Reviewed-by: Ming Lei ming.lei@redhat.com Reviewed-by: Yu Kuai yukuai@fnnas.com Signed-off-by: Nilay Shroff nilay@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Stable-dep-of: 9869d3a6fed3 ("block: fix race between wbt_enable_default and IO submission") Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-mq-sched.c | 27 +++++++++++++++++++++------ block/blk-mq-sched.h | 5 ++++- block/elevator.c | 34 ++++++++++++++++++++++------------ block/elevator.h | 4 +++- 4 files changed, 50 insertions(+), 20 deletions(-)
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 03ff16c49976..128f2be9d420 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -428,12 +428,17 @@ void blk_mq_free_sched_tags(struct elevator_tags *et, }
void blk_mq_free_sched_res(struct elevator_resources *res, + struct elevator_type *type, struct blk_mq_tag_set *set) { if (res->et) { blk_mq_free_sched_tags(res->et, set); res->et = NULL; } + if (res->data) { + blk_mq_free_sched_data(type, res->data); + res->data = NULL; + } }
void blk_mq_free_sched_res_batch(struct xarray *elv_tbl, @@ -458,7 +463,7 @@ void blk_mq_free_sched_res_batch(struct xarray *elv_tbl, WARN_ON_ONCE(1); continue; } - blk_mq_free_sched_res(&ctx->res, set); + blk_mq_free_sched_res(&ctx->res, ctx->type, set); } } } @@ -541,7 +546,9 @@ struct elevator_tags *blk_mq_alloc_sched_tags(struct blk_mq_tag_set *set, }
int blk_mq_alloc_sched_res(struct request_queue *q, - struct elevator_resources *res, unsigned int nr_hw_queues) + struct elevator_type *type, + struct elevator_resources *res, + unsigned int nr_hw_queues) { struct blk_mq_tag_set *set = q->tag_set;
@@ -550,6 +557,12 @@ int blk_mq_alloc_sched_res(struct request_queue *q, if (!res->et) return -ENOMEM;
+ res->data = blk_mq_alloc_sched_data(q, type); + if (IS_ERR(res->data)) { + blk_mq_free_sched_tags(res->et, set); + return -ENOMEM; + } + return 0; }
@@ -577,19 +590,21 @@ int blk_mq_alloc_sched_res_batch(struct xarray *elv_tbl, goto out_unwind; }
- ret = blk_mq_alloc_sched_res(q, &ctx->res, - nr_hw_queues); + ret = blk_mq_alloc_sched_res(q, q->elevator->type, + &ctx->res, nr_hw_queues); if (ret) goto out_unwind; } } return 0; + out_unwind: list_for_each_entry_continue_reverse(q, &set->tag_list, tag_set_list) { if (q->elevator) { ctx = xa_load(elv_tbl, q->id); if (ctx) - blk_mq_free_sched_res(&ctx->res, set); + blk_mq_free_sched_res(&ctx->res, + ctx->type, set); } } return ret; @@ -606,7 +621,7 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e, unsigned long i; int ret;
- eq = elevator_alloc(q, e, et); + eq = elevator_alloc(q, e, res); if (!eq) return -ENOMEM;
diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h index 4e1b86e85a8a..02c40a72e959 100644 --- a/block/blk-mq-sched.h +++ b/block/blk-mq-sched.h @@ -26,7 +26,9 @@ void blk_mq_sched_free_rqs(struct request_queue *q); struct elevator_tags *blk_mq_alloc_sched_tags(struct blk_mq_tag_set *set, unsigned int nr_hw_queues, unsigned int nr_requests); int blk_mq_alloc_sched_res(struct request_queue *q, - struct elevator_resources *res, unsigned int nr_hw_queues); + struct elevator_type *type, + struct elevator_resources *res, + unsigned int nr_hw_queues); int blk_mq_alloc_sched_res_batch(struct xarray *elv_tbl, struct blk_mq_tag_set *set, unsigned int nr_hw_queues); int blk_mq_alloc_sched_ctx_batch(struct xarray *elv_tbl, @@ -35,6 +37,7 @@ void blk_mq_free_sched_ctx_batch(struct xarray *elv_tbl); void blk_mq_free_sched_tags(struct elevator_tags *et, struct blk_mq_tag_set *set); void blk_mq_free_sched_res(struct elevator_resources *res, + struct elevator_type *type, struct blk_mq_tag_set *set); void blk_mq_free_sched_res_batch(struct xarray *et_table, struct blk_mq_tag_set *set); diff --git a/block/elevator.c b/block/elevator.c index cbec292a4af5..5b37ef44f52d 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -121,7 +121,7 @@ static struct elevator_type *elevator_find_get(const char *name) static const struct kobj_type elv_ktype;
struct elevator_queue *elevator_alloc(struct request_queue *q, - struct elevator_type *e, struct elevator_tags *et) + struct elevator_type *e, struct elevator_resources *res) { struct elevator_queue *eq;
@@ -134,7 +134,8 @@ struct elevator_queue *elevator_alloc(struct request_queue *q, kobject_init(&eq->kobj, &elv_ktype); mutex_init(&eq->sysfs_lock); hash_init(eq->hash); - eq->et = et; + eq->et = res->et; + eq->elevator_data = res->data;
return eq; } @@ -617,7 +618,7 @@ static void elv_exit_and_release(struct elv_change_ctx *ctx, mutex_unlock(&q->elevator_lock); blk_mq_unfreeze_queue(q, memflags); if (e) { - blk_mq_free_sched_res(&ctx->res, q->tag_set); + blk_mq_free_sched_res(&ctx->res, ctx->type, q->tag_set); kobject_put(&e->kobj); } } @@ -628,12 +629,15 @@ static int elevator_change_done(struct request_queue *q, int ret = 0;
if (ctx->old) { - struct elevator_resources res = {.et = ctx->old->et}; + struct elevator_resources res = { + .et = ctx->old->et, + .data = ctx->old->elevator_data + }; bool enable_wbt = test_bit(ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT, &ctx->old->flags);
elv_unregister_queue(q, ctx->old); - blk_mq_free_sched_res(&res, q->tag_set); + blk_mq_free_sched_res(&res, ctx->old->type, q->tag_set); kobject_put(&ctx->old->kobj); if (enable_wbt) wbt_enable_default(q->disk); @@ -658,7 +662,8 @@ static int elevator_change(struct request_queue *q, struct elv_change_ctx *ctx) lockdep_assert_held(&set->update_nr_hwq_lock);
if (strncmp(ctx->name, "none", 4)) { - ret = blk_mq_alloc_sched_res(q, &ctx->res, set->nr_hw_queues); + ret = blk_mq_alloc_sched_res(q, ctx->type, &ctx->res, + set->nr_hw_queues); if (ret) return ret; } @@ -681,11 +686,12 @@ static int elevator_change(struct request_queue *q, struct elv_change_ctx *ctx) blk_mq_unfreeze_queue(q, memflags); if (!ret) ret = elevator_change_done(q, ctx); + /* * Free sched resource if it's allocated but we couldn't switch elevator. */ if (!ctx->new) - blk_mq_free_sched_res(&ctx->res, set); + blk_mq_free_sched_res(&ctx->res, ctx->type, set);
return ret; } @@ -711,11 +717,12 @@ void elv_update_nr_hw_queues(struct request_queue *q, blk_mq_unfreeze_queue_nomemrestore(q); if (!ret) WARN_ON_ONCE(elevator_change_done(q, ctx)); + /* * Free sched resource if it's allocated but we couldn't switch elevator. */ if (!ctx->new) - blk_mq_free_sched_res(&ctx->res, set); + blk_mq_free_sched_res(&ctx->res, ctx->type, set); }
/* @@ -729,7 +736,6 @@ void elevator_set_default(struct request_queue *q) .no_uevent = true, }; int err; - struct elevator_type *e;
/* now we allow to switch elevator */ blk_queue_flag_clear(QUEUE_FLAG_NO_ELV_SWITCH, q); @@ -742,8 +748,8 @@ void elevator_set_default(struct request_queue *q) * have multiple queues or mq-deadline is not available, default * to "none". */ - e = elevator_find_get(ctx.name); - if (!e) + ctx.type = elevator_find_get(ctx.name); + if (!ctx.type) return;
if ((q->nr_hw_queues == 1 || @@ -753,7 +759,7 @@ void elevator_set_default(struct request_queue *q) pr_warn(""%s" elevator initialization, failed %d, falling back to "none"\n", ctx.name, err); } - elevator_put(e); + elevator_put(ctx.type); }
void elevator_set_none(struct request_queue *q) @@ -802,6 +808,7 @@ ssize_t elv_iosched_store(struct gendisk *disk, const char *buf, ctx.name = strstrip(elevator_name);
elv_iosched_load_module(ctx.name); + ctx.type = elevator_find_get(ctx.name);
down_read(&set->update_nr_hwq_lock); if (!blk_queue_no_elv_switch(q)) { @@ -812,6 +819,9 @@ ssize_t elv_iosched_store(struct gendisk *disk, const char *buf, ret = -ENOENT; } up_read(&set->update_nr_hwq_lock); + + if (ctx.type) + elevator_put(ctx.type); return ret; }
diff --git a/block/elevator.h b/block/elevator.h index e34043f6da26..3ee1d494f48a 100644 --- a/block/elevator.h +++ b/block/elevator.h @@ -33,6 +33,8 @@ struct elevator_tags { };
struct elevator_resources { + /* holds elevator data */ + void *data; /* holds elevator tags */ struct elevator_tags *et; }; @@ -185,7 +187,7 @@ ssize_t elv_iosched_store(struct gendisk *disk, const char *page, size_t count);
extern bool elv_bio_merge_ok(struct request *, struct bio *); struct elevator_queue *elevator_alloc(struct request_queue *, - struct elevator_type *, struct elevator_tags *); + struct elevator_type *, struct elevator_resources *);
/* * Helper functions.
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 9869d3a6fed381f3b98404e26e1afc75d680cbf9 ]
When wbt_enable_default() is moved out of queue freezing in elevator_change(), it can cause the wbt inflight counter to become negative (-1), leading to hung tasks in the writeback path. Tasks get stuck in wbt_wait() because the counter is in an inconsistent state.
The issue occurs because wbt_enable_default() could race with IO submission, allowing the counter to be decremented before proper initialization. This manifests as:
rq_wait[0]: inflight: -1 has_waiters: True
rwb_enabled() checks the state, which can be updated exactly between wbt_wait() (rq_qos_throttle()) and wbt_track()(rq_qos_track()), then the inflight counter will become negative.
And results in hung task warnings like: task:kworker/u24:39 state:D stack:0 pid:14767 Call Trace: rq_qos_wait+0xb4/0x150 wbt_wait+0xa9/0x100 __rq_qos_throttle+0x24/0x40 blk_mq_submit_bio+0x672/0x7b0 ...
Fix this by:
1. Splitting wbt_enable_default() into: - __wbt_enable_default(): Returns true if wbt_init() should be called - wbt_enable_default(): Wrapper for existing callers (no init) - wbt_init_enable_default(): New function that checks and inits WBT
2. Using wbt_init_enable_default() in blk_register_queue() to ensure proper initialization during queue registration
3. Move wbt_init() out of wbt_enable_default() which is only for enabling disabled wbt from bfq and iocost, and wbt_init() isn't needed. Then the original lock warning can be avoided.
4. Removing the ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT flag and its handling code since it's no longer needed
This ensures WBT is properly initialized before any IO can be submitted, preventing the counter from going negative.
Cc: Nilay Shroff nilay@linux.ibm.com Cc: Yu Kuai yukuai@fnnas.com Cc: Guangwu Zhang guazhang@redhat.com Fixes: 78c271344b6f ("block: move wbt_enable_default() out of queue freezing from sched ->exit()") Signed-off-by: Ming Lei ming.lei@redhat.com Reviewed-by: Nilay Shroff nilay@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/bfq-iosched.c | 2 +- block/blk-sysfs.c | 2 +- block/blk-wbt.c | 20 ++++++++++++++++---- block/blk-wbt.h | 5 +++++ block/elevator.c | 4 ---- block/elevator.h | 1 - 6 files changed, 23 insertions(+), 11 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 4a8d3d96bfe4..6e54b1d3d8bc 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -7181,7 +7181,7 @@ static void bfq_exit_queue(struct elevator_queue *e)
blk_stat_disable_accounting(bfqd->queue); blk_queue_flag_clear(QUEUE_FLAG_DISABLE_WBT_DEF, bfqd->queue); - set_bit(ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT, &e->flags); + wbt_enable_default(bfqd->queue->disk);
kfree(bfqd); } diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 76c47fe9b8d6..c0e4daaf9610 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -942,7 +942,7 @@ int blk_register_queue(struct gendisk *disk) elevator_set_default(q);
blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q); - wbt_enable_default(disk); + wbt_init_enable_default(disk);
/* Now everything is ready and send out KOBJ_ADD uevent */ kobject_uevent(&disk->queue_kobj, KOBJ_ADD); diff --git a/block/blk-wbt.c b/block/blk-wbt.c index eb8037bae0bd..0974875f77bd 100644 --- a/block/blk-wbt.c +++ b/block/blk-wbt.c @@ -699,7 +699,7 @@ static void wbt_requeue(struct rq_qos *rqos, struct request *rq) /* * Enable wbt if defaults are configured that way */ -void wbt_enable_default(struct gendisk *disk) +static bool __wbt_enable_default(struct gendisk *disk) { struct request_queue *q = disk->queue; struct rq_qos *rqos; @@ -716,19 +716,31 @@ void wbt_enable_default(struct gendisk *disk) if (enable && RQWB(rqos)->enable_state == WBT_STATE_OFF_DEFAULT) RQWB(rqos)->enable_state = WBT_STATE_ON_DEFAULT; mutex_unlock(&disk->rqos_state_mutex); - return; + return false; } mutex_unlock(&disk->rqos_state_mutex);
/* Queue not registered? Maybe shutting down... */ if (!blk_queue_registered(q)) - return; + return false;
if (queue_is_mq(q) && enable) - wbt_init(disk); + return true; + return false; +} + +void wbt_enable_default(struct gendisk *disk) +{ + __wbt_enable_default(disk); } EXPORT_SYMBOL_GPL(wbt_enable_default);
+void wbt_init_enable_default(struct gendisk *disk) +{ + if (__wbt_enable_default(disk)) + WARN_ON_ONCE(wbt_init(disk)); +} + u64 wbt_default_latency_nsec(struct request_queue *q) { /* diff --git a/block/blk-wbt.h b/block/blk-wbt.h index e5fc653b9b76..925f22475738 100644 --- a/block/blk-wbt.h +++ b/block/blk-wbt.h @@ -5,6 +5,7 @@ #ifdef CONFIG_BLK_WBT
int wbt_init(struct gendisk *disk); +void wbt_init_enable_default(struct gendisk *disk); void wbt_disable_default(struct gendisk *disk); void wbt_enable_default(struct gendisk *disk);
@@ -16,6 +17,10 @@ u64 wbt_default_latency_nsec(struct request_queue *);
#else
+static inline void wbt_init_enable_default(struct gendisk *disk) +{ +} + static inline void wbt_disable_default(struct gendisk *disk) { } diff --git a/block/elevator.c b/block/elevator.c index 5b37ef44f52d..a2f8b2251dc6 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -633,14 +633,10 @@ static int elevator_change_done(struct request_queue *q, .et = ctx->old->et, .data = ctx->old->elevator_data }; - bool enable_wbt = test_bit(ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT, - &ctx->old->flags);
elv_unregister_queue(q, ctx->old); blk_mq_free_sched_res(&res, ctx->old->type, q->tag_set); kobject_put(&ctx->old->kobj); - if (enable_wbt) - wbt_enable_default(q->disk); } if (ctx->new) { ret = elv_register_queue(q, ctx->new, !ctx->no_uevent); diff --git a/block/elevator.h b/block/elevator.h index 3ee1d494f48a..021726376042 100644 --- a/block/elevator.h +++ b/block/elevator.h @@ -156,7 +156,6 @@ struct elevator_queue
#define ELEVATOR_FLAG_REGISTERED 0 #define ELEVATOR_FLAG_DYING 1 -#define ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT 2
/* * block elevator interface
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Prajna Rajendra Kumar prajna.rajendrakumar@microchip.com
[ Upstream commit 71c814e98696f2cd53e9e6cef7501c2d667d4c5a ]
The spi-microchip-core.c driver provides support for the Microchip PolarFire SoC (MPFS) "hard" SPI controller. It was originally named "core" with the expectation that it might also cover Microchip's CoreSPI "soft" IP, but that never materialized.
The CoreSPI IP cannot be supported by this driver because its register layout differs substantially from the MPFS SPI controller. In practice most of the code would need to be replaced to handle those differences so keeping the drivers separate is the simpler approach.
The file and internal symbols are renamed to reflect MPFS support and to free up "spi-microchip-core.c" for CoreSPI driver.
Fixes: 9ac8d17694b6 ("spi: add support for microchip fpga spi controllers") Signed-off-by: Prajna Rajendra Kumar prajna.rajendrakumar@microchip.com Acked-by: Conor Dooley conor.dooley@microchip.com Link: https://patch.msgid.link/20251114104545.284765-2-prajna.rajendrakumar@microc... Signed-off-by: Mark Brown broonie@kernel.org Stable-dep-of: a8a313612af7 ("spi: mpfs: Fix an error handling path in mpfs_spi_probe()") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/Kconfig | 19 +- drivers/spi/Makefile | 2 +- .../spi/{spi-microchip-core.c => spi-mpfs.c} | 207 +++++++++--------- 3 files changed, 115 insertions(+), 113 deletions(-) rename drivers/spi/{spi-microchip-core.c => spi-mpfs.c} (68%)
diff --git a/drivers/spi/Kconfig b/drivers/spi/Kconfig index 55675750182e..1872f9d54a5c 100644 --- a/drivers/spi/Kconfig +++ b/drivers/spi/Kconfig @@ -706,15 +706,6 @@ config SPI_MESON_SPIFC This enables master mode support for the SPIFC (SPI flash controller) available in Amlogic Meson SoCs.
-config SPI_MICROCHIP_CORE - tristate "Microchip FPGA SPI controllers" - depends on SPI_MASTER - help - This enables the SPI driver for Microchip FPGA SPI controllers. - Say Y or M here if you want to use the "hard" controllers on - PolarFire SoC. - If built as a module, it will be called spi-microchip-core. - config SPI_MICROCHIP_CORE_QSPI tristate "Microchip FPGA QSPI controllers" depends on SPI_MASTER @@ -871,6 +862,16 @@ config SPI_PL022 controller. If you have an embedded system with an AMBA(R) bus and a PL022 controller, say Y or M here.
+config SPI_POLARFIRE_SOC + + tristate "Microchip FPGA SPI controllers" + depends on SPI_MASTER && ARCH_MICROCHIP + help + This enables the SPI driver for Microchip FPGA SPI controllers. + Say Y or M here if you want to use the "hard" controllers on + PolarFire SoC. + If built as a module, it will be called spi-mpfs. + config SPI_PPC4xx tristate "PPC4xx SPI Controller" depends on PPC32 && 4xx diff --git a/drivers/spi/Makefile b/drivers/spi/Makefile index 8ff74a13faaa..1f7c06a3091d 100644 --- a/drivers/spi/Makefile +++ b/drivers/spi/Makefile @@ -86,7 +86,6 @@ obj-$(CONFIG_SPI_LOONGSON_PLATFORM) += spi-loongson-plat.o obj-$(CONFIG_SPI_LP8841_RTC) += spi-lp8841-rtc.o obj-$(CONFIG_SPI_MESON_SPICC) += spi-meson-spicc.o obj-$(CONFIG_SPI_MESON_SPIFC) += spi-meson-spifc.o -obj-$(CONFIG_SPI_MICROCHIP_CORE) += spi-microchip-core.o obj-$(CONFIG_SPI_MICROCHIP_CORE_QSPI) += spi-microchip-core-qspi.o obj-$(CONFIG_SPI_MPC512x_PSC) += spi-mpc512x-psc.o obj-$(CONFIG_SPI_MPC52xx_PSC) += spi-mpc52xx-psc.o @@ -97,6 +96,7 @@ obj-$(CONFIG_SPI_MTK_NOR) += spi-mtk-nor.o obj-$(CONFIG_SPI_MTK_SNFI) += spi-mtk-snfi.o obj-$(CONFIG_SPI_MXIC) += spi-mxic.o obj-$(CONFIG_SPI_MXS) += spi-mxs.o +obj-$(CONFIG_SPI_POLARFIRE_SOC) += spi-mpfs.o obj-$(CONFIG_SPI_WPCM_FIU) += spi-wpcm-fiu.o obj-$(CONFIG_SPI_NPCM_FIU) += spi-npcm-fiu.o obj-$(CONFIG_SPI_NPCM_PSPI) += spi-npcm-pspi.o diff --git a/drivers/spi/spi-microchip-core.c b/drivers/spi/spi-mpfs.c similarity index 68% rename from drivers/spi/spi-microchip-core.c rename to drivers/spi/spi-mpfs.c index 9128b86c5366..9a14d1732a15 100644 --- a/drivers/spi/spi-microchip-core.c +++ b/drivers/spi/spi-mpfs.c @@ -99,7 +99,7 @@ #define REG_CTRL2 (0x48) #define REG_FRAMESUP (0x50)
-struct mchp_corespi { +struct mpfs_spi { void __iomem *regs; struct clk *clk; const u8 *tx_buf; @@ -113,34 +113,34 @@ struct mchp_corespi { int n_bytes; };
-static inline u32 mchp_corespi_read(struct mchp_corespi *spi, unsigned int reg) +static inline u32 mpfs_spi_read(struct mpfs_spi *spi, unsigned int reg) { return readl(spi->regs + reg); }
-static inline void mchp_corespi_write(struct mchp_corespi *spi, unsigned int reg, u32 val) +static inline void mpfs_spi_write(struct mpfs_spi *spi, unsigned int reg, u32 val) { writel(val, spi->regs + reg); }
-static inline void mchp_corespi_disable(struct mchp_corespi *spi) +static inline void mpfs_spi_disable(struct mpfs_spi *spi) { - u32 control = mchp_corespi_read(spi, REG_CONTROL); + u32 control = mpfs_spi_read(spi, REG_CONTROL);
control &= ~CONTROL_ENABLE;
- mchp_corespi_write(spi, REG_CONTROL, control); + mpfs_spi_write(spi, REG_CONTROL, control); }
-static inline void mchp_corespi_read_fifo(struct mchp_corespi *spi, int fifo_max) +static inline void mpfs_spi_read_fifo(struct mpfs_spi *spi, int fifo_max) { for (int i = 0; i < fifo_max; i++) { u32 data;
- while (mchp_corespi_read(spi, REG_STATUS) & STATUS_RXFIFO_EMPTY) + while (mpfs_spi_read(spi, REG_STATUS) & STATUS_RXFIFO_EMPTY) ;
- data = mchp_corespi_read(spi, REG_RX_DATA); + data = mpfs_spi_read(spi, REG_RX_DATA);
spi->rx_len -= spi->n_bytes;
@@ -158,34 +158,34 @@ static inline void mchp_corespi_read_fifo(struct mchp_corespi *spi, int fifo_max } }
-static void mchp_corespi_enable_ints(struct mchp_corespi *spi) +static void mpfs_spi_enable_ints(struct mpfs_spi *spi) { - u32 control = mchp_corespi_read(spi, REG_CONTROL); + u32 control = mpfs_spi_read(spi, REG_CONTROL);
control |= INT_ENABLE_MASK; - mchp_corespi_write(spi, REG_CONTROL, control); + mpfs_spi_write(spi, REG_CONTROL, control); }
-static void mchp_corespi_disable_ints(struct mchp_corespi *spi) +static void mpfs_spi_disable_ints(struct mpfs_spi *spi) { - u32 control = mchp_corespi_read(spi, REG_CONTROL); + u32 control = mpfs_spi_read(spi, REG_CONTROL);
control &= ~INT_ENABLE_MASK; - mchp_corespi_write(spi, REG_CONTROL, control); + mpfs_spi_write(spi, REG_CONTROL, control); }
-static inline void mchp_corespi_set_xfer_size(struct mchp_corespi *spi, int len) +static inline void mpfs_spi_set_xfer_size(struct mpfs_spi *spi, int len) { u32 control; u32 lenpart; - u32 frames = mchp_corespi_read(spi, REG_FRAMESUP); + u32 frames = mpfs_spi_read(spi, REG_FRAMESUP);
/* * Writing to FRAMECNT in REG_CONTROL will reset the frame count, taking * a shortcut requires an explicit clear. */ if (frames == len) { - mchp_corespi_write(spi, REG_COMMAND, COMMAND_CLRFRAMECNT); + mpfs_spi_write(spi, REG_COMMAND, COMMAND_CLRFRAMECNT); return; }
@@ -208,20 +208,20 @@ static inline void mchp_corespi_set_xfer_size(struct mchp_corespi *spi, int len) * that matches the documentation. */ lenpart = len & 0xffff; - control = mchp_corespi_read(spi, REG_CONTROL); + control = mpfs_spi_read(spi, REG_CONTROL); control &= ~CONTROL_FRAMECNT_MASK; control |= lenpart << CONTROL_FRAMECNT_SHIFT; - mchp_corespi_write(spi, REG_CONTROL, control); - mchp_corespi_write(spi, REG_FRAMESUP, len); + mpfs_spi_write(spi, REG_CONTROL, control); + mpfs_spi_write(spi, REG_FRAMESUP, len); }
-static inline void mchp_corespi_write_fifo(struct mchp_corespi *spi, int fifo_max) +static inline void mpfs_spi_write_fifo(struct mpfs_spi *spi, int fifo_max) { int i = 0;
- mchp_corespi_set_xfer_size(spi, fifo_max); + mpfs_spi_set_xfer_size(spi, fifo_max);
- while ((i < fifo_max) && !(mchp_corespi_read(spi, REG_STATUS) & STATUS_TXFIFO_FULL)) { + while ((i < fifo_max) && !(mpfs_spi_read(spi, REG_STATUS) & STATUS_TXFIFO_FULL)) { u32 word;
if (spi->n_bytes == 4) @@ -231,7 +231,7 @@ static inline void mchp_corespi_write_fifo(struct mchp_corespi *spi, int fifo_ma else word = spi->tx_buf ? *spi->tx_buf : 0xaa;
- mchp_corespi_write(spi, REG_TX_DATA, word); + mpfs_spi_write(spi, REG_TX_DATA, word); if (spi->tx_buf) spi->tx_buf += spi->n_bytes; i++; @@ -240,9 +240,9 @@ static inline void mchp_corespi_write_fifo(struct mchp_corespi *spi, int fifo_ma spi->tx_len -= i * spi->n_bytes; }
-static inline void mchp_corespi_set_framesize(struct mchp_corespi *spi, int bt) +static inline void mpfs_spi_set_framesize(struct mpfs_spi *spi, int bt) { - u32 frame_size = mchp_corespi_read(spi, REG_FRAME_SIZE); + u32 frame_size = mpfs_spi_read(spi, REG_FRAME_SIZE); u32 control;
if ((frame_size & FRAME_SIZE_MASK) == bt) @@ -252,25 +252,25 @@ static inline void mchp_corespi_set_framesize(struct mchp_corespi *spi, int bt) * Disable the SPI controller. Writes to the frame size have * no effect when the controller is enabled. */ - control = mchp_corespi_read(spi, REG_CONTROL); + control = mpfs_spi_read(spi, REG_CONTROL); control &= ~CONTROL_ENABLE; - mchp_corespi_write(spi, REG_CONTROL, control); + mpfs_spi_write(spi, REG_CONTROL, control);
- mchp_corespi_write(spi, REG_FRAME_SIZE, bt); + mpfs_spi_write(spi, REG_FRAME_SIZE, bt);
control |= CONTROL_ENABLE; - mchp_corespi_write(spi, REG_CONTROL, control); + mpfs_spi_write(spi, REG_CONTROL, control); }
-static void mchp_corespi_set_cs(struct spi_device *spi, bool disable) +static void mpfs_spi_set_cs(struct spi_device *spi, bool disable) { u32 reg; - struct mchp_corespi *corespi = spi_controller_get_devdata(spi->controller); + struct mpfs_spi *mspi = spi_controller_get_devdata(spi->controller);
- reg = mchp_corespi_read(corespi, REG_SLAVE_SELECT); + reg = mpfs_spi_read(mspi, REG_SLAVE_SELECT); reg &= ~BIT(spi_get_chipselect(spi, 0)); reg |= !disable << spi_get_chipselect(spi, 0); - corespi->pending_slave_select = reg; + mspi->pending_slave_select = reg;
/* * Only deassert chip select immediately. Writing to some registers @@ -281,12 +281,12 @@ static void mchp_corespi_set_cs(struct spi_device *spi, bool disable) * doesn't see any spurious clock transitions whilst CS is enabled. */ if (((spi->mode & SPI_CS_HIGH) == 0) == disable) - mchp_corespi_write(corespi, REG_SLAVE_SELECT, reg); + mpfs_spi_write(mspi, REG_SLAVE_SELECT, reg); }
-static int mchp_corespi_setup(struct spi_device *spi) +static int mpfs_spi_setup(struct spi_device *spi) { - struct mchp_corespi *corespi = spi_controller_get_devdata(spi->controller); + struct mpfs_spi *mspi = spi_controller_get_devdata(spi->controller); u32 reg;
if (spi_is_csgpiod(spi)) @@ -298,21 +298,21 @@ static int mchp_corespi_setup(struct spi_device *spi) * driving their select line low. */ if (spi->mode & SPI_CS_HIGH) { - reg = mchp_corespi_read(corespi, REG_SLAVE_SELECT); + reg = mpfs_spi_read(mspi, REG_SLAVE_SELECT); reg |= BIT(spi_get_chipselect(spi, 0)); - corespi->pending_slave_select = reg; - mchp_corespi_write(corespi, REG_SLAVE_SELECT, reg); + mspi->pending_slave_select = reg; + mpfs_spi_write(mspi, REG_SLAVE_SELECT, reg); } return 0; }
-static void mchp_corespi_init(struct spi_controller *host, struct mchp_corespi *spi) +static void mpfs_spi_init(struct spi_controller *host, struct mpfs_spi *spi) { unsigned long clk_hz; - u32 control = mchp_corespi_read(spi, REG_CONTROL); + u32 control = mpfs_spi_read(spi, REG_CONTROL);
control &= ~CONTROL_ENABLE; - mchp_corespi_write(spi, REG_CONTROL, control); + mpfs_spi_write(spi, REG_CONTROL, control);
control |= CONTROL_MASTER; control &= ~CONTROL_MODE_MASK; @@ -328,15 +328,15 @@ static void mchp_corespi_init(struct spi_controller *host, struct mchp_corespi * */ control |= CONTROL_SPS | CONTROL_BIGFIFO;
- mchp_corespi_write(spi, REG_CONTROL, control); + mpfs_spi_write(spi, REG_CONTROL, control);
- mchp_corespi_set_framesize(spi, DEFAULT_FRAMESIZE); + mpfs_spi_set_framesize(spi, DEFAULT_FRAMESIZE);
/* max. possible spi clock rate is the apb clock rate */ clk_hz = clk_get_rate(spi->clk); host->max_speed_hz = clk_hz;
- mchp_corespi_enable_ints(spi); + mpfs_spi_enable_ints(spi);
/* * It is required to enable direct mode, otherwise control over the chip @@ -344,34 +344,34 @@ static void mchp_corespi_init(struct spi_controller *host, struct mchp_corespi * * can deal with active high targets. */ spi->pending_slave_select = SSELOUT | SSEL_DIRECT; - mchp_corespi_write(spi, REG_SLAVE_SELECT, spi->pending_slave_select); + mpfs_spi_write(spi, REG_SLAVE_SELECT, spi->pending_slave_select);
- control = mchp_corespi_read(spi, REG_CONTROL); + control = mpfs_spi_read(spi, REG_CONTROL);
control &= ~CONTROL_RESET; control |= CONTROL_ENABLE;
- mchp_corespi_write(spi, REG_CONTROL, control); + mpfs_spi_write(spi, REG_CONTROL, control); }
-static inline void mchp_corespi_set_clk_gen(struct mchp_corespi *spi) +static inline void mpfs_spi_set_clk_gen(struct mpfs_spi *spi) { u32 control;
- control = mchp_corespi_read(spi, REG_CONTROL); + control = mpfs_spi_read(spi, REG_CONTROL); if (spi->clk_mode) control |= CONTROL_CLKMODE; else control &= ~CONTROL_CLKMODE;
- mchp_corespi_write(spi, REG_CLK_GEN, spi->clk_gen); - mchp_corespi_write(spi, REG_CONTROL, control); + mpfs_spi_write(spi, REG_CLK_GEN, spi->clk_gen); + mpfs_spi_write(spi, REG_CONTROL, control); }
-static inline void mchp_corespi_set_mode(struct mchp_corespi *spi, unsigned int mode) +static inline void mpfs_spi_set_mode(struct mpfs_spi *spi, unsigned int mode) { u32 mode_val; - u32 control = mchp_corespi_read(spi, REG_CONTROL); + u32 control = mpfs_spi_read(spi, REG_CONTROL);
switch (mode & SPI_MODE_X_MASK) { case SPI_MODE_0: @@ -394,22 +394,22 @@ static inline void mchp_corespi_set_mode(struct mchp_corespi *spi, unsigned int */
control &= ~CONTROL_ENABLE; - mchp_corespi_write(spi, REG_CONTROL, control); + mpfs_spi_write(spi, REG_CONTROL, control);
control &= ~(SPI_MODE_X_MASK << MODE_X_MASK_SHIFT); control |= mode_val;
- mchp_corespi_write(spi, REG_CONTROL, control); + mpfs_spi_write(spi, REG_CONTROL, control);
control |= CONTROL_ENABLE; - mchp_corespi_write(spi, REG_CONTROL, control); + mpfs_spi_write(spi, REG_CONTROL, control); }
-static irqreturn_t mchp_corespi_interrupt(int irq, void *dev_id) +static irqreturn_t mpfs_spi_interrupt(int irq, void *dev_id) { struct spi_controller *host = dev_id; - struct mchp_corespi *spi = spi_controller_get_devdata(host); - u32 intfield = mchp_corespi_read(spi, REG_MIS) & 0xf; + struct mpfs_spi *spi = spi_controller_get_devdata(host); + u32 intfield = mpfs_spi_read(spi, REG_MIS) & 0xf; bool finalise = false;
/* Interrupt line may be shared and not for us at all */ @@ -417,7 +417,7 @@ static irqreturn_t mchp_corespi_interrupt(int irq, void *dev_id) return IRQ_NONE;
if (intfield & INT_RX_CHANNEL_OVERFLOW) { - mchp_corespi_write(spi, REG_INT_CLEAR, INT_RX_CHANNEL_OVERFLOW); + mpfs_spi_write(spi, REG_INT_CLEAR, INT_RX_CHANNEL_OVERFLOW); finalise = true; dev_err(&host->dev, "%s: RX OVERFLOW: rxlen: %d, txlen: %d\n", __func__, @@ -425,7 +425,7 @@ static irqreturn_t mchp_corespi_interrupt(int irq, void *dev_id) }
if (intfield & INT_TX_CHANNEL_UNDERRUN) { - mchp_corespi_write(spi, REG_INT_CLEAR, INT_TX_CHANNEL_UNDERRUN); + mpfs_spi_write(spi, REG_INT_CLEAR, INT_TX_CHANNEL_UNDERRUN); finalise = true; dev_err(&host->dev, "%s: TX UNDERFLOW: rxlen: %d, txlen: %d\n", __func__, @@ -438,8 +438,8 @@ static irqreturn_t mchp_corespi_interrupt(int irq, void *dev_id) return IRQ_HANDLED; }
-static int mchp_corespi_calculate_clkgen(struct mchp_corespi *spi, - unsigned long target_hz) +static int mpfs_spi_calculate_clkgen(struct mpfs_spi *spi, + unsigned long target_hz) { unsigned long clk_hz, spi_hz, clk_gen;
@@ -475,20 +475,20 @@ static int mchp_corespi_calculate_clkgen(struct mchp_corespi *spi, return 0; }
-static int mchp_corespi_transfer_one(struct spi_controller *host, - struct spi_device *spi_dev, - struct spi_transfer *xfer) +static int mpfs_spi_transfer_one(struct spi_controller *host, + struct spi_device *spi_dev, + struct spi_transfer *xfer) { - struct mchp_corespi *spi = spi_controller_get_devdata(host); + struct mpfs_spi *spi = spi_controller_get_devdata(host); int ret;
- ret = mchp_corespi_calculate_clkgen(spi, (unsigned long)xfer->speed_hz); + ret = mpfs_spi_calculate_clkgen(spi, (unsigned long)xfer->speed_hz); if (ret) { dev_err(&host->dev, "failed to set clk_gen for target %u Hz\n", xfer->speed_hz); return ret; }
- mchp_corespi_set_clk_gen(spi); + mpfs_spi_set_clk_gen(spi);
spi->tx_buf = xfer->tx_buf; spi->rx_buf = xfer->rx_buf; @@ -496,45 +496,46 @@ static int mchp_corespi_transfer_one(struct spi_controller *host, spi->rx_len = xfer->len; spi->n_bytes = roundup_pow_of_two(DIV_ROUND_UP(xfer->bits_per_word, BITS_PER_BYTE));
- mchp_corespi_set_framesize(spi, xfer->bits_per_word); + mpfs_spi_set_framesize(spi, xfer->bits_per_word);
- mchp_corespi_write(spi, REG_COMMAND, COMMAND_RXFIFORST | COMMAND_TXFIFORST); + mpfs_spi_write(spi, REG_COMMAND, COMMAND_RXFIFORST | COMMAND_TXFIFORST);
- mchp_corespi_write(spi, REG_SLAVE_SELECT, spi->pending_slave_select); + mpfs_spi_write(spi, REG_SLAVE_SELECT, spi->pending_slave_select);
while (spi->tx_len) { int fifo_max = DIV_ROUND_UP(min(spi->tx_len, FIFO_DEPTH), spi->n_bytes);
- mchp_corespi_write_fifo(spi, fifo_max); - mchp_corespi_read_fifo(spi, fifo_max); + mpfs_spi_write_fifo(spi, fifo_max); + mpfs_spi_read_fifo(spi, fifo_max); }
spi_finalize_current_transfer(host); return 1; }
-static int mchp_corespi_prepare_message(struct spi_controller *host, - struct spi_message *msg) +static int mpfs_spi_prepare_message(struct spi_controller *host, + struct spi_message *msg) { struct spi_device *spi_dev = msg->spi; - struct mchp_corespi *spi = spi_controller_get_devdata(host); + struct mpfs_spi *spi = spi_controller_get_devdata(host);
- mchp_corespi_set_mode(spi, spi_dev->mode); + mpfs_spi_set_mode(spi, spi_dev->mode);
return 0; }
-static int mchp_corespi_probe(struct platform_device *pdev) +static int mpfs_spi_probe(struct platform_device *pdev) { struct spi_controller *host; - struct mchp_corespi *spi; + struct mpfs_spi *spi; struct resource *res; u32 num_cs; int ret = 0;
host = devm_spi_alloc_host(&pdev->dev, sizeof(*spi)); if (!host) - return -ENOMEM; + return dev_err_probe(&pdev->dev, -ENOMEM, + "unable to allocate host for SPI controller\n");
platform_set_drvdata(pdev, host);
@@ -544,11 +545,11 @@ static int mchp_corespi_probe(struct platform_device *pdev) host->num_chipselect = num_cs; host->mode_bits = SPI_CPOL | SPI_CPHA | SPI_CS_HIGH; host->use_gpio_descriptors = true; - host->setup = mchp_corespi_setup; + host->setup = mpfs_spi_setup; host->bits_per_word_mask = SPI_BPW_RANGE_MASK(1, 32); - host->transfer_one = mchp_corespi_transfer_one; - host->prepare_message = mchp_corespi_prepare_message; - host->set_cs = mchp_corespi_set_cs; + host->transfer_one = mpfs_spi_transfer_one; + host->prepare_message = mpfs_spi_prepare_message; + host->set_cs = mpfs_spi_set_cs; host->dev.of_node = pdev->dev.of_node;
spi = spi_controller_get_devdata(host); @@ -561,7 +562,7 @@ static int mchp_corespi_probe(struct platform_device *pdev) if (spi->irq < 0) return spi->irq;
- ret = devm_request_irq(&pdev->dev, spi->irq, mchp_corespi_interrupt, + ret = devm_request_irq(&pdev->dev, spi->irq, mpfs_spi_interrupt, IRQF_SHARED, dev_name(&pdev->dev), host); if (ret) return dev_err_probe(&pdev->dev, ret, @@ -572,11 +573,11 @@ static int mchp_corespi_probe(struct platform_device *pdev) return dev_err_probe(&pdev->dev, PTR_ERR(spi->clk), "could not get clk\n");
- mchp_corespi_init(host, spi); + mpfs_spi_init(host, spi);
ret = devm_spi_register_controller(&pdev->dev, host); if (ret) { - mchp_corespi_disable(spi); + mpfs_spi_disable(spi); return dev_err_probe(&pdev->dev, ret, "unable to register host for SPI controller\n"); } @@ -586,13 +587,13 @@ static int mchp_corespi_probe(struct platform_device *pdev) return 0; }
-static void mchp_corespi_remove(struct platform_device *pdev) +static void mpfs_spi_remove(struct platform_device *pdev) { struct spi_controller *host = platform_get_drvdata(pdev); - struct mchp_corespi *spi = spi_controller_get_devdata(host); + struct mpfs_spi *spi = spi_controller_get_devdata(host);
- mchp_corespi_disable_ints(spi); - mchp_corespi_disable(spi); + mpfs_spi_disable_ints(spi); + mpfs_spi_disable(spi); }
#define MICROCHIP_SPI_PM_OPS (NULL) @@ -602,23 +603,23 @@ static void mchp_corespi_remove(struct platform_device *pdev) */
#if defined(CONFIG_OF) -static const struct of_device_id mchp_corespi_dt_ids[] = { +static const struct of_device_id mpfs_spi_dt_ids[] = { { .compatible = "microchip,mpfs-spi" }, { /* sentinel */ } }; -MODULE_DEVICE_TABLE(of, mchp_corespi_dt_ids); +MODULE_DEVICE_TABLE(of, mpfs_spi_dt_ids); #endif
-static struct platform_driver mchp_corespi_driver = { - .probe = mchp_corespi_probe, +static struct platform_driver mpfs_spi_driver = { + .probe = mpfs_spi_probe, .driver = { - .name = "microchip-corespi", + .name = "microchip-spi", .pm = MICROCHIP_SPI_PM_OPS, - .of_match_table = of_match_ptr(mchp_corespi_dt_ids), + .of_match_table = of_match_ptr(mpfs_spi_dt_ids), }, - .remove = mchp_corespi_remove, + .remove = mpfs_spi_remove, }; -module_platform_driver(mchp_corespi_driver); +module_platform_driver(mpfs_spi_driver); MODULE_DESCRIPTION("Microchip coreSPI SPI controller driver"); MODULE_AUTHOR("Daire McNamara daire.mcnamara@microchip.com"); MODULE_AUTHOR("Conor Dooley conor.dooley@microchip.com");
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit a8a313612af7a55083ba5720f14f1835319debee ]
mpfs_spi_init() calls mpfs_spi_enable_ints(), so mpfs_spi_disable_ints() should be called if an error occurs after calling mpfs_spi_init(), as already done in the remove function.
Fixes: 9ac8d17694b6 ("spi: add support for microchip fpga spi controllers") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Link: https://patch.msgid.link/eb35f168517cc402ef7e78f26da02863e2f45c03.1765612110... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-mpfs.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/spi/spi-mpfs.c b/drivers/spi/spi-mpfs.c index 9a14d1732a15..7e9e64d8e6c8 100644 --- a/drivers/spi/spi-mpfs.c +++ b/drivers/spi/spi-mpfs.c @@ -577,6 +577,7 @@ static int mpfs_spi_probe(struct platform_device *pdev)
ret = devm_spi_register_controller(&pdev->dev, host); if (ret) { + mpfs_spi_disable_ints(spi); mpfs_spi_disable(spi); return dev_err_probe(&pdev->dev, ret, "unable to register host for SPI controller\n");
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Denis Sergeev denserg.edu@gmail.com
[ Upstream commit 46c28bbbb150b80827e4bcbea231560af9d16854 ]
The fan nominal speed returned by SMM is limited to 16 bits, but the driver allows the fan multiplier to be set via a module parameter.
Clamp the computed fan multiplier so that fan_nominal_speed * i8k_fan_mult always fits into a signed 32-bit integer and refuse to initialize the driver if the value is too large.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 20bdeebc88269 ("hwmon: (dell-smm) Introduce helper function for data init") Signed-off-by: Denis Sergeev denserg.edu@gmail.com Link: https://lore.kernel.org/r/20251209063706.49008-1-denserg.edu@gmail.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/dell-smm-hwmon.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c index cbe1a74a3dee..f0e8a9bc0d0e 100644 --- a/drivers/hwmon/dell-smm-hwmon.c +++ b/drivers/hwmon/dell-smm-hwmon.c @@ -76,6 +76,9 @@ #define DELL_SMM_NO_TEMP 10 #define DELL_SMM_NO_FANS 4
+/* limit fan multiplier to avoid overflow */ +#define DELL_SMM_MAX_FAN_MULT (INT_MAX / U16_MAX) + struct smm_regs { unsigned int eax; unsigned int ebx; @@ -1253,6 +1256,12 @@ static int dell_smm_init_data(struct device *dev, const struct dell_smm_ops *ops data->ops = ops; /* All options must not be 0 */ data->i8k_fan_mult = fan_mult ? : I8K_FAN_MULT; + if (data->i8k_fan_mult > DELL_SMM_MAX_FAN_MULT) { + dev_err(dev, + "fan multiplier %u is too large (max %u)\n", + data->i8k_fan_mult, DELL_SMM_MAX_FAN_MULT); + return -EINVAL; + } data->i8k_fan_max = fan_max ? : I8K_FAN_HIGH; data->i8k_pwm_mult = DIV_ROUND_UP(255, data->i8k_fan_max);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Junrui Luo moonafterrain@outlook.com
[ Upstream commit 6946c726c3f4c36f0f049e6f97e88c510b15f65d ]
The ibmpex_high_low_store() function retrieves driver data using dev_get_drvdata() and uses it without validation. This creates a race condition where the sysfs callback can be invoked after the data structure is freed, leading to use-after-free.
Fix by adding a NULL check after dev_get_drvdata(), and reordering operations in the deletion path to prevent TOCTOU.
Reported-by: Yuhao Jiang danisjiang@gmail.com Reported-by: Junrui Luo moonafterrain@outlook.com Fixes: 57c7c3a0fdea ("hwmon: IBM power meter driver") Signed-off-by: Junrui Luo moonafterrain@outlook.com Link: https://lore.kernel.org/r/MEYPR01MB7886BE2F51BFE41875B74B60AFA0A@MEYPR01MB78... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/ibmpex.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/hwmon/ibmpex.c b/drivers/hwmon/ibmpex.c index 228c5f6c6f38..129f3a9e8fe9 100644 --- a/drivers/hwmon/ibmpex.c +++ b/drivers/hwmon/ibmpex.c @@ -277,6 +277,9 @@ static ssize_t ibmpex_high_low_store(struct device *dev, { struct ibmpex_bmc_data *data = dev_get_drvdata(dev);
+ if (!data) + return -ENODEV; + ibmpex_reset_high_low_data(data);
return count; @@ -508,6 +511,9 @@ static void ibmpex_bmc_delete(struct ibmpex_bmc_data *data) { int i, j;
+ hwmon_device_unregister(data->hwmon_dev); + dev_set_drvdata(data->bmc_device, NULL); + device_remove_file(data->bmc_device, &sensor_dev_attr_reset_high_low.dev_attr); device_remove_file(data->bmc_device, &dev_attr_name.attr); @@ -521,8 +527,7 @@ static void ibmpex_bmc_delete(struct ibmpex_bmc_data *data) }
list_del(&data->list); - dev_set_drvdata(data->bmc_device, NULL); - hwmon_device_unregister(data->hwmon_dev); + ipmi_destroy_user(data->user); kfree(data->sensors); kfree(data);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Simakov bigalex934@gmail.com
[ Upstream commit 82f2aab35a1ab2e1460de06ef04c726460aed51c ]
The driver computes conversion intervals using the formula:
interval = (1 << (7 - rate)) * 125ms
where 'rate' is the sensor's conversion rate register value. According to the datasheet, the power-on reset value of this register is 0x8, which could be assigned to the register, after handling i2c general call. Using this default value causes a result greater than the bit width of left operand and an undefined behaviour in the calculation above, since shifting by values larger than the bit width is undefined behaviour as per C language standard.
Limit the maximum usable 'rate' value to 7 to prevent undefined behaviour in calculations.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Note (groeck): This does not matter in practice unless someone overwrites the chip configuration from outside the driver while the driver is loaded. The conversion time register is initialized with a value of 5 (500ms) when the driver is loaded, and the driver never writes a bad value.
Fixes: ca53e7640de7 ("hwmon: (tmp401) Convert to _info API") Signed-off-by: Alexey Simakov bigalex934@gmail.com Link: https://lore.kernel.org/r/20251211164342.6291-1-bigalex934@gmail.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/tmp401.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/tmp401.c b/drivers/hwmon/tmp401.c index 02c5a3bb1071..84aaf817144c 100644 --- a/drivers/hwmon/tmp401.c +++ b/drivers/hwmon/tmp401.c @@ -401,7 +401,7 @@ static int tmp401_chip_read(struct device *dev, u32 attr, int channel, long *val ret = regmap_read(data->regmap, TMP401_CONVERSION_RATE, ®val); if (ret < 0) return ret; - *val = (1 << (7 - regval)) * 125; + *val = (1 << (7 - min(regval, 7))) * 125; break; case hwmon_chip_temp_reset_history: *val = 0;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuicheng Lin shuicheng.lin@intel.com
[ Upstream commit b32045d73bb4333a2cebc5d3c005807adb03ab58 ]
Ensure gt->freq is released when sysfs_create_files() fails in xe_gt_freq_init(). Without this, the kobject would leak. Add kobject_put() before returning the error.
Fixes: fdc81c43f0c1 ("drm/xe: use devm_add_action_or_reset() helper") Signed-off-by: Shuicheng Lin shuicheng.lin@intel.com Reviewed-by: Alex Zuo alex.zuo@intel.com Reviewed-by: Xin Wang x.wang@intel.com Link: https://patch.msgid.link/20251114205638.2184529-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com (cherry picked from commit 251be5fb4982ebb0f5a81b62d975bd770f3ad5c2) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/xe/xe_gt_freq.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_freq.c b/drivers/gpu/drm/xe/xe_gt_freq.c index 4ff1b6b58d6b..e8e70fd2e8c4 100644 --- a/drivers/gpu/drm/xe/xe_gt_freq.c +++ b/drivers/gpu/drm/xe/xe_gt_freq.c @@ -296,8 +296,10 @@ int xe_gt_freq_init(struct xe_gt *gt) return -ENOMEM;
err = sysfs_create_files(gt->freq, freq_attrs); - if (err) + if (err) { + kobject_put(gt->freq); return err; + }
err = devm_add_action_or_reset(xe->drm.dev, freq_fini, gt->freq); if (err)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vinay Belgaumkar vinay.belgaumkar@intel.com
[ Upstream commit c88a0731ed95f9705deb127a7f1927fa59aa742b ]
Wa_14020316580 was getting clobbered by power gating init code later in the driver load sequence. Move the Wa so that it applies correctly.
Fixes: 7cd05ef89c9d ("drm/xe/xe2hpm: Add initial set of workarounds") Suggested-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Vinay Belgaumkar vinay.belgaumkar@intel.com Reviewed-by: Riana Tauro riana.tauro@intel.com Reviewed-by: Matt Roper matthew.d.roper@intel.com Link: https://patch.msgid.link/20251129052548.70766-1-vinay.belgaumkar@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com (cherry picked from commit 8b5502145351bde87f522df082b9e41356898ba3) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/xe/xe_gt_idle.c | 8 ++++++++ drivers/gpu/drm/xe/xe_wa.c | 8 -------- drivers/gpu/drm/xe/xe_wa_oob.rules | 1 + 3 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_idle.c b/drivers/gpu/drm/xe/xe_gt_idle.c index bdc9d9877ec4..3e3d1d52f630 100644 --- a/drivers/gpu/drm/xe/xe_gt_idle.c +++ b/drivers/gpu/drm/xe/xe_gt_idle.c @@ -5,6 +5,7 @@
#include <drm/drm_managed.h>
+#include <generated/xe_wa_oob.h> #include "xe_force_wake.h" #include "xe_device.h" #include "xe_gt.h" @@ -16,6 +17,7 @@ #include "xe_mmio.h" #include "xe_pm.h" #include "xe_sriov.h" +#include "xe_wa.h"
/** * DOC: Xe GT Idle @@ -145,6 +147,12 @@ void xe_gt_idle_enable_pg(struct xe_gt *gt) xe_mmio_write32(mmio, RENDER_POWERGATE_IDLE_HYSTERESIS, 25); }
+ if (XE_GT_WA(gt, 14020316580)) + gtidle->powergate_enable &= ~(VDN_HCP_POWERGATE_ENABLE(0) | + VDN_MFXVDENC_POWERGATE_ENABLE(0) | + VDN_HCP_POWERGATE_ENABLE(2) | + VDN_MFXVDENC_POWERGATE_ENABLE(2)); + xe_mmio_write32(mmio, POWERGATE_ENABLE, gtidle->powergate_enable); xe_force_wake_put(gt_to_fw(gt), fw_ref); } diff --git a/drivers/gpu/drm/xe/xe_wa.c b/drivers/gpu/drm/xe/xe_wa.c index 3cf30718b200..d209434fd7fc 100644 --- a/drivers/gpu/drm/xe/xe_wa.c +++ b/drivers/gpu/drm/xe/xe_wa.c @@ -270,14 +270,6 @@ static const struct xe_rtp_entry_sr gt_was[] = { XE_RTP_ACTIONS(SET(VDBOX_CGCTL3F1C(0), MFXPIPE_CLKGATE_DIS)), XE_RTP_ENTRY_FLAG(FOREACH_ENGINE), }, - { XE_RTP_NAME("14020316580"), - XE_RTP_RULES(MEDIA_VERSION(1301)), - XE_RTP_ACTIONS(CLR(POWERGATE_ENABLE, - VDN_HCP_POWERGATE_ENABLE(0) | - VDN_MFXVDENC_POWERGATE_ENABLE(0) | - VDN_HCP_POWERGATE_ENABLE(2) | - VDN_MFXVDENC_POWERGATE_ENABLE(2))), - }, { XE_RTP_NAME("14019449301"), XE_RTP_RULES(MEDIA_VERSION(1301), ENGINE_CLASS(VIDEO_DECODE)), XE_RTP_ACTIONS(SET(VDBOX_CGCTL3F08(0), CG3DDISHRS_CLKGATE_DIS)), diff --git a/drivers/gpu/drm/xe/xe_wa_oob.rules b/drivers/gpu/drm/xe/xe_wa_oob.rules index f3a6d5d239ce..1d32fa1c3b2d 100644 --- a/drivers/gpu/drm/xe/xe_wa_oob.rules +++ b/drivers/gpu/drm/xe/xe_wa_oob.rules @@ -81,3 +81,4 @@
15015404425_disable PLATFORM(PANTHERLAKE), MEDIA_STEP(B0, FOREVER) 16026007364 MEDIA_VERSION(3000) +14020316580 MEDIA_VERSION(1301)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 9acc3295813b9b846791fd3eab0a78a3144af560 ]
The Xe driver fails to build when CONFIG_DRM_XE_GPUSVM is disabled but CONFIG_DRM_GPUSVM is turned on, due to the clash of two commits:
In file included from drivers/gpu/drm/xe/xe_vm_madvise.c:8: drivers/gpu/drm/xe/xe_svm.h: In function 'xe_svm_init': include/linux/stddef.h:8:14: error: passing argument 5 of 'drm_gpusvm_init' makes integer from pointer without a cast [-Wint-conversion] drivers/gpu/drm/xe/xe_svm.h:217:38: note: in expansion of macro 'NULL' 217 | NULL, NULL, 0, 0, 0, NULL, NULL, 0); | ^~~~ In file included from drivers/gpu/drm/xe/xe_bo_types.h:11, from drivers/gpu/drm/xe/xe_bo.h:11, from drivers/gpu/drm/xe/xe_vm_madvise.c:11: include/drm/drm_gpusvm.h:254:35: note: expected 'long unsigned int' but argument is of type 'void *' 254 | unsigned long mm_start, unsigned long mm_range, | ~~~~~~~~~~~~~~^~~~~~~~ In file included from drivers/gpu/drm/xe/xe_vm_madvise.c:14: drivers/gpu/drm/xe/xe_svm.h:216:16: error: too many arguments to function 'drm_gpusvm_init'; expected 10, have 11 216 | return drm_gpusvm_init(&vm->svm.gpusvm, "Xe SVM (simple)", &vm->xe->drm, | ^~~~~~~~~~~~~~~ 217 | NULL, NULL, 0, 0, 0, NULL, NULL, 0); | ~ include/drm/drm_gpusvm.h:251:5: note: declared here
Adapt the caller to the new argument list by removing the extraneous NULL argument.
Fixes: 9e9787414882 ("drm/xe/userptr: replace xe_hmm with gpusvm") Fixes: 10aa5c806030 ("drm/gpusvm, drm/xe: Fix userptr to not allow device private pages") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Thomas Hellström thomas.hellstrom@linux.intel.com Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Link: https://patch.msgid.link/20251204094704.1030933-1-arnd@kernel.org (cherry picked from commit 29bce9c8b41d5c378263a927acb9a9074d0e7a0e) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/xe/xe_svm.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h index 0955d2ac8d74..fa757dd07954 100644 --- a/drivers/gpu/drm/xe/xe_svm.h +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -214,7 +214,7 @@ int xe_svm_init(struct xe_vm *vm) { #if IS_ENABLED(CONFIG_DRM_GPUSVM) return drm_gpusvm_init(&vm->svm.gpusvm, "Xe SVM (simple)", &vm->xe->drm, - NULL, NULL, 0, 0, 0, NULL, NULL, 0); + NULL, 0, 0, 0, NULL, NULL, 0); #else return 0; #endif
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Junxiao Chang junxiao.chang@intel.com
[ Upstream commit 17445af7dcc7d645b6fb8951fd10c8b72cc7f23f ]
MEI GSC interrupt comes from i915 or xe driver. It has top half and bottom half. Top half is called from i915/xe interrupt handler. It should be in irq disabled context.
With RT kernel(PREEMPT_RT enabled), by default IRQ handler is in threaded IRQ. MEI GSC top half might be in threaded IRQ context. generic_handle_irq_safe API could be called from either IRQ or process context, it disables local IRQ then calls MEI GSC interrupt top half.
This change fixes B580 GPU boot issue with RT enabled.
Fixes: e02cea83d32d ("drm/xe/gsc: add Battlemage support") Tested-by: Baoli Zhang baoli.zhang@intel.com Signed-off-by: Junxiao Chang junxiao.chang@intel.com Reviewed-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Reviewed-by: Matthew Brost matthew.brost@intel.com Link: https://patch.msgid.link/20251107033152.834960-1-junxiao.chang@intel.com Signed-off-by: Maarten Lankhorst dev@lankhorst.se (cherry picked from commit 3efadf028783a49ab2941294187c8b6dd86bf7da) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/xe/xe_heci_gsc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_heci_gsc.c b/drivers/gpu/drm/xe/xe_heci_gsc.c index a415ca488791..32d509b11391 100644 --- a/drivers/gpu/drm/xe/xe_heci_gsc.c +++ b/drivers/gpu/drm/xe/xe_heci_gsc.c @@ -221,7 +221,7 @@ void xe_heci_gsc_irq_handler(struct xe_device *xe, u32 iir) if (xe->heci_gsc.irq < 0) return;
- ret = generic_handle_irq(xe->heci_gsc.irq); + ret = generic_handle_irq_safe(xe->heci_gsc.irq); if (ret) drm_err_ratelimited(&xe->drm, "error handling GSC irq: %d\n", ret); } @@ -241,7 +241,7 @@ void xe_heci_csc_irq_handler(struct xe_device *xe, u32 iir) if (xe->heci_gsc.irq < 0) return;
- ret = generic_handle_irq(xe->heci_gsc.irq); + ret = generic_handle_irq_safe(xe->heci_gsc.irq); if (ret) drm_err_ratelimited(&xe->drm, "error handling GSC irq: %d\n", ret); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jagmeet Randhawa jagmeet.randhawa@intel.com
[ Upstream commit eafb6f62093f756535a7be1fc4559374a511e460 ]
There are some corner cases where flushing transient data may take slightly longer than the 150us timeout we currently allow. Update the driver to use a 300us timeout instead based on the latest guidance from the hardware team. An update to the bspec to formally document this is expected to arrive soon.
Fixes: c01c6066e6fa ("drm/xe/device: implement transient flush") Signed-off-by: Jagmeet Randhawa jagmeet.randhawa@intel.com Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Reviewed-by: Matt Roper matthew.d.roper@intel.com Link: https://patch.msgid.link/0201b1d6ec64d3651fcbff1ea21026efa915126a.1765487866... Signed-off-by: Matt Roper matthew.d.roper@intel.com (cherry picked from commit d69d3636f5f7a84bae7cd43473b3701ad9b7d544) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/xe/xe_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index 456899238377..5f757790d6f5 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -1046,7 +1046,7 @@ static void tdf_request_sync(struct xe_device *xe) * transient and need to be flushed.. */ if (xe_mmio_wait32(>->mmio, XE2_TDF_CTRL, TRANSIENT_FLUSH_REQUEST, 0, - 150, NULL, false)) + 300, NULL, false)) xe_gt_err_once(gt, "TD flush timeout\n");
xe_force_wake_put(gt_to_fw(gt), fw_ref);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Maslak jan.maslak@intel.com
[ Upstream commit eed5b815fa49c17d513202f54e980eb91955d3ed ]
During GT reset recovery in do_gt_restart(), xe_uc_start() was called before xe_reg_sr_apply_mmio() restored engine-specific registers. This created a race window where the scheduler could run jobs before hardware state was fully restored.
This caused failures in eudebug tests (xe_exec_sip_eudebug@breakpoint- waitsip-*) where TD_CTL register (containing TD_CTL_GLOBAL_DEBUG_ENABLE) wasn't restored before jobs started executing. Breakpoints would fail to trigger SIP entry because the debug enable bit wasn't set yet.
Fix by moving xe_uc_start() after all MMIO register restoration, including engine registers and CCS mode configuration, ensuring all hardware state is fully restored before any jobs can be scheduled.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Jan Maslak jan.maslak@intel.com Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com Signed-off-by: Matthew Brost matthew.brost@intel.com Link: https://patch.msgid.link/20251210145618.169625-2-jan.maslak@intel.com (cherry picked from commit 825aed0328588b2837636c1c5a0c48795d724617) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/xe/xe_gt.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index 6d3db5e55d98..61bed3b04ded 100644 --- a/drivers/gpu/drm/xe/xe_gt.c +++ b/drivers/gpu/drm/xe/xe_gt.c @@ -784,9 +784,6 @@ static int do_gt_restart(struct xe_gt *gt) xe_gt_sriov_pf_init_hw(gt);
xe_mocs_init(gt); - err = xe_uc_start(>->uc); - if (err) - return err;
for_each_hw_engine(hwe, gt, id) xe_reg_sr_apply_mmio(&hwe->reg_sr, gt); @@ -794,6 +791,10 @@ static int do_gt_restart(struct xe_gt *gt) /* Get CCS mode in sync between sw/hw */ xe_gt_apply_ccs_mode(gt);
+ err = xe_uc_start(>->uc); + if (err) + return err; + /* Restore GT freq to expected values */ xe_gt_sanitize_freq(gt);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haoxiang Li haoxiang_li2024@163.com
[ Upstream commit 680ad315caaa2860df411cb378bf3614d96c7648 ]
If gio_device_register fails, gio_dev_put() is required to drop the gio_dev device reference.
Fixes: e84de0c61905 ("MIPS: GIO bus support for SGI IP22/28") Signed-off-by: Haoxiang Li haoxiang_li2024@163.com Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/sgi-ip22/ip22-gio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/mips/sgi-ip22/ip22-gio.c b/arch/mips/sgi-ip22/ip22-gio.c index 5893ea4e382c..19b70928d6dc 100644 --- a/arch/mips/sgi-ip22/ip22-gio.c +++ b/arch/mips/sgi-ip22/ip22-gio.c @@ -372,7 +372,8 @@ static void ip22_check_gio(int slotno, unsigned long addr, int irq) gio_dev->resource.flags = IORESOURCE_MEM; gio_dev->irq = irq; dev_set_name(&gio_dev->dev, "%d", slotno); - gio_device_register(gio_dev); + if (gio_device_register(gio_dev)) + gio_dev_put(gio_dev); } else printk(KERN_INFO "GIO: slot %d : Empty\n", slotno); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marijn Suijten marijn.suijten@somainline.org
[ Upstream commit 2b973ca48ff3ef1952091c8f988d7796781836c8 ]
The DSI host must be enabled before our prepare function can run, which has to send its init sequence over DSI. Without enabling the host first the panel will not probe.
Fixes: 9e15123eca79 ("drm/msm/dsi: Stop unconditionally powering up DSI hosts at modeset") Signed-off-by: Marijn Suijten marijn.suijten@somainline.org Reviewed-by: Douglas Anderson dianders@chromium.org Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Reviewed-by: Martin Botka martin.botka@somainline.org Signed-off-by: Douglas Anderson dianders@chromium.org Link: https://patch.msgid.link/20251130-sony-akari-fix-panel-v1-1-1d27c60a55f5@som... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-sony-td4353-jdi.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/panel/panel-sony-td4353-jdi.c b/drivers/gpu/drm/panel/panel-sony-td4353-jdi.c index 7c989b70ab51..a14c86c60d19 100644 --- a/drivers/gpu/drm/panel/panel-sony-td4353-jdi.c +++ b/drivers/gpu/drm/panel/panel-sony-td4353-jdi.c @@ -212,6 +212,8 @@ static int sony_td4353_jdi_probe(struct mipi_dsi_device *dsi) if (ret) return dev_err_probe(dev, ret, "Failed to get backlight\n");
+ ctx->panel.prepare_prev_first = true; + drm_panel_add(&ctx->panel);
ret = mipi_dsi_attach(dsi);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Juergen Gross jgross@suse.com
[ Upstream commit e5aff444e3a7bdeef5ea796a2099fc3c60a070fa ]
The sparse tool issues a warning for arch/x76/xen/enlighten_pv.c:
arch/x86/xen/enlighten_pv.c:120:9: sparse: sparse: incorrect type in initializer (different address spaces) expected void const [noderef] __percpu *__vpp_verify got bool *
This is due to the percpu variable xen_in_preemptible_hcall being exported via EXPORT_SYMBOL_GPL() instead of EXPORT_PER_CPU_SYMBOL_GPL().
Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202512140856.Ic6FetG6-lkp@intel.com/ Fixes: fdfd811ddde3 ("x86/xen: allow privcmd hypercalls to be preempted") Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Juergen Gross jgross@suse.com Message-ID: 20251215115112.15072-1-jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/xen/enlighten_pv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 4806cc28d7ca..b74ff8bc7f2a 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -108,7 +108,7 @@ static int xen_cpu_dead_pv(unsigned int cpu); * calls. */ DEFINE_PER_CPU(bool, xen_in_preemptible_hcall); -EXPORT_SYMBOL_GPL(xen_in_preemptible_hcall); +EXPORT_PER_CPU_SYMBOL_GPL(xen_in_preemptible_hcall);
/* * In case of scheduling the flag must be cleared and restored after
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jianpeng Chang jianpeng.chang.cn@windriver.com
[ Upstream commit 3e8ade58b71b48913d21b647b2089e03e81f117e ]
Commit 8a6e02d0c00e ("of: reserved_mem: Restructure how the reserved memory regions are processed") changed the processing order of reserved memory regions, causing elfcorehdr to overlap with dynamically allocated reserved memory regions during kdump kernel boot.
The issue occurs because: 1. kexec-tools allocates elfcorehdr in the last crashkernel reserved memory region and passes it to the second kernel 2. The problematic commit moved dynamic reserved memory allocation (like bman-fbpr) to occur during fdt_scan_reserved_mem(), before elfcorehdr reservation in fdt_reserve_elfcorehdr() 3. bman-fbpr with 16MB alignment requirement can get allocated at addresses that overlap with the elfcorehdr location 4. When fdt_reserve_elfcorehdr() tries to reserve elfcorehdr memory, overlap detection identifies the conflict and skips reservation 5. kdump kernel fails with "Unable to handle kernel paging request" because elfcorehdr memory is not properly reserved
The boot log: Before 8a6e02d0c00e: OF: fdt: Reserving 1 KiB of memory at 0xf4fff000 for elfcorehdr OF: reserved mem: 0xf3000000..0xf3ffffff bman-fbpr
After 8a6e02d0c00e: OF: reserved mem: 0xf4000000..0xf4ffffff bman-fbpr OF: fdt: elfcorehdr is overlapped
Fix this by ensuring elfcorehdr reservation occurs before dynamic reserved memory allocation.
Fixes: 8a6e02d0c00e ("of: reserved_mem: Restructure how the reserved memory regions are processed") Signed-off-by: Jianpeng Chang jianpeng.chang.cn@windriver.com Link: https://patch.msgid.link/20251205015934.700016-1-jianpeng.chang.cn@windriver... Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/of/fdt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index fdaee4906836..3851ce244585 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -503,8 +503,8 @@ void __init early_init_fdt_scan_reserved_mem(void) if (!initial_boot_params) return;
- fdt_scan_reserved_mem(); fdt_reserve_elfcorehdr(); + fdt_scan_reserved_mem();
/* Process header /memreserve/ fields */ for (n = 0; ; n++) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
[ Upstream commit 77f73253015cbc7893fca1821ac3eae9eb4bc943 ]
Avoid a possible UAF in GPU recovery due to a race between the sched timeout callback and the tdr work queue.
The gpu recovery function calls drm_sched_stop() and later drm_sched_start(). drm_sched_start() restarts the tdr queue which will eventually free the job. If the tdr queue frees the job before time out callback completes, the job will be freed and we'll get a UAF when accessing the pasid. Cache it early to avoid the UAF.
Example KASAN trace: [ 493.058141] BUG: KASAN: slab-use-after-free in amdgpu_device_gpu_recover+0x968/0x990 [amdgpu] [ 493.067530] Read of size 4 at addr ffff88b0ce3f794c by task kworker/u128:1/323 [ 493.074892] [ 493.076485] CPU: 9 UID: 0 PID: 323 Comm: kworker/u128:1 Tainted: G E 6.16.0-1289896.2.zuul.bf4f11df81c1410bbe901c4373305a31 #1 PREEMPT(voluntary) [ 493.076493] Tainted: [E]=UNSIGNED_MODULE [ 493.076495] Hardware name: TYAN B8021G88V2HR-2T/S8021GM2NR-2T, BIOS V1.03.B10 04/01/2019 [ 493.076500] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched] [ 493.076512] Call Trace: [ 493.076515] <TASK> [ 493.076518] dump_stack_lvl+0x64/0x80 [ 493.076529] print_report+0xce/0x630 [ 493.076536] ? _raw_spin_lock_irqsave+0x86/0xd0 [ 493.076541] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 493.076545] ? amdgpu_device_gpu_recover+0x968/0x990 [amdgpu] [ 493.077253] kasan_report+0xb8/0xf0 [ 493.077258] ? amdgpu_device_gpu_recover+0x968/0x990 [amdgpu] [ 493.077965] amdgpu_device_gpu_recover+0x968/0x990 [amdgpu] [ 493.078672] ? __pfx_amdgpu_device_gpu_recover+0x10/0x10 [amdgpu] [ 493.079378] ? amdgpu_coredump+0x1fd/0x4c0 [amdgpu] [ 493.080111] amdgpu_job_timedout+0x642/0x1400 [amdgpu] [ 493.080903] ? pick_task_fair+0x24e/0x330 [ 493.080910] ? __pfx_amdgpu_job_timedout+0x10/0x10 [amdgpu] [ 493.081702] ? _raw_spin_lock+0x75/0xc0 [ 493.081708] ? __pfx__raw_spin_lock+0x10/0x10 [ 493.081712] drm_sched_job_timedout+0x1b0/0x4b0 [gpu_sched] [ 493.081721] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 493.081725] process_one_work+0x679/0xff0 [ 493.081732] worker_thread+0x6ce/0xfd0 [ 493.081736] ? __pfx_worker_thread+0x10/0x10 [ 493.081739] kthread+0x376/0x730 [ 493.081744] ? __pfx_kthread+0x10/0x10 [ 493.081748] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 493.081751] ? __pfx_kthread+0x10/0x10 [ 493.081755] ret_from_fork+0x247/0x330 [ 493.081761] ? __pfx_kthread+0x10/0x10 [ 493.081764] ret_from_fork_asm+0x1a/0x30 [ 493.081771] </TASK>
Fixes: a72002cb181f ("drm/amdgpu: Make use of drm_wedge_task_info") Link: https://github.com/HansKristian-Work/vkd3d-proton/pull/2670 Cc: SRINIVASAN.SHANMUGAM@amd.com Cc: vitaly.prosyak@amd.com Cc: christian.koenig@amd.com Suggested-by: Matthew Brost matthew.brost@intel.com Reviewed-by: Srinivasan Shanmugam srinivasan.shanmugam@amd.com Reviewed-by: Lijo Lazar lijo.lazar@amd.com Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 20880a3fd5dd7bca1a079534cf6596bda92e107d) Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 96b6738e6252..843770e61e42 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -6476,6 +6476,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, struct amdgpu_hive_info *hive = NULL; int r = 0; bool need_emergency_restart = false; + /* save the pasid here as the job may be freed before the end of the reset */ + int pasid = job ? job->pasid : -EINVAL;
/* * If it reaches here because of hang/timeout and a RAS error is @@ -6572,8 +6574,12 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, if (!r) { struct amdgpu_task_info *ti = NULL;
- if (job) - ti = amdgpu_vm_get_task_info_pasid(adev, job->pasid); + /* + * The job may already be freed at this point via the sched tdr workqueue so + * use the cached pasid. + */ + if (pasid >= 0) + ti = amdgpu_vm_get_task_info_pasid(adev, pasid);
drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE, ti ? &ti->task : NULL);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anurag Dutta a-dutta@ti.com
[ Upstream commit 1889dd2081975ce1f6275b06cdebaa8d154847a9 ]
When cqspi_request_mmap_dma() returns -EPROBE_DEFER after runtime PM is enabled, the error path calls clk_disable_unprepare() on an already disabled clock, causing an imbalance.
Use pm_runtime_get_sync() to increment the usage counter and resume the device. This prevents runtime_suspend() from being invoked and causing a double clock disable.
Fixes: 140623410536 ("mtd: spi-nor: Add driver for Cadence Quad SPI Flash Controller") Signed-off-by: Anurag Dutta a-dutta@ti.com Tested-by: Nishanth Menon nm@ti.com Link: https://patch.msgid.link/20251212072312.2711806-3-a-dutta@ti.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-cadence-quadspi.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c index af6d050da1c8..3231bdaf9bd0 100644 --- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -2024,7 +2024,9 @@ static int cqspi_probe(struct platform_device *pdev) probe_reset_failed: if (cqspi->is_jh7110) cqspi_jh7110_disable_clk(pdev, cqspi); - clk_disable_unprepare(cqspi->clk); + + if (pm_runtime_get_sync(&pdev->dev) >= 0) + clk_disable_unprepare(cqspi->clk); probe_clk_failed: return ret; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: huang-jl huang-jl@deepseek.com
[ Upstream commit 114ea9bbaf7681c4d363e13b7916e6fef6a4963a ]
io_import_kbuf() calculates nr_segs incorrectly when iov_offset is non-zero after iov_iter_advance(). It doesn't account for the partial consumption of the first bvec.
The problem comes when meet the following conditions: 1. Use UBLK_F_AUTO_BUF_REG feature of ublk. 2. The kernel will help to register the buffer, into the io uring. 3. Later, the ublk server try to send IO request using the registered buffer in the io uring, to read/write to fuse-based filesystem, with O_DIRECT.
From a userspace perspective, the ublk server thread is blocked in the
kernel, and will see "soft lockup" in the kernel dmesg.
When ublk registers a buffer with mixed-size bvecs like [4K]*6 + [12K] and a request partially consumes a bvec, the next request's nr_segs calculation uses bvec->bv_len instead of (bv_len - iov_offset).
This causes fuse_get_user_pages() to loop forever because nr_segs indicates fewer pages than actually needed.
Specifically, the infinite loop happens at: fuse_get_user_pages() -> iov_iter_extract_pages() -> iov_iter_extract_bvec_pages() Since the nr_segs is miscalculated, the iov_iter_extract_bvec_pages returns when finding that i->nr_segs is zero. Then iov_iter_extract_pages returns zero. However, fuse_get_user_pages does still not get enough data/pages, causing infinite loop.
Example: - Bvecs: [4K, 4K, 4K, 4K, 4K, 4K, 12K, ...] - Request 1: 32K at offset 0, uses 6*4K + 8K of the 12K bvec - Request 2: 32K at offset 32K - iov_offset = 8K (8K already consumed from 12K bvec) - Bug: calculates using 12K, not (12K - 8K) = 4K - Result: nr_segs too small, infinite loop in fuse_get_user_pages.
Fix by accounting for iov_offset when calculating the first segment's available length.
Fixes: b419bed4f0a6 ("io_uring/rsrc: ensure segments counts are correct on kbuf buffers") Signed-off-by: huang-jl huang-jl@deepseek.com Reviewed-by: Ming Lei ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- io_uring/rsrc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index 0010c4992490..5b6b73c6a62b 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -1057,6 +1057,7 @@ static int io_import_kbuf(int ddir, struct iov_iter *iter, if (count < imu->len) { const struct bio_vec *bvec = iter->bvec;
+ len += iter->iov_offset; while (len > bvec->bv_len) { len -= bvec->bv_len; bvec++;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 3035b9b46b0611898babc0b96ede65790d3566f7 ]
Add parameter `struct io_uring_cmd *` to ublk_prep_auto_buf_reg() and prepare for reusing this helper for the coming UBLK_BATCH_IO feature, which can fetch & commit one batch of io commands via single uring_cmd.
Reviewed-by: Caleb Sander Mateos csander@purestorage.com Signed-off-by: Ming Lei ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Stable-dep-of: c258f5c4502c ("ublk: fix deadlock when reading partition table") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/ublk_drv.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 359564c40cb5..599571634b7a 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1242,11 +1242,12 @@ ublk_auto_buf_reg_fallback(const struct ublk_queue *ubq, struct ublk_io *io) }
static bool ublk_auto_buf_reg(const struct ublk_queue *ubq, struct request *req, - struct ublk_io *io, unsigned int issue_flags) + struct ublk_io *io, struct io_uring_cmd *cmd, + unsigned int issue_flags) { int ret;
- ret = io_buffer_register_bvec(io->cmd, req, ublk_io_release, + ret = io_buffer_register_bvec(cmd, req, ublk_io_release, io->buf.index, issue_flags); if (ret) { if (io->buf.flags & UBLK_AUTO_BUF_REG_FALLBACK) { @@ -1258,18 +1259,19 @@ static bool ublk_auto_buf_reg(const struct ublk_queue *ubq, struct request *req, }
io->task_registered_buffers = 1; - io->buf_ctx_handle = io_uring_cmd_ctx_handle(io->cmd); + io->buf_ctx_handle = io_uring_cmd_ctx_handle(cmd); io->flags |= UBLK_IO_FLAG_AUTO_BUF_REG; return true; }
static bool ublk_prep_auto_buf_reg(struct ublk_queue *ubq, struct request *req, struct ublk_io *io, + struct io_uring_cmd *cmd, unsigned int issue_flags) { ublk_init_req_ref(ubq, io); if (ublk_support_auto_buf_reg(ubq) && ublk_rq_has_data(req)) - return ublk_auto_buf_reg(ubq, req, io, issue_flags); + return ublk_auto_buf_reg(ubq, req, io, cmd, issue_flags);
return true; } @@ -1344,7 +1346,7 @@ static void ublk_dispatch_req(struct ublk_queue *ubq, if (!ublk_start_io(ubq, req, io)) return;
- if (ublk_prep_auto_buf_reg(ubq, req, io, issue_flags)) + if (ublk_prep_auto_buf_reg(ubq, req, io, io->cmd, issue_flags)) ublk_complete_io_cmd(io, req, UBLK_IO_RES_OK, issue_flags); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 8d61ece156bd4f2b9e7d3b2a374a26d42c7a4a06 ]
Add `union ublk_io_buf` for naming the anonymous union of struct ublk_io's addr and buf fields, meantime apply it to `struct ublk_io` for storing either ublk auto buffer register data or ublk server io buffer address.
The union uses clear field names: - `addr`: for regular ublk server io buffer addresses - `auto_reg`: for ublk auto buffer registration data
This eliminates confusing access patterns and improves code readability.
Reviewed-by: Caleb Sander Mateos csander@purestorage.com Signed-off-by: Ming Lei ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Stable-dep-of: c258f5c4502c ("ublk: fix deadlock when reading partition table") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/ublk_drv.c | 40 ++++++++++++++++++++++------------------ 1 file changed, 22 insertions(+), 18 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 599571634b7a..c9d258b99090 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -155,12 +155,13 @@ struct ublk_uring_cmd_pdu { */ #define UBLK_REFCOUNT_INIT (REFCOUNT_MAX / 2)
+union ublk_io_buf { + __u64 addr; + struct ublk_auto_buf_reg auto_reg; +}; + struct ublk_io { - /* userspace buffer address from io cmd */ - union { - __u64 addr; - struct ublk_auto_buf_reg buf; - }; + union ublk_io_buf buf; unsigned int flags; int res;
@@ -499,7 +500,7 @@ static blk_status_t ublk_setup_iod_zoned(struct ublk_queue *ubq, iod->op_flags = ublk_op | ublk_req_build_flags(req); iod->nr_sectors = blk_rq_sectors(req); iod->start_sector = blk_rq_pos(req); - iod->addr = io->addr; + iod->addr = io->buf.addr;
return BLK_STS_OK; } @@ -1047,7 +1048,7 @@ static int ublk_map_io(const struct ublk_queue *ubq, const struct request *req, struct iov_iter iter; const int dir = ITER_DEST;
- import_ubuf(dir, u64_to_user_ptr(io->addr), rq_bytes, &iter); + import_ubuf(dir, u64_to_user_ptr(io->buf.addr), rq_bytes, &iter); return ublk_copy_user_pages(req, 0, &iter, dir); } return rq_bytes; @@ -1068,7 +1069,7 @@ static int ublk_unmap_io(bool need_map,
WARN_ON_ONCE(io->res > rq_bytes);
- import_ubuf(dir, u64_to_user_ptr(io->addr), io->res, &iter); + import_ubuf(dir, u64_to_user_ptr(io->buf.addr), io->res, &iter); return ublk_copy_user_pages(req, 0, &iter, dir); } return rq_bytes; @@ -1134,7 +1135,7 @@ static blk_status_t ublk_setup_iod(struct ublk_queue *ubq, struct request *req) iod->op_flags = ublk_op | ublk_req_build_flags(req); iod->nr_sectors = blk_rq_sectors(req); iod->start_sector = blk_rq_pos(req); - iod->addr = io->addr; + iod->addr = io->buf.addr;
return BLK_STS_OK; } @@ -1248,9 +1249,9 @@ static bool ublk_auto_buf_reg(const struct ublk_queue *ubq, struct request *req, int ret;
ret = io_buffer_register_bvec(cmd, req, ublk_io_release, - io->buf.index, issue_flags); + io->buf.auto_reg.index, issue_flags); if (ret) { - if (io->buf.flags & UBLK_AUTO_BUF_REG_FALLBACK) { + if (io->buf.auto_reg.flags & UBLK_AUTO_BUF_REG_FALLBACK) { ublk_auto_buf_reg_fallback(ubq, io); return true; } @@ -1539,7 +1540,7 @@ static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq) */ io->flags &= UBLK_IO_FLAG_CANCELED; io->cmd = NULL; - io->addr = 0; + io->buf.addr = 0;
/* * old task is PF_EXITING, put it now @@ -2100,13 +2101,16 @@ static inline int ublk_check_cmd_op(u32 cmd_op)
static inline int ublk_set_auto_buf_reg(struct ublk_io *io, struct io_uring_cmd *cmd) { - io->buf = ublk_sqe_addr_to_auto_buf_reg(READ_ONCE(cmd->sqe->addr)); + struct ublk_auto_buf_reg buf; + + buf = ublk_sqe_addr_to_auto_buf_reg(READ_ONCE(cmd->sqe->addr));
- if (io->buf.reserved0 || io->buf.reserved1) + if (buf.reserved0 || buf.reserved1) return -EINVAL;
- if (io->buf.flags & ~UBLK_AUTO_BUF_REG_F_MASK) + if (buf.flags & ~UBLK_AUTO_BUF_REG_F_MASK) return -EINVAL; + io->buf.auto_reg = buf; return 0; }
@@ -2128,7 +2132,7 @@ static int ublk_handle_auto_buf_reg(struct ublk_io *io, * this ublk request gets stuck. */ if (io->buf_ctx_handle == io_uring_cmd_ctx_handle(cmd)) - *buf_idx = io->buf.index; + *buf_idx = io->buf.auto_reg.index; }
return ublk_set_auto_buf_reg(io, cmd); @@ -2156,7 +2160,7 @@ ublk_config_io_buf(const struct ublk_device *ub, struct ublk_io *io, if (ublk_dev_support_auto_buf_reg(ub)) return ublk_handle_auto_buf_reg(io, cmd, buf_idx);
- io->addr = buf_addr; + io->buf.addr = buf_addr; return 0; }
@@ -2353,7 +2357,7 @@ static bool ublk_get_data(const struct ublk_queue *ubq, struct ublk_io *io, */ io->flags &= ~UBLK_IO_FLAG_NEED_GET_DATA; /* update iod->addr because ublksrv may have passed a new io buffer */ - ublk_get_iod(ubq, req->tag)->addr = io->addr; + ublk_get_iod(ubq, req->tag)->addr = io->buf.addr; pr_devel("%s: update iod->addr: qid %d tag %d io_flags %x addr %llx\n", __func__, ubq->q_id, req->tag, io->flags, ublk_get_iod(ubq, req->tag)->addr);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 0a9beafa7c633e6ff66b05b81eea78231b7e6520 ]
Refactor auto buffer register code and prepare for supporting batch IO feature, and the main motivation is to put 'ublk_io' operation code together, so that per-io lock can be applied for the code block.
The key changes are: - Rename ublk_auto_buf_reg() as ublk_do_auto_buf_reg() - Introduce an enum `auto_buf_reg_res` to represent the result of the buffer registration attempt (FAIL, FALLBACK, OK). - Split the existing `ublk_do_auto_buf_reg` function into two: - `__ublk_do_auto_buf_reg`: Performs the actual buffer registration and returns the `auto_buf_reg_res` status. - `ublk_do_auto_buf_reg`: A wrapper that calls the internal function and handles the I/O preparation based on the result. - Introduce `ublk_prep_auto_buf_reg_io` to encapsulate the logic for preparing the I/O for completion after buffer registration. - Pass the `tag` directly to `ublk_auto_buf_reg_fallback` to avoid recalculating it.
This refactoring makes the control flow clearer and isolates the different stages of the auto buffer registration process.
Reviewed-by: Caleb Sander Mateos csander@purestorage.com Signed-off-by: Ming Lei ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Stable-dep-of: c258f5c4502c ("ublk: fix deadlock when reading partition table") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/ublk_drv.c | 64 +++++++++++++++++++++++++++------------- 1 file changed, 43 insertions(+), 21 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index c9d258b99090..bdb897d44089 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1234,17 +1234,37 @@ static inline void __ublk_abort_rq(struct ublk_queue *ubq, }
static void -ublk_auto_buf_reg_fallback(const struct ublk_queue *ubq, struct ublk_io *io) +ublk_auto_buf_reg_fallback(const struct ublk_queue *ubq, unsigned tag) { - unsigned tag = io - ubq->ios; struct ublksrv_io_desc *iod = ublk_get_iod(ubq, tag);
iod->op_flags |= UBLK_IO_F_NEED_REG_BUF; }
-static bool ublk_auto_buf_reg(const struct ublk_queue *ubq, struct request *req, - struct ublk_io *io, struct io_uring_cmd *cmd, - unsigned int issue_flags) +enum auto_buf_reg_res { + AUTO_BUF_REG_FAIL, + AUTO_BUF_REG_FALLBACK, + AUTO_BUF_REG_OK, +}; + +static void ublk_prep_auto_buf_reg_io(const struct ublk_queue *ubq, + struct request *req, struct ublk_io *io, + struct io_uring_cmd *cmd, + enum auto_buf_reg_res res) +{ + if (res == AUTO_BUF_REG_OK) { + io->task_registered_buffers = 1; + io->buf_ctx_handle = io_uring_cmd_ctx_handle(cmd); + io->flags |= UBLK_IO_FLAG_AUTO_BUF_REG; + } + ublk_init_req_ref(ubq, io); + __ublk_prep_compl_io_cmd(io, req); +} + +static enum auto_buf_reg_res +__ublk_do_auto_buf_reg(const struct ublk_queue *ubq, struct request *req, + struct ublk_io *io, struct io_uring_cmd *cmd, + unsigned int issue_flags) { int ret;
@@ -1252,29 +1272,27 @@ static bool ublk_auto_buf_reg(const struct ublk_queue *ubq, struct request *req, io->buf.auto_reg.index, issue_flags); if (ret) { if (io->buf.auto_reg.flags & UBLK_AUTO_BUF_REG_FALLBACK) { - ublk_auto_buf_reg_fallback(ubq, io); - return true; + ublk_auto_buf_reg_fallback(ubq, req->tag); + return AUTO_BUF_REG_FALLBACK; } blk_mq_end_request(req, BLK_STS_IOERR); - return false; + return AUTO_BUF_REG_FAIL; }
- io->task_registered_buffers = 1; - io->buf_ctx_handle = io_uring_cmd_ctx_handle(cmd); - io->flags |= UBLK_IO_FLAG_AUTO_BUF_REG; - return true; + return AUTO_BUF_REG_OK; }
-static bool ublk_prep_auto_buf_reg(struct ublk_queue *ubq, - struct request *req, struct ublk_io *io, - struct io_uring_cmd *cmd, - unsigned int issue_flags) +static void ublk_do_auto_buf_reg(const struct ublk_queue *ubq, struct request *req, + struct ublk_io *io, struct io_uring_cmd *cmd, + unsigned int issue_flags) { - ublk_init_req_ref(ubq, io); - if (ublk_support_auto_buf_reg(ubq) && ublk_rq_has_data(req)) - return ublk_auto_buf_reg(ubq, req, io, cmd, issue_flags); + enum auto_buf_reg_res res = __ublk_do_auto_buf_reg(ubq, req, io, cmd, + issue_flags);
- return true; + if (res != AUTO_BUF_REG_FAIL) { + ublk_prep_auto_buf_reg_io(ubq, req, io, cmd, res); + io_uring_cmd_done(cmd, UBLK_IO_RES_OK, issue_flags); + } }
static bool ublk_start_io(const struct ublk_queue *ubq, struct request *req, @@ -1347,8 +1365,12 @@ static void ublk_dispatch_req(struct ublk_queue *ubq, if (!ublk_start_io(ubq, req, io)) return;
- if (ublk_prep_auto_buf_reg(ubq, req, io, io->cmd, issue_flags)) + if (ublk_support_auto_buf_reg(ubq) && ublk_rq_has_data(req)) { + ublk_do_auto_buf_reg(ubq, req, io, io->cmd, issue_flags); + } else { + ublk_init_req_ref(ubq, io); ublk_complete_io_cmd(io, req, UBLK_IO_RES_OK, issue_flags); + } }
static void ublk_cmd_tw_cb(struct io_uring_cmd *cmd,
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ming Lei ming.lei@redhat.com
[ Upstream commit c258f5c4502c9667bccf5d76fa731ab9c96687c1 ]
When one process(such as udev) opens ublk block device (e.g., to read the partition table via bdev_open()), a deadlock[1] can occur:
1. bdev_open() grabs disk->open_mutex 2. The process issues read I/O to ublk backend to read partition table 3. In __ublk_complete_rq(), blk_update_request() or blk_mq_end_request() runs bio->bi_end_io() callbacks 4. If this triggers fput() on file descriptor of ublk block device, the work may be deferred to current task's task work (see fput() implementation) 5. This eventually calls blkdev_release() from the same context 6. blkdev_release() tries to grab disk->open_mutex again 7. Deadlock: same task waiting for a mutex it already holds
The fix is to run blk_update_request() and blk_mq_end_request() with bottom halves disabled. This forces blkdev_release() to run in kernel work-queue context instead of current task work context, and allows ublk server to make forward progress, and avoids the deadlock.
Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Link: https://github.com/ublk-org/ublksrv/issues/170 [1] Signed-off-by: Ming Lei ming.lei@redhat.com Reviewed-by: Caleb Sander Mateos csander@purestorage.com [axboe: rewrite comment in ublk] Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/ublk_drv.c | 32 ++++++++++++++++++++++++++++---- 1 file changed, 28 insertions(+), 4 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index bdb897d44089..fa7b0481ea04 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1146,12 +1146,20 @@ static inline struct ublk_uring_cmd_pdu *ublk_get_uring_cmd_pdu( return io_uring_cmd_to_pdu(ioucmd, struct ublk_uring_cmd_pdu); }
+static void ublk_end_request(struct request *req, blk_status_t error) +{ + local_bh_disable(); + blk_mq_end_request(req, error); + local_bh_enable(); +} + /* todo: handle partial completion */ static inline void __ublk_complete_rq(struct request *req, struct ublk_io *io, bool need_map) { unsigned int unmapped_bytes; blk_status_t res = BLK_STS_OK; + bool requeue;
/* failed read IO if nothing is read */ if (!io->res && req_op(req) == REQ_OP_READ) @@ -1183,14 +1191,30 @@ static inline void __ublk_complete_rq(struct request *req, struct ublk_io *io, if (unlikely(unmapped_bytes < io->res)) io->res = unmapped_bytes;
- if (blk_update_request(req, BLK_STS_OK, io->res)) + /* + * Run bio->bi_end_io() with softirqs disabled. If the final fput + * happens off this path, then that will prevent ublk's blkdev_release() + * from being called on current's task work, see fput() implementation. + * + * Otherwise, ublk server may not provide forward progress in case of + * reading the partition table from bdev_open() with disk->open_mutex + * held, and causes dead lock as we could already be holding + * disk->open_mutex here. + * + * Preferably we would not be doing IO with a mutex held that is also + * used for release, but this work-around will suffice for now. + */ + local_bh_disable(); + requeue = blk_update_request(req, BLK_STS_OK, io->res); + local_bh_enable(); + if (requeue) blk_mq_requeue_request(req, true); else if (likely(!blk_should_fake_timeout(req->q))) __blk_mq_end_request(req, BLK_STS_OK);
return; exit: - blk_mq_end_request(req, res); + ublk_end_request(req, res); }
static struct io_uring_cmd *__ublk_prep_compl_io_cmd(struct ublk_io *io, @@ -1230,7 +1254,7 @@ static inline void __ublk_abort_rq(struct ublk_queue *ubq, if (ublk_nosrv_dev_should_queue_io(ubq->dev)) blk_mq_requeue_request(rq, false); else - blk_mq_end_request(rq, BLK_STS_IOERR); + ublk_end_request(rq, BLK_STS_IOERR); }
static void @@ -1275,7 +1299,7 @@ __ublk_do_auto_buf_reg(const struct ublk_queue *ubq, struct request *req, ublk_auto_buf_reg_fallback(ubq, req->tag); return AUTO_BUF_REG_FALLBACK; } - blk_mq_end_request(req, BLK_STS_IOERR); + ublk_end_request(req, BLK_STS_IOERR); return AUTO_BUF_REG_FAIL; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Fourier fourier.thomas@gmail.com
[ Upstream commit c9b5645fd8ca10f310e41b07540f98e6a9720f40 ]
If kstrdup() fails in init_dev(), then the newly allocated ID is lost.
Fixes: 64e8a6ece1a5 ("block/rnbd-clt: Dynamically alloc buffer for pathname & blk_symlink_name") Signed-off-by: Thomas Fourier fourier.thomas@gmail.com Acked-by: Jack Wang jinpu.wang@ionos.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/rnbd/rnbd-clt.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c index f1409e54010a..d1c354636315 100644 --- a/drivers/block/rnbd/rnbd-clt.c +++ b/drivers/block/rnbd/rnbd-clt.c @@ -1423,9 +1423,11 @@ static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess, goto out_alloc; }
- ret = ida_alloc_max(&index_ida, (1 << (MINORBITS - RNBD_PART_BITS)) - 1, - GFP_KERNEL); - if (ret < 0) { + dev->clt_device_id = ida_alloc_max(&index_ida, + (1 << (MINORBITS - RNBD_PART_BITS)) - 1, + GFP_KERNEL); + if (dev->clt_device_id < 0) { + ret = dev->clt_device_id; pr_err("Failed to initialize device '%s' from session %s, allocating idr failed, err: %d\n", pathname, sess->sessname, ret); goto out_queues; @@ -1434,10 +1436,9 @@ static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess, dev->pathname = kstrdup(pathname, GFP_KERNEL); if (!dev->pathname) { ret = -ENOMEM; - goto out_queues; + goto out_ida; }
- dev->clt_device_id = ret; dev->sess = sess; dev->access_mode = access_mode; dev->nr_poll_queues = nr_poll_queues; @@ -1453,6 +1454,8 @@ static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess,
return dev;
+out_ida: + ida_free(&index_ida, dev->clt_device_id); out_queues: kfree(dev->hw_queues); out_alloc:
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuicheng Lin shuicheng.lin@intel.com
[ Upstream commit 8e461304009135270e9ccf2d7e2dfe29daec9b60 ]
The exec and vm_bind ioctl allow userspace to specify an arbitrary num_syncs value. Without bounds checking, a very large num_syncs can force an excessively large allocation, leading to kernel warnings from the page allocator as below.
Introduce DRM_XE_MAX_SYNCS (set to 1024) and reject any request exceeding this limit.
" ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1217 at mm/page_alloc.c:5124 __alloc_frozen_pages_noprof+0x2f8/0x2180 mm/page_alloc.c:5124 ... Call Trace: <TASK> alloc_pages_mpol+0xe4/0x330 mm/mempolicy.c:2416 ___kmalloc_large_node+0xd8/0x110 mm/slub.c:4317 __kmalloc_large_node_noprof+0x18/0xe0 mm/slub.c:4348 __do_kmalloc_node mm/slub.c:4364 [inline] __kmalloc_noprof+0x3d4/0x4b0 mm/slub.c:4388 kmalloc_noprof include/linux/slab.h:909 [inline] kmalloc_array_noprof include/linux/slab.h:948 [inline] xe_exec_ioctl+0xa47/0x1e70 drivers/gpu/drm/xe/xe_exec.c:158 drm_ioctl_kernel+0x1f1/0x3e0 drivers/gpu/drm/drm_ioctl.c:797 drm_ioctl+0x5e7/0xc50 drivers/gpu/drm/drm_ioctl.c:894 xe_drm_ioctl+0x10b/0x170 drivers/gpu/drm/xe/xe_device.c:224 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:598 [inline] __se_sys_ioctl fs/ioctl.c:584 [inline] __x64_sys_ioctl+0x18b/0x210 fs/ioctl.c:584 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xbb/0x380 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f ... "
v2: Add "Reported-by" and Cc stable kernels. v3: Change XE_MAX_SYNCS from 64 to 1024. (Matt & Ashutosh) v4: s/XE_MAX_SYNCS/DRM_XE_MAX_SYNCS/ (Matt) v5: Do the check at the top of the exec func. (Matt)
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Reported-by: Koen Koning koen.koning@intel.com Reported-by: Peter Senna Tschudin peter.senna@linux.intel.com Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6450 Cc: stable@vger.kernel.org # v6.12+ Cc: Matthew Brost matthew.brost@intel.com Cc: Michal Mrozek michal.mrozek@intel.com Cc: Carl Zhang carl.zhang@intel.com Cc: José Roberto de Souza jose.souza@intel.com Cc: Lionel Landwerlin lionel.g.landwerlin@intel.com Cc: Ivan Briano ivan.briano@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Ashutosh Dixit ashutosh.dixit@intel.com Signed-off-by: Shuicheng Lin shuicheng.lin@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com Signed-off-by: Matthew Brost matthew.brost@intel.com Link: https://patch.msgid.link/20251205234715.2476561-5-shuicheng.lin@intel.com (cherry picked from commit b07bac9bd708ec468cd1b8a5fe70ae2ac9b0a11c) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Stable-dep-of: f8dd66bfb4e1 ("drm/xe/oa: Limit num_syncs to prevent oversized allocations") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/xe/xe_exec.c | 3 ++- drivers/gpu/drm/xe/xe_vm.c | 3 +++ include/uapi/drm/xe_drm.h | 1 + 3 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c index a8ab363a8046..ca85f7c15fab 100644 --- a/drivers/gpu/drm/xe/xe_exec.c +++ b/drivers/gpu/drm/xe/xe_exec.c @@ -130,7 +130,8 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
if (XE_IOCTL_DBG(xe, args->extensions) || XE_IOCTL_DBG(xe, args->pad[0] || args->pad[1] || args->pad[2]) || - XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1])) + XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1]) || + XE_IOCTL_DBG(xe, args->num_syncs > DRM_XE_MAX_SYNCS)) return -EINVAL;
q = xe_exec_queue_lookup(xef, args->exec_queue_id); diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index cdd1dc540a59..f0f699baa9f6 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -3282,6 +3282,9 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe, struct xe_vm *vm, if (XE_IOCTL_DBG(xe, args->extensions)) return -EINVAL;
+ if (XE_IOCTL_DBG(xe, args->num_syncs > DRM_XE_MAX_SYNCS)) + return -EINVAL; + if (args->num_binds > 1) { u64 __user *bind_user = u64_to_user_ptr(args->vector_of_binds); diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index 517489a7ec60..400555a8af18 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -1459,6 +1459,7 @@ struct drm_xe_exec { /** @exec_queue_id: Exec queue ID for the batch buffer */ __u32 exec_queue_id;
+#define DRM_XE_MAX_SYNCS 1024 /** @num_syncs: Amount of struct drm_xe_sync in array. */ __u32 num_syncs;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuicheng Lin shuicheng.lin@intel.com
[ Upstream commit f8dd66bfb4e184c71bd26418a00546ebe7f5c17a ]
The OA open parameters did not validate num_syncs, allowing userspace to pass arbitrarily large values, potentially leading to excessive allocations.
Add check to ensure that num_syncs does not exceed DRM_XE_MAX_SYNCS, returning -EINVAL when the limit is violated.
v2: use XE_IOCTL_DBG() and drop duplicated check. (Ashutosh)
Fixes: c8507a25cebd ("drm/xe/oa/uapi: Define and parse OA sync properties") Cc: Matthew Brost matthew.brost@intel.com Cc: Ashutosh Dixit ashutosh.dixit@intel.com Signed-off-by: Shuicheng Lin shuicheng.lin@intel.com Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Signed-off-by: Matthew Brost matthew.brost@intel.com Link: https://patch.msgid.link/20251205234715.2476561-6-shuicheng.lin@intel.com (cherry picked from commit e057b2d2b8d815df3858a87dffafa2af37e5945b) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/xe/xe_oa.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c index 125698a9ecf1..10047373e184 100644 --- a/drivers/gpu/drm/xe/xe_oa.c +++ b/drivers/gpu/drm/xe/xe_oa.c @@ -1253,6 +1253,9 @@ static int xe_oa_set_no_preempt(struct xe_oa *oa, u64 value, static int xe_oa_set_prop_num_syncs(struct xe_oa *oa, u64 value, struct xe_oa_open_param *param) { + if (XE_IOCTL_DBG(oa->xe, value > DRM_XE_MAX_SYNCS)) + return -EINVAL; + param->num_syncs = value; return 0; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ashutosh Dixit ashutosh.dixit@intel.com
[ Upstream commit 256edb267a9d0b5aef70e408e9fba4f930f9926e ]
Reports can be written out to the OA buffer using ways other than periodic sampling. These include mmio trigger and context switches. To support these use cases, when periodic sampling is not enabled, OAG_OAGLBCTXCTRL_COUNTER_RESUME must be set.
Fixes: 1db9a9dc90ae ("drm/xe/oa: OA stream initialization (OAG)") Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Reviewed-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Link: https://patch.msgid.link/20251205212613.826224-4-ashutosh.dixit@intel.com (cherry picked from commit 88d98e74adf3e20f678bb89581a5c3149fdbdeaa) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/xe/xe_oa.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c index 10047373e184..d0ceb67af83e 100644 --- a/drivers/gpu/drm/xe/xe_oa.c +++ b/drivers/gpu/drm/xe/xe_oa.c @@ -1104,11 +1104,12 @@ static int xe_oa_enable_metric_set(struct xe_oa_stream *stream) oag_buf_size_select(stream) | oag_configure_mmio_trigger(stream, true));
- xe_mmio_write32(mmio, __oa_regs(stream)->oa_ctx_ctrl, stream->periodic ? - (OAG_OAGLBCTXCTRL_COUNTER_RESUME | + xe_mmio_write32(mmio, __oa_regs(stream)->oa_ctx_ctrl, + OAG_OAGLBCTXCTRL_COUNTER_RESUME | + (stream->periodic ? OAG_OAGLBCTXCTRL_TIMER_ENABLE | REG_FIELD_PREP(OAG_OAGLBCTXCTRL_TIMER_PERIOD_MASK, - stream->period_exponent)) : 0); + stream->period_exponent) : 0));
/* * Initialize Super Queue Internal Cnt Register
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sairaj Kodilkar sarunkod@amd.com
[ Upstream commit c2e8dc1222c2136e714d5d972dce7e64924e4ed8 ]
Currently AMD IOMMU driver does not reserve domain ids programmed in the DTE while reusing the device table inside kdump kernel. This can cause reallocation of these domain ids for newer domains that are created by the kdump kernel, which can lead to potential IO_PAGE_FAULTs
Hence reserve these ids inside pdom_ids.
Fixes: 38e5f33ee359 ("iommu/amd: Reuse device table for kdump") Signed-off-by: Sairaj Kodilkar sarunkod@amd.com Reported-by: Jason Gunthorpe jgg@nvidia.com Reviewed-by: Vasant Hegde vasant.hegde@amd.com Reviewed-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Joerg Roedel joerg.roedel@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/amd/init.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index f2991c11867c..14eb9de33ccb 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1136,9 +1136,13 @@ static void set_dte_bit(struct dev_table_entry *dte, u8 bit) static bool __reuse_device_table(struct amd_iommu *iommu) { struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg; - u32 lo, hi, old_devtb_size; + struct dev_table_entry *old_dev_tbl_entry; + u32 lo, hi, old_devtb_size, devid; phys_addr_t old_devtb_phys; + u16 dom_id; + bool dte_v; u64 entry; + int ret;
/* Each IOMMU use separate device table with the same size */ lo = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET); @@ -1173,6 +1177,23 @@ static bool __reuse_device_table(struct amd_iommu *iommu) return false; }
+ for (devid = 0; devid <= pci_seg->last_bdf; devid++) { + old_dev_tbl_entry = &pci_seg->old_dev_tbl_cpy[devid]; + dte_v = FIELD_GET(DTE_FLAG_V, old_dev_tbl_entry->data[0]); + dom_id = FIELD_GET(DEV_DOMID_MASK, old_dev_tbl_entry->data[1]); + + if (!dte_v || !dom_id) + continue; + /* + * ID reservation can fail with -ENOSPC when there + * are multiple devices present in the same domain, + * hence check only for -ENOMEM. + */ + ret = ida_alloc_range(&pdom_ids, dom_id, dom_id, GFP_KERNEL); + if (ret == -ENOMEM) + return false; + } + return true; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rob Herring (Arm) robh@kernel.org
[ Upstream commit ce7b1d58609abc2941a1f38094147f439fb74233 ]
It's a requirement that DT overlays be applied at build time in order to validate them as overlays are not validated on their own.
Add missing target for mt8395-radxa hd panel overlay.
Fixes: 4c8ff61199a7 ("arm64: dts: mediatek: mt8395-radxa-nio-12l: Add Radxa 8 HD panel") Signed-off-by: Frank Wunderlich frank-w@public-files.de Acked-by: AngeloGioacchino Del Regno angelogiaocchino.delregno@collabora.com Link: https://patch.msgid.link/20251205215940.19287-1-linux@fw-web.de Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/mediatek/Makefile | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/arm64/boot/dts/mediatek/Makefile b/arch/arm64/boot/dts/mediatek/Makefile index a4df4c21399e..b50799b2a65f 100644 --- a/arch/arm64/boot/dts/mediatek/Makefile +++ b/arch/arm64/boot/dts/mediatek/Makefile @@ -104,6 +104,8 @@ dtb-$(CONFIG_ARCH_MEDIATEK) += mt8390-genio-700-evk.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt8395-kontron-3-5-sbc-i1200.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt8395-radxa-nio-12l.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt8395-radxa-nio-12l-8-hd-panel.dtbo +mt8395-radxa-nio-12l-8-hd-panel-dtbs := mt8395-radxa-nio-12l.dtb mt8395-radxa-nio-12l-8-hd-panel.dtbo +dtb-$(CONFIG_ARCH_MEDIATEK) += mt8395-radxa-nio-12l-8-hd-panel.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt8516-pumpkin.dtb
# Device tree overlays support
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nuno Sá nuno.sa@analog.com
[ Upstream commit b3db91c3bfea69a6c6258fea508f25a59c0feb1a ]
The reset_history attributes are write only. Hence don't report them as readable just to return -EOPNOTSUPP later on.
Fixes: cbc29538dbf7 ("hwmon: Add driver for LTC4282") Signed-off-by: Nuno Sá nuno.sa@analog.com Link: https://lore.kernel.org/r/20251219-ltc4282-fix-reset-history-v1-1-8eab974c12... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/ltc4282.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/hwmon/ltc4282.c b/drivers/hwmon/ltc4282.c index 1d664a2d7b3c..58c2d3a62432 100644 --- a/drivers/hwmon/ltc4282.c +++ b/drivers/hwmon/ltc4282.c @@ -1018,8 +1018,9 @@ static umode_t ltc4282_in_is_visible(const struct ltc4282_state *st, u32 attr) case hwmon_in_max: case hwmon_in_min: case hwmon_in_enable: - case hwmon_in_reset_history: return 0644; + case hwmon_in_reset_history: + return 0200; default: return 0; } @@ -1038,8 +1039,9 @@ static umode_t ltc4282_curr_is_visible(u32 attr) return 0444; case hwmon_curr_max: case hwmon_curr_min: - case hwmon_curr_reset_history: return 0644; + case hwmon_curr_reset_history: + return 0200; default: return 0; } @@ -1057,8 +1059,9 @@ static umode_t ltc4282_power_is_visible(u32 attr) return 0444; case hwmon_power_max: case hwmon_power_min: - case hwmon_power_reset_history: return 0644; + case hwmon_power_reset_history: + return 0200; default: return 0; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qianchang Zhao pioooooooooip@gmail.com
commit 5d510ac31626ed157d2182149559430350cf2104 upstream.
When size equals the current i_size (including 0), the code used to call check_lock_range(filp, i_size, size - 1, WRITE), which computes `size - 1` and can underflow for size==0. Skip the equal case.
Cc: stable@vger.kernel.org Reported-by: Qianchang Zhao pioooooooooip@gmail.com Reported-by: Zhitong Liu liuzhitong1993@gmail.com Signed-off-by: Qianchang Zhao pioooooooooip@gmail.com Acked-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/vfs.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/fs/smb/server/vfs.c +++ b/fs/smb/server/vfs.c @@ -324,6 +324,9 @@ static int check_lock_range(struct file struct file_lock_context *ctx = locks_inode_context(file_inode(filp)); int error = 0;
+ if (start == end) + return 0; + if (!ctx || list_empty_careful(&ctx->flc_posix)) return 0;
@@ -828,7 +831,7 @@ int ksmbd_vfs_truncate(struct ksmbd_work if (size < inode->i_size) { err = check_lock_range(filp, size, inode->i_size - 1, WRITE); - } else { + } else if (size > inode->i_size) { err = check_lock_range(filp, inode->i_size, size - 1, WRITE); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon linkinjeon@kernel.org
commit cafb57f7bdd57abba87725eb4e82bbdca4959644 upstream.
When a session is found but its state is not SMB2_SESSION_VALID, It indicates that no valid session was found, but it is missing to decrement the reference count acquired by the session lookup, which results in a reference count leak. This patch fixes the issue by explicitly calling ksmbd_user_session_put to release the reference to the session.
Cc: stable@vger.kernel.org Reported-by: Alexandre roger.andersen@protonmail.com Reported-by: Stanislas Polu spolu@dust.tt Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/mgmt/user_session.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/smb/server/mgmt/user_session.c +++ b/fs/smb/server/mgmt/user_session.c @@ -325,8 +325,10 @@ struct ksmbd_session *ksmbd_session_look sess = ksmbd_session_lookup(conn, id); if (!sess && conn->binding) sess = ksmbd_session_lookup_slowpath(id); - if (sess && sess->state != SMB2_SESSION_VALID) + if (sess && sess->state != SMB2_SESSION_VALID) { + ksmbd_user_session_put(sess); sess = NULL; + } return sess; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon linkinjeon@kernel.org
commit 95d7a890e4b03e198836d49d699408fd1867cb55 upstream.
The smb2_set_ea function, which handles Extended Attributes (EA), was performing buffer validation checks that incorrectly omitted the size of the null terminating character (+1 byte) for EA Name. This patch fixes the issue by explicitly adding '+ 1' to EaNameLength where the null terminator is expected to be present in the buffer, ensuring the validation accurately reflects the total required buffer size.
Cc: stable@vger.kernel.org Reported-by: Roger roger.andersen@protonmail.com Reported-by: Stanislas Polu spolu@dust.tt Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/smb2pdu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -2373,7 +2373,7 @@ static int smb2_set_ea(struct smb2_ea_in int rc = 0; unsigned int next = 0;
- if (buf_len < sizeof(struct smb2_ea_info) + eabuf->EaNameLength + + if (buf_len < sizeof(struct smb2_ea_info) + eabuf->EaNameLength + 1 + le16_to_cpu(eabuf->EaValueLength)) return -EINVAL;
@@ -2450,7 +2450,7 @@ next: break; }
- if (buf_len < sizeof(struct smb2_ea_info) + eabuf->EaNameLength + + if (buf_len < sizeof(struct smb2_ea_info) + eabuf->EaNameLength + 1 + le16_to_cpu(eabuf->EaValueLength)) { rc = -EINVAL; break;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ping Cheng pinglinux@gmail.com
commit 7953794f741e94d30df9dafaaa4c031c85b891d6 upstream.
HID_GD_Z is mapped to ABS_Z for stylus and pen in hid-input.c. But HID_GD_Z should be used to report ABS_DISTANCE for stylus and pen as described at: Documentation/input/event-codes.rst#n226
* ABS_DISTANCE:
- Used to describe the distance of a tool from an interaction surface. This event should only be emitted while the tool is hovering, meaning in close proximity of the device and while the value of the BTN_TOUCH code is 0. If the input device may be used freely in three dimensions, consider ABS_Z instead. - BTN_TOOL_<name> should be set to 1 when the tool comes into detectable proximity and set to 0 when the tool leaves detectable proximity. BTN_TOOL_<name> signals the type of tool that is currently detected by the hardware and is otherwise independent of ABS_DISTANCE and/or BTN_TOUCH.
This patch makes the correct mapping. The ABS_DISTANCE is currently not mapped by any HID usage in hid-generic driver.
Signed-off-by: Ping Cheng ping.cheng@wacom.com Cc: stable@kernel.org Signed-off-by: Jiri Kosina jkosina@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/hid-input.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
--- a/drivers/hid/hid-input.c +++ b/drivers/hid/hid-input.c @@ -878,7 +878,7 @@ static void hidinput_configure_usage(str
switch (usage->hid) { /* These usage IDs map directly to the usage codes. */ - case HID_GD_X: case HID_GD_Y: case HID_GD_Z: + case HID_GD_X: case HID_GD_Y: case HID_GD_RX: case HID_GD_RY: case HID_GD_RZ: if (field->flags & HID_MAIN_ITEM_RELATIVE) map_rel(usage->hid & 0xf); @@ -886,6 +886,22 @@ static void hidinput_configure_usage(str map_abs_clear(usage->hid & 0xf); break;
+ case HID_GD_Z: + /* HID_GD_Z is mapped to ABS_DISTANCE for stylus/pen */ + if (field->flags & HID_MAIN_ITEM_RELATIVE) { + map_rel(usage->hid & 0xf); + } else { + if (field->application == HID_DG_PEN || + field->physical == HID_DG_PEN || + field->logical == HID_DG_STYLUS || + field->physical == HID_DG_STYLUS || + field->application == HID_DG_DIGITIZER) + map_abs_clear(ABS_DISTANCE); + else + map_abs_clear(usage->hid & 0xf); + } + break; + case HID_GD_WHEEL: if (field->flags & HID_MAIN_ITEM_RELATIVE) { set_bit(REL_WHEEL, input->relbit);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sasha Finkelstein fnkl.kernel@gmail.com
commit d579478cee228bdc0029a0c12a1f6a63ea9d1c77 upstream.
Under certain conditions (more prevalent after a suspend/resume cycle), the touchscreen controller can send the "boot complete" interrupt before it actually finished booting. In those cases, attempting to read touch data resuls in a stream of "not ready" messages being read and interpreted as a touch report. Check that the response is in fact a touch report and discard it otherwise.
Reported-by: pitust piotr@stelmaszek.com Closes: https://oftc.catirclogs.org/asahi/2025-12-17#34878715; Fixes: 471a92f8a21a ("Input: apple_z2 - add a driver for Apple Z2 touchscreens") Signed-off-by: Sasha Finkelstein fnkl.kernel@gmail.com Link: https://patch.msgid.link/20251218-z2-init-fix-v1-1-48e3aa239caf@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/touchscreen/apple_z2.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/input/touchscreen/apple_z2.c +++ b/drivers/input/touchscreen/apple_z2.c @@ -21,6 +21,7 @@ #define APPLE_Z2_TOUCH_STARTED 3 #define APPLE_Z2_TOUCH_MOVED 4 #define APPLE_Z2_CMD_READ_INTERRUPT_DATA 0xEB +#define APPLE_Z2_REPLY_INTERRUPT_DATA 0xE1 #define APPLE_Z2_HBPP_CMD_BLOB 0x3001 #define APPLE_Z2_FW_MAGIC 0x5746325A #define LOAD_COMMAND_INIT_PAYLOAD 0 @@ -142,6 +143,9 @@ static int apple_z2_read_packet(struct a if (error) return error;
+ if (z2->rx_buf[0] != APPLE_Z2_REPLY_INTERRUPT_DATA) + return 0; + pkt_len = (get_unaligned_le16(z2->rx_buf + 1) + 8) & 0xfffffffc;
error = spi_read(z2->spidev, z2->rx_buf, pkt_len);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sanjay Govind sanjay.govind9@gmail.com
commit 806ec7b797adc1cc9b11535307638a55ddfb873c upstream.
Add support for various CRKD Guitar Controllers.
Signed-off-by: Sanjay Govind sanjay.govind9@gmail.com Link: https://patch.msgid.link/20251129073720.2750-2-sanjay.govind9@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/joystick/xpad.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/input/joystick/xpad.c +++ b/drivers/input/joystick/xpad.c @@ -133,6 +133,8 @@ static const struct xpad_device { } xpad_device[] = { /* Please keep this list sorted by vendor and product ID. */ { 0x0079, 0x18d4, "GPD Win 2 X-Box Controller", 0, XTYPE_XBOX360 }, + { 0x0351, 0x1000, "CRKD LP Blueberry Burst Pro Edition (Xbox)", 0, XTYPE_XBOX360 }, + { 0x0351, 0x2000, "CRKD LP Black Tribal Edition (Xbox) ", 0, XTYPE_XBOX360 }, { 0x03eb, 0xff01, "Wooting One (Legacy)", 0, XTYPE_XBOX360 }, { 0x03eb, 0xff02, "Wooting Two (Legacy)", 0, XTYPE_XBOX360 }, { 0x03f0, 0x038D, "HyperX Clutch", 0, XTYPE_XBOX360 }, /* wired */ @@ -420,6 +422,7 @@ static const struct xpad_device { { 0x3285, 0x0663, "Nacon Evol-X", 0, XTYPE_XBOXONE }, { 0x3537, 0x1004, "GameSir T4 Kaleid", 0, XTYPE_XBOX360 }, { 0x3537, 0x1010, "GameSir G7 SE", 0, XTYPE_XBOXONE }, + { 0x3651, 0x1000, "CRKD SG", 0, XTYPE_XBOX360 }, { 0x366c, 0x0005, "ByoWave Proteus Controller", MAP_SHARE_BUTTON, XTYPE_XBOXONE, FLAG_DELAY_INIT }, { 0x3767, 0x0101, "Fanatec Speedster 3 Forceshock Wheel", 0, XTYPE_XBOX }, { 0x37d7, 0x2501, "Flydigi Apex 5", 0, XTYPE_XBOX360 }, @@ -518,6 +521,7 @@ static const struct usb_device_id xpad_t */ { USB_INTERFACE_INFO('X', 'B', 0) }, /* Xbox USB-IF not-approved class */ XPAD_XBOX360_VENDOR(0x0079), /* GPD Win 2 controller */ + XPAD_XBOX360_VENDOR(0x0351), /* CRKD Controllers */ XPAD_XBOX360_VENDOR(0x03eb), /* Wooting Keyboards (Legacy) */ XPAD_XBOX360_VENDOR(0x03f0), /* HP HyperX Xbox 360 controllers */ XPAD_XBOXONE_VENDOR(0x03f0), /* HP HyperX Xbox One controllers */ @@ -578,6 +582,7 @@ static const struct usb_device_id xpad_t XPAD_XBOXONE_VENDOR(0x3285), /* Nacon Evol-X */ XPAD_XBOX360_VENDOR(0x3537), /* GameSir Controllers */ XPAD_XBOXONE_VENDOR(0x3537), /* GameSir Controllers */ + XPAD_XBOX360_VENDOR(0x3651), /* CRKD Controllers */ XPAD_XBOXONE_VENDOR(0x366c), /* ByoWave controllers */ XPAD_XBOX360_VENDOR(0x37d7), /* Flydigi Controllers */ XPAD_XBOX360_VENDOR(0x413d), /* Black Shark Green Ghost Controller */
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Junjie Cao junjie.cao@intel.com
commit 248d3a73a0167dce15ba100477c3e778c4787178 upstream.
The current validation 'wire_order[i] > ARRAY_SIZE(config_pins)' allows wire_order[i] to equal ARRAY_SIZE(config_pins), which causes out-of-bounds access when used as index in 'config_pins[wire_order[i]]'.
Since config_pins has 4 elements (indices 0-3), the valid range for wire_order should be 0-3. Fix the off-by-one error by using >= instead of > in the validation check.
Signed-off-by: Junjie Cao junjie.cao@intel.com Link: https://patch.msgid.link/20251114062817.852698-1-junjie.cao@intel.com Fixes: bb76dc09ddfc ("input: ti_am33x_tsc: Order of TSC wires, made configurable") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/touchscreen/ti_am335x_tsc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/input/touchscreen/ti_am335x_tsc.c +++ b/drivers/input/touchscreen/ti_am335x_tsc.c @@ -85,7 +85,7 @@ static int titsc_config_wires(struct tit wire_order[i] = ts_dev->config_inp[i] & 0x0F; if (WARN_ON(analog_line[i] > 7)) return -EINVAL; - if (WARN_ON(wire_order[i] > ARRAY_SIZE(config_pins))) + if (WARN_ON(wire_order[i] >= ARRAY_SIZE(config_pins))) return -EINVAL; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Minseong Kim ii4gsp@gmail.com
commit e58c88f0cb2d8ed89de78f6f17409d29cfab6c5c upstream.
lkkbd_interrupt() schedules lk->tq via schedule_work(), and the work handler lkkbd_reinit() dereferences the lkkbd structure and its serio/input_dev fields.
lkkbd_disconnect() and error paths in lkkbd_connect() free the lkkbd structure without preventing the reinit work from being queued again until serio_close() returns. This can allow the work handler to run after the structure has been freed, leading to a potential use-after-free.
Use disable_work_sync() instead of cancel_work_sync() to ensure the reinit work cannot be re-queued, and call it both in lkkbd_disconnect() and in lkkbd_connect() error paths after serio_open().
Signed-off-by: Minseong Kim ii4gsp@gmail.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251212052314.16139-1-ii4gsp@gmail.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/keyboard/lkkbd.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/input/keyboard/lkkbd.c +++ b/drivers/input/keyboard/lkkbd.c @@ -670,7 +670,8 @@ static int lkkbd_connect(struct serio *s
return 0;
- fail3: serio_close(serio); + fail3: disable_work_sync(&lk->tq); + serio_close(serio); fail2: serio_set_drvdata(serio, NULL); fail1: input_free_device(input_dev); kfree(lk); @@ -684,6 +685,8 @@ static void lkkbd_disconnect(struct seri { struct lkkbd *lk = serio_get_drvdata(serio);
+ disable_work_sync(&lk->tq); + input_get_device(lk->dev); input_unregister_device(lk->dev); serio_close(serio);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Duoming Zhou duoming@zju.edu.cn
commit bf40644ef8c8a288742fa45580897ed0e0289474 upstream.
The dev3_register_work delayed work item is initialized within alps_reconnect() and scheduled upon receipt of the first bare PS/2 packet from an external PS/2 device connected to the ALPS touchpad. During device detachment, the original implementation calls flush_workqueue() in psmouse_disconnect() to ensure completion of dev3_register_work. However, the flush_workqueue() in psmouse_disconnect() only blocks and waits for work items that were already queued to the workqueue prior to its invocation. Any work items submitted after flush_workqueue() is called are not included in the set of tasks that the flush operation awaits. This means that after flush_workqueue() has finished executing, the dev3_register_work could still be scheduled. Although the psmouse state is set to PSMOUSE_CMD_MODE in psmouse_disconnect(), the scheduling of dev3_register_work remains unaffected.
The race condition can occur as follows:
CPU 0 (cleanup path) | CPU 1 (delayed work) psmouse_disconnect() | psmouse_set_state() | flush_workqueue() | alps_report_bare_ps2_packet() alps_disconnect() | psmouse_queue_work() kfree(priv); // FREE | alps_register_bare_ps2_mouse() | priv = container_of(work...); // USE | priv->dev3 // USE
Add disable_delayed_work_sync() in alps_disconnect() to ensure that dev3_register_work is properly canceled and prevented from executing after the alps_data structure has been deallocated.
This bug is identified by static analysis.
Fixes: 04aae283ba6a ("Input: ALPS - do not mix trackstick and external PS/2 mouse data") Cc: stable@kernel.org Signed-off-by: Duoming Zhou duoming@zju.edu.cn Link: https://patch.msgid.link/b57b0a9ccca51a3f06be141bfc02b9ffe69d1845.1765939397... Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/mouse/alps.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/input/mouse/alps.c +++ b/drivers/input/mouse/alps.c @@ -2975,6 +2975,7 @@ static void alps_disconnect(struct psmou
psmouse_reset(psmouse); timer_shutdown_sync(&priv->timer); + disable_delayed_work_sync(&priv->dev3_register_work); if (priv->dev2) input_unregister_device(priv->dev2); if (!IS_ERR_OR_NULL(priv->dev3))
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoffer Sandberg cs@tuxedo.de
commit aed3716db7fff74919cc5775ca3a80c8bb246489 upstream.
The device occasionally wakes up from suspend with missing input on the internal keyboard and the following suspend attempt results in an instant wake-up. The quirks fix both issues for this device.
Signed-off-by: Christoffer Sandberg cs@tuxedo.de Signed-off-by: Werner Sembach wse@tuxedocomputers.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251124203336.64072-1-wse@tuxedocomputers.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/serio/i8042-acpipnpio.h | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/input/serio/i8042-acpipnpio.h +++ b/drivers/input/serio/i8042-acpipnpio.h @@ -1169,6 +1169,13 @@ static const struct dmi_system_id i8042_ .driver_data = (void *)(SERIO_QUIRK_NOMUX | SERIO_QUIRK_RESET_ALWAYS | SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP) }, + { + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "X5KK45xS_X5SP45xS"), + }, + .driver_data = (void *)(SERIO_QUIRK_NOMUX | SERIO_QUIRK_RESET_ALWAYS | + SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP) + }, /* * A lot of modern Clevo barebones have touchpad and/or keyboard issues * after suspend fixable with the forcenorestore quirk.
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
commit 204c8f77e8d4a3006f8abe40331f221a597ce608 upstream.
xfs_qm_quotacheck_dqadjust acquired the dquot through xfs_qm_dqget, which means it owns a reference and holds q_qlock. Both need to be dropped on an error exit.
Cc: stable@vger.kernel.org # v6.13 Fixes: ca378189fdfa ("xfs: convert quotacheck to attach dquot buffers") Reported-by: kernel test robot lkp@intel.com Reported-by: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Carlos Maiolino cem@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_qm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -1318,7 +1318,7 @@ xfs_qm_quotacheck_dqadjust(
error = xfs_dquot_attach_buf(NULL, dqp); if (error) - return error; + goto out_unlock;
trace_xfs_dqadjust(dqp);
@@ -1348,8 +1348,9 @@ xfs_qm_quotacheck_dqadjust( }
dqp->q_flags |= XFS_DQFLAG_DIRTY; +out_unlock: xfs_qm_dqput(dqp); - return 0; + return error; }
/*
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
commit 3e54d3b4a8437b6783d4145c86962a2aa51022f3 upstream.
Commit 2603be9e8167 ("can: gs_usb: gs_can_open(): improve error handling") added missing error handling to the gs_can_open() function.
The driver uses 2 USB anchors to track the allocated URBs: the TX URBs in struct gs_can::tx_submitted for each netdev and the RX URBs in struct gs_usb::rx_submitted for the USB device. gs_can_open() allocates the RX URBs, while TX URBs are allocated during gs_can_start_xmit().
The cleanup in gs_can_open() kills all anchored dev->tx_submitted URBs (which is not necessary since the netdev is not yet registered), but misses the parent->rx_submitted URBs.
Fix the problem by killing the rx_submitted instead of the tx_submitted.
Fixes: 2603be9e8167 ("can: gs_usb: gs_can_open(): improve error handling") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251210-gs_usb-fix-error-handling-v1-1-d6a5a03f10b... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/usb/gs_usb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -1074,7 +1074,7 @@ out_usb_free_urb: usb_free_urb(urb); out_usb_kill_anchored_urbs: if (!parent->active_channels) { - usb_kill_anchored_urbs(&dev->tx_submitted); + usb_kill_anchored_urbs(&parent->rx_submitted);
if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) gs_usb_timestamp_stop(parent);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kartik Rajput kkartik@nvidia.com
commit c87f820bc4748fdd4d50969e8930cd88d1b61582 upstream.
On Tegra platforms using ACPI, the SMCCC driver already registers the SoC device. This makes the registration performed by the Tegra fuse driver redundant.
When booted via ACPI, skip registering the SoC device and suppress printing SKU information from the Tegra fuse driver, as this information is already provided by the SMCCC driver.
Fixes: 972167c69080 ("soc/tegra: fuse: Add ACPI support for Tegra194 and Tegra234") Cc: stable@vger.kernel.org Signed-off-by: Kartik Rajput kkartik@nvidia.com Signed-off-by: Thierry Reding treding@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/soc/tegra/fuse/fuse-tegra.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/soc/tegra/fuse/fuse-tegra.c +++ b/drivers/soc/tegra/fuse/fuse-tegra.c @@ -182,8 +182,6 @@ static int tegra_fuse_probe(struct platf }
fuse->soc->init(fuse); - tegra_fuse_print_sku_info(&tegra_sku_info); - tegra_soc_device_register();
err = tegra_fuse_add_lookups(fuse); if (err)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yongxin Liu yongxin.liu@windriver.com
commit 611cf41ef6ac8301d23daadd8e78b013db0c5071 upstream.
The intel_pmc_ipc() function uses ACPI_ALLOCATE_BUFFER to allocate memory for the ACPI evaluation result but never frees it, causing a 192-byte memory leak on each call.
This leak is triggered during network interface initialization when the stmmac driver calls intel_mac_finish() -> intel_pmc_ipc().
unreferenced object 0xffff96a848d6ea80 (size 192): comm "dhcpcd", pid 541, jiffies 4294684345 hex dump (first 32 bytes): 04 00 00 00 05 00 00 00 98 ea d6 48 a8 96 ff ff ...........H.... 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................ backtrace (crc b1564374): kmemleak_alloc+0x2d/0x40 __kmalloc_noprof+0x2fa/0x730 acpi_ut_initialize_buffer+0x83/0xc0 acpi_evaluate_object+0x29a/0x2f0 intel_pmc_ipc+0xfd/0x170 intel_mac_finish+0x168/0x230 stmmac_mac_finish+0x3d/0x50 phylink_major_config+0x22b/0x5b0 phylink_mac_initial_config.constprop.0+0xf1/0x1b0 phylink_start+0x8e/0x210 __stmmac_open+0x12c/0x2b0 stmmac_open+0x23c/0x380 __dev_open+0x11d/0x2c0 __dev_change_flags+0x1d2/0x250 netif_change_flags+0x2b/0x70 dev_change_flags+0x40/0xb0
Add __free(kfree) for ACPI object to properly release the allocated buffer.
Cc: stable@vger.kernel.org Fixes: 7e2f7e25f6ff ("arch: x86: add IPC mailbox accessor function and add SoC register access") Signed-off-by: Yongxin Liu yongxin.liu@windriver.com Link: https://patch.msgid.link/20251128102437.3412891-2-yongxin.liu@windriver.com Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/platform_data/x86/intel_pmc_ipc.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/include/linux/platform_data/x86/intel_pmc_ipc.h +++ b/include/linux/platform_data/x86/intel_pmc_ipc.h @@ -9,6 +9,7 @@ #ifndef INTEL_PMC_IPC_H #define INTEL_PMC_IPC_H #include <linux/acpi.h> +#include <linux/cleanup.h>
#define IPC_SOC_REGISTER_ACCESS 0xAA #define IPC_SOC_SUB_CMD_READ 0x00 @@ -48,7 +49,6 @@ static inline int intel_pmc_ipc(struct p {.type = ACPI_TYPE_INTEGER,}, }; struct acpi_object_list arg_list = { PMC_IPCS_PARAM_COUNT, params }; - union acpi_object *obj; int status;
if (!ipc_cmd || !rbuf) @@ -72,7 +72,7 @@ static inline int intel_pmc_ipc(struct p if (ACPI_FAILURE(status)) return -ENODEV;
- obj = buffer.pointer; + union acpi_object *obj __free(kfree) = buffer.pointer;
if (obj && obj->type == ACPI_TYPE_PACKAGE && obj->package.count == VALID_IPC_RESPONSE) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pengjie Zhang zhangpengjie2@huawei.com
commit f103fa127c93016bcd89b05d8e11dc1a84f6990d upstream.
Local variable 'ret' in acpi_pcc_address_space_setup() is currently declared as 'static'. This can lead to race conditions in a multithreaded environment.
Remove the 'static' qualifier to ensure that 'ret' will be allocated directly on the stack as a local variable.
Fixes: a10b1c99e2dc ("ACPI: PCC: Setup PCC Opregion handler only if platform interrupt is available") Signed-off-by: Pengjie Zhang zhangpengjie2@huawei.com Reviewed-by: Sudeep Holla sudeep.holla@arm.com Acked-by: lihuisong@huawei.com Cc: 6.2+ stable@vger.kernel.org # 6.2+ [ rjw: Changelog edits ] Link: https://patch.msgid.link/20251210132634.2050033-1-zhangpengjie2@huawei.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/acpi_pcc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/acpi/acpi_pcc.c +++ b/drivers/acpi/acpi_pcc.c @@ -52,7 +52,7 @@ acpi_pcc_address_space_setup(acpi_handle struct pcc_data *data; struct acpi_pcc_info *ctx = handler_context; struct pcc_mbox_chan *pcc_chan; - static acpi_status ret; + acpi_status ret;
data = kzalloc(sizeof(*data), GFP_KERNEL); if (!data)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pengjie Zhang zhangpengjie2@huawei.com
commit 6ea3a44cef28add2d93b1ef119d84886cb1e3c9b upstream.
The current implementation overlooks the 'guaranteed_perf' register in this check.
If the Guaranteed Performance register is located in the PCC subspace, the function currently attempts to read it without acquiring the lock and without sending the CMD_READ doorbell to the firmware. This can result in reading stale data.
Fixes: 29523f095397 ("ACPI / CPPC: Add support for guaranteed performance") Signed-off-by: Pengjie Zhang zhangpengjie2@huawei.com Cc: 4.20+ stable@vger.kernel.org # 4.20+ [ rjw: Subject and changelog edits ] Link: https://patch.msgid.link/20251210132227.1988380-1-zhangpengjie2@huawei.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/cppc_acpi.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -1366,7 +1366,8 @@ int cppc_get_perf_caps(int cpunum, struc /* Are any of the regs PCC ?*/ if (CPC_IN_PCC(highest_reg) || CPC_IN_PCC(lowest_reg) || CPC_IN_PCC(lowest_non_linear_reg) || CPC_IN_PCC(nominal_reg) || - CPC_IN_PCC(low_freq_reg) || CPC_IN_PCC(nom_freq_reg)) { + CPC_IN_PCC(low_freq_reg) || CPC_IN_PCC(nom_freq_reg) || + CPC_IN_PCC(guaranteed_reg)) { if (pcc_ss_id < 0) { pr_debug("Invalid pcc_ss_id\n"); return -ENODEV;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Leroy christophe.leroy@csgroup.eu
commit 1417927df8049a0194933861e9b098669a95c762 upstream.
Commit fc96ec826bce ("spi: fsl-cpm: Use 16 bit mode for large transfers with even size") failed to make sure that the size is really even before switching to 16 bit mode. Until recently the problem went unnoticed because kernfs uses a pre-allocated bounce buffer of size PAGE_SIZE for reading EEPROM.
But commit 8ad6249c51d0 ("eeprom: at25: convert to spi-mem API") introduced an additional dynamically allocated bounce buffer whose size is exactly the size of the transfer, leading to a buffer overrun in the fsl-cpm driver when that size is odd.
Add the missing length parity verification and remain in 8 bit mode when the length is not even.
Fixes: fc96ec826bce ("spi: fsl-cpm: Use 16 bit mode for large transfers with even size") Cc: stable@vger.kernel.org Closes: https://lore.kernel.org/all/638496dd-ec60-4e53-bad7-eb657f67d580@csgroup.eu/ Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Reviewed-by: Sverdlin Alexander alexander.sverdlin@siemens.com Link: https://patch.msgid.link/3c4d81c3923c93f95ec56702a454744a4bad3cfc.1763627618... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/spi/spi-fsl-spi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/spi/spi-fsl-spi.c +++ b/drivers/spi/spi-fsl-spi.c @@ -335,7 +335,7 @@ static int fsl_spi_prepare_message(struc if (t->bits_per_word == 16 || t->bits_per_word == 32) t->bits_per_word = 8; /* pretend its 8 bits */ if (t->bits_per_word == 8 && t->len >= 256 && - (mpc8xxx_spi->flags & SPI_CPM1)) + !(t->len & 1) && (mpc8xxx_spi->flags & SPI_CPM1)) t->bits_per_word = 16; } }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jared Kangas jkangas@redhat.com
commit d3ecb12e2e04ce53c95f933c462f2d8b150b965b upstream.
MMC_SDHCI_ESDHC_IMX requires ARCH_MXC despite also being used on ARCH_S32, which results in unmet dependencies when compiling strictly for ARCH_S32. Resolve this by adding ARCH_S32 as an alternative to ARCH_MXC in the driver's dependencies.
Fixes: 5c4f00627c9a ("mmc: sdhci-esdhc-imx: add NXP S32G2 support") Cc: stable@bvger.kernel.org Signed-off-by: Jared Kangas jkangas@redhat.com Reviewed-by: Haibo Chen haibo.chen@nxp.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/Kconfig | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/mmc/host/Kconfig +++ b/drivers/mmc/host/Kconfig @@ -315,14 +315,14 @@ config MMC_SDHCI_ESDHC_MCF
config MMC_SDHCI_ESDHC_IMX tristate "SDHCI support for the Freescale eSDHC/uSDHC i.MX controller" - depends on ARCH_MXC || COMPILE_TEST + depends on ARCH_MXC || ARCH_S32 || COMPILE_TEST depends on MMC_SDHCI_PLTFM depends on OF select MMC_SDHCI_IO_ACCESSORS select MMC_CQHCI help This selects the Freescale eSDHC/uSDHC controller support - found on i.MX25, i.MX35 i.MX5x and i.MX6x. + found on i.MX25, i.MX35, i.MX5x, i.MX6x, and S32G.
If you have a controller with this interface, say Y or M here.
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sai Krishna Potthuri sai.krishna.potthuri@amd.com
commit a9c4c9085ec8ce3ce01be21b75184789e74f5f19 upstream.
On Xilinx/AMD platforms, the CD stable bit take slightly longer than one second(about an additional 100ms) to assert after a host controller reset. Although no functional failure observed with the existing one second delay but to ensure reliable initialization, increase the CD stable timeout to 2 seconds.
Fixes: e251709aaddb ("mmc: sdhci-of-arasan: Ensure CD logic stabilization before power-up") Cc: stable@vger.kernel.org Signed-off-by: Sai Krishna Potthuri sai.krishna.potthuri@amd.com Acked-by: Adrian Hunter adrian.hunter@intel.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-of-arasan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mmc/host/sdhci-of-arasan.c +++ b/drivers/mmc/host/sdhci-of-arasan.c @@ -99,7 +99,7 @@ #define HIWORD_UPDATE(val, mask, shift) \ ((val) << (shift) | (mask) << ((shift) + 16))
-#define CD_STABLE_TIMEOUT_US 1000000 +#define CD_STABLE_TIMEOUT_US 2000000 #define CD_STABLE_MAX_SLEEP_US 10
/**
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrew Jeffery andrew@codeconstruct.com.au
commit ed724ea1b82a800af4704311cb89e5ef1b4ea7ac upstream.
Enable use of common SDHCI-related properties such as sdhci-caps-mask as found in the AST2600 EVB DTS.
Cc: stable@vger.kernel.org # v6.2+ Signed-off-by: Andrew Jeffery andrew@codeconstruct.com.au Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/mmc/aspeed,sdhci.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/Documentation/devicetree/bindings/mmc/aspeed,sdhci.yaml +++ b/Documentation/devicetree/bindings/mmc/aspeed,sdhci.yaml @@ -41,7 +41,7 @@ properties: patternProperties: "^sdhci@[0-9a-f]+$": type: object - $ref: mmc-controller.yaml + $ref: sdhci-common.yaml unevaluatedProperties: false
properties:
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shaurya Rane ssrane_b23@ee.vjti.ac.in
commit 188e0fa5a679570ea35474575e724d8211423d17 upstream.
prp_get_untagged_frame() calls __pskb_copy() to create frame->skb_std but doesn't check if the allocation failed. If __pskb_copy() returns NULL, skb_clone() is called with a NULL pointer, causing a crash:
Oops: general protection fault, probably for non-canonical address 0xdffffc000000000f: 0000 [#1] SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000078-0x000000000000007f] CPU: 0 UID: 0 PID: 5625 Comm: syz.1.18 Not tainted syzkaller #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:skb_clone+0xd7/0x3a0 net/core/skbuff.c:2041 Code: 03 42 80 3c 20 00 74 08 4c 89 f7 e8 23 29 05 f9 49 83 3e 00 0f 85 a0 01 00 00 e8 94 dd 9d f8 48 8d 6b 7e 49 89 ee 49 c1 ee 03 <43> 0f b6 04 26 84 c0 0f 85 d1 01 00 00 44 0f b6 7d 00 41 83 e7 0c RSP: 0018:ffffc9000d00f200 EFLAGS: 00010207 RAX: ffffffff892235a1 RBX: 0000000000000000 RCX: ffff88803372a480 RDX: 0000000000000000 RSI: 0000000000000820 RDI: 0000000000000000 RBP: 000000000000007e R08: ffffffff8f7d0f77 R09: 1ffffffff1efa1ee R10: dffffc0000000000 R11: fffffbfff1efa1ef R12: dffffc0000000000 R13: 0000000000000820 R14: 000000000000000f R15: ffff88805144cc00 FS: 0000555557f6d500(0000) GS:ffff88808d72f000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000555581d35808 CR3: 000000005040e000 CR4: 0000000000352ef0 Call Trace: <TASK> hsr_forward_do net/hsr/hsr_forward.c:-1 [inline] hsr_forward_skb+0x1013/0x2860 net/hsr/hsr_forward.c:741 hsr_handle_frame+0x6ce/0xa70 net/hsr/hsr_slave.c:84 __netif_receive_skb_core+0x10b9/0x4380 net/core/dev.c:5966 __netif_receive_skb_one_core net/core/dev.c:6077 [inline] __netif_receive_skb+0x72/0x380 net/core/dev.c:6192 netif_receive_skb_internal net/core/dev.c:6278 [inline] netif_receive_skb+0x1cb/0x790 net/core/dev.c:6337 tun_rx_batched+0x1b9/0x730 drivers/net/tun.c:1485 tun_get_user+0x2b65/0x3e90 drivers/net/tun.c:1953 tun_chr_write_iter+0x113/0x200 drivers/net/tun.c:1999 new_sync_write fs/read_write.c:593 [inline] vfs_write+0x5c9/0xb30 fs/read_write.c:686 ksys_write+0x145/0x250 fs/read_write.c:738 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f0449f8e1ff Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 f9 92 02 00 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 4c 93 02 00 48 RSP: 002b:00007ffd7ad94c90 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007f044a1e5fa0 RCX: 00007f0449f8e1ff RDX: 000000000000003e RSI: 0000200000000500 RDI: 00000000000000c8 RBP: 00007ffd7ad94d20 R08: 0000000000000000 R09: 0000000000000000 R10: 000000000000003e R11: 0000000000000293 R12: 0000000000000001 R13: 00007f044a1e5fa0 R14: 00007f044a1e5fa0 R15: 0000000000000003 </TASK>
Add a NULL check immediately after __pskb_copy() to handle allocation failures gracefully.
Reported-by: syzbot+2fa344348a579b779e05@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2fa344348a579b779e05 Fixes: f266a683a480 ("net/hsr: Better frame dispatch") Cc: stable@vger.kernel.org Signed-off-by: Shaurya Rane ssrane_b23@ee.vjti.ac.in Reviewed-by: Felix Maurer fmaurer@redhat.com Tested-by: Felix Maurer fmaurer@redhat.com Link: https://patch.msgid.link/20251129093718.25320-1-ssrane_b23@ee.vjti.ac.in Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/hsr/hsr_forward.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/hsr/hsr_forward.c +++ b/net/hsr/hsr_forward.c @@ -205,6 +205,8 @@ struct sk_buff *prp_get_untagged_frame(s __pskb_copy(frame->skb_prp, skb_headroom(frame->skb_prp), GFP_ATOMIC); + if (!frame->skb_std) + return NULL; } else { /* Unexpected */ WARN_ONCE(1, "%s:%d: Unexpected frame received (port_src %s)\n",
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
commit c56a12c71ad38f381105f6e5036dede64ad2dfee upstream.
For some mysterious reasons the GCC 8 and 9 preprocessor manages to sporadically fumble _ASM_BYTES(0x0f, 0x0b):
$ grep ".byte[ ]*0x0f" defconfig-build/drivers/net/wireless/realtek/rtlwifi/base.s 1: .byte0x0f,0x0b ; 1: .byte 0x0f,0x0b ;
which makes the assembler upset and all that. While there are more _ASM_BYTES() users (notably the NOP instructions), those don't seem affected. Therefore replace the offending ASM_UD2 with one using the ud2 mnemonic.
Reported-by: Jean Delvare jdelvare@suse.de Suggested-by: Uros Bizjak ubizjak@gmail.com Fixes: 85a2d4a890dc ("x86,ibt: Use UDB instead of 0xEA") Cc: stable@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://patch.msgid.link/20251218104659.GT3911114@noisy.programming.kicks-as... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/bug.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/include/asm/bug.h +++ b/arch/x86/include/asm/bug.h @@ -10,7 +10,7 @@ /* * Despite that some emulators terminate on UD2, we use it for WARN(). */ -#define ASM_UD2 _ASM_BYTES(0x0f, 0x0b) +#define ASM_UD2 __ASM_FORM(ud2) #define INSN_UD2 0x0b0f #define LEN_UD2 2
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit 0edc78b82bea85e1b2165d8e870a5c3535919695 upstream.
Luigi reported that retriggering a posted MSI interrupt does not work correctly.
The reason is that the retrigger happens at the vector domain by sending an IPI to the actual vector on the target CPU. That works correctly exactly once because the posted MSI interrupt chip does not issue an EOI as that's only required for the posted MSI notification vector itself.
As a consequence the vector becomes stale in the ISR, which not only affects this vector but also any lower priority vector in the affected APIC because the ISR bit is not cleared.
Luigi proposed to set the vector in the remap PIR bitmap and raise the posted MSI notification vector. That works, but that still does not cure a related problem:
If there is ever a stray interrupt on such a vector, then the related APIC ISR bit becomes stale due to the lack of EOI as described above. Unlikely to happen, but if it happens it's not debuggable at all.
So instead of playing games with the PIR, this can be actually solved for both cases by:
1) Keeping track of the posted interrupt vector handler state
2) Implementing a posted MSI specific irq_ack() callback which checks that state. If the posted vector handler is inactive it issues an EOI, otherwise it delegates that to the posted handler.
This is correct versus affinity changes and concurrent events on the posted vector as the actual handler invocation is serialized through the interrupt descriptor lock.
Fixes: ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs") Reported-by: Luigi Rizzo lrizzo@google.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Luigi Rizzo lrizzo@google.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251125214631.044440658@linutronix.de Closes: https://lore.kernel.org/lkml/20251124104836.3685533-1-lrizzo@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/irq_remapping.h | 7 +++++++ arch/x86/kernel/irq.c | 23 +++++++++++++++++++++++ drivers/iommu/intel/irq_remapping.c | 8 ++++---- 3 files changed, 34 insertions(+), 4 deletions(-)
--- a/arch/x86/include/asm/irq_remapping.h +++ b/arch/x86/include/asm/irq_remapping.h @@ -87,4 +87,11 @@ static inline void panic_if_irq_remap(co }
#endif /* CONFIG_IRQ_REMAP */ + +#ifdef CONFIG_X86_POSTED_MSI +void intel_ack_posted_msi_irq(struct irq_data *irqd); +#else +#define intel_ack_posted_msi_irq NULL +#endif + #endif /* __X86_IRQ_REMAPPING_H */ --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -396,6 +396,7 @@ DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_kvm
/* Posted Interrupt Descriptors for coalesced MSIs to be posted */ DEFINE_PER_CPU_ALIGNED(struct pi_desc, posted_msi_pi_desc); +static DEFINE_PER_CPU_CACHE_HOT(bool, posted_msi_handler_active);
void intel_posted_msi_init(void) { @@ -413,6 +414,25 @@ void intel_posted_msi_init(void) this_cpu_write(posted_msi_pi_desc.ndst, destination); }
+void intel_ack_posted_msi_irq(struct irq_data *irqd) +{ + irq_move_irq(irqd); + + /* + * Handle the rare case that irq_retrigger() raised the actual + * assigned vector on the target CPU, which means that it was not + * invoked via the posted MSI handler below. In that case APIC EOI + * is required as otherwise the ISR entry becomes stale and lower + * priority interrupts are never going to be delivered after that. + * + * If the posted handler invoked the device interrupt handler then + * the EOI would be premature because it would acknowledge the + * posted vector. + */ + if (unlikely(!__this_cpu_read(posted_msi_handler_active))) + apic_eoi(); +} + static __always_inline bool handle_pending_pir(unsigned long *pir, struct pt_regs *regs) { unsigned long pir_copy[NR_PIR_WORDS]; @@ -445,6 +465,8 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi
pid = this_cpu_ptr(&posted_msi_pi_desc);
+ /* Mark the handler active for intel_ack_posted_msi_irq() */ + __this_cpu_write(posted_msi_handler_active, true); inc_irq_stat(posted_msi_notification_count); irq_enter();
@@ -473,6 +495,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi
apic_eoi(); irq_exit(); + __this_cpu_write(posted_msi_handler_active, false); set_irq_regs(old_regs); } #endif /* X86_POSTED_MSI */ --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1303,17 +1303,17 @@ static struct irq_chip intel_ir_chip = { * irq_enter(); * handle_edge_irq() * irq_chip_ack_parent() - * irq_move_irq(); // No EOI + * intel_ack_posted_msi_irq(); // No EOI * handle_irq_event() * driver_handler() * handle_edge_irq() * irq_chip_ack_parent() - * irq_move_irq(); // No EOI + * intel_ack_posted_msi_irq(); // No EOI * handle_irq_event() * driver_handler() * handle_edge_irq() * irq_chip_ack_parent() - * irq_move_irq(); // No EOI + * intel_ack_posted_msi_irq(); // No EOI * handle_irq_event() * driver_handler() * apic_eoi() @@ -1322,7 +1322,7 @@ static struct irq_chip intel_ir_chip = { */ static struct irq_chip intel_ir_chip_post_msi = { .name = "INTEL-IR-POST", - .irq_ack = irq_move_irq, + .irq_ack = intel_ack_posted_msi_irq, .irq_set_affinity = intel_ir_set_affinity, .irq_compose_msi_msg = intel_ir_compose_msi_msg, .irq_set_vcpu_affinity = intel_ir_set_vcpu_affinity,
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yongxin Liu yongxin.liu@windriver.com
[ Upstream commit c8161e5304abb26e6c0bec6efc947992500fa6c5 ]
Zero can be a valid value of num_records. For example, on Intel Atom x6425RE, only x87 and SSE are supported (features 0, 1), and fpu_user_cfg.max_features is 3. The for_each_extended_xfeature() loop only iterates feature 2, which is not enabled, so num_records = 0. This is valid and should not cause core dump failure.
The issue is that dump_xsave_layout_desc() returns 0 for both genuine errors (dump_emit() failure) and valid cases (no extended features). Use negative return values for errors and only abort on genuine failures.
Fixes: ba386777a30b ("x86/elf: Add a new FPU buffer layout info to x86 core files") Signed-off-by: Yongxin Liu yongxin.liu@windriver.com Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://patch.msgid.link/20251210000219.4094353-2-yongxin.liu@windriver.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/fpu/xstate.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 28e4fd65c9da..5f54a207ace4 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1945,7 +1945,7 @@ static int dump_xsave_layout_desc(struct coredump_params *cprm) };
if (!dump_emit(cprm, &xc, sizeof(xc))) - return 0; + return -1;
num_records++; } @@ -1983,7 +1983,7 @@ int elf_coredump_extra_notes_write(struct coredump_params *cprm) return 1;
num_records = dump_xsave_layout_desc(cprm); - if (!num_records) + if (num_records < 0) return 1;
/* Total size should be equal to the number of records */
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tal Zussman tz2294@columbia.edu
[ Upstream commit 8b62e64e6d30fa047b3aefb1a36e1f80c8acb3d2 ]
When the TLB_REMOTE_WRONG_CPU enum was introduced for the tlb_flush tracepoint, the enum was not exported to user-space. Add it to the appropriate macro definition to enable parsing by userspace tools, as per:
Link: https://lore.kernel.org/all/20150403013802.220157513@goodmis.org
[ mingo: Capitalize IPI, etc. ]
Fixes: 2815a56e4b72 ("x86/mm/tlb: Add tracepoint for TLB flush IPI to stale CPU") Signed-off-by: Tal Zussman tz2294@columbia.edu Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Rik van Riel riel@surriel.com Link: https://patch.msgid.link/20251212-tlb-trace-fix-v2-1-d322e0ad9b69@columbia.e... Signed-off-by: Sasha Levin sashal@kernel.org --- include/trace/events/tlb.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/include/trace/events/tlb.h b/include/trace/events/tlb.h index b4d8e7dc38f8..fb8369511685 100644 --- a/include/trace/events/tlb.h +++ b/include/trace/events/tlb.h @@ -12,8 +12,9 @@ EM( TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" ) \ EM( TLB_REMOTE_SHOOTDOWN, "remote shootdown" ) \ EM( TLB_LOCAL_SHOOTDOWN, "local shootdown" ) \ - EM( TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" ) \ - EMe( TLB_REMOTE_SEND_IPI, "remote ipi send" ) + EM( TLB_LOCAL_MM_SHOOTDOWN, "local MM shootdown" ) \ + EM( TLB_REMOTE_SEND_IPI, "remote IPI send" ) \ + EMe( TLB_REMOTE_WRONG_CPU, "remote wrong CPU" )
/* * First define the enums in TLB_FLUSH_REASON to be exported to userspace
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chancel Liu chancel.liu@nxp.com
[ Upstream commit 9f4d0899efd9892fc7514c9488270e1bb7dedd2b ]
If SAI works in master mode it will generate clocks for external codec from audio PLLs. Thus sample rates should be constrained according to audio PLL clocks. While SAI works in slave mode which means clocks are generated externally then constraints are independent of audio PLLs.
Fixes: 4edc98598be4 ("ASoC: fsl_sai: Add sample rate constraint") Signed-off-by: Chancel Liu chancel.liu@nxp.com Link: https://patch.msgid.link/20251210062109.2577735-1-chancel.liu@nxp.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/fsl/fsl_sai.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c index 72bfc91e21b9..86730c214914 100644 --- a/sound/soc/fsl/fsl_sai.c +++ b/sound/soc/fsl/fsl_sai.c @@ -917,8 +917,14 @@ static int fsl_sai_startup(struct snd_pcm_substream *substream, tx ? sai->dma_params_tx.maxburst : sai->dma_params_rx.maxburst);
- ret = snd_pcm_hw_constraint_list(substream->runtime, 0, - SNDRV_PCM_HW_PARAM_RATE, &sai->constraint_rates); + if (sai->is_consumer_mode[tx]) + ret = snd_pcm_hw_constraint_list(substream->runtime, 0, + SNDRV_PCM_HW_PARAM_RATE, + &fsl_sai_rate_constraints); + else + ret = snd_pcm_hw_constraint_list(substream->runtime, 0, + SNDRV_PCM_HW_PARAM_RATE, + &sai->constraint_rates);
return ret; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haotian Zhang vulab@iscas.ac.cn
[ Upstream commit 2a03b40deacbd293ac9aed0f9b11197dad54fe5f ]
When vxpocket_config() fails, vxpocket_probe() returns the error code directly without freeing the sound card resources allocated by snd_card_new(), which leads to a memory leak.
Add proper error handling to free the sound card and clear the allocation bit when vxpocket_config() fails.
Fixes: 15b99ac17295 ("[PATCH] pcmcia: add return value to _config() functions") Signed-off-by: Haotian Zhang vulab@iscas.ac.cn Link: https://patch.msgid.link/20251215042652.695-1-vulab@iscas.ac.cn Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pcmcia/vx/vxpocket.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/sound/pcmcia/vx/vxpocket.c b/sound/pcmcia/vx/vxpocket.c index 2e09f2a513a6..9a5c9aa8eec4 100644 --- a/sound/pcmcia/vx/vxpocket.c +++ b/sound/pcmcia/vx/vxpocket.c @@ -284,7 +284,13 @@ static int vxpocket_probe(struct pcmcia_device *p_dev)
vxp->p_dev = p_dev;
- return vxpocket_config(p_dev); + err = vxpocket_config(p_dev); + if (err < 0) { + card_alloc &= ~(1 << i); + snd_card_free(card); + return err; + } + return 0; }
static void vxpocket_detach(struct pcmcia_device *link)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haotian Zhang vulab@iscas.ac.cn
[ Upstream commit 5032347c04ba7ff9ba878f262e075d745c06a2a8 ]
When pdacf_config() fails, snd_pdacf_probe() returns the error code directly without freeing the sound card resources allocated by snd_card_new(), which leads to a memory leak.
Add proper error handling to free the sound card and clear the card list entry when pdacf_config() fails.
Fixes: 15b99ac17295 ("[PATCH] pcmcia: add return value to _config() functions") Suggested-by: Takashi Iwai tiwai@suse.de Signed-off-by: Haotian Zhang vulab@iscas.ac.cn Link: https://patch.msgid.link/20251215090433.211-1-vulab@iscas.ac.cn Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pcmcia/pdaudiocf/pdaudiocf.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/sound/pcmcia/pdaudiocf/pdaudiocf.c b/sound/pcmcia/pdaudiocf/pdaudiocf.c index 13419837dfb7..a3291e626440 100644 --- a/sound/pcmcia/pdaudiocf/pdaudiocf.c +++ b/sound/pcmcia/pdaudiocf/pdaudiocf.c @@ -131,7 +131,13 @@ static int snd_pdacf_probe(struct pcmcia_device *link) link->config_index = 1; link->config_regs = PRESENT_OPTION;
- return pdacf_config(link); + err = pdacf_config(link); + if (err < 0) { + card_list[i] = NULL; + snd_card_free(card); + return err; + } + return 0; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shipei Qu qu@darknavy.com
[ Upstream commit 5526c1c6ba1d0913c7dfcbbd6fe1744ea7c55f1e ]
get_meter_levels_from_urb() parses the 64-byte meter packets sent by the device and fills the per-channel arrays meter_level[], comp_level[] and master_level[] in struct snd_us16x08_meter_store.
Currently the function derives the channel index directly from the meter packet (MUB2(meter_urb, s) - 1) and uses it to index those arrays without validating the range. If the packet contains a negative or out-of-range channel number, the driver may write past the end of these arrays.
Introduce a local channel variable and validate it before updating the arrays. We reject negative indices, limit meter_level[] and comp_level[] to SND_US16X08_MAX_CHANNELS, and guard master_level[] updates with ARRAY_SIZE(master_level).
Fixes: d2bb390a2081 ("ALSA: usb-audio: Tascam US-16x08 DSP mixer quirk") Reported-by: DARKNAVY (@DarkNavyOrg) vr@darknavy.com Closes: https://lore.kernel.org/tencent_21C112743C44C1A2517FF219@qq.com Signed-off-by: Shipei Qu qu@darknavy.com Link: https://patch.msgid.link/20251217024630.59576-1-qu@darknavy.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/usb/mixer_us16x08.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/sound/usb/mixer_us16x08.c b/sound/usb/mixer_us16x08.c index 1c5712c31f5e..f9df40730eff 100644 --- a/sound/usb/mixer_us16x08.c +++ b/sound/usb/mixer_us16x08.c @@ -655,17 +655,25 @@ static void get_meter_levels_from_urb(int s, u8 *meter_urb) { int val = MUC2(meter_urb, s) + (MUC3(meter_urb, s) << 8); + int ch = MUB2(meter_urb, s) - 1; + + if (ch < 0) + return;
if (MUA0(meter_urb, s) == 0x61 && MUA1(meter_urb, s) == 0x02 && MUA2(meter_urb, s) == 0x04 && MUB0(meter_urb, s) == 0x62) { - if (MUC0(meter_urb, s) == 0x72) - store->meter_level[MUB2(meter_urb, s) - 1] = val; - if (MUC0(meter_urb, s) == 0xb2) - store->comp_level[MUB2(meter_urb, s) - 1] = val; + if (ch < SND_US16X08_MAX_CHANNELS) { + if (MUC0(meter_urb, s) == 0x72) + store->meter_level[ch] = val; + if (MUC0(meter_urb, s) == 0xb2) + store->comp_level[ch] = val; + } } if (MUA0(meter_urb, s) == 0x61 && MUA1(meter_urb, s) == 0x02 && - MUA2(meter_urb, s) == 0x02 && MUB0(meter_urb, s) == 0x62) - store->master_level[MUB2(meter_urb, s) - 1] = val; + MUA2(meter_urb, s) == 0x02 && MUB0(meter_urb, s) == 0x62) { + if (ch < ARRAY_SIZE(store->master_level)) + store->master_level[ch] = val; + } }
/* Function to retrieve current meter values from the device.
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shengjiu Wang shengjiu.wang@nxp.com
[ Upstream commit 00b960a83c764208b0623089eb70af3685e3906f ]
The reset_control handler has the reference count for usage, as there is reset operation in runtime suspend and resume, then reset operation in probe() would cause the reference count of reset not balanced.
Previously add reset operation in probe and remove is to fix the compile issue with !CONFIG_PM, as the driver has been update to use RUNTIME_PM_OPS(), so that change can be reverted.
Fixes: 1e0dff741b0a ("ASoC: ak4458: remove "reset-gpios" property handler") Signed-off-by: Shengjiu Wang shengjiu.wang@nxp.com Link: https://patch.msgid.link/20251216070201.358477-1-shengjiu.wang@nxp.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/ak4458.c | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/sound/soc/codecs/ak4458.c b/sound/soc/codecs/ak4458.c index a6c04dd3de3e..abe742edb10f 100644 --- a/sound/soc/codecs/ak4458.c +++ b/sound/soc/codecs/ak4458.c @@ -783,16 +783,12 @@ static int ak4458_i2c_probe(struct i2c_client *i2c)
pm_runtime_enable(&i2c->dev); regcache_cache_only(ak4458->regmap, true); - ak4458_reset(ak4458, false);
return 0; }
static void ak4458_i2c_remove(struct i2c_client *i2c) { - struct ak4458_priv *ak4458 = i2c_get_clientdata(i2c); - - ak4458_reset(ak4458, true); pm_runtime_disable(&i2c->dev); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shardul Bankar shardul.b@mpiricsoftware.com
[ Upstream commit df8d829bba3adcf3cc744c01d933b6fd7cf06e91 ]
When nfsd_create_serv() calls percpu_ref_init() to initialize nn->nfsd_net_ref, it allocates both a percpu reference counter and a percpu_ref_data structure (64 bytes). However, if the function fails later due to svc_create_pooled() returning NULL or svc_bind() returning an error, these allocations are not cleaned up, resulting in a memory leak.
The leak manifests as: - Unreferenced percpu allocation (8 bytes per CPU) - Unreferenced percpu_ref_data structure (64 bytes)
Fix this by adding percpu_ref_exit() calls in both error paths to properly clean up the percpu_ref_init() allocations.
This patch fixes the percpu_ref leak in nfsd_create_serv() seen as an auxiliary leak in syzbot report 099461f8558eb0a1f4f3; the prepare_creds() and vsock-related leaks in the same report remain to be addressed separately.
Reported-by: syzbot+099461f8558eb0a1f4f3@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=099461f8558eb0a1f4f3 Fixes: 47e988147f40 ("nfsd: add nfsd_serv_try_get and nfsd_serv_put") Signed-off-by: Shardul Bankar shardul.b@mpiricsoftware.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfsd/nfssvc.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index 7057ddd7a0a8..32cc03a7e7be 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -633,12 +633,15 @@ int nfsd_create_serv(struct net *net) serv = svc_create_pooled(nfsd_programs, ARRAY_SIZE(nfsd_programs), &nn->nfsd_svcstats, nfsd_max_blksize, nfsd); - if (serv == NULL) + if (serv == NULL) { + percpu_ref_exit(&nn->nfsd_net_ref); return -ENOMEM; + }
error = svc_bind(serv, net); if (error < 0) { svc_destroy(&serv); + percpu_ref_exit(&nn->nfsd_net_ref); return error; } spin_lock(&nfsd_notifier_lock);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuming Fan shumingf@realtek.com
[ Upstream commit 1b0f3f9ee41ee2bdd206667f85ea2aa36dfe6e69 ]
The SDCA specification uses Q7.8 volume format. This patch adds a field to indicate whether it is SDCA volume control and supports the volume settings.
Signed-off-by: Shuming Fan shumingf@realtek.com Reviewed-by: Charles Keepax ckeepax@opensource.cirrus.com Link: https://patch.msgid.link/20251106093335.1363237-1-shumingf@realtek.com Signed-off-by: Mark Brown broonie@kernel.org Stable-dep-of: 095d62114182 ("ASoC: ops: fix snd_soc_get_volsw for sx controls") Signed-off-by: Sasha Levin sashal@kernel.org --- include/sound/soc.h | 1 + sound/soc/sdca/sdca_asoc.c | 34 ++++++--------------- sound/soc/soc-ops.c | 62 +++++++++++++++++++++++++++++++------- 3 files changed, 61 insertions(+), 36 deletions(-)
diff --git a/include/sound/soc.h b/include/sound/soc.h index ddc508ff7b9b..a9c058b06ab4 100644 --- a/include/sound/soc.h +++ b/include/sound/soc.h @@ -1225,6 +1225,7 @@ struct soc_mixer_control { unsigned int sign_bit; unsigned int invert:1; unsigned int autodisable:1; + unsigned int sdca_q78:1; #ifdef CONFIG_SND_SOC_TOPOLOGY struct snd_soc_dobj dobj; #endif diff --git a/sound/soc/sdca/sdca_asoc.c b/sound/soc/sdca/sdca_asoc.c index c493ec530cc5..892b7c028fae 100644 --- a/sound/soc/sdca/sdca_asoc.c +++ b/sound/soc/sdca/sdca_asoc.c @@ -795,7 +795,6 @@ static int control_limit_kctl(struct device *dev, struct sdca_control_range *range; int min, max, step; unsigned int *tlv; - int shift;
if (control->type != SDCA_CTL_DATATYPE_Q7P8DB) return 0; @@ -814,37 +813,22 @@ static int control_limit_kctl(struct device *dev, min = sign_extend32(min, control->nbits - 1); max = sign_extend32(max, control->nbits - 1);
- /* - * FIXME: Only support power of 2 step sizes as this can be supported - * by a simple shift. - */ - if (hweight32(step) != 1) { - dev_err(dev, "%s: %s: currently unsupported step size\n", - entity->label, control->label); - return -EINVAL; - } - - /* - * The SDCA volumes are in steps of 1/256th of a dB, a step down of - * 64 (shift of 6) gives 1/4dB. 1/4dB is the smallest unit that is also - * representable in the ALSA TLVs which are in 1/100ths of a dB. - */ - shift = max(ffs(step) - 1, 6); - tlv = devm_kcalloc(dev, 4, sizeof(*tlv), GFP_KERNEL); if (!tlv) return -ENOMEM;
- tlv[0] = SNDRV_CTL_TLVT_DB_SCALE; + tlv[0] = SNDRV_CTL_TLVT_DB_MINMAX; tlv[1] = 2 * sizeof(*tlv); tlv[2] = (min * 100) >> 8; - tlv[3] = ((1 << shift) * 100) >> 8; + tlv[3] = (max * 100) >> 8; + + step = (step * 100) >> 8;
- mc->min = min >> shift; - mc->max = max >> shift; - mc->shift = shift; - mc->rshift = shift; - mc->sign_bit = 15 - shift; + mc->min = ((int)tlv[2] / step); + mc->max = ((int)tlv[3] / step); + mc->shift = step; + mc->sign_bit = 15; + mc->sdca_q78 = 1;
kctl->tlv.p = tlv; kctl->access |= SNDRV_CTL_ELEM_ACCESS_TLV_READ; diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c index d2b6fb8e0b6c..ce86978c158d 100644 --- a/sound/soc/soc-ops.c +++ b/sound/soc/soc-ops.c @@ -110,6 +110,36 @@ int snd_soc_put_enum_double(struct snd_kcontrol *kcontrol, } EXPORT_SYMBOL_GPL(snd_soc_put_enum_double);
+static int sdca_soc_q78_reg_to_ctl(struct soc_mixer_control *mc, unsigned int reg_val, + unsigned int mask, unsigned int shift, int max) +{ + int val = reg_val; + + if (WARN_ON(!mc->shift)) + return -EINVAL; + + val = sign_extend32(val, mc->sign_bit); + val = (((val * 100) >> 8) / (int)mc->shift); + val -= mc->min; + + return val & mask; +} + +static unsigned int sdca_soc_q78_ctl_to_reg(struct soc_mixer_control *mc, int val, + unsigned int mask, unsigned int shift, int max) +{ + unsigned int ret_val; + int reg_val; + + if (WARN_ON(!mc->shift)) + return -EINVAL; + + reg_val = val + mc->min; + ret_val = (int)((reg_val * mc->shift) << 8) / 100; + + return ret_val & mask; +} + static int soc_mixer_reg_to_ctl(struct soc_mixer_control *mc, unsigned int reg_val, unsigned int mask, unsigned int shift, int max) { @@ -197,19 +227,27 @@ static int soc_put_volsw(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol, struct soc_mixer_control *mc, int mask, int max) { + unsigned int (*ctl_to_reg)(struct soc_mixer_control *, int, unsigned int, unsigned int, int); struct snd_soc_component *component = snd_kcontrol_chip(kcontrol); unsigned int val1, val_mask; unsigned int val2 = 0; bool double_r = false; int ret;
+ if (mc->sdca_q78) { + ctl_to_reg = sdca_soc_q78_ctl_to_reg; + val_mask = mask; + } else { + ctl_to_reg = soc_mixer_ctl_to_reg; + val_mask = mask << mc->shift; + } + ret = soc_mixer_valid_ctl(mc, ucontrol->value.integer.value[0], max); if (ret) return ret;
- val1 = soc_mixer_ctl_to_reg(mc, ucontrol->value.integer.value[0], + val1 = ctl_to_reg(mc, ucontrol->value.integer.value[0], mask, mc->shift, max); - val_mask = mask << mc->shift;
if (snd_soc_volsw_is_stereo(mc)) { ret = soc_mixer_valid_ctl(mc, ucontrol->value.integer.value[1], max); @@ -217,14 +255,10 @@ static int soc_put_volsw(struct snd_kcontrol *kcontrol, return ret;
if (mc->reg == mc->rreg) { - val1 |= soc_mixer_ctl_to_reg(mc, - ucontrol->value.integer.value[1], - mask, mc->rshift, max); + val1 |= ctl_to_reg(mc, ucontrol->value.integer.value[1], mask, mc->rshift, max); val_mask |= mask << mc->rshift; } else { - val2 = soc_mixer_ctl_to_reg(mc, - ucontrol->value.integer.value[1], - mask, mc->shift, max); + val2 = ctl_to_reg(mc, ucontrol->value.integer.value[1], mask, mc->shift, max); double_r = true; } } @@ -248,21 +282,27 @@ static int soc_get_volsw(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol, struct soc_mixer_control *mc, int mask, int max) { + int (*reg_to_ctl)(struct soc_mixer_control *, unsigned int, unsigned int, unsigned int, int); struct snd_soc_component *component = snd_kcontrol_chip(kcontrol); unsigned int reg_val; int val;
+ if (mc->sdca_q78) + reg_to_ctl = sdca_soc_q78_reg_to_ctl; + else + reg_to_ctl = soc_mixer_reg_to_ctl; + reg_val = snd_soc_component_read(component, mc->reg); - val = soc_mixer_reg_to_ctl(mc, reg_val, mask, mc->shift, max); + val = reg_to_ctl(mc, reg_val, mask, mc->shift, max);
ucontrol->value.integer.value[0] = val;
if (snd_soc_volsw_is_stereo(mc)) { if (mc->reg == mc->rreg) { - val = soc_mixer_reg_to_ctl(mc, reg_val, mask, mc->rshift, max); + val = reg_to_ctl(mc, reg_val, mask, mc->rshift, max); } else { reg_val = snd_soc_component_read(component, mc->rreg); - val = soc_mixer_reg_to_ctl(mc, reg_val, mask, mc->shift, max); + val = reg_to_ctl(mc, reg_val, mask, mc->shift, max); }
ucontrol->value.integer.value[1] = val;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Binding sbinding@opensource.cirrus.com
[ Upstream commit 095d621141826a2841dae85b52c784c147ea99d3 ]
SX controls are currently broken, since the clamp introduced in commit a0ce874cfaaa ("ASoC: ops: improve snd_soc_get_volsw") does not handle SX controls, for example where the min value in the clamp is greater than the max value in the clamp.
Add clamp parameter to prevent clamping in SX controls. The nature of SX controls mean that it wraps around 0, with a variable number of bits, therefore clamping the value becomes complicated and prone to error.
Fixes 35 kunit tests for soc_ops_test_access.
Fixes: a0ce874cfaaa ("ASoC: ops: improve snd_soc_get_volsw")
Co-developed-by: Charles Keepax ckeepax@opensource.cirrus.com Signed-off-by: Stefan Binding sbinding@opensource.cirrus.com Tested-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://patch.msgid.link/20251216134938.788625-1-sbinding@opensource.cirrus.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/soc-ops.c | 32 ++++++++++++++++++++------------ 1 file changed, 20 insertions(+), 12 deletions(-)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c index ce86978c158d..624e9269fc25 100644 --- a/sound/soc/soc-ops.c +++ b/sound/soc/soc-ops.c @@ -111,7 +111,8 @@ int snd_soc_put_enum_double(struct snd_kcontrol *kcontrol, EXPORT_SYMBOL_GPL(snd_soc_put_enum_double);
static int sdca_soc_q78_reg_to_ctl(struct soc_mixer_control *mc, unsigned int reg_val, - unsigned int mask, unsigned int shift, int max) + unsigned int mask, unsigned int shift, int max, + bool sx) { int val = reg_val;
@@ -141,20 +142,26 @@ static unsigned int sdca_soc_q78_ctl_to_reg(struct soc_mixer_control *mc, int va }
static int soc_mixer_reg_to_ctl(struct soc_mixer_control *mc, unsigned int reg_val, - unsigned int mask, unsigned int shift, int max) + unsigned int mask, unsigned int shift, int max, + bool sx) { int val = (reg_val >> shift) & mask;
if (mc->sign_bit) val = sign_extend32(val, mc->sign_bit);
- val = clamp(val, mc->min, mc->max); - val -= mc->min; + if (sx) { + val -= mc->min; // SX controls intentionally can overflow here + val = min_t(unsigned int, val & mask, max); + } else { + val = clamp(val, mc->min, mc->max); + val -= mc->min; + }
if (mc->invert) val = max - val;
- return val & mask; + return val; }
static unsigned int soc_mixer_ctl_to_reg(struct soc_mixer_control *mc, int val, @@ -280,9 +287,10 @@ static int soc_put_volsw(struct snd_kcontrol *kcontrol,
static int soc_get_volsw(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol, - struct soc_mixer_control *mc, int mask, int max) + struct soc_mixer_control *mc, int mask, int max, bool sx) { - int (*reg_to_ctl)(struct soc_mixer_control *, unsigned int, unsigned int, unsigned int, int); + int (*reg_to_ctl)(struct soc_mixer_control *, unsigned int, unsigned int, + unsigned int, int, bool); struct snd_soc_component *component = snd_kcontrol_chip(kcontrol); unsigned int reg_val; int val; @@ -293,16 +301,16 @@ static int soc_get_volsw(struct snd_kcontrol *kcontrol, reg_to_ctl = soc_mixer_reg_to_ctl;
reg_val = snd_soc_component_read(component, mc->reg); - val = reg_to_ctl(mc, reg_val, mask, mc->shift, max); + val = reg_to_ctl(mc, reg_val, mask, mc->shift, max, sx);
ucontrol->value.integer.value[0] = val;
if (snd_soc_volsw_is_stereo(mc)) { if (mc->reg == mc->rreg) { - val = reg_to_ctl(mc, reg_val, mask, mc->rshift, max); + val = reg_to_ctl(mc, reg_val, mask, mc->rshift, max, sx); } else { reg_val = snd_soc_component_read(component, mc->rreg); - val = reg_to_ctl(mc, reg_val, mask, mc->shift, max); + val = reg_to_ctl(mc, reg_val, mask, mc->shift, max, sx); }
ucontrol->value.integer.value[1] = val; @@ -371,7 +379,7 @@ int snd_soc_get_volsw(struct snd_kcontrol *kcontrol, (struct soc_mixer_control *)kcontrol->private_value; unsigned int mask = soc_mixer_mask(mc);
- return soc_get_volsw(kcontrol, ucontrol, mc, mask, mc->max - mc->min); + return soc_get_volsw(kcontrol, ucontrol, mc, mask, mc->max - mc->min, false); } EXPORT_SYMBOL_GPL(snd_soc_get_volsw);
@@ -413,7 +421,7 @@ int snd_soc_get_volsw_sx(struct snd_kcontrol *kcontrol, (struct soc_mixer_control *)kcontrol->private_value; unsigned int mask = soc_mixer_sx_mask(mc);
- return soc_get_volsw(kcontrol, ucontrol, mc, mask, mc->max); + return soc_get_volsw(kcontrol, ucontrol, mc, mask, mc->max, true); } EXPORT_SYMBOL_GPL(snd_soc_get_volsw_sx);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jinhui Guo guojinhui.liam@bytedance.com
[ Upstream commit 936750fdba4c45e13bbd17f261bb140dd55f5e93 ]
The race window between __scan_channels() and deliver_response() causes the parameters of some channels to be set to 0.
1.[CPUA] __scan_channels() issues an IPMI request and waits with wait_event() until all channels have been scanned. wait_event() internally calls might_sleep(), which might yield the CPU. (Moreover, an interrupt can preempt wait_event() and force the task to yield the CPU.) 2.[CPUB] deliver_response() is invoked when the CPU receives the IPMI response. After processing a IPMI response, deliver_response() directly assigns intf->wchannels to intf->channel_list and sets intf->channels_ready to true. However, not all channels are actually ready for use. 3.[CPUA] Since intf->channels_ready is already true, wait_event() never enters __wait_event(). __scan_channels() immediately clears intf->null_user_handler and exits. 4.[CPUB] Once intf->null_user_handler is set to NULL, deliver_response() ignores further IPMI responses, leaving the remaining channels zero-initialized and unusable.
CPUA CPUB ------------------------------- ----------------------------- __scan_channels() intf->null_user_handler = channel_handler; send_channel_info_cmd(intf, 0); wait_event(intf->waitq, intf->channels_ready); do { might_sleep(); deliver_response() channel_handler() intf->channel_list = intf->wchannels + set; intf->channels_ready = true; send_channel_info_cmd(intf, intf->curr_channel); if (condition) break; __wait_event(wq_head, condition); } while(0) intf->null_user_handler = NULL; deliver_response() if (!msg->user) if (intf->null_user_handler) rv = -EINVAL; return rv; ------------------------------- -----------------------------
Fix the race between __scan_channels() and deliver_response() by deferring both the assignment intf->channel_list = intf->wchannels and the flag intf->channels_ready = true until all channels have been successfully scanned or until the IPMI request has failed.
Signed-off-by: Jinhui Guo guojinhui.liam@bytedance.com Message-ID: 20250930074239.2353-2-guojinhui.liam@bytedance.com Signed-off-by: Corey Minyard corey@minyard.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/char/ipmi/ipmi_msghandler.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c index 3700ab4eba3e..d3f84deee451 100644 --- a/drivers/char/ipmi/ipmi_msghandler.c +++ b/drivers/char/ipmi/ipmi_msghandler.c @@ -3417,8 +3417,6 @@ channel_handler(struct ipmi_smi *intf, struct ipmi_recv_msg *msg) intf->channels_ready = true; wake_up(&intf->waitq); } else { - intf->channel_list = intf->wchannels + set; - intf->channels_ready = true; rv = send_channel_info_cmd(intf, intf->curr_channel); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jinhui Guo guojinhui.liam@bytedance.com
[ Upstream commit 6bd30d8fc523fb880b4be548e8501bc0fe8f42d4 ]
channel_handler() sets intf->channels_ready to true but never clears it, so __scan_channels() skips any rescan. When the BMC firmware changes a rescan is required. Allow it by clearing the flag before starting a new scan.
Signed-off-by: Jinhui Guo guojinhui.liam@bytedance.com Message-ID: 20250930074239.2353-3-guojinhui.liam@bytedance.com Signed-off-by: Corey Minyard corey@minyard.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/char/ipmi/ipmi_msghandler.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c index d3f84deee451..0a886399f9da 100644 --- a/drivers/char/ipmi/ipmi_msghandler.c +++ b/drivers/char/ipmi/ipmi_msghandler.c @@ -599,7 +599,8 @@ static void __ipmi_bmc_unregister(struct ipmi_smi *intf); static int __ipmi_bmc_register(struct ipmi_smi *intf, struct ipmi_device_id *id, bool guid_set, guid_t *guid, int intf_num); -static int __scan_channels(struct ipmi_smi *intf, struct ipmi_device_id *id); +static int __scan_channels(struct ipmi_smi *intf, + struct ipmi_device_id *id, bool rescan);
static void free_ipmi_user(struct kref *ref) { @@ -2668,7 +2669,7 @@ static int __bmc_get_device_id(struct ipmi_smi *intf, struct bmc_device *bmc, if (__ipmi_bmc_register(intf, &id, guid_set, &guid, intf_num)) need_waiter(intf); /* Retry later on an error. */ else - __scan_channels(intf, &id); + __scan_channels(intf, &id, false);
if (!intf_set) { @@ -2688,7 +2689,7 @@ static int __bmc_get_device_id(struct ipmi_smi *intf, struct bmc_device *bmc, goto out_noprocessing; } else if (memcmp(&bmc->fetch_id, &bmc->id, sizeof(bmc->id))) /* Version info changes, scan the channels again. */ - __scan_channels(intf, &bmc->fetch_id); + __scan_channels(intf, &bmc->fetch_id, true);
bmc->dyn_id_expiry = jiffies + IPMI_DYN_DEV_ID_EXPIRY;
@@ -3438,10 +3439,17 @@ channel_handler(struct ipmi_smi *intf, struct ipmi_recv_msg *msg) /* * Must be holding intf->bmc_reg_mutex to call this. */ -static int __scan_channels(struct ipmi_smi *intf, struct ipmi_device_id *id) +static int __scan_channels(struct ipmi_smi *intf, + struct ipmi_device_id *id, + bool rescan) { int rv;
+ if (rescan) { + /* Clear channels_ready to force channels rescan. */ + intf->channels_ready = false; + } + if (ipmi_version_major(id) > 1 || (ipmi_version_major(id) == 1 && ipmi_version_minor(id) >= 5)) { @@ -3656,7 +3664,7 @@ int ipmi_add_smi(struct module *owner, }
mutex_lock(&intf->bmc_reg_mutex); - rv = __scan_channels(intf, &id); + rv = __scan_channels(intf, &id, false); mutex_unlock(&intf->bmc_reg_mutex); if (rv) goto out_err_bmc_reg;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Wang peter.wang@mediatek.com
[ Upstream commit 014de20bb36ba03e0e0b0a7e0a1406ab900c9fda ]
Address a race condition between shutdown and suspend operations in the UFS Mediatek driver. Before entering suspend, check if a shutdown is in progress to prevent conflicts and ensure system stability.
Signed-off-by: Peter Wang peter.wang@mediatek.com Acked-by: Chun-Hung Wu chun-hung.wu@mediatek.com Link: https://patch.msgid.link/20250924094527.2992256-6-peter.wang@mediatek.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ufs/host/ufs-mediatek.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/ufs/host/ufs-mediatek.c b/drivers/ufs/host/ufs-mediatek.c index 758a393a9de1..d0cbd96ad29d 100644 --- a/drivers/ufs/host/ufs-mediatek.c +++ b/drivers/ufs/host/ufs-mediatek.c @@ -2373,6 +2373,11 @@ static int ufs_mtk_system_suspend(struct device *dev) struct arm_smccc_res res; int ret;
+ if (hba->shutting_down) { + ret = -EBUSY; + goto out; + } + ret = ufshcd_system_suspend(dev); if (ret) goto out;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peng Fan peng.fan@nxp.com
[ Upstream commit 81fb53feb66a3aefbf6fcab73bb8d06f5b0c54ad ]
With mailbox channel requested, there is possibility that interrupts may come in, so need to make sure the workqueue is initialized before the queue is scheduled by mailbox rx callback.
Reviewed-by: Frank Li Frank.Li@nxp.com Signed-off-by: Peng Fan peng.fan@nxp.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/imx/imx-scu-irq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/firmware/imx/imx-scu-irq.c b/drivers/firmware/imx/imx-scu-irq.c index f2b902e95b73..b9f6128d56f7 100644 --- a/drivers/firmware/imx/imx-scu-irq.c +++ b/drivers/firmware/imx/imx-scu-irq.c @@ -214,6 +214,8 @@ int imx_scu_enable_general_irq_channel(struct device *dev) cl->dev = dev; cl->rx_callback = imx_scu_irq_callback;
+ INIT_WORK(&imx_sc_irq_work, imx_scu_irq_work_handler); + /* SCU general IRQ uses general interrupt channel 3 */ ch = mbox_request_channel_byname(cl, "gip3"); if (IS_ERR(ch)) { @@ -223,8 +225,6 @@ int imx_scu_enable_general_irq_channel(struct device *dev) return ret; }
- INIT_WORK(&imx_sc_irq_work, imx_scu_irq_work_handler); - if (!of_parse_phandle_with_args(dev->of_node, "mboxes", "#mbox-cells", 0, &spec)) { i = of_alias_get_id(spec.np, "mu");
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 7b5d4416964c07c902163822a30a622111172b01 ]
This is currently done in uml_finishsetup(), but e.g. with KCOV enabled we'll crash because some init code can call into e.g. memparse(), which has coverage annotations, and then the checks in check_kcov_mode() crash because current is NULL.
Simply initialize the cpu_tasks[] array statically, which fixes the crash. For the later SMP work, it seems to have not really caused any problems yet, but initialize all of the entries anyway.
Link: https://patch.msgid.link/20250924113214.c76cd74d0583.I974f691ebb1a2b47915bd2... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/um/kernel/process.c | 4 +++- arch/um/kernel/um_arch.c | 2 -- 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c index 9c9c66dc45f0..13d461712c99 100644 --- a/arch/um/kernel/process.c +++ b/arch/um/kernel/process.c @@ -43,7 +43,9 @@ * cares about its entry, so it's OK if another processor is modifying its * entry. */ -struct task_struct *cpu_tasks[NR_CPUS]; +struct task_struct *cpu_tasks[NR_CPUS] = { + [0 ... NR_CPUS - 1] = &init_task, +}; EXPORT_SYMBOL(cpu_tasks);
void free_stack(unsigned long stack, int order) diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c index cfbbbf8500c3..ed2f67848a50 100644 --- a/arch/um/kernel/um_arch.c +++ b/arch/um/kernel/um_arch.c @@ -239,8 +239,6 @@ static struct notifier_block panic_exit_notifier = {
void uml_finishsetup(void) { - cpu_tasks[0] = &init_task; - atomic_notifier_chain_register(&panic_notifier_list, &panic_exit_notifier);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthias Schiffer matthias.schiffer@tq-group.com
[ Upstream commit 3f61783920504b2cf99330b372d82914bb004d8e ]
am33xx.dtsi has the same clock setup as am35xx.dtsi, setting ti,no-reset-on-init and ti,no-idle on timer1_target and timer2_target, so AM33 needs the same workaround as AM35 to avoid ti-sysc probe failing on certain target modules.
Signed-off-by: Matthias Schiffer matthias.schiffer@tq-group.com Signed-off-by: Alexander Stein alexander.stein@ew.tq-group.com Link: https://lore.kernel.org/r/20250825131114.2206804-1-alexander.stein@ew.tq-gro... Signed-off-by: Kevin Hilman khilman@baylibre.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bus/ti-sysc.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c index 5566ad11399e..610354ce7f8f 100644 --- a/drivers/bus/ti-sysc.c +++ b/drivers/bus/ti-sysc.c @@ -48,6 +48,7 @@ enum sysc_soc { SOC_UNKNOWN, SOC_2420, SOC_2430, + SOC_AM33, SOC_3430, SOC_AM35, SOC_3630, @@ -2912,6 +2913,7 @@ static void ti_sysc_idle(struct work_struct *work) static const struct soc_device_attribute sysc_soc_match[] = { SOC_FLAG("OMAP242*", SOC_2420), SOC_FLAG("OMAP243*", SOC_2430), + SOC_FLAG("AM33*", SOC_AM33), SOC_FLAG("AM35*", SOC_AM35), SOC_FLAG("OMAP3[45]*", SOC_3430), SOC_FLAG("OMAP3[67]*", SOC_3630), @@ -3117,10 +3119,15 @@ static int sysc_check_active_timer(struct sysc *ddata) * can be dropped if we stop supporting old beagleboard revisions * A to B4 at some point. */ - if (sysc_soc->soc == SOC_3430 || sysc_soc->soc == SOC_AM35) + switch (sysc_soc->soc) { + case SOC_AM33: + case SOC_3430: + case SOC_AM35: error = -ENXIO; - else + break; + default: error = -EBUSY; + }
if ((ddata->cfg.quirks & SYSC_QUIRK_NO_RESET_ON_INIT) && (ddata->cfg.quirks & SYSC_QUIRK_NO_IDLE))
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Strahan David.Strahan@microchip.com
[ Upstream commit 48e6b7e708029cea451e53a8c16fc8c16039ecdc ]
Add support for new Hurray Data controller.
All entries are in HEX.
Add PCI IDs for Hurray Data controllers: VID / DID / SVID / SDID ---- ---- ---- ---- 9005 028f 207d 4840
Reviewed-by: Scott Benesh scott.benesh@microchip.com Reviewed-by: Scott Teel scott.teel@microchip.com Reviewed-by: Mike McGowen mike.mcgowen@microchip.com Signed-off-by: David Strahan David.Strahan@microchip.com Signed-off-by: Don Brace don.brace@microchip.com Link: https://patch.msgid.link/20251106163823.786828-4-don.brace@microchip.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/smartpqi/smartpqi_init.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index fdc856845a05..98e93900254c 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -10127,6 +10127,10 @@ static const struct pci_device_id pqi_pci_id_table[] = { PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f, 0x207d, 0x4240) }, + { + PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f, + 0x207d, 0x4840) + }, { PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f, PCI_VENDOR_ID_ADVANTECH, 0x8312)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Justin Tee justin.tee@broadcom.com
[ Upstream commit 07caedc6a3887938813727beafea40f07c497705 ]
It's possible for an unstable link to repeatedly bounce allowing a FLOGI retry, but then bounce again forcing an abort of the FLOGI. Ensure that the initial reference count on the FLOGI ndlp is restored in this faulty link scenario.
Signed-off-by: Justin Tee justin.tee@broadcom.com Link: https://patch.msgid.link/20251106224639.139176-8-justintee8345@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_els.c | 36 +++++++++++++++++++++++++------- drivers/scsi/lpfc/lpfc_hbadisc.c | 4 +++- 2 files changed, 32 insertions(+), 8 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index b71db7d7d747..c08237f04bce 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -934,10 +934,15 @@ lpfc_cmpl_els_flogi(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, /* Check to see if link went down during discovery */ if (lpfc_els_chk_latt(vport)) { /* One additional decrement on node reference count to - * trigger the release of the node + * trigger the release of the node. Make sure the ndlp + * is marked NLP_DROPPED. */ - if (!(ndlp->fc4_xpt_flags & SCSI_XPT_REGD)) + if (!test_bit(NLP_IN_DEV_LOSS, &ndlp->nlp_flag) && + !test_bit(NLP_DROPPED, &ndlp->nlp_flag) && + !(ndlp->fc4_xpt_flags & SCSI_XPT_REGD)) { + set_bit(NLP_DROPPED, &ndlp->nlp_flag); lpfc_nlp_put(ndlp); + } goto out; }
@@ -995,9 +1000,10 @@ lpfc_cmpl_els_flogi(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, IOERR_LOOP_OPEN_FAILURE))) lpfc_vlog_msg(vport, KERN_WARNING, LOG_ELS, "2858 FLOGI Status:x%x/x%x TMO" - ":x%x Data x%lx x%x\n", + ":x%x Data x%lx x%x x%lx x%x\n", ulp_status, ulp_word4, tmo, - phba->hba_flag, phba->fcf.fcf_flag); + phba->hba_flag, phba->fcf.fcf_flag, + ndlp->nlp_flag, ndlp->fc4_xpt_flags);
/* Check for retry */ if (lpfc_els_retry(phba, cmdiocb, rspiocb)) { @@ -1015,14 +1021,17 @@ lpfc_cmpl_els_flogi(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, * reference to trigger node release. */ if (!test_bit(NLP_IN_DEV_LOSS, &ndlp->nlp_flag) && - !(ndlp->fc4_xpt_flags & SCSI_XPT_REGD)) + !test_bit(NLP_DROPPED, &ndlp->nlp_flag) && + !(ndlp->fc4_xpt_flags & SCSI_XPT_REGD)) { + set_bit(NLP_DROPPED, &ndlp->nlp_flag); lpfc_nlp_put(ndlp); + }
lpfc_printf_vlog(vport, KERN_WARNING, LOG_ELS, "0150 FLOGI Status:x%x/x%x " - "xri x%x TMO:x%x refcnt %d\n", + "xri x%x iotag x%x TMO:x%x refcnt %d\n", ulp_status, ulp_word4, cmdiocb->sli4_xritag, - tmo, kref_read(&ndlp->kref)); + cmdiocb->iotag, tmo, kref_read(&ndlp->kref));
/* If this is not a loop open failure, bail out */ if (!(ulp_status == IOSTAT_LOCAL_REJECT && @@ -1279,6 +1288,19 @@ lpfc_issue_els_flogi(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp, uint32_t tmo, did; int rc;
+ /* It's possible for lpfc to reissue a FLOGI on an ndlp that is marked + * NLP_DROPPED. This happens when the FLOGI completed with the XB bit + * set causing lpfc to reference the ndlp until the XRI_ABORTED CQE is + * issued. The time window for the XRI_ABORTED CQE can be as much as + * 2*2*RA_TOV allowing for ndlp reuse of this type when the link is + * cycling quickly. When true, restore the initial reference and remove + * the NLP_DROPPED flag as lpfc is retrying. + */ + if (test_and_clear_bit(NLP_DROPPED, &ndlp->nlp_flag)) { + if (!lpfc_nlp_get(ndlp)) + return 1; + } + cmdsize = (sizeof(uint32_t) + sizeof(struct serv_parm)); elsiocb = lpfc_prep_els_iocb(vport, 1, cmdsize, retry, ndlp, ndlp->nlp_DID, ELS_CMD_FLOGI); diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c index 43d246c5c049..717ae56c8e4b 100644 --- a/drivers/scsi/lpfc/lpfc_hbadisc.c +++ b/drivers/scsi/lpfc/lpfc_hbadisc.c @@ -424,6 +424,7 @@ lpfc_check_nlp_post_devloss(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp) { if (test_and_clear_bit(NLP_IN_RECOV_POST_DEV_LOSS, &ndlp->save_flags)) { + clear_bit(NLP_DROPPED, &ndlp->nlp_flag); lpfc_nlp_get(ndlp); lpfc_printf_vlog(vport, KERN_INFO, LOG_DISCOVERY | LOG_NODE, "8438 Devloss timeout reversed on DID x%x " @@ -566,7 +567,8 @@ lpfc_dev_loss_tmo_handler(struct lpfc_nodelist *ndlp) return fcf_inuse; }
- lpfc_nlp_put(ndlp); + if (!test_and_set_bit(NLP_DROPPED, &ndlp->nlp_flag)) + lpfc_nlp_put(ndlp); return fcf_inuse; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josua Mayer josua@solid-run.com
[ Upstream commit f0e6bc0c3ef4b4afb299bd6912586cafd5d864e9 ]
CP110 based platforms rely on the bootloader for pci port initialization. TF-A actively prevents non-uboot re-configuration of pci lanes, and many boards do not have software control over the pci card reset.
If a pci port had link at boot-time and the clock is stopped at a later point, the link fails and can not be recovered.
PCI controller driver probe - and by extension ownership of a driver for the pci clocks - may be delayed especially on large modular kernels, causing the clock core to start disabling unused clocks.
Add the CLK_IGNORE_UNUSED flag to the three pci port's clocks to ensure they are not stopped before the pci controller driver has taken ownership and tested for an existing link.
This fixes failed pci link detection when controller driver probes late, e.g. with arm64 defconfig and CONFIG_PHY_MVEBU_CP110_COMPHY=m.
Closes: https://lore.kernel.org/r/b71596c7-461b-44b6-89ab-3cfbd492639f@solid-run.com Signed-off-by: Josua Mayer josua@solid-run.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Gregory CLEMENT gregory.clement@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/mvebu/cp110-system-controller.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
diff --git a/drivers/clk/mvebu/cp110-system-controller.c b/drivers/clk/mvebu/cp110-system-controller.c index 03c59bf22106..b47c86906046 100644 --- a/drivers/clk/mvebu/cp110-system-controller.c +++ b/drivers/clk/mvebu/cp110-system-controller.c @@ -110,6 +110,25 @@ static const char * const gate_base_names[] = { [CP110_GATE_EIP197] = "eip197" };
+static unsigned long gate_flags(const u8 bit_idx) +{ + switch (bit_idx) { + case CP110_GATE_PCIE_X1_0: + case CP110_GATE_PCIE_X1_1: + case CP110_GATE_PCIE_X4: + /* + * If a port had an active link at boot time, stopping + * the clock creates a failed state from which controller + * driver can not recover. + * Prevent stopping this clock till after a driver has taken + * ownership. + */ + return CLK_IGNORE_UNUSED; + default: + return 0; + } +}; + struct cp110_gate_clk { struct clk_hw hw; struct regmap *regmap; @@ -171,6 +190,7 @@ static struct clk_hw *cp110_register_gate(const char *name, init.ops = &cp110_gate_ops; init.parent_names = &parent_name; init.num_parents = 1; + init.flags = gate_flags(bit_idx);
gate->regmap = regmap; gate->bit_idx = bit_idx;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ben Collins bcollins@kernel.org
[ Upstream commit 825ce89a3ef17f84cf2c0eacfa6b8dc9fd11d13f ]
The PUT_64[LB]E() macros need to cast the value to unsigned long long like the GET_64[LB]E() macros. Caused lots of warnings when compiled on 32-bit, and clobbered addresses (36-bit P4080).
Signed-off-by: Ben Collins bcollins@kernel.org Reviewed-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Madhavan Srinivasan maddy@linux.ibm.com Link: https://patch.msgid.link/2025042122-mustard-wrasse-694572@boujee-and-buff Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/boot/addnote.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/boot/addnote.c b/arch/powerpc/boot/addnote.c index 53b3b2621457..78704927453a 100644 --- a/arch/powerpc/boot/addnote.c +++ b/arch/powerpc/boot/addnote.c @@ -68,8 +68,8 @@ static int e_class = ELFCLASS32; #define PUT_16BE(off, v)(buf[off] = ((v) >> 8) & 0xff, \ buf[(off) + 1] = (v) & 0xff) #define PUT_32BE(off, v)(PUT_16BE((off), (v) >> 16L), PUT_16BE((off) + 2, (v))) -#define PUT_64BE(off, v)((PUT_32BE((off), (v) >> 32L), \ - PUT_32BE((off) + 4, (v)))) +#define PUT_64BE(off, v)((PUT_32BE((off), (unsigned long long)(v) >> 32L), \ + PUT_32BE((off) + 4, (unsigned long long)(v))))
#define GET_16LE(off) ((buf[off]) + (buf[(off)+1] << 8)) #define GET_32LE(off) (GET_16LE(off) + (GET_16LE((off)+2U) << 16U)) @@ -78,7 +78,8 @@ static int e_class = ELFCLASS32; #define PUT_16LE(off, v) (buf[off] = (v) & 0xff, \ buf[(off) + 1] = ((v) >> 8) & 0xff) #define PUT_32LE(off, v) (PUT_16LE((off), (v)), PUT_16LE((off) + 2, (v) >> 16L)) -#define PUT_64LE(off, v) (PUT_32LE((off), (v)), PUT_32LE((off) + 4, (v) >> 32L)) +#define PUT_64LE(off, v) (PUT_32LE((off), (unsigned long long)(v)), \ + PUT_32LE((off) + 4, (unsigned long long)(v) >> 32L))
#define GET_16(off) (e_data == ELFDATA2MSB ? GET_16BE(off) : GET_16LE(off)) #define GET_32(off) (e_data == ELFDATA2MSB ? GET_32BE(off) : GET_32LE(off))
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tony Battersby tonyb@cybernetics.com
[ Upstream commit 4f6aaade2a22ac428fa99ed716cf2b87e79c9837 ]
When qla2xxx is loaded with qlini_mode=disabled, ha->flags.disable_msix_handshake is used before it is set, resulting in the wrong interrupt handler being used on certain HBAs (qla2xxx_msix_rsp_q_hs() is used when qla2xxx_msix_rsp_q() should be used). The only difference between these two interrupt handlers is that the _hs() version writes to a register to clear the "RISC" interrupt, whereas the other version does not. So this bug results in the RISC interrupt being cleared when it should not be. This occasionally causes a different interrupt handler qla24xx_msix_default() for a different vector to see ((stat & HSRX_RISC_INT) == 0) and ignore its interrupt, which then causes problems like:
qla2xxx [0000:02:00.0]-d04c:6: MBX Command timeout for cmd 20, iocontrol=8 jiffies=1090c0300 mb[0-3]=[0x4000 0x0 0x40 0xda] mb7 0x500 host_status 0x40000010 hccr 0x3f00 qla2xxx [0000:02:00.0]-101e:6: Mailbox cmd timeout occurred, cmd=0x20, mb[0]=0x20. Scheduling ISP abort (the cmd varies; sometimes it is 0x20, 0x22, 0x54, 0x5a, 0x5d, or 0x6a)
This problem can be reproduced with a 16 or 32 Gbps HBA by loading qla2xxx with qlini_mode=disabled and running a high IOPS test while triggering frequent RSCN database change events.
While analyzing the problem I discovered that even with disable_msix_handshake forced to 0, it is not necessary to clear the RISC interrupt from qla2xxx_msix_rsp_q_hs() (more below). So just completely remove qla2xxx_msix_rsp_q_hs() and the logic for selecting it, which also fixes the bug with qlini_mode=disabled.
The test below describes the justification for not needing qla2xxx_msix_rsp_q_hs():
Force disable_msix_handshake to 0: qla24xx_config_rings(): if (0 && (ha->fw_attributes & BIT_6) && (IS_MSIX_NACK_CAPABLE(ha)) && (ha->flags.msix_enabled)) {
In qla24xx_msix_rsp_q() and qla2xxx_msix_rsp_q_hs(), check: (rd_reg_dword(®->host_status) & HSRX_RISC_INT)
Count the number of calls to each function with HSRX_RISC_INT set and the number with HSRX_RISC_INT not set while performing some I/O.
If qla2xxx_msix_rsp_q_hs() clears the RISC interrupt (original code): qla24xx_msix_rsp_q: 50% of calls have HSRX_RISC_INT set qla2xxx_msix_rsp_q_hs: 5% of calls have HSRX_RISC_INT set (# of qla2xxx_msix_rsp_q_hs interrupts) = (# of qla24xx_msix_rsp_q interrupts) * 3
If qla2xxx_msix_rsp_q_hs() does not clear the RISC interrupt (patched code): qla24xx_msix_rsp_q: 100% of calls have HSRX_RISC_INT set qla2xxx_msix_rsp_q_hs: 9% of calls have HSRX_RISC_INT set (# of qla2xxx_msix_rsp_q_hs interrupts) = (# of qla24xx_msix_rsp_q interrupts) * 3
In the case of the original code, qla24xx_msix_rsp_q() was seeing HSRX_RISC_INT set only 50% of the time because qla2xxx_msix_rsp_q_hs() was clearing it when it shouldn't have been. In the patched code, qla24xx_msix_rsp_q() sees HSRX_RISC_INT set 100% of the time, which makes sense if that interrupt handler needs to clear the RISC interrupt (which it does). qla2xxx_msix_rsp_q_hs() sees HSRX_RISC_INT only 9% of the time, which is just overlap from the other interrupt during the high IOPS test.
Tested with SCST on: QLE2742 FW:v9.08.02 (32 Gbps 2-port) QLE2694L FW:v9.10.11 (16 Gbps 4-port) QLE2694L FW:v9.08.02 (16 Gbps 4-port) QLE2672 FW:v8.07.12 (16 Gbps 2-port) both initiator and target mode
Signed-off-by: Tony Battersby tonyb@cybernetics.com Link: https://patch.msgid.link/56d378eb-14ad-49c7-bae9-c649b6c7691e@cybernetics.co... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_def.h | 1 - drivers/scsi/qla2xxx/qla_gbl.h | 2 +- drivers/scsi/qla2xxx/qla_isr.c | 32 +++----------------------------- drivers/scsi/qla2xxx/qla_mid.c | 4 +--- 4 files changed, 5 insertions(+), 34 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index cb95b7b12051..b3265952c4be 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -3503,7 +3503,6 @@ struct isp_operations { #define QLA_MSIX_RSP_Q 0x01 #define QLA_ATIO_VECTOR 0x02 #define QLA_MSIX_QPAIR_MULTIQ_RSP_Q 0x03 -#define QLA_MSIX_QPAIR_MULTIQ_RSP_Q_HS 0x04
#define QLA_MIDX_DEFAULT 0 #define QLA_MIDX_RSP_Q 1 diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h index 145defc420f2..55d531c19e6b 100644 --- a/drivers/scsi/qla2xxx/qla_gbl.h +++ b/drivers/scsi/qla2xxx/qla_gbl.h @@ -766,7 +766,7 @@ extern int qla2x00_dfs_remove(scsi_qla_host_t *);
/* Globa function prototypes for multi-q */ extern int qla25xx_request_irq(struct qla_hw_data *, struct qla_qpair *, - struct qla_msix_entry *, int); + struct qla_msix_entry *); extern int qla25xx_init_req_que(struct scsi_qla_host *, struct req_que *); extern int qla25xx_init_rsp_que(struct scsi_qla_host *, struct rsp_que *); extern int qla25xx_create_req_que(struct qla_hw_data *, uint16_t, uint8_t, diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index c4c6b5c6658c..a3971afc2dd1 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -4467,32 +4467,6 @@ qla2xxx_msix_rsp_q(int irq, void *dev_id) return IRQ_HANDLED; }
-irqreturn_t -qla2xxx_msix_rsp_q_hs(int irq, void *dev_id) -{ - struct qla_hw_data *ha; - struct qla_qpair *qpair; - struct device_reg_24xx __iomem *reg; - unsigned long flags; - - qpair = dev_id; - if (!qpair) { - ql_log(ql_log_info, NULL, 0x505b, - "%s: NULL response queue pointer.\n", __func__); - return IRQ_NONE; - } - ha = qpair->hw; - - reg = &ha->iobase->isp24; - spin_lock_irqsave(&ha->hardware_lock, flags); - wrt_reg_dword(®->hccr, HCCRX_CLR_RISC_INT); - spin_unlock_irqrestore(&ha->hardware_lock, flags); - - queue_work(ha->wq, &qpair->q_work); - - return IRQ_HANDLED; -} - /* Interrupt handling helpers. */
struct qla_init_msix_entry { @@ -4505,7 +4479,6 @@ static const struct qla_init_msix_entry msix_entries[] = { { "rsp_q", qla24xx_msix_rsp_q }, { "atio_q", qla83xx_msix_atio_q }, { "qpair_multiq", qla2xxx_msix_rsp_q }, - { "qpair_multiq_hs", qla2xxx_msix_rsp_q_hs }, };
static const struct qla_init_msix_entry qla82xx_msix_entries[] = { @@ -4792,9 +4765,10 @@ qla2x00_free_irqs(scsi_qla_host_t *vha) }
int qla25xx_request_irq(struct qla_hw_data *ha, struct qla_qpair *qpair, - struct qla_msix_entry *msix, int vector_type) + struct qla_msix_entry *msix) { - const struct qla_init_msix_entry *intr = &msix_entries[vector_type]; + const struct qla_init_msix_entry *intr = + &msix_entries[QLA_MSIX_QPAIR_MULTIQ_RSP_Q]; scsi_qla_host_t *vha = pci_get_drvdata(ha->pdev); int ret;
diff --git a/drivers/scsi/qla2xxx/qla_mid.c b/drivers/scsi/qla2xxx/qla_mid.c index 8b71ac0b1d99..0abc47e72e0b 100644 --- a/drivers/scsi/qla2xxx/qla_mid.c +++ b/drivers/scsi/qla2xxx/qla_mid.c @@ -899,9 +899,7 @@ qla25xx_create_rsp_que(struct qla_hw_data *ha, uint16_t options, rsp->options, rsp->id, rsp->rsp_q_in, rsp->rsp_q_out);
- ret = qla25xx_request_irq(ha, qpair, qpair->msix, - ha->flags.disable_msix_handshake ? - QLA_MSIX_QPAIR_MULTIQ_RSP_Q : QLA_MSIX_QPAIR_MULTIQ_RSP_Q_HS); + ret = qla25xx_request_irq(ha, qpair, qpair->msix); if (ret) goto que_failed;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tony Battersby tonyb@cybernetics.com
[ Upstream commit 8f58fc64d559b5fda1b0a5e2a71422be61e79ab9 ]
When given the module parameter qlini_mode=exclusive, qla2xxx in initiator mode is initially unable to successfully send SCSI commands to devices it finds while scanning, resulting in an escalating series of resets until an adapter reset clears the issue. Fix by checking the active mode instead of the module parameter.
Signed-off-by: Tony Battersby tonyb@cybernetics.com Link: https://patch.msgid.link/1715ec14-ba9a-45dc-9cf2-d41aa6b81b5e@cybernetics.co... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_os.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 70a579cf9c3f..6b0f616b1ca0 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -3460,13 +3460,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id) ha->mqenable = 0;
if (ha->mqenable) { - bool startit = false; - - if (QLA_TGT_MODE_ENABLED()) - startit = false; - - if (ql2x_ini_mode == QLA2XXX_INI_MODE_ENABLED) - startit = true; + bool startit = !!(host->active_mode & MODE_INITIATOR);
/* Create start of day qpairs for Block MQ */ for (i = 0; i < ha->max_qpairs; i++)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tony Battersby tonyb@cybernetics.com
[ Upstream commit 957aa5974989fba4ae4f807ebcb27f12796edd4d ]
If a mailbox command completes immediately after wait_for_completion_timeout() times out, ha->mbx_intr_comp could be left in an inconsistent state, causing the next mailbox command not to wait for the hardware. Fix by reinitializing the completion before use.
Signed-off-by: Tony Battersby tonyb@cybernetics.com Link: https://patch.msgid.link/11b6485e-0bfd-4784-8f99-c06a196dad94@cybernetics.co... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_mbx.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c index 32eb0ce8b170..1f01576f044b 100644 --- a/drivers/scsi/qla2xxx/qla_mbx.c +++ b/drivers/scsi/qla2xxx/qla_mbx.c @@ -253,6 +253,7 @@ qla2x00_mailbox_command(scsi_qla_host_t *vha, mbx_cmd_t *mcp) /* Issue set host interrupt command to send cmd out. */ ha->flags.mbox_int = 0; clear_bit(MBX_INTERRUPT, &ha->mbx_cmd_flags); + reinit_completion(&ha->mbx_intr_comp);
/* Unlock mbx registers and wait for interrupt */ ql_dbg(ql_dbg_mbx, vha, 0x100f, @@ -279,6 +280,7 @@ qla2x00_mailbox_command(scsi_qla_host_t *vha, mbx_cmd_t *mcp) "cmd=%x Timeout.\n", command); spin_lock_irqsave(&ha->hardware_lock, flags); clear_bit(MBX_INTR_WAIT, &ha->mbx_cmd_flags); + reinit_completion(&ha->mbx_intr_comp); spin_unlock_irqrestore(&ha->hardware_lock, flags);
if (chip_reset != ha->chip_reset) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bernd Schubert bschubert@ddn.com
[ Upstream commit 1ce120dcefc056ce8af2486cebbb77a458aad4c3 ]
This was done as condition on direct_io_allow_mmap, but I believe this is not right, as a file might be open two times - once with write-back enabled another time with FOPEN_DIRECT_IO.
Signed-off-by: Bernd Schubert bschubert@ddn.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fuse/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c index f1ef77a0be05..c5c82b380791 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -1607,7 +1607,7 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, if (!ia) return -ENOMEM;
- if (fopen_direct_io && fc->direct_io_allow_mmap) { + if (fopen_direct_io) { res = filemap_write_and_wait_range(mapping, pos, pos + count - 1); if (res) { fuse_io_free(ia);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bernd Schubert bschubert@ddn.com
[ Upstream commit b359af8275a982a458e8df6c6beab1415be1f795 ]
generic_file_direct_write() also does this and has a large comment about.
Reproducer here is xfstest's generic/209, which is exactly to have competing DIO write and cached IO read.
Signed-off-by: Bernd Schubert bschubert@ddn.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fuse/file.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c index c5c82b380791..bb4ecfd469a5 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -1681,6 +1681,15 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, if (res > 0) *ppos = pos;
+ if (res > 0 && write && fopen_direct_io) { + /* + * As in generic_file_direct_write(), invalidate after the + * write, to invalidate read-ahead cache that may have competed + * with the write. + */ + invalidate_inode_pages2_range(mapping, idx_from, idx_to); + } + return res > 0 ? res : err; } EXPORT_SYMBOL_GPL(fuse_direct_io);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Li Qiang liqiang01@kylinos.cn
[ Upstream commit 7aa31ee9ec92915926e74731378c009c9cc04928 ]
The VIA watchdog driver uses allocate_resource() to reserve a MMIO region for the watchdog control register. However, the allocated resource was not given a name, which causes the kernel resource tree to contain an entry marked as "<BAD>" under /proc/iomem on x86 platforms.
During boot, this unnamed resource can lead to a critical hang because subsequent resource lookups and conflict checks fail to handle the invalid entry properly.
Signed-off-by: Li Qiang liqiang01@kylinos.cn Reviewed-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Wim Van Sebroeck wim@linux-watchdog.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/watchdog/via_wdt.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/watchdog/via_wdt.c b/drivers/watchdog/via_wdt.c index d647923d68fe..f55576392651 100644 --- a/drivers/watchdog/via_wdt.c +++ b/drivers/watchdog/via_wdt.c @@ -165,6 +165,7 @@ static int wdt_probe(struct pci_dev *pdev, dev_err(&pdev->dev, "cannot enable PCI device\n"); return -ENODEV; } + wdt_res.name = "via_wdt";
/* * Allocate a MMIO region which contains watchdog control register
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit e5bf5ee266633cb18fff6f98f0b7d59a62819eee ]
ffs_epfile_open() can race with removal, ending up with file->private_data pointing to freed object.
There is a total count of opened files on functionfs (both ep0 and dynamic ones) and when it hits zero, dynamic files get removed. Unfortunately, that removal can happen while another thread is in ffs_epfile_open(), but has not incremented the count yet. In that case open will succeed, leaving us with UAF on any subsequent read() or write().
The root cause is that ffs->opened is misused; atomic_dec_and_test() vs. atomic_add_return() is not a good idea, when object remains visible all along.
To untangle that * serialize openers on ffs->mutex (both for ep0 and for dynamic files) * have dynamic ones use atomic_inc_not_zero() and fail if we had zero ->opened; in that case the file we are opening is doomed. * have the inodes of dynamic files marked on removal (from the callback of simple_recursive_removal()) - clear ->i_private there. * have open of dynamic ones verify they hadn't been already removed, along with checking that state is FFS_ACTIVE.
Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/function/f_fs.c | 53 ++++++++++++++++++++++++------ 1 file changed, 43 insertions(+), 10 deletions(-)
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index 47cfbe41fdff..69f6e3c0f7e0 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -640,13 +640,22 @@ static ssize_t ffs_ep0_read(struct file *file, char __user *buf,
static int ffs_ep0_open(struct inode *inode, struct file *file) { - struct ffs_data *ffs = inode->i_private; + struct ffs_data *ffs = inode->i_sb->s_fs_info; + int ret;
- if (ffs->state == FFS_CLOSING) - return -EBUSY; + /* Acquire mutex */ + ret = ffs_mutex_lock(&ffs->mutex, file->f_flags & O_NONBLOCK); + if (ret < 0) + return ret;
- file->private_data = ffs; ffs_data_opened(ffs); + if (ffs->state == FFS_CLOSING) { + ffs_data_closed(ffs); + mutex_unlock(&ffs->mutex); + return -EBUSY; + } + mutex_unlock(&ffs->mutex); + file->private_data = ffs;
return stream_open(inode, file); } @@ -1193,14 +1202,33 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data) static int ffs_epfile_open(struct inode *inode, struct file *file) { - struct ffs_epfile *epfile = inode->i_private; + struct ffs_data *ffs = inode->i_sb->s_fs_info; + struct ffs_epfile *epfile; + int ret;
- if (WARN_ON(epfile->ffs->state != FFS_ACTIVE)) + /* Acquire mutex */ + ret = ffs_mutex_lock(&ffs->mutex, file->f_flags & O_NONBLOCK); + if (ret < 0) + return ret; + + if (!atomic_inc_not_zero(&ffs->opened)) { + mutex_unlock(&ffs->mutex); + return -ENODEV; + } + /* + * we want the state to be FFS_ACTIVE; FFS_ACTIVE alone is + * not enough, though - we might have been through FFS_CLOSING + * and back to FFS_ACTIVE, with our file already removed. + */ + epfile = smp_load_acquire(&inode->i_private); + if (unlikely(ffs->state != FFS_ACTIVE || !epfile)) { + mutex_unlock(&ffs->mutex); + ffs_data_closed(ffs); return -ENODEV; + } + mutex_unlock(&ffs->mutex);
file->private_data = epfile; - ffs_data_opened(epfile->ffs); - return stream_open(inode, file); }
@@ -1332,7 +1360,7 @@ static void ffs_dmabuf_put(struct dma_buf_attachment *attach) static int ffs_epfile_release(struct inode *inode, struct file *file) { - struct ffs_epfile *epfile = inode->i_private; + struct ffs_epfile *epfile = file->private_data; struct ffs_dmabuf_priv *priv, *tmp; struct ffs_data *ffs = epfile->ffs;
@@ -2352,6 +2380,11 @@ static int ffs_epfiles_create(struct ffs_data *ffs) return 0; }
+static void clear_one(struct dentry *dentry) +{ + smp_store_release(&dentry->d_inode->i_private, NULL); +} + static void ffs_epfiles_destroy(struct ffs_epfile *epfiles, unsigned count) { struct ffs_epfile *epfile = epfiles; @@ -2359,7 +2392,7 @@ static void ffs_epfiles_destroy(struct ffs_epfile *epfiles, unsigned count) for (; count; --count, ++epfile) { BUG_ON(mutex_is_locked(&epfile->mutex)); if (epfile->dentry) { - simple_recursive_removal(epfile->dentry, NULL); + simple_recursive_removal(epfile->dentry, clear_one); epfile->dentry = NULL; } }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Encrow Thorne jyc0019@gmail.com
[ Upstream commit f3d8b64ee46c9b4b0b82b1a4642027728bac95b8 ]
RESET_CONTROL_FLAGS_BIT_* macros use BIT(), but reset.h does not include bits.h. This causes compilation errors when including reset.h standalone.
Include bits.h to make reset.h self-contained.
Suggested-by: Troy Mitchell troy.mitchell@linux.dev Reviewed-by: Troy Mitchell troy.mitchell@linux.dev Reviewed-by: Philipp Zabel p.zabel@pengutronix.de Signed-off-by: Encrow Thorne jyc0019@gmail.com Signed-off-by: Philipp Zabel p.zabel@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/reset.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/include/linux/reset.h b/include/linux/reset.h index 840d75d172f6..44f9e3415f92 100644 --- a/include/linux/reset.h +++ b/include/linux/reset.h @@ -2,6 +2,7 @@ #ifndef _LINUX_RESET_H_ #define _LINUX_RESET_H_
+#include <linux/bits.h> #include <linux/err.h> #include <linux/errno.h> #include <linux/types.h>
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yuezhang Mo Yuezhang.Mo@sony.com
[ Upstream commit 51fc7b4ce10ccab8ea5e4876bcdc42cf5202a0ef ]
The kernel test robot reported that the exFAT remount operation failed. The reason for the failure was that the process's umask is different between mount and remount, causing fs_fmask and fs_dmask are changed.
Potentially, both gid and uid may also be changed. Therefore, when initializing fs_context for remount, inherit these mount options from the options used during mount.
Reported-by: kernel test robot oliver.sang@intel.com Closes: https://lore.kernel.org/oe-lkp/202511251637.81670f5c-lkp@intel.com Signed-off-by: Yuezhang Mo Yuezhang.Mo@sony.com Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/exfat/super.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/fs/exfat/super.c b/fs/exfat/super.c index 74d451f732c7..581754001128 100644 --- a/fs/exfat/super.c +++ b/fs/exfat/super.c @@ -813,10 +813,21 @@ static int exfat_init_fs_context(struct fs_context *fc) ratelimit_state_init(&sbi->ratelimit, DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST);
- sbi->options.fs_uid = current_uid(); - sbi->options.fs_gid = current_gid(); - sbi->options.fs_fmask = current->fs->umask; - sbi->options.fs_dmask = current->fs->umask; + if (fc->purpose == FS_CONTEXT_FOR_RECONFIGURE && fc->root) { + struct super_block *sb = fc->root->d_sb; + struct exfat_mount_options *cur_opts = &EXFAT_SB(sb)->options; + + sbi->options.fs_uid = cur_opts->fs_uid; + sbi->options.fs_gid = cur_opts->fs_gid; + sbi->options.fs_fmask = cur_opts->fs_fmask; + sbi->options.fs_dmask = cur_opts->fs_dmask; + } else { + sbi->options.fs_uid = current_uid(); + sbi->options.fs_gid = current_gid(); + sbi->options.fs_fmask = current->fs->umask; + sbi->options.fs_dmask = current->fs->umask; + } + sbi->options.allow_utime = -1; sbi->options.errors = EXFAT_ERRORS_RO; exfat_set_iocharset(&sbi->options, exfat_default_iocharset);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yuezhang Mo Yuezhang.Mo@sony.com
[ Upstream commit 4e163c39dd4e70fcdce948b8774d96e0482b4a11 ]
xfstests generic/363 was failing due to unzeroed post-EOF page cache that allowed mmap writes beyond EOF to become visible after file extension.
For example, in following xfs_io sequence, 0x22 should not be written to the file but would become visible after the extension:
xfs_io -f -t -c "pwrite -S 0x11 0 8" \ -c "mmap 0 4096" \ -c "mwrite -S 0x22 32 32" \ -c "munmap" \ -c "pwrite -S 0x33 512 32" \ $testfile
This violates the expected behavior where writes beyond EOF via mmap should not persist after the file is extended. Instead, the extended region should contain zeros.
Fix this by using truncate_pagecache() to truncate the page cache after the current EOF when extending the file.
Signed-off-by: Yuezhang Mo Yuezhang.Mo@sony.com Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/exfat/file.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/fs/exfat/file.c b/fs/exfat/file.c index adc37b4d7fc2..536c8078f0c1 100644 --- a/fs/exfat/file.c +++ b/fs/exfat/file.c @@ -25,6 +25,8 @@ static int exfat_cont_expand(struct inode *inode, loff_t size) struct exfat_sb_info *sbi = EXFAT_SB(sb); struct exfat_chain clu;
+ truncate_pagecache(inode, i_size_read(inode)); + ret = inode_newsize_ok(inode, size); if (ret) return ret; @@ -639,6 +641,9 @@ static ssize_t exfat_file_write_iter(struct kiocb *iocb, struct iov_iter *iter)
inode_lock(inode);
+ if (pos > i_size_read(inode)) + truncate_pagecache(inode, i_size_read(inode)); + valid_size = ei->valid_size;
ret = generic_write_checks(iocb, iter);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lizhi Xu lizhi.xu@windriver.com
[ Upstream commit 09bf21bf5249880f62fe759b53b14b4b52900c6c ]
Interrupts are disabled before entering usb_hcd_giveback_urb(). A spinlock_t becomes a sleeping lock on PREEMPT_RT, so it cannot be acquired with disabled interrupts.
Save the interrupt status and restore it after usb_hcd_giveback_urb().
syz reported: BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 Call Trace: dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 rt_spin_lock+0xc7/0x2c0 kernel/locking/spinlock_rt.c:57 spin_lock include/linux/spinlock_rt.h:44 [inline] mon_bus_complete drivers/usb/mon/mon_main.c:134 [inline] mon_complete+0x5c/0x200 drivers/usb/mon/mon_main.c:147 usbmon_urb_complete include/linux/usb/hcd.h:738 [inline] __usb_hcd_giveback_urb+0x254/0x5e0 drivers/usb/core/hcd.c:1647 vhci_urb_enqueue+0xb4f/0xe70 drivers/usb/usbip/vhci_hcd.c:818
Reported-by: syzbot+205ef33a3b636b4181fb@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=205ef33a3b636b4181fb Signed-off-by: Lizhi Xu lizhi.xu@windriver.com Acked-by: Shuah Khan skhan@linuxfoundation.org Link: https://lore.kernel.org/r/20250916014143.1439759-1-lizhi.xu@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/usbip/vhci_hcd.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c index 0d6c10a8490c..f7e405abe608 100644 --- a/drivers/usb/usbip/vhci_hcd.c +++ b/drivers/usb/usbip/vhci_hcd.c @@ -831,15 +831,15 @@ static int vhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag no_need_xmit: usb_hcd_unlink_urb_from_ep(hcd, urb); no_need_unlink: - spin_unlock_irqrestore(&vhci->lock, flags); if (!ret) { /* usb_hcd_giveback_urb() should be called with * irqs disabled */ - local_irq_disable(); + spin_unlock(&vhci->lock); usb_hcd_giveback_urb(hcd, urb, urb->status); - local_irq_enable(); + spin_lock(&vhci->lock); } + spin_unlock_irqrestore(&vhci->lock, flags); return ret; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Pearson mpearson-lenovo@squebb.ca
[ Upstream commit 30cd2cb1abf4c4acdb1ddb468c946f68939819fb ]
The UCSI spec states that the num_connectors field is 7 bits, and the 8th bit is reserved and should be set to zero. Some buggy FW has been known to set this bit, and it can lead to a system not booting. Flag that the FW is not behaving correctly, and auto-fix the value so that the system boots correctly.
Found on Lenovo P1 G8 during Linux enablement program. The FW will be fixed, but seemed worth addressing in case it hit platforms that aren't officially Linux supported.
Signed-off-by: Mark Pearson mpearson-lenovo@squebb.ca Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20250821185319.2585023-1-mpearson-lenovo@squebb.ca Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/typec/ucsi/ucsi.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index 3f568f790f39..3995483a0aa0 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -1807,6 +1807,12 @@ static int ucsi_init(struct ucsi *ucsi) ret = -ENODEV; goto err_reset; } + /* Check if reserved bit set. This is out of spec but happens in buggy FW */ + if (ucsi->cap.num_connectors & 0x80) { + dev_warn(ucsi->dev, "UCSI: Invalid num_connectors %d. Likely buggy FW\n", + ucsi->cap.num_connectors); + ucsi->cap.num_connectors &= 0x7f; // clear bit and carry on + }
/* Allocate the connectors. Released in ucsi_unregister() */ connector = kcalloc(ucsi->cap.num_connectors + 1, sizeof(*connector), GFP_KERNEL);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pei Xiao xiaopei01@kylinos.cn
[ Upstream commit c9fb952360d0c78bbe98239bd6b702f05c2dbb31 ]
FIELD_PREP() checks that a value fits into the available bitfield, add a check for step_avg to fix gcc complains.
which gcc complains about: drivers/iio/adc/ti_am335x_adc.c: In function 'tiadc_step_config': include/linux/compiler_types.h:572:38: error: call to '__compiletime_assert_491' declared with attribute error: FIELD_PREP: value too large for the field include/linux/mfd/ti_am335x_tscadc.h:58:29: note: in expansion of macro 'FIELD_PREP' #define STEPCONFIG_AVG(val) FIELD_PREP(GENMASK(4, 2), (val)) ^~~~~~~~~~ drivers/iio/adc/ti_am335x_adc.c:127:17: note: in expansion of macro 'STEPCONFIG_AVG' stepconfig = STEPCONFIG_AVG(ffs(adc_dev->step_avg[i]) - 1)
Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202510102117.Jqxrw1vF-lkp@intel.com/ Signed-off-by: Pei Xiao xiaopei01@kylinos.cn Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/adc/ti_am335x_adc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/adc/ti_am335x_adc.c b/drivers/iio/adc/ti_am335x_adc.c index 99f274adc870..a1a28584de93 100644 --- a/drivers/iio/adc/ti_am335x_adc.c +++ b/drivers/iio/adc/ti_am335x_adc.c @@ -123,7 +123,7 @@ static void tiadc_step_config(struct iio_dev *indio_dev)
chan = adc_dev->channel_line[i];
- if (adc_dev->step_avg[i]) + if (adc_dev->step_avg[i] && adc_dev->step_avg[i] <= STEPCONFIG_AVG_16) stepconfig = STEPCONFIG_AVG(ffs(adc_dev->step_avg[i]) - 1) | STEPCONFIG_FIFO1; else
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hongyu Xie xiehongyu1@kylinos.cn
[ Upstream commit 8d34983720155b8f05de765f0183d9b0e1345cc0 ]
run_graceperiod blocks usb 2.0 devices from auto suspending after xhci_start for 500ms.
Log shows: [ 13.387170] xhci_hub_control:1271: xhci-hcd PNP0D10:03: Get port status 7-1 read: 0x2a0, return 0x100 [ 13.387177] hub_event:5779: hub 7-0:1.0: state 7 ports 1 chg 0000 evt 0000 [ 13.387182] hub_suspend:3903: hub 7-0:1.0: hub_suspend [ 13.387188] hcd_bus_suspend:2250: usb usb7: bus auto-suspend, wakeup 1 [ 13.387191] hcd_bus_suspend:2279: usb usb7: suspend raced with wakeup event [ 13.387193] hcd_bus_resume:2303: usb usb7: usb auto-resume [ 13.387296] hub_event:5779: hub 3-0:1.0: state 7 ports 1 chg 0000 evt 0000 [ 13.393343] handle_port_status:2034: xhci-hcd PNP0D10:02: handle_port_status: starting usb5 port polling. [ 13.393353] xhci_hub_control:1271: xhci-hcd PNP0D10:02: Get port status 5-1 read: 0x206e1, return 0x10101 [ 13.400047] hub_suspend:3903: hub 3-0:1.0: hub_suspend [ 13.403077] hub_resume:3948: hub 7-0:1.0: hub_resume [ 13.403080] xhci_hub_control:1271: xhci-hcd PNP0D10:03: Get port status 7-1 read: 0x2a0, return 0x100 [ 13.403085] hub_event:5779: hub 7-0:1.0: state 7 ports 1 chg 0000 evt 0000 [ 13.403087] hub_suspend:3903: hub 7-0:1.0: hub_suspend [ 13.403090] hcd_bus_suspend:2250: usb usb7: bus auto-suspend, wakeup 1 [ 13.403093] hcd_bus_suspend:2279: usb usb7: suspend raced with wakeup event [ 13.403095] hcd_bus_resume:2303: usb usb7: usb auto-resume [ 13.405002] handle_port_status:1913: xhci-hcd PNP0D10:04: Port change event, 9-1, id 1, portsc: 0x6e1 [ 13.405016] hub_activate:1169: usb usb5-port1: status 0101 change 0001 [ 13.405026] xhci_clear_port_change_bit:658: xhci-hcd PNP0D10:02: clear port1 connect change, portsc: 0x6e1 [ 13.413275] hcd_bus_suspend:2250: usb usb3: bus auto-suspend, wakeup 1 [ 13.419081] hub_resume:3948: hub 7-0:1.0: hub_resume [ 13.419086] xhci_hub_control:1271: xhci-hcd PNP0D10:03: Get port status 7-1 read: 0x2a0, return 0x100 [ 13.419095] hub_event:5779: hub 7-0:1.0: state 7 ports 1 chg 0000 evt 0000 [ 13.419100] hub_suspend:3903: hub 7-0:1.0: hub_suspend [ 13.419106] hcd_bus_suspend:2250: usb usb7: bus auto-suspend, wakeup 1 [ 13.419110] hcd_bus_suspend:2279: usb usb7: suspend raced with wakeup event [ 13.419112] hcd_bus_resume:2303: usb usb7: usb auto-resume [ 13.420455] handle_port_status:2034: xhci-hcd PNP0D10:04: handle_port_status: starting usb9 port polling. [ 13.420493] handle_port_status:1913: xhci-hcd PNP0D10:05: Port change event, 10-1, id 1, portsc: 0x6e1 [ 13.425332] hcd_bus_suspend:2279: usb usb3: suspend raced with wakeup event [ 13.431931] handle_port_status:2034: xhci-hcd PNP0D10:05: handle_port_status: starting usb10 port polling. [ 13.435080] hub_resume:3948: hub 7-0:1.0: hub_resume [ 13.435084] xhci_hub_control:1271: xhci-hcd PNP0D10:03: Get port status 7-1 read: 0x2a0, return 0x100 [ 13.435092] hub_event:5779: hub 7-0:1.0: state 7 ports 1 chg 0000 evt 0000 [ 13.435096] hub_suspend:3903: hub 7-0:1.0: hub_suspend [ 13.435102] hcd_bus_suspend:2250: usb usb7: bus auto-suspend, wakeup 1 [ 13.435106] hcd_bus_suspend:2279: usb usb7: suspend raced with wakeup event
usb7 and other usb 2.0 root hub were rapidly toggling between suspend and resume states. More, "suspend raced with wakeup event" confuses people.
So, limit run_graceperiod for only usb 3.0 devices
Signed-off-by: Hongyu Xie xiehongyu1@kylinos.cn Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://patch.msgid.link/20251119142417.2820519-2-mathias.nyman@linux.intel.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/host/xhci-hub.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c index b3a59ce1b3f4..5e1442e91743 100644 --- a/drivers/usb/host/xhci-hub.c +++ b/drivers/usb/host/xhci-hub.c @@ -1671,7 +1671,7 @@ int xhci_hub_status_data(struct usb_hcd *hcd, char *buf) * SS devices are only visible to roothub after link training completes. * Keep polling roothubs for a grace period after xHC start */ - if (xhci->run_graceperiod) { + if (hcd->speed >= HCD_USB3 && xhci->run_graceperiod) { if (time_before(jiffies, xhci->run_graceperiod)) status = 1; else
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen Changcheng chenchangcheng@kylinos.cn
[ Upstream commit 955a48a5353f4fe009704a9a4272a3adf627cd35 ]
The optical drive of EL-R12 has the same vid and pid as INIC-3069, as follows: T: Bus=02 Lev=02 Prnt=02 Port=01 Cnt=01 Dev#= 3 Spd=5000 MxCh= 0 D: Ver= 3.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=13fd ProdID=3940 Rev= 3.10 S: Manufacturer=HL-DT-ST S: Product= DVD+-RW GT80N S: SerialNumber=423349524E4E38303338323439202020 C:* #Ifs= 1 Cfg#= 1 Atr=80 MxPwr=144mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=02 Prot=50 Driver=usb-storage E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=0a(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
This will result in the optical drive device also adding the quirks of US_FL_NO_ATA_1X. When performing an erase operation, it will fail, and the reason for the failure is as follows: [ 388.967742] sr 5:0:0:0: [sr0] tag#0 Send: scmd 0x00000000d20c33a7 [ 388.967742] sr 5:0:0:0: [sr0] tag#0 CDB: ATA command pass through(12)/Blank a1 11 00 00 00 00 00 00 00 00 00 00 [ 388.967773] sr 5:0:0:0: [sr0] tag#0 Done: SUCCESS Result: hostbyte=DID_TARGET_FAILURE driverbyte=DRIVER_OK cmd_age=0s [ 388.967773] sr 5:0:0:0: [sr0] tag#0 CDB: ATA command pass through(12)/Blank a1 11 00 00 00 00 00 00 00 00 00 00 [ 388.967803] sr 5:0:0:0: [sr0] tag#0 Sense Key : Illegal Request [current] [ 388.967803] sr 5:0:0:0: [sr0] tag#0 Add. Sense: Invalid field in cdb [ 388.967803] sr 5:0:0:0: [sr0] tag#0 scsi host busy 1 failed 0 [ 388.967803] sr 5:0:0:0: Notifying upper driver of completion (result 8100002) [ 388.967834] sr 5:0:0:0: [sr0] tag#0 0 sectors total, 0 bytes done.
For the EL-R12 standard optical drive, all operational commands and usage scenarios were tested without adding the IGNORE_RESIDUE quirks, and no issues were encountered. It can be reasonably concluded that removing the IGNORE_RESIDUE quirks has no impact.
Signed-off-by: Chen Changcheng chenchangcheng@kylinos.cn Link: https://patch.msgid.link/20251121064020.29332-1-chenchangcheng@kylinos.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/storage/unusual_uas.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/storage/unusual_uas.h b/drivers/usb/storage/unusual_uas.h index 1477e31d7763..b695f5ba9a40 100644 --- a/drivers/usb/storage/unusual_uas.h +++ b/drivers/usb/storage/unusual_uas.h @@ -98,7 +98,7 @@ UNUSUAL_DEV(0x125f, 0xa94a, 0x0160, 0x0160, US_FL_NO_ATA_1X),
/* Reported-by: Benjamin Tissoires benjamin.tissoires@redhat.com */ -UNUSUAL_DEV(0x13fd, 0x3940, 0x0000, 0x9999, +UNUSUAL_DEV(0x13fd, 0x3940, 0x0309, 0x0309, "Initio Corporation", "INIC-3069", USB_SC_DEVICE, USB_PR_DEVICE, NULL,
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Pecio michal.pecio@gmail.com
[ Upstream commit e6aec6d9f5794e85d2312497a5d81296d885090e ]
Some old HCs ignore transfer ring link TRBs whose chain bit is unset. This breaks endpoint operation and sometimes makes it execute other ring's TDs, which may corrupt their buffers or cause unwanted device action. We avoid this by chaining all link TRBs on affected rings.
Fix an omission which allows them to be unchained by cancelling TDs.
The patch was tested by reproducing this condition on an isochronous endpoint (non-power-of-two TDs are sometimes split not to cross 64K) and printing link TRBs in trb_to_noop() on good and buggy HCs.
Actual hardware malfunction is rare since it requires Missed Service Error shortly before the unchained link TRB, at least on NEC and AMD. I have never seen it after commit bb0ba4cb1065 ("usb: xhci: Apply the link chain quirk on NEC isoc endpoints"), but it's Russian roulette and I can't test all affected hosts and workloads. Fairly often MSEs happen after cancellation because the endpoint was stopped.
Signed-off-by: Michal Pecio michal.pecio@gmail.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://patch.msgid.link/20251119142417.2820519-11-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/host/xhci-ring.c | 27 ++++++++++++++++----------- 1 file changed, 16 insertions(+), 11 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 5bdcf9ab2b99..25185552287c 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -128,11 +128,11 @@ static void inc_td_cnt(struct urb *urb) urb_priv->num_tds_done++; }
-static void trb_to_noop(union xhci_trb *trb, u32 noop_type) +static void trb_to_noop(union xhci_trb *trb, u32 noop_type, bool unchain_links) { if (trb_is_link(trb)) { - /* unchain chained link TRBs */ - trb->link.control &= cpu_to_le32(~TRB_CHAIN); + if (unchain_links) + trb->link.control &= cpu_to_le32(~TRB_CHAIN); } else { trb->generic.field[0] = 0; trb->generic.field[1] = 0; @@ -465,7 +465,7 @@ static void xhci_handle_stopped_cmd_ring(struct xhci_hcd *xhci, xhci_dbg(xhci, "Turn aborted command %p to no-op\n", i_cmd->command_trb);
- trb_to_noop(i_cmd->command_trb, TRB_CMD_NOOP); + trb_to_noop(i_cmd->command_trb, TRB_CMD_NOOP, false);
/* * caller waiting for completion is called when command @@ -797,13 +797,18 @@ static int xhci_move_dequeue_past_td(struct xhci_hcd *xhci, * (The last TRB actually points to the ring enqueue pointer, which is not part * of this TD.) This is used to remove partially enqueued isoc TDs from a ring. */ -static void td_to_noop(struct xhci_td *td, bool flip_cycle) +static void td_to_noop(struct xhci_hcd *xhci, struct xhci_virt_ep *ep, + struct xhci_td *td, bool flip_cycle) { + bool unchain_links; struct xhci_segment *seg = td->start_seg; union xhci_trb *trb = td->start_trb;
+ /* link TRBs should now be unchained, but some old HCs expect otherwise */ + unchain_links = !xhci_link_chain_quirk(xhci, ep->ring ? ep->ring->type : TYPE_STREAM); + while (1) { - trb_to_noop(trb, TRB_TR_NOOP); + trb_to_noop(trb, TRB_TR_NOOP, unchain_links);
/* flip cycle if asked to */ if (flip_cycle && trb != td->start_trb && trb != td->end_trb) @@ -1091,16 +1096,16 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep) "Found multiple active URBs %p and %p in stream %u?\n", td->urb, cached_td->urb, td->urb->stream_id); - td_to_noop(cached_td, false); + td_to_noop(xhci, ep, cached_td, false); cached_td->cancel_status = TD_CLEARED; } - td_to_noop(td, false); + td_to_noop(xhci, ep, td, false); td->cancel_status = TD_CLEARING_CACHE; cached_td = td; break; } } else { - td_to_noop(td, false); + td_to_noop(xhci, ep, td, false); td->cancel_status = TD_CLEARED; } } @@ -1125,7 +1130,7 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep) continue; xhci_warn(xhci, "Failed to clear cancelled cached URB %p, mark clear anyway\n", td->urb); - td_to_noop(td, false); + td_to_noop(xhci, ep, td, false); td->cancel_status = TD_CLEARED; } } @@ -4273,7 +4278,7 @@ static int xhci_queue_isoc_tx(struct xhci_hcd *xhci, gfp_t mem_flags, */ urb_priv->td[0].end_trb = ep_ring->enqueue; /* Every TRB except the first & last will have its cycle bit flipped. */ - td_to_noop(&urb_priv->td[0], true); + td_to_noop(xhci, xep, &urb_priv->td[0], true);
/* Reset the ring enqueue back to the first TRB and its cycle bit. */ ep_ring->enqueue = urb_priv->td[0].start_trb;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wenhua Lin Wenhua.Lin@unisoc.com
[ Upstream commit 29e8a0c587e328ed458380a45d6028adf64d7487 ]
In sprd_clk_init(), when devm_clk_get() returns -EPROBE_DEFER for either uart or source clock, we should propagate the error instead of just warning and continuing with NULL clocks.
Currently the driver only emits a warning when clock acquisition fails and proceeds with NULL clock pointers. This can lead to issues later when the clocks are actually needed. More importantly, when the clock provider is not ready yet and returns -EPROBE_DEFER, we should return this error to allow deferred probing.
This change adds explicit checks for -EPROBE_DEFER after both: 1. devm_clk_get(uport->dev, uart) 2. devm_clk_get(uport->dev, source)
When -EPROBE_DEFER is encountered, the function now returns -EPROBE_DEFER to let the driver framework retry probing later when the clock dependencies are resolved.
Signed-off-by: Wenhua Lin Wenhua.Lin@unisoc.com Link: https://patch.msgid.link/20251022030840.956589-1-Wenhua.Lin@unisoc.com Reviewed-by: Cixi Geng cixi.geng@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/sprd_serial.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/tty/serial/sprd_serial.c b/drivers/tty/serial/sprd_serial.c index 8c9366321f8e..092755f35683 100644 --- a/drivers/tty/serial/sprd_serial.c +++ b/drivers/tty/serial/sprd_serial.c @@ -1133,6 +1133,9 @@ static int sprd_clk_init(struct uart_port *uport)
clk_uart = devm_clk_get(uport->dev, "uart"); if (IS_ERR(clk_uart)) { + if (PTR_ERR(clk_uart) == -EPROBE_DEFER) + return -EPROBE_DEFER; + dev_warn(uport->dev, "uart%d can't get uart clock\n", uport->line); clk_uart = NULL; @@ -1140,6 +1143,9 @@ static int sprd_clk_init(struct uart_port *uport)
clk_parent = devm_clk_get(uport->dev, "source"); if (IS_ERR(clk_parent)) { + if (PTR_ERR(clk_parent) == -EPROBE_DEFER) + return -EPROBE_DEFER; + dev_warn(uport->dev, "uart%d can't get source clock\n", uport->line); clk_parent = NULL;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Rogers irogers@google.com
[ Upstream commit a0a4173631bfcfd3520192c0a61cf911d6a52c3a ]
Passing an empty map to perf_cpu_map__max triggered a SEGV. Explicitly test for the empty map.
Reported-by: Ingo Molnar mingo@kernel.org Closes: https://lore.kernel.org/linux-perf-users/aSwt7yzFjVJCEmVp@gmail.com/ Tested-by: Ingo Molnar mingo@kernel.org Signed-off-by: Ian Rogers irogers@google.com Tested-by: Thomas Richter tmricht@linux.ibm.com Signed-off-by: Namhyung Kim namhyung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/perf/cpumap.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c index b20a5280f2b3..2bbbe1c782b8 100644 --- a/tools/lib/perf/cpumap.c +++ b/tools/lib/perf/cpumap.c @@ -368,10 +368,12 @@ struct perf_cpu perf_cpu_map__max(const struct perf_cpu_map *map) .cpu = -1 };
- // cpu_map__trim_new() qsort()s it, cpu_map__default_new() sorts it as well. - return __perf_cpu_map__nr(map) > 0 - ? __perf_cpu_map__cpu(map, __perf_cpu_map__nr(map) - 1) - : result; + if (!map) + return result; + + // The CPUs are always sorted and nr is always > 0 as 0 length map is + // encoded as NULL. + return __perf_cpu_map__cpu(map, __perf_cpu_map__nr(map) - 1); }
/** Is 'b' a subset of 'a'. */
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Reidel adrian@mainlining.org
[ Upstream commit e3c13e0caa8ceb7dec1a7c4fcfd9dbef56a69fbe ]
Set CLK_OPS_PARENT_ENABLE to ensure the parent gets prepared and enabled when switching to it, fixing an "rcg didn't update its configuration" warning.
Signed-off-by: Jens Reidel adrian@mainlining.org Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@oss.qualcomm.com Link: https://lore.kernel.org/r/20250919-sm7150-dispcc-fixes-v1-3-308ad47c5fce@mai... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/dispcc-sm7150.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/qcom/dispcc-sm7150.c b/drivers/clk/qcom/dispcc-sm7150.c index bdfff246ed3f..ddc7230b8aea 100644 --- a/drivers/clk/qcom/dispcc-sm7150.c +++ b/drivers/clk/qcom/dispcc-sm7150.c @@ -356,7 +356,7 @@ static struct clk_rcg2 dispcc_mdss_pclk0_clk_src = { .name = "dispcc_mdss_pclk0_clk_src", .parent_data = dispcc_parent_data_4, .num_parents = ARRAY_SIZE(dispcc_parent_data_4), - .flags = CLK_SET_RATE_PARENT, + .flags = CLK_SET_RATE_PARENT | CLK_OPS_PARENT_ENABLE, .ops = &clk_pixel_ops, }, };
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jinhui Guo guojinhui.liam@bytedance.com
[ Upstream commit d3429178ee51dd7155445d15a5ab87a45fae3c73 ]
When probing the I2C master, disable SMBus interrupts to prevent storms caused by broken firmware mis-configuring IC_SMBUS=1; the handler never services them and a mis-configured SMBUS Master extend-clock timeout or SMBUS Slave extend-clock timeout can flood the CPU.
Signed-off-by: Jinhui Guo guojinhui.liam@bytedance.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Acked-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Andi Shyti andi.shyti@kernel.org Link: https://lore.kernel.org/r/20251021075714.3712-2-guojinhui.liam@bytedance.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-designware-core.h | 1 + drivers/i2c/busses/i2c-designware-master.c | 7 +++++++ 2 files changed, 8 insertions(+)
diff --git a/drivers/i2c/busses/i2c-designware-core.h b/drivers/i2c/busses/i2c-designware-core.h index 347843b4f5dd..436555543c79 100644 --- a/drivers/i2c/busses/i2c-designware-core.h +++ b/drivers/i2c/busses/i2c-designware-core.h @@ -78,6 +78,7 @@ #define DW_IC_TX_ABRT_SOURCE 0x80 #define DW_IC_ENABLE_STATUS 0x9c #define DW_IC_CLR_RESTART_DET 0xa8 +#define DW_IC_SMBUS_INTR_MASK 0xcc #define DW_IC_COMP_PARAM_1 0xf4 #define DW_IC_COMP_VERSION 0xf8 #define DW_IC_SDA_HOLD_MIN_VERS 0x3131312A /* "111*" == v1.11* */ diff --git a/drivers/i2c/busses/i2c-designware-master.c b/drivers/i2c/busses/i2c-designware-master.c index 41e9b5ecad20..45bfca05bb30 100644 --- a/drivers/i2c/busses/i2c-designware-master.c +++ b/drivers/i2c/busses/i2c-designware-master.c @@ -220,6 +220,13 @@ static int i2c_dw_init_master(struct dw_i2c_dev *dev) /* Disable the adapter */ __i2c_dw_disable(dev);
+ /* + * Mask SMBus interrupts to block storms from broken + * firmware that leaves IC_SMBUS=1; the handler never + * services them. + */ + regmap_write(dev->map, DW_IC_SMBUS_INTR_MASK, 0); + /* Write standard speed timing parameters */ regmap_write(dev->map, DW_IC_SS_SCL_HCNT, dev->ss_hcnt); regmap_write(dev->map, DW_IC_SS_SCL_LCNT, dev->ss_lcnt);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Derek J. Clark derekjohn.clark@gmail.com
[ Upstream commit 55715d7ad5e772d621c3201da3895f250591bce8 ]
Add Legion Go 2 SKU's to the Extreme Mode quirks table.
Signed-off-by: Derek J. Clark derekjohn.clark@gmail.com Reviewed-by: Armin Wolf W_Armin@gmx.de Reviewed-by: Mark Pearson mpearson-lenovo@squebb.ca Link: https://patch.msgid.link/20251127151605.1018026-4-derekjohn.clark@gmail.com Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/lenovo/wmi-gamezone.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/x86/lenovo/wmi-gamezone.c b/drivers/platform/x86/lenovo/wmi-gamezone.c index 0eb7fe8222f4..b26806b37d96 100644 --- a/drivers/platform/x86/lenovo/wmi-gamezone.c +++ b/drivers/platform/x86/lenovo/wmi-gamezone.c @@ -274,8 +274,23 @@ static const struct dmi_system_id fwbug_list[] = { }, .driver_data = &quirk_no_extreme_bug, }, + { + .ident = "Legion Go 8ASP2", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"), + DMI_MATCH(DMI_PRODUCT_VERSION, "Legion Go 8ASP2"), + }, + .driver_data = &quirk_no_extreme_bug, + }, + { + .ident = "Legion Go 8AHP2", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"), + DMI_MATCH(DMI_PRODUCT_VERSION, "Legion Go 8AHP2"), + }, + .driver_data = &quirk_no_extreme_bug, + }, {}, - };
/**
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Wagner wagi@kernel.org
[ Upstream commit b71cbcf7d170e51148d5467820ae8a72febcb651 ]
nvme_fc_ctrl_put can acquire the rport lock when freeing the ctrl object:
nvme_fc_ctrl_put nvme_fc_ctrl_free spin_lock_irqsave(rport->lock)
Thus we can't hold the rport lock when calling nvme_fc_ctrl_put.
Justin suggested use the safe list iterator variant because nvme_fc_ctrl_put will also modify the rport->list.
Cc: Justin Tee justin.tee@broadcom.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Daniel Wagner wagi@kernel.org Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/fc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index 2c903729b0b9..8324230c5371 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -1468,14 +1468,14 @@ nvme_fc_match_disconn_ls(struct nvme_fc_rport *rport, { struct fcnvme_ls_disconnect_assoc_rqst *rqst = &lsop->rqstbuf->rq_dis_assoc; - struct nvme_fc_ctrl *ctrl, *ret = NULL; + struct nvme_fc_ctrl *ctrl, *tmp, *ret = NULL; struct nvmefc_ls_rcv_op *oldls = NULL; u64 association_id = be64_to_cpu(rqst->associd.association_id); unsigned long flags;
spin_lock_irqsave(&rport->lock, flags);
- list_for_each_entry(ctrl, &rport->ctrl_list, ctrl_list) { + list_for_each_entry_safe(ctrl, tmp, &rport->ctrl_list, ctrl_list) { if (!nvme_fc_ctrl_get(ctrl)) continue; spin_lock(&ctrl->lock); @@ -1488,7 +1488,9 @@ nvme_fc_match_disconn_ls(struct nvme_fc_rport *rport, if (ret) /* leave the ctrl get reference */ break; + spin_unlock_irqrestore(&rport->lock, flags); nvme_fc_ctrl_put(ctrl); + spin_lock_irqsave(&rport->lock, flags); }
spin_unlock_irqrestore(&rport->lock, flags);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pei Xiao xiaopei01@kylinos.cn
[ Upstream commit 4910da6b36b122db50a27fabf6ab7f8611b60bf8 ]
The for_each_child_of_node() macro automatically manages device node reference counts during normal iteration. However, when breaking out of the loop early with return, the current iteration's node is not automatically released, leading to a reference count leak.
Fix this by adding of_node_put(child) before returning from the loop when emc2305_set_single_tz() fails.
This issue could lead to memory leaks over multiple probe cycles.
Signed-off-by: Pei Xiao xiaopei01@kylinos.cn Link: https://lore.kernel.org/r/tencent_5CDC08544C901D5ECA270573D5AEE3117108@qq.co... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/emc2305.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/hwmon/emc2305.c b/drivers/hwmon/emc2305.c index 60809289f816..84cb9b72cb6c 100644 --- a/drivers/hwmon/emc2305.c +++ b/drivers/hwmon/emc2305.c @@ -685,8 +685,10 @@ static int emc2305_probe(struct i2c_client *client) i = 0; for_each_child_of_node(dev->of_node, child) { ret = emc2305_set_single_tz(dev, child, i); - if (ret != 0) + if (ret != 0) { + of_node_put(child); return ret; + } i++; } } else {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Justin Tee justintee8345@gmail.com
[ Upstream commit 13989207ee29c40501e719512e8dc90768325895 ]
With authentication, in addition to EKEYREJECTED there is also no point in retrying reconnects when status is ENOKEY. Thus, add -ENOKEY as another criteria to determine when to stop retries.
Cc: Daniel Wagner wagi@kernel.org Cc: Hannes Reinecke hare@suse.de Closes: https://lore.kernel.org/linux-nvme/20250829-nvme-fc-sync-v3-0-d69c87e63aee@k... Signed-off-by: Justin Tee justintee8345@gmail.com Tested-by: Daniel Wagner wagi@kernel.org Reviewed-by: Daniel Wagner wagi@kernel.org Reviewed-by: Hannes Reinecke hare@suse.de Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/fabrics.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c index 2e58a7ce1090..55a8afd2efd5 100644 --- a/drivers/nvme/host/fabrics.c +++ b/drivers/nvme/host/fabrics.c @@ -592,7 +592,7 @@ bool nvmf_should_reconnect(struct nvme_ctrl *ctrl, int status) if (status > 0 && (status & NVME_STATUS_DNR)) return false;
- if (status == -EKEYREJECTED) + if (status == -EKEYREJECTED || status == -ENOKEY) return false;
if (ctrl->opts->max_reconnects == -1 ||
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pei Xiao xiaopei01@kylinos.cn
[ Upstream commit 541dfb49dcb80c2509e030842de77adfb77820f5 ]
./drivers/hwmon/emc2305.c:597:4-15: ERROR: probable double put
Device node iterators put the previous value of the index variable, so an explicit put causes a double put.
Signed-off-by: Pei Xiao xiaopei01@kylinos.cn Link: https://lore.kernel.org/r/tencent_CD373F952BE48697C949E39CB5EB77841D06@qq.co... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/emc2305.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/hwmon/emc2305.c b/drivers/hwmon/emc2305.c index 84cb9b72cb6c..ceae96c07ac4 100644 --- a/drivers/hwmon/emc2305.c +++ b/drivers/hwmon/emc2305.c @@ -593,10 +593,8 @@ static int emc2305_probe_childs_from_dt(struct device *dev) for_each_child_of_node(dev->of_node, child) { if (of_property_present(child, "reg")) { ret = emc2305_of_parse_pwm_child(dev, child, data); - if (ret) { - of_node_put(child); + if (ret) continue; - } count++; } }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chia-Lin Kao (AceLan) acelan.kao@canonical.com
[ Upstream commit b169e1733cadb614e87f69d7a5ae1b186c50d313 ]
Dell Pro Rugged 10/12 tablets has a reliable VGBS method. If VGBS is not called on boot, the on-screen keyboard won't appear if the device is booted without a keyboard.
Call VGBS on boot on thess devices to get the initial state of SW_TABLET_MODE in a reliable way.
Signed-off-by: Chia-Lin Kao (AceLan) acelan.kao@canonical.com Reviewed-by: Hans de Goede johannes.goede@oss.qualcomm.com Link: https://patch.msgid.link/20251127070407.656463-1-acelan.kao@canonical.com Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/intel/hid.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/drivers/platform/x86/intel/hid.c b/drivers/platform/x86/intel/hid.c index 9c07a7faf18f..560cc063198e 100644 --- a/drivers/platform/x86/intel/hid.c +++ b/drivers/platform/x86/intel/hid.c @@ -177,6 +177,18 @@ static const struct dmi_system_id dmi_vgbs_allow_list[] = { DMI_MATCH(DMI_PRODUCT_NAME, "HP Elite Dragonfly G2 Notebook PC"), }, }, + { + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "Dell Pro Rugged 10 Tablet RA00260"), + }, + }, + { + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "Dell Pro Rugged 12 Tablet RA02260"), + }, + }, { } };
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gregory CLEMENT gregory.clement@bootlin.com
[ Upstream commit 36dac9a3dda1f2bae343191bc16b910c603cac25 ]
Since commit e424054000878 ("MIPS: Tracing: Reduce the overhead of dynamic Function Tracer"), the macro UASM_i_LA_mostly has been used, and this macro can generate more than 2 instructions. At the same time, the code in ftrace assumes that no more than 2 instructions can be generated, which is why it stores them in an int[2] array. However, as previously noted, the macro UASM_i_LA_mostly (and now UASM_i_LA) causes a buffer overflow when _mcount is beyond 32 bits. This leads to corruption of the variables located in the __read_mostly section.
This corruption was observed because the variable __cpu_primary_thread_mask was corrupted, causing a hang very early during boot.
This fix prevents the corruption by avoiding the generation of instructions if they could exceed 2 instructions in length. Fortunately, insn_la_mcount is only used if the instrumented code is located outside the kernel code section, so dynamic ftrace can still be used, albeit in a more limited scope. This is still preferable to corrupting memory and/or crashing the kernel.
Signed-off-by: Gregory CLEMENT gregory.clement@bootlin.com Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/kernel/ftrace.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/arch/mips/kernel/ftrace.c b/arch/mips/kernel/ftrace.c index f39e85fd58fa..b15615b28569 100644 --- a/arch/mips/kernel/ftrace.c +++ b/arch/mips/kernel/ftrace.c @@ -54,10 +54,20 @@ static inline void ftrace_dyn_arch_init_insns(void) u32 *buf; unsigned int v1;
- /* la v1, _mcount */ - v1 = 3; - buf = (u32 *)&insn_la_mcount[0]; - UASM_i_LA(&buf, v1, MCOUNT_ADDR); + /* If we are not in compat space, the number of generated + * instructions will exceed the maximum expected limit of 2. + * To prevent buffer overflow, we avoid generating them. + * insn_la_mcount will not be used later in ftrace_make_call. + */ + if (uasm_in_compat_space_p(MCOUNT_ADDR)) { + /* la v1, _mcount */ + v1 = 3; + buf = (u32 *)&insn_la_mcount[0]; + UASM_i_LA(&buf, v1, MCOUNT_ADDR); + } else { + pr_warn("ftrace: mcount address beyond 32 bits is not supported (%lX)\n", + MCOUNT_ADDR); + }
/* jal (ftrace_caller + 8), jump over the first two instruction */ buf = (u32 *)&insn_jal_ftrace_caller; @@ -189,6 +199,13 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) unsigned int new; unsigned long ip = rec->ip;
+ /* When the code to patch does not belong to the kernel code + * space, we must use insn_la_mcount. However, if MCOUNT_ADDR + * is not in compat space, insn_la_mcount is not usable. + */ + if (!core_kernel_text(ip) && !uasm_in_compat_space_p(MCOUNT_ADDR)) + return -EFAULT; + new = core_kernel_text(ip) ? insn_jal_ftrace_caller : insn_la_mcount[0];
#ifdef CONFIG_64BIT
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Garry john.g.garry@oracle.com
[ Upstream commit 1f7d6e2efeedd8f545d3e0e9bf338023bf4ea584 ]
The atomic write enable module param is "atomic_wr", and not "atomic_write", so fix the module param description.
Fixes: 84f3a3c01d70 ("scsi: scsi_debug: Atomic write support") Signed-off-by: John Garry john.g.garry@oracle.com Reviewed-by: Bart Van Assche bvanassche@acm.org Link: https://patch.msgid.link/20251211100651.9056-1-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi_debug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index b2ab97be5db3..047d56d23bea 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -7410,7 +7410,7 @@ MODULE_PARM_DESC(lbprz, MODULE_PARM_DESC(lbpu, "enable LBP, support UNMAP command (def=0)"); MODULE_PARM_DESC(lbpws, "enable LBP, support WRITE SAME(16) with UNMAP bit (def=0)"); MODULE_PARM_DESC(lbpws10, "enable LBP, support WRITE SAME(10) with UNMAP bit (def=0)"); -MODULE_PARM_DESC(atomic_write, "enable ATOMIC WRITE support, support WRITE ATOMIC(16) (def=0)"); +MODULE_PARM_DESC(atomic_wr, "enable ATOMIC WRITE support, support WRITE ATOMIC(16) (def=0)"); MODULE_PARM_DESC(lowest_aligned, "lowest aligned lba (def=0)"); MODULE_PARM_DESC(lun_format, "LUN format: 0->peripheral (def); 1 --> flat address method"); MODULE_PARM_DESC(max_luns, "number of LUNs per target to simulate(def=1)");
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Neil Armstrong neil.armstrong@linaro.org
[ Upstream commit 129049d4fe22c998ae9fd1ec479fbb4ed5338c15 ]
On plaforms with an a7xx GPU not supporting IFPC, the ifpc_reglist if still deferenced in a7xx_patch_pwrup_reglist() which causes a kernel crash: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 ... pc : a6xx_hw_init+0x155c/0x1e4c [msm] lr : a6xx_hw_init+0x9a8/0x1e4c [msm] ... Call trace: a6xx_hw_init+0x155c/0x1e4c [msm] (P) msm_gpu_hw_init+0x58/0x88 [msm] adreno_load_gpu+0x94/0x1fc [msm] msm_open+0xe4/0xf4 [msm] drm_file_alloc+0x1a0/0x2e4 [drm] drm_client_init+0x7c/0x104 [drm] drm_fbdev_client_setup+0x94/0xcf0 [drm_client_lib] drm_client_setup+0xb4/0xd8 [drm_client_lib] msm_drm_kms_post_init+0x2c/0x3c [msm] msm_drm_init+0x1a4/0x228 [msm] msm_drm_bind+0x30/0x3c [msm] ...
Check the validity of ifpc_reglist before deferencing the table to setup the register values.
Fixes: a6a0157cc68e ("drm/msm/a6xx: Enable IFPC on Adreno X1-85") Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Reviewed-by: Akhil P Oommen akhilpo@oss.qualcomm.com Patchwork: https://patchwork.freedesktop.org/patch/688944/ Message-ID: 20251117-topic-sm8x50-fix-a6xx-non-ifpc-v1-1-e4473cbf5903@linaro.org Signed-off-by: Rob Clark robin.clark@oss.qualcomm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 6f7ed07670b1..d1eaa849b197 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -837,15 +837,17 @@ static void a7xx_patch_pwrup_reglist(struct msm_gpu *gpu) lock->gpu_req = lock->cpu_req = lock->turn = 0;
reglist = adreno_gpu->info->a6xx->ifpc_reglist; - lock->ifpc_list_len = reglist->count; + if (reglist) { + lock->ifpc_list_len = reglist->count;
- /* - * For each entry in each of the lists, write the offset and the current - * register value into the GPU buffer - */ - for (i = 0; i < reglist->count; i++) { - *dest++ = reglist->regs[i]; - *dest++ = gpu_read(gpu, reglist->regs[i]); + /* + * For each entry in each of the lists, write the offset and the current + * register value into the GPU buffer + */ + for (i = 0; i < reglist->count; i++) { + *dest++ = reglist->regs[i]; + *dest++ = gpu_read(gpu, reglist->regs[i]); + } }
reglist = adreno_gpu->info->a6xx->pwrup_reglist;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alok Tiwari alok.a.tiwari@oracle.com
[ Upstream commit ef3b04091fd8bc737dc45312375df8625b8318e2 ]
Move the call to preempt_prepare_postamble() after verifying that preempt_postamble_ptr is valid. If preempt_postamble_ptr is NULL, dereferencing it in preempt_prepare_postamble() would lead to a crash.
This change avoids calling the preparation function when the postamble allocation has failed, preventing potential NULL pointer dereference and ensuring proper error handling.
Fixes: 50117cad0c50 ("drm/msm/a6xx: Use posamble to reset counters on preemption") Signed-off-by: Alok Tiwari alok.a.tiwari@oracle.com Patchwork: https://patchwork.freedesktop.org/patch/687659/ Message-ID: 20251113082839.3821867-1-alok.a.tiwari@oracle.com Signed-off-by: Rob Clark robin.clark@oss.qualcomm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_preempt.c b/drivers/gpu/drm/msm/adreno/a6xx_preempt.c index afc5f4aa3b17..747a22afad9f 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_preempt.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_preempt.c @@ -454,11 +454,11 @@ void a6xx_preempt_init(struct msm_gpu *gpu) gpu->vm, &a6xx_gpu->preempt_postamble_bo, &a6xx_gpu->preempt_postamble_iova);
- preempt_prepare_postamble(a6xx_gpu); - if (IS_ERR(a6xx_gpu->preempt_postamble_ptr)) goto fail;
+ preempt_prepare_postamble(a6xx_gpu); + timer_setup(&a6xx_gpu->preempt_timer, a6xx_preempt_timer, 0);
return;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Caleb Sander Mateos csander@purestorage.com
[ Upstream commit daa24603d9f0808929514ee62ced30052ca7221c ]
If a ublk server process releases a ublk char device file, any requests dispatched to the ublk server but not yet completed will retain a ref value of UBLK_REFCOUNT_INIT. Before commit e63d2228ef83 ("ublk: simplify aborting ublk request"), __ublk_fail_req() would decrement the reference count before completing the failed request. However, that commit optimized __ublk_fail_req() to call __ublk_complete_rq() directly without decrementing the request reference count. The leaked reference count incorrectly allows user copy and zero copy operations on the completed ublk request. It also triggers the WARN_ON_ONCE(refcount_read(&io->ref)) warnings in ublk_queue_reinit() and ublk_deinit_queue(). Commit c5c5eb24ed61 ("ublk: avoid ublk_io_release() called after ublk char dev is closed") already fixed the issue for ublk devices using UBLK_F_SUPPORT_ZERO_COPY or UBLK_F_AUTO_BUF_REG. However, the reference count leak also affects UBLK_F_USER_COPY, the other reference-counted data copy mode. Fix the condition in ublk_check_and_reset_active_ref() to include all reference-counted data copy modes. This ensures that any ublk requests still owned by the ublk server when it exits have their reference counts reset to 0.
Signed-off-by: Caleb Sander Mateos csander@purestorage.com Fixes: e63d2228ef83 ("ublk: simplify aborting ublk request") Reviewed-by: Ming Lei ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/ublk_drv.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index fa7b0481ea04..d8079ea8f8ca 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1674,8 +1674,7 @@ static bool ublk_check_and_reset_active_ref(struct ublk_device *ub) { int i, j;
- if (!(ub->dev_info.flags & (UBLK_F_SUPPORT_ZERO_COPY | - UBLK_F_AUTO_BUF_REG))) + if (!ublk_dev_need_req_ref(ub)) return false;
for (i = 0; i < ub->dev_info.nr_hw_queues; i++) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 1ddb815fdfd45613c32e9bd1f7137428f298e541 ]
The "dev->clt_device_id" variable is set using ida_alloc_max() which returns an int and in particular it returns negative error codes. Change the type from u32 to int to fix the error checking.
Fixes: c9b5645fd8ca ("block: rnbd-clt: Fix leaked ID in init_dev()") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/rnbd/rnbd-clt.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/block/rnbd/rnbd-clt.h b/drivers/block/rnbd/rnbd-clt.h index a48e040abe63..fbc1ed766025 100644 --- a/drivers/block/rnbd/rnbd-clt.h +++ b/drivers/block/rnbd/rnbd-clt.h @@ -112,7 +112,7 @@ struct rnbd_clt_dev { struct rnbd_queue *hw_queues; u32 device_id; /* local Idr index - used to track minor number allocations. */ - u32 clt_device_id; + int clt_device_id; struct mutex lock; enum rnbd_clt_dev_state dev_state; refcount_t refcount;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefano Garzarella sgarzare@redhat.com
[ Upstream commit d8ee3cfdc89b75dc059dc21c27bef2c1440f67eb ]
vhost_vsock_get() uses hash_for_each_possible_rcu() to find the `vhost_vsock` associated with the `guest_cid`. hash_for_each_possible_rcu() should only be called within an RCU read section, as mentioned in the following comment in include/linux/rculist.h:
/** * hlist_for_each_entry_rcu - iterate over rcu list of given type * @pos: the type * to use as a loop cursor. * @head: the head for your list. * @member: the name of the hlist_node within the struct. * @cond: optional lockdep expression if called from non-RCU protection. * * This list-traversal primitive may safely run concurrently with * the _rcu list-mutation primitives such as hlist_add_head_rcu() * as long as the traversal is guarded by rcu_read_lock(). */
Currently, all calls to vhost_vsock_get() are between rcu_read_lock() and rcu_read_unlock() except for calls in vhost_vsock_set_cid() and vhost_vsock_reset_orphans(). In both cases, the current code is safe, but we can make improvements to make it more robust.
About vhost_vsock_set_cid(), when building the kernel with CONFIG_PROVE_RCU_LIST enabled, we get the following RCU warning when the user space issues `ioctl(dev, VHOST_VSOCK_SET_GUEST_CID, ...)` :
WARNING: suspicious RCU usage 6.18.0-rc7 #62 Not tainted ----------------------------- drivers/vhost/vsock.c:74 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 1 lock held by rpc-libvirtd/3443: #0: ffffffffc05032a8 (vhost_vsock_mutex){+.+.}-{4:4}, at: vhost_vsock_dev_ioctl+0x2ff/0x530 [vhost_vsock]
stack backtrace: CPU: 2 UID: 0 PID: 3443 Comm: rpc-libvirtd Not tainted 6.18.0-rc7 #62 PREEMPT(none) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-7.fc42 06/10/2025 Call Trace: <TASK> dump_stack_lvl+0x75/0xb0 dump_stack+0x14/0x1a lockdep_rcu_suspicious.cold+0x4e/0x97 vhost_vsock_get+0x8f/0xa0 [vhost_vsock] vhost_vsock_dev_ioctl+0x307/0x530 [vhost_vsock] __x64_sys_ioctl+0x4f2/0xa00 x64_sys_call+0xed0/0x1da0 do_syscall_64+0x73/0xfa0 entry_SYSCALL_64_after_hwframe+0x76/0x7e ... </TASK>
This is not a real problem, because the vhost_vsock_get() caller, i.e. vhost_vsock_set_cid(), holds the `vhost_vsock_mutex` used by the hash table writers. Anyway, to prevent that warning, add lockdep_is_held() condition to hash_for_each_possible_rcu() to verify that either the caller is in an RCU read section or `vhost_vsock_mutex` is held when CONFIG_PROVE_RCU_LIST is enabled; and also clarify the comment for vhost_vsock_get() to better describe the locking requirements and the scope of the returned pointer validity.
About vhost_vsock_reset_orphans(), currently this function is only called via vsock_for_each_connected_socket(), which holds the `vsock_table_lock` spinlock (which is also an RCU read-side critical section). However, add an explicit RCU read lock there to make the code more robust and explicit about the RCU requirements, and to prevent issues if the calling context changes in the future or if vhost_vsock_reset_orphans() is called from other contexts.
Fixes: 834e772c8db0 ("vhost/vsock: fix use-after-free in network stack callers") Cc: stefanha@redhat.com Signed-off-by: Stefano Garzarella sgarzare@redhat.com Reviewed-by: Stefan Hajnoczi stefanha@redhat.com Message-Id: 20251126133826.142496-1-sgarzare@redhat.com Message-ID: 20251126210313.GA499503@fedora Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vhost/vsock.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index ae01457ea2cd..78cc66fbb3dd 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -64,14 +64,15 @@ static u32 vhost_transport_get_local_cid(void) return VHOST_VSOCK_DEFAULT_HOST_CID; }
-/* Callers that dereference the return value must hold vhost_vsock_mutex or the - * RCU read lock. +/* Callers must be in an RCU read section or hold the vhost_vsock_mutex. + * The return value can only be dereferenced while within the section. */ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid) { struct vhost_vsock *vsock;
- hash_for_each_possible_rcu(vhost_vsock_hash, vsock, hash, guest_cid) { + hash_for_each_possible_rcu(vhost_vsock_hash, vsock, hash, guest_cid, + lockdep_is_held(&vhost_vsock_mutex)) { u32 other_cid = vsock->guest_cid;
/* Skip instances that have no CID yet */ @@ -707,9 +708,15 @@ static void vhost_vsock_reset_orphans(struct sock *sk) * executing. */
+ rcu_read_lock(); + /* If the peer is still valid, no need to reset connection */ - if (vhost_vsock_get(vsk->remote_addr.svm_cid)) + if (vhost_vsock_get(vsk->remote_addr.svm_cid)) { + rcu_read_unlock(); return; + } + + rcu_read_unlock();
/* If the close timeout is pending, let it expire. This avoids races * with the timeout callback.
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zilin Guan zilin@seu.edu.cn
[ Upstream commit cb6d5aa9c0f10074f1ad056c3e2278ad2cc7ec8d ]
In smb3_reconfigure(), if smb3_sync_session_ctx_passwords() fails, the function returns immediately without freeing and erasing the newly allocated new_password and new_password2. This causes both a memory leak and a potential information leak.
Fix this by calling kfree_sensitive() on both password buffers before returning in this error case.
Fixes: 0f0e357902957 ("cifs: during remount, make sure passwords are in sync") Signed-off-by: Zilin Guan zilin@seu.edu.cn Reviewed-by: ChenXiaoSong chenxiaosong@kylinos.cn Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/fs_context.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c index 2a0d8b87bd8e..d8bd3cdc535d 100644 --- a/fs/smb/client/fs_context.c +++ b/fs/smb/client/fs_context.c @@ -1080,6 +1080,8 @@ static int smb3_reconfigure(struct fs_context *fc) rc = smb3_sync_session_ctx_passwords(cifs_sb, ses); if (rc) { mutex_unlock(&ses->session_mutex); + kfree_sensitive(new_password); + kfree_sensitive(new_password2); return rc; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lyude Paul lyude@redhat.com
commit e54ad0cd3673c93cdafda58505eaa81610fe3aef upstream.
Invariants should be prefixed with a # to turn it into a header.
There are no functional changes in this patch.
Cc: stable@vger.kernel.org Fixes: c284d3e42338 ("rust: drm: gem: Add GEM object abstraction") Signed-off-by: Lyude Paul lyude@redhat.com Link: https://patch.msgid.link/20251107202603.465932-1-lyude@redhat.com Signed-off-by: Alice Ryhl aliceryhl@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- rust/kernel/drm/gem/mod.rs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/rust/kernel/drm/gem/mod.rs +++ b/rust/kernel/drm/gem/mod.rs @@ -184,7 +184,7 @@ impl<T: IntoGEMObject> BaseObject for T
/// A base GEM object. /// -/// Invariants +/// # Invariants /// /// - `self.obj` is a valid instance of a `struct drm_gem_object`. /// - `self.dev` is always a valid pointer to a `struct drm_device`.
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alice Ryhl aliceryhl@google.com
commit 6c37bebd8c926ad01ef157c0d123633a203e5c0d upstream.
Similar to the previous commit, List::remove is used on delivered_deaths, so do not use mem::take on it as that may result in violations of the List::remove safety requirements.
I don't think this particular case can be triggered because it requires fd close to run in parallel with an ioctl on the same fd. But let's not tempt fate.
Cc: stable@vger.kernel.org Fixes: eafedbc7c050 ("rust_binder: add Rust Binder driver") Signed-off-by: Alice Ryhl aliceryhl@google.com Acked-by: Miguel Ojeda ojeda@kernel.org Link: https://patch.msgid.link/20251111-binder-fix-list-remove-v1-2-8ed14a0da63d@g... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/android/binder/process.rs | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/android/binder/process.rs b/drivers/android/binder/process.rs index 27323070f30f..fd5dcdc8788c 100644 --- a/drivers/android/binder/process.rs +++ b/drivers/android/binder/process.rs @@ -1362,8 +1362,12 @@ fn deferred_release(self: Arc<Self>) { work.into_arc().cancel(); }
- let delivered_deaths = take(&mut self.inner.lock().delivered_deaths); - drop(delivered_deaths); + // Clear delivered_deaths list. + // + // Scope ensures that MutexGuard is dropped while executing the body. + while let Some(delivered_death) = { self.inner.lock().delivered_deaths.pop_front() } { + drop(delivered_death); + }
// Free any resources kept alive by allocated buffers. let omapping = self.inner.lock().mapping.take();
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: FUJITA Tomonori fujita.tomonori@gmail.com
commit d8932355f8c5673106eca49abd142f8fe0c1fe8b upstream.
Add dma_set_mask(), dma_set_coherent_mask(), dma_map_sgtable(), and dma_max_mapping_size() helpers to fix a build error when CONFIG_HAS_DMA is not enabled.
Note that when CONFIG_HAS_DMA is enabled, they are included in both bindings_generated.rs and bindings_helpers_generated.rs. The former takes precedence so behavior remains unchanged in that case.
This fixes the following build error on UML:
error[E0425]: cannot find function `dma_set_mask` in crate `bindings` --> rust/kernel/dma.rs:46:38 | 46 | to_result(unsafe { bindings::dma_set_mask(self.as_ref().as_raw(), mask.value()) }) | ^^^^^^^^^^^^ help: a function with a similar name exists: `xa_set_mark` | ::: rust/bindings/bindings_generated.rs:24690:5 | 24690 | pub fn xa_set_mark(arg1: *mut xarray, index: ffi::c_ulong, arg2: xa_mark_t); | ---------------------------------------------------------------------------- similarly named function `xa_set_mark` defined here
error[E0425]: cannot find function `dma_set_coherent_mask` in crate `bindings` --> rust/kernel/dma.rs:63:38 | 63 | to_result(unsafe { bindings::dma_set_coherent_mask(self.as_ref().as_raw(), mask.value()) }) | ^^^^^^^^^^^^^^^^^^^^^ help: a function with a similar name exists: `dma_coherent_ok` | ::: rust/bindings/bindings_generated.rs:52745:5 | 52745 | pub fn dma_coherent_ok(dev: *mut device, phys: phys_addr_t, size: usize) -> bool_; | ---------------------------------------------------------------------------------- similarly named function `dma_coherent_ok` defined here
error[E0425]: cannot find function `dma_map_sgtable` in crate `bindings` --> rust/kernel/scatterlist.rs:212:23 | 212 | bindings::dma_map_sgtable(dev.as_raw(), sgt.as_ptr(), dir.into(), 0) | ^^^^^^^^^^^^^^^ help: a function with a similar name exists: `dma_unmap_sgtable` | ::: rust/bindings/bindings_helpers_generated.rs:1351:5 | 1351 | / pub fn dma_unmap_sgtable( 1352 | | dev: *mut device, 1353 | | sgt: *mut sg_table, 1354 | | dir: dma_data_direction, 1355 | | attrs: ffi::c_ulong, 1356 | | ); | |______- similarly named function `dma_unmap_sgtable` defined here
error[E0425]: cannot find function `dma_max_mapping_size` in crate `bindings` --> rust/kernel/scatterlist.rs:356:52 | 356 | let max_segment = match unsafe { bindings::dma_max_mapping_size(dev.as_raw()) } { | ^^^^^^^^^^^^^^^^^^^^ not found in `bindings`
error: aborting due to 4 previous errors
Cc: stable@vger.kernel.org # v6.17+ Fixes: 101d66828a4ee ("rust: dma: add DMA addressing capabilities") Signed-off-by: FUJITA Tomonori fujita.tomonori@gmail.com Reviewed-by: David Gow davidgow@google.com Reviewed-by: Alice Ryhl aliceryhl@google.com Link: https://patch.msgid.link/20251204160639.364936-1-fujita.tomonori@gmail.com [ Use relative paths in the error splat; add 'dma' prefix. - Danilo ] Signed-off-by: Danilo Krummrich dakr@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- rust/helpers/dma.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)
diff --git a/rust/helpers/dma.c b/rust/helpers/dma.c index 6e741c197242..2afa32c21c94 100644 --- a/rust/helpers/dma.c +++ b/rust/helpers/dma.c @@ -19,3 +19,24 @@ int rust_helper_dma_set_mask_and_coherent(struct device *dev, u64 mask) { return dma_set_mask_and_coherent(dev, mask); } + +int rust_helper_dma_set_mask(struct device *dev, u64 mask) +{ + return dma_set_mask(dev, mask); +} + +int rust_helper_dma_set_coherent_mask(struct device *dev, u64 mask) +{ + return dma_set_coherent_mask(dev, mask); +} + +int rust_helper_dma_map_sgtable(struct device *dev, struct sg_table *sgt, + enum dma_data_direction dir, unsigned long attrs) +{ + return dma_map_sgtable(dev, sgt, dir, attrs); +} + +size_t rust_helper_dma_max_mapping_size(struct device *dev) +{ + return dma_max_mapping_size(dev); +}
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marko Turk mt@markoturk.info
commit e2f1081ca8f18c146e8f928486deac61eca2b517 upstream.
MMIO backend of PCI Bar always assumes little-endian devices and will convert to CPU endianness automatically. Remove the u32::from_le conversion which would cause a bug on big-endian machines.
Cc: stable@vger.kernel.org Reviewed-by: Dirk Behme dirk.behme@de.bosch.com Signed-off-by: Marko Turk mt@markoturk.info Fixes: 685376d18e9a ("samples: rust: add Rust PCI sample driver") Link: https://patch.msgid.link/20251210112503.62925-2-mt@markoturk.info Signed-off-by: Danilo Krummrich dakr@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- samples/rust/rust_driver_pci.rs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/samples/rust/rust_driver_pci.rs b/samples/rust/rust_driver_pci.rs index 5823787bea8e..fa677991a5c4 100644 --- a/samples/rust/rust_driver_pci.rs +++ b/samples/rust/rust_driver_pci.rs @@ -48,7 +48,7 @@ fn testdev(index: &TestIndex, bar: &Bar0) -> Result<u32> { // Select the test. bar.write8(index.0, Regs::TEST);
- let offset = u32::from_le(bar.read32(Regs::OFFSET)) as usize; + let offset = bar.read32(Regs::OFFSET) as usize; let data = bar.read8(Regs::DATA);
// Write `data` to `offset` to increase `count` by one.
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alice Ryhl aliceryhl@google.com
commit 919b72922717e396be9435c83916b9969505bd23 upstream.
These typedefs are always equivalent so this should not change anything, but the code makes a lot more sense like this.
Cc: stable@vger.kernel.org Signed-off-by: Alice Ryhl aliceryhl@google.com Fixes: 493fc33ec252 ("rust: io: add resource abstraction") Link: https://patch.msgid.link/20251112-resource-phys-typedefs-v2-1-538307384f82@g... Signed-off-by: Danilo Krummrich dakr@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- rust/kernel/io/resource.rs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/rust/kernel/io/resource.rs +++ b/rust/kernel/io/resource.rs @@ -16,7 +16,7 @@ use crate::types::Opaque; /// /// This is a type alias to either `u32` or `u64` depending on the config option /// `CONFIG_PHYS_ADDR_T_64BIT`, and it can be a u64 even on 32-bit architectures. -pub type ResourceSize = bindings::phys_addr_t; +pub type ResourceSize = bindings::resource_size_t;
/// A region allocated from a parent [`Resource`]. ///
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alice Ryhl aliceryhl@google.com
commit dfd67993044f507ba8fd6ee9956f923ba4b7e851 upstream.
Resource sizes are a general concept for dealing with physical addresses, and not specific to the Resource type, which is just one way to access physical addresses. Thus, move the typedef to the io module.
Still keep a re-export under resource. This avoids this commit from being a flag-day, but I also think it's a useful re-export in general so that you can import
use kernel::io::resource::{Resource, ResourceSize};
instead of having to write
use kernel::io::{ resource::Resource, ResourceSize, };
in the specific cases where you need ResourceSize because you are using the Resource type. Therefore I think it makes sense to keep this re-export indefinitely and it is *not* intended as a temporary re-export for migration purposes.
Cc: stable@vger.kernel.org # for v6.18 [1] Signed-off-by: Alice Ryhl aliceryhl@google.com Link: https://patch.msgid.link/20251112-resource-phys-typedefs-v2-2-538307384f82@g... Link: https://lore.kernel.org/all/20251112-resource-phys-typedefs-v2-0-538307384f8... [1] Signed-off-by: Danilo Krummrich dakr@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- rust/kernel/io.rs | 6 ++++++ rust/kernel/io/resource.rs | 6 +----- 2 files changed, 7 insertions(+), 5 deletions(-)
--- a/rust/kernel/io.rs +++ b/rust/kernel/io.rs @@ -13,6 +13,12 @@ pub mod resource;
pub use resource::Resource;
+/// Resource Size type. +/// +/// This is a type alias to either `u32` or `u64` depending on the config option +/// `CONFIG_PHYS_ADDR_T_64BIT`, and it can be a u64 even on 32-bit architectures. +pub type ResourceSize = bindings::resource_size_t; + /// Raw representation of an MMIO region. /// /// By itself, the existence of an instance of this structure does not provide any guarantees that --- a/rust/kernel/io/resource.rs +++ b/rust/kernel/io/resource.rs @@ -12,11 +12,7 @@ use crate::prelude::*; use crate::str::{CStr, CString}; use crate::types::Opaque;
-/// Resource Size type. -/// -/// This is a type alias to either `u32` or `u64` depending on the config option -/// `CONFIG_PHYS_ADDR_T_64BIT`, and it can be a u64 even on 32-bit architectures. -pub type ResourceSize = bindings::resource_size_t; +pub use super::ResourceSize;
/// A region allocated from a parent [`Resource`]. ///
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alice Ryhl aliceryhl@google.com
commit dd6ff5cf56fb183fce605ca6a5bfce228cd8888b upstream.
The C typedef phys_addr_t is missing an analogue in Rust, meaning that we end up using bindings::phys_addr_t or ResourceSize as a replacement in various places throughout the kernel. Fix that by introducing a new typedef on the Rust side. Place it next to the existing ResourceSize typedef since they're quite related to each other.
Cc: stable@vger.kernel.org # for v6.18 [1] Signed-off-by: Alice Ryhl aliceryhl@google.com Link: https://patch.msgid.link/20251112-resource-phys-typedefs-v2-4-538307384f82@g... Link: https://lore.kernel.org/all/20251112-resource-phys-typedefs-v2-0-538307384f8... [1] Signed-off-by: Danilo Krummrich dakr@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- rust/kernel/devres.rs | 18 +++++++++++++++--- rust/kernel/io.rs | 20 +++++++++++++++++--- rust/kernel/io/resource.rs | 9 ++++++--- 3 files changed, 38 insertions(+), 9 deletions(-)
--- a/rust/kernel/devres.rs +++ b/rust/kernel/devres.rs @@ -52,8 +52,20 @@ struct Inner<T: Send> { /// # Examples /// /// ```no_run -/// # use kernel::{bindings, device::{Bound, Device}, devres::Devres, io::{Io, IoRaw}}; -/// # use core::ops::Deref; +/// use kernel::{ +/// bindings, +/// device::{ +/// Bound, +/// Device, +/// }, +/// devres::Devres, +/// io::{ +/// Io, +/// IoRaw, +/// PhysAddr, +/// }, +/// }; +/// use core::ops::Deref; /// /// // See also [`pci::Bar`] for a real example. /// struct IoMem<const SIZE: usize>(IoRaw<SIZE>); @@ -66,7 +78,7 @@ struct Inner<T: Send> { /// unsafe fn new(paddr: usize) -> Result<Self>{ /// // SAFETY: By the safety requirements of this function [`paddr`, `paddr` + `SIZE`) is /// // valid for `ioremap`. -/// let addr = unsafe { bindings::ioremap(paddr as bindings::phys_addr_t, SIZE) }; +/// let addr = unsafe { bindings::ioremap(paddr as PhysAddr, SIZE) }; /// if addr.is_null() { /// return Err(ENOMEM); /// } --- a/rust/kernel/io.rs +++ b/rust/kernel/io.rs @@ -13,6 +13,12 @@ pub mod resource;
pub use resource::Resource;
+/// Physical address type. +/// +/// This is a type alias to either `u32` or `u64` depending on the config option +/// `CONFIG_PHYS_ADDR_T_64BIT`, and it can be a u64 even on 32-bit architectures. +pub type PhysAddr = bindings::phys_addr_t; + /// Resource Size type. /// /// This is a type alias to either `u32` or `u64` depending on the config option @@ -68,8 +74,16 @@ impl<const SIZE: usize> IoRaw<SIZE> { /// # Examples /// /// ```no_run -/// # use kernel::{bindings, ffi::c_void, io::{Io, IoRaw}}; -/// # use core::ops::Deref; +/// use kernel::{ +/// bindings, +/// ffi::c_void, +/// io::{ +/// Io, +/// IoRaw, +/// PhysAddr, +/// }, +/// }; +/// use core::ops::Deref; /// /// // See also [`pci::Bar`] for a real example. /// struct IoMem<const SIZE: usize>(IoRaw<SIZE>); @@ -82,7 +96,7 @@ impl<const SIZE: usize> IoRaw<SIZE> { /// unsafe fn new(paddr: usize) -> Result<Self>{ /// // SAFETY: By the safety requirements of this function [`paddr`, `paddr` + `SIZE`) is /// // valid for `ioremap`. -/// let addr = unsafe { bindings::ioremap(paddr as bindings::phys_addr_t, SIZE) }; +/// let addr = unsafe { bindings::ioremap(paddr as PhysAddr, SIZE) }; /// if addr.is_null() { /// return Err(ENOMEM); /// } --- a/rust/kernel/io/resource.rs +++ b/rust/kernel/io/resource.rs @@ -12,7 +12,10 @@ use crate::prelude::*; use crate::str::{CStr, CString}; use crate::types::Opaque;
-pub use super::ResourceSize; +pub use super::{ + PhysAddr, + ResourceSize, // +};
/// A region allocated from a parent [`Resource`]. /// @@ -93,7 +96,7 @@ impl Resource { /// the region, or a part of it, is already in use. pub fn request_region( &self, - start: ResourceSize, + start: PhysAddr, size: ResourceSize, name: CString, flags: Flags, @@ -127,7 +130,7 @@ impl Resource { }
/// Returns the start address of the resource. - pub fn start(&self) -> ResourceSize { + pub fn start(&self) -> PhysAddr { let inner = self.0.get(); // SAFETY: Safe as per the invariants of `Resource`. unsafe { (*inner).start }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Sakkinen jarkko@kernel.org
commit 62cd5d480b9762ce70d720a81fa5b373052ae05f upstream.
'tpm2_load_cmd' allocates a tempoary blob indirectly via 'tpm2_key_decode' but it is not freed in the failure paths. Address this by wrapping the blob into with a cleanup helper.
Cc: stable@vger.kernel.org # v5.13+ Fixes: f2219745250f ("security: keys: trusted: use ASN.1 TPM2 key format for the blobs") Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/keys/trusted-keys/trusted_tpm2.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/security/keys/trusted-keys/trusted_tpm2.c +++ b/security/keys/trusted-keys/trusted_tpm2.c @@ -387,6 +387,7 @@ static int tpm2_load_cmd(struct tpm_chip struct trusted_key_options *options, u32 *blob_handle) { + u8 *blob_ref __free(kfree) = NULL; struct tpm_buf buf; unsigned int private_len; unsigned int public_len; @@ -400,6 +401,9 @@ static int tpm2_load_cmd(struct tpm_chip /* old form */ blob = payload->blob; payload->old_format = 1; + } else { + /* Bind for cleanup: */ + blob_ref = blob; }
/* new format carries keyhandle but old format doesn't */ @@ -464,8 +468,6 @@ static int tpm2_load_cmd(struct tpm_chip (__be32 *) &buf.data[TPM_HEADER_SIZE]);
out: - if (blob != payload->blob) - kfree(blob); tpm_buf_destroy(&buf);
if (rc > 0)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 9c75986a298f121ed2c6599b05e51d9a34e77068 upstream.
The mmio regmap allocated during probe is never freed.
Switch to using the device managed allocator so that the regmap is released on probe failures (e.g. probe deferral) and on driver unbind.
Fixes: a250cd4c1901 ("clk: keystone: syscon-clk: Do not use syscon helper to build regmap") Cc: stable@vger.kernel.org # 6.15 Cc: Andrew Davis afd@ti.com Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/keystone/syscon-clk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/keystone/syscon-clk.c b/drivers/clk/keystone/syscon-clk.c index c509929da854..ecf180a7949c 100644 --- a/drivers/clk/keystone/syscon-clk.c +++ b/drivers/clk/keystone/syscon-clk.c @@ -129,7 +129,7 @@ static int ti_syscon_gate_clk_probe(struct platform_device *pdev) if (IS_ERR(base)) return PTR_ERR(base);
- regmap = regmap_init_mmio(dev, base, &ti_syscon_regmap_cfg); + regmap = devm_regmap_init_mmio(dev, base, &ti_syscon_regmap_cfg); if (IS_ERR(regmap)) return dev_err_probe(dev, PTR_ERR(regmap), "failed to get regmap\n");
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
commit 84230ad2d2afbf0c44c32967e525c0ad92e26b4e upstream.
When the core of io_uring was updated to handle completions consistently and with fixed return codes, the POLL_REMOVE opcode with updates got slightly broken. If a POLL_ADD is pending and then POLL_REMOVE is used to update the events of that request, if that update causes the POLL_ADD to now trigger, then that completion is lost and a CQE is never posted.
Additionally, ensure that if an update does cause an existing POLL_ADD to complete, that the completion value isn't always overwritten with -ECANCELED. For that case, whatever io_poll_add() set the value to should just be retained.
Cc: stable@vger.kernel.org Fixes: 97b388d70b53 ("io_uring: handle completions in the core") Reported-by: syzbot+641eec6b7af1f62f2b99@syzkaller.appspotmail.com Tested-by: syzbot+641eec6b7af1f62f2b99@syzkaller.appspotmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/poll.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/io_uring/poll.c +++ b/io_uring/poll.c @@ -936,12 +936,17 @@ int io_poll_remove(struct io_kiocb *req,
ret2 = io_poll_add(preq, issue_flags & ~IO_URING_F_UNLOCKED); /* successfully updated, don't complete poll request */ - if (!ret2 || ret2 == -EIOCBQUEUED) + if (ret2 == IOU_ISSUE_SKIP_COMPLETE) goto out; + /* request completed as part of the update, complete it */ + else if (ret2 == IOU_COMPLETE) + goto complete; }
- req_set_fail(preq); io_req_set_res(preq, -ECANCELED, 0); +complete: + if (preq->cqe.res < 0) + req_set_fail(preq); preq->io_task_work.func = io_req_task_complete; io_req_task_work_add(preq); out:
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
commit e15cb2200b934e507273510ba6bc747d5cde24a3 upstream.
Using min_wait, two timeouts are given:
1) The min_wait timeout, within which up to 'wait_nr' events are waited for. 2) The overall long timeout, which is entered if no events are generated in the min_wait window.
If the min_wait has expired, any event being posted must wake the task. For SQPOLL, that isn't the case, as it won't trigger the io_has_work() condition, as it will have already processed the task_work that happened when an event was posted. This causes any event to trigger post the min_wait to not always cause the waiting application to wakeup, and instead it will wait until the overall timeout has expired. This can be shown in a test case that has a 1 second min_wait, with a 5 second overall wait, even if an event triggers after 1.5 seconds:
axboe@m2max-kvm /d/iouring-mre (master)> zig-out/bin/iouring info: MIN_TIMEOUT supported: true, features: 0x3ffff info: Testing: min_wait=1000ms, timeout=5s, wait_nr=4 info: 1 cqes in 5000.2ms
where the expected result should be:
axboe@m2max-kvm /d/iouring-mre (master)> zig-out/bin/iouring info: MIN_TIMEOUT supported: true, features: 0x3ffff info: Testing: min_wait=1000ms, timeout=5s, wait_nr=4 info: 1 cqes in 1500.3ms
When the min_wait timeout triggers, reset the number of completions needed to wake the task. This should ensure that any future events will wake the task, regardless of how many events it originally wanted to wait for.
Reported-by: Tip ten Brink tip@tenbrinkmeijs.com Cc: stable@vger.kernel.org Fixes: 1100c4a2656d ("io_uring: add support for batch wait timeout") Link: https://github.com/axboe/liburing/issues/1477 Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2551,6 +2551,9 @@ static enum hrtimer_restart io_cqring_mi goto out_wake; }
+ /* any generated CQE posted past this time should wake us up */ + iowq->cq_tail = iowq->cq_min_tail; + hrtimer_update_function(&iowq->t, io_cqring_timer_wakeup); hrtimer_set_expires(timer, iowq->timeout); return HRTIMER_RESTART;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Prithvi Tambewagh activprithvi@gmail.com
commit b14fad555302a2104948feaff70503b64c80ac01 upstream.
__io_openat_prep() allocates a struct filename using getname(). However, for the condition of the file being installed in the fixed file table as well as having O_CLOEXEC flag set, the function returns early. At that point, the request doesn't have REQ_F_NEED_CLEANUP flag set. Due to this, the memory for the newly allocated struct filename is not cleaned up, causing a memory leak.
Fix this by setting the REQ_F_NEED_CLEANUP for the request just after the successful getname() call, so that when the request is torn down, the filename will be cleaned up, along with other resources needing cleanup.
Reported-by: syzbot+00e61c43eb5e4740438f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=00e61c43eb5e4740438f Tested-by: syzbot+00e61c43eb5e4740438f@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Prithvi Tambewagh activprithvi@gmail.com Fixes: b9445598d8c6 ("io_uring: openat directly into fixed fd table") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/openclose.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/io_uring/openclose.c +++ b/io_uring/openclose.c @@ -73,13 +73,13 @@ static int __io_openat_prep(struct io_ki open->filename = NULL; return ret; } + req->flags |= REQ_F_NEED_CLEANUP;
open->file_slot = READ_ONCE(sqe->file_index); if (open->file_slot && (open->how.flags & O_CLOEXEC)) return -EINVAL;
open->nofile = rlimit(RLIMIT_NOFILE); - req->flags |= REQ_F_NEED_CLEANUP; if (io_openat_force_async(open)) req->flags |= REQ_F_FORCE_ASYNC; return 0;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Ogness john.ogness@linutronix.de
commit 26873e3e7f0cb26c45e6ad63656f9fe36b2aa31b upstream.
Allowing irq_work to be scheduled while trying to suspend has shown to cause problems as some architectures interpret the pending interrupts as a reason to not suspend. This became a problem for printk() with the introduction of NBCON consoles. With every printk() call, NBCON console printing kthreads are woken by queueing irq_work. This means that irq_work continues to be queued due to printk() calls late in the suspend procedure.
Avoid this problem by preventing printk() from queueing irq_work once console suspending has begun. This applies to triggering NBCON and legacy deferred printing as well as klogd waiters.
Since triggering of NBCON threaded printing relies on irq_work, the pr_flush() within console_suspend_all() is used to perform the final flushing before suspending consoles and blocking irq_work queueing. NBCON consoles that are not suspended (due to the usage of the "no_console_suspend" boot argument) transition to atomic flushing.
Introduce a new global variable @console_irqwork_blocked to flag when irq_work queueing is to be avoided. The flag is used by printk_get_console_flush_type() to avoid allowing deferred printing and switch NBCON consoles to atomic flushing. It is also used by vprintk_emit() to avoid klogd waking.
Add WARN_ON_ONCE(console_irqwork_blocked) to the irq_work queuing functions to catch any code that attempts to queue printk irq_work during the suspending/resuming procedure.
Cc: stable@vger.kernel.org # 6.13.x because no drivers in 6.12.x Fixes: 6b93bb41f6ea ("printk: Add non-BKL (nbcon) console basic infrastructure") Closes: https://lore.kernel.org/lkml/DB9PR04MB8429E7DDF2D93C2695DE401D92C4A@DB9PR04M... Signed-off-by: John Ogness john.ogness@linutronix.de Reviewed-by: Petr Mladek pmladek@suse.com Tested-by: Sherry Sun sherry.sun@nxp.com Link: https://patch.msgid.link/20251113160351.113031-3-john.ogness@linutronix.de Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/printk/internal.h | 8 ++++-- kernel/printk/nbcon.c | 7 +++++ kernel/printk/printk.c | 58 ++++++++++++++++++++++++++++++++++------------- 3 files changed, 55 insertions(+), 18 deletions(-)
--- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -230,6 +230,8 @@ struct console_flush_type { bool legacy_offload; };
+extern bool console_irqwork_blocked; + /* * Identify which console flushing methods should be used in the context of * the caller. @@ -241,7 +243,7 @@ static inline void printk_get_console_fl switch (nbcon_get_default_prio()) { case NBCON_PRIO_NORMAL: if (have_nbcon_console && !have_boot_console) { - if (printk_kthreads_running) + if (printk_kthreads_running && !console_irqwork_blocked) ft->nbcon_offload = true; else ft->nbcon_atomic = true; @@ -251,7 +253,7 @@ static inline void printk_get_console_fl if (have_legacy_console || have_boot_console) { if (!is_printk_legacy_deferred()) ft->legacy_direct = true; - else + else if (!console_irqwork_blocked) ft->legacy_offload = true; } break; @@ -264,7 +266,7 @@ static inline void printk_get_console_fl if (have_legacy_console || have_boot_console) { if (!is_printk_legacy_deferred()) ft->legacy_direct = true; - else + else if (!console_irqwork_blocked) ft->legacy_offload = true; } break; --- a/kernel/printk/nbcon.c +++ b/kernel/printk/nbcon.c @@ -1276,6 +1276,13 @@ void nbcon_kthreads_wake(void) if (!printk_kthreads_running) return;
+ /* + * It is not allowed to call this function when console irq_work + * is blocked. + */ + if (WARN_ON_ONCE(console_irqwork_blocked)) + return; + cookie = console_srcu_read_lock(); for_each_console_srcu(con) { if (!(console_srcu_read_flags(con) & CON_NBCON)) --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -462,6 +462,9 @@ bool have_boot_console; /* See printk_legacy_allow_panic_sync() for details. */ bool legacy_allow_panic_sync;
+/* Avoid using irq_work when suspending. */ +bool console_irqwork_blocked; + #ifdef CONFIG_PRINTK DECLARE_WAIT_QUEUE_HEAD(log_wait); static DECLARE_WAIT_QUEUE_HEAD(legacy_wait); @@ -2426,7 +2429,7 @@ asmlinkage int vprintk_emit(int facility
if (ft.legacy_offload) defer_console_output(); - else + else if (!console_irqwork_blocked) wake_up_klogd();
return printed_len; @@ -2730,10 +2733,20 @@ void console_suspend_all(void) { struct console *con;
+ if (console_suspend_enabled) + pr_info("Suspending console(s) (use no_console_suspend to debug)\n"); + + /* + * Flush any console backlog and then avoid queueing irq_work until + * console_resume_all(). Until then deferred printing is no longer + * triggered, NBCON consoles transition to atomic flushing, and + * any klogd waiters are not triggered. + */ + pr_flush(1000, true); + console_irqwork_blocked = true; + if (!console_suspend_enabled) return; - pr_info("Suspending console(s) (use no_console_suspend to debug)\n"); - pr_flush(1000, true);
console_list_lock(); for_each_console(con) @@ -2754,26 +2767,34 @@ void console_resume_all(void) struct console_flush_type ft; struct console *con;
- if (!console_suspend_enabled) - return; - - console_list_lock(); - for_each_console(con) - console_srcu_write_flags(con, con->flags & ~CON_SUSPENDED); - console_list_unlock(); - /* - * Ensure that all SRCU list walks have completed. All printing - * contexts must be able to see they are no longer suspended so - * that they are guaranteed to wake up and resume printing. + * Allow queueing irq_work. After restoring console state, deferred + * printing and any klogd waiters need to be triggered in case there + * is now a console backlog. */ - synchronize_srcu(&console_srcu); + console_irqwork_blocked = false; + + if (console_suspend_enabled) { + console_list_lock(); + for_each_console(con) + console_srcu_write_flags(con, con->flags & ~CON_SUSPENDED); + console_list_unlock(); + + /* + * Ensure that all SRCU list walks have completed. All printing + * contexts must be able to see they are no longer suspended so + * that they are guaranteed to wake up and resume printing. + */ + synchronize_srcu(&console_srcu); + }
printk_get_console_flush_type(&ft); if (ft.nbcon_offload) nbcon_kthreads_wake(); if (ft.legacy_offload) defer_console_output(); + else + wake_up_klogd();
pr_flush(1000, true); } @@ -4511,6 +4532,13 @@ static void __wake_up_klogd(int val) if (!printk_percpu_data_ready()) return;
+ /* + * It is not allowed to call this function when console irq_work + * is blocked. + */ + if (WARN_ON_ONCE(console_irqwork_blocked)) + return; + preempt_disable(); /* * Guarantee any new records can be seen by tasks preparing to wait
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tejun Heo tj@kernel.org
commit 530b6637c79e728d58f1d9b66bd4acf4b735b86d upstream.
Factor out local_dsq_post_enq() which performs post-enqueue handling for local DSQs - triggering resched_curr() if SCX_ENQ_PREEMPT is specified or if the current CPU is idle. No functional change.
This will be used by the next patch to fix move_local_task_to_local_dsq().
Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Andrea Righi arighi@nvidia.com Reviewed-by: Emil Tsalapatis emil@etsalapatis.com Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/ext.c | 34 +++++++++++++++++++--------------- 1 file changed, 19 insertions(+), 15 deletions(-)
--- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -906,6 +906,22 @@ static void refill_task_slice_dfl(struct __scx_add_event(sch, SCX_EV_REFILL_SLICE_DFL, 1); }
+static void local_dsq_post_enq(struct scx_dispatch_q *dsq, struct task_struct *p, + u64 enq_flags) +{ + struct rq *rq = container_of(dsq, struct rq, scx.local_dsq); + bool preempt = false; + + if ((enq_flags & SCX_ENQ_PREEMPT) && p != rq->curr && + rq->curr->sched_class == &ext_sched_class) { + rq->curr->scx.slice = 0; + preempt = true; + } + + if (preempt || sched_class_above(&ext_sched_class, rq->curr->sched_class)) + resched_curr(rq); +} + static void dispatch_enqueue(struct scx_sched *sch, struct scx_dispatch_q *dsq, struct task_struct *p, u64 enq_flags) { @@ -1003,22 +1019,10 @@ static void dispatch_enqueue(struct scx_ if (enq_flags & SCX_ENQ_CLEAR_OPSS) atomic_long_set_release(&p->scx.ops_state, SCX_OPSS_NONE);
- if (is_local) { - struct rq *rq = container_of(dsq, struct rq, scx.local_dsq); - bool preempt = false; - - if ((enq_flags & SCX_ENQ_PREEMPT) && p != rq->curr && - rq->curr->sched_class == &ext_sched_class) { - rq->curr->scx.slice = 0; - preempt = true; - } - - if (preempt || sched_class_above(&ext_sched_class, - rq->curr->sched_class)) - resched_curr(rq); - } else { + if (is_local) + local_dsq_post_enq(dsq, p, enq_flags); + else raw_spin_unlock(&dsq->lock); - } }
static void task_unlink_from_dsq(struct task_struct *p,
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zqiang qiang.zhang@linux.dev
commit 517a44d18537ef8ab888f71197c80116c14cee0a upstream.
This commit use kthread_destroy_worker() to release sch->helper objects to fix the following kmemleak:
unreferenced object 0xffff888121ec7b00 (size 128): comm "scx_simple", pid 1197, jiffies 4295884415 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 ad 4e ad de .............N.. ff ff ff ff 00 00 00 00 ff ff ff ff ff ff ff ff ................ backtrace (crc 587b3352): kmemleak_alloc+0x62/0xa0 __kmalloc_cache_noprof+0x28d/0x3e0 kthread_create_worker_on_node+0xd5/0x1f0 scx_enable.isra.210+0x6c2/0x25b0 bpf_scx_reg+0x12/0x20 bpf_struct_ops_link_create+0x2c3/0x3b0 __sys_bpf+0x3102/0x4b00 __x64_sys_bpf+0x79/0xc0 x64_sys_call+0x15d9/0x1dd0 do_syscall_64+0xf0/0x470 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: bff3b5aec1b7 ("sched_ext: Move disable machinery into scx_sched") Cc: stable@vger.kernel.org # v6.16+ Signed-off-by: Zqiang qiang.zhang@linux.dev Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/ext.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -3512,7 +3512,7 @@ static void scx_sched_free_rcu_work(stru int node;
irq_work_sync(&sch->error_irq_work); - kthread_stop(sch->helper->task); + kthread_destroy_worker(sch->helper);
free_percpu(sch->pcpu);
@@ -4504,7 +4504,7 @@ static struct scx_sched *scx_alloc_and_a return sch;
err_stop_helper: - kthread_stop(sch->helper->task); + kthread_destroy_worker(sch->helper); err_free_pcpu: free_percpu(sch->pcpu); err_free_gdsqs:
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tejun Heo tj@kernel.org
commit 9f769637a93fac81689b80df6855f545839cf999 upstream.
scx_enable() calls scx_bypass(true) to initialize in bypass mode and then scx_bypass(false) on success to exit. If scx_enable() fails during task initialization - e.g. scx_cgroup_init() or scx_init_task() returns an error - it jumps to err_disable while bypass is still active. scx_disable_workfn() then calls scx_bypass(true/false) for its own bypass, leaving the bypass depth at 1 instead of 0. This causes the system to remain permanently in bypass mode after a failed scx_enable().
Failures after task initialization is complete - e.g. scx_tryset_enable_state() at the end - already call scx_bypass(false) before reaching the error path and are not affected. This only affects a subset of failure modes.
Fix it by tracking whether scx_enable() called scx_bypass(true) in a bool and having scx_disable_workfn() call an extra scx_bypass(false) to clear it. This is a temporary measure as the bypass depth will be moved into the sched instance, which will make this tracking unnecessary.
Fixes: 8c2090c504e9 ("sched_ext: Initialize in bypass mode") Cc: stable@vger.kernel.org # v6.12+ Reported-by: Chris Mason clm@meta.com Reviewed-by: Emil Tsalapatis emil@etsalapatis.com Link: https://lore.kernel.org/stable/286e6f7787a81239e1ce2989b52391ce%40kernel.org Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/ext.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
--- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -40,6 +40,13 @@ static bool scx_init_task_enabled; static bool scx_switching_all; DEFINE_STATIC_KEY_FALSE(__scx_switched_all);
+/* + * Tracks whether scx_enable() called scx_bypass(true). Used to balance bypass + * depth on enable failure. Will be removed when bypass depth is moved into the + * sched instance. + */ +static bool scx_bypassed_for_enable; + static atomic_long_t scx_nr_rejected = ATOMIC_LONG_INIT(0); static atomic_long_t scx_hotplug_seq = ATOMIC_LONG_INIT(0);
@@ -4051,6 +4058,11 @@ static void scx_disable_workfn(struct kt scx_dsp_max_batch = 0; free_kick_pseqs();
+ if (scx_bypassed_for_enable) { + scx_bypassed_for_enable = false; + scx_bypass(false); + } + mutex_unlock(&scx_enable_mutex);
WARN_ON_ONCE(scx_set_enable_state(SCX_DISABLED) != SCX_DISABLING); @@ -4676,6 +4688,7 @@ static int scx_enable(struct sched_ext_o * Init in bypass mode to guarantee forward progress. */ scx_bypass(true); + scx_bypassed_for_enable = true;
for (i = SCX_OPI_NORMAL_BEGIN; i < SCX_OPI_NORMAL_END; i++) if (((void (**)(void))ops)[i]) @@ -4780,6 +4793,7 @@ static int scx_enable(struct sched_ext_o scx_task_iter_stop(&sti); percpu_up_write(&scx_fork_rwsem);
+ scx_bypassed_for_enable = false; scx_bypass(false);
if (!scx_tryset_enable_state(SCX_ENABLED, SCX_ENABLING)) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tejun Heo tj@kernel.org
commit f5e1e5ec204da11fa87fdf006d451d80ce06e118 upstream.
move_local_task_to_local_dsq() is used when moving a task from a non-local DSQ to a local DSQ on the same CPU. It directly manipulates the local DSQ without going through dispatch_enqueue() and was missing the post-enqueue handling that triggers preemption when SCX_ENQ_PREEMPT is set or the idle task is running.
The function is used by move_task_between_dsqs() which backs scx_bpf_dsq_move() and may be called while the CPU is busy.
Add local_dsq_post_enq() call to move_local_task_to_local_dsq(). As the dispatch path doesn't need post-enqueue handling, add SCX_RQ_IN_BALANCE early exit to keep consume_dispatch_q() behavior unchanged and avoid triggering unnecessary resched when scx_bpf_dsq_move() is used from the dispatch path.
Fixes: 4c30f5ce4f7a ("sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq()") Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Andrea Righi arighi@nvidia.com Reviewed-by: Emil Tsalapatis emil@etsalapatis.com Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/ext.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -919,6 +919,14 @@ static void local_dsq_post_enq(struct sc struct rq *rq = container_of(dsq, struct rq, scx.local_dsq); bool preempt = false;
+ /* + * If @rq is in balance, the CPU is already vacant and looking for the + * next task to run. No need to preempt or trigger resched after moving + * @p into its local DSQ. + */ + if (rq->scx.flags & SCX_RQ_IN_BALANCE) + return; + if ((enq_flags & SCX_ENQ_PREEMPT) && p != rq->curr && rq->curr->sched_class == &ext_sched_class) { rq->curr->scx.slice = 0; @@ -1524,6 +1532,8 @@ static void move_local_task_to_local_dsq
dsq_mod_nr(dst_dsq, 1); p->scx.dsq = dst_dsq; + + local_dsq_post_enq(dst_dsq, p, enq_flags); }
/**
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Avadhut Naik avadhut.naik@amd.com
commit d7ac083f095d894a0b8ac0573516bfd035e6b25a upstream.
Currently, when a CMCI storm detected on a Machine Check bank, subsides, the bank's corresponding bit in the mce_poll_banks per-CPU variable is cleared unconditionally by cmci_storm_end().
On AMD SMCA systems, this essentially disables polling on that particular bank on that CPU. Consequently, any subsequent correctable errors or storms will not be logged.
Since AMD SMCA systems allow banks to be managed by both polling and interrupts, the polling banks bitmap for a CPU, i.e., mce_poll_banks, should not be modified when a storm subsides.
Fixes: 7eae17c4add5 ("x86/mce: Add per-bank CMCI storm mitigation") Signed-off-by: Avadhut Naik avadhut.naik@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251121190542.2447913-2-avadhut.naik@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/mce/threshold.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/x86/kernel/cpu/mce/threshold.c +++ b/arch/x86/kernel/cpu/mce/threshold.c @@ -85,7 +85,8 @@ void cmci_storm_end(unsigned int bank) { struct mca_storm_desc *storm = this_cpu_ptr(&storm_desc);
- __clear_bit(bank, this_cpu_ptr(mce_poll_banks)); + if (!mce_flags.amd_threshold) + __clear_bit(bank, this_cpu_ptr(mce_poll_banks)); storm->banks[bank].history = 0; storm->banks[bank].in_storm_mode = false;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sarthak Garg sarthak.garg@oss.qualcomm.com
commit b1f856b1727c2eaa4be2c6d7cd7a8ed052bbeb87 upstream.
According to the hardware programming guide, the clock frequency must remain below 52MHz during the transition to HS400 mode.
However,in the current implementation, the timing is set to HS400 (a DDR mode) before adjusting the clock. This causes the clock to double prematurely to 104MHz during the transition phase, violating the specification and potentially resulting in CRC errors or CMD timeouts.
This change ensures that clock doubling is avoided during intermediate transitions and is applied only when the card requires a 200MHz clock for HS400 operation.
Signed-off-by: Sarthak Garg sarthak.garg@oss.qualcomm.com Reviewed-by: Bjorn Andersson andersson@kernel.org Acked-by: Adrian Hunter adrian.hunter@intel.com Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-msm.c | 27 +++++++++++++++------------ 1 file changed, 15 insertions(+), 12 deletions(-)
--- a/drivers/mmc/host/sdhci-msm.c +++ b/drivers/mmc/host/sdhci-msm.c @@ -344,41 +344,43 @@ static void sdhci_msm_v5_variant_writel_ writel_relaxed(val, host->ioaddr + offset); }
-static unsigned int msm_get_clock_mult_for_bus_mode(struct sdhci_host *host) +static unsigned int msm_get_clock_mult_for_bus_mode(struct sdhci_host *host, + unsigned int clock, + unsigned int timing) { - struct mmc_ios ios = host->mmc->ios; /* * The SDHC requires internal clock frequency to be double the * actual clock that will be set for DDR mode. The controller * uses the faster clock(100/400MHz) for some of its parts and * send the actual required clock (50/200MHz) to the card. */ - if (ios.timing == MMC_TIMING_UHS_DDR50 || - ios.timing == MMC_TIMING_MMC_DDR52 || - ios.timing == MMC_TIMING_MMC_HS400 || + if (timing == MMC_TIMING_UHS_DDR50 || + timing == MMC_TIMING_MMC_DDR52 || + (timing == MMC_TIMING_MMC_HS400 && + clock == MMC_HS200_MAX_DTR) || host->flags & SDHCI_HS400_TUNING) return 2; return 1; }
static void msm_set_clock_rate_for_bus_mode(struct sdhci_host *host, - unsigned int clock) + unsigned int clock, + unsigned int timing) { struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host); struct sdhci_msm_host *msm_host = sdhci_pltfm_priv(pltfm_host); - struct mmc_ios curr_ios = host->mmc->ios; struct clk *core_clk = msm_host->bulk_clks[0].clk; unsigned long achieved_rate; unsigned int desired_rate; unsigned int mult; int rc;
- mult = msm_get_clock_mult_for_bus_mode(host); + mult = msm_get_clock_mult_for_bus_mode(host, clock, timing); desired_rate = clock * mult; rc = dev_pm_opp_set_rate(mmc_dev(host->mmc), desired_rate); if (rc) { pr_err("%s: Failed to set clock at rate %u at timing %d\n", - mmc_hostname(host->mmc), desired_rate, curr_ios.timing); + mmc_hostname(host->mmc), desired_rate, timing); return; }
@@ -397,7 +399,7 @@ static void msm_set_clock_rate_for_bus_m msm_host->clk_rate = desired_rate;
pr_debug("%s: Setting clock at rate %lu at timing %d\n", - mmc_hostname(host->mmc), achieved_rate, curr_ios.timing); + mmc_hostname(host->mmc), achieved_rate, timing); }
/* Platform specific tuning */ @@ -1239,7 +1241,7 @@ static int sdhci_msm_execute_tuning(stru */ if (host->flags & SDHCI_HS400_TUNING) { sdhci_msm_hc_select_mode(host); - msm_set_clock_rate_for_bus_mode(host, ios.clock); + msm_set_clock_rate_for_bus_mode(host, ios.clock, ios.timing); host->flags &= ~SDHCI_HS400_TUNING; }
@@ -1864,6 +1866,7 @@ static void sdhci_msm_set_clock(struct s { struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host); struct sdhci_msm_host *msm_host = sdhci_pltfm_priv(pltfm_host); + struct mmc_ios ios = host->mmc->ios;
if (!clock) { host->mmc->actual_clock = msm_host->clk_rate = 0; @@ -1872,7 +1875,7 @@ static void sdhci_msm_set_clock(struct s
sdhci_msm_hc_select_mode(host);
- msm_set_clock_rate_for_bus_mode(host, clock); + msm_set_clock_rate_for_bus_mode(host, ios.clock, ios.timing); out: __sdhci_msm_set_clock(host, clock); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Minnekhanov alexeymin@postmarketos.org
commit c57210bc15371caa06a5d4040e7d8aaeed4cb661 upstream.
Add definition for display subsystem reset control, so display driver can reset display controller properly, clearing any configuration left there by bootloader. Since 6.17 after PM domains rework it became necessary for display to function.
Fixes: 0e789b491ba0 ("pmdomain: core: Leave powered-on genpds on until sync_state") Cc: stable@vger.kernel.org # 6.17 Signed-off-by: Alexey Minnekhanov alexeymin@postmarketos.org Acked-by: Krzysztof Kozlowski krzk@kernel.org Link: https://lore.kernel.org/r/20251116-sdm660-mdss-reset-v2-1-6219bec0a97f@postm... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/dt-bindings/clock/qcom,mmcc-sdm660.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/include/dt-bindings/clock/qcom,mmcc-sdm660.h b/include/dt-bindings/clock/qcom,mmcc-sdm660.h index f9dbc21cb5c7..ee2a89dae72d 100644 --- a/include/dt-bindings/clock/qcom,mmcc-sdm660.h +++ b/include/dt-bindings/clock/qcom,mmcc-sdm660.h @@ -157,6 +157,7 @@ #define BIMC_SMMU_GDSC 7
#define CAMSS_MICRO_BCR 0 +#define MDSS_BCR 1
#endif
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: André Draszik andre.draszik@linaro.org
commit 5e428e45bf17a8f3784099ca5ded16e3b5d59766 upstream.
Commit f4fb9c4d7f94 ("phy: exynos5-usbdrd: allow DWC3 runtime suspend with UDC bound (E850+)") incorrectly added clk_bulk_disable() as the inverse of clk_bulk_prepare_enable() while it should have of course used clk_bulk_disable_unprepare(). This means incorrect reference counts to the CMU driver remain.
Update the code accordingly.
Fixes: f4fb9c4d7f94 ("phy: exynos5-usbdrd: allow DWC3 runtime suspend with UDC bound (E850+)") CC: stable@vger.kernel.org Signed-off-by: André Draszik andre.draszik@linaro.org Reviewed-by: Sam Protsenko semen.protsenko@linaro.org Reviewed-by: Peter Griffin peter.griffin@linaro.org Link: https://patch.msgid.link/20251006-gs101-usb-phy-clk-imbalance-v1-1-205b20612... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/phy/samsung/phy-exynos5-usbdrd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/phy/samsung/phy-exynos5-usbdrd.c +++ b/drivers/phy/samsung/phy-exynos5-usbdrd.c @@ -1823,7 +1823,7 @@ static int exynos5_usbdrd_orien_sw_set(s phy_drd->orientation = orientation; }
- clk_bulk_disable(phy_drd->drv_data->n_clks, phy_drd->clks); + clk_bulk_disable_unprepare(phy_drd->drv_data->n_clks, phy_drd->clks);
return 0; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ard Biesheuvel ardb@kernel.org
commit 40374d308e4e456048d83991e937f13fc8bda8bf upstream.
Initialize the cpus_allowed_lock struct member of efi_mm.
Cc: stable@vger.kernel.org Signed-off-by: Ard Biesheuvel ardb@kernel.org Acked-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/firmware/efi/efi.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -74,6 +74,9 @@ struct mm_struct efi_mm = { .page_table_lock = __SPIN_LOCK_UNLOCKED(efi_mm.page_table_lock), .mmlist = LIST_HEAD_INIT(efi_mm.mmlist), .cpu_bitmap = { [BITS_TO_LONGS(NR_CPUS)] = 0}, +#ifdef CONFIG_SCHED_MM_CID + .cpus_allowed_lock = __RAW_SPIN_LOCK_UNLOCKED(efi_mm.cpus_allowed_lock), +#endif };
struct workqueue_struct *efi_rts_wq;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make24@iscas.ac.cn
commit 970e1e41805f0bd49dc234330a9390f4708d097d upstream.
driver_find_device() calls get_device() to increment the reference count once a matching device is found. device_release_driver() releases the driver, but it does not decrease the reference count that was incremented by driver_find_device(). At the end of the loop, there is no put_device() to balance the reference count. To avoid reference count leakage, add put_device() to decrease the reference count.
Found by code review.
Cc: stable@vger.kernel.org Fixes: bfc653aa89cb ("perf: arm_cspmu: Separate Arm and vendor module") Signed-off-by: Ma Ke make24@iscas.ac.cn Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/perf/arm_cspmu/arm_cspmu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/perf/arm_cspmu/arm_cspmu.c +++ b/drivers/perf/arm_cspmu/arm_cspmu.c @@ -1365,8 +1365,10 @@ void arm_cspmu_impl_unregister(const str
/* Unbind the driver from all matching backend devices. */ while ((dev = driver_find_device(&arm_cspmu_driver.driver, NULL, - match, arm_cspmu_match_device))) + match, arm_cspmu_match_device))) { device_release_driver(dev); + put_device(dev); + }
mutex_lock(&arm_cspmu_lock);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@kernel.org
commit 2f22115709fc7ebcfa40af3367a508fbbd2f71e9 upstream.
In the C code, the 'inc' argument to the assembly functions blake2s_compress_ssse3() and blake2s_compress_avx512() is declared with type u32, matching blake2s_compress(). The assembly code then reads it from the 64-bit %rcx. However, the ABI doesn't guarantee zero-extension to 64 bits, nor do gcc or clang guarantee it. Therefore, fix these functions to read this argument from the 32-bit %ecx.
In theory, this bug could have caused the wrong 'inc' value to be used, causing incorrect BLAKE2s hashes. In practice, probably not: I've fixed essentially this same bug in many other assembly files too, but there's never been a real report of it having caused a problem. In x86_64, all writes to 32-bit registers are zero-extended to 64 bits. That results in zero-extension in nearly all situations. I've only been able to demonstrate a lack of zero-extension with a somewhat contrived example involving truncation, e.g. when the C code has a u64 variable holding 0x1234567800000040 and passes it as a u32 expecting it to be truncated to 0x40 (64). But that's not what the real code does, of course.
Fixes: ed0356eda153 ("crypto: blake2s - x86_64 SIMD implementation") Cc: stable@vger.kernel.org Reviewed-by: Ard Biesheuvel ardb@kernel.org Link: https://lore.kernel.org/r/20251102234209.62133-2-ebiggers@kernel.org Signed-off-by: Eric Biggers ebiggers@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- lib/crypto/x86/blake2s-core.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/lib/crypto/x86/blake2s-core.S +++ b/lib/crypto/x86/blake2s-core.S @@ -52,7 +52,7 @@ SYM_FUNC_START(blake2s_compress_ssse3) movdqa ROT16(%rip),%xmm12 movdqa ROR328(%rip),%xmm13 movdqu 0x20(%rdi),%xmm14 - movq %rcx,%xmm15 + movd %ecx,%xmm15 leaq SIGMA+0xa0(%rip),%r8 jmp .Lbeginofloop .align 32 @@ -176,7 +176,7 @@ SYM_FUNC_START(blake2s_compress_avx512) vmovdqu (%rdi),%xmm0 vmovdqu 0x10(%rdi),%xmm1 vmovdqu 0x20(%rdi),%xmm4 - vmovq %rcx,%xmm5 + vmovd %ecx,%xmm5 vmovdqa IV(%rip),%xmm14 vmovdqa IV+16(%rip),%xmm15 jmp .Lblake2s_compress_avx512_mainloop
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Haberland sth@linux.ibm.com
commit c943bfc6afb8d0e781b9b7406f36caa8bbf95cb9 upstream.
After a copy pair swap the block device's "device" symlink points to the secondary CCW device, but the gendisk's parent remained the primary, leaving /sys/block/<dasdx> under the wrong parent.
Move the gendisk to the secondary's device with device_move(), keeping the sysfs topology consistent after the swap.
Fixes: 413862caad6f ("s390/dasd: add copy pair swap capability") Cc: stable@vger.kernel.org #6.1 Reviewed-by: Jan Hoeppner hoeppner@linux.ibm.com Signed-off-by: Stefan Haberland sth@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/block/dasd_eckd.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/s390/block/dasd_eckd.c +++ b/drivers/s390/block/dasd_eckd.c @@ -6150,6 +6150,7 @@ static int dasd_eckd_copy_pair_swap(stru struct dasd_copy_relation *copy; struct dasd_block *block; struct gendisk *gdp; + int rc;
copy = device->copy; if (!copy) @@ -6184,6 +6185,13 @@ static int dasd_eckd_copy_pair_swap(stru /* swap blocklayer device link */ gdp = block->gdp; dasd_add_link_to_gendisk(gdp, secondary); + rc = device_move(disk_to_dev(gdp), &secondary->cdev->dev, DPM_ORDER_NONE); + if (rc) { + dev_err(&primary->cdev->dev, + "copy_pair_swap: moving blockdevice parent %s->%s failed (%d)\n", + dev_name(&primary->cdev->dev), + dev_name(&secondary->cdev->dev), rc); + }
/* re-enable device */ dasd_device_remove_stop_bits(primary, DASD_STOPPED_PPRC);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sven Eckelmann (Plasma Cloud) se@simonwunderlich.de
commit 38b845e1f9e810869b0a0b69f202b877b7b7fb12 upstream.
The power-limits for ru and mcs and stored in the devicetree as bytewise array (often with sizes which are not a multiple of 4). These arrays have a prefix which defines for how many modes a line is applied. This prefix is also only a byte - but the code still tried to fix the endianness of this byte with a be32 operation. As result, loading was mostly failing or was sending completely unexpected values to the firmware.
Since the other rates are also stored in the devicetree as bytewise arrays, just drop the u32 access + be32_to_cpu conversion and directly access them as bytes arrays.
Cc: stable@vger.kernel.org Fixes: 22b980badc0f ("mt76: add functions for parsing rate power limits from DT") Fixes: a9627d992b5e ("mt76: extend DT rate power limits to support 11ax devices") Signed-off-by: Sven Eckelmann (Plasma Cloud) se@simonwunderlich.de Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/mediatek/mt76/eeprom.c | 37 ++++++++++++++++++---------- 1 file changed, 24 insertions(+), 13 deletions(-)
--- a/drivers/net/wireless/mediatek/mt76/eeprom.c +++ b/drivers/net/wireless/mediatek/mt76/eeprom.c @@ -253,6 +253,19 @@ mt76_get_of_array(struct device_node *np return prop->value; }
+static const s8 * +mt76_get_of_array_s8(struct device_node *np, char *name, size_t *len, int min) +{ + struct property *prop = of_find_property(np, name, NULL); + + if (!prop || !prop->value || prop->length < min) + return NULL; + + *len = prop->length; + + return prop->value; +} + struct device_node * mt76_find_channel_node(struct device_node *np, struct ieee80211_channel *chan) { @@ -294,7 +307,7 @@ mt76_get_txs_delta(struct device_node *n }
static void -mt76_apply_array_limit(s8 *pwr, size_t pwr_len, const __be32 *data, +mt76_apply_array_limit(s8 *pwr, size_t pwr_len, const s8 *data, s8 target_power, s8 nss_delta, s8 *max_power) { int i; @@ -303,15 +316,14 @@ mt76_apply_array_limit(s8 *pwr, size_t p return;
for (i = 0; i < pwr_len; i++) { - pwr[i] = min_t(s8, target_power, - be32_to_cpu(data[i]) + nss_delta); + pwr[i] = min_t(s8, target_power, data[i] + nss_delta); *max_power = max(*max_power, pwr[i]); } }
static void mt76_apply_multi_array_limit(s8 *pwr, size_t pwr_len, s8 pwr_num, - const __be32 *data, size_t len, s8 target_power, + const s8 *data, size_t len, s8 target_power, s8 nss_delta, s8 *max_power) { int i, cur; @@ -319,8 +331,7 @@ mt76_apply_multi_array_limit(s8 *pwr, si if (!data) return;
- len /= 4; - cur = be32_to_cpu(data[0]); + cur = data[0]; for (i = 0; i < pwr_num; i++) { if (len < pwr_len + 1) break; @@ -335,7 +346,7 @@ mt76_apply_multi_array_limit(s8 *pwr, si if (!len) break;
- cur = be32_to_cpu(data[0]); + cur = data[0]; } }
@@ -346,7 +357,7 @@ s8 mt76_get_rate_power_limits(struct mt7 { struct mt76_dev *dev = phy->dev; struct device_node *np; - const __be32 *val; + const s8 *val; char name[16]; u32 mcs_rates = dev->drv->mcs_rates; u32 ru_rates = ARRAY_SIZE(dest->ru[0]); @@ -392,21 +403,21 @@ s8 mt76_get_rate_power_limits(struct mt7
txs_delta = mt76_get_txs_delta(np, hweight16(phy->chainmask));
- val = mt76_get_of_array(np, "rates-cck", &len, ARRAY_SIZE(dest->cck)); + val = mt76_get_of_array_s8(np, "rates-cck", &len, ARRAY_SIZE(dest->cck)); mt76_apply_array_limit(dest->cck, ARRAY_SIZE(dest->cck), val, target_power, txs_delta, &max_power);
- val = mt76_get_of_array(np, "rates-ofdm", - &len, ARRAY_SIZE(dest->ofdm)); + val = mt76_get_of_array_s8(np, "rates-ofdm", + &len, ARRAY_SIZE(dest->ofdm)); mt76_apply_array_limit(dest->ofdm, ARRAY_SIZE(dest->ofdm), val, target_power, txs_delta, &max_power);
- val = mt76_get_of_array(np, "rates-mcs", &len, mcs_rates + 1); + val = mt76_get_of_array_s8(np, "rates-mcs", &len, mcs_rates + 1); mt76_apply_multi_array_limit(dest->mcs[0], ARRAY_SIZE(dest->mcs[0]), ARRAY_SIZE(dest->mcs), val, len, target_power, txs_delta, &max_power);
- val = mt76_get_of_array(np, "rates-ru", &len, ru_rates + 1); + val = mt76_get_of_array_s8(np, "rates-ru", &len, ru_rates + 1); mt76_apply_multi_array_limit(dest->ru[0], ARRAY_SIZE(dest->ru[0]), ARRAY_SIZE(dest->ru), val, len, target_power, txs_delta, &max_power);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik josef@toxicpanda.com
commit 0185c2292c600993199bc6b1f342ad47a9e8c678 upstream.
In our user safe ino resolve ioctl we'll just turn any ret into -EACCES from inode_permission(). This is redundant, and could potentially be wrong if we had an ENOMEM in the security layer or some such other error, so simply return the actual return value.
Note: The patch was taken from v5 of fscrypt patchset (https://lore.kernel.org/linux-btrfs/cover.1706116485.git.josef@toxicpanda.co...) which was handled over time by various people: Omar Sandoval, Sweet Tea Dorminy, Josef Bacik.
Fixes: 23d0b79dfaed ("btrfs: Add unprivileged version of ino_lookup ioctl") CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Daniel Vacek neelx@suse.com Reviewed-by: David Sterba dsterba@suse.com [ add note ] Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/ioctl.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1913,10 +1913,8 @@ static int btrfs_search_path_in_tree_use ret = inode_permission(idmap, &temp_inode->vfs_inode, MAY_READ | MAY_EXEC); iput(&temp_inode->vfs_inode); - if (ret) { - ret = -EACCES; + if (ret) goto out_put; - }
if (key.offset == upper_limit) break;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Velichayshiy a.velichayshiy@ispras.ru
commit 4cfc7d5a4a01d2133b278cdbb1371fba1b419174 upstream.
After commit b77b4a4815a9 ("gfs2: Rework freeze / thaw logic"), the freeze error handling is broken because gfs2_do_thaw() overwrites the 'error' variable, causing incorrect processing of the original freeze error.
Fix this by calling gfs2_do_thaw() when gfs2_lock_fs_check_clean() fails but ignoring its return value to preserve the original freeze error for proper reporting.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: b77b4a4815a9 ("gfs2: Rework freeze / thaw logic") Cc: stable@vger.kernel.org # v6.5+ Signed-off-by: Alexey Velichayshiy a.velichayshiy@ispras.ru Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/gfs2/super.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -749,9 +749,7 @@ static int gfs2_freeze_super(struct supe break; }
- error = gfs2_do_thaw(sdp, who, freeze_owner); - if (error) - goto out; + (void)gfs2_do_thaw(sdp, who, freeze_owner);
if (error == -EBUSY) fs_err(sdp, "waiting for recovery before freeze\n");
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Li Chen chenl311@chinatelecom.cn
commit 3179a5f7f86bcc3acd5d6fb2a29f891ef5615852 upstream.
loop devices under heavy stress-ng loop streessor can trigger many capacity change events in a short time. Each event prints an info message from set_capacity_and_notify(), flooding the console and contributing to soft lockups on slow consoles.
Switch the printk in set_capacity_and_notify() to pr_info_ratelimited() so frequent capacity changes do not spam the log while still reporting occasional changes.
Cc: stable@vger.kernel.org Signed-off-by: Li Chen chenl311@chinatelecom.cn Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- block/genhd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/block/genhd.c +++ b/block/genhd.c @@ -90,7 +90,7 @@ bool set_capacity_and_notify(struct gend (disk->flags & GENHD_FL_HIDDEN)) return false;
- pr_info("%s: detected capacity change from %lld to %lld\n", + pr_info_ratelimited("%s: detected capacity change from %lld to %lld\n", disk->disk_name, capacity, size);
/*
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ye Bin yebin10@huawei.com
commit 6abfe107894af7e8ce3a2e120c619d81ee764ad5 upstream.
Copying the file system while it is mounted as read-only results in a mount failure: [~]# mkfs.ext4 -F /dev/sdc [~]# mount /dev/sdc -o ro /mnt/test [~]# dd if=/dev/sdc of=/dev/sda bs=1M [~]# mount /dev/sda /mnt/test1 [ 1094.849826] JBD2: journal checksum error [ 1094.850927] EXT4-fs (sda): Could not load journal inode mount: mount /dev/sda on /mnt/test1 failed: Bad message
The process described above is just an abstracted way I came up with to reproduce the issue. In the actual scenario, the file system was mounted read-only and then copied while it was still mounted. It was found that the mount operation failed. The user intended to verify the data or use it as a backup, and this action was performed during a version upgrade. Above issue may happen as follows: ext4_fill_super set_journal_csum_feature_set(sb) if (ext4_has_metadata_csum(sb)) incompat = JBD2_FEATURE_INCOMPAT_CSUM_V3; if (test_opt(sb, JOURNAL_CHECKSUM) jbd2_journal_set_features(sbi->s_journal, compat, 0, incompat); lock_buffer(journal->j_sb_buffer); sb->s_feature_incompat |= cpu_to_be32(incompat); //The data in the journal sb was modified, but the checksum was not updated, so the data remaining in memory has a mismatch between the data and the checksum. unlock_buffer(journal->j_sb_buffer);
In this case, the journal sb copied over is in a state where the checksum and data are inconsistent, so mounting fails. To solve the above issue, update the checksum in memory after modifying the journal sb.
Fixes: 4fd5ea43bc11 ("jbd2: checksum journal superblock") Signed-off-by: Ye Bin yebin10@huawei.com Reviewed-by: Baokun Li libaokun1@huawei.com Reviewed-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Jan Kara jack@suse.cz Message-ID: 20251103010123.3753631-1-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/jbd2/journal.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
--- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -2349,6 +2349,12 @@ int jbd2_journal_set_features(journal_t sb->s_feature_compat |= cpu_to_be32(compat); sb->s_feature_ro_compat |= cpu_to_be32(ro); sb->s_feature_incompat |= cpu_to_be32(incompat); + /* + * Update the checksum now so that it is valid even for read-only + * filesystems where jbd2_write_superblock() doesn't get called. + */ + if (jbd2_journal_has_csum_v2or3(journal)) + sb->s_checksum = jbd2_superblock_csum(sb); unlock_buffer(journal->j_sb_buffer); jbd2_journal_init_transaction_limits(journal);
@@ -2378,9 +2384,17 @@ void jbd2_journal_clear_features(journal
sb = journal->j_superblock;
+ lock_buffer(journal->j_sb_buffer); sb->s_feature_compat &= ~cpu_to_be32(compat); sb->s_feature_ro_compat &= ~cpu_to_be32(ro); sb->s_feature_incompat &= ~cpu_to_be32(incompat); + /* + * Update the checksum now so that it is valid even for read-only + * filesystems where jbd2_write_superblock() doesn't get called. + */ + if (jbd2_journal_has_csum_v2or3(journal)) + sb->s_checksum = jbd2_superblock_csum(sb); + unlock_buffer(journal->j_sb_buffer); jbd2_journal_init_transaction_limits(journal); } EXPORT_SYMBOL(jbd2_journal_clear_features);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rene Rebe rene@exactco.de
commit 82d20481024cbae2ea87fe8b86d12961bfda7169 upstream.
For years I wondered why the floppy driver does not just work on sparc64, e.g:
root@SUNW_375_0066:# disktype /dev/fd0 disktype: Can't open /dev/fd0: No such device or address
[ 525.341906] disktype: attempt to access beyond end of device fd0: rw=0, sector=0, nr_sectors = 16 limit=8 [ 525.341991] floppy: error 10 while reading block 0
Turns out floppy.c __floppy_read_block_0 tries to read one page for the first test read to determine the disk size and thus fails if that is greater than 4k. Adjust minimum MAX_DISK_SIZE to PAGE_SIZE to fix floppy on sparc64 and likely all other PAGE_SIZE != 4KB configs.
Cc: stable@vger.kernel.org Signed-off-by: René Rebe rene@exactco.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/floppy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/block/floppy.c +++ b/drivers/block/floppy.c @@ -329,7 +329,7 @@ static bool initialized; * This default is used whenever the current disk size is unknown. * [Now it is rather a minimum] */ -#define MAX_DISK_SIZE 4 /* 3984 */ +#define MAX_DISK_SIZE (PAGE_SIZE / 1024)
/* * globals used by 'result()'
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@kernel.org
commit 4dffc9bbffb9ccfcda730d899c97c553599e7ca8 upstream.
The original implementation of memcpy_sglist() was broken because it didn't handle scatterlists that describe exactly the same memory, which is a case that many callers rely on. The current implementation is broken too because it calls the skcipher_walk functions which can fail. It ignores any errors from those functions.
Fix it by replacing it with a new implementation written from scratch. It always succeeds. It's also a bit faster, since it avoids the overhead of skcipher_walk. skcipher_walk includes a lot of functionality (such as alignmask handling) that's irrelevant here.
Reported-by: Colin Ian King coking@nvidia.com Closes: https://lore.kernel.org/r/20251114122620.111623-1-coking@nvidia.com Fixes: 131bdceca1f0 ("crypto: scatterwalk - Add memcpy_sglist") Fixes: 0f8d42bf128d ("crypto: scatterwalk - Move skcipher walk and use it for memcpy_sglist") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers ebiggers@kernel.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- crypto/scatterwalk.c | 95 +++++++++++++++++++++++++++++++----- include/crypto/scatterwalk.h | 52 ++++++++++++-------- 2 files changed, 114 insertions(+), 33 deletions(-)
diff --git a/crypto/scatterwalk.c b/crypto/scatterwalk.c index 1d010e2a1b1a..b95e5974e327 100644 --- a/crypto/scatterwalk.c +++ b/crypto/scatterwalk.c @@ -101,26 +101,97 @@ void memcpy_to_sglist(struct scatterlist *sg, unsigned int start, } EXPORT_SYMBOL_GPL(memcpy_to_sglist);
+/** + * memcpy_sglist() - Copy data from one scatterlist to another + * @dst: The destination scatterlist. Can be NULL if @nbytes == 0. + * @src: The source scatterlist. Can be NULL if @nbytes == 0. + * @nbytes: Number of bytes to copy + * + * The scatterlists can describe exactly the same memory, in which case this + * function is a no-op. No other overlaps are supported. + * + * Context: Any context + */ void memcpy_sglist(struct scatterlist *dst, struct scatterlist *src, unsigned int nbytes) { - struct skcipher_walk walk = {}; + unsigned int src_offset, dst_offset;
- if (unlikely(nbytes == 0)) /* in case sg == NULL */ + if (unlikely(nbytes == 0)) /* in case src and/or dst is NULL */ return;
- walk.total = nbytes; + src_offset = src->offset; + dst_offset = dst->offset; + for (;;) { + /* Compute the length to copy this step. */ + unsigned int len = min3(src->offset + src->length - src_offset, + dst->offset + dst->length - dst_offset, + nbytes); + struct page *src_page = sg_page(src); + struct page *dst_page = sg_page(dst); + const void *src_virt; + void *dst_virt;
- scatterwalk_start(&walk.in, src); - scatterwalk_start(&walk.out, dst); + if (IS_ENABLED(CONFIG_HIGHMEM)) { + /* HIGHMEM: we may have to actually map the pages. */ + const unsigned int src_oip = offset_in_page(src_offset); + const unsigned int dst_oip = offset_in_page(dst_offset); + const unsigned int limit = PAGE_SIZE;
- skcipher_walk_first(&walk, true); - do { - if (walk.src.virt.addr != walk.dst.virt.addr) - memcpy(walk.dst.virt.addr, walk.src.virt.addr, - walk.nbytes); - skcipher_walk_done(&walk, 0); - } while (walk.nbytes); + /* Further limit len to not cross a page boundary. */ + len = min3(len, limit - src_oip, limit - dst_oip); + + /* Compute the source and destination pages. */ + src_page += src_offset / PAGE_SIZE; + dst_page += dst_offset / PAGE_SIZE; + + if (src_page != dst_page) { + /* Copy between different pages. */ + memcpy_page(dst_page, dst_oip, + src_page, src_oip, len); + flush_dcache_page(dst_page); + } else if (src_oip != dst_oip) { + /* Copy between different parts of same page. */ + dst_virt = kmap_local_page(dst_page); + memcpy(dst_virt + dst_oip, dst_virt + src_oip, + len); + kunmap_local(dst_virt); + flush_dcache_page(dst_page); + } /* Else, it's the same memory. No action needed. */ + } else { + /* + * !HIGHMEM: no mapping needed. Just work in the linear + * buffer of each sg entry. Note that we can cross page + * boundaries, as they are not significant in this case. + */ + src_virt = page_address(src_page) + src_offset; + dst_virt = page_address(dst_page) + dst_offset; + if (src_virt != dst_virt) { + memcpy(dst_virt, src_virt, len); + if (ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE) + __scatterwalk_flush_dcache_pages( + dst_page, dst_offset, len); + } /* Else, it's the same memory. No action needed. */ + } + nbytes -= len; + if (nbytes == 0) /* No more to copy? */ + break; + + /* + * There's more to copy. Advance the offsets by the length + * copied this step, and advance the sg entries as needed. + */ + src_offset += len; + if (src_offset >= src->offset + src->length) { + src = sg_next(src); + src_offset = src->offset; + } + dst_offset += len; + if (dst_offset >= dst->offset + dst->length) { + dst = sg_next(dst); + dst_offset = dst->offset; + } + } } EXPORT_SYMBOL_GPL(memcpy_sglist);
diff --git a/include/crypto/scatterwalk.h b/include/crypto/scatterwalk.h index 83d14376ff2b..f485454e3955 100644 --- a/include/crypto/scatterwalk.h +++ b/include/crypto/scatterwalk.h @@ -227,6 +227,34 @@ static inline void scatterwalk_done_src(struct scatter_walk *walk, scatterwalk_advance(walk, nbytes); }
+/* + * Flush the dcache of any pages that overlap the region + * [offset, offset + nbytes) relative to base_page. + * + * This should be called only when ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, to ensure + * that all relevant code (including the call to sg_page() in the caller, if + * applicable) gets fully optimized out when !ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE. + */ +static inline void __scatterwalk_flush_dcache_pages(struct page *base_page, + unsigned int offset, + unsigned int nbytes) +{ + unsigned int num_pages; + + base_page += offset / PAGE_SIZE; + offset %= PAGE_SIZE; + + /* + * This is an overflow-safe version of + * num_pages = DIV_ROUND_UP(offset + nbytes, PAGE_SIZE). + */ + num_pages = nbytes / PAGE_SIZE; + num_pages += DIV_ROUND_UP(offset + (nbytes % PAGE_SIZE), PAGE_SIZE); + + for (unsigned int i = 0; i < num_pages; i++) + flush_dcache_page(base_page + i); +} + /** * scatterwalk_done_dst() - Finish one step of a walk of destination scatterlist * @walk: the scatter_walk @@ -240,27 +268,9 @@ static inline void scatterwalk_done_dst(struct scatter_walk *walk, unsigned int nbytes) { scatterwalk_unmap(walk); - /* - * Explicitly check ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE instead of just - * relying on flush_dcache_page() being a no-op when not implemented, - * since otherwise the BUG_ON in sg_page() does not get optimized out. - * This also avoids having to consider whether the loop would get - * reliably optimized out or not. - */ - if (ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE) { - struct page *base_page; - unsigned int offset; - int start, end, i; - - base_page = sg_page(walk->sg); - offset = walk->offset; - start = offset >> PAGE_SHIFT; - end = start + (nbytes >> PAGE_SHIFT); - end += (offset_in_page(offset) + offset_in_page(nbytes) + - PAGE_SIZE - 1) >> PAGE_SHIFT; - for (i = start; i < end; i++) - flush_dcache_page(base_page + i); - } + if (ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE) + __scatterwalk_flush_dcache_pages(sg_page(walk->sg), + walk->offset, nbytes); scatterwalk_advance(walk, nbytes); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zheng Yejian zhengyejian@huaweicloud.com
commit f3f9f42232dee596d15491ca3f611d02174db49c upstream.
Currently when the length of a symbol is longer than 0x7f characters, its type shown in /proc/kallsyms can be incorrect.
I found this issue when reading the code, but it can be reproduced by following steps:
1. Define a function which symbol length is 130 characters:
#define X13(x) x##x##x##x##x##x##x##x##x##x##x##x##x static noinline void X13(x123456789)(void) { printk("hello world\n"); }
2. The type in vmlinux is 't':
$ nm vmlinux | grep x123456 ffffffff816290f0 t x123456789x123456789x123456789x12[...]
3. Then boot the kernel, the type shown in /proc/kallsyms becomes 'g' instead of the expected 't':
# cat /proc/kallsyms | grep x123456 ffffffff816290f0 g x123456789x123456789x123456789x12[...]
The root cause is that, after commit 73bbb94466fd ("kallsyms: support "big" kernel symbols"), ULEB128 was used to encode symbol name length. That is, for "big" kernel symbols of which name length is longer than 0x7f characters, the length info is encoded into 2 bytes.
kallsyms_get_symbol_type() expects to read the first char of the symbol name which indicates the symbol type. However, due to the "big" symbol case not being handled, the symbol type read from /proc/kallsyms may be wrong, so handle it properly.
Cc: stable@vger.kernel.org Fixes: 73bbb94466fd ("kallsyms: support "big" kernel symbols") Signed-off-by: Zheng Yejian zhengyejian@huaweicloud.com Acked-by: Gary Guo gary@garyguo.net Link: https://patch.msgid.link/20241011143853.3022643-1-zhengyejian@huaweicloud.co... Signed-off-by: Miguel Ojeda ojeda@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/kallsyms.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/kernel/kallsyms.c +++ b/kernel/kallsyms.c @@ -103,8 +103,11 @@ static char kallsyms_get_symbol_type(uns { /* * Get just the first code, look it up in the token table, - * and return the first char from this token. + * and return the first char from this token. If MSB of length + * is 1, it is a "big" symbol, so needs an additional byte. */ + if (kallsyms_names[off] & 0x80) + off++; return kallsyms_token_table[kallsyms_token_index[kallsyms_names[off + 1]]]; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Konstantin Komarov almaz.alexandrovich@paragon-software.com
commit 801f614ba263cb37624982b27b4c82f3c3c597a9 upstream.
Some NTFS volumes failed to mount because sparse data runs were not handled correctly during runlist unpacking. The code performed arithmetic on the special SPARSE_LCN64 marker, leading to invalid LCN values and mount errors.
Add an explicit check for the case described above, marking the run as sparse without applying arithmetic.
Fixes: 736fc7bf5f68 ("fs: ntfs3: Fix integer overflow in run_unpack()") Cc: stable@vger.kernel.org Signed-off-by: Konstantin Komarov almaz.alexandrovich@paragon-software.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ntfs3/run.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/fs/ntfs3/run.c +++ b/fs/ntfs3/run.c @@ -984,8 +984,12 @@ int run_unpack(struct runs_tree *run, st if (!dlcn) return -EINVAL;
- if (check_add_overflow(prev_lcn, dlcn, &lcn)) + /* Check special combination: 0 + SPARSE_LCN64. */ + if (!prev_lcn && dlcn == SPARSE_LCN64) { + lcn = SPARSE_LCN64; + } else if (check_add_overflow(prev_lcn, dlcn, &lcn)) { return -EINVAL; + } prev_lcn = lcn; } else { /* The size of 'dlcn' can't be > 8. */
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
commit d3042cbe84a060b4df764eb6c5300bbe20d125ca upstream.
The error path of copying the old config used the wrong variable in the error message:
$ mkdir /tmp/build $ ./tools/testing/ktest/config-bisect.pl -b /tmp/build config-good /tmp/config-bad $ chmod 0 /tmp/build $ ./tools/testing/ktest/config-bisect.pl -b /tmp/build config-good /tmp/config-bad good cp /tmp/build//.config config-good.tmp ... [0 seconds] FAILED! Use of uninitialized value $config in concatenation (.) or string at ./tools/testing/ktest/config-bisect.pl line 744. failed to copy to config-good.tmp
When it should have shown:
failed to copy /tmp/build//.config to config-good.tmp
Cc: stable@vger.kernel.org Cc: John 'Warthog9' Hawley warthog9@kernel.org Fixes: 0f0db065999cf ("ktest: Add standalone config-bisect.pl program") Link: https://patch.msgid.link/20251203180924.6862bd26@gandalf.local.home Reported-by: "John W. Krahn" jwkrahn@shaw.ca Signed-off-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/ktest/config-bisect.pl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/tools/testing/ktest/config-bisect.pl +++ b/tools/testing/ktest/config-bisect.pl @@ -741,9 +741,9 @@ if ($start) { die "Can not find file $bad\n"; } if ($val eq "good") { - run_command "cp $output_config $good" or die "failed to copy $config to $good\n"; + run_command "cp $output_config $good" or die "failed to copy $output_config to $good\n"; } elsif ($val eq "bad") { - run_command "cp $output_config $bad" or die "failed to copy $config to $bad\n"; + run_command "cp $output_config $bad" or die "failed to copy $output_config to $bad\n"; } }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Sakkinen jarkko.sakkinen@opinsys.com
commit faf07e611dfa464b201223a7253e9dc5ee0f3c9e upstream.
tpm2_get_pcr_allocation() does not cap any upper limit for the number of banks. Cap the limit to eight banks so that out of bounds values coming from external I/O cause on only limited harm.
Cc: stable@vger.kernel.org # v5.10+ Fixes: bcfff8384f6c ("tpm: dynamically allocate the allocated_banks array") Tested-by: Lai Yi yi1.lai@linux.intel.com Reviewed-by: Jonathan McDowell noodles@meta.com Reviewed-by: Roberto Sassu roberto.sassu@huawei.com Signed-off-by: Jarkko Sakkinen jarkko.sakkinen@opinsys.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm-chip.c | 1 - drivers/char/tpm/tpm1-cmd.c | 5 ----- drivers/char/tpm/tpm2-cmd.c | 8 +++----- include/linux/tpm.h | 8 +++++--- 4 files changed, 8 insertions(+), 14 deletions(-)
--- a/drivers/char/tpm/tpm-chip.c +++ b/drivers/char/tpm/tpm-chip.c @@ -282,7 +282,6 @@ static void tpm_dev_release(struct devic
kfree(chip->work_space.context_buf); kfree(chip->work_space.session_buf); - kfree(chip->allocated_banks); #ifdef CONFIG_TCG_TPM2_HMAC kfree(chip->auth); #endif --- a/drivers/char/tpm/tpm1-cmd.c +++ b/drivers/char/tpm/tpm1-cmd.c @@ -799,11 +799,6 @@ int tpm1_pm_suspend(struct tpm_chip *chi */ int tpm1_get_pcr_allocation(struct tpm_chip *chip) { - chip->allocated_banks = kcalloc(1, sizeof(*chip->allocated_banks), - GFP_KERNEL); - if (!chip->allocated_banks) - return -ENOMEM; - chip->allocated_banks[0].alg_id = TPM_ALG_SHA1; chip->allocated_banks[0].digest_size = hash_digest_size[HASH_ALGO_SHA1]; chip->allocated_banks[0].crypto_id = HASH_ALGO_SHA1; --- a/drivers/char/tpm/tpm2-cmd.c +++ b/drivers/char/tpm/tpm2-cmd.c @@ -538,11 +538,9 @@ ssize_t tpm2_get_pcr_allocation(struct t
nr_possible_banks = be32_to_cpup( (__be32 *)&buf.data[TPM_HEADER_SIZE + 5]); - - chip->allocated_banks = kcalloc(nr_possible_banks, - sizeof(*chip->allocated_banks), - GFP_KERNEL); - if (!chip->allocated_banks) { + if (nr_possible_banks > TPM2_MAX_PCR_BANKS) { + pr_err("tpm: out of bank capacity: %u > %u\n", + nr_possible_banks, TPM2_MAX_PCR_BANKS); rc = -ENOMEM; goto out; } --- a/include/linux/tpm.h +++ b/include/linux/tpm.h @@ -26,7 +26,9 @@ #include <crypto/aes.h>
#define TPM_DIGEST_SIZE 20 /* Max TPM v1.2 PCR size */ -#define TPM_MAX_DIGEST_SIZE SHA512_DIGEST_SIZE + +#define TPM2_MAX_DIGEST_SIZE SHA512_DIGEST_SIZE +#define TPM2_MAX_PCR_BANKS 8
struct tpm_chip; struct trusted_key_payload; @@ -68,7 +70,7 @@ enum tpm2_curves {
struct tpm_digest { u16 alg_id; - u8 digest[TPM_MAX_DIGEST_SIZE]; + u8 digest[TPM2_MAX_DIGEST_SIZE]; } __packed;
struct tpm_bank_info { @@ -189,7 +191,7 @@ struct tpm_chip { unsigned int groups_cnt;
u32 nr_allocated_banks; - struct tpm_bank_info *allocated_banks; + struct tpm_bank_info allocated_banks[TPM2_MAX_PCR_BANKS]; #ifdef CONFIG_ACPI acpi_handle acpi_dev_handle; char ppi_version[TPM_PPI_VERSION_LEN + 1];
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
commit 222047f68e8565c558728f792f6fef152a1d4d51 upstream.
The freeze_all_ptr check in filesystems_freeze_callback() introduced by commit a3f8f8662771 ("power: always freeze efivarfs") is reverse which quite confusingly causes all file systems to be frozen when filesystem_freeze_enabled is false.
On my systems it causes the WARN_ON_ONCE() in __set_task_frozen() to trigger, most likely due to an attempt to freeze a file system that is not ready for that.
Add a logical negation to the check in question to reverse it as appropriate.
Fixes: a3f8f8662771 ("power: always freeze efivarfs") Cc: 6.18+ stable@vger.kernel.org # 6.18+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://patch.msgid.link/12788397.O9o76ZdvQC@rafael.j.wysocki Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/super.c +++ b/fs/super.c @@ -1188,7 +1188,7 @@ static void filesystems_freeze_callback( if (!sb->s_op->freeze_fs && !sb->s_op->freeze_super) return;
- if (freeze_all_ptr && !(sb->s_type->fs_flags & FS_POWER_FREEZE)) + if (!freeze_all_ptr && !(sb->s_type->fs_flags & FS_POWER_FREEZE)) return;
if (!get_active_super(sb))
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Ogness john.ogness@linutronix.de
commit d01ff281bd9b1bfeac9ab98ec8a9ee41da900d5e upstream.
Currently printk_trigger_flush() only triggers legacy offloaded flushing, even if that may not be the appropriate method to flush for currently registered consoles. (The function predates the NBCON consoles.)
Since commit 6690d6b52726 ("printk: Add helper for flush type logic") there is printk_get_console_flush_type(), which also considers NBCON consoles and reports all the methods of flushing appropriate based on the system state and consoles available.
Update printk_trigger_flush() to use printk_get_console_flush_type() to appropriately flush registered consoles.
Suggested-by: Petr Mladek pmladek@suse.com Signed-off-by: John Ogness john.ogness@linutronix.de Reviewed-by: Petr Mladek pmladek@suse.com Link: https://lore.kernel.org/stable/20251113160351.113031-2-john.ogness%40linutro... Tested-by: Sherry Sun sherry.sun@nxp.com Link: https://patch.msgid.link/20251113160351.113031-2-john.ogness@linutronix.de Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/printk/nbcon.c | 2 +- kernel/printk/printk.c | 23 ++++++++++++++++++++++- 2 files changed, 23 insertions(+), 2 deletions(-)
--- a/kernel/printk/nbcon.c +++ b/kernel/printk/nbcon.c @@ -1856,7 +1856,7 @@ void nbcon_device_release(struct console if (console_trylock()) console_unlock(); } else if (ft.legacy_offload) { - printk_trigger_flush(); + defer_console_output(); } } console_srcu_read_unlock(cookie); --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -4595,9 +4595,30 @@ void defer_console_output(void) __wake_up_klogd(PRINTK_PENDING_WAKEUP | PRINTK_PENDING_OUTPUT); }
+/** + * printk_trigger_flush - Attempt to flush printk buffer to consoles. + * + * If possible, flush the printk buffer to all consoles in the caller's + * context. If offloading is available, trigger deferred printing. + * + * This is best effort. Depending on the system state, console states, + * and caller context, no actual flushing may result from this call. + */ void printk_trigger_flush(void) { - defer_console_output(); + struct console_flush_type ft; + + printk_get_console_flush_type(&ft); + if (ft.nbcon_atomic) + nbcon_atomic_flush_pending(); + if (ft.nbcon_offload) + nbcon_kthreads_wake(); + if (ft.legacy_direct) { + if (console_trylock()) + console_unlock(); + } + if (ft.legacy_offload) + defer_console_output(); }
int vprintk_deferred(const char *fmt, va_list args)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Ogness john.ogness@linutronix.de
commit 66e7c1e0ee08cfb6db64f8f3f6e5a3cc930145c8 upstream.
With commit ("printk: Avoid scheduling irq_work on suspend") the implementation of printk_get_console_flush_type() was modified to avoid offloading when irq_work should be blocked during suspend. Since printk uses the returned flush type to determine what flushing methods are used, this was thought to be sufficient for avoiding irq_work usage during the suspend phase.
However, vprintk_emit() implements a hack to support printk_deferred(). In this hack, the returned flush type is adjusted to make sure no legacy direct printing occurs when printk_deferred() was used.
Because of this hack, the legacy offloading flushing method can still be used, causing irq_work to be queued when it should not be.
Adjust the vprintk_emit() hack to also consider @console_irqwork_blocked so that legacy offloading will not be chosen when irq_work should be blocked.
Link: https://lore.kernel.org/lkml/87fra90xv4.fsf@jogness.linutronix.de Signed-off-by: John Ogness john.ogness@linutronix.de Fixes: 26873e3e7f0c ("printk: Avoid scheduling irq_work on suspend") Reviewed-by: Petr Mladek pmladek@suse.com Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/printk/printk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2393,7 +2393,7 @@ asmlinkage int vprintk_emit(int facility /* If called from the scheduler, we can not call up(). */ if (level == LOGLEVEL_SCHED) { level = LOGLEVEL_DEFAULT; - ft.legacy_offload |= ft.legacy_direct; + ft.legacy_offload |= ft.legacy_direct && !console_irqwork_blocked; ft.legacy_direct = false; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fedor Pchelkin pchelkin@ispras.ru
commit ee5a977b4e771cc181f39d504426dbd31ed701cc upstream.
strscpy_pad() can't be used to copy a non-NUL-term string into a NUL-term string of possibly bigger size. Commit 0efc5990bca5 ("string.h: Introduce memtostr() and memtostr_pad()") provides additional information in that regard. So if this happens, the following warning is observed:
strnlen: detected buffer overflow: 65 byte read of buffer size 64 WARNING: CPU: 0 PID: 28655 at lib/string_helpers.c:1032 __fortify_report+0x96/0xc0 lib/string_helpers.c:1032 Modules linked in: CPU: 0 UID: 0 PID: 28655 Comm: syz-executor.3 Not tainted 6.12.54-syzkaller-00144-g5f0270f1ba00 #0 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:__fortify_report+0x96/0xc0 lib/string_helpers.c:1032 Call Trace: <TASK> __fortify_panic+0x1f/0x30 lib/string_helpers.c:1039 strnlen include/linux/fortify-string.h:235 [inline] sized_strscpy include/linux/fortify-string.h:309 [inline] parse_apply_sb_mount_options fs/ext4/super.c:2504 [inline] __ext4_fill_super fs/ext4/super.c:5261 [inline] ext4_fill_super+0x3c35/0xad00 fs/ext4/super.c:5706 get_tree_bdev_flags+0x387/0x620 fs/super.c:1636 vfs_get_tree+0x93/0x380 fs/super.c:1814 do_new_mount fs/namespace.c:3553 [inline] path_mount+0x6ae/0x1f70 fs/namespace.c:3880 do_mount fs/namespace.c:3893 [inline] __do_sys_mount fs/namespace.c:4103 [inline] __se_sys_mount fs/namespace.c:4080 [inline] __x64_sys_mount+0x280/0x300 fs/namespace.c:4080 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x64/0x140 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Since userspace is expected to provide s_mount_opts field to be at most 63 characters long with the ending byte being NUL-term, use a 64-byte buffer which matches the size of s_mount_opts, so that strscpy_pad() does its job properly. Return with error if the user still managed to provide a non-NUL-term string here.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 8ecb790ea8c3 ("ext4: avoid potential buffer over-read in parse_apply_sb_mount_options()") Cc: stable@vger.kernel.org Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Reviewed-by: Baokun Li libaokun1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Message-ID: 20251101160430.222297-1-pchelkin@ispras.ru Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/super.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -2475,7 +2475,7 @@ static int parse_apply_sb_mount_options( struct ext4_fs_context *m_ctx) { struct ext4_sb_info *sbi = EXT4_SB(sb); - char s_mount_opts[65]; + char s_mount_opts[64]; struct ext4_fs_context *s_ctx = NULL; struct fs_context *fc = NULL; int ret = -ENOMEM; @@ -2483,7 +2483,8 @@ static int parse_apply_sb_mount_options( if (!sbi->s_es->s_mount_opts[0]) return 0;
- strscpy_pad(s_mount_opts, sbi->s_es->s_mount_opts); + if (strscpy_pad(s_mount_opts, sbi->s_es->s_mount_opts) < 0) + return -E2BIG;
fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL); if (!fc)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fedor Pchelkin pchelkin@ispras.ru
commit 3db63d2c2d1d1e78615dd742568c5a2d55291ad1 upstream.
params.mount_opts may come as potentially non-NUL-term string. Userspace is expected to pass a NUL-term string. Add an extra check to ensure this holds true. Note that further code utilizes strscpy_pad() so this is just for proper informing the user of incorrect data being provided.
Found by Linux Verification Center (linuxtesting.org).
Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Reviewed-by: Baokun Li libaokun1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Message-ID: 20251101160430.222297-2-pchelkin@ispras.ru Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/ioctl.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -1394,6 +1394,10 @@ static int ext4_ioctl_set_tune_sb(struct if (copy_from_user(¶ms, in, sizeof(params))) return -EFAULT;
+ if (strnlen(params.mount_opts, sizeof(params.mount_opts)) == + sizeof(params.mount_opts)) + return -E2BIG; + if ((params.set_flags & ~TUNE_OPS_SUPPORTED) != 0) return -EOPNOTSUPP;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Karina Yankevich k.yankevich@omp.ru
commit b97cb7d6a051aa6ebd57906df0e26e9e36c26d14 upstream.
If ext4_get_inode_loc() fails (e.g. if it returns -EFSCORRUPTED), iloc.bh will remain set to NULL. Since ext4_xattr_inode_dec_ref_all() lacks error checking, this will lead to a null pointer dereference in ext4_raw_inode(), called right after ext4_get_inode_loc().
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: c8e008b60492 ("ext4: ignore xattrs past end") Cc: stable@kernel.org Signed-off-by: Karina Yankevich k.yankevich@omp.ru Reviewed-by: Sergey Shtylyov s.shtylyov@omp.ru Reviewed-by: Baokun Li libaokun1@huawei.com Message-ID: 20251022093253.3546296-1-k.yankevich@omp.ru Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/xattr.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -1174,7 +1174,11 @@ ext4_xattr_inode_dec_ref_all(handle_t *h if (block_csum) end = (void *)bh->b_data + bh->b_size; else { - ext4_get_inode_loc(parent, &iloc); + err = ext4_get_inode_loc(parent, &iloc); + if (err) { + EXT4_ERROR_INODE(parent, "parent inode loc (error %d)", err); + return; + } end = (void *)ext4_raw_inode(&iloc) + EXT4_SB(parent->i_sb)->s_inode_size; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haibo Chen haibo.chen@nxp.com
commit 4091c8206cfd2e3bb529ef260887296b90d9b6a2 upstream.
i_state_flags used on 32-bit archs, need to clear this flag when alloc inode. Find this issue when umount ext4, sometimes track the inode as orphan accidently, cause ext4 mesg dump.
Fixes: acf943e9768e ("ext4: fix checks for orphan inodes") Signed-off-by: Haibo Chen haibo.chen@nxp.com Reviewed-by: Baokun Li libaokun1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Message-ID: 20251104-ext4-v1-1-73691a0800f9@nxp.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/ialloc.c | 1 - fs/ext4/inode.c | 1 - fs/ext4/super.c | 1 + 3 files changed, 1 insertion(+), 2 deletions(-)
--- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -1293,7 +1293,6 @@ got: ei->i_csum_seed = ext4_chksum(csum, (__u8 *)&gen, sizeof(gen)); }
- ext4_clear_state_flags(ei); /* Only relevant on 32-bit archs */ ext4_set_inode_state(inode, EXT4_STATE_NEW);
ei->i_extra_isize = sbi->s_want_extra_isize; --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5288,7 +5288,6 @@ struct inode *__ext4_iget(struct super_b ei->i_projid = make_kprojid(&init_user_ns, i_projid); set_nlink(inode, le16_to_cpu(raw_inode->i_links_count));
- ext4_clear_state_flags(ei); /* Only relevant on 32-bit archs */ ei->i_inline_off = 0; ei->i_dir_start_lookup = 0; ei->i_dtime = le32_to_cpu(raw_inode->i_dtime); --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1396,6 +1396,7 @@ static struct inode *ext4_alloc_inode(st
inode_set_iversion(&ei->vfs_inode, 1); ei->i_flags = 0; + ext4_clear_state_flags(ei); /* Only relevant on 32-bit archs */ spin_lock_init(&ei->i_raw_lock); ei->i_prealloc_node = RB_ROOT; atomic_set(&ei->i_prealloc_active, 0);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yongjian Sun sunyongjian1@huawei.com
commit 3f7a79d05c692c7cfec70bf104b1b3c3d0ce6247 upstream.
When the MB_CHECK_ASSERT macro is enabled, an assertion failure can occur in __mb_check_buddy when checking preallocated blocks (pa) in a block group:
Assertion failure in mb_free_blocks() : "groupnr == e4b->bd_group"
This happens when a pa at the very end of a block group (e.g., pa_pstart=32765, pa_len=3 in a group of 32768 blocks) becomes exhausted - its pa_pstart is advanced by pa_len to 32768, which lies in the next block group. If this exhausted pa (with pa_len == 0) is still in the bb_prealloc_list during the buddy check, the assertion incorrectly flags it as belonging to the wrong group. A possible sequence is as follows:
ext4_mb_new_blocks ext4_mb_release_context pa->pa_pstart += EXT4_C2B(sbi, ac->ac_b_ex.fe_len) pa->pa_len -= ac->ac_b_ex.fe_len
__mb_check_buddy for each pa in group ext4_get_group_no_and_offset MB_CHECK_ASSERT(groupnr == e4b->bd_group)
To fix this, we modify the check to skip block group validation for exhausted preallocations (where pa_len == 0). Such entries are in a transitional state and will be removed from the list soon, so they should not trigger an assertion. This change prevents the false positive while maintaining the integrity of the checks for active allocations.
Fixes: c9de560ded61f ("ext4: Add multi block allocator for ext4") Signed-off-by: Yongjian Sun sunyongjian1@huawei.com Reviewed-by: Baokun Li libaokun1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Message-ID: 20251106060614.631382-2-sunyongjian@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/mballoc.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -783,6 +783,8 @@ static void __mb_check_buddy(struct ext4 ext4_group_t groupnr; struct ext4_prealloc_space *pa; pa = list_entry(cur, struct ext4_prealloc_space, pa_group_list); + if (!pa->pa_len) + continue; ext4_get_group_no_and_offset(sb, pa->pa_pstart, &groupnr, &k); MB_CHECK_ASSERT(groupnr == e4b->bd_group); for (i = 0; i < pa->pa_len; i++)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li libaokun1@huawei.com
commit 7c11c56eb32eae96893eebafdbe3decadefe88ad upstream.
Kernel commit 0a6ce20c1564 ("ext4: verify orphan file size is not too big") limits the maximum supported orphan file size to 8 << 20.
However, in e2fsprogs, the orphan file size is set to 32–512 filesystem blocks when creating a filesystem.
With 64k block size, formatting an ext4 fs >32G gives an orphan file bigger than the kernel allows, so mount prints an error and fails:
EXT4-fs (vdb): orphan file too big: 8650752 EXT4-fs (vdb): mount failed
To prevent this issue and allow previously created 64KB filesystems to mount, we updates the maximum allowed orphan file size in the kernel to 512 filesystem blocks.
Fixes: 0a6ce20c1564 ("ext4: verify orphan file size is not too big") Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Message-ID: 20251120134233.2994147-1-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/orphan.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/ext4/orphan.c +++ b/fs/ext4/orphan.c @@ -8,6 +8,8 @@ #include "ext4.h" #include "ext4_jbd2.h"
+#define EXT4_MAX_ORPHAN_FILE_BLOCKS 512 + static int ext4_orphan_file_add(handle_t *handle, struct inode *inode) { int i, j, start; @@ -588,7 +590,7 @@ int ext4_init_orphan_info(struct super_b * consuming absurd amounts of memory when pinning blocks of orphan * file in memory. */ - if (inode->i_size > 8 << 20) { + if (inode->i_size > (EXT4_MAX_ORPHAN_FILE_BLOCKS << inode->i_blkbits)) { ext4_msg(sb, KERN_ERR, "orphan file too big: %llu", (unsigned long long)inode->i_size); ret = -EFSCORRUPTED;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
commit 524c3853831cf4f7e1db579e487c757c3065165c upstream.
syzbot is reporting possibility of deadlock due to sharing lock_class_key for jbd2_handle across ext4 and ocfs2. But this is a false positive, for one disk partition can't have two filesystems at the same time.
Reported-by: syzbot+6e493c165d26d6fcbf72@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6e493c165d26d6fcbf72 Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Tested-by: syzbot+6e493c165d26d6fcbf72@syzkaller.appspotmail.com Reviewed-by: Jan Kara jack@suse.cz Message-ID: 987110fc-5470-457a-a218-d286a09dd82f@I-love.SAKURA.ne.jp Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/jbd2/journal.c | 6 ++++-- include/linux/jbd2.h | 6 ++++++ 2 files changed, 10 insertions(+), 2 deletions(-)
--- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -1521,7 +1521,6 @@ static journal_t *journal_init_common(st struct block_device *fs_dev, unsigned long long start, int len, int blocksize) { - static struct lock_class_key jbd2_trans_commit_key; journal_t *journal; int err; int n; @@ -1530,6 +1529,7 @@ static journal_t *journal_init_common(st if (!journal) return ERR_PTR(-ENOMEM);
+ lockdep_register_key(&journal->jbd2_trans_commit_key); journal->j_blocksize = blocksize; journal->j_dev = bdev; journal->j_fs_dev = fs_dev; @@ -1560,7 +1560,7 @@ static journal_t *journal_init_common(st journal->j_max_batch_time = 15000; /* 15ms */ atomic_set(&journal->j_reserved_credits, 0); lockdep_init_map(&journal->j_trans_commit_map, "jbd2_handle", - &jbd2_trans_commit_key, 0); + &journal->jbd2_trans_commit_key, 0);
/* The journal is marked for error until we succeed with recovery! */ journal->j_flags = JBD2_ABORT; @@ -1611,6 +1611,7 @@ err_cleanup: kfree(journal->j_wbuf); jbd2_journal_destroy_revoke(journal); journal_fail_superblock(journal); + lockdep_unregister_key(&journal->jbd2_trans_commit_key); kfree(journal); return ERR_PTR(err); } @@ -2187,6 +2188,7 @@ int jbd2_journal_destroy(journal_t *jour jbd2_journal_destroy_revoke(journal); kfree(journal->j_fc_wbuf); kfree(journal->j_wbuf); + lockdep_unregister_key(&journal->jbd2_trans_commit_key); kfree(journal);
return err; --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -1253,6 +1253,12 @@ struct journal_s */ struct lockdep_map j_trans_commit_map; #endif + /** + * @jbd2_trans_commit_key: + * + * "struct lock_class_key" for @j_trans_commit_map + */ + struct lock_class_key jbd2_trans_commit_key;
/** * @j_fc_cleanup_callback:
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Byungchul Park byungchul@sk.com
commit 40a71b53d5a6d4ea17e4d54b99b2ac03a7f5e783 upstream.
jbd2 journal handling code doesn't want jbd2_might_wait_for_commit() to be placed between start_this_handle() and stop_this_handle(). So it marks the region with rwsem_acquire_read() and rwsem_release().
However, the annotation is too strong for that purpose. We don't have to use more than try lock annotation for that.
rwsem_acquire_read() implies:
1. might be a waiter on contention of the lock. 2. enter to the critical section of the lock.
All we need in here is to act 2, not 1. So trylock version of annotation is sufficient for that purpose. Now that dept partially relies on lockdep annotaions, dept interpets rwsem_acquire_read() as a potential wait and might report a deadlock by the wait.
Replace it with trylock version of annotation.
Signed-off-by: Byungchul Park byungchul@sk.com Reviewed-by: Jan Kara jack@suse.cz Cc: stable@kernel.org Message-ID: 20251024073940.1063-1-byungchul@sk.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/jbd2/transaction.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -441,7 +441,7 @@ repeat: read_unlock(&journal->j_state_lock); current->journal_info = handle;
- rwsem_acquire_read(&journal->j_trans_commit_map, 0, 0, _THIS_IP_); + rwsem_acquire_read(&journal->j_trans_commit_map, 0, 1, _THIS_IP_); jbd2_journal_free_transaction(new_transaction); /* * Ensure that no allocations done while the transaction is open are
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bart Van Assche bvanassche@acm.org
commit 935a20d1bebf6236076785fac3ff81e3931834e9 upstream.
Freezing the request queue from inside sysfs store callbacks may cause a deadlock in combination with the dm-multipath driver and the queue_if_no_path option. Additionally, freezing the request queue slows down system boot on systems where sysfs attributes are set synchronously.
Fix this by removing the blk_mq_freeze_queue() / blk_mq_unfreeze_queue() calls from the store callbacks that do not strictly need these callbacks. Add the __data_racy annotation to request_queue.rq_timeout to suppress KCSAN data race reports about the rq_timeout reads.
This patch may cause a small delay in applying the new settings.
For all the attributes affected by this patch, I/O will complete correctly whether the old or the new value of the attribute is used.
This patch affects the following sysfs attributes: * io_poll_delay * io_timeout * nomerges * read_ahead_kb * rq_affinity
Here is an example of a deadlock triggered by running test srp/002 if this patch is not applied:
task:multipathd Call Trace: <TASK> __schedule+0x8c1/0x1bf0 schedule+0xdd/0x270 schedule_preempt_disabled+0x1c/0x30 __mutex_lock+0xb89/0x1650 mutex_lock_nested+0x1f/0x30 dm_table_set_restrictions+0x823/0xdf0 __bind+0x166/0x590 dm_swap_table+0x2a7/0x490 do_resume+0x1b1/0x610 dev_suspend+0x55/0x1a0 ctl_ioctl+0x3a5/0x7e0 dm_ctl_ioctl+0x12/0x20 __x64_sys_ioctl+0x127/0x1a0 x64_sys_call+0xe2b/0x17d0 do_syscall_64+0x96/0x3a0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 </TASK> task:(udev-worker) Call Trace: <TASK> __schedule+0x8c1/0x1bf0 schedule+0xdd/0x270 blk_mq_freeze_queue_wait+0xf2/0x140 blk_mq_freeze_queue_nomemsave+0x23/0x30 queue_ra_store+0x14e/0x290 queue_attr_store+0x23e/0x2c0 sysfs_kf_write+0xde/0x140 kernfs_fop_write_iter+0x3b2/0x630 vfs_write+0x4fd/0x1390 ksys_write+0xfd/0x230 __x64_sys_write+0x76/0xc0 x64_sys_call+0x276/0x17d0 do_syscall_64+0x96/0x3a0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 </TASK>
Cc: Christoph Hellwig hch@lst.de Cc: Ming Lei ming.lei@redhat.com Cc: Nilay Shroff nilay@linux.ibm.com Cc: Martin Wilck mwilck@suse.com Cc: Benjamin Marzinski bmarzins@redhat.com Cc: stable@vger.kernel.org Fixes: af2814149883 ("block: freeze the queue in queue_attr_store") Signed-off-by: Bart Van Assche bvanassche@acm.org Reviewed-by: Nilay Shroff nilay@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- block/blk-sysfs.c | 26 ++++++++------------------ include/linux/blkdev.h | 2 +- 2 files changed, 9 insertions(+), 19 deletions(-)
--- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -143,21 +143,22 @@ queue_ra_store(struct gendisk *disk, con { unsigned long ra_kb; ssize_t ret; - unsigned int memflags; struct request_queue *q = disk->queue;
ret = queue_var_store(&ra_kb, page, count); if (ret < 0) return ret; /* - * ->ra_pages is protected by ->limits_lock because it is usually - * calculated from the queue limits by queue_limits_commit_update. + * The ->ra_pages change below is protected by ->limits_lock because it + * is usually calculated from the queue limits by + * queue_limits_commit_update(). + * + * bdi->ra_pages reads are not serialized against bdi->ra_pages writes. + * Use WRITE_ONCE() to write bdi->ra_pages once. */ mutex_lock(&q->limits_lock); - memflags = blk_mq_freeze_queue(q); - disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10); + WRITE_ONCE(disk->bdi->ra_pages, ra_kb >> (PAGE_SHIFT - 10)); mutex_unlock(&q->limits_lock); - blk_mq_unfreeze_queue(q, memflags);
return ret; } @@ -375,21 +376,18 @@ static ssize_t queue_nomerges_store(stru size_t count) { unsigned long nm; - unsigned int memflags; struct request_queue *q = disk->queue; ssize_t ret = queue_var_store(&nm, page, count);
if (ret < 0) return ret;
- memflags = blk_mq_freeze_queue(q); blk_queue_flag_clear(QUEUE_FLAG_NOMERGES, q); blk_queue_flag_clear(QUEUE_FLAG_NOXMERGES, q); if (nm == 2) blk_queue_flag_set(QUEUE_FLAG_NOMERGES, q); else if (nm) blk_queue_flag_set(QUEUE_FLAG_NOXMERGES, q); - blk_mq_unfreeze_queue(q, memflags);
return ret; } @@ -409,7 +407,6 @@ queue_rq_affinity_store(struct gendisk * #ifdef CONFIG_SMP struct request_queue *q = disk->queue; unsigned long val; - unsigned int memflags;
ret = queue_var_store(&val, page, count); if (ret < 0) @@ -421,7 +418,6 @@ queue_rq_affinity_store(struct gendisk * * are accessed individually using atomic test_bit operation. So we * don't grab any lock while updating these flags. */ - memflags = blk_mq_freeze_queue(q); if (val == 2) { blk_queue_flag_set(QUEUE_FLAG_SAME_COMP, q); blk_queue_flag_set(QUEUE_FLAG_SAME_FORCE, q); @@ -432,7 +428,6 @@ queue_rq_affinity_store(struct gendisk * blk_queue_flag_clear(QUEUE_FLAG_SAME_COMP, q); blk_queue_flag_clear(QUEUE_FLAG_SAME_FORCE, q); } - blk_mq_unfreeze_queue(q, memflags); #endif return ret; } @@ -446,11 +441,9 @@ static ssize_t queue_poll_delay_store(st static ssize_t queue_poll_store(struct gendisk *disk, const char *page, size_t count) { - unsigned int memflags; ssize_t ret = count; struct request_queue *q = disk->queue;
- memflags = blk_mq_freeze_queue(q); if (!(q->limits.features & BLK_FEAT_POLL)) { ret = -EINVAL; goto out; @@ -459,7 +452,6 @@ static ssize_t queue_poll_store(struct g pr_info_ratelimited("writes to the poll attribute are ignored.\n"); pr_info_ratelimited("please use driver specific parameters instead.\n"); out: - blk_mq_unfreeze_queue(q, memflags); return ret; }
@@ -472,7 +464,7 @@ static ssize_t queue_io_timeout_show(str static ssize_t queue_io_timeout_store(struct gendisk *disk, const char *page, size_t count) { - unsigned int val, memflags; + unsigned int val; int err; struct request_queue *q = disk->queue;
@@ -480,9 +472,7 @@ static ssize_t queue_io_timeout_store(st if (err || val == 0) return -EINVAL;
- memflags = blk_mq_freeze_queue(q); blk_queue_rq_timeout(q, msecs_to_jiffies(val)); - blk_mq_unfreeze_queue(q, memflags);
return count; } --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -485,7 +485,7 @@ struct request_queue { */ unsigned long queue_flags;
- unsigned int rq_timeout; + unsigned int __data_racy rq_timeout;
unsigned int queue_depth;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: xu xin xu.xin16@zte.com.cn
commit 590c03ca6a3fbb114396673314e2aa483839608b upstream.
Patch series "ksm: fix exec/fork inheritance", v2.
This series fixes exec/fork inheritance. See the detailed description of the issue below.
This patch (of 2):
Background ==========
commit d7597f59d1d33 ("mm: add new api to enable ksm per process") introduced MMF_VM_MERGE_ANY for mm->flags, and allowed user to set it by prctl() so that the process's VMAs are forcibly scanned by ksmd.
Subsequently, the 3c6f33b7273a ("mm/ksm: support fork/exec for prctl") supported inheriting the MMF_VM_MERGE_ANY flag when a task calls execve().
Finally, commit 3a9e567ca45fb ("mm/ksm: fix ksm exec support for prctl") fixed the issue that ksmd doesn't scan the mm_struct with MMF_VM_MERGE_ANY by adding the mm_slot to ksm_mm_head in __bprm_mm_init().
Problem =======
In some extreme scenarios, however, this inheritance of MMF_VM_MERGE_ANY during exec/fork can fail. For example, when the scanning frequency of ksmd is tuned extremely high, a process carrying MMF_VM_MERGE_ANY may still fail to pass it to the newly exec'd process. This happens because ksm_execve() is executed too early in the do_execve flow (prematurely adding the new mm_struct to the ksm_mm_slot list).
As a result, before do_execve completes, ksmd may have already performed a scan and found that this new mm_struct has no VM_MERGEABLE VMAs, thus clearing its MMF_VM_MERGE_ANY flag. Consequently, when the new program executes, the flag MMF_VM_MERGE_ANY inheritance missed.
Root reason ===========
commit d7597f59d1d33 ("mm: add new api to enable ksm per process") clear the flag MMF_VM_MERGE_ANY when ksmd found no VM_MERGEABLE VMAs.
Solution ========
Firstly, Don't clear MMF_VM_MERGE_ANY when ksmd found no VM_MERGEABLE VMAs, because perhaps their mm_struct has just been added to ksm_mm_slot list, and its process has not yet officially started running or has not yet performed mmap/brk to allocate anonymous VMAS.
Secondly, recheck MMF_VM_MERGEABLE again if a process takes MMF_VM_MERGE_ANY, and create a mm_slot and join it into ksm_scan_list again.
Link: https://lkml.kernel.org/r/20251007182504440BJgK8VXRHh8TD7IGSUIY4@zte.com.cn Link: https://lkml.kernel.org/r/20251007182821572h_SoFqYZXEP1mvWI4n9VL@zte.com.cn Fixes: 3c6f33b7273a ("mm/ksm: support fork/exec for prctl") Fixes: d7597f59d1d3 ("mm: add new api to enable ksm per process") Signed-off-by: xu xin xu.xin16@zte.com.cn Cc: Stefan Roesch shr@devkernel.io Cc: David Hildenbrand david@redhat.com Cc: Jinjiang Tu tujinjiang@huawei.com Cc: Wang Yaxin wang.yaxin@zte.com.cn Cc: Yang Yang yang.yang29@zte.com.cn Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/ksm.h | 4 ++-- mm/ksm.c | 20 +++++++++++++++++--- 2 files changed, 19 insertions(+), 5 deletions(-)
--- a/include/linux/ksm.h +++ b/include/linux/ksm.h @@ -17,7 +17,7 @@ #ifdef CONFIG_KSM int ksm_madvise(struct vm_area_struct *vma, unsigned long start, unsigned long end, int advice, vm_flags_t *vm_flags); -vm_flags_t ksm_vma_flags(const struct mm_struct *mm, const struct file *file, +vm_flags_t ksm_vma_flags(struct mm_struct *mm, const struct file *file, vm_flags_t vm_flags); int ksm_enable_merge_any(struct mm_struct *mm); int ksm_disable_merge_any(struct mm_struct *mm); @@ -103,7 +103,7 @@ bool ksm_process_mergeable(struct mm_str
#else /* !CONFIG_KSM */
-static inline vm_flags_t ksm_vma_flags(const struct mm_struct *mm, +static inline vm_flags_t ksm_vma_flags(struct mm_struct *mm, const struct file *file, vm_flags_t vm_flags) { return vm_flags; --- a/mm/ksm.c +++ b/mm/ksm.c @@ -2712,8 +2712,14 @@ no_vmas: spin_unlock(&ksm_mmlist_lock);
mm_slot_free(mm_slot_cache, mm_slot); + /* + * Only clear MMF_VM_MERGEABLE. We must not clear + * MMF_VM_MERGE_ANY, because for those MMF_VM_MERGE_ANY process, + * perhaps their mm_struct has just been added to ksm_mm_slot + * list, and its process has not yet officially started running + * or has not yet performed mmap/brk to allocate anonymous VMAS. + */ mm_flags_clear(MMF_VM_MERGEABLE, mm); - mm_flags_clear(MMF_VM_MERGE_ANY, mm); mmap_read_unlock(mm); mmdrop(mm); } else { @@ -2831,12 +2837,20 @@ static int __ksm_del_vma(struct vm_area_ * * Returns: @vm_flags possibly updated to mark mergeable. */ -vm_flags_t ksm_vma_flags(const struct mm_struct *mm, const struct file *file, +vm_flags_t ksm_vma_flags(struct mm_struct *mm, const struct file *file, vm_flags_t vm_flags) { if (mm_flags_test(MMF_VM_MERGE_ANY, mm) && - __ksm_should_add_vma(file, vm_flags)) + __ksm_should_add_vma(file, vm_flags)) { vm_flags |= VM_MERGEABLE; + /* + * Generally, the flags here always include MMF_VM_MERGEABLE. + * However, in rare cases, this flag may be cleared by ksmd who + * scans a cycle without finding any mergeable vma. + */ + if (unlikely(!mm_flags_test(MMF_VM_MERGEABLE, mm))) + __ksm_enter(mm); + }
return vm_flags; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Laurent Pinchart laurent.pinchart@ideasonboard.com
commit 082b86919b7a94de01d849021b4da820a6cb89dc upstream.
Commit cbd9463da1b1 ("media: v4l2-mem2mem: Avoid calling .device_run in v4l2_m2m_job_finish") deferred calls to .device_run() to a work queue to avoid recursive calls when a job is finished right away from .device_run(). It failed to update the v4l2_m2m_job_finish() documentation that still states the function must not be called from .device_run(). Fix it.
Fixes: cbd9463da1b1 ("media: v4l2-mem2mem: Avoid calling .device_run in v4l2_m2m_job_finish") Cc: stable@vger.kernel.org Signed-off-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/media/v4l2-mem2mem.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/include/media/v4l2-mem2mem.h +++ b/include/media/v4l2-mem2mem.h @@ -192,8 +192,7 @@ void v4l2_m2m_try_schedule(struct v4l2_m * other instances to take control of the device. * * This function has to be called only after &v4l2_m2m_ops->device_run - * callback has been called on the driver. To prevent recursion, it should - * not be called directly from the &v4l2_m2m_ops->device_run callback though. + * callback has been called on the driver. */ void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev, struct v4l2_m2m_ctx *m2m_ctx);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wei Yang richard.weiyang@gmail.com
commit 2a1351cd4176ee1809b0900d386919d03b7652f8 upstream.
We add pmd folio into ds_queue on the first page fault in __do_huge_pmd_anonymous_page(), so that we can split it in case of memory pressure. This should be the same for a pmd folio during wp page fault.
Commit 1ced09e0331f ("mm: allocate THP on hugezeropage wp-fault") miss to add it to ds_queue, which means system may not reclaim enough memory in case of memory pressure even the pmd folio is under used.
Move deferred_split_folio() into map_anon_folio_pmd() to make the pmd folio installation consistent.
Link: https://lkml.kernel.org/r/20251008095453.18772-2-richard.weiyang@gmail.com Fixes: 1ced09e0331f ("mm: allocate THP on hugezeropage wp-fault") Signed-off-by: Wei Yang richard.weiyang@gmail.com Acked-by: David Hildenbrand david@redhat.com Reviewed-by: Lance Yang lance.yang@linux.dev Reviewed-by: Dev Jain dev.jain@arm.com Acked-by: Usama Arif usamaarif642@gmail.com Reviewed-by: Zi Yan ziy@nvidia.com Reviewed-by: Baolin Wang baolin.wang@linux.alibaba.com Cc: Matthew Wilcox willy@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/huge_memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1233,6 +1233,7 @@ static void map_anon_folio_pmd(struct fo count_vm_event(THP_FAULT_ALLOC); count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_ALLOC); count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC); + deferred_split_folio(folio, false); }
static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf) @@ -1273,7 +1274,6 @@ static vm_fault_t __do_huge_pmd_anonymou pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); map_anon_folio_pmd(folio, vmf->pmd, vma, haddr); mm_inc_nr_ptes(vma->vm_mm); - deferred_split_folio(folio, false); spin_unlock(vmf->ptl); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Sakkinen jarkko@kernel.org
commit 6e9722e9a7bfe1bbad649937c811076acf86e1fd upstream.
'name_size' does not have any range checks, and it just directly indexes with TPM_ALG_ID, which could lead into memory corruption at worst.
Address the issue by only processing known values and returning -EINVAL for unrecognized values.
Make also 'tpm_buf_append_name' and 'tpm_buf_fill_hmac_session' fallible so that errors are detected before causing any spurious TPM traffic.
End also the authorization session on failure in both of the functions, as the session state would be then by definition corrupted.
Cc: stable@vger.kernel.org # v6.10+ Fixes: 1085b8276bb4 ("tpm: Add the rest of the session HMAC API") Reviewed-by: Jonathan McDowell noodles@meta.com Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm2-cmd.c | 23 ++++- drivers/char/tpm/tpm2-sessions.c | 132 ++++++++++++++++++++---------- include/linux/tpm.h | 13 +- security/keys/trusted-keys/trusted_tpm2.c | 29 +++++- 4 files changed, 142 insertions(+), 55 deletions(-)
--- a/drivers/char/tpm/tpm2-cmd.c +++ b/drivers/char/tpm/tpm2-cmd.c @@ -187,7 +187,11 @@ int tpm2_pcr_extend(struct tpm_chip *chi }
if (!disable_pcr_integrity) { - tpm_buf_append_name(chip, &buf, pcr_idx, NULL); + rc = tpm_buf_append_name(chip, &buf, pcr_idx, NULL); + if (rc) { + tpm_buf_destroy(&buf); + return rc; + } tpm_buf_append_hmac_session(chip, &buf, 0, NULL, 0); } else { tpm_buf_append_handle(chip, &buf, pcr_idx); @@ -202,8 +206,14 @@ int tpm2_pcr_extend(struct tpm_chip *chi chip->allocated_banks[i].digest_size); }
- if (!disable_pcr_integrity) - tpm_buf_fill_hmac_session(chip, &buf); + if (!disable_pcr_integrity) { + rc = tpm_buf_fill_hmac_session(chip, &buf); + if (rc) { + tpm_buf_destroy(&buf); + return rc; + } + } + rc = tpm_transmit_cmd(chip, &buf, 0, "attempting extend a PCR value"); if (!disable_pcr_integrity) rc = tpm_buf_check_hmac_response(chip, &buf, rc); @@ -261,7 +271,12 @@ int tpm2_get_random(struct tpm_chip *chi | TPM2_SA_CONTINUE_SESSION, NULL, 0); tpm_buf_append_u16(&buf, num_bytes); - tpm_buf_fill_hmac_session(chip, &buf); + err = tpm_buf_fill_hmac_session(chip, &buf); + if (err) { + tpm_buf_destroy(&buf); + return err; + } + err = tpm_transmit_cmd(chip, &buf, offsetof(struct tpm2_get_random_out, buffer), --- a/drivers/char/tpm/tpm2-sessions.c +++ b/drivers/char/tpm/tpm2-sessions.c @@ -144,16 +144,23 @@ struct tpm2_auth { /* * Name Size based on TPM algorithm (assumes no hash bigger than 255) */ -static u8 name_size(const u8 *name) +static int name_size(const u8 *name) { - static u8 size_map[] = { - [TPM_ALG_SHA1] = SHA1_DIGEST_SIZE, - [TPM_ALG_SHA256] = SHA256_DIGEST_SIZE, - [TPM_ALG_SHA384] = SHA384_DIGEST_SIZE, - [TPM_ALG_SHA512] = SHA512_DIGEST_SIZE, - }; - u16 alg = get_unaligned_be16(name); - return size_map[alg] + 2; + u16 hash_alg = get_unaligned_be16(name); + + switch (hash_alg) { + case TPM_ALG_SHA1: + return SHA1_DIGEST_SIZE + 2; + case TPM_ALG_SHA256: + return SHA256_DIGEST_SIZE + 2; + case TPM_ALG_SHA384: + return SHA384_DIGEST_SIZE + 2; + case TPM_ALG_SHA512: + return SHA512_DIGEST_SIZE + 2; + default: + pr_warn("tpm: unsupported name algorithm: 0x%04x\n", hash_alg); + return -EINVAL; + } }
static int tpm2_parse_read_public(char *name, struct tpm_buf *buf) @@ -161,6 +168,7 @@ static int tpm2_parse_read_public(char * struct tpm_header *head = (struct tpm_header *)buf->data; off_t offset = TPM_HEADER_SIZE; u32 tot_len = be32_to_cpu(head->length); + int ret; u32 val;
/* we're starting after the header so adjust the length */ @@ -173,8 +181,13 @@ static int tpm2_parse_read_public(char * offset += val; /* name */ val = tpm_buf_read_u16(buf, &offset); - if (val != name_size(&buf->data[offset])) + ret = name_size(&buf->data[offset]); + if (ret < 0) + return ret; + + if (val != ret) return -EINVAL; + memcpy(name, &buf->data[offset], val); /* forget the rest */ return 0; @@ -221,46 +234,72 @@ static int tpm2_read_public(struct tpm_c * As with most tpm_buf operations, success is assumed because failure * will be caused by an incorrect programming model and indicated by a * kernel message. + * + * Ends the authorization session on failure. */ -void tpm_buf_append_name(struct tpm_chip *chip, struct tpm_buf *buf, - u32 handle, u8 *name) +int tpm_buf_append_name(struct tpm_chip *chip, struct tpm_buf *buf, + u32 handle, u8 *name) { #ifdef CONFIG_TCG_TPM2_HMAC enum tpm2_mso_type mso = tpm2_handle_mso(handle); struct tpm2_auth *auth; int slot; + int ret; #endif
if (!tpm2_chip_auth(chip)) { tpm_buf_append_handle(chip, buf, handle); - return; + return 0; }
#ifdef CONFIG_TCG_TPM2_HMAC slot = (tpm_buf_length(buf) - TPM_HEADER_SIZE) / 4; if (slot >= AUTH_MAX_NAMES) { - dev_err(&chip->dev, "TPM: too many handles\n"); - return; + dev_err(&chip->dev, "too many handles\n"); + ret = -EIO; + goto err; } auth = chip->auth; - WARN(auth->session != tpm_buf_length(buf), - "name added in wrong place\n"); + if (auth->session != tpm_buf_length(buf)) { + dev_err(&chip->dev, "session state malformed"); + ret = -EIO; + goto err; + } tpm_buf_append_u32(buf, handle); auth->session += 4;
if (mso == TPM2_MSO_PERSISTENT || mso == TPM2_MSO_VOLATILE || mso == TPM2_MSO_NVRAM) { - if (!name) - tpm2_read_public(chip, handle, auth->name[slot]); + if (!name) { + ret = tpm2_read_public(chip, handle, auth->name[slot]); + if (ret) + goto err; + } } else { - if (name) - dev_err(&chip->dev, "TPM: Handle does not require name but one is specified\n"); + if (name) { + dev_err(&chip->dev, "handle 0x%08x does not use a name\n", + handle); + ret = -EIO; + goto err; + } }
auth->name_h[slot] = handle; - if (name) - memcpy(auth->name[slot], name, name_size(name)); + if (name) { + ret = name_size(name); + if (ret < 0) + goto err; + + memcpy(auth->name[slot], name, ret); + } +#endif + return 0; + +#ifdef CONFIG_TCG_TPM2_HMAC +err: + tpm2_end_auth_session(chip); + return tpm_ret_to_err(ret); #endif } EXPORT_SYMBOL_GPL(tpm_buf_append_name); @@ -533,11 +572,9 @@ static void tpm_buf_append_salt(struct t * encryption key and encrypts the first parameter of the command * buffer with it. * - * As with most tpm_buf operations, success is assumed because failure - * will be caused by an incorrect programming model and indicated by a - * kernel message. + * Ends the authorization session on failure. */ -void tpm_buf_fill_hmac_session(struct tpm_chip *chip, struct tpm_buf *buf) +int tpm_buf_fill_hmac_session(struct tpm_chip *chip, struct tpm_buf *buf) { u32 cc, handles, val; struct tpm2_auth *auth = chip->auth; @@ -549,9 +586,12 @@ void tpm_buf_fill_hmac_session(struct tp u8 cphash[SHA256_DIGEST_SIZE]; struct sha256_ctx sctx; struct hmac_sha256_ctx hctx; + int ret;
- if (!auth) - return; + if (!auth) { + ret = -EIO; + goto err; + }
/* save the command code in BE format */ auth->ordinal = head->ordinal; @@ -560,9 +600,11 @@ void tpm_buf_fill_hmac_session(struct tp
i = tpm2_find_cc(chip, cc); if (i < 0) { - dev_err(&chip->dev, "Command 0x%x not found in TPM\n", cc); - return; + dev_err(&chip->dev, "command 0x%08x not found\n", cc); + ret = -EIO; + goto err; } + attrs = chip->cc_attrs_tbl[i];
handles = (attrs >> TPM2_CC_ATTR_CHANDLES) & GENMASK(2, 0); @@ -576,9 +618,9 @@ void tpm_buf_fill_hmac_session(struct tp u32 handle = tpm_buf_read_u32(buf, &offset_s);
if (auth->name_h[i] != handle) { - dev_err(&chip->dev, "TPM: handle %d wrong for name\n", - i); - return; + dev_err(&chip->dev, "invalid handle 0x%08x\n", handle); + ret = -EIO; + goto err; } } /* point offset_s to the start of the sessions */ @@ -609,12 +651,14 @@ void tpm_buf_fill_hmac_session(struct tp offset_s += len; } if (offset_s != offset_p) { - dev_err(&chip->dev, "TPM session length is incorrect\n"); - return; + dev_err(&chip->dev, "session length is incorrect\n"); + ret = -EIO; + goto err; } if (!hmac) { - dev_err(&chip->dev, "TPM could not find HMAC session\n"); - return; + dev_err(&chip->dev, "could not find HMAC session\n"); + ret = -EIO; + goto err; }
/* encrypt before HMAC */ @@ -646,8 +690,11 @@ void tpm_buf_fill_hmac_session(struct tp if (mso == TPM2_MSO_PERSISTENT || mso == TPM2_MSO_VOLATILE || mso == TPM2_MSO_NVRAM) { - sha256_update(&sctx, auth->name[i], - name_size(auth->name[i])); + ret = name_size(auth->name[i]); + if (ret < 0) + goto err; + + sha256_update(&sctx, auth->name[i], ret); } else { __be32 h = cpu_to_be32(auth->name_h[i]);
@@ -668,6 +715,11 @@ void tpm_buf_fill_hmac_session(struct tp hmac_sha256_update(&hctx, auth->tpm_nonce, sizeof(auth->tpm_nonce)); hmac_sha256_update(&hctx, &auth->attrs, 1); hmac_sha256_final(&hctx, hmac); + return 0; + +err: + tpm2_end_auth_session(chip); + return ret; } EXPORT_SYMBOL(tpm_buf_fill_hmac_session);
--- a/include/linux/tpm.h +++ b/include/linux/tpm.h @@ -526,8 +526,8 @@ static inline struct tpm2_auth *tpm2_chi #endif }
-void tpm_buf_append_name(struct tpm_chip *chip, struct tpm_buf *buf, - u32 handle, u8 *name); +int tpm_buf_append_name(struct tpm_chip *chip, struct tpm_buf *buf, + u32 handle, u8 *name); void tpm_buf_append_hmac_session(struct tpm_chip *chip, struct tpm_buf *buf, u8 attributes, u8 *passphrase, int passphraselen); @@ -560,7 +560,7 @@ static inline void tpm_buf_append_hmac_s #ifdef CONFIG_TCG_TPM2_HMAC
int tpm2_start_auth_session(struct tpm_chip *chip); -void tpm_buf_fill_hmac_session(struct tpm_chip *chip, struct tpm_buf *buf); +int tpm_buf_fill_hmac_session(struct tpm_chip *chip, struct tpm_buf *buf); int tpm_buf_check_hmac_response(struct tpm_chip *chip, struct tpm_buf *buf, int rc); void tpm2_end_auth_session(struct tpm_chip *chip); @@ -574,10 +574,13 @@ static inline int tpm2_start_auth_sessio static inline void tpm2_end_auth_session(struct tpm_chip *chip) { } -static inline void tpm_buf_fill_hmac_session(struct tpm_chip *chip, - struct tpm_buf *buf) + +static inline int tpm_buf_fill_hmac_session(struct tpm_chip *chip, + struct tpm_buf *buf) { + return 0; } + static inline int tpm_buf_check_hmac_response(struct tpm_chip *chip, struct tpm_buf *buf, int rc) --- a/security/keys/trusted-keys/trusted_tpm2.c +++ b/security/keys/trusted-keys/trusted_tpm2.c @@ -283,7 +283,10 @@ int tpm2_seal_trusted(struct tpm_chip *c goto out_put; }
- tpm_buf_append_name(chip, &buf, options->keyhandle, NULL); + rc = tpm_buf_append_name(chip, &buf, options->keyhandle, NULL); + if (rc) + goto out; + tpm_buf_append_hmac_session(chip, &buf, TPM2_SA_DECRYPT, options->keyauth, TPM_DIGEST_SIZE);
@@ -331,7 +334,10 @@ int tpm2_seal_trusted(struct tpm_chip *c goto out; }
- tpm_buf_fill_hmac_session(chip, &buf); + rc = tpm_buf_fill_hmac_session(chip, &buf); + if (rc) + goto out; + rc = tpm_transmit_cmd(chip, &buf, 4, "sealing data"); rc = tpm_buf_check_hmac_response(chip, &buf, rc); if (rc) @@ -448,7 +454,10 @@ static int tpm2_load_cmd(struct tpm_chip return rc; }
- tpm_buf_append_name(chip, &buf, options->keyhandle, NULL); + rc = tpm_buf_append_name(chip, &buf, options->keyhandle, NULL); + if (rc) + goto out; + tpm_buf_append_hmac_session(chip, &buf, 0, options->keyauth, TPM_DIGEST_SIZE);
@@ -460,7 +469,10 @@ static int tpm2_load_cmd(struct tpm_chip goto out; }
- tpm_buf_fill_hmac_session(chip, &buf); + rc = tpm_buf_fill_hmac_session(chip, &buf); + if (rc) + goto out; + rc = tpm_transmit_cmd(chip, &buf, 4, "loading blob"); rc = tpm_buf_check_hmac_response(chip, &buf, rc); if (!rc) @@ -508,7 +520,9 @@ static int tpm2_unseal_cmd(struct tpm_ch return rc; }
- tpm_buf_append_name(chip, &buf, blob_handle, NULL); + rc = tpm_buf_append_name(chip, &buf, options->keyhandle, NULL); + if (rc) + goto out;
if (!options->policyhandle) { tpm_buf_append_hmac_session(chip, &buf, TPM2_SA_ENCRYPT, @@ -533,7 +547,10 @@ static int tpm2_unseal_cmd(struct tpm_ch NULL, 0); }
- tpm_buf_fill_hmac_session(chip, &buf); + rc = tpm_buf_fill_hmac_session(chip, &buf); + if (rc) + goto out; + rc = tpm_transmit_cmd(chip, &buf, 6, "unsealing"); rc = tpm_buf_check_hmac_response(chip, &buf, rc); if (rc > 0)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Sakkinen jarkko@kernel.org
commit bda1cbf73c6e241267c286427f2ed52b5735d872 upstream.
tpm2_read_public() has some rudimentary range checks but the function does not ensure that the response buffer has enough bytes for the full TPMT_HA payload.
Re-implement the function with necessary checks and validation, and return name and name size for all handle types back to the caller.
Cc: stable@vger.kernel.org # v6.10+ Fixes: d0a25bb961e6 ("tpm: Add HMAC session name/handle append") Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Reviewed-by: Jonathan McDowell noodles@meta.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm2-cmd.c | 3 + drivers/char/tpm/tpm2-sessions.c | 94 ++++++++++++++++++++------------------- 2 files changed, 53 insertions(+), 44 deletions(-)
--- a/drivers/char/tpm/tpm2-cmd.c +++ b/drivers/char/tpm/tpm2-cmd.c @@ -11,8 +11,11 @@ * used by the kernel internally. */
+#include "linux/dev_printk.h" +#include "linux/tpm.h" #include "tpm.h" #include <crypto/hash_info.h> +#include <linux/unaligned.h>
static bool disable_pcr_integrity; module_param(disable_pcr_integrity, bool, 0444); --- a/drivers/char/tpm/tpm2-sessions.c +++ b/drivers/char/tpm/tpm2-sessions.c @@ -163,53 +163,61 @@ static int name_size(const u8 *name) } }
-static int tpm2_parse_read_public(char *name, struct tpm_buf *buf) +static int tpm2_read_public(struct tpm_chip *chip, u32 handle, void *name) { - struct tpm_header *head = (struct tpm_header *)buf->data; + u32 mso = tpm2_handle_mso(handle); off_t offset = TPM_HEADER_SIZE; - u32 tot_len = be32_to_cpu(head->length); - int ret; - u32 val; - - /* we're starting after the header so adjust the length */ - tot_len -= TPM_HEADER_SIZE; - - /* skip public */ - val = tpm_buf_read_u16(buf, &offset); - if (val > tot_len) - return -EINVAL; - offset += val; - /* name */ - val = tpm_buf_read_u16(buf, &offset); - ret = name_size(&buf->data[offset]); - if (ret < 0) - return ret; - - if (val != ret) - return -EINVAL; - - memcpy(name, &buf->data[offset], val); - /* forget the rest */ - return 0; -} - -static int tpm2_read_public(struct tpm_chip *chip, u32 handle, char *name) -{ + int rc, name_size_alg; struct tpm_buf buf; - int rc; + + if (mso != TPM2_MSO_PERSISTENT && mso != TPM2_MSO_VOLATILE && + mso != TPM2_MSO_NVRAM) { + memcpy(name, &handle, sizeof(u32)); + return sizeof(u32); + }
rc = tpm_buf_init(&buf, TPM2_ST_NO_SESSIONS, TPM2_CC_READ_PUBLIC); if (rc) return rc;
tpm_buf_append_u32(&buf, handle); - rc = tpm_transmit_cmd(chip, &buf, 0, "read public"); - if (rc == TPM2_RC_SUCCESS) - rc = tpm2_parse_read_public(name, &buf);
- tpm_buf_destroy(&buf); + rc = tpm_transmit_cmd(chip, &buf, 0, "TPM2_ReadPublic"); + if (rc) { + tpm_buf_destroy(&buf); + return tpm_ret_to_err(rc); + } + + /* Skip TPMT_PUBLIC: */ + offset += tpm_buf_read_u16(&buf, &offset);
- return rc; + /* + * Ensure space for the length field of TPM2B_NAME and hashAlg field of + * TPMT_HA (the extra four bytes). + */ + if (offset + 4 > tpm_buf_length(&buf)) { + tpm_buf_destroy(&buf); + return -EIO; + } + + rc = tpm_buf_read_u16(&buf, &offset); + name_size_alg = name_size(&buf.data[offset]); + + if (name_size_alg < 0) + return name_size_alg; + + if (rc != name_size_alg) { + tpm_buf_destroy(&buf); + return -EIO; + } + + if (offset + rc > tpm_buf_length(&buf)) { + tpm_buf_destroy(&buf); + return -EIO; + } + + memcpy(name, &buf.data[offset], rc); + return name_size_alg; } #endif /* CONFIG_TCG_TPM2_HMAC */
@@ -243,6 +251,7 @@ int tpm_buf_append_name(struct tpm_chip #ifdef CONFIG_TCG_TPM2_HMAC enum tpm2_mso_type mso = tpm2_handle_mso(handle); struct tpm2_auth *auth; + u16 name_size_alg; int slot; int ret; #endif @@ -273,8 +282,10 @@ int tpm_buf_append_name(struct tpm_chip mso == TPM2_MSO_NVRAM) { if (!name) { ret = tpm2_read_public(chip, handle, auth->name[slot]); - if (ret) + if (ret < 0) goto err; + + name_size_alg = ret; } } else { if (name) { @@ -286,13 +297,8 @@ int tpm_buf_append_name(struct tpm_chip }
auth->name_h[slot] = handle; - if (name) { - ret = name_size(name); - if (ret < 0) - goto err; - - memcpy(auth->name[slot], name, ret); - } + if (name) + memcpy(auth->name[slot], name, name_size_alg); #endif return 0;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sourabh Jain sourabhjain@linux.ibm.com
commit adc15829fb73e402903b7030729263b6ee4a7232 upstream.
With the generic crashkernel reservation, the kernel emits the following warning on powerpc:
WARNING: CPU: 0 PID: 1 at arch/powerpc/mm/mem.c:341 add_system_ram_resources+0xfc/0x180 Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.0-auto-12607-g5472d60c129f #1 VOLUNTARY Hardware name: IBM,9080-HEX Power11 (architected) 0x820200 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries NIP: c00000000201de3c LR: c00000000201de34 CTR: 0000000000000000 REGS: c000000127cef8a0 TRAP: 0700 Not tainted (6.17.0-auto-12607-g5472d60c129f) MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 84000840 XER: 20040010 CFAR: c00000000017eed0 IRQMASK: 0 GPR00: c00000000201de34 c000000127cefb40 c0000000016a8100 0000000000000001 GPR04: c00000012005aa00 0000000020000000 c000000002b705c8 0000000000000000 GPR08: 000000007fffffff fffffffffffffff0 c000000002db8100 000000011fffffff GPR12: c00000000201dd40 c000000002ff0000 c0000000000112bc 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 c0000000015a3808 GPR24: c00000000200468c c000000001699888 0000000000000106 c0000000020d1950 GPR28: c0000000014683f8 0000000081000200 c0000000015c1868 c000000002b9f710 NIP [c00000000201de3c] add_system_ram_resources+0xfc/0x180 LR [c00000000201de34] add_system_ram_resources+0xf4/0x180 Call Trace: add_system_ram_resources+0xf4/0x180 (unreliable) do_one_initcall+0x60/0x36c do_initcalls+0x120/0x220 kernel_init_freeable+0x23c/0x390 kernel_init+0x34/0x26c ret_from_kernel_user_thread+0x14/0x1c
This warning occurs due to a conflict between crashkernel and System RAM iomem resources.
The generic crashkernel reservation adds the crashkernel memory range to /proc/iomem during early initialization. Later, all memblock ranges are added to /proc/iomem as System RAM. If the crashkernel region overlaps with any memblock range, it causes a conflict while adding those memblock regions as iomem resources, triggering the above warning. The conflicting memblock regions are then omitted from /proc/iomem.
For example, if the following crashkernel region is added to /proc/iomem: 20000000-11fffffff : Crash kernel
then the following memblock regions System RAM regions fail to be inserted: 00000000-7fffffff : System RAM 80000000-257fffffff : System RAM
Fix this by not adding the crashkernel memory to /proc/iomem on powerpc. Introduce an architecture hook to let each architecture decide whether to export the crashkernel region to /proc/iomem.
For more info checkout commit c40dd2f766440 ("powerpc: Add System RAM to /proc/iomem") and commit bce074bdbc36 ("powerpc: insert System RAM resource to prevent crashkernel conflict")
Note: Before switching to the generic crashkernel reservation, powerpc never exported the crashkernel region to /proc/iomem.
Link: https://lkml.kernel.org/r/20251016142831.144515-1-sourabhjain@linux.ibm.com Fixes: e3185ee438c2 ("powerpc/crash: use generic crashkernel reservation"). Signed-off-by: Sourabh Jain sourabhjain@linux.ibm.com Reported-by: Venkat Rao Bagalkote venkat88@linux.ibm.com Closes: https://lore.kernel.org/all/90937fe0-2e76-4c82-b27e-7b8a7fe3ac69@linux.ibm.c... Tested-by: Venkat Rao Bagalkote venkat88@linux.ibm.com Cc: Baoquan he bhe@redhat.com Cc: Hari Bathini hbathini@linux.ibm.com Cc: Madhavan Srinivasan maddy@linux.ibm.com Cc: Mahesh Salgaonkar mahesh@linux.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: Ritesh Harjani (IBM) ritesh.list@gmail.com Cc: Vivek Goyal vgoyal@redhat.com Cc: Dave Young dyoung@redhat.com Cc: Mike Rapoport rppt@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/include/asm/crash_reserve.h | 8 ++++++++ include/linux/crash_reserve.h | 6 ++++++ kernel/crash_reserve.c | 3 +++ 3 files changed, 17 insertions(+)
diff --git a/arch/powerpc/include/asm/crash_reserve.h b/arch/powerpc/include/asm/crash_reserve.h index 6467ce29b1fa..d1b570ddbf98 100644 --- a/arch/powerpc/include/asm/crash_reserve.h +++ b/arch/powerpc/include/asm/crash_reserve.h @@ -5,4 +5,12 @@ /* crash kernel regions are Page size agliged */ #define CRASH_ALIGN PAGE_SIZE
+#ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION +static inline bool arch_add_crash_res_to_iomem(void) +{ + return false; +} +#define arch_add_crash_res_to_iomem arch_add_crash_res_to_iomem +#endif + #endif /* _ASM_POWERPC_CRASH_RESERVE_H */ diff --git a/include/linux/crash_reserve.h b/include/linux/crash_reserve.h index 7b44b41d0a20..f0dc03d94ca2 100644 --- a/include/linux/crash_reserve.h +++ b/include/linux/crash_reserve.h @@ -32,6 +32,12 @@ int __init parse_crashkernel(char *cmdline, unsigned long long system_ram, void __init reserve_crashkernel_cma(unsigned long long cma_size);
#ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION +#ifndef arch_add_crash_res_to_iomem +static inline bool arch_add_crash_res_to_iomem(void) +{ + return true; +} +#endif #ifndef DEFAULT_CRASH_KERNEL_LOW_SIZE #define DEFAULT_CRASH_KERNEL_LOW_SIZE (128UL << 20) #endif diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c index 87bf4d41eabb..62e60e0223cf 100644 --- a/kernel/crash_reserve.c +++ b/kernel/crash_reserve.c @@ -524,6 +524,9 @@ void __init reserve_crashkernel_cma(unsigned long long cma_size) #ifndef HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY static __init int insert_crashkernel_resources(void) { + if (!arch_add_crash_res_to_iomem()) + return 0; + if (crashk_res.start < crashk_res.end) insert_resource(&iomem_resource, &crashk_res);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans de Goede johannes.goede@oss.qualcomm.com
commit 31b931bebd11a0f00967114f62c8c38952f483e5 upstream.
After commit a50f7456f853 ("dma-mapping: Allow use of DMA_BIT_MASK(64) in global scope"), the DMA_BIT_MASK() macro is broken when passed non trivial statements for the value of 'n'. This is caused by the new version missing parenthesis around 'n' when evaluating 'n'.
One example of this breakage is the IPU6 driver now crashing due to it getting DMA-addresses with address bit 32 set even though it has tried to set a 32 bit DMA mask.
The IPU6 CSI2 engine has a DMA mask of either 31 or 32 bits depending on if it is in secure mode or not and it sets this masks like this:
mmu_info->aperture_end = (dma_addr_t)DMA_BIT_MASK(isp->secure_mode ? IPU6_MMU_ADDR_BITS : IPU6_MMU_ADDR_BITS_NON_SECURE);
So the 'n' argument here is "isp->secure_mode ? IPU6_MMU_ADDR_BITS : IPU6_MMU_ADDR_BITS_NON_SECURE" which gets expanded into:
isp->secure_mode ? IPU6_MMU_ADDR_BITS : IPU6_MMU_ADDR_BITS_NON_SECURE - 1
With the -1 only being applied in the non secure case, causing the secure mode mask to be one 1 bit too large.
Fixes: a50f7456f853 ("dma-mapping: Allow use of DMA_BIT_MASK(64) in global scope") Cc: Sakari Ailus sakari.ailus@linux.intel.com Cc: James Clark james.clark@linaro.org Cc: Nathan Chancellor nathan@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede johannes.goede@oss.qualcomm.com Reviewed-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Link: https://lore.kernel.org/r/20251207184756.97904-1-johannes.goede@oss.qualcomm... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/dma-mapping.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 2ceda49c609f..aa36a0d1d9df 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -90,7 +90,7 @@ */ #define DMA_MAPPING_ERROR (~(dma_addr_t)0)
-#define DMA_BIT_MASK(n) GENMASK_ULL(n - 1, 0) +#define DMA_BIT_MASK(n) GENMASK_ULL((n) - 1, 0)
struct dma_iova_state { dma_addr_t addr;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harry Yoo harry.yoo@oracle.com
commit 0f35040de59371ad542b915d7b91176c9910dadc upstream.
Currently, kvfree_rcu_barrier() flushes RCU sheaves across all slab caches when a cache is destroyed. This is unnecessary; only the RCU sheaves belonging to the cache being destroyed need to be flushed.
As suggested by Vlastimil Babka, introduce a weaker form of kvfree_rcu_barrier() that operates on a specific slab cache.
Factor out flush_rcu_sheaves_on_cache() from flush_all_rcu_sheaves() and call it from flush_all_rcu_sheaves() and kvfree_rcu_barrier_on_cache().
Call kvfree_rcu_barrier_on_cache() instead of kvfree_rcu_barrier() on cache destruction.
The performance benefit is evaluated on a 12 core 24 threads AMD Ryzen 5900X machine (1 socket), by loading slub_kunit module.
Before: Total calls: 19 Average latency (us): 18127 Total time (us): 344414
After: Total calls: 19 Average latency (us): 10066 Total time (us): 191264
Two performance regression have been reported: - stress module loader test's runtime increases by 50-60% (Daniel) - internal graphics test's runtime on Tegra234 increases by 35% (Jon)
They are fixed by this change.
Suggested-by: Vlastimil Babka vbabka@suse.cz Fixes: ec66e0d59952 ("slab: add sheaf support for batching kfree_rcu() operations") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-mm/1bda09da-93be-4737-aef0-d47f8c5c9301@suse.c... Reported-and-tested-by: Daniel Gomez da.gomez@samsung.com Closes: https://lore.kernel.org/linux-mm/0406562e-2066-4cf8-9902-b2b0616dd742@kernel... Reported-and-tested-by: Jon Hunter jonathanh@nvidia.com Closes: https://lore.kernel.org/linux-mm/e988eff6-1287-425e-a06c-805af5bbf262@nvidia... Signed-off-by: Harry Yoo harry.yoo@oracle.com Link: https://patch.msgid.link/20251207154148.117723-1-harry.yoo@oracle.com Signed-off-by: Vlastimil Babka vbabka@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/slab.h | 7 ++++++ mm/slab.h | 1 mm/slab_common.c | 52 +++++++++++++++++++++++++++++++++------------- mm/slub.c | 57 +++++++++++++++++++++++++++------------------------ 4 files changed, 76 insertions(+), 41 deletions(-)
--- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -1150,10 +1150,17 @@ static inline void kvfree_rcu_barrier(vo rcu_barrier(); }
+static inline void kvfree_rcu_barrier_on_cache(struct kmem_cache *s) +{ + rcu_barrier(); +} + static inline void kfree_rcu_scheduler_running(void) { } #else void kvfree_rcu_barrier(void);
+void kvfree_rcu_barrier_on_cache(struct kmem_cache *s); + void kfree_rcu_scheduler_running(void); #endif
--- a/mm/slab.h +++ b/mm/slab.h @@ -442,6 +442,7 @@ static inline bool is_kmalloc_normal(str
bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj); void flush_all_rcu_sheaves(void); +void flush_rcu_sheaves_on_cache(struct kmem_cache *s);
#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | \ SLAB_CACHE_DMA32 | SLAB_PANIC | \ --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -492,7 +492,7 @@ void kmem_cache_destroy(struct kmem_cach return;
/* in-flight kfree_rcu()'s may include objects from our cache */ - kvfree_rcu_barrier(); + kvfree_rcu_barrier_on_cache(s);
if (IS_ENABLED(CONFIG_SLUB_RCU_DEBUG) && (s->flags & SLAB_TYPESAFE_BY_RCU)) { @@ -2039,25 +2039,13 @@ unlock_return: } EXPORT_SYMBOL_GPL(kvfree_call_rcu);
-/** - * kvfree_rcu_barrier - Wait until all in-flight kvfree_rcu() complete. - * - * Note that a single argument of kvfree_rcu() call has a slow path that - * triggers synchronize_rcu() following by freeing a pointer. It is done - * before the return from the function. Therefore for any single-argument - * call that will result in a kfree() to a cache that is to be destroyed - * during module exit, it is developer's responsibility to ensure that all - * such calls have returned before the call to kmem_cache_destroy(). - */ -void kvfree_rcu_barrier(void) +static inline void __kvfree_rcu_barrier(void) { struct kfree_rcu_cpu_work *krwp; struct kfree_rcu_cpu *krcp; bool queued; int i, cpu;
- flush_all_rcu_sheaves(); - /* * Firstly we detach objects and queue them over an RCU-batch * for all CPUs. Finally queued works are flushed for each CPU. @@ -2119,8 +2107,43 @@ void kvfree_rcu_barrier(void) } } } + +/** + * kvfree_rcu_barrier - Wait until all in-flight kvfree_rcu() complete. + * + * Note that a single argument of kvfree_rcu() call has a slow path that + * triggers synchronize_rcu() following by freeing a pointer. It is done + * before the return from the function. Therefore for any single-argument + * call that will result in a kfree() to a cache that is to be destroyed + * during module exit, it is developer's responsibility to ensure that all + * such calls have returned before the call to kmem_cache_destroy(). + */ +void kvfree_rcu_barrier(void) +{ + flush_all_rcu_sheaves(); + __kvfree_rcu_barrier(); +} EXPORT_SYMBOL_GPL(kvfree_rcu_barrier);
+/** + * kvfree_rcu_barrier_on_cache - Wait for in-flight kvfree_rcu() calls on a + * specific slab cache. + * @s: slab cache to wait for + * + * See the description of kvfree_rcu_barrier() for details. + */ +void kvfree_rcu_barrier_on_cache(struct kmem_cache *s) +{ + if (s->cpu_sheaves) + flush_rcu_sheaves_on_cache(s); + /* + * TODO: Introduce a version of __kvfree_rcu_barrier() that works + * on a specific slab cache. + */ + __kvfree_rcu_barrier(); +} +EXPORT_SYMBOL_GPL(kvfree_rcu_barrier_on_cache); + static unsigned long kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc) { @@ -2216,4 +2239,3 @@ void __init kvfree_rcu_init(void) }
#endif /* CONFIG_KVFREE_RCU_BATCHED */ - --- a/mm/slub.c +++ b/mm/slub.c @@ -4118,42 +4118,47 @@ static void flush_rcu_sheaf(struct work_
/* needed for kvfree_rcu_barrier() */ -void flush_all_rcu_sheaves(void) +void flush_rcu_sheaves_on_cache(struct kmem_cache *s) { struct slub_flush_work *sfw; - struct kmem_cache *s; unsigned int cpu;
+ mutex_lock(&flush_lock); + + for_each_online_cpu(cpu) { + sfw = &per_cpu(slub_flush, cpu); + + /* + * we don't check if rcu_free sheaf exists - racing + * __kfree_rcu_sheaf() might have just removed it. + * by executing flush_rcu_sheaf() on the cpu we make + * sure the __kfree_rcu_sheaf() finished its call_rcu() + */ + + INIT_WORK(&sfw->work, flush_rcu_sheaf); + sfw->s = s; + queue_work_on(cpu, flushwq, &sfw->work); + } + + for_each_online_cpu(cpu) { + sfw = &per_cpu(slub_flush, cpu); + flush_work(&sfw->work); + } + + mutex_unlock(&flush_lock); +} + +void flush_all_rcu_sheaves(void) +{ + struct kmem_cache *s; + cpus_read_lock(); mutex_lock(&slab_mutex);
list_for_each_entry(s, &slab_caches, list) { if (!s->cpu_sheaves) continue; - - mutex_lock(&flush_lock); - - for_each_online_cpu(cpu) { - sfw = &per_cpu(slub_flush, cpu); - - /* - * we don't check if rcu_free sheaf exists - racing - * __kfree_rcu_sheaf() might have just removed it. - * by executing flush_rcu_sheaf() on the cpu we make - * sure the __kfree_rcu_sheaf() finished its call_rcu() - */ - - INIT_WORK(&sfw->work, flush_rcu_sheaf); - sfw->s = s; - queue_work_on(cpu, flushwq, &sfw->work); - } - - for_each_online_cpu(cpu) { - sfw = &per_cpu(slub_flush, cpu); - flush_work(&sfw->work); - } - - mutex_unlock(&flush_lock); + flush_rcu_sheaves_on_cache(s); }
mutex_unlock(&slab_mutex);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthieu Baerts (NGI0) matttbe@kernel.org
commit 0ace3297a7301911e52d8195cb1006414897c859 upstream.
Before this patch, the kernel was saving any flags set by the userspace, even unknown ones. This doesn't cause critical issues because the kernel is only looking at specific ones. But on the other hand, endpoints dumps could tell the userspace some recent flags seem to be supported on older kernel versions.
Instead, ignore all unknown flags when parsing them. By doing that, the userspace can continue to set unsupported flags, but it has a way to verify what is supported by the kernel.
Note that it sounds better to continue accepting unsupported flags not to change the behaviour, but also that eases things on the userspace side by adding "optional" endpoint types only supported by newer kernel versions without having to deal with the different kernel versions.
A note for the backports: there will be conflicts in mptcp.h on older versions not having the mentioned flags, the new line should still be added last, and the '5' needs to be adapted to have the same value as the last entry.
Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-1-9e4781a... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/uapi/linux/mptcp.h | 1 + net/mptcp/pm_netlink.c | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-)
--- a/include/uapi/linux/mptcp.h +++ b/include/uapi/linux/mptcp.h @@ -40,6 +40,7 @@ #define MPTCP_PM_ADDR_FLAG_FULLMESH _BITUL(3) #define MPTCP_PM_ADDR_FLAG_IMPLICIT _BITUL(4) #define MPTCP_PM_ADDR_FLAG_LAMINAR _BITUL(5) +#define MPTCP_PM_ADDR_FLAGS_MASK GENMASK(5, 0)
struct mptcp_info { __u8 mptcpi_subflows; --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -119,7 +119,8 @@ int mptcp_pm_parse_entry(struct nlattr * }
if (tb[MPTCP_PM_ADDR_ATTR_FLAGS]) - entry->flags = nla_get_u32(tb[MPTCP_PM_ADDR_ATTR_FLAGS]); + entry->flags = nla_get_u32(tb[MPTCP_PM_ADDR_ATTR_FLAGS]) & + MPTCP_PM_ADDR_FLAGS_MASK;
if (tb[MPTCP_PM_ADDR_ATTR_PORT]) entry->addr.port = htons(nla_get_u16(tb[MPTCP_PM_ADDR_ATTR_PORT]));
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthieu Baerts (NGI0) matttbe@kernel.org
commit 29f4801e9c8dfd12bdcb33b61a6ac479c7162bd7 upstream.
This validates the previous commit: the userspace can set unknown flags -- the 7th bit is currently unused -- without errors, but only the supported ones are printed in the endpoints dumps.
The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID.
Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-2-9e4781a... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/pm_netlink.sh | 4 ++++ tools/testing/selftests/net/mptcp/pm_nl_ctl.c | 11 +++++++++++ 2 files changed, 15 insertions(+)
--- a/tools/testing/selftests/net/mptcp/pm_netlink.sh +++ b/tools/testing/selftests/net/mptcp/pm_netlink.sh @@ -192,6 +192,10 @@ check "show_endpoints" \ flush_endpoint check "show_endpoints" "" "flush addrs"
+add_endpoint 10.0.1.1 flags unknown +check "show_endpoints" "$(format_endpoints "1,10.0.1.1")" "ignore unknown flags" +flush_endpoint + set_limits 9 1 2>/dev/null check "get_limits" "${default_limits}" "rcv addrs above hard limit"
--- a/tools/testing/selftests/net/mptcp/pm_nl_ctl.c +++ b/tools/testing/selftests/net/mptcp/pm_nl_ctl.c @@ -24,6 +24,8 @@ #define IPPROTO_MPTCP 262 #endif
+#define MPTCP_PM_ADDR_FLAG_UNKNOWN _BITUL(7) + static void syntax(char *argv[]) { fprintf(stderr, "%s add|ann|rem|csf|dsf|get|set|del|flush|dump|events|listen|accept [<args>]\n", argv[0]); @@ -836,6 +838,8 @@ int add_addr(int fd, int pm_family, int flags |= MPTCP_PM_ADDR_FLAG_BACKUP; else if (!strcmp(tok, "fullmesh")) flags |= MPTCP_PM_ADDR_FLAG_FULLMESH; + else if (!strcmp(tok, "unknown")) + flags |= MPTCP_PM_ADDR_FLAG_UNKNOWN; else error(1, errno, "unknown flag %s", argv[arg]); @@ -1047,6 +1051,13 @@ static void print_addr(struct rtattr *at if (flags) printf(","); } + + if (flags & MPTCP_PM_ADDR_FLAG_UNKNOWN) { + printf("unknown"); + flags &= ~MPTCP_PM_ADDR_FLAG_UNKNOWN; + if (flags) + printf(","); + }
/* bump unknown flags, if any */ if (flags)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
commit 2ea6190f42d0416a4310e60a7fcb0b49fcbbd4fb upstream.
The MPTCP protocol usually schedule the retransmission timer only when there is some chances for such retransmissions to happen.
With a notable exception: __mptcp_push_pending() currently schedule such timer unconditionally, potentially leading to unnecessary rtx timer expiration.
The issue is present since the blamed commit below but become easily reproducible after commit 27b0e701d387 ("mptcp: drop bogus optimization in __mptcp_check_push()")
Fixes: 33d41c9cd74c ("mptcp: more accurate timeout") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-3-9e4781a... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1597,7 +1597,7 @@ void __mptcp_push_pending(struct sock *s struct mptcp_sendmsg_info info = { .flags = flags, }; - bool do_check_data_fin = false; + bool copied = false; int push_count = 1;
while (mptcp_send_head(sk) && (push_count > 0)) { @@ -1639,7 +1639,7 @@ void __mptcp_push_pending(struct sock *s push_count--; continue; } - do_check_data_fin = true; + copied = true; } } } @@ -1648,11 +1648,14 @@ void __mptcp_push_pending(struct sock *s if (ssk) mptcp_push_release(ssk, &info);
- /* ensure the rtx timer is running */ - if (!mptcp_rtx_timer_pending(sk)) - mptcp_reset_rtx_timer(sk); - if (do_check_data_fin) + /* Avoid scheduling the rtx timer if no data has been pushed; the timer + * will be updated on positive acks by __mptcp_cleanup_una(). + */ + if (copied) { + if (!mptcp_rtx_timer_pending(sk)) + mptcp_reset_rtx_timer(sk); mptcp_check_send_data_fin(sk); + } }
static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, bool first)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
commit ffb8c27b0539dd90262d1021488e7817fae57c42 upstream.
Jakub reported an MPTCP deadlock at fallback time:
WARNING: possible recursive locking detected 6.18.0-rc7-virtme #1 Not tainted -------------------------------------------- mptcp_connect/20858 is trying to acquire lock: ff1100001da18b60 (&msk->fallback_lock){+.-.}-{3:3}, at: __mptcp_try_fallback+0xd8/0x280
but task is already holding lock: ff1100001da18b60 (&msk->fallback_lock){+.-.}-{3:3}, at: __mptcp_retrans+0x352/0xaa0
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ---- lock(&msk->fallback_lock); lock(&msk->fallback_lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by mptcp_connect/20858: #0: ff1100001da18290 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg+0x114/0x1bc0 #1: ff1100001db40fd0 (k-sk_lock-AF_INET#2){+.+.}-{0:0}, at: __mptcp_retrans+0x2cb/0xaa0 #2: ff1100001da18b60 (&msk->fallback_lock){+.-.}-{3:3}, at: __mptcp_retrans+0x352/0xaa0
stack backtrace: CPU: 0 UID: 0 PID: 20858 Comm: mptcp_connect Not tainted 6.18.0-rc7-virtme #1 PREEMPT(full) Hardware name: Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl+0x6f/0xa0 print_deadlock_bug.cold+0xc0/0xcd validate_chain+0x2ff/0x5f0 __lock_acquire+0x34c/0x740 lock_acquire.part.0+0xbc/0x260 _raw_spin_lock_bh+0x38/0x50 __mptcp_try_fallback+0xd8/0x280 mptcp_sendmsg_frag+0x16c2/0x3050 __mptcp_retrans+0x421/0xaa0 mptcp_release_cb+0x5aa/0xa70 release_sock+0xab/0x1d0 mptcp_sendmsg+0xd5b/0x1bc0 sock_write_iter+0x281/0x4d0 new_sync_write+0x3c5/0x6f0 vfs_write+0x65e/0xbb0 ksys_write+0x17e/0x200 do_syscall_64+0xbb/0xfd0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7fa5627cbc5e Code: 4d 89 d8 e8 14 bd 00 00 4c 8b 5d f8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f 05 <c9> c3 83 e2 39 83 fa 08 75 e7 e8 13 ff ff ff 0f 1f 00 f3 0f 1e fa RSP: 002b:00007fff1fe14700 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fa5627cbc5e RDX: 0000000000001f9c RSI: 00007fff1fe16984 RDI: 0000000000000005 RBP: 00007fff1fe14710 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000202 R12: 00007fff1fe16920 R13: 0000000000002000 R14: 0000000000001f9c R15: 0000000000001f9c
The packet scheduler could attempt a reinjection after receiving an MP_FAIL and before the infinite map has been transmitted, causing a deadlock since MPTCP needs to do the reinjection atomically from WRT fallback.
Address the issue explicitly avoiding the reinjection in the critical scenario. Note that this is the only fallback critical section that could potentially send packets and hit the double-lock.
Reported-by: Jakub Kicinski kuba@kernel.org Closes: https://netdev-ctrl.bots.linux.dev/logs/vmksft/mptcp-dbg/results/412720/1-mp... Fixes: f8a1d9b18c5e ("mptcp: make fallback action and fallback decision atomic") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-4-9e4781a... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2693,10 +2693,13 @@ static void __mptcp_retrans(struct sock
/* * make the whole retrans decision, xmit, disallow - * fallback atomic + * fallback atomic, note that we can't retrans even + * when an infinite fallback is in progress, i.e. new + * subflows are disallowed. */ spin_lock_bh(&msk->fallback_lock); - if (__mptcp_check_fallback(msk)) { + if (__mptcp_check_fallback(msk) || + !msk->allow_subflows) { spin_unlock_bh(&msk->fallback_lock); release_sock(ssk); goto clear_scheduled;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen Changcheng chenchangcheng@kylinos.cn
commit 0831269b5f71594882accfceb02638124f88955d upstream.
We cannot determine which models require the NO_ATA_1X and IGNORE_RESIDUE quirks aside from the EL-R12 optical drive device.
Fixes: 955a48a5353f ("usb: usb-storage: No additional quirks need to be added to the EL-R12 optical drive.") Signed-off-by: Chen Changcheng chenchangcheng@kylinos.cn Link: https://patch.msgid.link/20251218012318.15978-1-chenchangcheng@kylinos.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/storage/unusual_uas.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/usb/storage/unusual_uas.h +++ b/drivers/usb/storage/unusual_uas.h @@ -98,7 +98,7 @@ UNUSUAL_DEV(0x125f, 0xa94a, 0x0160, 0x01 US_FL_NO_ATA_1X),
/* Reported-by: Benjamin Tissoires benjamin.tissoires@redhat.com */ -UNUSUAL_DEV(0x13fd, 0x3940, 0x0309, 0x0309, +UNUSUAL_DEV(0x13fd, 0x3940, 0x0000, 0x0309, "Initio Corporation", "INIC-3069", USB_SC_DEVICE, USB_PR_DEVICE, NULL,
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeongjun Park aha310510@gmail.com
commit b91e6aafe8d356086cc621bc03e35ba2299e4788 upstream.
rlen value is a user-controlled value, but dtv5100_i2c_msg() does not check the size of the rlen value. Therefore, if it is set to a value larger than sizeof(st->data), an out-of-bounds vuln occurs for st->data.
Therefore, we need to add proper range checking to prevent this vuln.
Fixes: 60688d5e6e6e ("V4L/DVB (8735): dtv5100: replace dummy frontend by zl10353") Cc: stable@vger.kernel.org Signed-off-by: Jeongjun Park aha310510@gmail.com Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/usb/dvb-usb/dtv5100.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/media/usb/dvb-usb/dtv5100.c +++ b/drivers/media/usb/dvb-usb/dtv5100.c @@ -55,6 +55,11 @@ static int dtv5100_i2c_msg(struct dvb_us } index = (addr << 8) + wbuf[0];
+ if (rlen > sizeof(st->data)) { + warn("rlen = %x is too big!\n", rlen); + return -EINVAL; + } + memcpy(st->data, rbuf, rlen); msleep(1); /* avoid I2C errors */ return usb_control_msg(d->udev, pipe, request,
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Colin Ian King colin.i.king@gmail.com
commit be440980eace19c035a0745fd6b6e42707bc4f49 upstream.
The pvr2_trace message is reporting an error about control read transfers, however it is using the incorrect variable write_len instead of read_lean. Fix this by using the correct variable read_len.
Fixes: d855497edbfb ("V4L/DVB (4228a): pvrusb2 to kernel 2.6.18") Cc: stable@vger.kernel.org Signed-off-by: Colin Ian King colin.i.king@gmail.com Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/usb/pvrusb2/pvrusb2-hdw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/usb/pvrusb2/pvrusb2-hdw.c +++ b/drivers/media/usb/pvrusb2/pvrusb2-hdw.c @@ -3622,7 +3622,7 @@ static int pvr2_send_request_ex(struct p pvr2_trace( PVR2_TRACE_ERROR_LEGS, "Attempted to execute %d byte control-read transfer (limit=%d)", - write_len,PVR2_CTL_BUFFSIZE); + read_len, PVR2_CTL_BUFFSIZE); return -EINVAL; } if ((!write_len) && (!read_len)) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 356d1924b9a6bc2164ce2bf1fad147b0c37ae085 upstream.
Platform drivers can be probed after their init sections have been discarded (e.g. on probe deferral or manual rebind through sysfs) so the probe function and match table must not live in init.
Fixes: 783f6d3dcf35 ("phy: bcm63xx-usbh: Add BCM63xx USBH driver") Cc: stable@vger.kernel.org # 5.9 Cc: Álvaro Fernández Rojas noltari@gmail.com Signed-off-by: Johan Hovold johan@kernel.org Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patch.msgid.link/20251017054537.6884-1-johan@kernel.org Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/phy/broadcom/phy-bcm63xx-usbh.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/phy/broadcom/phy-bcm63xx-usbh.c +++ b/drivers/phy/broadcom/phy-bcm63xx-usbh.c @@ -375,7 +375,7 @@ static struct phy *bcm63xx_usbh_phy_xlat return of_phy_simple_xlate(dev, args); }
-static int __init bcm63xx_usbh_phy_probe(struct platform_device *pdev) +static int bcm63xx_usbh_phy_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; struct bcm63xx_usbh_phy *usbh; @@ -432,7 +432,7 @@ static int __init bcm63xx_usbh_phy_probe return 0; }
-static const struct of_device_id bcm63xx_usbh_phy_ids[] __initconst = { +static const struct of_device_id bcm63xx_usbh_phy_ids[] = { { .compatible = "brcm,bcm6318-usbh-phy", .data = &usbh_bcm6318 }, { .compatible = "brcm,bcm6328-usbh-phy", .data = &usbh_bcm6328 }, { .compatible = "brcm,bcm6358-usbh-phy", .data = &usbh_bcm6358 }, @@ -443,7 +443,7 @@ static const struct of_device_id bcm63xx }; MODULE_DEVICE_TABLE(of, bcm63xx_usbh_phy_ids);
-static struct platform_driver bcm63xx_usbh_phy_driver __refdata = { +static struct platform_driver bcm63xx_usbh_phy_driver = { .driver = { .name = "bcm63xx-usbh-phy", .of_match_table = bcm63xx_usbh_phy_ids,
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit b4c61e542faf8c9131d69ecfc3ad6de96d1b2ab8 upstream.
Make sure to drop the reference taken when looking up the PHY I2C device during probe on probe failure (e.g. probe deferral) and on driver unbind.
Fixes: 73108aa90cbf ("USB: ohci-nxp: Use isp1301 driver") Cc: stable@vger.kernel.org # 3.5 Reported-by: Ma Ke make24@iscas.ac.cn Link: https://lore.kernel.org/lkml/20251117013428.21840-1-make24@iscas.ac.cn/ Signed-off-by: Johan Hovold johan@kernel.org Acked-by: Alan Stern stern@rowland.harvard.edu Reviewed-by: Vladimir Zapolskiy vz@mleia.com Link: https://patch.msgid.link/20251218153519.19453-4-johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/ohci-nxp.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/host/ohci-nxp.c +++ b/drivers/usb/host/ohci-nxp.c @@ -223,6 +223,7 @@ static int ohci_hcd_nxp_probe(struct pla fail_resource: usb_put_hcd(hcd); fail_disable: + put_device(&isp1301_i2c_client->dev); isp1301_i2c_client = NULL; return ret; } @@ -234,6 +235,7 @@ static void ohci_hcd_nxp_remove(struct p usb_remove_hcd(hcd); ohci_nxp_stop_hc(); usb_put_hcd(hcd); + put_device(&isp1301_i2c_client->dev); isp1301_i2c_client = NULL; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
commit d14cd998e67ba8f1cca52a260a1ce1a60954fd8b upstream.
Selecting DRM_AUX_HPD_BRIDGE is not possible from a built-in driver when CONFIG_DRM=m:
WARNING: unmet direct dependencies detected for DRM_AUX_HPD_BRIDGE Depends on [m]: HAS_IOMEM [=y] && DRM [=m] && DRM_BRIDGE [=y] && OF [=y] Selected by [y]: - UCSI_HUAWEI_GAOKUN [=y] && USB_SUPPORT [=y] && TYPEC [=y] && TYPEC_UCSI [=y] && EC_HUAWEI_GAOKUN [=y] && DRM_BRIDGE [=y] && OF [=y]
Add the same dependency we have in similar drivers to work around this.
Fixes: 00327d7f2c8c ("usb: typec: ucsi: add Huawei Matebook E Go ucsi driver") Cc: stable stable@kernel.org Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://patch.msgid.link/20251204101111.1035975-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/ucsi/Kconfig | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/typec/ucsi/Kconfig +++ b/drivers/usb/typec/ucsi/Kconfig @@ -96,6 +96,7 @@ config UCSI_LENOVO_YOGA_C630 config UCSI_HUAWEI_GAOKUN tristate "UCSI Interface Driver for Huawei Matebook E Go" depends on EC_HUAWEI_GAOKUN + depends on DRM || !DRM select DRM_AUX_HPD_BRIDGE if DRM_BRIDGE && OF help This driver enables UCSI support on the Huawei Matebook E Go tablet,
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haoxiang Li lihaoxiang@isrc.iscas.ac.cn
commit 128bb7fab342546352603bde8b49ff54e3af0529 upstream.
In error paths, call typec_altmode_put_plug() to drop the device reference obtained by typec_altmode_get_plug().
Fixes: 71ba4fe56656 ("usb: typec: altmodes/displayport: add SOP' support") Cc: stable stable@kernel.org Signed-off-by: Haoxiang Li lihaoxiang@isrc.iscas.ac.cn Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://patch.msgid.link/20251206070445.190770-1-lihaoxiang@isrc.iscas.ac.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/altmodes/displayport.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/usb/typec/altmodes/displayport.c +++ b/drivers/usb/typec/altmodes/displayport.c @@ -764,12 +764,16 @@ int dp_altmode_probe(struct typec_altmod if (!(DP_CAP_PIN_ASSIGN_DFP_D(port->vdo) & DP_CAP_PIN_ASSIGN_UFP_D(alt->vdo)) && !(DP_CAP_PIN_ASSIGN_UFP_D(port->vdo) & - DP_CAP_PIN_ASSIGN_DFP_D(alt->vdo))) + DP_CAP_PIN_ASSIGN_DFP_D(alt->vdo))) { + typec_altmode_put_plug(plug); return -ENODEV; + }
dp = devm_kzalloc(&alt->dev, sizeof(*dp), GFP_KERNEL); - if (!dp) + if (!dp) { + typec_altmode_put_plug(plug); return -ENOMEM; + }
INIT_WORK(&dp->work, dp_altmode_work); mutex_init(&dp->lock);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make24@iscas.ac.cn
commit c84117912bddd9e5d87e68daf182410c98181407 upstream.
lpc32xx_udc_probe() acquires an i2c_client reference through isp1301_get_client() but fails to release it in both error handling paths and the normal removal path. This could result in a reference count leak for the I2C device, preventing proper cleanup and potentially leading to resource exhaustion. Add put_device() to release the reference in the probe failure path and in the remove function.
Calling path: isp1301_get_client() -> of_find_i2c_device_by_node() -> i2c_find_device_by_fwnode(). As comments of i2c_find_device_by_fwnode() says, 'The user must call put_device(&client->dev) once done with the i2c client.'
Found by code review.
Cc: stable stable@kernel.org Fixes: 24a28e428351 ("USB: gadget driver for LPC32xx") Signed-off-by: Ma Ke make24@iscas.ac.cn Link: https://patch.msgid.link/20251215020931.15324-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/udc/lpc32xx_udc.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-)
--- a/drivers/usb/gadget/udc/lpc32xx_udc.c +++ b/drivers/usb/gadget/udc/lpc32xx_udc.c @@ -3020,7 +3020,7 @@ static int lpc32xx_udc_probe(struct plat pdev->dev.dma_mask = &lpc32xx_usbd_dmamask; retval = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32)); if (retval) - return retval; + goto i2c_fail;
udc->board = &lpc32xx_usbddata;
@@ -3038,28 +3038,32 @@ static int lpc32xx_udc_probe(struct plat /* Get IRQs */ for (i = 0; i < 4; i++) { udc->udp_irq[i] = platform_get_irq(pdev, i); - if (udc->udp_irq[i] < 0) - return udc->udp_irq[i]; + if (udc->udp_irq[i] < 0) { + retval = udc->udp_irq[i]; + goto i2c_fail; + } }
udc->udp_baseaddr = devm_platform_ioremap_resource(pdev, 0); if (IS_ERR(udc->udp_baseaddr)) { dev_err(udc->dev, "IO map failure\n"); - return PTR_ERR(udc->udp_baseaddr); + retval = PTR_ERR(udc->udp_baseaddr); + goto i2c_fail; }
/* Get USB device clock */ udc->usb_slv_clk = devm_clk_get(&pdev->dev, NULL); if (IS_ERR(udc->usb_slv_clk)) { dev_err(udc->dev, "failed to acquire USB device clock\n"); - return PTR_ERR(udc->usb_slv_clk); + retval = PTR_ERR(udc->usb_slv_clk); + goto i2c_fail; }
/* Enable USB device clock */ retval = clk_prepare_enable(udc->usb_slv_clk); if (retval < 0) { dev_err(udc->dev, "failed to start USB device clock\n"); - return retval; + goto i2c_fail; }
/* Setup deferred workqueue data */ @@ -3161,6 +3165,8 @@ dma_alloc_fail: dma_free_coherent(&pdev->dev, UDCA_BUFF_SIZE, udc->udca_v_base, udc->udca_p_base); i2c_fail: + if (udc->isp1301_i2c_client) + put_device(&udc->isp1301_i2c_client->dev); clk_disable_unprepare(udc->usb_slv_clk); dev_err(udc->dev, "%s probe failed, %d\n", driver_name, retval);
@@ -3189,6 +3195,9 @@ static void lpc32xx_udc_remove(struct pl dma_free_coherent(&pdev->dev, UDCA_BUFF_SIZE, udc->udca_v_base, udc->udca_p_base);
+ if (udc->isp1301_i2c_client) + put_device(&udc->isp1301_i2c_client->dev); + clk_disable_unprepare(udc->usb_slv_clk); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Duoming Zhou duoming@zju.edu.cn
commit 41ca62e3e21e48c2903b3b45e232cf4f2ff7434f upstream.
The delayed work item otg_event is initialized in fsl_otg_conf() and scheduled under two conditions: 1. When a host controller binds to the OTG controller. 2. When the USB ID pin state changes (cable insertion/removal).
A race condition occurs when the device is removed via fsl_otg_remove(): the fsl_otg instance may be freed while the delayed work is still pending or executing. This leads to use-after-free when the work function fsl_otg_event() accesses the already freed memory.
The problematic scenario:
(detach thread) | (delayed work) fsl_otg_remove() | kfree(fsl_otg_dev) //FREE| fsl_otg_event() | og = container_of(...) //USE | og-> //USE
Fix this by calling disable_delayed_work_sync() in fsl_otg_remove() before deallocating the fsl_otg structure. This ensures the delayed work is properly canceled and completes execution prior to memory deallocation.
This bug was identified through static analysis.
Fixes: 0807c500a1a6 ("USB: add Freescale USB OTG Transceiver driver") Cc: stable stable@kernel.org Signed-off-by: Duoming Zhou duoming@zju.edu.cn Link: https://patch.msgid.link/20251205034831.12846-1-duoming@zju.edu.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/phy/phy-fsl-usb.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/phy/phy-fsl-usb.c +++ b/drivers/usb/phy/phy-fsl-usb.c @@ -988,6 +988,7 @@ static void fsl_otg_remove(struct platfo { struct fsl_usb2_platform_data *pdata = dev_get_platdata(&pdev->dev);
+ disable_delayed_work_sync(&fsl_otg_dev->otg_event); usb_remove_phy(&fsl_otg_dev->phy); free_irq(fsl_otg_dev->irq, fsl_otg_dev);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit b4b64fda4d30a83a7f00e92a0c8a1d47699609f3 upstream.
A recent change fixing a device reference leak in a UDC driver introduced a potential use-after-free in the non-OF case as the isp1301_get_client() helper only increases the reference count for the returned I2C device in the OF case.
Increment the reference count also for non-OF so that the caller can decrement it unconditionally.
Note that this is inherently racy just as using the returned I2C device is since nothing is preventing the PHY driver from being unbound while in use.
Fixes: c84117912bdd ("USB: lpc32xx_udc: Fix error handling in probe") Cc: stable@vger.kernel.org Cc: Ma Ke make24@iscas.ac.cn Signed-off-by: Johan Hovold johan@kernel.org Reviewed-by: Vladimir Zapolskiy vz@mleia.com Link: https://patch.msgid.link/20251218153519.19453-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/phy/phy-isp1301.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/usb/phy/phy-isp1301.c +++ b/drivers/usb/phy/phy-isp1301.c @@ -149,7 +149,12 @@ struct i2c_client *isp1301_get_client(st return client;
/* non-DT: only one ISP1301 chip supported */ - return isp1301_i2c_client; + if (isp1301_i2c_client) { + get_device(&isp1301_i2c_client->dev); + return isp1301_i2c_client; + } + + return NULL; } EXPORT_SYMBOL_GPL(isp1301_get_client);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 782be79e4551550d7a82b1957fc0f7347e6d461f upstream.
A recent change fixing a device reference leak introduced a clock imbalance by reusing an error path so that the clock may be disabled before having been enabled.
Note that the clock framework allows for passing in NULL clocks so there is no risk for a NULL pointer dereference.
Also drop the bogus I2C client NULL check added by the offending commit as the pointer has already been verified to be non-NULL.
Fixes: c84117912bdd ("USB: lpc32xx_udc: Fix error handling in probe") Cc: stable@vger.kernel.org Cc: Ma Ke make24@iscas.ac.cn Signed-off-by: Johan Hovold johan@kernel.org Reviewed-by: Vladimir Zapolskiy vz@mleia.com Link: https://patch.msgid.link/20251218153519.19453-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/udc/lpc32xx_udc.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
--- a/drivers/usb/gadget/udc/lpc32xx_udc.c +++ b/drivers/usb/gadget/udc/lpc32xx_udc.c @@ -3020,7 +3020,7 @@ static int lpc32xx_udc_probe(struct plat pdev->dev.dma_mask = &lpc32xx_usbd_dmamask; retval = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32)); if (retval) - goto i2c_fail; + goto err_put_client;
udc->board = &lpc32xx_usbddata;
@@ -3040,7 +3040,7 @@ static int lpc32xx_udc_probe(struct plat udc->udp_irq[i] = platform_get_irq(pdev, i); if (udc->udp_irq[i] < 0) { retval = udc->udp_irq[i]; - goto i2c_fail; + goto err_put_client; } }
@@ -3048,7 +3048,7 @@ static int lpc32xx_udc_probe(struct plat if (IS_ERR(udc->udp_baseaddr)) { dev_err(udc->dev, "IO map failure\n"); retval = PTR_ERR(udc->udp_baseaddr); - goto i2c_fail; + goto err_put_client; }
/* Get USB device clock */ @@ -3056,14 +3056,14 @@ static int lpc32xx_udc_probe(struct plat if (IS_ERR(udc->usb_slv_clk)) { dev_err(udc->dev, "failed to acquire USB device clock\n"); retval = PTR_ERR(udc->usb_slv_clk); - goto i2c_fail; + goto err_put_client; }
/* Enable USB device clock */ retval = clk_prepare_enable(udc->usb_slv_clk); if (retval < 0) { dev_err(udc->dev, "failed to start USB device clock\n"); - goto i2c_fail; + goto err_put_client; }
/* Setup deferred workqueue data */ @@ -3165,9 +3165,10 @@ dma_alloc_fail: dma_free_coherent(&pdev->dev, UDCA_BUFF_SIZE, udc->udca_v_base, udc->udca_p_base); i2c_fail: - if (udc->isp1301_i2c_client) - put_device(&udc->isp1301_i2c_client->dev); clk_disable_unprepare(udc->usb_slv_clk); +err_put_client: + put_device(&udc->isp1301_i2c_client->dev); + dev_err(udc->dev, "%s probe failed, %d\n", driver_name, retval);
return retval; @@ -3195,10 +3196,9 @@ static void lpc32xx_udc_remove(struct pl dma_free_coherent(&pdev->dev, UDCA_BUFF_SIZE, udc->udca_v_base, udc->udca_p_base);
- if (udc->isp1301_i2c_client) - put_device(&udc->isp1301_i2c_client->dev); - clk_disable_unprepare(udc->usb_slv_clk); + + put_device(&udc->isp1301_i2c_client->dev); }
#ifdef CONFIG_PM
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miaoqian Lin linmq006@gmail.com
commit 3b4961313d31e200c9e974bb1536cdea217f78b5 upstream.
When clk_bulk_prepare_enable() fails, the error path jumps to err_resetc_assert, skipping clk_bulk_put_all() and leaking the clock references acquired by clk_bulk_get_all().
Add err_clk_put_all label to properly release clock resources in all error paths.
Found via static analysis and code review.
Fixes: c0c61471ef86 ("usb: dwc3: of-simple: Convert to bulk clk API") Cc: stable stable@kernel.org Signed-off-by: Miaoqian Lin linmq006@gmail.com Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://patch.msgid.link/20251211064937.2360510-1-linmq006@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/dwc3-of-simple.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/usb/dwc3/dwc3-of-simple.c +++ b/drivers/usb/dwc3/dwc3-of-simple.c @@ -70,11 +70,11 @@ static int dwc3_of_simple_probe(struct p simple->num_clocks = ret; ret = clk_bulk_prepare_enable(simple->num_clocks, simple->clks); if (ret) - goto err_resetc_assert; + goto err_clk_put_all;
ret = of_platform_populate(np, NULL, NULL, dev); if (ret) - goto err_clk_put; + goto err_clk_disable;
pm_runtime_set_active(dev); pm_runtime_enable(dev); @@ -82,8 +82,9 @@ static int dwc3_of_simple_probe(struct p
return 0;
-err_clk_put: +err_clk_disable: clk_bulk_disable_unprepare(simple->num_clocks, simple->clks); +err_clk_put_all: clk_bulk_put_all(simple->num_clocks, simple->clks);
err_resetc_assert:
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Udipto Goswami udipto.goswami@oss.qualcomm.com
commit e1003aa7ec9eccdde4c926bd64ef42816ad55f25 upstream.
On some platforms, switching USB roles from host to device can trigger controller faults due to premature PHY power-down. This occurs when the PHY is disabled too early during teardown, causing synchronization issues between the PHY and controller.
Keep susphy enabled during dwc3_host_exit() and dwc3_gadget_exit() ensures the PHY remains in a low-power state capable of handling required commands during role switch.
Cc: stable stable@kernel.org Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init") Suggested-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Signed-off-by: Udipto Goswami udipto.goswami@oss.qualcomm.com Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://patch.msgid.link/20251126054221.120638-1-udipto.goswami@oss.qualcomm... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/gadget.c | 2 +- drivers/usb/dwc3/host.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4825,7 +4825,7 @@ void dwc3_gadget_exit(struct dwc3 *dwc) if (!dwc->gadget) return;
- dwc3_enable_susphy(dwc, false); + dwc3_enable_susphy(dwc, true); usb_del_gadget(dwc->gadget); dwc3_gadget_free_endpoints(dwc); usb_put_gadget(dwc->gadget); --- a/drivers/usb/dwc3/host.c +++ b/drivers/usb/dwc3/host.c @@ -226,7 +226,7 @@ void dwc3_host_exit(struct dwc3 *dwc) if (dwc->sys_wakeup) device_init_wakeup(&dwc->xhci->dev, false);
- dwc3_enable_susphy(dwc, false); + dwc3_enable_susphy(dwc, true); platform_device_unregister(dwc->xhci); dwc->xhci = NULL; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haoxiang Li haoxiang_li2024@163.com
commit 36cc7e09df9e43db21b46519b740145410dd9f4a upstream.
usbhsp_get_pipe() set pipe's flags to IS_USED. In error paths, usbhsp_put_pipe() is required to clear pipe's flags to prevent pipe exhaustion.
Fixes: f1407d5c6624 ("usb: renesas_usbhs: Add Renesas USBHS common code") Cc: stable stable@kernel.org Signed-off-by: Haoxiang Li haoxiang_li2024@163.com Link: https://patch.msgid.link/20251204132129.109234-1-haoxiang_li2024@163.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/renesas_usbhs/pipe.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/renesas_usbhs/pipe.c +++ b/drivers/usb/renesas_usbhs/pipe.c @@ -713,11 +713,13 @@ struct usbhs_pipe *usbhs_pipe_malloc(str /* make sure pipe is not busy */ ret = usbhsp_pipe_barrier(pipe); if (ret < 0) { + usbhsp_put_pipe(pipe); dev_err(dev, "pipe setup failed %d\n", usbhs_pipe_number(pipe)); return NULL; }
if (usbhsp_setup_pipecfg(pipe, is_host, dir_in, &pipecfg)) { + usbhsp_put_pipe(pipe); dev_err(dev, "can't setup pipe\n"); return NULL; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Łukasz Bartosik ukaszb@chromium.org
commit 74098cc06e753d3ffd8398b040a3a1dfb65260c0 upstream.
This fixup replaces tty_vhangup() call with call to tty_port_tty_vhangup(). Both calls hangup tty device synchronously however tty_port_tty_vhangup() increases reference count during the hangup operation using scoped_guard(tty_port_tty).
Cc: stable stable@kernel.org Fixes: 1f73b8b56cf3 ("xhci: dbgtty: fix device unregister") Signed-off-by: Łukasz Bartosik ukaszb@chromium.org Link: https://patch.msgid.link/20251127111644.3161386-1-ukaszb@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-dbgtty.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/usb/host/xhci-dbgtty.c +++ b/drivers/usb/host/xhci-dbgtty.c @@ -554,7 +554,7 @@ static void xhci_dbc_tty_unregister_devi * Hang up the TTY. This wakes up any blocked * writers and causes subsequent writes to fail. */ - tty_vhangup(port->port.tty); + tty_port_tty_vhangup(&port->port);
tty_unregister_device(dbc_tty_driver, port->minor); xhci_dbc_tty_exit_port(port);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tianchu Chen flynnnchen@tencent.com
commit 82d12088c297fa1cef670e1718b3d24f414c23f7 upstream.
Discovered by Atuin - Automated Vulnerability Discovery Engine.
In ac_ioctl, the validation of IndexCard and the check for a valid RamIO pointer are skipped when cmd is 6. However, the function unconditionally executes readb(apbs[IndexCard].RamIO + VERS) at the end.
If cmd is 6, IndexCard may reference a board that does not exist (where RamIO is NULL), leading to a NULL pointer dereference.
Fix this by skipping the readb access when cmd is 6, as this command is a global information query and does not target a specific board context.
Signed-off-by: Tianchu Chen flynnnchen@tencent.com Acked-by: Arnd Bergmann arnd@arndb.de Cc: stable stable@kernel.org Link: https://patch.msgid.link/20251128155323.a786fde92ebb926cbe96fcb1@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/applicom.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/char/applicom.c +++ b/drivers/char/applicom.c @@ -835,7 +835,10 @@ static long ac_ioctl(struct file *file, ret = -ENOTTY; break; } - Dummy = readb(apbs[IndexCard].RamIO + VERS); + + if (cmd != 6) + Dummy = readb(apbs[IndexCard].RamIO + VERS); + kfree(adgl); mutex_unlock(&ac_mutex); return ret;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Srinivas Kandagatla srinivas.kandagatla@oss.qualcomm.com
commit 4d4e746aa9f0f07261dcb41e4f51edb98723dcaa upstream.
Fix below warnings generated dt_bindings_check for examples in the bindings.
Documentation/devicetree/bindings/slimbus/slimbus.example.dtb: slim@28080000 (qcom,slim-ngd-v1.5.0): 'audio-codec@1,0' does not match any of the regexes: '^pinctrl-[0-9]+$', '^slim@[0-9a-f]+$' from schema $id: http://devicetree.org/schemas/slimbus/qcom,slim-ngd.yaml# Documentation/devicetree/bindings/slimbus/slimbus.example.dtb: slim@28080000 (qcom,slim-ngd-v1.5.0): #address-cells: 1 was expected from schema $id: http://devicetree.org/schemas/slimbus/qcom,slim-ngd.yaml# Documentation/devicetree/bindings/slimbus/slimbus.example.dtb: slim@28080000 (qcom,slim-ngd-v1.5.0): 'dmas' is a required property from schema $id: http://devicetree.org/schemas/slimbus/qcom,slim-ngd.yaml# Documentation/devicetree/bindings/slimbus/slimbus.example.dtb: slim@28080000 (qcom,slim-ngd-v1.5.0): 'dma-names' is a required property from schema $id: http://devicetree.org/schemas/slimbus/qcom,slim-ngd.yaml#
Fixes: 7cbba32a2d62 ("slimbus: qcom: remove unused qcom controller driver") Cc: stable stable@kernel.org Reported-by: Rob Herring robh@kernel.org Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@oss.qualcomm.com Acked-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Srinivas Kandagatla srini@kernel.org Link: https://patch.msgid.link/20251114110505.143105-1-srini@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- .../devicetree/bindings/slimbus/slimbus.yaml | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/Documentation/devicetree/bindings/slimbus/slimbus.yaml b/Documentation/devicetree/bindings/slimbus/slimbus.yaml index 89017d9cda10..5a941610ce4e 100644 --- a/Documentation/devicetree/bindings/slimbus/slimbus.yaml +++ b/Documentation/devicetree/bindings/slimbus/slimbus.yaml @@ -75,16 +75,22 @@ examples: #size-cells = <1>; ranges;
- slim@28080000 { + controller@28080000 { compatible = "qcom,slim-ngd-v1.5.0"; reg = <0x091c0000 0x2c000>; interrupts = <GIC_SPI 163 IRQ_TYPE_LEVEL_HIGH>; - #address-cells = <2>; + dmas = <&slimbam 3>, <&slimbam 4>; + dma-names = "rx", "tx"; + #address-cells = <1>; #size-cells = <0>; - - audio-codec@1,0 { + slim@1 { + reg = <1>; + #address-cells = <2>; + #size-cells = <0>; + codec@1,0 { compatible = "slim217,1a0"; reg = <1 0>; + }; }; + }; }; - };
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make24@iscas.ac.cn
commit 6d5925b667e4ed9e77c8278cc215191d29454a3f upstream.
intel_th_output_open() calls bus_find_device_by_devt() which internally increments the device reference count via get_device(), but this reference is not properly released in several error paths. When device driver is unavailable, file operations cannot be obtained, or the driver's open method fails, the function returns without calling put_device(), leading to a permanent device reference count leak. This prevents the device from being properly released and could cause resource exhaustion over time.
Found by code review.
Cc: stable stable@kernel.org Fixes: 39f4034693b7 ("intel_th: Add driver infrastructure for Intel(R) Trace Hub devices") Signed-off-by: Ma Ke make24@iscas.ac.cn Link: https://patch.msgid.link/20251112091723.35963-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwtracing/intel_th/core.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-)
--- a/drivers/hwtracing/intel_th/core.c +++ b/drivers/hwtracing/intel_th/core.c @@ -810,13 +810,17 @@ static int intel_th_output_open(struct i int err;
dev = bus_find_device_by_devt(&intel_th_bus, inode->i_rdev); - if (!dev || !dev->driver) - return -ENODEV; + if (!dev || !dev->driver) { + err = -ENODEV; + goto out_no_device; + }
thdrv = to_intel_th_driver(dev->driver); fops = fops_get(thdrv->fops); - if (!fops) - return -ENODEV; + if (!fops) { + err = -ENODEV; + goto out_put_device; + }
replace_fops(file, fops);
@@ -824,10 +828,16 @@ static int intel_th_output_open(struct i
if (file->f_op->open) { err = file->f_op->open(inode, file); - return err; + if (err) + goto out_put_device; }
return 0; + +out_put_device: + put_device(dev); +out_no_device: + return err; }
static const struct file_operations intel_th_output_fops = {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make24@iscas.ac.cn
commit a6dab2f61d23c1eb32f1d08fa7b4919a2478950b upstream.
mei_register() fails to release the device reference in error paths after device_initialize(). During normal device registration, the reference is properly handled through mei_deregister() which calls device_destroy(). However, in error handling paths (such as cdev_alloc failure, cdev_add failure, etc.), missing put_device() calls cause reference count leaks, preventing the device's release function (mei_device_release) from being called and resulting in memory leaks of mei_device.
Found by code review.
Cc: stable stable@kernel.org Fixes: 7704e6be4ed2 ("mei: hook mei_device on class device") Signed-off-by: Ma Ke make24@iscas.ac.cn Acked-by: Alexander Usyskin alexander.usyskin@intel.com Link: https://patch.msgid.link/20251104020133.5017-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/mei/main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/misc/mei/main.c b/drivers/misc/mei/main.c index 86a73684a373..6f26d5160788 100644 --- a/drivers/misc/mei/main.c +++ b/drivers/misc/mei/main.c @@ -1307,6 +1307,7 @@ int mei_register(struct mei_device *dev, struct device *parent) err_del_cdev: cdev_del(dev->cdev); err: + put_device(&dev->dev); mei_minor_free(minor); return ret; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Junxiao Chang junxiao.chang@intel.com
commit 5d92c3b41f0bddfa416130c6e1b424414f3d2acf upstream.
INTEL_MEI_GSC depends on either i915 or Xe and can be present when either of above is present.
Cc: stable stable@kernel.org Fixes: 87a4c85d3a3e ("drm/xe/gsc: add gsc device support") Tested-by: Baoli Zhang baoli.zhang@intel.com Signed-off-by: Junxiao Chang junxiao.chang@intel.com Signed-off-by: Alexander Usyskin alexander.usyskin@intel.com Link: https://patch.msgid.link/20251109153533.3179787-1-alexander.usyskin@intel.co... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/mei/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/misc/mei/Kconfig +++ b/drivers/misc/mei/Kconfig @@ -49,7 +49,7 @@ config INTEL_MEI_TXE config INTEL_MEI_GSC tristate "Intel MEI GSC embedded device" depends on INTEL_MEI_ME - depends on DRM_I915 + depends on DRM_I915 || DRM_XE help Intel auxiliary driver for GSC devices embedded in Intel graphics devices.
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit 24ec03cc55126b7b3adf102f4b3d9f716532b329 upstream.
The change that restores sysfs fwnode information does it only for OF cases. Update the fix to cover all possible types of fwnodes.
Fixes: d36f0e9a0002 ("serial: core: restore of_node information in sysfs") Cc: stable stable@kernel.org Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://patch.msgid.link/20251127163650.2942075-1-andriy.shevchenko@linux.in... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/serial_base_bus.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/drivers/tty/serial/serial_base_bus.c +++ b/drivers/tty/serial/serial_base_bus.c @@ -13,7 +13,7 @@ #include <linux/device.h> #include <linux/idr.h> #include <linux/module.h> -#include <linux/of.h> +#include <linux/property.h> #include <linux/serial_core.h> #include <linux/slab.h> #include <linux/spinlock.h> @@ -60,6 +60,7 @@ void serial_base_driver_unregister(struc driver_unregister(driver); }
+/* On failure the caller must put device @dev with put_device() */ static int serial_base_device_init(struct uart_port *port, struct device *dev, struct device *parent_dev, @@ -73,7 +74,8 @@ static int serial_base_device_init(struc dev->parent = parent_dev; dev->bus = &serial_base_bus_type; dev->release = release; - device_set_of_node_from_dev(dev, parent_dev); + + device_set_node(dev, fwnode_handle_get(dev_fwnode(parent_dev)));
if (!serial_base_initialized) { dev_dbg(port->dev, "uart_add_one_port() called before arch_initcall()?\n"); @@ -94,7 +96,7 @@ static void serial_base_ctrl_release(str { struct serial_ctrl_device *ctrl_dev = to_serial_base_ctrl_device(dev);
- of_node_put(dev->of_node); + fwnode_handle_put(dev_fwnode(dev)); kfree(ctrl_dev); }
@@ -142,7 +144,7 @@ static void serial_base_port_release(str { struct serial_port_device *port_dev = to_serial_base_port_device(dev);
- of_node_put(dev->of_node); + fwnode_handle_put(dev_fwnode(dev)); kfree(port_dev); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Stein alexander.stein@ew.tq-group.com
commit f54151148b969fb4b62bec8093d255306d20df30 upstream.
During restoring sysfs fwnode information the information of_node_reused was dropped. This was previously set by device_set_of_node_from_dev(). Add it back manually
Fixes: 24ec03cc5512 ("serial: core: Restore sysfs fwnode information") Cc: stable stable@kernel.org Suggested-by: Cosmin Tanislav demonsingur@gmail.com Signed-off-by: Alexander Stein alexander.stein@ew.tq-group.com Tested-by: Michael Walle mwalle@kernel.org Tested-by: Marek Szyprowski m.szyprowski@samsung.com Tested-by: Cosmin Tanislav demonsingur@gmail.com Link: https://patch.msgid.link/20251219152813.1893982-1-alexander.stein@ew.tq-grou... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/serial_base_bus.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/tty/serial/serial_base_bus.c +++ b/drivers/tty/serial/serial_base_bus.c @@ -74,6 +74,7 @@ static int serial_base_device_init(struc dev->parent = parent_dev; dev->bus = &serial_base_bus_type; dev->release = release; + dev->of_node_reused = true;
device_set_node(dev, fwnode_handle_get(dev_fwnode(parent_dev)));
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: j.turek jakub.turek@elsta.tech
commit 267ee93c417e685d9f8e079e41c70ba6ee4df5a5 upstream.
RTS line control with delay should be triggered when there is no more bytes in kfifo and hardware buffer is empty. Without this patch RTS control is scheduled right after feeding hardware buffer and this is too early.
RTS line may change state before hardware buffer is empty.
With this patch delayed RTS state change is triggered when function cdns_uart_handle_tx is called from cdns_uart_isr on CDNS_UART_IXR_TXEMPTY exactly when hardware completed transmission
Fixes: fccc9d9233f9 ("tty: serial: uartps: Add rs485 support to uartps driver") Cc: stable stable@kernel.org Link: https://patch.msgid.link/20251221103221.1971125-1-jakub.turek@elsta.tech Signed-off-by: Jakub Turek jakub.turek@elsta.tech Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/xilinx_uartps.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
--- a/drivers/tty/serial/xilinx_uartps.c +++ b/drivers/tty/serial/xilinx_uartps.c @@ -430,10 +430,17 @@ static void cdns_uart_handle_tx(void *de struct tty_port *tport = &port->state->port; unsigned int numbytes; unsigned char ch; + ktime_t rts_delay;
if (kfifo_is_empty(&tport->xmit_fifo) || uart_tx_stopped(port)) { /* Disable the TX Empty interrupt */ writel(CDNS_UART_IXR_TXEMPTY, port->membase + CDNS_UART_IDR); + /* Set RTS line after delay */ + if (cdns_uart->port->rs485.flags & SER_RS485_ENABLED) { + cdns_uart->tx_timer.function = &cdns_rs485_rx_callback; + rts_delay = ns_to_ktime(cdns_calc_after_tx_delay(cdns_uart)); + hrtimer_start(&cdns_uart->tx_timer, rts_delay, HRTIMER_MODE_REL); + } return; }
@@ -450,13 +457,6 @@ static void cdns_uart_handle_tx(void *de
/* Enable the TX Empty interrupt */ writel(CDNS_UART_IXR_TXEMPTY, cdns_uart->port->membase + CDNS_UART_IER); - - if (cdns_uart->port->rs485.flags & SER_RS485_ENABLED && - (kfifo_is_empty(&tport->xmit_fifo) || uart_tx_stopped(port))) { - hrtimer_update_function(&cdns_uart->tx_timer, cdns_rs485_rx_callback); - hrtimer_start(&cdns_uart->tx_timer, - ns_to_ktime(cdns_calc_after_tx_delay(cdns_uart)), HRTIMER_MODE_REL); - } }
/**
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Claudiu Beznea claudiu.beznea.uj@bp.renesas.com
commit c3ca8a0aac832fe8047608bb2ae2cca314c6d717 upstream.
The driver updates struct sci_port::tx_cookie to zero right before the TX work is scheduled, or to -EINVAL when DMA is disabled. dma_async_is_complete(), called through dma_cookie_status() (and possibly through dmaengine_tx_status()), considers cookies valid only if they have values greater than or equal to 1.
Passing zero or -EINVAL to dmaengine_tx_status() before any TX DMA transfer has started leads to an incorrect TX status being reported, as the cookie is invalid for the DMA subsystem. This may cause long wait times when the serial device is opened for configuration before any TX activity has occurred.
Check that the TX cookie is valid before passing it to dmaengine_tx_status().
Fixes: 7cc0e0a43a91 ("serial: sh-sci: Check if TX data was written to device in .tx_empty()") Cc: stable stable@kernel.org Signed-off-by: Claudiu Beznea claudiu.beznea.uj@bp.renesas.com Link: https://patch.msgid.link/20251217135759.402015-1-claudiu.beznea.uj@bp.renesa... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/sh-sci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/tty/serial/sh-sci.c +++ b/drivers/tty/serial/sh-sci.c @@ -1740,7 +1740,7 @@ static void sci_dma_check_tx_occurred(st struct dma_tx_state state; enum dma_status status;
- if (!s->chan_tx) + if (!s->chan_tx || s->cookie_tx <= 0) return;
status = dmaengine_tx_status(s->chan_tx, s->cookie_tx, &state);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
commit a03b2011808ab02ccb7ab6b573b013b77fbb5921 upstream.
When the target residency of the current candidate idle state is greater than the expected time till the closest timer (the sleep length), it does not matter whether or not the tick has already been stopped or if it is going to be stopped. The closest timer will trigger anyway at its due time, so if an idle state with target residency above the sleep length is selected, energy will be wasted and there may be excess latency.
Of course, if the closest timer were canceled before it could trigger, a deeper idle state would be more suitable, but this is not expected to happen (generally speaking, hrtimers are not expected to be canceled as a rule).
Accordingly, the teo_state_ok() check done in that case causes energy to be wasted more often than it allows any energy to be saved (if it allows any energy to be saved at all), so drop it and let the governor use the teo_find_shallower_state() return value as the new candidate idle state index.
Fixes: 21d28cd2fa5f ("cpuidle: teo: Do not call tick_nohz_get_sleep_length() upfront") Cc: All applicable stable@vger.kernel.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Christian Loehle christian.loehle@arm.com Tested-by: Christian Loehle christian.loehle@arm.com Link: https://patch.msgid.link/5955081.DvuYhMxLoT@rafael.j.wysocki Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cpuidle/governors/teo.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
--- a/drivers/cpuidle/governors/teo.c +++ b/drivers/cpuidle/governors/teo.c @@ -458,11 +458,8 @@ static int teo_select(struct cpuidle_dri * If the closest expected timer is before the target residency of the * candidate state, a shallower one needs to be found. */ - if (drv->states[idx].target_residency_ns > duration_ns) { - i = teo_find_shallower_state(drv, dev, idx, duration_ns, false); - if (teo_state_ok(i, drv)) - idx = i; - } + if (drv->states[idx].target_residency_ns > duration_ns) + idx = teo_find_shallower_state(drv, dev, idx, duration_ns, false);
/* * If the selected state's target residency is below the tick length
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miaoqian Lin linmq006@gmail.com
commit 9600156bb99852c216a2128cdf9f114eb67c350f upstream.
There are two reference count leaks in this driver:
1. In nforce2_fsb_read(): pci_get_subsys() increases the reference count of the PCI device, but pci_dev_put() is never called to release it, thus leaking the reference.
2. In nforce2_detect_chipset(): pci_get_subsys() gets a reference to the nforce2_dev which is stored in a global variable, but the reference is never released when the module is unloaded.
Fix both by: - Adding pci_dev_put(nforce2_sub5) in nforce2_fsb_read() after reading the configuration. - Adding pci_dev_put(nforce2_dev) in nforce2_exit() to release the global device reference.
Found via static analysis.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin linmq006@gmail.com Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cpufreq/cpufreq-nforce2.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/cpufreq/cpufreq-nforce2.c +++ b/drivers/cpufreq/cpufreq-nforce2.c @@ -145,6 +145,8 @@ static unsigned int nforce2_fsb_read(int pci_read_config_dword(nforce2_sub5, NFORCE2_BOOTFSB, &fsb); fsb /= 1000000;
+ pci_dev_put(nforce2_sub5); + /* Check if PLL register is already set */ pci_read_config_byte(nforce2_dev, NFORCE2_PLLENABLE, (u8 *)&temp);
@@ -426,6 +428,7 @@ static int __init nforce2_init(void) static void __exit nforce2_exit(void) { cpufreq_unregister_driver(&nforce2_driver); + pci_dev_put(nforce2_dev); }
module_init(nforce2_init);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tony Battersby tonyb@cybernetics.com
commit b57fbc88715b6d18f379463f48a15b560b087ffe upstream.
This reverts commit 0367076b0817d5c75dfb83001ce7ce5c64d803a9.
The commit being reverted added code to __qla2x00_abort_all_cmds() to call sp->done() without holding a spinlock. But unlike the older code below it, this new code failed to check sp->cmd_type and just assumed TYPE_SRB, which results in a jump to an invalid pointer in target-mode with TYPE_TGT_CMD:
qla2xxx [0000:65:00.0]-d034:8: qla24xx_do_nack_work create sess success 0000000009f7a79b qla2xxx [0000:65:00.0]-5003:8: ISP System Error - mbx1=1ff5h mbx2=10h mbx3=0h mbx4=0h mbx5=191h mbx6=0h mbx7=0h. qla2xxx [0000:65:00.0]-d01e:8: -> fwdump no buffer qla2xxx [0000:65:00.0]-f03a:8: qla_target(0): System error async event 0x8002 occurred qla2xxx [0000:65:00.0]-00af:8: Performing ISP error recovery - ha=0000000058183fda. BUG: kernel NULL pointer dereference, address: 0000000000000000 PF: supervisor instruction fetch in kernel mode PF: error_code(0x0010) - not-present page PGD 0 P4D 0 Oops: 0010 [#1] SMP CPU: 2 PID: 9446 Comm: qla2xxx_8_dpc Tainted: G O 6.1.133 #1 Hardware name: Supermicro Super Server/X11SPL-F, BIOS 4.2 12/15/2023 RIP: 0010:0x0 Code: Unable to access opcode bytes at 0xffffffffffffffd6. RSP: 0018:ffffc90001f93dc8 EFLAGS: 00010206 RAX: 0000000000000282 RBX: 0000000000000355 RCX: ffff88810d16a000 RDX: ffff88810dbadaa8 RSI: 0000000000080000 RDI: ffff888169dc38c0 RBP: ffff888169dc38c0 R08: 0000000000000001 R09: 0000000000000045 R10: ffffffffa034bdf0 R11: 0000000000000000 R12: ffff88810800bb40 R13: 0000000000001aa8 R14: ffff888100136610 R15: ffff8881070f7400 FS: 0000000000000000(0000) GS:ffff88bf80080000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 000000010c8ff006 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __die+0x4d/0x8b ? page_fault_oops+0x91/0x180 ? trace_buffer_unlock_commit_regs+0x38/0x1a0 ? exc_page_fault+0x391/0x5e0 ? asm_exc_page_fault+0x22/0x30 __qla2x00_abort_all_cmds+0xcb/0x3e0 [qla2xxx_scst] qla2x00_abort_all_cmds+0x50/0x70 [qla2xxx_scst] qla2x00_abort_isp_cleanup+0x3b7/0x4b0 [qla2xxx_scst] qla2x00_abort_isp+0xfd/0x860 [qla2xxx_scst] qla2x00_do_dpc+0x581/0xa40 [qla2xxx_scst] kthread+0xa8/0xd0 </TASK>
Then commit 4475afa2646d ("scsi: qla2xxx: Complete command early within lock") added the spinlock back, because not having the lock caused a race and a crash. But qla2x00_abort_srb() in the switch below already checks for qla2x00_chip_is_down() and handles it the same way, so the code above the switch is now redundant and still buggy in target-mode. Remove it.
Cc: stable@vger.kernel.org Signed-off-by: Tony Battersby tonyb@cybernetics.com Link: https://patch.msgid.link/3a8022dc-bcfd-4b01-9f9b-7a9ec61fa2a3@cybernetics.co... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_os.c | 6 ------ 1 file changed, 6 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1862,12 +1862,6 @@ __qla2x00_abort_all_cmds(struct qla_qpai for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) { sp = req->outstanding_cmds[cnt]; if (sp) { - if (qla2x00_chip_is_down(vha)) { - req->outstanding_cmds[cnt] = NULL; - sp->done(sp, res); - continue; - } - switch (sp->cmd_type) { case TYPE_SRB: qla2x00_abort_srb(qp, sp, res, &flags);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Junrui Luo moonafterrain@outlook.com
commit f6ab594672d4cba08540919a4e6be2e202b60007 upstream.
The asd_pci_remove() function fails to synchronize with pending tasklets before freeing the asd_ha structure, leading to a potential use-after-free vulnerability.
When a device removal is triggered (via hot-unplug or module unload), race condition can occur.
The fix adds tasklet_kill() before freeing the asd_ha structure, ensuring all scheduled tasklets complete before cleanup proceeds.
Reported-by: Yuhao Jiang danisjiang@gmail.com Reported-by: Junrui Luo moonafterrain@outlook.com Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver") Cc: stable@vger.kernel.org Signed-off-by: Junrui Luo moonafterrain@outlook.com Link: https://patch.msgid.link/ME2PR01MB3156AB7DCACA206C845FC7E8AFFDA@ME2PR01MB315... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/aic94xx/aic94xx_init.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/scsi/aic94xx/aic94xx_init.c +++ b/drivers/scsi/aic94xx/aic94xx_init.c @@ -882,6 +882,9 @@ static void asd_pci_remove(struct pci_de
asd_disable_ints(asd_ha);
+ /* Ensure all scheduled tasklets complete before freeing resources */ + tasklet_kill(&asd_ha->seq.dl_tasklet); + asd_remove_dev_attrs(asd_ha);
/* XXX more here as needed */
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dai Ngo dai.ngo@oracle.com
commit 6f52063db9aabdaabea929b1e998af98c2e8d917 upstream.
The reservation type argument for the pr_preempt call should match the one used in nfsd4_block_get_device_info_scsi.
Fixes: f99d4fbdae67 ("nfsd: add SCSI layout support") Cc: stable@vger.kernel.org Signed-off-by: Dai Ngo dai.ngo@oracle.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/blocklayout.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/nfsd/blocklayout.c +++ b/fs/nfsd/blocklayout.c @@ -344,7 +344,8 @@ nfsd4_scsi_fence_client(struct nfs4_layo struct block_device *bdev = file->nf_file->f_path.mnt->mnt_sb->s_bdev;
bdev->bd_disk->fops->pr_ops->pr_preempt(bdev, NFSD_MDS_PR_KEY, - nfsd4_scsi_pr_key(clp), 0, true); + nfsd4_scsi_pr_key(clp), + PR_EXCLUSIVE_ACCESS_REG_ONLY, true); }
const struct nfsd4_layout_ops scsi_layout_ops = {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrey Vatoropin a.vatoropin@crpt.ru
commit 5053eab38a4c4543522d0c320c639c56a8b59908 upstream.
If allocation of cmd->t_task_cdb fails, it remains NULL but is later dereferenced in the 'err' path.
In case of error, reset NULL t_task_cdb value to point at the default fixed-size buffer.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 9e95fb805dc0 ("scsi: target: Fix NULL pointer dereference") Cc: stable@vger.kernel.org Signed-off-by: Andrey Vatoropin a.vatoropin@crpt.ru Reviewed-by: Mike Christie michael.christie@oracle.com Link: https://patch.msgid.link/20251118084014.324940-1-a.vatoropin@crpt.ru Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/target/target_core_transport.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -1524,6 +1524,7 @@ target_cmd_init_cdb(struct se_cmd *cmd, if (scsi_command_size(cdb) > sizeof(cmd->__t_task_cdb)) { cmd->t_task_cdb = kzalloc(scsi_command_size(cdb), gfp); if (!cmd->t_task_cdb) { + cmd->t_task_cdb = &cmd->__t_task_cdb[0]; pr_err("Unable to allocate cmd->t_task_cdb" " %u > sizeof(cmd->__t_task_cdb): %lu ops\n", scsi_command_size(cdb),
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chandrakanth Patil chandrakanth.patil@broadcom.com
commit d373163194982f43b92c552c138c29d9f0b79553 upstream.
The driver was not reading the MAX_REQ_PER_REPLY_QUEUE_LIMIT IOCFacts flag, so the reply-queue-full handling was never enabled, even on firmware that supports it. Reading this flag enables the feature and prevents reply queue overflow.
Fixes: f08b24d82749 ("scsi: mpi3mr: Avoid reply queue full condition") Cc: stable@vger.kernel.org Signed-off-by: Chandrakanth Patil chandrakanth.patil@broadcom.com Link: https://patch.msgid.link/20251211002929.22071-1-chandrakanth.patil@broadcom.... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/mpi3mr/mpi/mpi30_ioc.h | 1 + drivers/scsi/mpi3mr/mpi3mr_fw.c | 2 ++ 2 files changed, 3 insertions(+)
--- a/drivers/scsi/mpi3mr/mpi/mpi30_ioc.h +++ b/drivers/scsi/mpi3mr/mpi/mpi30_ioc.h @@ -166,6 +166,7 @@ struct mpi3_ioc_facts_data { #define MPI3_IOCFACTS_FLAGS_SIGNED_NVDATA_REQUIRED (0x00010000) #define MPI3_IOCFACTS_FLAGS_DMA_ADDRESS_WIDTH_MASK (0x0000ff00) #define MPI3_IOCFACTS_FLAGS_DMA_ADDRESS_WIDTH_SHIFT (8) +#define MPI3_IOCFACTS_FLAGS_MAX_REQ_PER_REPLY_QUEUE_LIMIT (0x00000040) #define MPI3_IOCFACTS_FLAGS_INITIAL_PORT_ENABLE_MASK (0x00000030) #define MPI3_IOCFACTS_FLAGS_INITIAL_PORT_ENABLE_SHIFT (4) #define MPI3_IOCFACTS_FLAGS_INITIAL_PORT_ENABLE_NOT_STARTED (0x00000000) --- a/drivers/scsi/mpi3mr/mpi3mr_fw.c +++ b/drivers/scsi/mpi3mr/mpi3mr_fw.c @@ -3158,6 +3158,8 @@ static void mpi3mr_process_factsdata(str mrioc->facts.dma_mask = (facts_flags & MPI3_IOCFACTS_FLAGS_DMA_ADDRESS_WIDTH_MASK) >> MPI3_IOCFACTS_FLAGS_DMA_ADDRESS_WIDTH_SHIFT; + mrioc->facts.max_req_limit = (facts_flags & + MPI3_IOCFACTS_FLAGS_MAX_REQ_PER_REPLY_QUEUE_LIMIT); mrioc->facts.protocol_flags = facts_data->protocol_flags; mrioc->facts.mpi_version = le32_to_cpu(facts_data->mpi_version.word); mrioc->facts.max_reqs = le16_to_cpu(facts_data->max_outstanding_requests);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Seunghwan Baek sh8267.baek@samsung.com
commit c9f36f04a8a2725172cdf2b5e32363e4addcb14c upstream.
If UFS resume fails, the event history is updated in ufshcd_resume(), but there is no code anywhere to record UFS suspend. Therefore, add code to record UFS suspend error event history.
Fixes: dd11376b9f1b ("scsi: ufs: Split the drivers/scsi/ufs directory") Cc: stable@vger.kernel.org Signed-off-by: Seunghwan Baek sh8267.baek@samsung.com Reviewed-by: Peter Wang peter.wang@mediatek.com Link: https://patch.msgid.link/20251210063854.1483899-2-sh8267.baek@samsung.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ufs/core/ufshcd.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -10223,7 +10223,7 @@ static int ufshcd_suspend(struct ufs_hba ret = ufshcd_setup_clocks(hba, false); if (ret) { ufshcd_enable_irq(hba); - return ret; + goto out; } if (ufshcd_is_clkgating_allowed(hba)) { hba->clk_gating.state = CLKS_OFF; @@ -10235,6 +10235,9 @@ static int ufshcd_suspend(struct ufs_hba /* Put the host controller in low power mode if possible */ ufshcd_hba_vreg_set_lpm(hba); ufshcd_pm_qos_update(hba, false); +out: + if (ret) + ufshcd_update_evt_hist(hba, UFS_EVT_SUSPEND_ERR, (u32)ret); return ret; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Prusakowski jprusakowski@google.com
commit 297baa4aa263ff8f5b3d246ee16a660d76aa82c4 upstream.
Xfstests generic/335, generic/336 sometimes crash with the following message:
F2FS-fs (dm-0): detect filesystem reference count leak during umount, type: 9, count: 1 ------------[ cut here ]------------ kernel BUG at fs/f2fs/super.c:1939! Oops: invalid opcode: 0000 [#1] SMP NOPTI CPU: 1 UID: 0 PID: 609351 Comm: umount Tainted: G W 6.17.0-rc5-xfstests-g9dd1835ecda5 #1 PREEMPT(none) Tainted: [W]=WARN Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:f2fs_put_super+0x3b3/0x3c0 Call Trace: <TASK> generic_shutdown_super+0x7e/0x190 kill_block_super+0x1a/0x40 kill_f2fs_super+0x9d/0x190 deactivate_locked_super+0x30/0xb0 cleanup_mnt+0xba/0x150 task_work_run+0x5c/0xa0 exit_to_user_mode_loop+0xb7/0xc0 do_syscall_64+0x1ae/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e </TASK> ---[ end trace 0000000000000000 ]---
It appears that sometimes it is possible that f2fs_put_super() is called before all node page reads are completed. Adding a call to f2fs_wait_on_all_pages() for F2FS_RD_NODE fixes the problem.
Cc: stable@kernel.org Fixes: 20872584b8c0b ("f2fs: fix to drop all dirty meta/node pages during umount()") Signed-off-by: Jan Prusakowski jprusakowski@google.com Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/super.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-)
--- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -1988,14 +1988,6 @@ static void f2fs_put_super(struct super_ truncate_inode_pages_final(META_MAPPING(sbi)); }
- for (i = 0; i < NR_COUNT_TYPE; i++) { - if (!get_pages(sbi, i)) - continue; - f2fs_err(sbi, "detect filesystem reference count leak during " - "umount, type: %d, count: %lld", i, get_pages(sbi, i)); - f2fs_bug_on(sbi, 1); - } - f2fs_bug_on(sbi, sbi->fsync_node_num);
f2fs_destroy_compress_inode(sbi); @@ -2006,6 +1998,15 @@ static void f2fs_put_super(struct super_ iput(sbi->meta_inode); sbi->meta_inode = NULL;
+ /* Should check the page counts after dropping all node/meta pages */ + for (i = 0; i < NR_COUNT_TYPE; i++) { + if (!get_pages(sbi, i)) + continue; + f2fs_err(sbi, "detect filesystem reference count leak during " + "umount, type: %d, count: %lld", i, get_pages(sbi, i)); + f2fs_bug_on(sbi, 1); + } + /* * iput() can update stat information, if f2fs_write_checkpoint() * above failed with error.
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chao Yu chao@kernel.org
commit 10b591e7fb7cdc8c1e53e9c000dc0ef7069aaa76 upstream.
Bai, Shuangpeng sjb7183@psu.edu reported a bug as below:
Oops: divide error: 0000 [#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 11441 Comm: syz.0.46 Not tainted 6.17.0 #1 PREEMPT(full) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:f2fs_all_cluster_page_ready+0x106/0x550 fs/f2fs/compress.c:857 Call Trace: <TASK> f2fs_write_cache_pages fs/f2fs/data.c:3078 [inline] __f2fs_write_data_pages fs/f2fs/data.c:3290 [inline] f2fs_write_data_pages+0x1c19/0x3600 fs/f2fs/data.c:3317 do_writepages+0x38e/0x640 mm/page-writeback.c:2634 filemap_fdatawrite_wbc mm/filemap.c:386 [inline] __filemap_fdatawrite_range mm/filemap.c:419 [inline] file_write_and_wait_range+0x2ba/0x3e0 mm/filemap.c:794 f2fs_do_sync_file+0x6e6/0x1b00 fs/f2fs/file.c:294 generic_write_sync include/linux/fs.h:3043 [inline] f2fs_file_write_iter+0x76e/0x2700 fs/f2fs/file.c:5259 new_sync_write fs/read_write.c:593 [inline] vfs_write+0x7e9/0xe00 fs/read_write.c:686 ksys_write+0x19d/0x2d0 fs/read_write.c:738 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xf7/0x470 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f
The bug was triggered w/ below race condition:
fsync setattr ioctl - f2fs_do_sync_file - file_write_and_wait_range - f2fs_write_cache_pages : inode is non-compressed : cc.cluster_size = F2FS_I(inode)->i_cluster_size = 0 - tag_pages_for_writeback - f2fs_setattr - truncate_setsize - f2fs_truncate - f2fs_fileattr_set - f2fs_setflags_common - set_compress_context : F2FS_I(inode)->i_cluster_size = 4 : set_inode_flag(inode, FI_COMPRESSED_FILE) - f2fs_compressed_file : return true - f2fs_all_cluster_page_ready : "pgidx % cc->cluster_size" trigger dividing 0 issue
Let's change as below to fix this issue: - introduce a new atomic type variable .writeback in structure f2fs_inode_info to track the number of threads which calling f2fs_write_cache_pages(). - use .i_sem lock to protect .writeback update. - check .writeback before update compression context in f2fs_setflags_common() to avoid race w/ ->writepages.
Fixes: 4c8ff7095bef ("f2fs: support data compression") Cc: stable@kernel.org Reported-by: Bai, Shuangpeng sjb7183@psu.edu Tested-by: Bai, Shuangpeng sjb7183@psu.edu Closes: https://lore.kernel.org/lkml/44D8F7B3-68AD-425F-9915-65D27591F93F@psu.edu Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/data.c | 17 +++++++++++++++++ fs/f2fs/f2fs.h | 3 ++- fs/f2fs/file.c | 5 +++-- fs/f2fs/super.c | 1 + 4 files changed, 23 insertions(+), 3 deletions(-)
--- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -3224,6 +3224,19 @@ static inline bool __should_serialize_io return false; }
+static inline void account_writeback(struct inode *inode, bool inc) +{ + if (!f2fs_sb_has_compression(F2FS_I_SB(inode))) + return; + + f2fs_down_read(&F2FS_I(inode)->i_sem); + if (inc) + atomic_inc(&F2FS_I(inode)->writeback); + else + atomic_dec(&F2FS_I(inode)->writeback); + f2fs_up_read(&F2FS_I(inode)->i_sem); +} + static int __f2fs_write_data_pages(struct address_space *mapping, struct writeback_control *wbc, enum iostat_type io_type) @@ -3269,10 +3282,14 @@ static int __f2fs_write_data_pages(struc locked = true; }
+ account_writeback(inode, true); + blk_start_plug(&plug); ret = f2fs_write_cache_pages(mapping, wbc, io_type); blk_finish_plug(&plug);
+ account_writeback(inode, false); + if (locked) mutex_unlock(&sbi->writepages);
--- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -948,6 +948,7 @@ struct f2fs_inode_info { unsigned char i_compress_level; /* compress level (lz4hc,zstd) */ unsigned char i_compress_flag; /* compress flag */ unsigned int i_cluster_size; /* cluster size */ + atomic_t writeback; /* count # of writeback thread */
unsigned int atomic_write_cnt; loff_t original_i_size; /* original i_size before atomic write */ @@ -4675,7 +4676,7 @@ static inline bool f2fs_disable_compress f2fs_up_write(&fi->i_sem); return true; } - if (f2fs_is_mmap_file(inode) || + if (f2fs_is_mmap_file(inode) || atomic_read(&fi->writeback) || (S_ISREG(inode->i_mode) && F2FS_HAS_BLOCKS(inode))) { f2fs_up_write(&fi->i_sem); return false; --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -2125,8 +2125,9 @@ static int f2fs_setflags_common(struct i
f2fs_down_write(&fi->i_sem); if (!f2fs_may_compress(inode) || - (S_ISREG(inode->i_mode) && - F2FS_HAS_BLOCKS(inode))) { + atomic_read(&fi->writeback) || + (S_ISREG(inode->i_mode) && + F2FS_HAS_BLOCKS(inode))) { f2fs_up_write(&fi->i_sem); return -EINVAL; } --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -1759,6 +1759,7 @@ static struct inode *f2fs_alloc_inode(st atomic_set(&fi->dirty_pages, 0); atomic_set(&fi->i_compr_blocks, 0); atomic_set(&fi->open_count, 0); + atomic_set(&fi->writeback, 0); init_f2fs_rwsem(&fi->i_sem); spin_lock_init(&fi->i_size_lock); INIT_LIST_HEAD(&fi->dirty_list);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chao Yu chao@kernel.org
commit ca8b201f28547e28343a6f00a6e91fa8c09572fe upstream.
As Jiaming Zhang and syzbot reported, there is potential deadlock in f2fs as below:
Chain exists of: &sbi->cp_rwsem --> fs_reclaim --> sb_internal#2
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- rlock(sb_internal#2); lock(fs_reclaim); lock(sb_internal#2); rlock(&sbi->cp_rwsem);
*** DEADLOCK ***
3 locks held by kswapd0/73: #0: ffffffff8e247a40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat mm/vmscan.c:7015 [inline] #0: ffffffff8e247a40 (fs_reclaim){+.+.}-{0:0}, at: kswapd+0x951/0x2800 mm/vmscan.c:7389 #1: ffff8880118400e0 (&type->s_umount_key#50){.+.+}-{4:4}, at: super_trylock_shared fs/super.c:562 [inline] #1: ffff8880118400e0 (&type->s_umount_key#50){.+.+}-{4:4}, at: super_cache_scan+0x91/0x4b0 fs/super.c:197 #2: ffff888011840610 (sb_internal#2){.+.+}-{0:0}, at: f2fs_evict_inode+0x8d9/0x1b60 fs/f2fs/inode.c:890
stack backtrace: CPU: 0 UID: 0 PID: 73 Comm: kswapd0 Not tainted syzkaller #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 print_circular_bug+0x2ee/0x310 kernel/locking/lockdep.c:2043 check_noncircular+0x134/0x160 kernel/locking/lockdep.c:2175 check_prev_add kernel/locking/lockdep.c:3165 [inline] check_prevs_add kernel/locking/lockdep.c:3284 [inline] validate_chain+0xb9b/0x2140 kernel/locking/lockdep.c:3908 __lock_acquire+0xab9/0xd20 kernel/locking/lockdep.c:5237 lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5868 down_read+0x46/0x2e0 kernel/locking/rwsem.c:1537 f2fs_down_read fs/f2fs/f2fs.h:2278 [inline] f2fs_lock_op fs/f2fs/f2fs.h:2357 [inline] f2fs_do_truncate_blocks+0x21c/0x10c0 fs/f2fs/file.c:791 f2fs_truncate_blocks+0x10a/0x300 fs/f2fs/file.c:867 f2fs_truncate+0x489/0x7c0 fs/f2fs/file.c:925 f2fs_evict_inode+0x9f2/0x1b60 fs/f2fs/inode.c:897 evict+0x504/0x9c0 fs/inode.c:810 f2fs_evict_inode+0x1dc/0x1b60 fs/f2fs/inode.c:853 evict+0x504/0x9c0 fs/inode.c:810 dispose_list fs/inode.c:852 [inline] prune_icache_sb+0x21b/0x2c0 fs/inode.c:1000 super_cache_scan+0x39b/0x4b0 fs/super.c:224 do_shrink_slab+0x6ef/0x1110 mm/shrinker.c:437 shrink_slab_memcg mm/shrinker.c:550 [inline] shrink_slab+0x7ef/0x10d0 mm/shrinker.c:628 shrink_one+0x28a/0x7c0 mm/vmscan.c:4955 shrink_many mm/vmscan.c:5016 [inline] lru_gen_shrink_node mm/vmscan.c:5094 [inline] shrink_node+0x315d/0x3780 mm/vmscan.c:6081 kswapd_shrink_node mm/vmscan.c:6941 [inline] balance_pgdat mm/vmscan.c:7124 [inline] kswapd+0x147c/0x2800 mm/vmscan.c:7389 kthread+0x70e/0x8a0 kernel/kthread.c:463 ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 </TASK>
The root cause is deadlock among four locks as below:
kswapd - fs_reclaim --- Lock A - shrink_one - evict - f2fs_evict_inode - sb_start_intwrite --- Lock B
- iput - evict - f2fs_evict_inode - sb_start_intwrite --- Lock B - f2fs_truncate - f2fs_truncate_blocks - f2fs_do_truncate_blocks - f2fs_lock_op --- Lock C
ioctl - f2fs_ioc_commit_atomic_write - f2fs_lock_op --- Lock C - __f2fs_commit_atomic_write - __replace_atomic_write_block - f2fs_get_dnode_of_data - __get_node_folio - f2fs_check_nid_range - f2fs_handle_error - f2fs_record_errors - f2fs_down_write --- Lock D
open - do_open - do_truncate - security_inode_need_killpriv - f2fs_getxattr - lookup_all_xattrs - f2fs_handle_error - f2fs_record_errors - f2fs_down_write --- Lock D - f2fs_commit_super - read_mapping_folio - filemap_alloc_folio_noprof - prepare_alloc_pages - fs_reclaim_acquire --- Lock A
In order to avoid such deadlock, we need to avoid grabbing sb_lock in f2fs_handle_error(), so, let's use asynchronous method instead: - remove f2fs_handle_error() implementation - rename f2fs_handle_error_async() to f2fs_handle_error() - spread f2fs_handle_error()
Fixes: 95fa90c9e5a7 ("f2fs: support recording errors into superblock") Cc: stable@kernel.org Reported-by: syzbot+14b90e1156b9f6fc1266@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/68eae49b.050a0220.ac43.0001.GAE@goo... Reported-by: Jiaming Zhang r772577952@gmail.com Closes: https://lore.kernel.org/lkml/CANypQFa-Gy9sD-N35o3PC+FystOWkNuN8pv6S75HLT0ga-... Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/compress.c | 5 +---- fs/f2fs/f2fs.h | 1 - fs/f2fs/super.c | 41 ----------------------------------------- 3 files changed, 1 insertion(+), 46 deletions(-)
--- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -759,10 +759,7 @@ void f2fs_decompress_cluster(struct deco ret = -EFSCORRUPTED;
/* Avoid f2fs_commit_super in irq context */ - if (!in_task) - f2fs_handle_error_async(sbi, ERROR_FAIL_DECOMPRESSION); - else - f2fs_handle_error(sbi, ERROR_FAIL_DECOMPRESSION); + f2fs_handle_error(sbi, ERROR_FAIL_DECOMPRESSION); goto out_release; }
--- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -3812,7 +3812,6 @@ void f2fs_quota_off_umount(struct super_ void f2fs_save_errors(struct f2fs_sb_info *sbi, unsigned char flag); void f2fs_handle_critical_error(struct f2fs_sb_info *sbi, unsigned char reason); void f2fs_handle_error(struct f2fs_sb_info *sbi, unsigned char error); -void f2fs_handle_error_async(struct f2fs_sb_info *sbi, unsigned char error); int f2fs_commit_super(struct f2fs_sb_info *sbi, bool recover); int f2fs_sync_fs(struct super_block *sb, int sync); int f2fs_sanity_check_ckpt(struct f2fs_sb_info *sbi); --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -4560,50 +4560,9 @@ void f2fs_save_errors(struct f2fs_sb_inf spin_unlock_irqrestore(&sbi->error_lock, flags); }
-static bool f2fs_update_errors(struct f2fs_sb_info *sbi) -{ - unsigned long flags; - bool need_update = false; - - spin_lock_irqsave(&sbi->error_lock, flags); - if (sbi->error_dirty) { - memcpy(F2FS_RAW_SUPER(sbi)->s_errors, sbi->errors, - MAX_F2FS_ERRORS); - sbi->error_dirty = false; - need_update = true; - } - spin_unlock_irqrestore(&sbi->error_lock, flags); - - return need_update; -} - -static void f2fs_record_errors(struct f2fs_sb_info *sbi, unsigned char error) -{ - int err; - - f2fs_down_write(&sbi->sb_lock); - - if (!f2fs_update_errors(sbi)) - goto out_unlock; - - err = f2fs_commit_super(sbi, false); - if (err) - f2fs_err_ratelimited(sbi, - "f2fs_commit_super fails to record errors:%u, err:%d", - error, err); -out_unlock: - f2fs_up_write(&sbi->sb_lock); -} - void f2fs_handle_error(struct f2fs_sb_info *sbi, unsigned char error) { f2fs_save_errors(sbi, error); - f2fs_record_errors(sbi, error); -} - -void f2fs_handle_error_async(struct f2fs_sb_info *sbi, unsigned char error) -{ - f2fs_save_errors(sbi, error);
if (!sbi->error_dirty) return;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chao Yu chao@kernel.org
commit be112e7449a6e1b54aa9feac618825d154b3a5c7 upstream.
In order to let userspace detect such error rather than suffering silent failure.
Fixes: 4354994f097d ("f2fs: checkpoint disabling") Cc: stable@kernel.org Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/super.c | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-)
--- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -2634,10 +2634,11 @@ restore_flag: return err; }
-static void f2fs_enable_checkpoint(struct f2fs_sb_info *sbi) +static int f2fs_enable_checkpoint(struct f2fs_sb_info *sbi) { unsigned int nr_pages = get_pages(sbi, F2FS_DIRTY_DATA) / 16; long long start, writeback, end; + int ret;
f2fs_info(sbi, "f2fs_enable_checkpoint() starts, meta: %lld, node: %lld, data: %lld", get_pages(sbi, F2FS_DIRTY_META), @@ -2671,7 +2672,9 @@ static void f2fs_enable_checkpoint(struc set_sbi_flag(sbi, SBI_IS_DIRTY); f2fs_up_write(&sbi->gc_lock);
- f2fs_sync_fs(sbi->sb, 1); + ret = f2fs_sync_fs(sbi->sb, 1); + if (ret) + f2fs_err(sbi, "%s sync_fs failed, ret: %d", __func__, ret);
/* Let's ensure there's no pending checkpoint anymore */ f2fs_flush_ckpt_thread(sbi); @@ -2681,6 +2684,7 @@ static void f2fs_enable_checkpoint(struc f2fs_info(sbi, "f2fs_enable_checkpoint() finishes, writeback:%llu, sync:%llu", ktime_ms_delta(writeback, start), ktime_ms_delta(end, writeback)); + return ret; }
static int __f2fs_remount(struct fs_context *fc, struct super_block *sb) @@ -2894,7 +2898,9 @@ static int __f2fs_remount(struct fs_cont goto restore_discard; need_enable_checkpoint = true; } else { - f2fs_enable_checkpoint(sbi); + err = f2fs_enable_checkpoint(sbi); + if (err) + goto restore_discard; need_disable_checkpoint = true; } } @@ -2937,7 +2943,8 @@ skip: return 0; restore_checkpoint: if (need_enable_checkpoint) { - f2fs_enable_checkpoint(sbi); + if (f2fs_enable_checkpoint(sbi)) + f2fs_warn(sbi, "checkpoint has not been enabled"); } else if (need_disable_checkpoint) { if (f2fs_disable_checkpoint(sbi)) f2fs_warn(sbi, "checkpoint has not been disabled"); @@ -5232,13 +5239,12 @@ reset_checkpoint: if (err) goto sync_free_meta;
- if (test_opt(sbi, DISABLE_CHECKPOINT)) { + if (test_opt(sbi, DISABLE_CHECKPOINT)) err = f2fs_disable_checkpoint(sbi); - if (err) - goto sync_free_meta; - } else if (is_set_ckpt_flags(sbi, CP_DISABLED_FLAG)) { - f2fs_enable_checkpoint(sbi); - } + else if (is_set_ckpt_flags(sbi, CP_DISABLED_FLAG)) + err = f2fs_enable_checkpoint(sbi); + if (err) + goto sync_free_meta;
/* * If filesystem is not mounted as read-only then
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chao Yu chao@kernel.org
commit 7c37c79510329cd951a4dedf3f7bf7e2b18dccec upstream.
As syzbot reported:
F2FS-fs (loop0): __update_extent_tree_range: extent len is zero, type: 0, extent [0, 0, 0], age [0, 0] ------------[ cut here ]------------ kernel BUG at fs/f2fs/extent_cache.c:678! Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI CPU: 0 UID: 0 PID: 5336 Comm: syz.0.0 Not tainted syzkaller #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:__update_extent_tree_range+0x13bc/0x1500 fs/f2fs/extent_cache.c:678 Call Trace: <TASK> f2fs_update_read_extent_cache_range+0x192/0x3e0 fs/f2fs/extent_cache.c:1085 f2fs_do_zero_range fs/f2fs/file.c:1657 [inline] f2fs_zero_range+0x10c1/0x1580 fs/f2fs/file.c:1737 f2fs_fallocate+0x583/0x990 fs/f2fs/file.c:2030 vfs_fallocate+0x669/0x7e0 fs/open.c:342 ioctl_preallocate fs/ioctl.c:289 [inline] file_ioctl+0x611/0x780 fs/ioctl.c:-1 do_vfs_ioctl+0xb33/0x1430 fs/ioctl.c:576 __do_sys_ioctl fs/ioctl.c:595 [inline] __se_sys_ioctl+0x82/0x170 fs/ioctl.c:583 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f07bc58eec9
In error path of f2fs_zero_range(), it may add a zero-sized extent into extent cache, it should be avoided.
Fixes: 6e9619499f53 ("f2fs: support in batch fzero in dnode page") Cc: stable@kernel.org Reported-by: syzbot+24124df3170c3638b35f@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/68e5d698.050a0220.256323.0032.GAE@g... Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/file.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1654,8 +1654,11 @@ static int f2fs_do_zero_range(struct dno f2fs_set_data_blkaddr(dn, NEW_ADDR); }
- f2fs_update_read_extent_cache_range(dn, start, 0, index - start); - f2fs_update_age_extent_cache_range(dn, start, index - start); + if (index > start) { + f2fs_update_read_extent_cache_range(dn, start, 0, + index - start); + f2fs_update_age_extent_cache_range(dn, start, index - start); + }
return ret; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Deepanshu Kartikey kartikey406@gmail.com
commit d33f89b34aa313f50f9a512d58dd288999f246b0 upstream.
F2FS can mount filesystems with corrupted directory depth values that get runtime-clamped to MAX_DIR_HASH_DEPTH. When RENAME_WHITEOUT operations are performed on such directories, f2fs_rename performs directory modifications (updating target entry and deleting source entry) before attempting to add the whiteout entry via f2fs_add_link.
If f2fs_add_link fails due to the corrupted directory structure, the function returns an error to VFS, but the partial directory modifications have already been committed to disk. VFS assumes the entire rename operation failed and does not update the dentry cache, leaving stale mappings.
In the error path, VFS does not call d_move() to update the dentry cache. This results in new_dentry still pointing to the old inode (new_inode) which has already had its i_nlink decremented to zero. The stale cache causes subsequent operations to incorrectly reference the freed inode.
This causes subsequent operations to use cached dentry information that no longer matches the on-disk state. When a second rename targets the same entry, VFS attempts to decrement i_nlink on the stale inode, which may already have i_nlink=0, triggering a WARNING in drop_nlink().
Example sequence: 1. First rename (RENAME_WHITEOUT): file2 → file1 - f2fs updates file1 entry on disk (points to inode 8) - f2fs deletes file2 entry on disk - f2fs_add_link(whiteout) fails (corrupted directory) - Returns error to VFS - VFS does not call d_move() due to error - VFS cache still has: file1 → inode 7 (stale!) - inode 7 has i_nlink=0 (already decremented)
2. Second rename: file3 → file1 - VFS uses stale cache: file1 → inode 7 - Tries to drop_nlink on inode 7 (i_nlink already 0) - WARNING in drop_nlink()
Fix this by explicitly invalidating old_dentry and new_dentry when f2fs_add_link fails during whiteout creation. This forces VFS to refresh from disk on subsequent operations, ensuring cache consistency even when the rename partially succeeds.
Reproducer: 1. Mount F2FS image with corrupted i_current_depth 2. renameat2(file2, file1, RENAME_WHITEOUT) 3. renameat2(file3, file1, 0) 4. System triggers WARNING in drop_nlink()
Fixes: 7e01e7ad746b ("f2fs: support RENAME_WHITEOUT") Reported-by: syzbot+632cf32276a9a564188d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=632cf32276a9a564188d Suggested-by: Chao Yu chao@kernel.org Link: https://lore.kernel.org/all/20251022233349.102728-1-kartikey406@gmail.com/ [v1] Cc: stable@vger.kernel.org Signed-off-by: Deepanshu Kartikey kartikey406@gmail.com Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/namei.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/fs/f2fs/namei.c +++ b/fs/f2fs/namei.c @@ -1053,9 +1053,11 @@ static int f2fs_rename(struct mnt_idmap if (whiteout) { set_inode_flag(whiteout, FI_INC_LINK); err = f2fs_add_link(old_dentry, whiteout); - if (err) + if (err) { + d_invalidate(old_dentry); + d_invalidate(new_dentry); goto put_out_dir; - + } spin_lock(&whiteout->i_lock); whiteout->i_state &= ~I_LINKABLE; spin_unlock(&whiteout->i_lock);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chao Yu chao@kernel.org
commit 1f27ef42bb0b7c0740c5616ec577ec188b8a1d05 upstream.
As Hong Yun reported in mailing list:
loop7: detected capacity change from 0 to 131072 ------------[ cut here ]------------ kmem_cache of name 'f2fs_xattr_entry-7:7' already exists WARNING: CPU: 0 PID: 24426 at mm/slab_common.c:110 kmem_cache_sanity_check mm/slab_common.c:109 [inline] WARNING: CPU: 0 PID: 24426 at mm/slab_common.c:110 __kmem_cache_create_args+0xa6/0x320 mm/slab_common.c:307 CPU: 0 UID: 0 PID: 24426 Comm: syz.7.1370 Not tainted 6.17.0-rc4 #1 PREEMPT(full) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:kmem_cache_sanity_check mm/slab_common.c:109 [inline] RIP: 0010:__kmem_cache_create_args+0xa6/0x320 mm/slab_common.c:307 Call Trace: __kmem_cache_create include/linux/slab.h:353 [inline] f2fs_kmem_cache_create fs/f2fs/f2fs.h:2943 [inline] f2fs_init_xattr_caches+0xa5/0xe0 fs/f2fs/xattr.c:843 f2fs_fill_super+0x1645/0x2620 fs/f2fs/super.c:4918 get_tree_bdev_flags+0x1fb/0x260 fs/super.c:1692 vfs_get_tree+0x43/0x140 fs/super.c:1815 do_new_mount+0x201/0x550 fs/namespace.c:3808 do_mount fs/namespace.c:4136 [inline] __do_sys_mount fs/namespace.c:4347 [inline] __se_sys_mount+0x298/0x2f0 fs/namespace.c:4324 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x8e/0x3a0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x76/0x7e
The bug can be reproduced w/ below scripts: - mount /dev/vdb /mnt1 - mount /dev/vdc /mnt2 - umount /mnt1 - mounnt /dev/vdb /mnt1
The reason is if we created two slab caches, named f2fs_xattr_entry-7:3 and f2fs_xattr_entry-7:7, and they have the same slab size. Actually, slab system will only create one slab cache core structure which has slab name of "f2fs_xattr_entry-7:3", and two slab caches share the same structure and cache address.
So, if we destroy f2fs_xattr_entry-7:3 cache w/ cache address, it will decrease reference count of slab cache, rather than release slab cache entirely, since there is one more user has referenced the cache.
Then, if we try to create slab cache w/ name "f2fs_xattr_entry-7:3" again, slab system will find that there is existed cache which has the same name and trigger the warning.
Let's changes to use global inline_xattr_slab instead of per-sb slab cache for fixing.
Fixes: a999150f4fe3 ("f2fs: use kmem_cache pool during inline xattr lookups") Cc: stable@kernel.org Reported-by: Hong Yun yhong@link.cuhk.edu.hk Tested-by: Hong Yun yhong@link.cuhk.edu.hk Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/f2fs.h | 3 --- fs/f2fs/super.c | 17 ++++++++--------- fs/f2fs/xattr.c | 32 +++++++++++--------------------- fs/f2fs/xattr.h | 10 ++++++---- 4 files changed, 25 insertions(+), 37 deletions(-)
--- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -1886,9 +1886,6 @@ struct f2fs_sb_info { spinlock_t error_lock; /* protect errors/stop_reason array */ bool error_dirty; /* errors of sb is dirty */
- struct kmem_cache *inline_xattr_slab; /* inline xattr entry */ - unsigned int inline_xattr_slab_size; /* default inline xattr slab size */ - /* For reclaimed segs statistics per each GC mode */ unsigned int gc_segment_mode; /* GC state for reclaimed segments */ unsigned int gc_reclaimed_segs[MAX_GC_MODE]; /* Reclaimed segs for each mode */ --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -2028,7 +2028,6 @@ static void f2fs_put_super(struct super_ kfree(sbi->raw_super);
f2fs_destroy_page_array_cache(sbi); - f2fs_destroy_xattr_caches(sbi); #ifdef CONFIG_QUOTA for (i = 0; i < MAXQUOTAS; i++) kfree(F2FS_OPTION(sbi).s_qf_names[i]); @@ -4997,13 +4996,9 @@ try_onemore: if (err) goto free_iostat;
- /* init per sbi slab cache */ - err = f2fs_init_xattr_caches(sbi); - if (err) - goto free_percpu; err = f2fs_init_page_array_cache(sbi); if (err) - goto free_xattr_cache; + goto free_percpu;
/* get an inode for meta space */ sbi->meta_inode = f2fs_iget(sb, F2FS_META_INO(sbi)); @@ -5331,8 +5326,6 @@ free_meta_inode: sbi->meta_inode = NULL; free_page_array_cache: f2fs_destroy_page_array_cache(sbi); -free_xattr_cache: - f2fs_destroy_xattr_caches(sbi); free_percpu: destroy_percpu_info(sbi); free_iostat: @@ -5535,10 +5528,15 @@ static int __init init_f2fs_fs(void) err = f2fs_create_casefold_cache(); if (err) goto free_compress_cache; - err = register_filesystem(&f2fs_fs_type); + err = f2fs_init_xattr_cache(); if (err) goto free_casefold_cache; + err = register_filesystem(&f2fs_fs_type); + if (err) + goto free_xattr_cache; return 0; +free_xattr_cache: + f2fs_destroy_xattr_cache(); free_casefold_cache: f2fs_destroy_casefold_cache(); free_compress_cache: @@ -5579,6 +5577,7 @@ fail: static void __exit exit_f2fs_fs(void) { unregister_filesystem(&f2fs_fs_type); + f2fs_destroy_xattr_cache(); f2fs_destroy_casefold_cache(); f2fs_destroy_compress_cache(); f2fs_destroy_compress_mempool(); --- a/fs/f2fs/xattr.c +++ b/fs/f2fs/xattr.c @@ -23,11 +23,12 @@ #include "xattr.h" #include "segment.h"
+static struct kmem_cache *inline_xattr_slab; static void *xattr_alloc(struct f2fs_sb_info *sbi, int size, bool *is_inline) { - if (likely(size == sbi->inline_xattr_slab_size)) { + if (likely(size == DEFAULT_XATTR_SLAB_SIZE)) { *is_inline = true; - return f2fs_kmem_cache_alloc(sbi->inline_xattr_slab, + return f2fs_kmem_cache_alloc(inline_xattr_slab, GFP_F2FS_ZERO, false, sbi); } *is_inline = false; @@ -38,7 +39,7 @@ static void xattr_free(struct f2fs_sb_in bool is_inline) { if (is_inline) - kmem_cache_free(sbi->inline_xattr_slab, xattr_addr); + kmem_cache_free(inline_xattr_slab, xattr_addr); else kfree(xattr_addr); } @@ -830,25 +831,14 @@ int f2fs_setxattr(struct inode *inode, i return err; }
-int f2fs_init_xattr_caches(struct f2fs_sb_info *sbi) +int __init f2fs_init_xattr_cache(void) { - dev_t dev = sbi->sb->s_bdev->bd_dev; - char slab_name[32]; - - sprintf(slab_name, "f2fs_xattr_entry-%u:%u", MAJOR(dev), MINOR(dev)); - - sbi->inline_xattr_slab_size = F2FS_OPTION(sbi).inline_xattr_size * - sizeof(__le32) + XATTR_PADDING_SIZE; - - sbi->inline_xattr_slab = f2fs_kmem_cache_create(slab_name, - sbi->inline_xattr_slab_size); - if (!sbi->inline_xattr_slab) - return -ENOMEM; - - return 0; + inline_xattr_slab = f2fs_kmem_cache_create("f2fs_xattr_entry", + DEFAULT_XATTR_SLAB_SIZE); + return inline_xattr_slab ? 0 : -ENOMEM; }
-void f2fs_destroy_xattr_caches(struct f2fs_sb_info *sbi) +void f2fs_destroy_xattr_cache(void) { - kmem_cache_destroy(sbi->inline_xattr_slab); -} + kmem_cache_destroy(inline_xattr_slab); +} \ No newline at end of file --- a/fs/f2fs/xattr.h +++ b/fs/f2fs/xattr.h @@ -89,6 +89,8 @@ struct f2fs_xattr_entry { F2FS_TOTAL_EXTRA_ATTR_SIZE / sizeof(__le32) - \ DEF_INLINE_RESERVED_SIZE - \ MIN_INLINE_DENTRY_SIZE / sizeof(__le32)) +#define DEFAULT_XATTR_SLAB_SIZE (DEFAULT_INLINE_XATTR_ADDRS * \ + sizeof(__le32) + XATTR_PADDING_SIZE)
/* * On-disk structure of f2fs_xattr @@ -132,8 +134,8 @@ int f2fs_setxattr(struct inode *, int, c int f2fs_getxattr(struct inode *, int, const char *, void *, size_t, struct folio *); ssize_t f2fs_listxattr(struct dentry *, char *, size_t); -int f2fs_init_xattr_caches(struct f2fs_sb_info *); -void f2fs_destroy_xattr_caches(struct f2fs_sb_info *); +int __init f2fs_init_xattr_cache(void); +void f2fs_destroy_xattr_cache(void); #else
#define f2fs_xattr_handlers NULL @@ -150,8 +152,8 @@ static inline int f2fs_getxattr(struct i { return -EOPNOTSUPP; } -static inline int f2fs_init_xattr_caches(struct f2fs_sb_info *sbi) { return 0; } -static inline void f2fs_destroy_xattr_caches(struct f2fs_sb_info *sbi) { } +static inline int __init f2fs_init_xattr_cache(void) { return 0; } +static inline void f2fs_destroy_xattr_cache(void) { } #endif
#ifdef CONFIG_F2FS_FS_SECURITY
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiaole He hexiaole1994@126.com
commit 27bf6a637b7613fc85fa6af468b7d612d78cd5c0 upstream.
The age extent cache uses last_blocks (derived from allocated_data_blocks) to determine data age. However, there's a conflict between the deletion marker (last_blocks=0) and legitimate last_blocks=0 cases when allocated_data_blocks overflows to 0 after reaching ULLONG_MAX.
In this case, valid extents are incorrectly skipped due to the "if (!tei->last_blocks)" check in __update_extent_tree_range().
This patch fixes the issue by: 1. Reserving ULLONG_MAX as an invalid/deletion marker 2. Limiting allocated_data_blocks to range [0, ULLONG_MAX-1] 3. Using F2FS_EXTENT_AGE_INVALID for deletion scenarios 4. Adjusting overflow age calculation from ULLONG_MAX to (ULLONG_MAX-1)
Reproducer (using a patched kernel with allocated_data_blocks initialized to ULLONG_MAX - 3 for quick testing):
Step 1: Mount and check initial state # dd if=/dev/zero of=/tmp/test.img bs=1M count=100 # mkfs.f2fs -f /tmp/test.img # mkdir -p /mnt/f2fs_test # mount -t f2fs -o loop,age_extent_cache /tmp/test.img /mnt/f2fs_test # cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age" Allocated Data Blocks: 18446744073709551612 # ULLONG_MAX - 3 Inner Struct Count: tree: 1(0), node: 0
Step 2: Create files and write data to trigger overflow # touch /mnt/f2fs_test/{1,2,3,4}.txt; sync # cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age" Allocated Data Blocks: 18446744073709551613 # ULLONG_MAX - 2 Inner Struct Count: tree: 5(0), node: 1
# dd if=/dev/urandom of=/mnt/f2fs_test/1.txt bs=4K count=1; sync # cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age" Allocated Data Blocks: 18446744073709551614 # ULLONG_MAX - 1 Inner Struct Count: tree: 5(0), node: 2
# dd if=/dev/urandom of=/mnt/f2fs_test/2.txt bs=4K count=1; sync # cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age" Allocated Data Blocks: 18446744073709551615 # ULLONG_MAX Inner Struct Count: tree: 5(0), node: 3
# dd if=/dev/urandom of=/mnt/f2fs_test/3.txt bs=4K count=1; sync # cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age" Allocated Data Blocks: 0 # Counter overflowed! Inner Struct Count: tree: 5(0), node: 4
Step 3: Trigger the bug - next write should create node but gets skipped # dd if=/dev/urandom of=/mnt/f2fs_test/4.txt bs=4K count=1; sync # cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age" Allocated Data Blocks: 1 Inner Struct Count: tree: 5(0), node: 4
Expected: node: 5 (new extent node for 4.txt) Actual: node: 4 (extent insertion was incorrectly skipped due to last_blocks = allocated_data_blocks = 0 in __get_new_block_age)
After this fix, the extent node is correctly inserted and node count becomes 5 as expected.
Fixes: 71644dff4811 ("f2fs: add block_age-based extent cache") Cc: stable@kernel.org Signed-off-by: Xiaole He hexiaole1994@126.com Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/extent_cache.c | 5 +++-- fs/f2fs/f2fs.h | 6 ++++++ fs/f2fs/segment.c | 9 +++++++-- 3 files changed, 16 insertions(+), 4 deletions(-)
--- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -808,7 +808,7 @@ static void __update_extent_tree_range(s } goto out_read_extent_cache; update_age_extent_cache: - if (!tei->last_blocks) + if (tei->last_blocks == F2FS_EXTENT_AGE_INVALID) goto out_read_extent_cache;
__set_extent_info(&ei, fofs, len, 0, false, @@ -912,7 +912,7 @@ static int __get_new_block_age(struct in cur_age = cur_blocks - tei.last_blocks; else /* allocated_data_blocks overflow */ - cur_age = ULLONG_MAX - tei.last_blocks + cur_blocks; + cur_age = (ULLONG_MAX - 1) - tei.last_blocks + cur_blocks;
if (tei.age) ei->age = __calculate_block_age(sbi, cur_age, tei.age); @@ -1114,6 +1114,7 @@ void f2fs_update_age_extent_cache_range( struct extent_info ei = { .fofs = fofs, .len = len, + .last_blocks = F2FS_EXTENT_AGE_INVALID, };
if (!__may_extent_tree(dn->inode, EX_BLOCK_AGE)) --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -708,6 +708,12 @@ enum extent_type { NR_EXTENT_CACHES, };
+/* + * Reserved value to mark invalid age extents, hence valid block range + * from 0 to ULLONG_MAX-1 + */ +#define F2FS_EXTENT_AGE_INVALID ULLONG_MAX + struct extent_info { unsigned int fofs; /* start offset in a file */ unsigned int len; /* length of the extent */ --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -3881,8 +3881,13 @@ skip_new_segment: locate_dirty_segment(sbi, GET_SEGNO(sbi, old_blkaddr)); locate_dirty_segment(sbi, GET_SEGNO(sbi, *new_blkaddr));
- if (IS_DATASEG(curseg->seg_type)) - atomic64_inc(&sbi->allocated_data_blocks); + if (IS_DATASEG(curseg->seg_type)) { + unsigned long long new_val; + + new_val = atomic64_inc_return(&sbi->allocated_data_blocks); + if (unlikely(new_val == ULLONG_MAX)) + atomic64_set(&sbi->allocated_data_blocks, 0); + }
up_write(&sit_i->sentry_lock);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiaole He hexiaole1994@126.com
commit 392711ef18bff524a873b9c239a73148c5432262 upstream.
The one_time_gc field in struct victim_sel_policy is conditionally initialized but unconditionally read, leading to undefined behavior that triggers UBSAN warnings.
In f2fs_get_victim() at fs/f2fs/gc.c:774, the victim_sel_policy structure is declared without initialization:
struct victim_sel_policy p;
The field p.one_time_gc is only assigned when the 'one_time' parameter is true (line 789):
if (one_time) { p.one_time_gc = one_time; ... }
However, this field is unconditionally read in subsequent get_gc_cost() at line 395:
if (p->one_time_gc && (valid_thresh_ratio < 100) && ...)
When one_time is false, p.one_time_gc contains uninitialized stack memory. Hence p.one_time_gc is an invalid bool value.
UBSAN detects this invalid bool value:
UBSAN: invalid-load in fs/f2fs/gc.c:395:7 load of value 77 is not a valid value for type '_Bool' CPU: 3 UID: 0 PID: 1297 Comm: f2fs_gc-252:16 Not tainted 6.18.0-rc3 #5 PREEMPT(voluntary) Hardware name: OpenStack Foundation OpenStack Nova, BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x70/0x90 dump_stack+0x14/0x20 __ubsan_handle_load_invalid_value+0xb3/0xf0 ? dl_server_update+0x2e/0x40 ? update_curr+0x147/0x170 f2fs_get_victim.cold+0x66/0x134 [f2fs] ? sched_balance_newidle+0x2ca/0x470 ? finish_task_switch.isra.0+0x8d/0x2a0 f2fs_gc+0x2ba/0x8e0 [f2fs] ? _raw_spin_unlock_irqrestore+0x12/0x40 ? __timer_delete_sync+0x80/0xe0 ? timer_delete_sync+0x14/0x20 ? schedule_timeout+0x82/0x100 gc_thread_func+0x38b/0x860 [f2fs] ? gc_thread_func+0x38b/0x860 [f2fs] ? __pfx_autoremove_wake_function+0x10/0x10 kthread+0x10b/0x220 ? __pfx_gc_thread_func+0x10/0x10 [f2fs] ? _raw_spin_unlock_irq+0x12/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x11a/0x160 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK>
This issue is reliably reproducible with the following steps on a 100GB SSD /dev/vdb:
mkfs.f2fs -f /dev/vdb mount /dev/vdb /mnt/f2fs_test fio --name=gc --directory=/mnt/f2fs_test --rw=randwrite \ --bs=4k --size=8G --numjobs=12 --fsync=4 --runtime=10 \ --time_based echo 1 > /sys/fs/f2fs/vdb/gc_urgent
The uninitialized value causes incorrect GC victim selection, leading to unpredictable garbage collection behavior.
Fix by zero-initializing the entire victim_sel_policy structure to ensure all fields have defined values.
Fixes: e791d00bd06c ("f2fs: add valid block ratio not to do excessive GC for one time GC") Cc: stable@kernel.org Signed-off-by: Xiaole He hexiaole1994@126.com Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/gc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/f2fs/gc.c +++ b/fs/f2fs/gc.c @@ -774,7 +774,7 @@ int f2fs_get_victim(struct f2fs_sb_info { struct dirty_seglist_info *dirty_i = DIRTY_I(sbi); struct sit_info *sm = SIT_I(sbi); - struct victim_sel_policy p; + struct victim_sel_policy p = {0}; unsigned int secno, last_victim; unsigned int last_segment; unsigned int nsearched;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chao Yu chao@kernel.org
commit 68d05693f8c031257a0822464366e1c2a239a512 upstream.
mkfs.f2fs -f /dev/vdd mount /dev/vdd /mnt/f2fs touch /mnt/f2fs/foo sync # avoid CP_UMOUNT_FLAG in last f2fs_checkpoint.ckpt_flags touch /mnt/f2fs/bar f2fs_io fsync /mnt/f2fs/bar f2fs_io shutdown 2 /mnt/f2fs umount /mnt/f2fs blockdev --setro /dev/vdd mount /dev/vdd /mnt/f2fs mount: /mnt/f2fs: WARNING: source write-protected, mounted read-only.
For the case if we create and fsync a new inode before sudden power-cut, without norecovery or disable_roll_forward mount option, the following mount will succeed w/o recovering last fsynced inode.
The problem here is that we only check inode_list list after find_fsync_dnodes() in f2fs_recover_fsync_data() to find out whether there is recoverable data in the iamge, but there is a missed case, if last fsynced inode is not existing in last checkpoint, then, we will fail to get its inode due to nat of inode node is not existing in last checkpoint, so the inode won't be linked in inode_list.
Let's detect such case in dyrun mode to fix this issue.
After this change, mount will fail as expected below: mount: /mnt/f2fs: cannot mount /dev/vdd read-only. dmesg(1) may have more information after failed mount system call. demsg: F2FS-fs (vdd): Need to recover fsync data, but write access unavailable, please try mount w/ disable_roll_forward or norecovery
Cc: stable@kernel.org Fixes: 6781eabba1bd ("f2fs: give -EINVAL for norecovery and rw mount") Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/recovery.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-)
--- a/fs/f2fs/recovery.c +++ b/fs/f2fs/recovery.c @@ -399,7 +399,7 @@ static int sanity_check_node_chain(struc }
static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head, - bool check_only) + bool check_only, bool *new_inode) { struct curseg_info *curseg; block_t blkaddr, blkaddr_fast; @@ -447,16 +447,19 @@ static int find_fsync_dnodes(struct f2fs quota_inode = true; }
- /* - * CP | dnode(F) | inode(DF) - * For this case, we should not give up now. - */ entry = add_fsync_inode(sbi, head, ino_of_node(folio), quota_inode); if (IS_ERR(entry)) { err = PTR_ERR(entry); - if (err == -ENOENT) + /* + * CP | dnode(F) | inode(DF) + * For this case, we should not give up now. + */ + if (err == -ENOENT) { + if (check_only) + *new_inode = true; goto next; + } f2fs_folio_put(folio, true); break; } @@ -875,6 +878,7 @@ int f2fs_recover_fsync_data(struct f2fs_ int ret = 0; unsigned long s_flags = sbi->sb->s_flags; bool need_writecp = false; + bool new_inode = false;
f2fs_notice(sbi, "f2fs_recover_fsync_data: recovery fsync data, " "check_only: %d", check_only); @@ -890,8 +894,8 @@ int f2fs_recover_fsync_data(struct f2fs_ f2fs_down_write(&sbi->cp_global_sem);
/* step #1: find fsynced inode numbers */ - err = find_fsync_dnodes(sbi, &inode_list, check_only); - if (err || list_empty(&inode_list)) + err = find_fsync_dnodes(sbi, &inode_list, check_only, &new_inode); + if (err < 0 || (list_empty(&inode_list) && (!check_only || !new_inode))) goto skip;
if (check_only) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chao Yu chao@kernel.org
commit 37345eae9deaa2e4f372eeb98f6594cd0ee0916e upstream.
w/ LFS mode, in get_left_section_blocks(), we should not account the blocks which were used before and now are invalided, otherwise those blocks will be counted as freed one in has_curseg_enough_space(), result in missing to trigger GC in time.
Cc: stable@kernel.org Fixes: 249ad438e1d9 ("f2fs: add a method for calculating the remaining blocks in the current segment in LFS mode.") Fixes: bf34c93d2645 ("f2fs: check curseg space before foreground GC") Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/segment.h | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/fs/f2fs/segment.h +++ b/fs/f2fs/segment.h @@ -607,10 +607,12 @@ static inline int reserved_sections(stru static inline unsigned int get_left_section_blocks(struct f2fs_sb_info *sbi, enum log_type type, unsigned int segno) { - if (f2fs_lfs_mode(sbi) && __is_large_section(sbi)) - return CAP_BLKS_PER_SEC(sbi) - SEGS_TO_BLKS(sbi, - (segno - GET_START_SEG_FROM_SEC(sbi, segno))) - + if (f2fs_lfs_mode(sbi)) { + unsigned int used_blocks = __is_large_section(sbi) ? SEGS_TO_BLKS(sbi, + (segno - GET_START_SEG_FROM_SEC(sbi, segno))) : 0; + return CAP_BLKS_PER_SEC(sbi) - used_blocks - CURSEG_I(sbi, type)->next_blkoff; + } return CAP_BLKS_PER_SEC(sbi) - get_ckpt_valid_blocks(sbi, segno, true); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chao Yu chao@kernel.org
commit 01fba45deaddcce0d0b01c411435d1acf6feab7b upstream.
With below scripts, it will trigger panic in f2fs:
mkfs.f2fs -f /dev/vdd mount /dev/vdd /mnt/f2fs touch /mnt/f2fs/foo sync echo 111 >> /mnt/f2fs/foo f2fs_io fsync /mnt/f2fs/foo f2fs_io shutdown 2 /mnt/f2fs umount /mnt/f2fs mount -o ro,norecovery /dev/vdd /mnt/f2fs or mount -o ro,disable_roll_forward /dev/vdd /mnt/f2fs
F2FS-fs (vdd): f2fs_recover_fsync_data: recovery fsync data, check_only: 0 F2FS-fs (vdd): Mounted with checkpoint version = 7f5c361f F2FS-fs (vdd): Stopped filesystem due to reason: 0 F2FS-fs (vdd): f2fs_recover_fsync_data: recovery fsync data, check_only: 1 Filesystem f2fs get_tree() didn't set fc->root, returned 1 ------------[ cut here ]------------ kernel BUG at fs/super.c:1761! Oops: invalid opcode: 0000 [#1] SMP PTI CPU: 3 UID: 0 PID: 722 Comm: mount Not tainted 6.18.0-rc2+ #721 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:vfs_get_tree.cold+0x18/0x1a Call Trace: <TASK> fc_mount+0x13/0xa0 path_mount+0x34e/0xc50 __x64_sys_mount+0x121/0x150 do_syscall_64+0x84/0x800 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fa6cc126cfe
The root cause is we missed to handle error number returned from f2fs_recover_fsync_data() when mounting image w/ ro,norecovery or ro,disable_roll_forward mount option, result in returning a positive error number to vfs_get_tree(), fix it.
Cc: stable@kernel.org Fixes: 6781eabba1bd ("f2fs: give -EINVAL for norecovery and rw mount") Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/super.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
--- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -5203,11 +5203,15 @@ try_onemore: } } else { err = f2fs_recover_fsync_data(sbi, true); - - if (!f2fs_readonly(sb) && err > 0) { - err = -EINVAL; - f2fs_err(sbi, "Need to recover fsync data"); - goto free_meta; + if (err > 0) { + if (!f2fs_readonly(sb)) { + f2fs_err(sbi, "Need to recover fsync data"); + err = -EINVAL; + goto free_meta; + } else { + f2fs_info(sbi, "drop all fsynced data"); + err = 0; + } } }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alison Schofield alison.schofield@intel.com
commit f59b701b4674f7955170b54c4167c5590f4714eb upstream.
KASAN reports a global-out-of-bounds access when running these nfit tests: clear.sh, pmem-errors.sh, pfn-meta-errors.sh, btt-errors.sh, daxdev-errors.sh, and inject-error.sh.
[] BUG: KASAN: global-out-of-bounds in nfit_test_ctl+0x769f/0x7840 [nfit_test] [] Read of size 4 at addr ffffffffc03ea01c by task ndctl/1215 [] The buggy address belongs to the variable: [] handle+0x1c/0x1df4 [nfit_test]
nfit_test_search_spa() uses handle[nvdimm->id] to retrieve a device handle and triggers a KASAN error when it reads past the end of the handle array. It should not be indexing the handle array at all.
The correct device handle is stored in per-DIMM test data. Each DIMM has a struct nfit_mem that embeds a struct acpi_nfit_memdev that describes the NFIT device handle. Use that device handle here.
Fixes: 10246dc84dfc ("acpi nfit: nfit_test supports translate SPA") Cc: stable@vger.kernel.org Signed-off-by: Alison Schofield alison.schofield@intel.com Reviewed-by: Dave Jiang dave.jiang@intel.com> --- Link: https://patch.msgid.link/20251031234227.1303113-1-alison.schofield@intel.com Signed-off-by: Ira Weiny ira.weiny@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/nvdimm/test/nfit.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/tools/testing/nvdimm/test/nfit.c +++ b/tools/testing/nvdimm/test/nfit.c @@ -670,6 +670,7 @@ static int nfit_test_search_spa(struct n .addr = spa->spa, .region = NULL, }; + struct nfit_mem *nfit_mem; u64 dpa;
ret = device_for_each_child(&bus->dev, &ctx, @@ -687,8 +688,12 @@ static int nfit_test_search_spa(struct n */ nd_mapping = &nd_region->mapping[nd_region->ndr_mappings - 1]; nvdimm = nd_mapping->nvdimm; + nfit_mem = nvdimm_provider_data(nvdimm); + if (!nfit_mem) + return -EINVAL;
- spa->devices[0].nfit_device_handle = handle[nvdimm->id]; + spa->devices[0].nfit_device_handle = + __to_nfit_memdev(nfit_mem)->device_handle; spa->num_nvdimms = 1; spa->devices[0].dpa = dpa;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Claudiu Beznea claudiu.beznea.uj@bp.renesas.com
commit 44bf66122c12ef6d3382a9b84b9be1802e5f0e95 upstream.
Commit 1d2da79708cb ("pinctrl: renesas: rzg2l: Avoid configuring ISEL in gpio_irq_{en,dis}able*()") dropped the configuration of ISEL from struct irq_chip::{irq_enable, irq_disable} APIs and moved it to struct gpio_chip::irq::{child_to_parent_hwirq, child_irq_domain_ops::free} APIs to fix spurious IRQs.
After commit 1d2da79708cb ("pinctrl: renesas: rzg2l: Avoid configuring ISEL in gpio_irq_{en,dis}able*()"), ISEL was no longer configured properly on resume. This is because the pinctrl resume code used struct irq_chip::irq_enable (called from rzg2l_gpio_irq_restore()) to reconfigure the wakeup interrupts. Some drivers (e.g. Ethernet) may also reconfigure non-wakeup interrupts on resume through their own code, eventually calling struct irq_chip::irq_enable.
Fix this by adding ISEL configuration back into the struct irq_chip::irq_enable API and on resume path for wakeup interrupts.
As struct irq_chip::irq_enable needs now to lock to update the ISEL, convert the struct rzg2l_pinctrl::lock to a raw spinlock and replace the locking API calls with the raw variants. Otherwise the lockdep reports invalid wait context when probing the adv7511 module on RZ/G2L:
[ BUG: Invalid wait context ] 6.17.0-rc5-next-20250911-00001-gfcfac22533c9 #18 Not tainted ----------------------------- (udev-worker)/165 is trying to lock: ffff00000e3664a8 (&pctrl->lock){....}-{3:3}, at: rzg2l_gpio_irq_enable+0x38/0x78 other info that might help us debug this: context-{5:5} 3 locks held by (udev-worker)/165: #0: ffff00000e890108 (&dev->mutex){....}-{4:4}, at: __driver_attach+0x90/0x1ac #1: ffff000011c07240 (request_class){+.+.}-{4:4}, at: __setup_irq+0xb4/0x6dc #2: ffff000011c070c8 (lock_class){....}-{2:2}, at: __setup_irq+0xdc/0x6dc stack backtrace: CPU: 1 UID: 0 PID: 165 Comm: (udev-worker) Not tainted 6.17.0-rc5-next-20250911-00001-gfcfac22533c9 #18 PREEMPT Hardware name: Renesas SMARC EVK based on r9a07g044l2 (DT) Call trace: show_stack+0x18/0x24 (C) dump_stack_lvl+0x90/0xd0 dump_stack+0x18/0x24 __lock_acquire+0xa14/0x20b4 lock_acquire+0x1c8/0x354 _raw_spin_lock_irqsave+0x60/0x88 rzg2l_gpio_irq_enable+0x38/0x78 irq_enable+0x40/0x8c __irq_startup+0x78/0xa4 irq_startup+0x108/0x16c __setup_irq+0x3c0/0x6dc request_threaded_irq+0xec/0x1ac devm_request_threaded_irq+0x80/0x134 adv7511_probe+0x928/0x9a4 [adv7511] i2c_device_probe+0x22c/0x3dc really_probe+0xbc/0x2a0 __driver_probe_device+0x78/0x12c driver_probe_device+0x40/0x164 __driver_attach+0x9c/0x1ac bus_for_each_dev+0x74/0xd0 driver_attach+0x24/0x30 bus_add_driver+0xe4/0x208 driver_register+0x60/0x128 i2c_register_driver+0x48/0xd0 adv7511_init+0x5c/0x1000 [adv7511] do_one_initcall+0x64/0x30c do_init_module+0x58/0x23c load_module+0x1bcc/0x1d40 init_module_from_file+0x88/0xc4 idempotent_init_module+0x188/0x27c __arm64_sys_finit_module+0x68/0xac invoke_syscall+0x48/0x110 el0_svc_common.constprop.0+0xc0/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x4c/0x160 el0t_64_sync_handler+0xa0/0xe4 el0t_64_sync+0x198/0x19c
Having ISEL configuration back into the struct irq_chip::irq_enable API should be safe with respect to spurious IRQs, as in the probe case IRQs are enabled anyway in struct gpio_chip::irq::child_to_parent_hwirq. No spurious IRQs were detected on suspend/resume, boot, ethernet link insert/remove tests (executed on RZ/G3S). Boot, ethernet link insert/remove tests were also executed successfully on RZ/G2L.
Fixes: 1d2da79708cb ("pinctrl: renesas: rzg2l: Avoid configuring ISEL in gpio_irq_{en,dis}able*(") Cc: stable@vger.kernel.org Signed-off-by: Claudiu Beznea claudiu.beznea.uj@bp.renesas.com Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Link: https://patch.msgid.link/20250912095308.3603704-1-claudiu.beznea.uj@bp.renes... Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pinctrl/renesas/pinctrl-rzg2l.c | 71 +++++++++++++++++++------------- 1 file changed, 44 insertions(+), 27 deletions(-)
--- a/drivers/pinctrl/renesas/pinctrl-rzg2l.c +++ b/drivers/pinctrl/renesas/pinctrl-rzg2l.c @@ -359,7 +359,7 @@ struct rzg2l_pinctrl { spinlock_t bitmap_lock; /* protect tint_slot bitmap */ unsigned int hwirq[RZG2L_TINT_MAX_INTERRUPT];
- spinlock_t lock; /* lock read/write registers */ + raw_spinlock_t lock; /* lock read/write registers */ struct mutex mutex; /* serialize adding groups and functions */
struct rzg2l_pinctrl_pin_settings *settings; @@ -543,7 +543,7 @@ static void rzg2l_pinctrl_set_pfc_mode(s unsigned long flags; u32 reg;
- spin_lock_irqsave(&pctrl->lock, flags); + raw_spin_lock_irqsave(&pctrl->lock, flags);
/* Set pin to 'Non-use (Hi-Z input protection)' */ reg = readw(pctrl->base + PM(off)); @@ -567,7 +567,7 @@ static void rzg2l_pinctrl_set_pfc_mode(s
pctrl->data->pwpr_pfc_lock_unlock(pctrl, true);
- spin_unlock_irqrestore(&pctrl->lock, flags); + raw_spin_unlock_irqrestore(&pctrl->lock, flags); };
static int rzg2l_pinctrl_set_mux(struct pinctrl_dev *pctldev, @@ -882,10 +882,10 @@ static void rzg2l_rmw_pin_config(struct addr += 4; }
- spin_lock_irqsave(&pctrl->lock, flags); + raw_spin_lock_irqsave(&pctrl->lock, flags); reg = readl(addr) & ~(mask << (bit * 8)); writel(reg | (val << (bit * 8)), addr); - spin_unlock_irqrestore(&pctrl->lock, flags); + raw_spin_unlock_irqrestore(&pctrl->lock, flags); }
static int rzg2l_caps_to_pwr_reg(const struct rzg2l_register_offsets *regs, u32 caps) @@ -1121,7 +1121,7 @@ static int rzg2l_write_oen(struct rzg2l_ if (bit < 0) return -EINVAL;
- spin_lock_irqsave(&pctrl->lock, flags); + raw_spin_lock_irqsave(&pctrl->lock, flags); val = readb(pctrl->base + oen_offset); if (oen) val &= ~BIT(bit); @@ -1134,7 +1134,7 @@ static int rzg2l_write_oen(struct rzg2l_ writeb(val, pctrl->base + oen_offset); if (pctrl->data->hwcfg->oen_pwpr_lock) writeb(pwpr & ~PWPR_REGWE_B, pctrl->base + regs->pwpr); - spin_unlock_irqrestore(&pctrl->lock, flags); + raw_spin_unlock_irqrestore(&pctrl->lock, flags);
return 0; } @@ -1687,14 +1687,14 @@ static int rzg2l_gpio_request(struct gpi if (ret) return ret;
- spin_lock_irqsave(&pctrl->lock, flags); + raw_spin_lock_irqsave(&pctrl->lock, flags);
/* Select GPIO mode in PMC Register */ reg8 = readb(pctrl->base + PMC(off)); reg8 &= ~BIT(bit); pctrl->data->pmc_writeb(pctrl, reg8, PMC(off));
- spin_unlock_irqrestore(&pctrl->lock, flags); + raw_spin_unlock_irqrestore(&pctrl->lock, flags);
return 0; } @@ -1709,7 +1709,7 @@ static void rzg2l_gpio_set_direction(str unsigned long flags; u16 reg16;
- spin_lock_irqsave(&pctrl->lock, flags); + raw_spin_lock_irqsave(&pctrl->lock, flags);
reg16 = readw(pctrl->base + PM(off)); reg16 &= ~(PM_MASK << (bit * 2)); @@ -1717,7 +1717,7 @@ static void rzg2l_gpio_set_direction(str reg16 |= (output ? PM_OUTPUT : PM_INPUT) << (bit * 2); writew(reg16, pctrl->base + PM(off));
- spin_unlock_irqrestore(&pctrl->lock, flags); + raw_spin_unlock_irqrestore(&pctrl->lock, flags); }
static int rzg2l_gpio_get_direction(struct gpio_chip *chip, unsigned int offset) @@ -1761,7 +1761,7 @@ static int rzg2l_gpio_set(struct gpio_ch unsigned long flags; u8 reg8;
- spin_lock_irqsave(&pctrl->lock, flags); + raw_spin_lock_irqsave(&pctrl->lock, flags);
reg8 = readb(pctrl->base + P(off));
@@ -1770,7 +1770,7 @@ static int rzg2l_gpio_set(struct gpio_ch else writeb(reg8 & ~BIT(bit), pctrl->base + P(off));
- spin_unlock_irqrestore(&pctrl->lock, flags); + raw_spin_unlock_irqrestore(&pctrl->lock, flags);
return 0; } @@ -2429,14 +2429,13 @@ static int rzg2l_gpio_get_gpioint(unsign return gpioint; }
-static void rzg2l_gpio_irq_endisable(struct rzg2l_pinctrl *pctrl, - unsigned int hwirq, bool enable) +static void __rzg2l_gpio_irq_endisable(struct rzg2l_pinctrl *pctrl, + unsigned int hwirq, bool enable) { const struct pinctrl_pin_desc *pin_desc = &pctrl->desc.pins[hwirq]; u64 *pin_data = pin_desc->drv_data; u32 off = RZG2L_PIN_CFG_TO_PORT_OFFSET(*pin_data); u8 bit = RZG2L_PIN_ID_TO_PIN(hwirq); - unsigned long flags; void __iomem *addr;
addr = pctrl->base + ISEL(off); @@ -2445,12 +2444,20 @@ static void rzg2l_gpio_irq_endisable(str addr += 4; }
- spin_lock_irqsave(&pctrl->lock, flags); if (enable) writel(readl(addr) | BIT(bit * 8), addr); else writel(readl(addr) & ~BIT(bit * 8), addr); - spin_unlock_irqrestore(&pctrl->lock, flags); +} + +static void rzg2l_gpio_irq_endisable(struct rzg2l_pinctrl *pctrl, + unsigned int hwirq, bool enable) +{ + unsigned long flags; + + raw_spin_lock_irqsave(&pctrl->lock, flags); + __rzg2l_gpio_irq_endisable(pctrl, hwirq, enable); + raw_spin_unlock_irqrestore(&pctrl->lock, flags); }
static void rzg2l_gpio_irq_disable(struct irq_data *d) @@ -2462,15 +2469,25 @@ static void rzg2l_gpio_irq_disable(struc gpiochip_disable_irq(gc, hwirq); }
-static void rzg2l_gpio_irq_enable(struct irq_data *d) +static void __rzg2l_gpio_irq_enable(struct irq_data *d, bool lock) { struct gpio_chip *gc = irq_data_get_irq_chip_data(d); + struct rzg2l_pinctrl *pctrl = container_of(gc, struct rzg2l_pinctrl, gpio_chip); unsigned int hwirq = irqd_to_hwirq(d);
gpiochip_enable_irq(gc, hwirq); + if (lock) + rzg2l_gpio_irq_endisable(pctrl, hwirq, true); + else + __rzg2l_gpio_irq_endisable(pctrl, hwirq, true); irq_chip_enable_parent(d); }
+static void rzg2l_gpio_irq_enable(struct irq_data *d) +{ + __rzg2l_gpio_irq_enable(d, true); +} + static int rzg2l_gpio_irq_set_type(struct irq_data *d, unsigned int type) { return irq_chip_set_type_parent(d, type); @@ -2616,11 +2633,11 @@ static void rzg2l_gpio_irq_restore(struc * This has to be atomically executed to protect against a concurrent * interrupt. */ - spin_lock_irqsave(&pctrl->lock, flags); + raw_spin_lock_irqsave(&pctrl->lock, flags); ret = rzg2l_gpio_irq_set_type(data, irqd_get_trigger_type(data)); if (!ret && !irqd_irq_disabled(data)) - rzg2l_gpio_irq_enable(data); - spin_unlock_irqrestore(&pctrl->lock, flags); + __rzg2l_gpio_irq_enable(data, false); + raw_spin_unlock_irqrestore(&pctrl->lock, flags);
if (ret) dev_crit(pctrl->dev, "Failed to set IRQ type for virq=%u\n", virq); @@ -2950,7 +2967,7 @@ static int rzg2l_pinctrl_probe(struct pl "failed to enable GPIO clk\n"); }
- spin_lock_init(&pctrl->lock); + raw_spin_lock_init(&pctrl->lock); spin_lock_init(&pctrl->bitmap_lock); mutex_init(&pctrl->mutex); atomic_set(&pctrl->wakeup_path, 0); @@ -3097,7 +3114,7 @@ static void rzg2l_pinctrl_pm_setup_pfc(s u32 nports = pctrl->data->n_port_pins / RZG2L_PINS_PER_PORT; unsigned long flags;
- spin_lock_irqsave(&pctrl->lock, flags); + raw_spin_lock_irqsave(&pctrl->lock, flags); pctrl->data->pwpr_pfc_lock_unlock(pctrl, false);
/* Restore port registers. */ @@ -3142,7 +3159,7 @@ static void rzg2l_pinctrl_pm_setup_pfc(s }
pctrl->data->pwpr_pfc_lock_unlock(pctrl, true); - spin_unlock_irqrestore(&pctrl->lock, flags); + raw_spin_unlock_irqrestore(&pctrl->lock, flags); }
static int rzg2l_pinctrl_suspend_noirq(struct device *dev) @@ -3191,14 +3208,14 @@ static int rzg2l_pinctrl_resume_noirq(st
writeb(cache->qspi, pctrl->base + QSPI); if (pctrl->data->hwcfg->oen_pwpr_lock) { - spin_lock_irqsave(&pctrl->lock, flags); + raw_spin_lock_irqsave(&pctrl->lock, flags); pwpr = readb(pctrl->base + regs->pwpr); writeb(pwpr | PWPR_REGWE_B, pctrl->base + regs->pwpr); } writeb(cache->oen, pctrl->base + pctrl->data->hwcfg->regs.oen); if (pctrl->data->hwcfg->oen_pwpr_lock) { writeb(pwpr & ~PWPR_REGWE_B, pctrl->base + regs->pwpr); - spin_unlock_irqrestore(&pctrl->lock, flags); + raw_spin_unlock_irqrestore(&pctrl->lock, flags); } for (u8 i = 0; i < 2; i++) { if (regs->sd_ch)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 9935df5333aa503a18de5071f53762b65c783c4c upstream.
Reject attempts to disable KVM_MEM_GUEST_MEMFD on a memslot that was initially created with a guest_memfd binding, as KVM doesn't support toggling KVM_MEM_GUEST_MEMFD on existing memslots. KVM prevents enabling KVM_MEM_GUEST_MEMFD, but doesn't prevent clearing the flag.
Failure to reject the new memslot results in a use-after-free due to KVM not unbinding from the guest_memfd instance. Unbinding on a FLAGS_ONLY change is easy enough, and can/will be done as a hardening measure (in anticipation of KVM supporting dirty logging on guest_memfd at some point), but fixing the use-after-free would only address the immediate symptom.
================================================================== BUG: KASAN: slab-use-after-free in kvm_gmem_release+0x362/0x400 [kvm] Write of size 8 at addr ffff8881111ae908 by task repro/745
CPU: 7 UID: 1000 PID: 745 Comm: repro Not tainted 6.18.0-rc6-115d5de2eef3-next-kasan #3 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl+0x51/0x60 print_report+0xcb/0x5c0 kasan_report+0xb4/0xe0 kvm_gmem_release+0x362/0x400 [kvm] __fput+0x2fa/0x9d0 task_work_run+0x12c/0x200 do_exit+0x6ae/0x2100 do_group_exit+0xa8/0x230 __x64_sys_exit_group+0x3a/0x50 x64_sys_call+0x737/0x740 do_syscall_64+0x5b/0x900 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f581f2eac31 </TASK>
Allocated by task 745 on cpu 6 at 9.746971s: kasan_save_stack+0x20/0x40 kasan_save_track+0x13/0x50 __kasan_kmalloc+0x77/0x90 kvm_set_memory_region.part.0+0x652/0x1110 [kvm] kvm_vm_ioctl+0x14b0/0x3290 [kvm] __x64_sys_ioctl+0x129/0x1a0 do_syscall_64+0x5b/0x900 entry_SYSCALL_64_after_hwframe+0x4b/0x53
Freed by task 745 on cpu 6 at 9.747467s: kasan_save_stack+0x20/0x40 kasan_save_track+0x13/0x50 __kasan_save_free_info+0x37/0x50 __kasan_slab_free+0x3b/0x60 kfree+0xf5/0x440 kvm_set_memslot+0x3c2/0x1160 [kvm] kvm_set_memory_region.part.0+0x86a/0x1110 [kvm] kvm_vm_ioctl+0x14b0/0x3290 [kvm] __x64_sys_ioctl+0x129/0x1a0 do_syscall_64+0x5b/0x900 entry_SYSCALL_64_after_hwframe+0x4b/0x53
Reported-by: Alexander Potapenko glider@google.com Fixes: a7800aa80ea4 ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251202020334.1171351-2-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- virt/kvm/kvm_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2085,7 +2085,7 @@ static int kvm_set_memory_region(struct return -EINVAL; if ((mem->userspace_addr != old->userspace_addr) || (npages != old->npages) || - ((mem->flags ^ old->flags) & KVM_MEM_READONLY)) + ((mem->flags ^ old->flags) & (KVM_MEM_READONLY | KVM_MEM_GUEST_MEMFD))) return -EINVAL;
if (base_gfn != old->base_gfn)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Deepanshu Kartikey kartikey406@gmail.com
commit 53ca00a19d345197a37a1bf552e8d1e7b091666c upstream.
When CONFIG_SLUB_TINY is enabled, kfree_nolock() calls kasan_slab_free() before defer_free(). On ARM64 with MTE (Memory Tagging Extension), kasan_slab_free() poisons the memory and changes the tag from the original (e.g., 0xf3) to a poison tag (0xfe).
When defer_free() then tries to write to the freed object to build the deferred free list via llist_add(), the pointer still has the old tag, causing a tag mismatch and triggering a KASAN use-after-free report:
BUG: KASAN: slab-use-after-free in defer_free+0x3c/0xbc mm/slub.c:6537 Write at addr f3f000000854f020 by task kworker/u8:6/983 Pointer tag: [f3], memory tag: [fe]
Fix this by calling kasan_reset_tag() before accessing the freed memory. This is safe because defer_free() is part of the allocator itself and is expected to manipulate freed memory for bookkeeping purposes.
Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().") Cc: stable@vger.kernel.org Reported-by: syzbot+7a25305a76d872abcfa1@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7a25305a76d872abcfa1 Tested-by: syzbot+7a25305a76d872abcfa1@syzkaller.appspotmail.com Signed-off-by: Deepanshu Kartikey kartikey406@gmail.com Acked-by: Alexei Starovoitov ast@kernel.org Link: https://patch.msgid.link/20251210022024.3255826-1-kartikey406@gmail.com Signed-off-by: Vlastimil Babka vbabka@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/slub.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/mm/slub.c +++ b/mm/slub.c @@ -6509,6 +6509,8 @@ static void defer_free(struct kmem_cache
guard(preempt)();
+ head = kasan_reset_tag(head); + df = this_cpu_ptr(&defer_free_objects); if (llist_add(head + s->offset, &df->objects)) irq_work_queue(&df->work);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeongjun Park aha310510@gmail.com
commit 98aabfe2d79f74613abc2b0b1cef08f97eaf5322 upstream.
vidtv_channel_si_init() creates a temporary list (program, service, event) and ownership of the memory itself is transferred to the PAT/SDT/EIT tables through vidtv_psi_pat_program_assign(), vidtv_psi_sdt_service_assign(), vidtv_psi_eit_event_assign().
The problem here is that the local pointer where the memory ownership transfer was completed is not initialized to NULL. This causes the vidtv_psi_pmt_create_sec_for_each_pat_entry() function to fail, and in the flow that jumps to free_eit, the memory that was freed by vidtv_psi_*_table_destroy() can be accessed again by vidtv_psi_*_event_destroy() due to the uninitialized local pointer, so it is freed once again.
Therefore, to prevent use-after-free and double-free vulnerability, local pointers must be initialized to NULL when transferring memory ownership.
Cc: stable@vger.kernel.org Reported-by: syzbot+1d9c0edea5907af239e0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=1d9c0edea5907af239e0 Fixes: 3be8037960bc ("media: vidtv: add error checks") Signed-off-by: Jeongjun Park aha310510@gmail.com Reviewed-by: Daniel Almeida daniel.almeida@collabora.com Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/test-drivers/vidtv/vidtv_channel.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/media/test-drivers/vidtv/vidtv_channel.c +++ b/drivers/media/test-drivers/vidtv/vidtv_channel.c @@ -461,12 +461,15 @@ int vidtv_channel_si_init(struct vidtv_m
/* assemble all programs and assign to PAT */ vidtv_psi_pat_program_assign(m->si.pat, programs); + programs = NULL;
/* assemble all services and assign to SDT */ vidtv_psi_sdt_service_assign(m->si.sdt, services); + services = NULL;
/* assemble all events and assign to EIT */ vidtv_psi_eit_event_assign(m->si.eit, events); + events = NULL;
m->si.pmt_secs = vidtv_psi_pmt_create_sec_for_each_pat_entry(m->si.pat, m->pcr_pid);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Prithvi Tambewagh activprithvi@gmail.com
commit 039bef30e320827bac8990c9f29d2a68cd8adb5f upstream.
syzbot reported a kernel BUG in ocfs2_find_victim_chain() because the `cl_next_free_rec` field of the allocation chain list (next free slot in the chain list) is 0, triggring the BUG_ON(!cl->cl_next_free_rec) condition in ocfs2_find_victim_chain() and panicking the kernel.
To fix this, an if condition is introduced in ocfs2_claim_suballoc_bits(), just before calling ocfs2_find_victim_chain(), the code block in it being executed when either of the following conditions is true:
1. `cl_next_free_rec` is equal to 0, indicating that there are no free chains in the allocation chain list 2. `cl_next_free_rec` is greater than `cl_count` (the total number of chains in the allocation chain list)
Either of them being true is indicative of the fact that there are no chains left for usage.
This is addressed using ocfs2_error(), which prints the error log for debugging purposes, rather than panicking the kernel.
Link: https://lkml.kernel.org/r/20251201130711.143900-1-activprithvi@gmail.com Signed-off-by: Prithvi Tambewagh activprithvi@gmail.com Reported-by: syzbot+96d38c6e1655c1420a72@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=96d38c6e1655c1420a72 Tested-by: syzbot+96d38c6e1655c1420a72@syzkaller.appspotmail.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Jun Piao piaojun@huawei.com Cc: Heming Zhao heming.zhao@suse.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/suballoc.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -1992,6 +1992,16 @@ static int ocfs2_claim_suballoc_bits(str }
cl = (struct ocfs2_chain_list *) &fe->id2.i_chain; + if (!le16_to_cpu(cl->cl_next_free_rec) || + le16_to_cpu(cl->cl_next_free_rec) > le16_to_cpu(cl->cl_count)) { + status = ocfs2_error(ac->ac_inode->i_sb, + "Chain allocator dinode %llu has invalid next " + "free chain record %u, but only %u total\n", + (unsigned long long)le64_to_cpu(fe->i_blkno), + le16_to_cpu(cl->cl_next_free_rec), + le16_to_cpu(cl->cl_count)); + goto bail; + }
victim = ocfs2_find_victim_chain(cl); ac->ac_chain = victim;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maxim Levitsky mlevitsk@redhat.com
commit ab4e41eb9fabd4607304fa7cfe8ec9c0bd8e1552 upstream.
Fix an interaction between SMM and PV asynchronous #PFs where an #SMI can cause KVM to drop an async #PF ready event, and thus result in guest tasks becoming permanently stuck due to the task that encountered the #PF never being resumed. Specifically, don't clear the completion queue when paging is disabled, and re-check for completed async #PFs if/when paging is enabled.
Prior to commit 2635b5c4a0e4 ("KVM: x86: interrupt based APF 'page ready' event delivery"), flushing the APF queue without notifying the guest of completed APF requests when paging is disabled was "necessary", in that delivering a #PF to the guest when paging is disabled would likely confuse and/or crash the guest. And presumably the original async #PF development assumed that a guest would only disable paging when there was no intent to ever re-enable paging.
That assumption fails in several scenarios, most visibly on an emulated SMI, as entering SMM always disables CR0.PG (i.e. initially runs with paging disabled). When the SMM handler eventually executes RSM, the interrupted paging-enabled is restored, and the async #PF event is lost.
Similarly, invoking firmware, e.g. via EFI runtime calls, might require a transition through paging modes and thus also disable paging with valid entries in the competion queue.
To avoid dropping completion events, drop the "clear" entirely, and handle paging-enable transitions in the same way KVM already handles APIC enable/disable events: if a vCPU's APIC is disabled, APF completion events are not kept pending and not injected while APIC is disabled. Once a vCPU's APIC is re-enabled, KVM raises KVM_REQ_APF_READY so that the vCPU recognizes any pending pending #APF ready events.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251015033258.50974-4-mlevitsk@redhat.com [sean: rework changelog to call out #PF injection, drop "real mode" references, expand the code comment] Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/x86.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-)
--- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1045,6 +1045,13 @@ bool kvm_require_dr(struct kvm_vcpu *vcp } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_require_dr);
+static bool kvm_pv_async_pf_enabled(struct kvm_vcpu *vcpu) +{ + u64 mask = KVM_ASYNC_PF_ENABLED | KVM_ASYNC_PF_DELIVERY_AS_INT; + + return (vcpu->arch.apf.msr_en_val & mask) == mask; +} + static inline u64 pdptr_rsvd_bits(struct kvm_vcpu *vcpu) { return vcpu->arch.reserved_gpa_bits | rsvd_bits(5, 8) | rsvd_bits(1, 2); @@ -1137,15 +1144,20 @@ void kvm_post_set_cr0(struct kvm_vcpu *v }
if ((cr0 ^ old_cr0) & X86_CR0_PG) { - kvm_clear_async_pf_completion_queue(vcpu); - kvm_async_pf_hash_reset(vcpu); - /* * Clearing CR0.PG is defined to flush the TLB from the guest's * perspective. */ if (!(cr0 & X86_CR0_PG)) kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu); + /* + * Check for async #PF completion events when enabling paging, + * as the vCPU may have previously encountered async #PFs (it's + * entirely legal for the guest to toggle paging on/off without + * waiting for the async #PF queue to drain). + */ + else if (kvm_pv_async_pf_enabled(vcpu)) + kvm_make_request(KVM_REQ_APF_READY, vcpu); }
if ((cr0 ^ old_cr0) & KVM_MMU_CR0_ROLE_BITS) @@ -3650,13 +3662,6 @@ static int set_msr_mce(struct kvm_vcpu * return 0; }
-static inline bool kvm_pv_async_pf_enabled(struct kvm_vcpu *vcpu) -{ - u64 mask = KVM_ASYNC_PF_ENABLED | KVM_ASYNC_PF_DELIVERY_AS_INT; - - return (vcpu->arch.apf.msr_en_val & mask) == mask; -} - static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, u64 data) { gpa_t gpa = data & ~0x3f;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tzung-Bi Shih tzungbi@kernel.org
commit 944edca81e7aea15f83cf9a13a6ab67f711e8abd upstream.
After unbinding the driver, another kthread `cros_ec_console_log_work` is still accessing the device, resulting an UAF and crash.
The driver doesn't unregister the EC device in .remove() which should shutdown sub-devices synchronously. Fix it.
Fixes: 26a14267aff2 ("platform/chrome: Add ChromeOS EC ISHTP driver") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20251031033900.3577394-1-tzungbi@kernel.org Signed-off-by: Tzung-Bi Shih tzungbi@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/platform/chrome/cros_ec_ishtp.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/platform/chrome/cros_ec_ishtp.c +++ b/drivers/platform/chrome/cros_ec_ishtp.c @@ -667,6 +667,7 @@ static void cros_ec_ishtp_remove(struct
cancel_work_sync(&client_data->work_ishtp_reset); cancel_work_sync(&client_data->work_ec_evt); + cros_ec_unregister(client_data->ec_dev); cros_ish_deinit(cros_ish_cl); ishtp_put_device(cl_device); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wangao Wang wangao.wang@oss.qualcomm.com
commit ad699fa78b59241c9d71a8cafb51525f3dab04d4 upstream.
Add sanity check in iris_vb2_stop_streaming. If inst->state is already IRIS_INST_ERROR, we should skip the stream_off operation because it would still send packets to the firmware.
In iris_kill_session, inst->state is set to IRIS_INST_ERROR and session_close is executed, which will kfree(inst_hfi_gen2->packet). If stop_streaming is called afterward, it will cause a crash.
Fixes: 11712ce70f8e5 ("media: iris: implement vb2 streaming ops") Cc: stable@vger.kernel.org Reviewed-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Reviewed-by: Dikshita Agarwal dikshita.agarwal@oss.qualcomm.com Signed-off-by: Wangao Wang wangao.wang@oss.qualcomm.com Reviewed-by: Vikash Garodia vikash.garodia@oss.qualcomm.com [bod: remove qcom from patch title] Signed-off-by: Bryan O'Donoghue bod@kernel.org Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/iris/iris_vb2.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/media/platform/qcom/iris/iris_vb2.c +++ b/drivers/media/platform/qcom/iris/iris_vb2.c @@ -231,6 +231,8 @@ void iris_vb2_stop_streaming(struct vb2_ return;
mutex_lock(&inst->lock); + if (inst->state == IRIS_INST_ERROR) + goto exit;
if (!V4L2_TYPE_IS_OUTPUT(q->type) && !V4L2_TYPE_IS_CAPTURE(q->type)) @@ -241,10 +243,10 @@ void iris_vb2_stop_streaming(struct vb2_ goto exit;
exit: - iris_helper_buffers_done(inst, q->type, VB2_BUF_STATE_ERROR); - if (ret) + if (ret) { + iris_helper_buffers_done(inst, q->type, VB2_BUF_STATE_ERROR); iris_inst_change_state(inst, IRIS_INST_ERROR); - + } mutex_unlock(&inst->lock); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhichi Lin zhichi.lin@vivo.com
commit 08bd4c46d5e63b78e77f2605283874bbe868ab19 upstream.
__scs_magic() needs a 'void *' variable, but a 'struct task_struct *' is given. 'task_scs(tsk)' is the starting address of the task's shadow call stack, and '__scs_magic(task_scs(tsk))' is the end address of the task's shadow call stack. Here should be '__scs_magic(task_scs(tsk))'.
The user-visible effect of this bug is that when CONFIG_DEBUG_STACK_USAGE is enabled, the shadow call stack usage checking function (scs_check_usage) would scan an incorrect memory range. This could lead to:
1. **Inaccurate stack usage reporting**: The function would calculate wrong usage statistics for the shadow call stack, potentially showing incorrect value in kmsg.
2. **Potential kernel crash**: If the value of __scs_magic(tsk)is greater than that of __scs_magic(task_scs(tsk)), the for loop may access unmapped memory, potentially causing a kernel panic. However, this scenario is unlikely because task_struct is allocated via the slab allocator (which typically returns lower addresses), while the shadow call stack returned by task_scs(tsk) is allocated via vmalloc(which typically returns higher addresses).
However, since this is purely a debugging feature (CONFIG_DEBUG_STACK_USAGE), normal production systems should be not unaffected. The bug only impacts developers and testers who are actively debugging stack usage with this configuration enabled.
Link: https://lkml.kernel.org/r/20251011082222.12965-1-zhichi.lin@vivo.com Fixes: 5bbaf9d1fcb9 ("scs: Add support for stack usage debugging") Signed-off-by: Jiyuan Xie xiejiyuan@vivo.com Signed-off-by: Zhichi Lin zhichi.lin@vivo.com Reviewed-by: Sami Tolvanen samitolvanen@google.com Acked-by: Will Deacon will@kernel.org Cc: Andrey Konovalov andreyknvl@gmail.com Cc: Kees Cook keescook@chromium.org Cc: Marco Elver elver@google.com Cc: Will Deacon will@kernel.org Cc: Yee Lee yee.lee@mediatek.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/scs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/scs.c +++ b/kernel/scs.c @@ -135,7 +135,7 @@ static void scs_check_usage(struct task_ if (!IS_ENABLED(CONFIG_DEBUG_STACK_USAGE)) return;
- for (p = task_scs(tsk); p < __scs_magic(tsk); ++p) { + for (p = task_scs(tsk); p < __scs_magic(task_scs(tsk)); ++p) { if (!READ_ONCE_NOCHECK(*p)) break; used += sizeof(*p);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit dca7da244349eef4d78527cafc0bf80816b261f5 upstream.
The ASP chip is a very old variant of the GSP chip and is used e.g. in HP 730 workstations. When trying to reprogram the affinity it will crash with a HPMC as the relevant registers don't seem to be at the usual location. Let's avoid the crash by checking the sversion. Also note, that reprogramming isn't necessary either, as the HP730 is a just a single-CPU machine.
Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/parisc/gsc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/parisc/gsc.c +++ b/drivers/parisc/gsc.c @@ -154,7 +154,9 @@ static int gsc_set_affinity_irq(struct i gsc_dev->eim = ((u32) gsc_dev->gsc_irq.txn_addr) | gsc_dev->gsc_irq.txn_data;
/* switch IRQ's for devices below LASI/WAX to other CPU */ - gsc_writel(gsc_dev->eim, gsc_dev->hpa + OFFSET_IAR); + /* ASP chip (svers 0x70) does not support reprogramming */ + if (gsc_dev->gsc->id.sversion != 0x70) + gsc_writel(gsc_dev->eim, gsc_dev->hpa + OFFSET_IAR);
irq_data_update_effective_affinity(d, &tmask);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dongsheng Yang dongsheng.yang@linux.dev
commit ebbb90344a7da2421e4b54668b94e81828b8b308 upstream.
In dm-pcache, in order to ensure crash-consistency, a dual-copy scheme is used to alternately update metadata, and there is a slot index that records the current slot. However, in the write path the current implementation writes directly to the current slot indexed by slot index, and then advances the slot — which ends up overwriting the existing slot, violating the crash-consistency guarantee.
This patch fixes that behavior, preventing metadata from being overwritten incorrectly.
In addition, this patch add a missing pmem_wmb() after memcpy_flushcache().
Signed-off-by: Dongsheng Yang dongsheng.yang@linux.dev Signed-off-by: Mikulas Patocka mpatocka@redhat.com Reviewed-by: Zheng Gu cengku@gmail.com Cc: stable@vger.kernel.org # 6.18 Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-pcache/cache.c | 8 ++++---- drivers/md/dm-pcache/cache_segment.c | 8 ++++---- 2 files changed, 8 insertions(+), 8 deletions(-)
--- a/drivers/md/dm-pcache/cache.c +++ b/drivers/md/dm-pcache/cache.c @@ -21,10 +21,10 @@ static void cache_info_write(struct pcac cache_info->header.crc = pcache_meta_crc(&cache_info->header, sizeof(struct pcache_cache_info));
+ cache->info_index = (cache->info_index + 1) % PCACHE_META_INDEX_MAX; memcpy_flushcache(get_cache_info_addr(cache), cache_info, sizeof(struct pcache_cache_info)); - - cache->info_index = (cache->info_index + 1) % PCACHE_META_INDEX_MAX; + pmem_wmb(); }
static void cache_info_init_default(struct pcache_cache *cache); @@ -93,10 +93,10 @@ void cache_pos_encode(struct pcache_cach pos_onmedia.header.seq = seq; pos_onmedia.header.crc = cache_pos_onmedia_crc(&pos_onmedia);
+ *index = (*index + 1) % PCACHE_META_INDEX_MAX; + memcpy_flushcache(pos_onmedia_addr, &pos_onmedia, sizeof(struct pcache_cache_pos_onmedia)); pmem_wmb(); - - *index = (*index + 1) % PCACHE_META_INDEX_MAX; }
int cache_pos_decode(struct pcache_cache *cache, --- a/drivers/md/dm-pcache/cache_segment.c +++ b/drivers/md/dm-pcache/cache_segment.c @@ -26,11 +26,11 @@ static void cache_seg_info_write(struct seg_info->header.seq++; seg_info->header.crc = pcache_meta_crc(&seg_info->header, sizeof(struct pcache_segment_info));
+ cache_seg->info_index = (cache_seg->info_index + 1) % PCACHE_META_INDEX_MAX; + seg_info_addr = get_seg_info_addr(cache_seg); memcpy_flushcache(seg_info_addr, seg_info, sizeof(struct pcache_segment_info)); pmem_wmb(); - - cache_seg->info_index = (cache_seg->info_index + 1) % PCACHE_META_INDEX_MAX; mutex_unlock(&cache_seg->info_lock); }
@@ -129,10 +129,10 @@ static void cache_seg_ctrl_write(struct cache_seg_gen.header.crc = pcache_meta_crc(&cache_seg_gen.header, sizeof(struct pcache_cache_seg_gen));
+ cache_seg->gen_index = (cache_seg->gen_index + 1) % PCACHE_META_INDEX_MAX; + memcpy_flushcache(get_cache_seg_gen_addr(cache_seg), &cache_seg_gen, sizeof(struct pcache_cache_seg_gen)); pmem_wmb(); - - cache_seg->gen_index = (cache_seg->gen_index + 1) % PCACHE_META_INDEX_MAX; }
static void cache_seg_ctrl_init(struct pcache_cache_segment *cache_seg)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vivian Wang wangruikang@iscas.ac.cn
commit 43169328c7b4623b54b7713ec68479cebda5465f upstream.
In chacha_zvkb, avoid using the s0 register, which is the frame pointer, by reallocating KEY0 to t5. This makes stack traces available if e.g. a crash happens in chacha_zvkb.
No frame pointer maintenance is otherwise required since this is a leaf function.
Signed-off-by: Vivian Wang wangruikang@iscas.ac.cn Fixes: bb54668837a0 ("crypto: riscv - add vector crypto accelerated ChaCha20") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20251202-riscv-chacha_zvkb-fp-v2-1-7bd00098c9dc@is... Signed-off-by: Eric Biggers ebiggers@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- lib/crypto/riscv/chacha-riscv64-zvkb.S | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/lib/crypto/riscv/chacha-riscv64-zvkb.S +++ b/lib/crypto/riscv/chacha-riscv64-zvkb.S @@ -60,7 +60,8 @@ #define VL t2 #define STRIDE t3 #define ROUND_CTR t4 -#define KEY0 s0 +#define KEY0 t5 +// Avoid s0/fp to allow for unwinding #define KEY1 s1 #define KEY2 s2 #define KEY3 s3 @@ -143,7 +144,6 @@ // The updated 32-bit counter is written back to state->x[12] before returning. SYM_FUNC_START(chacha_zvkb) addi sp, sp, -96 - sd s0, 0(sp) sd s1, 8(sp) sd s2, 16(sp) sd s3, 24(sp) @@ -280,7 +280,6 @@ SYM_FUNC_START(chacha_zvkb) bnez NBLOCKS, .Lblock_loop
sw COUNTER, 48(STATEP) - ld s0, 0(sp) ld s1, 8(sp) ld s2, 16(sp) ld s3, 24(sp)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Dryomov idryomov@gmail.com
commit 8c738512714e8c0aa18f8a10c072d5b01c83db39 upstream.
If the osdmap is (maliciously) corrupted such that the encoded length of ceph_pg_pool envelope is less than what is expected for a particular encoding version, out-of-bounds reads may ensue because the only bounds check that is there is based on that length value.
This patch adds explicit bounds checks for each field that is decoded or skipped.
Cc: stable@vger.kernel.org Reported-by: ziming zhang ezrakiez@gmail.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Xiubo Li xiubli@redhat.com Tested-by: ziming zhang ezrakiez@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ceph/osdmap.c | 118 ++++++++++++++++++++++++------------------------------ 1 file changed, 53 insertions(+), 65 deletions(-)
--- a/net/ceph/osdmap.c +++ b/net/ceph/osdmap.c @@ -806,51 +806,49 @@ static int decode_pool(void **p, void *e ceph_decode_need(p, end, len, bad); pool_end = *p + len;
+ ceph_decode_need(p, end, 4 + 4 + 4, bad); pi->type = ceph_decode_8(p); pi->size = ceph_decode_8(p); pi->crush_ruleset = ceph_decode_8(p); pi->object_hash = ceph_decode_8(p); - pi->pg_num = ceph_decode_32(p); pi->pgp_num = ceph_decode_32(p);
- *p += 4 + 4; /* skip lpg* */ - *p += 4; /* skip last_change */ - *p += 8 + 4; /* skip snap_seq, snap_epoch */ + /* lpg*, last_change, snap_seq, snap_epoch */ + ceph_decode_skip_n(p, end, 8 + 4 + 8 + 4, bad);
/* skip snaps */ - num = ceph_decode_32(p); + ceph_decode_32_safe(p, end, num, bad); while (num--) { - *p += 8; /* snapid key */ - *p += 1 + 1; /* versions */ - len = ceph_decode_32(p); - *p += len; + /* snapid key, pool snap (with versions) */ + ceph_decode_skip_n(p, end, 8 + 2, bad); + ceph_decode_skip_string(p, end, bad); }
- /* skip removed_snaps */ - num = ceph_decode_32(p); - *p += num * (8 + 8); + /* removed_snaps */ + ceph_decode_skip_map(p, end, 64, 64, bad);
+ ceph_decode_need(p, end, 8 + 8 + 4, bad); *p += 8; /* skip auid */ pi->flags = ceph_decode_64(p); *p += 4; /* skip crash_replay_interval */
if (ev >= 7) - pi->min_size = ceph_decode_8(p); + ceph_decode_8_safe(p, end, pi->min_size, bad); else pi->min_size = pi->size - pi->size / 2;
if (ev >= 8) - *p += 8 + 8; /* skip quota_max_* */ + /* quota_max_* */ + ceph_decode_skip_n(p, end, 8 + 8, bad);
if (ev >= 9) { - /* skip tiers */ - num = ceph_decode_32(p); - *p += num * 8; + /* tiers */ + ceph_decode_skip_set(p, end, 64, bad);
+ ceph_decode_need(p, end, 8 + 1 + 8 + 8, bad); *p += 8; /* skip tier_of */ *p += 1; /* skip cache_mode */ - pi->read_tier = ceph_decode_64(p); pi->write_tier = ceph_decode_64(p); } else { @@ -858,86 +856,76 @@ static int decode_pool(void **p, void *e pi->write_tier = -1; }
- if (ev >= 10) { - /* skip properties */ - num = ceph_decode_32(p); - while (num--) { - len = ceph_decode_32(p); - *p += len; /* key */ - len = ceph_decode_32(p); - *p += len; /* val */ - } - } + if (ev >= 10) + /* properties */ + ceph_decode_skip_map(p, end, string, string, bad);
if (ev >= 11) { - /* skip hit_set_params */ - *p += 1 + 1; /* versions */ - len = ceph_decode_32(p); - *p += len; + /* hit_set_params (with versions) */ + ceph_decode_skip_n(p, end, 2, bad); + ceph_decode_skip_string(p, end, bad);
- *p += 4; /* skip hit_set_period */ - *p += 4; /* skip hit_set_count */ + /* hit_set_period, hit_set_count */ + ceph_decode_skip_n(p, end, 4 + 4, bad); }
if (ev >= 12) - *p += 4; /* skip stripe_width */ + /* stripe_width */ + ceph_decode_skip_32(p, end, bad);
- if (ev >= 13) { - *p += 8; /* skip target_max_bytes */ - *p += 8; /* skip target_max_objects */ - *p += 4; /* skip cache_target_dirty_ratio_micro */ - *p += 4; /* skip cache_target_full_ratio_micro */ - *p += 4; /* skip cache_min_flush_age */ - *p += 4; /* skip cache_min_evict_age */ - } - - if (ev >= 14) { - /* skip erasure_code_profile */ - len = ceph_decode_32(p); - *p += len; - } + if (ev >= 13) + /* target_max_*, cache_target_*, cache_min_* */ + ceph_decode_skip_n(p, end, 16 + 8 + 8, bad); + + if (ev >= 14) + /* erasure_code_profile */ + ceph_decode_skip_string(p, end, bad);
/* * last_force_op_resend_preluminous, will be overridden if the * map was encoded with RESEND_ON_SPLIT */ if (ev >= 15) - pi->last_force_request_resend = ceph_decode_32(p); + ceph_decode_32_safe(p, end, pi->last_force_request_resend, bad); else pi->last_force_request_resend = 0;
if (ev >= 16) - *p += 4; /* skip min_read_recency_for_promote */ + /* min_read_recency_for_promote */ + ceph_decode_skip_32(p, end, bad);
if (ev >= 17) - *p += 8; /* skip expected_num_objects */ + /* expected_num_objects */ + ceph_decode_skip_64(p, end, bad);
if (ev >= 19) - *p += 4; /* skip cache_target_dirty_high_ratio_micro */ + /* cache_target_dirty_high_ratio_micro */ + ceph_decode_skip_32(p, end, bad);
if (ev >= 20) - *p += 4; /* skip min_write_recency_for_promote */ + /* min_write_recency_for_promote */ + ceph_decode_skip_32(p, end, bad);
if (ev >= 21) - *p += 1; /* skip use_gmt_hitset */ + /* use_gmt_hitset */ + ceph_decode_skip_8(p, end, bad);
if (ev >= 22) - *p += 1; /* skip fast_read */ + /* fast_read */ + ceph_decode_skip_8(p, end, bad);
- if (ev >= 23) { - *p += 4; /* skip hit_set_grade_decay_rate */ - *p += 4; /* skip hit_set_search_last_n */ - } + if (ev >= 23) + /* hit_set_grade_decay_rate, hit_set_search_last_n */ + ceph_decode_skip_n(p, end, 4 + 4, bad);
if (ev >= 24) { - /* skip opts */ - *p += 1 + 1; /* versions */ - len = ceph_decode_32(p); - *p += len; + /* opts (with versions) */ + ceph_decode_skip_n(p, end, 2, bad); + ceph_decode_skip_string(p, end, bad); }
if (ev >= 25) - pi->last_force_request_resend = ceph_decode_32(p); + ceph_decode_32_safe(p, end, pi->last_force_request_resend, bad);
/* ignore the rest */
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Finn Thain fthain@linux-m68k.org
commit b94b73567561642323617155bf4ee24ef0d258fe upstream.
Since Linux v6.7, booting using BootX on an Old World PowerMac produces an early crash. Stan Johnson writes, "the symptoms are that the screen goes blank and the backlight stays on, and the system freezes (Linux doesn't boot)."
Further testing revealed that the failure can be avoided by disabling CONFIG_BOOTX_TEXT. Bisection revealed that the regression was caused by a change to the font bitmap pointer that's used when btext_init() begins painting characters on the display, early in the boot process.
Christophe Leroy explains, "before kernel text is relocated to its final location ... data is addressed with an offset which is added to the Global Offset Table (GOT) entries at the start of bootx_init() by function reloc_got2(). But the pointers that are located inside a structure are not referenced in the GOT and are therefore not updated by reloc_got2(). It is therefore needed to apply the offset manually by using PTRRELOC() macro."
Cc: stable@vger.kernel.org Link: https://lists.debian.org/debian-powerpc/2025/10/msg00111.html Link: https://lore.kernel.org/linuxppc-dev/d81ddca8-c5ee-d583-d579-02b19ed95301@ya... Reported-by: Cedar Maxwell cedarmaxwell@mac.com Closes: https://lists.debian.org/debian-powerpc/2025/09/msg00031.html Bisected-by: Stan Johnson userm57@yahoo.com Tested-by: Stan Johnson userm57@yahoo.com Fixes: 0ebc7feae79a ("powerpc: Use shared font data") Suggested-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Finn Thain fthain@linux-m68k.org Reviewed-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Madhavan Srinivasan maddy@linux.ibm.com Link: https://patch.msgid.link/22b3b247425a052b079ab84da926706b3702c2c7.1762731022... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/kernel/btext.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/powerpc/kernel/btext.c +++ b/arch/powerpc/kernel/btext.c @@ -20,6 +20,7 @@ #include <asm/io.h> #include <asm/processor.h> #include <asm/udbg.h> +#include <asm/setup.h>
#define NO_SCROLL
@@ -463,7 +464,7 @@ static noinline void draw_byte(unsigned { unsigned char *base = calc_base(locX << 3, locY << 4); unsigned int font_index = c * 16; - const unsigned char *font = font_sun_8x16.data + font_index; + const unsigned char *font = PTRRELOC(font_sun_8x16.data) + font_index; int rb = dispDeviceRowBytes;
rmci_maybe_on();
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 0ea9494be9c931ddbc084ad5e11fda91b554cf47 upstream.
WARN and don't restart the hrtimer if KVM's callback runs with the guest's APIC timer in periodic mode but with a period of '0', as not advancing the hrtimer's deadline would put the CPU into an infinite loop of hrtimer events. Observing a period of '0' should be impossible, even when the hrtimer is running on a different CPU than the vCPU, as KVM is supposed to cancel the hrtimer before changing (or zeroing) the period, e.g. when switching from periodic to one-shot.
Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251113205114.1647493-2-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/lapic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2970,7 +2970,7 @@ static enum hrtimer_restart apic_timer_f
apic_timer_expired(apic, true);
- if (lapic_is_periodic(apic)) { + if (lapic_is_periodic(apic) && !WARN_ON_ONCE(!apic->lapic_timer.period)) { advance_periodic_target_expiration(apic); hrtimer_add_expires_ns(&ktimer->timer, ktimer->period); return HRTIMER_RESTART;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: fuqiang wang fuqiang.wng@gmail.com
commit 9633f180ce994ab293ce4924a9b7aaf4673aa114 upstream.
When restarting an hrtimer to emulate a the guest's APIC timer in periodic mode, explicitly set the expiration using the target expiration computed by advance_periodic_target_expiration() instead of adding the period to the existing timer. This will allow making adjustments to the expiration, e.g. to deal with expirations far in the past, without having to implement the same logic in both advance_periodic_target_expiration() and apic_timer_fn().
Cc: stable@vger.kernel.org Signed-off-by: fuqiang wang fuqiang.wng@gmail.com [sean: split to separate patch, write changelog] Link: https://patch.msgid.link/20251113205114.1647493-3-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/lapic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2972,7 +2972,7 @@ static enum hrtimer_restart apic_timer_f
if (lapic_is_periodic(apic) && !WARN_ON_ONCE(!apic->lapic_timer.period)) { advance_periodic_target_expiration(apic); - hrtimer_add_expires_ns(&ktimer->timer, ktimer->period); + hrtimer_set_expires(&ktimer->timer, ktimer->target_expiration); return HRTIMER_RESTART; } else return HRTIMER_NORESTART;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: fuqiang wang fuqiang.wng@gmail.com
commit 18ab3fc8e880791aa9f7c000261320fc812b5465 upstream.
When advancing the target expiration for the guest's APIC timer in periodic mode, set the expiration to "now" if the target expiration is in the past (similar to what is done in update_target_expiration()). Blindly adding the period to the previous target expiration can result in KVM generating a practically unbounded number of hrtimer IRQs due to programming an expired timer over and over. In extreme scenarios, e.g. if userspace pauses/suspends a VM for an extended duration, this can even cause hard lockups in the host.
Currently, the bug only affects Intel CPUs when using the hypervisor timer (HV timer), a.k.a. the VMX preemption timer. Unlike the software timer, a.k.a. hrtimer, which KVM keeps running even on exits to userspace, the HV timer only runs while the guest is active. As a result, if the vCPU does not run for an extended duration, there will be a huge gap between the target expiration and the current time the vCPU resumes running. Because the target expiration is incremented by only one period on each timer expiration, this leads to a series of timer expirations occurring rapidly after the vCPU/VM resumes.
More critically, when the vCPU first triggers a periodic HV timer expiration after resuming, advancing the expiration by only one period will result in a target expiration in the past. As a result, the delta may be calculated as a negative value. When the delta is converted into an absolute value (tscdeadline is an unsigned u64), the resulting value can overflow what the HV timer is capable of programming. I.e. the large value will exceed the VMX Preemption Timer's maximum bit width of cpu_preemption_timer_multi + 32, and thus cause KVM to switch from the HV timer to the software timer (hrtimers).
After switching to the software timer, periodic timer expiration callbacks may be executed consecutively within a single clock interrupt handler, because hrtimers honors KVM's request for an expiration in the past and immediately re-invokes KVM's callback after reprogramming. And because the interrupt handler runs with IRQs disabled, restarting KVM's hrtimer over and over until the target expiration is advanced to "now" can result in a hard lockup.
E.g. the following hard lockup was triggered in the host when running a Windows VM (only relevant because it used the APIC timer in periodic mode) after resuming the VM from a long suspend (in the host).
NMI watchdog: Watchdog detected hard LOCKUP on cpu 45 ... RIP: 0010:advance_periodic_target_expiration+0x4d/0x80 [kvm] ... RSP: 0018:ff4f88f5d98d8ef0 EFLAGS: 00000046 RAX: fff0103f91be678e RBX: fff0103f91be678e RCX: 00843a7d9e127bcc RDX: 0000000000000002 RSI: 0052ca4003697505 RDI: ff440d5bfbdbd500 RBP: ff440d5956f99200 R08: ff2ff2a42deb6a84 R09: 000000000002a6c0 R10: 0122d794016332b3 R11: 0000000000000000 R12: ff440db1af39cfc0 R13: ff440db1af39cfc0 R14: ffffffffc0d4a560 R15: ff440db1af39d0f8 FS: 00007f04a6ffd700(0000) GS:ff440db1af380000(0000) knlGS:000000e38a3b8000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000d5651feff8 CR3: 000000684e038002 CR4: 0000000000773ee0 PKRU: 55555554 Call Trace: <IRQ> apic_timer_fn+0x31/0x50 [kvm] __hrtimer_run_queues+0x100/0x280 hrtimer_interrupt+0x100/0x210 ? ttwu_do_wakeup+0x19/0x160 smp_apic_timer_interrupt+0x6a/0x130 apic_timer_interrupt+0xf/0x20 </IRQ>
Moreover, if the suspend duration of the virtual machine is not long enough to trigger a hard lockup in this scenario, since commit 98c25ead5eda ("KVM: VMX: Move preemption timer <=> hrtimer dance to common x86"), KVM will continue using the software timer until the guest reprograms the APIC timer in some way. Since the periodic timer does not require frequent APIC timer register programming, the guest may continue to use the software timer in perpetuity.
Fixes: d8f2f498d9ed ("x86/kvm: fix LAPIC timer drift when guest uses periodic mode") Cc: stable@vger.kernel.org Signed-off-by: fuqiang wang fuqiang.wng@gmail.com [sean: massage comments and changelog] Link: https://patch.msgid.link/20251113205114.1647493-4-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/lapic.c | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-)
--- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2131,15 +2131,33 @@ static void advance_periodic_target_expi ktime_t delta;
/* - * Synchronize both deadlines to the same time source or - * differences in the periods (caused by differences in the - * underlying clocks or numerical approximation errors) will - * cause the two to drift apart over time as the errors - * accumulate. + * Use kernel time as the time source for both the hrtimer deadline and + * TSC-based deadline so that they stay synchronized. Computing each + * deadline independently will cause the two deadlines to drift apart + * over time as differences in the periods accumulate, e.g. due to + * differences in the underlying clocks or numerical approximation errors. */ apic->lapic_timer.target_expiration = ktime_add_ns(apic->lapic_timer.target_expiration, apic->lapic_timer.period); + + /* + * If the new expiration is in the past, e.g. because userspace stopped + * running the VM for an extended duration, then force the expiration + * to "now" and don't try to play catch-up with the missed events. KVM + * will only deliver a single interrupt regardless of how many events + * are pending, i.e. restarting the timer with an expiration in the + * past will do nothing more than waste host cycles, and can even lead + * to a hard lockup in extreme cases. + */ + if (ktime_before(apic->lapic_timer.target_expiration, now)) + apic->lapic_timer.target_expiration = now; + + /* + * Note, ensuring the expiration isn't in the past also prevents delta + * from going negative, which could cause the TSC deadline to become + * excessively large due to it an unsigned value. + */ delta = ktime_sub(apic->lapic_timer.target_expiration, now); apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) + nsec_to_cycles(apic->vcpu, delta);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yosry Ahmed yosry.ahmed@linux.dev
commit 3d80f4c93d3d26d0f9a0dd2844961a632eeea634 upstream.
When emulating L2 instructions, svm_check_intercept() checks whether a write to CR0 should trigger a synthesized #VMEXIT with SVM_EXIT_CR0_SEL_WRITE. However, it does not check whether L1 enabled the intercept for SVM_EXIT_WRITE_CR0, which has higher priority according to the APM (24593—Rev. 3.42—March 2024, Table 15-7):
When both selective and non-selective CR0-write intercepts are active at the same time, the non-selective intercept takes priority. With respect to exceptions, the priority of this intercept is the same as the generic CR0-write intercept.
Make sure L1 does NOT intercept SVM_EXIT_WRITE_CR0 before checking if SVM_EXIT_CR0_SEL_WRITE needs to be injected.
Opportunistically tweak the "not CR0" logic to explicitly bail early so that it's more obvious that only CR0 has a selective intercept, and that modifying icpt_info.exit_code is functionally necessary so that the call to nested_svm_exit_handled() checks the correct exit code.
Fixes: cfec82cb7d31 ("KVM: SVM: Add intercept check for emulated cr accesses") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed yosry.ahmed@linux.dev Link: https://patch.msgid.link/20251024192918.3191141-4-yosry.ahmed@linux.dev [sean: isolate non-CR0 write logic, tweak comments accordingly] Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/svm.c | 24 +++++++++++++++++++----- 1 file changed, 19 insertions(+), 5 deletions(-)
--- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4528,15 +4528,29 @@ static int svm_check_intercept(struct kv case SVM_EXIT_WRITE_CR0: { unsigned long cr0, val;
- if (info->intercept == x86_intercept_cr_write) + /* + * Adjust the exit code accordingly if a CR other than CR0 is + * being written, and skip straight to the common handling as + * only CR0 has an additional selective intercept. + */ + if (info->intercept == x86_intercept_cr_write && info->modrm_reg) { icpt_info.exit_code += info->modrm_reg; + break; + }
- if (icpt_info.exit_code != SVM_EXIT_WRITE_CR0 || - info->intercept == x86_intercept_clts) + /* + * Convert the exit_code to SVM_EXIT_CR0_SEL_WRITE if a + * selective CR0 intercept is triggered (the common logic will + * treat the selective intercept as being enabled). Note, the + * unconditional intercept has higher priority, i.e. this is + * only relevant if *only* the selective intercept is enabled. + */ + if (vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_CR0_WRITE) || + !(vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_SELECTIVE_CR0))) break;
- if (!(vmcb12_is_intercept(&svm->nested.ctl, - INTERCEPT_SELECTIVE_CR0))) + /* CLTS never triggers INTERCEPT_SELECTIVE_CR0 */ + if (info->intercept == x86_intercept_clts) break;
cr0 = vcpu->arch.cr0 & ~SVM_CR0_SELECTIVE_MASK;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jim Mattson jmattson@google.com
commit 7c8b465a1c91f674655ea9cec5083744ec5f796a upstream.
Mark the VMCB_NPT bit as dirty in nested_vmcb02_prepare_save() on every nested VMRUN.
If L1 changes the PAT MSR between two VMRUN instructions on the same L1 vCPU, the g_pat field in the associated vmcb02 will change, and the VMCB_NPT clean bit should be cleared.
Fixes: 4bb170a5430b ("KVM: nSVM: do not mark all VMCB02 fields dirty on nested vmexit") Cc: stable@vger.kernel.org Signed-off-by: Jim Mattson jmattson@google.com Link: https://lore.kernel.org/r/20250922162935.621409-3-jmattson@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/nested.c | 1 + 1 file changed, 1 insertion(+)
--- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -613,6 +613,7 @@ static void nested_vmcb02_prepare_save(s struct kvm_vcpu *vcpu = &svm->vcpu;
nested_vmcb02_compute_g_pat(svm); + vmcb_mark_dirty(vmcb02, VMCB_NPT);
/* Load the nested guest state */ if (svm->nested.vmcb12_gpa != svm->nested.last_vmcb12_gpa) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 17e5a9b77716564540d81f0c1e6082d28cf305c9 upstream.
Forcefully override ARCH from x86_64 to x86 to handle the scenario where the user specifies ARCH=x86_64 on the command line.
Fixes: 9af04539d474 ("KVM: selftests: Override ARCH for x86_64 instead of using ARCH_DIR") Cc: stable@vger.kernel.org Reported-by: David Matlack dmatlack@google.com Closes: https://lore.kernel.org/all/20250724213130.3374922-1-dmatlack@google.com Link: https://lore.kernel.org/r/20251007223057.368082-1-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/kvm/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -6,7 +6,7 @@ ARCH ?= $(SUBARCH) ifeq ($(ARCH),$(filter $(ARCH),arm64 s390 riscv x86 x86_64 loongarch)) # Top-level selftests allows ARCH=x86_64 :-( ifeq ($(ARCH),x86_64) - ARCH := x86 + override ARCH := x86 endif include Makefile.kvm else
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yosry Ahmed yosry.ahmed@linux.dev
commit 5674a76db0213f9db1e4d08e847ff649b46889c0 upstream.
When emulating L2 instructions, svm_check_intercept() checks whether a write to CR0 should trigger a synthesized #VMEXIT with SVM_EXIT_CR0_SEL_WRITE. For MOV-to-CR0, SVM_EXIT_CR0_SEL_WRITE is only triggered if any bit other than CR0.MP and CR0.TS is updated. However, according to the APM (24593—Rev. 3.42—March 2024, Table 15-7):
The LMSW instruction treats the selective CR0-write intercept as a non-selective intercept (i.e., it intercepts regardless of the value being written).
Skip checking the changed bits for x86_intercept_lmsw and always inject SVM_EXIT_CR0_SEL_WRITE.
Fixes: cfec82cb7d31 ("KVM: SVM: Add intercept check for emulated cr accesses") Cc: stable@vger.kernel.org Reported-by: Matteo Rizzo matteorizzo@google.com Signed-off-by: Yosry Ahmed yosry.ahmed@linux.dev Link: https://patch.msgid.link/20251024192918.3191141-3-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/svm.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
--- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4553,20 +4553,20 @@ static int svm_check_intercept(struct kv if (info->intercept == x86_intercept_clts) break;
- cr0 = vcpu->arch.cr0 & ~SVM_CR0_SELECTIVE_MASK; - val = info->src_val & ~SVM_CR0_SELECTIVE_MASK; - + /* LMSW always triggers INTERCEPT_SELECTIVE_CR0 */ if (info->intercept == x86_intercept_lmsw) { - cr0 &= 0xfUL; - val &= 0xfUL; - /* lmsw can't clear PE - catch this here */ - if (cr0 & X86_CR0_PE) - val |= X86_CR0_PE; + icpt_info.exit_code = SVM_EXIT_CR0_SEL_WRITE; + break; }
+ /* + * MOV-to-CR0 only triggers INTERCEPT_SELECTIVE_CR0 if any bit + * other than SVM_CR0_SELECTIVE_MASK is changed. + */ + cr0 = vcpu->arch.cr0 & ~SVM_CR0_SELECTIVE_MASK; + val = info->src_val & ~SVM_CR0_SELECTIVE_MASK; if (cr0 ^ val) icpt_info.exit_code = SVM_EXIT_CR0_SEL_WRITE; - break; } case SVM_EXIT_READ_DR0:
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wanpeng Li wanpengli@tencent.com
commit 32bd348be3fa07b26c5ea6b818a161c142dcc2f2 upstream.
In kvm_vcpu_on_spin(), the loop counter 'i' is incorrectly written to last_boosted_vcpu instead of the actual vCPU index 'idx'. This causes last_boosted_vcpu to store the loop iteration count rather than the vCPU index, leading to incorrect round-robin behavior in subsequent directed yield operations.
Fix this by using 'idx' instead of 'i' in the assignment.
Signed-off-by: Wanpeng Li wanpengli@tencent.com Reviewed-by: Sean Christopherson seanjc@google.com Message-ID: 20251110033232.12538-7-kernellwp@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- virt/kvm/kvm_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4026,7 +4026,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *m
yielded = kvm_vcpu_yield_to(vcpu); if (yielded > 0) { - WRITE_ONCE(kvm->last_boosted_vcpu, i); + WRITE_ONCE(kvm->last_boosted_vcpu, idx); break; } else if (yielded < 0 && !--try) { break;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jim Mattson jmattson@google.com
commit 93c9e107386dbe1243287a5b14ceca894de372b9 upstream.
Mark the VMCB_PERM_MAP bit as dirty in nested_vmcb02_prepare_control() on every nested VMRUN.
If L1 changes MSR interception (INTERCEPT_MSR_PROT) between two VMRUN instructions on the same L1 vCPU, the msrpm_base_pa in the associated vmcb02 will change, and the VMCB_PERM_MAP clean bit should be cleared.
Fixes: 4bb170a5430b ("KVM: nSVM: do not mark all VMCB02 fields dirty on nested vmexit") Reported-by: Matteo Rizzo matteorizzo@google.com Cc: stable@vger.kernel.org Signed-off-by: Jim Mattson jmattson@google.com Link: https://lore.kernel.org/r/20250922162935.621409-2-jmattson@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/nested.c | 1 + 1 file changed, 1 insertion(+)
--- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -752,6 +752,7 @@ static void nested_vmcb02_prepare_contro vmcb02->control.nested_ctl = vmcb01->control.nested_ctl; vmcb02->control.iopm_base_pa = vmcb01->control.iopm_base_pa; vmcb02->control.msrpm_base_pa = vmcb01->control.msrpm_base_pa; + vmcb_mark_dirty(vmcb02, VMCB_PERM_MAP);
/* * Stash vmcb02's counter if the guest hasn't moved past the guilty
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit c0711f8c610e1634ed54fb04da1e82252730306a upstream.
Set all user-return MSRs to their post-TD-exit value when preparing to run a TDX vCPU to ensure the value that KVM expects to be loaded after running the vCPU is indeed the value that's loaded in hardware. If the TDX-Module doesn't actually enter the guest, i.e. doesn't do VM-Enter, then it won't "restore" VMM state, i.e. won't clobber user-return MSRs to their expected post-run values, in which case simply updating KVM's "cached" value will effectively corrupt the cache due to hardware still holding the original value.
In theory, KVM could conditionally update the current user-return value if and only if tdh_vp_enter() succeeds, but in practice "success" doesn't guarantee the TDX-Module actually entered the guest, e.g. if the TDX-Module synthesizes an EPT Violation because it suspects a zero-step attack.
Force-load the expected values instead of trying to decipher whether or not the TDX-Module restored/clobbered MSRs, as the risk doesn't justify the benefits. Effectively avoiding four WRMSRs once per run loop (even if the vCPU is scheduled out, user-return MSRs only need to be reloaded if the CPU exits to userspace or runs a non-TDX vCPU) is likely in the noise when amortized over all entries, given the cost of running a TDX vCPU. E.g. the cost of the WRMSRs is somewhere between ~300 and ~500 cycles, whereas the cost of a _single_ roundtrip to/from a TDX guest is thousands of cycles.
Fixes: e0b4f31a3c65 ("KVM: TDX: restore user ret MSRs") Cc: stable@vger.kernel.org Cc: Yan Zhao yan.y.zhao@intel.com Cc: Xiaoyao Li xiaoyao.li@intel.com Cc: Rick Edgecombe rick.p.edgecombe@intel.com Reviewed-by: Xiaoyao Li xiaoyao.li@intel.com Link: https://patch.msgid.link/20251030191528.3380553-2-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/kvm_host.h | 1 arch/x86/kvm/vmx/tdx.c | 56 ++++++++++++++++------------------------ arch/x86/kvm/vmx/tdx.h | 1 arch/x86/kvm/x86.c | 9 ------ 4 files changed, 23 insertions(+), 44 deletions(-)
--- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2387,7 +2387,6 @@ int kvm_pv_send_ipi(struct kvm *kvm, uns int kvm_add_user_return_msr(u32 msr); int kvm_find_user_return_msr(u32 msr); int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask); -void kvm_user_return_msr_update_cache(unsigned int index, u64 val); u64 kvm_get_user_return_msr(unsigned int slot);
static inline bool kvm_is_supported_user_return_msr(u32 msr) --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -763,25 +763,6 @@ static bool tdx_protected_apic_has_inter return tdx_vcpu_state_details_intr_pending(vcpu_state_details); }
-/* - * Compared to vmx_prepare_switch_to_guest(), there is not much to do - * as SEAMCALL/SEAMRET calls take care of most of save and restore. - */ -void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) -{ - struct vcpu_vt *vt = to_vt(vcpu); - - if (vt->guest_state_loaded) - return; - - if (likely(is_64bit_mm(current->mm))) - vt->msr_host_kernel_gs_base = current->thread.gsbase; - else - vt->msr_host_kernel_gs_base = read_msr(MSR_KERNEL_GS_BASE); - - vt->guest_state_loaded = true; -} - struct tdx_uret_msr { u32 msr; unsigned int slot; @@ -795,19 +776,38 @@ static struct tdx_uret_msr tdx_uret_msrs {.msr = MSR_TSC_AUX,}, };
-static void tdx_user_return_msr_update_cache(void) +void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) { + struct vcpu_vt *vt = to_vt(vcpu); int i;
+ if (vt->guest_state_loaded) + return; + + if (likely(is_64bit_mm(current->mm))) + vt->msr_host_kernel_gs_base = current->thread.gsbase; + else + vt->msr_host_kernel_gs_base = read_msr(MSR_KERNEL_GS_BASE); + + vt->guest_state_loaded = true; + + /* + * Explicitly set user-return MSRs that are clobbered by the TDX-Module + * if VP.ENTER succeeds, i.e. on TD-Exit, with the values that would be + * written by the TDX-Module. Don't rely on the TDX-Module to actually + * clobber the MSRs, as the contract is poorly defined and not upheld. + * E.g. the TDX-Module will synthesize an EPT Violation without doing + * VM-Enter if it suspects a zero-step attack, and never "restore" VMM + * state. + */ for (i = 0; i < ARRAY_SIZE(tdx_uret_msrs); i++) - kvm_user_return_msr_update_cache(tdx_uret_msrs[i].slot, - tdx_uret_msrs[i].defval); + kvm_set_user_return_msr(tdx_uret_msrs[i].slot, + tdx_uret_msrs[i].defval, -1ull); }
static void tdx_prepare_switch_to_host(struct kvm_vcpu *vcpu) { struct vcpu_vt *vt = to_vt(vcpu); - struct vcpu_tdx *tdx = to_tdx(vcpu);
if (!vt->guest_state_loaded) return; @@ -815,11 +815,6 @@ static void tdx_prepare_switch_to_host(s ++vcpu->stat.host_state_reload; wrmsrl(MSR_KERNEL_GS_BASE, vt->msr_host_kernel_gs_base);
- if (tdx->guest_entered) { - tdx_user_return_msr_update_cache(); - tdx->guest_entered = false; - } - vt->guest_state_loaded = false; }
@@ -1059,7 +1054,6 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu update_debugctlmsr(vcpu->arch.host_debugctl);
tdx_load_host_xsave_state(vcpu); - tdx->guest_entered = true;
vcpu->arch.regs_avail &= TDX_REGS_AVAIL_SET;
@@ -3447,10 +3441,6 @@ static int __init __tdx_bringup(void) /* * Check if MSRs (tdx_uret_msrs) can be saved/restored * before returning to user space. - * - * this_cpu_ptr(user_return_msrs)->registered isn't checked - * because the registration is done at vcpu runtime by - * tdx_user_return_msr_update_cache(). */ tdx_uret_msrs[i].slot = kvm_find_user_return_msr(tdx_uret_msrs[i].msr); if (tdx_uret_msrs[i].slot == -1) { --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -67,7 +67,6 @@ struct vcpu_tdx { u64 vp_enter_ret;
enum vcpu_tdx_state state; - bool guest_entered;
u64 map_gpa_next; u64 map_gpa_end; --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -681,15 +681,6 @@ int kvm_set_user_return_msr(unsigned slo } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_set_user_return_msr);
-void kvm_user_return_msr_update_cache(unsigned int slot, u64 value) -{ - struct kvm_user_return_msrs *msrs = this_cpu_ptr(user_return_msrs); - - msrs->values[slot].curr = value; - kvm_user_return_register_notifier(msrs); -} -EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_user_return_msr_update_cache); - u64 kvm_get_user_return_msr(unsigned int slot) { return this_cpu_ptr(user_return_msrs)->values[slot].curr;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dongli Zhang dongli.zhang@oracle.com
commit 29763138830916f46daaa50e83e7f4f907a3236b upstream.
If an APICv status updated was pended while L2 was active, immediately refresh vmcs01's controls instead of pending KVM_REQ_APICV_UPDATE as kvm_vcpu_update_apicv() only calls into vendor code if a change is necessary.
E.g. if APICv is inhibited, and then activated while L2 is running:
kvm_vcpu_update_apicv() | -> __kvm_vcpu_update_apicv() | -> apic->apicv_active = true | -> vmx_refresh_apicv_exec_ctrl() | -> vmx->nested.update_vmcs01_apicv_status = true | -> return
Then L2 exits to L1:
__nested_vmx_vmexit() | -> kvm_make_request(KVM_REQ_APICV_UPDATE)
vcpu_enter_guest(): KVM_REQ_APICV_UPDATE -> kvm_vcpu_update_apicv() | -> __kvm_vcpu_update_apicv() | -> return // because if (apic->apicv_active == activate)
Reported-by: Chao Gao chao.gao@intel.com Closes: https://lore.kernel.org/all/aQ2jmnN8wUYVEawF@intel.com Fixes: 7c69661e225c ("KVM: nVMX: Defer APICv updates while L2 is active until L1 is active") Cc: stable@vger.kernel.org Signed-off-by: Dongli Zhang dongli.zhang@oracle.com [sean: write changelog] Link: https://patch.msgid.link/20251205231913.441872-3-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/vmx/nested.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -19,6 +19,7 @@ #include "trace.h" #include "vmx.h" #include "smm.h" +#include "x86_ops.h"
static bool __read_mostly enable_shadow_vmcs = 1; module_param_named(enable_shadow_vmcs, enable_shadow_vmcs, bool, S_IRUGO); @@ -5216,7 +5217,7 @@ void __nested_vmx_vmexit(struct kvm_vcpu
if (vmx->nested.update_vmcs01_apicv_status) { vmx->nested.update_vmcs01_apicv_status = false; - kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu); + vmx_refresh_apicv_exec_ctrl(vcpu); }
if (vmx->nested.update_vmcs01_hwapic_isr) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit f402ecd7a8b6446547076f4bd24bd5d4dcc94481 upstream.
Set exit_code_hi to -1u as a temporary band-aid to fix a long-standing (effectively since KVM's inception) bug where KVM treats the exit code as a 32-bit value, when in reality it's a 64-bit value. Per the APM, offset 0x70 is a single 64-bit value:
070h 63:0 EXITCODE
And a sane reading of the error values defined in "Table C-1. SVM Intercept Codes" is that negative values use the full 64 bits:
–1 VMEXIT_INVALID Invalid guest state in VMCB. –2 VMEXIT_BUSYBUSY bit was set in the VMSA –3 VMEXIT_IDLE_REQUIREDThe sibling thread is not in an idle state -4 VMEXIT_INVALID_PMC Invalid PMC state
And that interpretation is confirmed by testing on Milan and Turin (by setting bits in CR0[63:32] to generate VMEXIT_INVALID on VMRUN).
Furthermore, Xen has treated exitcode as a 64-bit value since HVM support was adding in 2006 (see Xen commit d1bd157fbc ("Big merge the HVM full-virtualisation abstractions.")).
Cc: Jim Mattson jmattson@google.com Cc: Yosry Ahmed yosry.ahmed@linux.dev Cc: stable@vger.kernel.org Reviewed-by: Yosry Ahmed yosry.ahmed@linux.dev Link: https://patch.msgid.link/20251113225621.1688428-3-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/nested.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -985,7 +985,7 @@ int nested_svm_vmrun(struct kvm_vcpu *vc if (!nested_vmcb_check_save(vcpu) || !nested_vmcb_check_controls(vcpu)) { vmcb12->control.exit_code = SVM_EXIT_ERR; - vmcb12->control.exit_code_hi = 0; + vmcb12->control.exit_code_hi = -1u; vmcb12->control.exit_info_1 = 0; vmcb12->control.exit_info_2 = 0; goto out; @@ -1018,7 +1018,7 @@ out_exit_err: svm->soft_int_injected = false;
svm->vmcb->control.exit_code = SVM_EXIT_ERR; - svm->vmcb->control.exit_code_hi = 0; + svm->vmcb->control.exit_code_hi = -1u; svm->vmcb->control.exit_info_1 = 0; svm->vmcb->control.exit_info_2 = 0;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit da01f64e7470988f8607776aa7afa924208863fb upstream.
Explicitly clear exit_code_hi in the VMCB when synthesizing "normal" nested VM-Exits, as the full exit code is a 64-bit value (spoiler alert), and all exit codes for non-failing VMRUN use only bits 31:0.
Cc: Jim Mattson jmattson@google.com Cc: Yosry Ahmed yosry.ahmed@linux.dev Cc: stable@vger.kernel.org Reviewed-by: Yosry Ahmed yosry.ahmed@linux.dev Link: https://patch.msgid.link/20251113225621.1688428-2-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/svm.c | 2 ++ arch/x86/kvm/svm/svm.h | 7 ++++--- 2 files changed, 6 insertions(+), 3 deletions(-)
--- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -2438,6 +2438,7 @@ static bool check_selective_cr0_intercep
if (cr0 ^ val) { svm->vmcb->control.exit_code = SVM_EXIT_CR0_SEL_WRITE; + svm->vmcb->control.exit_code_hi = 0; ret = (nested_svm_exit_handled(svm) == NESTED_EXIT_DONE); }
@@ -4627,6 +4628,7 @@ static int svm_check_intercept(struct kv if (static_cpu_has(X86_FEATURE_NRIPS)) vmcb->control.next_rip = info->next_rip; vmcb->control.exit_code = icpt_info.exit_code; + vmcb->control.exit_code_hi = 0; vmexit = nested_svm_exit_handled(svm);
ret = (vmexit == NESTED_EXIT_DONE) ? X86EMUL_INTERCEPTED --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -764,9 +764,10 @@ int nested_svm_vmexit(struct vcpu_svm *s
static inline int nested_svm_simple_vmexit(struct vcpu_svm *svm, u32 exit_code) { - svm->vmcb->control.exit_code = exit_code; - svm->vmcb->control.exit_info_1 = 0; - svm->vmcb->control.exit_info_2 = 0; + svm->vmcb->control.exit_code = exit_code; + svm->vmcb->control.exit_code_hi = 0; + svm->vmcb->control.exit_info_1 = 0; + svm->vmcb->control.exit_info_2 = 0; return nested_svm_vmexit(svm); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit e2b43fb25243d502ad36b07bab9de09f4b76fff9 upstream.
When handling KVM_SET_CPUID{,2}, do runtime CPUID updates on the vCPU's current CPUID (and caps) prior to swapping in the incoming CPUID state so that KVM doesn't lose pending updates if the incoming CPUID is rejected, and to prevent a false failure on the equality check.
Note, runtime updates are unconditionally performed on the incoming/new CPUID (and associated caps), i.e. clearing the dirty flag won't negatively affect the new CPUID.
Fixes: 93da6af3ae56 ("KVM: x86: Defer runtime updates of dynamic CPUID bits until CPUID emulation") Reported-by: Igor Mammedov imammedo@redhat.com Closes: https://lore.kernel.org/all/20251128123202.68424a95@imammedo Cc: stable@vger.kernel.org Acked-by: Igor Mammedov imammedo@redhat.com Tested-by: Igor Mammedov imammedo@redhat.com Link: https://patch.msgid.link/20251202015049.1167490-2-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/cpuid.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -510,10 +510,17 @@ static int kvm_set_cpuid(struct kvm_vcpu int r;
/* + * Apply pending runtime CPUID updates to the current CPUID entries to + * avoid false positives due to mismatches on KVM-owned feature flags. + */ + if (vcpu->arch.cpuid_dynamic_bits_dirty) + kvm_update_cpuid_runtime(vcpu); + + /* * Swap the existing (old) entries with the incoming (new) entries in * order to massage the new entries, e.g. to account for dynamic bits - * that KVM controls, without clobbering the current guest CPUID, which - * KVM needs to preserve in order to unwind on failure. + * that KVM controls, without losing the current guest CPUID, which KVM + * needs to preserve in order to unwind on failure. * * Similarly, save the vCPU's current cpu_caps so that the capabilities * can be updated alongside the CPUID entries when performing runtime
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gavin Shan gshan@redhat.com
commit 1b9439c933b500cb24710bbd81fe56e9b0025b6f upstream.
In commit 0297cdc12a87 ("KVM: selftests: Add option to rseq test to override /dev/cpu_dma_latency"), a 'break' is missed before the option 'l' in the argument parsing loop, which leads to an unexpected core dump in atoi_paranoid(). It tries to get the latency from non-existent argument.
host$ ./rseq_test -u Random seed: 0x6b8b4567 Segmentation fault (core dumped)
Add a 'break' before the option 'l' in the argument parsing loop to avoid the unexpected core dump.
Fixes: 0297cdc12a87 ("KVM: selftests: Add option to rseq test to override /dev/cpu_dma_latency") Cc: stable@vger.kernel.org # v6.15+ Signed-off-by: Gavin Shan gshan@redhat.com Link: https://patch.msgid.link/20251124050427.1924591-1-gshan@redhat.com [sean: describe code change in shortlog] Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/kvm/rseq_test.c | 1 + 1 file changed, 1 insertion(+)
--- a/tools/testing/selftests/kvm/rseq_test.c +++ b/tools/testing/selftests/kvm/rseq_test.c @@ -215,6 +215,7 @@ int main(int argc, char *argv[]) switch (opt) { case 'u': skip_sanity_check = true; + break; case 'l': latency = atoi_paranoid(optarg); break;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haoxiang Li lihaoxiang@isrc.iscas.ac.cn
commit fc40459de82543b565ebc839dca8f7987f16f62e upstream.
xfs_buf_item_get_format() may allocate memory for bip->bli_formats, free the memory in the error path.
Fixes: c3d5f0c2fb85 ("xfs: complain if anyone tries to create a too-large buffer log item") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li lihaoxiang@isrc.iscas.ac.cn Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Carlos Maiolino cmaiolino@redhat.com Signed-off-by: Carlos Maiolino cem@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_buf_item.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -896,6 +896,7 @@ xfs_buf_item_init( map_size = DIV_ROUND_UP(chunks, NBWORD);
if (map_size > XFS_BLF_DATAMAP_SIZE) { + xfs_buf_item_free_format(bip); kmem_cache_free(xfs_buf_item_cache, bip); xfs_err(mp, "buffer item dirty bitmap (%u uints) too small to reflect %u bytes!",
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit f06725052098d7b1133ac3846d693c383dc427a2 upstream.
gcc 14.2 warns about:
xfs_attr_item.c: In function ‘xfs_attr_recover_work’: xfs_attr_item.c:785:9: warning: ‘ip’ may be used uninitialized [-Wmaybe-uninitialized] 785 | xfs_trans_ijoin(tp, ip, 0); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ xfs_attr_item.c:740:42: note: ‘ip’ was declared here 740 | struct xfs_inode *ip; | ^~
I think this is bogus since xfs_attri_recover_work either returns a real pointer having initialized ip or an ERR_PTR having not touched it, but the tools are smarter than me so let's just null-init the variable anyway.
Cc: stable@vger.kernel.org # v6.8 Fixes: e70fb328d52772 ("xfs: recreate work items when recovering intent items") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Carlos Maiolino cmaiolino@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Carlos Maiolino cem@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_attr_item.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/xfs/xfs_attr_item.c +++ b/fs/xfs/xfs_attr_item.c @@ -737,7 +737,7 @@ xfs_attr_recover_work( struct xfs_attri_log_item *attrip = ATTRI_ITEM(lip); struct xfs_attr_intent *attr; struct xfs_mount *mp = lip->li_log->l_mp; - struct xfs_inode *ip; + struct xfs_inode *ip = NULL; struct xfs_da_args *args; struct xfs_trans *tp; struct xfs_trans_res resv;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
commit dc68c0f601691010dd5ae53442f8523f41a53131 upstream.
The grofs code for zoned RT subvolums already tries to check for zone alignment, but gets it wrong by using the old instead of the new mount structure.
Fixes: 01b71e64bb87 ("xfs: support growfs on zoned file systems") Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Cc: stable@vger.kernel.org # v6.15 Signed-off-by: Carlos Maiolino cem@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_rtalloc.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 6907e871fa15..e063f4f2f2e6 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -1255,12 +1255,10 @@ xfs_growfs_check_rtgeom( min_logfsbs = min_t(xfs_extlen_t, xfs_log_calc_minimum_size(nmp), nmp->m_rsumblocks * 2);
- kfree(nmp); - trace_xfs_growfs_check_rtgeom(mp, min_logfsbs);
if (min_logfsbs > mp->m_sb.sb_logblocks) - return -EINVAL; + goto out_inval;
if (xfs_has_zoned(mp)) { uint32_t gblocks = mp->m_groups[XG_TYPE_RTG].blocks; @@ -1268,16 +1266,20 @@ xfs_growfs_check_rtgeom(
if (rextsize != 1) return -EINVAL; - div_u64_rem(mp->m_sb.sb_rblocks, gblocks, &rem); + div_u64_rem(nmp->m_sb.sb_rblocks, gblocks, &rem); if (rem) { xfs_warn(mp, "new RT volume size (%lld) not aligned to RT group size (%d)", - mp->m_sb.sb_rblocks, gblocks); - return -EINVAL; + nmp->m_sb.sb_rblocks, gblocks); + goto out_inval; } }
+ kfree(nmp); return 0; +out_inval: + kfree(nmp); + return -EINVAL; }
/*
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit 5990fd756943836978ad184aac980e2b36ab7e01 upstream.
The xchk_setup_xattr_buf function can allocate a new value buffer, which means that any reference to ab->value before the call could become a dangling pointer. Fix this by moving an assignment to after the buffer setup.
Cc: stable@vger.kernel.org # v6.10 Fixes: e47dcf113ae348 ("xfs: repair extended attributes") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Carlos Maiolino cem@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/scrub/attr_repair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/xfs/scrub/attr_repair.c +++ b/fs/xfs/scrub/attr_repair.c @@ -333,7 +333,6 @@ xrep_xattr_salvage_remote_attr( .attr_filter = ent->flags & XFS_ATTR_NSP_ONDISK_MASK, .namelen = rentry->namelen, .name = rentry->name, - .value = ab->value, .valuelen = be32_to_cpu(rentry->valuelen), }; unsigned int namesize; @@ -363,6 +362,7 @@ xrep_xattr_salvage_remote_attr( error = -EDEADLOCK; if (error) return error; + args.value = ab->value;
/* Look up the remote value and stash it for reconstruction. */ error = xfs_attr3_leaf_getvalue(leaf_bp, &args);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
commit 982d2616a2906113e433fdc0cfcc122f8d1bb60a upstream.
Garbage collection assumes all zones contain the full amount of blocks. Mkfs already ensures this happens, but make the kernel check it as well to avoid getting into trouble due to fuzzers or mkfs bugs.
Fixes: 2167eaabe2fa ("xfs: define the zoned on-disk format") Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Cc: stable@vger.kernel.org # v6.15 Signed-off-by: Carlos Maiolino cem@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/libxfs/xfs_sb.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index cdd16dd805d7..94c272a2ae26 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -301,6 +301,21 @@ xfs_validate_rt_geometry( sbp->sb_rbmblocks != xfs_expected_rbmblocks(sbp)) return false;
+ if (xfs_sb_is_v5(sbp) && + (sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_ZONED)) { + uint32_t mod; + + /* + * Zoned RT devices must be aligned to the RT group size, + * because garbage collection assumes that all zones have the + * same size to avoid insane complexity if that weren't the + * case. + */ + div_u64_rem(sbp->sb_rextents, sbp->sb_rgextents, &mod); + if (mod) + return false; + } + return true; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
commit ef7f38df890f5dcd2ae62f8dbde191d72f3bebae upstream.
Synthetic events currently do not have a function to register perf events. This leads to calling the tracepoint register functions with a NULL function pointer which triggers:
------------[ cut here ]------------ WARNING: kernel/tracepoint.c:175 at tracepoint_add_func+0x357/0x370, CPU#2: perf/2272 Modules linked in: kvm_intel kvm irqbypass CPU: 2 UID: 0 PID: 2272 Comm: perf Not tainted 6.18.0-ftest-11964-ge022764176fc-dirty #323 PREEMPTLAZY Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-debian-1.17.0-1 04/01/2014 RIP: 0010:tracepoint_add_func+0x357/0x370 Code: 28 9c e8 4c 0b f5 ff eb 0f 4c 89 f7 48 c7 c6 80 4d 28 9c e8 ab 89 f4 ff 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc cc <0f> 0b 49 c7 c6 ea ff ff ff e9 ee fe ff ff 0f 0b e9 f9 fe ff ff 0f RSP: 0018:ffffabc0c44d3c40 EFLAGS: 00010246 RAX: 0000000000000001 RBX: ffff9380aa9e4060 RCX: 0000000000000000 RDX: 000000000000000a RSI: ffffffff9e1d4a98 RDI: ffff937fcf5fd6c8 RBP: 0000000000000001 R08: 0000000000000007 R09: ffff937fcf5fc780 R10: 0000000000000003 R11: ffffffff9c193910 R12: 000000000000000a R13: ffffffff9e1e5888 R14: 0000000000000000 R15: ffffabc0c44d3c78 FS: 00007f6202f5f340(0000) GS:ffff93819f00f000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055d3162281a8 CR3: 0000000106a56003 CR4: 0000000000172ef0 Call Trace: <TASK> tracepoint_probe_register+0x5d/0x90 synth_event_reg+0x3c/0x60 perf_trace_event_init+0x204/0x340 perf_trace_init+0x85/0xd0 perf_tp_event_init+0x2e/0x50 perf_try_init_event+0x6f/0x230 ? perf_event_alloc+0x4bb/0xdc0 perf_event_alloc+0x65a/0xdc0 __se_sys_perf_event_open+0x290/0x9f0 do_syscall_64+0x93/0x7b0 ? entry_SYSCALL_64_after_hwframe+0x76/0x7e ? trace_hardirqs_off+0x53/0xc0 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Instead, have the code return -ENODEV, which doesn't warn and has perf error out with:
# perf record -e synthetic:futex_wait Error: The sys_perf_event_open() syscall returned with 19 (No such device) for event (synthetic:futex_wait). "dmesg | grep -i perf" may provide additional information.
Ideally perf should support synthetic events, but for now just fix the warning. The support can come later.
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Arnaldo Carvalho de Melo acme@kernel.org Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Link: https://patch.msgid.link/20251216182440.147e4453@gandalf.local.home Fixes: 4b147936fa509 ("tracing: Add support for 'synthetic' events") Reported-by: Ian Rogers irogers@google.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -700,6 +700,8 @@ int trace_event_reg(struct trace_event_c
#ifdef CONFIG_PERF_EVENTS case TRACE_REG_PERF_REGISTER: + if (!call->class->perf_probe) + return -ENODEV; return tracepoint_probe_register(call->tp, call->class->perf_probe, call);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
commit 359afc8eb02a518fbdd0cbd462c8c2827c6cbec2 upstream.
Commit 89d9cec3b1e9 ("PM: runtime: Clear power.needs_force_resume in pm_runtime_reinit()") added provisional clearing of power.needs_force_resume to pm_runtime_reinit(), but it is done unconditionally which is a mistake because pm_runtime_reinit() may race with driver probing and removal [1].
To address this, notice that power.needs_force_resume should never be set when runtime PM is enabled and so it only needs to be cleared when runtime PM is disabled, and update pm_runtime_init() to only clear that flag when runtime PM is disabled.
Fixes: 89d9cec3b1e9 ("PM: runtime: Clear power.needs_force_resume in pm_runtime_reinit()") Reported-by: Ed Tsai ed.tsai@mediatek.com Closes: https://lore.kernel.org/linux-pm/20251215122154.3180001-1-ed.tsai@mediatek.c... [1] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: 6.17+ stable@vger.kernel.org # 6.17+ Reviewed-by: Ulf Hansson ulf.hansson@linaro.org Link: https://patch.msgid.link/12807571.O9o76ZdvQC@rafael.j.wysocki Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/power/runtime.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
--- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -1869,16 +1869,18 @@ void pm_runtime_init(struct device *dev) */ void pm_runtime_reinit(struct device *dev) { - if (!pm_runtime_enabled(dev)) { - if (dev->power.runtime_status == RPM_ACTIVE) - pm_runtime_set_suspended(dev); - if (dev->power.irq_safe) { - spin_lock_irq(&dev->power.lock); - dev->power.irq_safe = 0; - spin_unlock_irq(&dev->power.lock); - if (dev->parent) - pm_runtime_put(dev->parent); - } + if (pm_runtime_enabled(dev)) + return; + + if (dev->power.runtime_status == RPM_ACTIVE) + pm_runtime_set_suspended(dev); + + if (dev->power.irq_safe) { + spin_lock_irq(&dev->power.lock); + dev->power.irq_safe = 0; + spin_unlock_irq(&dev->power.lock); + if (dev->parent) + pm_runtime_put(dev->parent); } /* * Clear power.needs_force_resume in case it has been set by
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Brown broonie@kernel.org
commit 98a97bf41528ef738b06eb07ec2b2eb1cfde6ce6 upstream.
When we exec a new task we forget to flush the set of locked GCS mode bits. Since we do flush the rest of the state this means that if GCS is locked the new task will be unable to enable GCS, it will be locked as being disabled. Add the expected flush.
Fixes: fc84bc5378a8 ("arm64/gcs: Context switch GCS state for EL0") Cc: stable@vger.kernel.org # 6.13.x Reported-by: Yury Khrustalev Yury.Khrustalev@arm.com Signed-off-by: Mark Brown broonie@kernel.org Tested-by: Yury Khrustalev yury.khrustalev@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/process.c | 1 + 1 file changed, 1 insertion(+)
--- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -292,6 +292,7 @@ static void flush_gcs(void) current->thread.gcs_base = 0; current->thread.gcs_size = 0; current->thread.gcs_el0_mode = 0; + current->thread.gcs_el0_locked = 0; write_sysreg_s(GCSCRE0_EL1_nTR, SYS_GCSCRE0_EL1); write_sysreg_s(0, SYS_GCSPR_EL0); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Charles Mirabile cmirabil@redhat.com
commit 5a0b1882506858b12cc77f0e2439a5f3c5052761 upstream.
poly1305-core.S is an auto-generated file, so it should be ignored.
Fixes: bef9c7559869 ("lib/crypto: riscv/poly1305: Import OpenSSL/CRYPTOGAMS implementation") Cc: stable@vger.kernel.org Signed-off-by: Charles Mirabile cmirabil@redhat.com Link: https://lore.kernel.org/r/20251212184717.133701-1-cmirabil@redhat.com Signed-off-by: Eric Biggers ebiggers@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- lib/crypto/riscv/.gitignore | 2 ++ 1 file changed, 2 insertions(+) create mode 100644 lib/crypto/riscv/.gitignore
diff --git a/lib/crypto/riscv/.gitignore b/lib/crypto/riscv/.gitignore new file mode 100644 index 000000000000..0d47d4f21c6d --- /dev/null +++ b/lib/crypto/riscv/.gitignore @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0-only +poly1305-core.S
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: René Rebe rene@exactco.de
commit dd75c723ef566f7f009c047f47e0eee95fe348ab upstream.
Wake-on-Lan does currently not work for r8169 in DASH mode, e.g. the ASUS Pro WS X570-ACE with RTL8168fp/RTL8117.
Fix by not returning early in rtl_prepare_power_down when dash_enabled. While this fixes WoL, it still kills the OOB RTL8117 remote management BMC connection. Fix by not calling rtl8168_driver_stop if WoL is enabled.
Fixes: 065c27c184d6 ("r8169: phy power ops") Signed-off-by: René Rebe rene@exactco.de Cc: stable@vger.kernel.org Reviewed-by: Heiner Kallweit hkallweit1@gmail.com Link: https://patch.msgid.link/20251202.194137.1647877804487085954.rene@exactco.de Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169_main.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -2669,9 +2669,6 @@ static void rtl_wol_enable_rx(struct rtl
static void rtl_prepare_power_down(struct rtl8169_private *tp) { - if (tp->dash_enabled) - return; - if (tp->mac_version == RTL_GIGA_MAC_VER_32 || tp->mac_version == RTL_GIGA_MAC_VER_33) rtl_ephy_write(tp, 0x19, 0xff64); @@ -4807,7 +4804,7 @@ static void rtl8169_down(struct rtl8169_ rtl_disable_exit_l1(tp); rtl_prepare_power_down(tp);
- if (tp->dash_type != RTL_DASH_NONE) + if (tp->dash_type != RTL_DASH_NONE && !tp->saved_wolopts) rtl8168_driver_stop(tp); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thorsten Blum thorsten.blum@linux.dev
commit c4cdf7376271bce5714c06d79ec67759b18910eb upstream.
The local variable 'val' was never clamped to -75000 or 180000 because the return value of clamp_val() was not used. Fix this by assigning the clamped value back to 'val', and use clamp() instead of clamp_val().
Cc: stable@vger.kernel.org Fixes: a557a92e6881 ("net: phy: marvell-88q2xxx: add support for temperature sensor") Signed-off-by: Thorsten Blum thorsten.blum@linux.dev Reviewed-by: Dimitri Fedrau dima.fedrau@gmail.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://patch.msgid.link/20251202172743.453055-3-thorsten.blum@linux.dev Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/marvell-88q2xxx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/phy/marvell-88q2xxx.c +++ b/drivers/net/phy/marvell-88q2xxx.c @@ -698,7 +698,7 @@ static int mv88q2xxx_hwmon_write(struct
switch (attr) { case hwmon_temp_max: - clamp_val(val, -75000, 180000); + val = clamp(val, -75000, 180000); val = (val / 1000) + 75; val = FIELD_PREP(MDIO_MMD_PCS_MV_TEMP_SENSOR3_INT_THRESH_MASK, val);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amir Goldstein amir73il@gmail.com
commit 635bc4def026a24e071436f4f356ea08c0eed6ff upstream.
inotify/fanotify do not allow users with no read access to a file to subscribe to events (e.g. IN_ACCESS/IN_MODIFY), but they do allow the same user to subscribe for watching events on children when the user has access to the parent directory (e.g. /dev).
Users with no read access to a file but with read access to its parent directory can still stat the file and see if it was accessed/modified via atime/mtime change.
The same is not true for special files (e.g. /dev/null). Users will not generally observe atime/mtime changes when other users read/write to special files, only when someone sets atime/mtime via utimensat().
Align fsnotify events with this stat behavior and do not generate ACCESS/MODIFY events to parent watchers on read/write of special files. The events are still generated to parent watchers on utimensat(). This closes some side-channels that could be possibly used for information exfiltration [1].
[1] https://snee.la/pdf/pubs/file-notification-attacks.pdf
Reported-by: Sudheendra Raghav Neela sneela@tugraz.at CC: stable@vger.kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/notify/fsnotify.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/fs/notify/fsnotify.c +++ b/fs/notify/fsnotify.c @@ -270,8 +270,15 @@ int __fsnotify_parent(struct dentry *den /* * Include parent/name in notification either if some notification * groups require parent info or the parent is interested in this event. + * The parent interest in ACCESS/MODIFY events does not apply to special + * files, where read/write are not on the filesystem of the parent and + * events can provide an undesirable side-channel for information + * exfiltration. */ - parent_interested = mask & p_mask & ALL_FSNOTIFY_EVENTS; + parent_interested = mask & p_mask & ALL_FSNOTIFY_EVENTS && + !(data_type == FSNOTIFY_EVENT_PATH && + d_is_special(dentry) && + (mask & (FS_ACCESS | FS_MODIFY))); if (parent_needed || parent_interested) { /* When notifying parent, child should be passed as data */ WARN_ON_ONCE(inode != fsnotify_data_inode(data, data_type));
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: caoping caoping@cmss.chinamobile.com
commit 6af2a01d65f89e73c1cbb9267f8880d83a88cee4 upstream.
handshake_req_submit() replaces sk->sk_destruct but never restores it when submission fails before the request is hashed. handshake_sk_destruct() then returns early and the original destructor never runs, leaking the socket. Restore sk_destruct on the error path.
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") Reviewed-by: Chuck Lever chuck.lever@oracle.com Cc: stable@vger.kernel.org Signed-off-by: caoping caoping@cmss.chinamobile.com Link: https://patch.msgid.link/20251204091058.1545151-1-caoping@cmss.chinamobile.c... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/handshake/request.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/handshake/request.c +++ b/net/handshake/request.c @@ -276,6 +276,8 @@ int handshake_req_submit(struct socket * out_unlock: spin_unlock(&hn->hn_lock); out_err: + /* Restore original destructor so socket teardown still runs on failure */ + req->hr_sk->sk_destruct = req->hr_odestruct; trace_handshake_submit_err(net, req, req->hr_sk, ret); handshake_req_destroy(req); return ret;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chuck Lever chuck.lever@oracle.com
commit 27d17641cacfedd816789b75d342430f6b912bd2 upstream.
From RFC 8881:
5.8.1.14. Attribute 75: suppattr_exclcreat
The bit vector that would set all REQUIRED and RECOMMENDED attributes that are supported by the EXCLUSIVE4_1 method of file creation via the OPEN operation. The scope of this attribute applies to all objects with a matching fsid.
There's nothing in RFC 8881 that states that suppattr_exclcreat is or is not allowed to contain bits for attributes that are clear in the reported supported_attrs bitmask. But it doesn't make sense for an NFS server to indicate that it /doesn't/ implement an attribute, but then also indicate that clients /are/ allowed to set that attribute using OPEN(create) with EXCLUSIVE4_1.
Ensure that the SECURITY_LABEL and ACL bits are not set in the suppattr_exclcreat bitmask when they are also not set in the supported_attrs bitmask.
Fixes: 8c18f2052e75 ("nfsd41: SUPPATTR_EXCLCREAT attribute") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfs4xdr.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -3375,6 +3375,11 @@ static __be32 nfsd4_encode_fattr4_suppat u32 supp[3];
memcpy(supp, nfsd_suppattrs[resp->cstate.minorversion], sizeof(supp)); + if (!IS_POSIXACL(d_inode(args->dentry))) + supp[0] &= ~FATTR4_WORD0_ACL; + if (!args->contextsupport) + supp[2] &= ~FATTR4_WORD2_SECURITY_LABEL; + supp[0] &= NFSD_SUPPATTR_EXCLCREAT_WORD0; supp[1] &= NFSD_SUPPATTR_EXCLCREAT_WORD1; supp[2] &= NFSD_SUPPATTR_EXCLCREAT_WORD2;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chuck Lever chuck.lever@oracle.com
commit 913f7cf77bf14c13cfea70e89bcb6d0b22239562 upstream.
An NFSv4 client that sets an ACL with a named principal during file creation retrieves the ACL afterwards, and finds that it is only a default ACL (based on the mode bits) and not the ACL that was requested during file creation. This violates RFC 8881 section 6.4.1.3: "the ACL attribute is set as given".
The issue occurs in nfsd_create_setattr(), which calls nfsd_attrs_valid() to determine whether to call nfsd_setattr(). However, nfsd_attrs_valid() checks only for iattr changes and security labels, but not POSIX ACLs. When only an ACL is present, the function returns false, nfsd_setattr() is skipped, and the POSIX ACL is never applied to the inode.
Subsequently, when the client retrieves the ACL, the server finds no POSIX ACL on the inode and returns one generated from the file's mode bits rather than returning the originally-specified ACL.
Reported-by: Aurélien Couderc aurelien.couderc2002@gmail.com Fixes: c0cbe70742f4 ("NFSD: add posix ACLs to struct nfsd_attrs") Cc: Roland Mainz roland.mainz@nrubsig.org Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/vfs.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/nfsd/vfs.h +++ b/fs/nfsd/vfs.h @@ -67,7 +67,8 @@ static inline bool nfsd_attrs_valid(stru struct iattr *iap = attrs->na_iattr;
return (iap->ia_valid || (attrs->na_seclabel && - attrs->na_seclabel->len)); + attrs->na_seclabel->len) || + attrs->na_pacl || attrs->na_dpacl); }
__be32 nfserrno (int errno);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
commit 26e455064983e00013c0a63ffe0eed9e9ec2fa89 upstream.
With the introduction of 8-bit formats the DMIC blob lookup also needs to be modified to prefer the 32-bit blob when 8-bit format is used on FE.
At the same time we also need to make sure that in case 8-bit format is used, but only 16-bit blob is available for DMIC then we will not try to look for 8-bit blob (which is invalid) as fallback, but for a 16-bit one.
Fixes: c04c2e829649 ("ASoC: SOF: ipc4-topology: Add support for 8-bit formats") Cc: stable@vger.kernel.org Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Reviewed-by: Seppo Ingalsuo seppo.ingalsuo@linux.intel.com Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Link: https://patch.msgid.link/20251215120648.4827-2-peter.ujfalusi@linux.intel.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/sof/ipc4-topology.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c index 221e9d4052b8..47959f182f4b 100644 --- a/sound/soc/sof/ipc4-topology.c +++ b/sound/soc/sof/ipc4-topology.c @@ -1752,11 +1752,9 @@ snd_sof_get_nhlt_endpoint_data(struct snd_sof_dev *sdev, struct snd_sof_dai *dai channel_count = params_channels(params); sample_rate = params_rate(params); bit_depth = params_width(params); - /* - * Look for 32-bit blob first instead of 16-bit if copier - * supports multiple formats - */ - if (bit_depth == 16 && !single_bitdepth) { + + /* Prefer 32-bit blob if copier supports multiple formats */ + if (bit_depth <= 16 && !single_bitdepth) { dev_dbg(sdev->dev, "Looking for 32-bit blob first for DMIC\n"); format_change = true; bit_depth = 32; @@ -1799,10 +1797,18 @@ snd_sof_get_nhlt_endpoint_data(struct snd_sof_dev *sdev, struct snd_sof_dai *dai if (format_change) { /* * The 32-bit blob was not found in NHLT table, try to - * look for one based on the params + * look for 16-bit for DMIC or based on the params for + * SSP */ - bit_depth = params_width(params); - format_change = false; + if (linktype == SOF_DAI_INTEL_DMIC) { + bit_depth = 16; + if (params_width(params) == 16) + format_change = false; + } else { + bit_depth = params_width(params); + format_change = false; + } + get_new_blob = true; } else if (linktype == SOF_DAI_INTEL_DMIC && !single_bitdepth) { /*
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Antheas Kapenekakis lkml@antheas.dev
commit f7cede182c963720edd1e5fb50ea4f1c7eafa30e upstream.
By default, these devices use the quirk ALC294_FIXUP_ASUS_SPK. Not using it causes the headphone jack to stop working. Therefore, introduce a new quirk ALC287_FIXUP_TXNW2781_I2C_ASUS that binds to the TAS amplifier while using that quirk.
Cc: stable@kernel.org Fixes: 18a4895370a7 ("ALSA: hda/realtek: Add match for ASUS Xbox Ally projects") Signed-off-by: Antheas Kapenekakis lkml@antheas.dev Link: https://patch.msgid.link/20251216211714.1116898-1-lkml@antheas.dev Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/hda/codecs/realtek/alc269.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
--- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -3703,6 +3703,7 @@ enum { ALC295_FIXUP_DELL_TAS2781_I2C, ALC245_FIXUP_TAS2781_SPI_2, ALC287_FIXUP_TXNW2781_I2C, + ALC287_FIXUP_TXNW2781_I2C_ASUS, ALC287_FIXUP_YOGA7_14ARB7_I2C, ALC245_FIXUP_HP_MUTE_LED_COEFBIT, ALC245_FIXUP_HP_MUTE_LED_V1_COEFBIT, @@ -5993,6 +5994,12 @@ static const struct hda_fixup alc269_fix .chained = true, .chain_id = ALC285_FIXUP_THINKPAD_HEADSET_JACK, }, + [ALC287_FIXUP_TXNW2781_I2C_ASUS] = { + .type = HDA_FIXUP_FUNC, + .v.func = tas2781_fixup_txnw_i2c, + .chained = true, + .chain_id = ALC294_FIXUP_ASUS_SPK, + }, [ALC287_FIXUP_YOGA7_14ARB7_I2C] = { .type = HDA_FIXUP_FUNC, .v.func = yoga7_14arb7_fixup_i2c, @@ -6736,8 +6743,8 @@ static const struct hda_quirk alc269_fix SND_PCI_QUIRK(0x1043, 0x12f0, "ASUS X541UV", ALC256_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x1313, "Asus K42JZ", ALC269VB_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x1314, "ASUS GA605K", ALC285_FIXUP_ASUS_GA605K_HEADSET_MIC), - SND_PCI_QUIRK(0x1043, 0x1384, "ASUS RC73XA", ALC287_FIXUP_TXNW2781_I2C), - SND_PCI_QUIRK(0x1043, 0x1394, "ASUS RC73YA", ALC287_FIXUP_TXNW2781_I2C), + SND_PCI_QUIRK(0x1043, 0x1384, "ASUS RC73XA", ALC287_FIXUP_TXNW2781_I2C_ASUS), + SND_PCI_QUIRK(0x1043, 0x1394, "ASUS RC73YA", ALC287_FIXUP_TXNW2781_I2C_ASUS), SND_PCI_QUIRK(0x1043, 0x13b0, "ASUS Z550SA", ALC256_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x1427, "Asus Zenbook UX31E", ALC269VB_FIXUP_ASUS_ZENBOOK), SND_PCI_QUIRK(0x1043, 0x1433, "ASUS GX650PY/PZ/PV/PU/PYV/PZV/PIV/PVV", ALC285_FIXUP_ASUS_I2C_HEADSET_MIC),
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chuck Lever chuck.lever@oracle.com
commit ad3cbbb0c1892c48919727fcb8dec5965da8bacb upstream.
From RFC 8881:
5.8.1.14. Attribute 75: suppattr_exclcreat
The bit vector that would set all REQUIRED and RECOMMENDED attributes that are supported by the EXCLUSIVE4_1 method of file creation via the OPEN operation. The scope of this attribute applies to all objects with a matching fsid.
There's nothing in RFC 8881 that states that suppattr_exclcreat is or is not allowed to contain bits for attributes that are clear in the reported supported_attrs bitmask. But it doesn't make sense for an NFS server to indicate that it /doesn't/ implement an attribute, but then also indicate that clients /are/ allowed to set that attribute using OPEN(create) with EXCLUSIVE4_1.
The FATTR4_WORD2_TIME_DELEG attributes are also not to be allowed for OPEN(create) with EXCLUSIVE4_1. It doesn't make sense to set a delegated timestamp on a new file.
Fixes: 7e13f4f8d27d ("nfsd: handle delegated timestamps in SETATTR") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfsd.h | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/nfsd/nfsd.h +++ b/fs/nfsd/nfsd.h @@ -547,8 +547,14 @@ static inline bool nfsd_attrs_supported( #define NFSD_SUPPATTR_EXCLCREAT_WORD1 \ (NFSD_WRITEABLE_ATTRS_WORD1 & \ ~(FATTR4_WORD1_TIME_ACCESS_SET | FATTR4_WORD1_TIME_MODIFY_SET)) +/* + * The FATTR4_WORD2_TIME_DELEG attributes are not to be allowed for + * OPEN(create) with EXCLUSIVE4_1. It doesn't make sense to set a + * delegated timestamp on a new file. + */ #define NFSD_SUPPATTR_EXCLCREAT_WORD2 \ - NFSD_WRITEABLE_ATTRS_WORD2 + (NFSD_WRITEABLE_ATTRS_WORD2 & \ + ~(FATTR4_WORD2_TIME_DELEG_ACCESS | FATTR4_WORD2_TIME_DELEG_MODIFY))
extern int nfsd4_is_junction(struct dentry *dentry); extern int register_cld_notifier(void);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
commit 816f291fc23f325d31509d0e97873249ad75ae9a upstream.
SSP/DMIC blobs have no support for FLOAT type, they are using S32 on data bus.
Convert the format from FLOAT_LE to S32_LE to make sure that the correct format is used within the path.
FLOAT conversion will be done on the host side (or within the path).
Fixes: f7c41911ad74 ("ASoC: SOF: ipc4-topology: Add support for float sample type") Cc: stable@vger.kernel.org Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Reviewed-by: Seppo Ingalsuo seppo.ingalsuo@linux.intel.com Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Link: https://patch.msgid.link/20251215120648.4827-3-peter.ujfalusi@linux.intel.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/sof/ipc4-topology.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c index 47959f182f4b..32b628e2fe29 100644 --- a/sound/soc/sof/ipc4-topology.c +++ b/sound/soc/sof/ipc4-topology.c @@ -1843,7 +1843,7 @@ snd_sof_get_nhlt_endpoint_data(struct snd_sof_dev *sdev, struct snd_sof_dai *dai *len = cfg->size >> 2; *dst = (u32 *)cfg->caps;
- if (format_change) { + if (format_change || params_format(params) == SNDRV_PCM_FORMAT_FLOAT_LE) { /* * Update the params to reflect that different blob was loaded * instead of the requested bit depth (16 -> 32 or 32 -> 16).
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joshua Rogers linux@joshua.hu
commit d1bea0ce35b6095544ee82bb54156fc62c067e58 upstream.
svc_rdma_copy_inline_range indexed rqstp->rq_pages[rc_curpage] without verifying rc_curpage stays within the allocated page array. Add guards before the first use and after advancing to a new page.
Fixes: d7cc73972661 ("svcrdma: support multiple Read chunks per RPC") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers linux@joshua.hu Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sunrpc/xprtrdma/svc_rdma_rw.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/sunrpc/xprtrdma/svc_rdma_rw.c +++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c @@ -841,6 +841,9 @@ static int svc_rdma_copy_inline_range(st for (page_no = 0; page_no < numpages; page_no++) { unsigned int page_len;
+ if (head->rc_curpage >= rqstp->rq_maxpages) + return -EINVAL; + page_len = min_t(unsigned int, remaining, PAGE_SIZE - head->rc_pageoff);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit ebae102897e760e9e6bc625f701dd666b2163bd1 upstream.
Clang is not happy about set but (in some cases) unused variable:
fs/nfsd/export.c:1027:17: error: variable 'inode' set but not used [-Werror,-Wunused-but-set-variable]
since it's used as a parameter to dprintk() which might be configured a no-op. To avoid uglifying code with the specific ifdeffery just mark the variable __maybe_unused.
The commit [1], which introduced this behaviour, is quite old and hence the Fixes tag points to the first of the Git era.
Link: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?... [1] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/export.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/nfsd/export.c +++ b/fs/nfsd/export.c @@ -1024,7 +1024,7 @@ exp_rootfh(struct net *net, struct auth_ { struct svc_export *exp; struct path path; - struct inode *inode; + struct inode *inode __maybe_unused; struct svc_fh fh; int err; struct nfsd_net *nn = net_generic(net, nfsd_net_id);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joshua Rogers linux@joshua.hu
commit 94972027ab55b200e031059fd6c7a649f8248020 upstream.
The function comment specifies 0 on success and -EINVAL on invalid parameters. Make the tail return 0 after a successful copy loop.
Fixes: d7cc73972661 ("svcrdma: support multiple Read chunks per RPC") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers linux@joshua.hu Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sunrpc/xprtrdma/svc_rdma_rw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/sunrpc/xprtrdma/svc_rdma_rw.c +++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c @@ -863,7 +863,7 @@ static int svc_rdma_copy_inline_range(st offset += page_len; }
- return -EINVAL; + return 0; }
/**
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joshua Rogers linux@joshua.hu
commit a8ee9099f30654917aa68f55d707b5627e1dbf77 upstream.
svc_rdma_copy_inline_range added rc_curpage (page index) to the page base instead of the byte offset rc_pageoff. Use rc_pageoff so copies land within the current page.
Found by ZeroPath (https://zeropath.com)
Fixes: 8e122582680c ("svcrdma: Move svc_rdma_read_info::ri_pageno to struct svc_rdma_recv_ctxt") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers linux@joshua.hu Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sunrpc/xprtrdma/svc_rdma_rw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/sunrpc/xprtrdma/svc_rdma_rw.c +++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c @@ -851,7 +851,7 @@ static int svc_rdma_copy_inline_range(st head->rc_page_count++;
dst = page_address(rqstp->rq_pages[head->rc_curpage]); - memcpy(dst + head->rc_curpage, src + offset, page_len); + memcpy((unsigned char *)dst + head->rc_pageoff, src + offset, page_len);
head->rc_readbytes += page_len; head->rc_pageoff += page_len;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joshua Rogers linux@joshua.hu
commit d4b69a6186b215d2dc1ebcab965ed88e8d41768d upstream.
A zero length gss_token results in pages == 0 and in_token->pages[0] is NULL. The code unconditionally evaluates page_address(in_token->pages[0]) for the initial memcpy, which can dereference NULL even when the copy length is 0. Guard the first memcpy so it only runs when length > 0.
Fixes: 5866efa8cbfb ("SUNRPC: Fix svcauth_gss_proxy_init()") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers linux@joshua.hu Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sunrpc/auth_gss/svcauth_gss.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/sunrpc/auth_gss/svcauth_gss.c +++ b/net/sunrpc/auth_gss/svcauth_gss.c @@ -1083,7 +1083,8 @@ static int gss_read_proxy_verf(struct sv }
length = min_t(unsigned int, inlen, (char *)xdr->end - (char *)xdr->p); - memcpy(page_address(in_token->pages[0]), xdr->p, length); + if (length) + memcpy(page_address(in_token->pages[0]), xdr->p, length); inlen -= length;
to_offs = length;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nysal Jan K.A. nysal@linux.ibm.com
commit c2296a1e42418556efbeb5636c4fa6aa6106713a upstream.
If SMT is disabled or a partial SMT state is enabled, when a new kernel image is loaded for kexec, on reboot the following warning is observed:
kexec: Waking offline cpu 228. WARNING: CPU: 0 PID: 9062 at arch/powerpc/kexec/core_64.c:223 kexec_prepare_cpus+0x1b0/0x1bc [snip] NIP kexec_prepare_cpus+0x1b0/0x1bc LR kexec_prepare_cpus+0x1a0/0x1bc Call Trace: kexec_prepare_cpus+0x1a0/0x1bc (unreliable) default_machine_kexec+0x160/0x19c machine_kexec+0x80/0x88 kernel_kexec+0xd0/0x118 __do_sys_reboot+0x210/0x2c4 system_call_exception+0x124/0x320 system_call_vectored_common+0x15c/0x2ec
This occurs as add_cpu() fails due to cpu_bootable() returning false for CPUs that fail the cpu_smt_thread_allowed() check or non primary threads if SMT is disabled.
Fix the issue by enabling SMT and resetting the number of SMT threads to the number of threads per core, before attempting to wake up all present CPUs.
Fixes: 38253464bc82 ("cpu/SMT: Create topology_smt_thread_allowed()") Reported-by: Sachin P Bappalige sachinpb@linux.ibm.com Cc: stable@vger.kernel.org # v6.6+ Reviewed-by: Srikar Dronamraju srikar@linux.ibm.com Signed-off-by: Nysal Jan K.A. nysal@linux.ibm.com Tested-by: Samir M samir@linux.ibm.com Reviewed-by: Sourabh Jain sourabhjain@linux.ibm.com Signed-off-by: Madhavan Srinivasan maddy@linux.ibm.com Link: https://patch.msgid.link/20251028105516.26258-1-nysal@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/kexec/core_64.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
--- a/arch/powerpc/kexec/core_64.c +++ b/arch/powerpc/kexec/core_64.c @@ -202,6 +202,23 @@ static void kexec_prepare_cpus_wait(int mb(); }
+ +/* + * The add_cpu() call in wake_offline_cpus() can fail as cpu_bootable() + * returns false for CPUs that fail the cpu_smt_thread_allowed() check + * or non primary threads if SMT is disabled. Re-enable SMT and set the + * number of SMT threads to threads per core. + */ +static void kexec_smt_reenable(void) +{ +#if defined(CONFIG_SMP) && defined(CONFIG_HOTPLUG_SMT) + lock_device_hotplug(); + cpu_smt_num_threads = threads_per_core; + cpu_smt_control = CPU_SMT_ENABLED; + unlock_device_hotplug(); +#endif +} + /* * We need to make sure each present CPU is online. The next kernel will scan * the device tree and assume primary threads are online and query secondary @@ -216,6 +233,8 @@ static void wake_offline_cpus(void) { int cpu = 0;
+ kexec_smt_reenable(); + for_each_present_cpu(cpu) { if (!cpu_online(cpu)) { printk(KERN_INFO "kexec: Waking offline cpu %d.\n",
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit 266273eaf4d99475f1ae57f687b3e42bc71ec6f0 upstream.
We can't log a conflicting inode if it's a directory and it was moved from one parent directory to another parent directory in the current transaction, as this can result an attempt to have a directory with two hard links during log replay, one for the old parent directory and another for the new parent directory.
The following scenario triggers that issue:
1) We have directories "dir1" and "dir2" created in a past transaction. Directory "dir1" has inode A as its parent directory;
2) We move "dir1" to some other directory;
3) We create a file with the name "dir1" in directory inode A;
4) We fsync the new file. This results in logging the inode of the new file and the inode for the directory "dir1" that was previously moved in the current transaction. So the log tree has the INODE_REF item for the new location of "dir1";
5) We move the new file to some other directory. This results in updating the log tree to included the new INODE_REF for the new location of the file and removes the INODE_REF for the old location. This happens during the rename when we call btrfs_log_new_name();
6) We fsync the file, and that persists the log tree changes done in the previous step (btrfs_log_new_name() only updates the log tree in memory);
7) We have a power failure;
8) Next time the fs is mounted, log replay happens and when processing the inode for directory "dir1" we find a new INODE_REF and add that link, but we don't remove the old link of the inode since we have not logged the old parent directory of the directory inode "dir1".
As a result after log replay finishes when we trigger writeback of the subvolume tree's extent buffers, the tree check will detect that we have a directory a hard link count of 2 and we get a mount failure. The errors and stack traces reported in dmesg/syslog are like this:
[ 3845.729764] BTRFS info (device dm-0): start tree-log replay [ 3845.730304] page: refcount:3 mapcount:0 mapping:000000005c8a3027 index:0x1d00 pfn:0x11510c [ 3845.731236] memcg:ffff9264c02f4e00 [ 3845.731751] aops:btree_aops [btrfs] ino:1 [ 3845.732300] flags: 0x17fffc00000400a(uptodate|private|writeback|node=0|zone=2|lastcpupid=0x1ffff) [ 3845.733346] raw: 017fffc00000400a 0000000000000000 dead000000000122 ffff9264d978aea8 [ 3845.734265] raw: 0000000000001d00 ffff92650e6d4738 00000003ffffffff ffff9264c02f4e00 [ 3845.735305] page dumped because: eb page dump [ 3845.735981] BTRFS critical (device dm-0): corrupt leaf: root=5 block=30408704 slot=6 ino=257, invalid nlink: has 2 expect no more than 1 for dir [ 3845.737786] BTRFS info (device dm-0): leaf 30408704 gen 10 total ptrs 17 free space 14881 owner 5 [ 3845.737789] BTRFS info (device dm-0): refs 4 lock_owner 0 current 30701 [ 3845.737792] item 0 key (256 INODE_ITEM 0) itemoff 16123 itemsize 160 [ 3845.737794] inode generation 3 transid 9 size 16 nbytes 16384 [ 3845.737795] block group 0 mode 40755 links 1 uid 0 gid 0 [ 3845.737797] rdev 0 sequence 2 flags 0x0 [ 3845.737798] atime 1764259517.0 [ 3845.737800] ctime 1764259517.572889464 [ 3845.737801] mtime 1764259517.572889464 [ 3845.737802] otime 1764259517.0 [ 3845.737803] item 1 key (256 INODE_REF 256) itemoff 16111 itemsize 12 [ 3845.737805] index 0 name_len 2 [ 3845.737807] item 2 key (256 DIR_ITEM 2363071922) itemoff 16077 itemsize 34 [ 3845.737808] location key (257 1 0) type 2 [ 3845.737810] transid 9 data_len 0 name_len 4 [ 3845.737811] item 3 key (256 DIR_ITEM 2676584006) itemoff 16043 itemsize 34 [ 3845.737813] location key (258 1 0) type 2 [ 3845.737814] transid 9 data_len 0 name_len 4 [ 3845.737815] item 4 key (256 DIR_INDEX 2) itemoff 16009 itemsize 34 [ 3845.737816] location key (257 1 0) type 2 [ 3845.737818] transid 9 data_len 0 name_len 4 [ 3845.737819] item 5 key (256 DIR_INDEX 3) itemoff 15975 itemsize 34 [ 3845.737820] location key (258 1 0) type 2 [ 3845.737821] transid 9 data_len 0 name_len 4 [ 3845.737822] item 6 key (257 INODE_ITEM 0) itemoff 15815 itemsize 160 [ 3845.737824] inode generation 9 transid 10 size 6 nbytes 0 [ 3845.737825] block group 0 mode 40755 links 2 uid 0 gid 0 [ 3845.737826] rdev 0 sequence 1 flags 0x0 [ 3845.737827] atime 1764259517.572889464 [ 3845.737828] ctime 1764259517.572889464 [ 3845.737830] mtime 1764259517.572889464 [ 3845.737831] otime 1764259517.572889464 [ 3845.737832] item 7 key (257 INODE_REF 256) itemoff 15801 itemsize 14 [ 3845.737833] index 2 name_len 4 [ 3845.737834] item 8 key (257 INODE_REF 258) itemoff 15787 itemsize 14 [ 3845.737836] index 2 name_len 4 [ 3845.737837] item 9 key (257 DIR_ITEM 2507850652) itemoff 15754 itemsize 33 [ 3845.737838] location key (259 1 0) type 1 [ 3845.737839] transid 10 data_len 0 name_len 3 [ 3845.737840] item 10 key (257 DIR_INDEX 2) itemoff 15721 itemsize 33 [ 3845.737842] location key (259 1 0) type 1 [ 3845.737843] transid 10 data_len 0 name_len 3 [ 3845.737844] item 11 key (258 INODE_ITEM 0) itemoff 15561 itemsize 160 [ 3845.737846] inode generation 9 transid 10 size 8 nbytes 0 [ 3845.737847] block group 0 mode 40755 links 1 uid 0 gid 0 [ 3845.737848] rdev 0 sequence 1 flags 0x0 [ 3845.737849] atime 1764259517.572889464 [ 3845.737850] ctime 1764259517.572889464 [ 3845.737851] mtime 1764259517.572889464 [ 3845.737852] otime 1764259517.572889464 [ 3845.737853] item 12 key (258 INODE_REF 256) itemoff 15547 itemsize 14 [ 3845.737855] index 3 name_len 4 [ 3845.737856] item 13 key (258 DIR_ITEM 1843588421) itemoff 15513 itemsize 34 [ 3845.737857] location key (257 1 0) type 2 [ 3845.737858] transid 10 data_len 0 name_len 4 [ 3845.737860] item 14 key (258 DIR_INDEX 2) itemoff 15479 itemsize 34 [ 3845.737861] location key (257 1 0) type 2 [ 3845.737862] transid 10 data_len 0 name_len 4 [ 3845.737863] item 15 key (259 INODE_ITEM 0) itemoff 15319 itemsize 160 [ 3845.737865] inode generation 10 transid 10 size 0 nbytes 0 [ 3845.737866] block group 0 mode 100600 links 1 uid 0 gid 0 [ 3845.737867] rdev 0 sequence 2 flags 0x0 [ 3845.737868] atime 1764259517.580874966 [ 3845.737869] ctime 1764259517.586121869 [ 3845.737870] mtime 1764259517.580874966 [ 3845.737872] otime 1764259517.580874966 [ 3845.737873] item 16 key (259 INODE_REF 257) itemoff 15306 itemsize 13 [ 3845.737874] index 2 name_len 3 [ 3845.737875] BTRFS error (device dm-0): block=30408704 write time tree block corruption detected [ 3845.739448] ------------[ cut here ]------------ [ 3845.740092] WARNING: CPU: 5 PID: 30701 at fs/btrfs/disk-io.c:335 btree_csum_one_bio+0x25a/0x270 [btrfs] [ 3845.741439] Modules linked in: btrfs dm_flakey crc32c_cryptoapi (...) [ 3845.750626] CPU: 5 UID: 0 PID: 30701 Comm: mount Tainted: G W 6.18.0-rc6-btrfs-next-218+ #1 PREEMPT(full) [ 3845.752414] Tainted: [W]=WARN [ 3845.752828] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [ 3845.754499] RIP: 0010:btree_csum_one_bio+0x25a/0x270 [btrfs] [ 3845.755460] Code: 31 f6 48 89 (...) [ 3845.758685] RSP: 0018:ffffa8d9c5677678 EFLAGS: 00010246 [ 3845.759450] RAX: 0000000000000000 RBX: ffff92650e6d4738 RCX: 0000000000000000 [ 3845.760309] RDX: 0000000000000000 RSI: ffffffff9aab45b9 RDI: ffff9264c4748000 [ 3845.761239] RBP: ffff9264d4324000 R08: 0000000000000000 R09: ffffa8d9c5677468 [ 3845.762607] R10: ffff926bdc1fffa8 R11: 0000000000000003 R12: ffffa8d9c5677680 [ 3845.764099] R13: 0000000000004000 R14: ffff9264dd624000 R15: ffff9264d978aba8 [ 3845.765094] FS: 00007f751fa5a840(0000) GS:ffff926c42a82000(0000) knlGS:0000000000000000 [ 3845.766226] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3845.766970] CR2: 0000558df1815380 CR3: 000000010ed88003 CR4: 0000000000370ef0 [ 3845.768009] Call Trace: [ 3845.768392] <TASK> [ 3845.768714] btrfs_submit_bbio+0x6ee/0x7f0 [btrfs] [ 3845.769640] ? write_one_eb+0x28e/0x340 [btrfs] [ 3845.770588] btree_write_cache_pages+0x2f0/0x550 [btrfs] [ 3845.771286] ? alloc_extent_state+0x19/0x100 [btrfs] [ 3845.771967] ? merge_next_state+0x1a/0x90 [btrfs] [ 3845.772586] ? set_extent_bit+0x233/0x8b0 [btrfs] [ 3845.773198] ? xas_load+0x9/0xc0 [ 3845.773589] ? xas_find+0x14d/0x1a0 [ 3845.773969] do_writepages+0xc6/0x160 [ 3845.774367] filemap_fdatawrite_wbc+0x48/0x60 [ 3845.775003] __filemap_fdatawrite_range+0x5b/0x80 [ 3845.775902] btrfs_write_marked_extents+0x61/0x170 [btrfs] [ 3845.776707] btrfs_write_and_wait_transaction+0x4e/0xc0 [btrfs] [ 3845.777379] ? _raw_spin_unlock_irqrestore+0x23/0x40 [ 3845.777923] btrfs_commit_transaction+0x5ea/0xd20 [btrfs] [ 3845.778551] ? _raw_spin_unlock+0x15/0x30 [ 3845.778986] ? release_extent_buffer+0x34/0x160 [btrfs] [ 3845.779659] btrfs_recover_log_trees+0x7a3/0x7c0 [btrfs] [ 3845.780416] ? __pfx_replay_one_buffer+0x10/0x10 [btrfs] [ 3845.781499] open_ctree+0x10bb/0x15f0 [btrfs] [ 3845.782194] btrfs_get_tree.cold+0xb/0x16c [btrfs] [ 3845.782764] ? fscontext_read+0x15c/0x180 [ 3845.783202] ? rw_verify_area+0x50/0x180 [ 3845.783667] vfs_get_tree+0x25/0xd0 [ 3845.784047] vfs_cmd_create+0x59/0xe0 [ 3845.784458] __do_sys_fsconfig+0x4f6/0x6b0 [ 3845.784914] do_syscall_64+0x50/0x1220 [ 3845.785340] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 3845.785980] RIP: 0033:0x7f751fc7f4aa [ 3845.786759] Code: 73 01 c3 48 (...) [ 3845.789951] RSP: 002b:00007ffcdba45dc8 EFLAGS: 00000246 ORIG_RAX: 00000000000001af [ 3845.791402] RAX: ffffffffffffffda RBX: 000055ccc8291c20 RCX: 00007f751fc7f4aa [ 3845.792688] RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000003 [ 3845.794308] RBP: 000055ccc8292120 R08: 0000000000000000 R09: 0000000000000000 [ 3845.795829] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 3845.797183] R13: 00007f751fe11580 R14: 00007f751fe1326c R15: 00007f751fdf8a23 [ 3845.798633] </TASK> [ 3845.799067] ---[ end trace 0000000000000000 ]--- [ 3845.800215] BTRFS: error (device dm-0) in btrfs_commit_transaction:2553: errno=-5 IO failure (Error while writing out transaction) [ 3845.801860] BTRFS warning (device dm-0 state E): Skipping commit of aborted transaction. [ 3845.802815] BTRFS error (device dm-0 state EA): Transaction aborted (error -5) [ 3845.803728] BTRFS: error (device dm-0 state EA) in cleanup_transaction:2036: errno=-5 IO failure [ 3845.805374] BTRFS: error (device dm-0 state EA) in btrfs_replay_log:2083: errno=-5 IO failure (Failed to recover log tree) [ 3845.807919] BTRFS error (device dm-0 state EA): open_ctree failed: -5
Fix this by never logging a conflicting inode that is a directory and was moved in the current transaction (its last_unlink_trans equals the current transaction) and instead fallback to a transaction commit.
A test case for fstests will follow soon.
Reported-by: Vyacheslav Kovalevsky slva.kovalevskiy.2014@gmail.com Link: https://lore.kernel.org/linux-btrfs/7bbc9419-5c56-450a-b5a0-efeae7457113@gma... CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/tree-log.c | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+)
--- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -6050,6 +6050,33 @@ static int conflicting_inode_is_dir(stru return ret; }
+static bool can_log_conflicting_inode(const struct btrfs_trans_handle *trans, + const struct btrfs_inode *inode) +{ + if (!S_ISDIR(inode->vfs_inode.i_mode)) + return true; + + if (inode->last_unlink_trans < trans->transid) + return true; + + /* + * If this is a directory and its unlink_trans is not from a past + * transaction then we must fallback to a transaction commit in order + * to avoid getting a directory with 2 hard links after log replay. + * + * This happens if a directory A is renamed, moved from one parent + * directory to another one, a new file is created in the old parent + * directory with the old name of our directory A, the new file is + * fsynced, then we moved the new file to some other parent directory + * and fsync again the new file. This results in a log tree where we + * logged that directory A existed, with the INODE_REF item for the + * new location but without having logged its old parent inode, so + * that on log replay we add a new link for the new location but the + * old link remains, resulting in a link count of 2. + */ + return false; +} + static int add_conflicting_inode(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_path *path, @@ -6153,6 +6180,11 @@ static int add_conflicting_inode(struct return 0; }
+ if (!can_log_conflicting_inode(trans, inode)) { + btrfs_add_delayed_iput(inode); + return BTRFS_LOG_FORCE_COMMIT; + } + btrfs_add_delayed_iput(inode);
ino_elem = kmalloc(sizeof(*ino_elem), GFP_NOFS); @@ -6217,6 +6249,12 @@ static int log_conflicting_inodes(struct break; }
+ if (!can_log_conflicting_inode(trans, inode)) { + btrfs_add_delayed_iput(inode); + ret = BTRFS_LOG_FORCE_COMMIT; + break; + } + /* * Always log the directory, we cannot make this * conditional on need_log_inode() because the directory
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shakeel Butt shakeel.butt@linux.dev
commit 3309b63a2281efb72df7621d60cc1246b6286ad3 upstream.
On x86-64, this_cpu_cmpxchg() uses CMPXCHG without LOCK prefix which means it is only safe for the local CPU and not for multiple CPUs. Recently the commit 36df6e3dbd7e ("cgroup: make css_rstat_updated nmi safe") make css_rstat_updated lockless and uses lockless list to allow reentrancy. Since css_rstat_updated can invoked from process context, IRQ and NMI, it uses this_cpu_cmpxchg() to select the winner which will inset the lockless lnode into the global per-cpu lockless list.
However the commit missed one case where lockless node of a cgroup can be accessed and modified by another CPU doing the flushing. Basically llist_del_first_init() in css_process_update_tree().
On a cursory look, it can be questioned how css_process_update_tree() can see a lockless node in global lockless list where the updater is at this_cpu_cmpxchg() and before llist_add() call in css_rstat_updated(). This can indeed happen in the presence of IRQs/NMI.
Consider this scenario: Updater for cgroup stat C on CPU A in process context is after llist_on_list() check and before this_cpu_cmpxchg() in css_rstat_updated() where it get interrupted by IRQ/NMI. In the IRQ/NMI context, a new updater calls css_rstat_updated() for same cgroup C and successfully inserts rstatc_pcpu->lnode.
Now concurrently CPU B is running the flusher and it calls llist_del_first_init() for CPU A and got rstatc_pcpu->lnode of cgroup C which was added by the IRQ/NMI updater.
Now imagine CPU B calling init_llist_node() on cgroup C's rstatc_pcpu->lnode of CPU A and on CPU A, the process context updater calling this_cpu_cmpxchg(rstatc_pcpu->lnode) concurrently.
The CMPXCNG without LOCK on CPU A is not safe and thus we need LOCK prefix.
In Meta's fleet running the kernel with the commit 36df6e3dbd7e, we are observing on some machines the memcg stats are getting skewed by more than the actual memory on the system. On close inspection, we noticed that lockless node for a workload for specific CPU was in the bad state and thus all the updates on that CPU for that cgroup was being lost.
To confirm if this skew was indeed due to this CMPXCHG without LOCK in css_rstat_updated(), we created a repro (using AI) at [1] which shows that CMPXCHG without LOCK creates almost the same lnode corruption as seem in Meta's fleet and with LOCK CMPXCHG the issue does not reproduces.
Link: http://lore.kernel.org/efiagdwmzfwpdzps74fvcwq3n4cs36q33ij7eebcpssactv3zu@se... [1] Signed-off-by: Shakeel Butt shakeel.butt@linux.dev Cc: stable@vger.kernel.org # v6.17+ Fixes: 36df6e3dbd7e ("cgroup: make css_rstat_updated nmi safe") Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/cgroup/rstat.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
--- a/kernel/cgroup/rstat.c +++ b/kernel/cgroup/rstat.c @@ -71,7 +71,6 @@ __bpf_kfunc void css_rstat_updated(struc { struct llist_head *lhead; struct css_rstat_cpu *rstatc; - struct css_rstat_cpu __percpu *rstatc_pcpu; struct llist_node *self;
/* @@ -104,18 +103,22 @@ __bpf_kfunc void css_rstat_updated(struc /* * This function can be renentered by irqs and nmis for the same cgroup * and may try to insert the same per-cpu lnode into the llist. Note - * that llist_add() does not protect against such scenarios. + * that llist_add() does not protect against such scenarios. In addition + * this same per-cpu lnode can be modified through init_llist_node() + * from css_rstat_flush() running on a different CPU. * * To protect against such stacked contexts of irqs/nmis, we use the * fact that lnode points to itself when not on a list and then use - * this_cpu_cmpxchg() to atomically set to NULL to select the winner + * try_cmpxchg() to atomically set to NULL to select the winner * which will call llist_add(). The losers can assume the insertion is * successful and the winner will eventually add the per-cpu lnode to * the llist. + * + * Please note that we can not use this_cpu_cmpxchg() here as on some + * archs it is not safe against modifications from multiple CPUs. */ self = &rstatc->lnode; - rstatc_pcpu = css->rstat_cpu; - if (this_cpu_cmpxchg(rstatc_pcpu->lnode.next, self, NULL) != self) + if (!try_cmpxchg(&rstatc->lnode.next, &self, NULL)) return;
lhead = ss_lhead_cpu(css->ss, cpu);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sven Schnelle svens@linux.ibm.com
commit b1aa01d31249bd116b18c7f512d3e46b4b4ad83b upstream.
With z16 a new flag 'search boot program' was introduced for list-directed IPL (SCSI, NVMe, ECKD DASD). If this flag is set, e.g. via selecting the "Automatic" value for the "Boot program selector" control on an HMC load panel, it is copied to the reipl structure from the initial ipl structure. When a user now sets a boot prog via sysfs, the flag is not cleared and the bootloader will again automatically select the boot program, ignoring user configuration.
To avoid that, clear the SBP flag when a bootprog sysfs file is written.
Cc: stable@vger.kernel.org Reviewed-by: Peter Oberparleiter oberpar@linux.ibm.com Reviewed-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sven Schnelle svens@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/include/uapi/asm/ipl.h | 1 arch/s390/kernel/ipl.c | 48 +++++++++++++++++++++++++++++---------- 2 files changed, 37 insertions(+), 12 deletions(-)
--- a/arch/s390/include/uapi/asm/ipl.h +++ b/arch/s390/include/uapi/asm/ipl.h @@ -15,6 +15,7 @@ struct ipl_pl_hdr { #define IPL_PL_FLAG_IPLPS 0x80 #define IPL_PL_FLAG_SIPL 0x40 #define IPL_PL_FLAG_IPLSR 0x20 +#define IPL_PL_FLAG_SBP 0x10
/* IPL Parameter Block header */ struct ipl_pb_hdr { --- a/arch/s390/kernel/ipl.c +++ b/arch/s390/kernel/ipl.c @@ -262,6 +262,24 @@ static struct kobj_attribute sys_##_pref sys_##_prefix##_##_name##_show, \ sys_##_prefix##_##_name##_store)
+#define DEFINE_IPL_ATTR_BOOTPROG_RW(_prefix, _name, _fmt_out, _fmt_in, _hdr, _value) \ + IPL_ATTR_SHOW_FN(_prefix, _name, _fmt_out, (unsigned long long) _value) \ +static ssize_t sys_##_prefix##_##_name##_store(struct kobject *kobj, \ + struct kobj_attribute *attr, \ + const char *buf, size_t len) \ +{ \ + unsigned long long value; \ + if (sscanf(buf, _fmt_in, &value) != 1) \ + return -EINVAL; \ + (_value) = value; \ + (_hdr).flags &= ~IPL_PL_FLAG_SBP; \ + return len; \ +} \ +static struct kobj_attribute sys_##_prefix##_##_name##_attr = \ + __ATTR(_name, 0644, \ + sys_##_prefix##_##_name##_show, \ + sys_##_prefix##_##_name##_store) + #define DEFINE_IPL_ATTR_STR_RW(_prefix, _name, _fmt_out, _fmt_in, _value)\ IPL_ATTR_SHOW_FN(_prefix, _name, _fmt_out, _value) \ static ssize_t sys_##_prefix##_##_name##_store(struct kobject *kobj, \ @@ -818,12 +836,13 @@ DEFINE_IPL_ATTR_RW(reipl_fcp, wwpn, "0x% reipl_block_fcp->fcp.wwpn); DEFINE_IPL_ATTR_RW(reipl_fcp, lun, "0x%016llx\n", "%llx\n", reipl_block_fcp->fcp.lun); -DEFINE_IPL_ATTR_RW(reipl_fcp, bootprog, "%lld\n", "%lld\n", - reipl_block_fcp->fcp.bootprog); DEFINE_IPL_ATTR_RW(reipl_fcp, br_lba, "%lld\n", "%lld\n", reipl_block_fcp->fcp.br_lba); DEFINE_IPL_ATTR_RW(reipl_fcp, device, "0.0.%04llx\n", "0.0.%llx\n", reipl_block_fcp->fcp.devno); +DEFINE_IPL_ATTR_BOOTPROG_RW(reipl_fcp, bootprog, "%lld\n", "%lld\n", + reipl_block_fcp->hdr, + reipl_block_fcp->fcp.bootprog);
static void reipl_get_ascii_loadparm(char *loadparm, struct ipl_parameter_block *ibp) @@ -942,10 +961,11 @@ DEFINE_IPL_ATTR_RW(reipl_nvme, fid, "0x% reipl_block_nvme->nvme.fid); DEFINE_IPL_ATTR_RW(reipl_nvme, nsid, "0x%08llx\n", "%llx\n", reipl_block_nvme->nvme.nsid); -DEFINE_IPL_ATTR_RW(reipl_nvme, bootprog, "%lld\n", "%lld\n", - reipl_block_nvme->nvme.bootprog); DEFINE_IPL_ATTR_RW(reipl_nvme, br_lba, "%lld\n", "%lld\n", reipl_block_nvme->nvme.br_lba); +DEFINE_IPL_ATTR_BOOTPROG_RW(reipl_nvme, bootprog, "%lld\n", "%lld\n", + reipl_block_nvme->hdr, + reipl_block_nvme->nvme.bootprog);
static struct attribute *reipl_nvme_attrs[] = { &sys_reipl_nvme_fid_attr.attr, @@ -1038,8 +1058,9 @@ static const struct bin_attribute *const };
DEFINE_IPL_CCW_ATTR_RW(reipl_eckd, device, reipl_block_eckd->eckd); -DEFINE_IPL_ATTR_RW(reipl_eckd, bootprog, "%lld\n", "%lld\n", - reipl_block_eckd->eckd.bootprog); +DEFINE_IPL_ATTR_BOOTPROG_RW(reipl_eckd, bootprog, "%lld\n", "%lld\n", + reipl_block_eckd->hdr, + reipl_block_eckd->eckd.bootprog);
static struct attribute *reipl_eckd_attrs[] = { &sys_reipl_eckd_device_attr.attr, @@ -1567,12 +1588,13 @@ DEFINE_IPL_ATTR_RW(dump_fcp, wwpn, "0x%0 dump_block_fcp->fcp.wwpn); DEFINE_IPL_ATTR_RW(dump_fcp, lun, "0x%016llx\n", "%llx\n", dump_block_fcp->fcp.lun); -DEFINE_IPL_ATTR_RW(dump_fcp, bootprog, "%lld\n", "%lld\n", - dump_block_fcp->fcp.bootprog); DEFINE_IPL_ATTR_RW(dump_fcp, br_lba, "%lld\n", "%lld\n", dump_block_fcp->fcp.br_lba); DEFINE_IPL_ATTR_RW(dump_fcp, device, "0.0.%04llx\n", "0.0.%llx\n", dump_block_fcp->fcp.devno); +DEFINE_IPL_ATTR_BOOTPROG_RW(dump_fcp, bootprog, "%lld\n", "%lld\n", + dump_block_fcp->hdr, + dump_block_fcp->fcp.bootprog);
DEFINE_IPL_ATTR_SCP_DATA_RW(dump_fcp, dump_block_fcp->hdr, dump_block_fcp->fcp, @@ -1604,10 +1626,11 @@ DEFINE_IPL_ATTR_RW(dump_nvme, fid, "0x%0 dump_block_nvme->nvme.fid); DEFINE_IPL_ATTR_RW(dump_nvme, nsid, "0x%08llx\n", "%llx\n", dump_block_nvme->nvme.nsid); -DEFINE_IPL_ATTR_RW(dump_nvme, bootprog, "%lld\n", "%llx\n", - dump_block_nvme->nvme.bootprog); DEFINE_IPL_ATTR_RW(dump_nvme, br_lba, "%lld\n", "%llx\n", dump_block_nvme->nvme.br_lba); +DEFINE_IPL_ATTR_BOOTPROG_RW(dump_nvme, bootprog, "%lld\n", "%llx\n", + dump_block_nvme->hdr, + dump_block_nvme->nvme.bootprog);
DEFINE_IPL_ATTR_SCP_DATA_RW(dump_nvme, dump_block_nvme->hdr, dump_block_nvme->nvme, @@ -1635,8 +1658,9 @@ static const struct attribute_group dump
/* ECKD dump device attributes */ DEFINE_IPL_CCW_ATTR_RW(dump_eckd, device, dump_block_eckd->eckd); -DEFINE_IPL_ATTR_RW(dump_eckd, bootprog, "%lld\n", "%llx\n", - dump_block_eckd->eckd.bootprog); +DEFINE_IPL_ATTR_BOOTPROG_RW(dump_eckd, bootprog, "%lld\n", "%llx\n", + dump_block_eckd->hdr, + dump_block_eckd->eckd.bootprog);
IPL_ATTR_BR_CHR_SHOW_FN(dump, dump_block_eckd->eckd); IPL_ATTR_BR_CHR_STORE_FN(dump, dump_block_eckd->eckd);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wentao Guan guanwentao@uniontech.com
commit 52721cfc78c76b09c66e092b52617006390ae96a upstream.
Call gpiochip_remove() to free the resources allocated by gpiochip_add_data() in error path.
Fixes: 553b75d4bfe9 ("gpio: regmap: Allow to allocate regmap-irq device") Fixes: ae495810cffe ("gpio: regmap: add the .fixed_direction_output configuration parameter") CC: stable@vger.kernel.org Co-developed-by: WangYuli wangyl5933@chinaunicom.cn Signed-off-by: WangYuli wangyl5933@chinaunicom.cn Signed-off-by: Wentao Guan guanwentao@uniontech.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20251204101303.30353-1-guanwentao@uniontech.com [Bartosz: reworked the commit message] Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpio-regmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpio/gpio-regmap.c +++ b/drivers/gpio/gpio-regmap.c @@ -328,7 +328,7 @@ struct gpio_regmap *gpio_regmap_register config->regmap_irq_line, config->regmap_irq_flags, 0, config->regmap_irq_chip, &gpio->irq_chip_data); if (ret) - goto err_free_bitmap; + goto err_remove_gpiochip;
irq_domain = regmap_irq_get_domain(gpio->irq_chip_data); } else
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Askar Safin safinaskar@gmail.com
commit 2d967310c49ed93ac11cef408a55ddf15c3dd52e upstream.
Dell Precision 7780 often wakes up on its own from suspend. Sometimes wake up happens immediately (i. e. within 7 seconds), sometimes it happens after, say, 30 minutes.
Fixes: 1796f808e4bb ("HID: i2c-hid: acpi: Stop setting wakeup_capable") Link: https://lore.kernel.org/linux-i2c/197ae95ffd8.dc819e60457077.769212048860909... Cc: stable@vger.kernel.org Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Askar Safin safinaskar@gmail.com Link: https://lore.kernel.org/r/20251206180414.3183334-2-safinaskar@gmail.com Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpiolib-acpi-quirks.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)
--- a/drivers/gpio/gpiolib-acpi-quirks.c +++ b/drivers/gpio/gpiolib-acpi-quirks.c @@ -370,6 +370,28 @@ static const struct dmi_system_id gpioli .ignore_wake = "ASCP1A00:00@8", }, }, + { + /* + * Spurious wakeups, likely from touchpad controller + * Dell Precision 7780 + * Found in BIOS 1.24.1 + * + * Found in touchpad firmware, installed by Dell Touchpad Firmware Update Utility version 1160.4196.9, A01 + * ( Dell-Touchpad-Firmware-Update-Utility_VYGNN_WIN64_1160.4196.9_A00.EXE ), + * released on 11 Jul 2024 + * + * https://lore.kernel.org/linux-i2c/197ae95ffd8.dc819e60457077.769212048860909... + */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_FAMILY, "Precision"), + DMI_MATCH(DMI_PRODUCT_NAME, "Precision 7780"), + DMI_MATCH(DMI_BOARD_NAME, "0C6JVW"), + }, + .driver_data = &(struct acpi_gpiolib_dmi_quirk) { + .ignore_wake = "VEN_0488:00@355", + }, + }, {} /* Terminating entry */ };
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xi Ruoyao xry111@xry111.site
commit dae9750105cf93ac1e156ef91f4beeb53bd64777 upstream.
The manuals of 2K2000 says both BIT_CTRL_MODE and BYTE_CTRL_MODE are supported but the latter is recommended. Also on 2K3000, per the ACPI DSDT the GPIO controller is compatible with 2K2000, but it fails to operate GPIOs 62 and 63 (and maybe others) using BIT_CTRL_MODE. Using BYTE_CTRL_MODE also makes those 2K3000 GPIOs work.
Fixes: 3feb70a61740 ("gpio: loongson: add more gpio chip support") Cc: stable@vger.kernel.org Signed-off-by: Xi Ruoyao xry111@xry111.site Reviewed-by: Huacai Chen chenhuacai@loongson.cn Link: https://lore.kernel.org/r/20251128075033.255821-1-xry111@xry111.site Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpio-loongson-64bit.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/drivers/gpio/gpio-loongson-64bit.c +++ b/drivers/gpio/gpio-loongson-64bit.c @@ -407,11 +407,11 @@ static const struct loongson_gpio_chip_d
static const struct loongson_gpio_chip_data loongson_gpio_ls2k2000_data1 = { .label = "ls2k2000_gpio", - .mode = BIT_CTRL_MODE, - .conf_offset = 0x0, - .in_offset = 0x20, - .out_offset = 0x10, - .inten_offset = 0x30, + .mode = BYTE_CTRL_MODE, + .conf_offset = 0x800, + .in_offset = 0xa00, + .out_offset = 0x900, + .inten_offset = 0xb00, };
static const struct loongson_gpio_chip_data loongson_gpio_ls2k2000_data2 = {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
commit 72e24456a54fe04710d89626cc5a88703e2f6202 upstream.
Deeply daisy chained DP/MST displays are no longer able to light up. This reverts commit e0dec00f3d05 ("drm/amd/display: Fix pbn to kbps Conversion")
Cc: Jerry Zuo jerry.zuo@amd.com Reported-by: nat@nullable.se Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4756 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit e1c94109c76e8a77a21531bd53f6c63356c81158) Cc: stable@vger.kernel.org # 6.17+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 59 +++++++----- 1 file changed, 36 insertions(+), 23 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c @@ -884,28 +884,26 @@ struct dsc_mst_fairness_params { };
#if defined(CONFIG_DRM_AMD_DC_FP) -static uint64_t kbps_to_pbn(int kbps, bool is_peak_pbn) +static uint16_t get_fec_overhead_multiplier(struct dc_link *dc_link) { - uint64_t effective_kbps = (uint64_t)kbps; + u8 link_coding_cap; + uint16_t fec_overhead_multiplier_x1000 = PBN_FEC_OVERHEAD_MULTIPLIER_8B_10B;
- if (is_peak_pbn) { // add 0.6% (1006/1000) overhead into effective kbps - effective_kbps *= 1006; - effective_kbps = div_u64(effective_kbps, 1000); - } + link_coding_cap = dc_link_dp_mst_decide_link_encoding_format(dc_link); + if (link_coding_cap == DP_128b_132b_ENCODING) + fec_overhead_multiplier_x1000 = PBN_FEC_OVERHEAD_MULTIPLIER_128B_132B;
- return (uint64_t) DIV64_U64_ROUND_UP(effective_kbps * 64, (54 * 8 * 1000)); + return fec_overhead_multiplier_x1000; }
-static uint32_t pbn_to_kbps(unsigned int pbn, bool with_margin) +static int kbps_to_peak_pbn(int kbps, uint16_t fec_overhead_multiplier_x1000) { - uint64_t pbn_effective = (uint64_t)pbn; - - if (with_margin) // deduct 0.6% (994/1000) overhead from effective pbn - pbn_effective *= (1000000 / PEAK_FACTOR_X1000); - else - pbn_effective *= 1000; + u64 peak_kbps = kbps;
- return DIV_U64_ROUND_UP(pbn_effective * 8 * 54, 64); + peak_kbps *= 1006; + peak_kbps *= fec_overhead_multiplier_x1000; + peak_kbps = div_u64(peak_kbps, 1000 * 1000); + return (int) DIV64_U64_ROUND_UP(peak_kbps * 64, (54 * 8 * 1000)); }
static void set_dsc_configs_from_fairness_vars(struct dsc_mst_fairness_params *params, @@ -976,7 +974,7 @@ static int bpp_x16_from_pbn(struct dsc_m dc_dsc_get_default_config_option(param.sink->ctx->dc, &dsc_options); dsc_options.max_target_bpp_limit_override_x16 = drm_connector->display_info.max_dsc_bpp * 16;
- kbps = pbn_to_kbps(pbn, false); + kbps = div_u64((u64)pbn * 994 * 8 * 54, 64); dc_dsc_compute_config( param.sink->ctx->dc->res_pool->dscs[0], ¶m.sink->dsc_caps.dsc_dec_caps, @@ -1005,11 +1003,12 @@ static int increase_dsc_bpp(struct drm_a int link_timeslots_used; int fair_pbn_alloc; int ret = 0; + uint16_t fec_overhead_multiplier_x1000 = get_fec_overhead_multiplier(dc_link);
for (i = 0; i < count; i++) { if (vars[i + k].dsc_enabled) { initial_slack[i] = - kbps_to_pbn(params[i].bw_range.max_kbps, false) - vars[i + k].pbn; + kbps_to_peak_pbn(params[i].bw_range.max_kbps, fec_overhead_multiplier_x1000) - vars[i + k].pbn; bpp_increased[i] = false; remaining_to_increase += 1; } else { @@ -1105,6 +1104,7 @@ static int try_disable_dsc(struct drm_at int next_index; int remaining_to_try = 0; int ret; + uint16_t fec_overhead_multiplier_x1000 = get_fec_overhead_multiplier(dc_link); int var_pbn;
for (i = 0; i < count; i++) { @@ -1137,7 +1137,7 @@ static int try_disable_dsc(struct drm_at
DRM_DEBUG_DRIVER("MST_DSC index #%d, try no compression\n", next_index); var_pbn = vars[next_index].pbn; - vars[next_index].pbn = kbps_to_pbn(params[next_index].bw_range.stream_kbps, true); + vars[next_index].pbn = kbps_to_peak_pbn(params[next_index].bw_range.stream_kbps, fec_overhead_multiplier_x1000); ret = drm_dp_atomic_find_time_slots(state, params[next_index].port->mgr, params[next_index].port, @@ -1197,6 +1197,7 @@ static int compute_mst_dsc_configs_for_l int count = 0; int i, k, ret; bool debugfs_overwrite = false; + uint16_t fec_overhead_multiplier_x1000 = get_fec_overhead_multiplier(dc_link); struct drm_connector_state *new_conn_state;
memset(params, 0, sizeof(params)); @@ -1277,7 +1278,7 @@ static int compute_mst_dsc_configs_for_l DRM_DEBUG_DRIVER("MST_DSC Try no compression\n"); for (i = 0; i < count; i++) { vars[i + k].aconnector = params[i].aconnector; - vars[i + k].pbn = kbps_to_pbn(params[i].bw_range.stream_kbps, false); + vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.stream_kbps, fec_overhead_multiplier_x1000); vars[i + k].dsc_enabled = false; vars[i + k].bpp_x16 = 0; ret = drm_dp_atomic_find_time_slots(state, params[i].port->mgr, params[i].port, @@ -1299,7 +1300,7 @@ static int compute_mst_dsc_configs_for_l DRM_DEBUG_DRIVER("MST_DSC Try max compression\n"); for (i = 0; i < count; i++) { if (params[i].compression_possible && params[i].clock_force_enable != DSC_CLK_FORCE_DISABLE) { - vars[i + k].pbn = kbps_to_pbn(params[i].bw_range.min_kbps, false); + vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.min_kbps, fec_overhead_multiplier_x1000); vars[i + k].dsc_enabled = true; vars[i + k].bpp_x16 = params[i].bw_range.min_target_bpp_x16; ret = drm_dp_atomic_find_time_slots(state, params[i].port->mgr, @@ -1307,7 +1308,7 @@ static int compute_mst_dsc_configs_for_l if (ret < 0) return ret; } else { - vars[i + k].pbn = kbps_to_pbn(params[i].bw_range.stream_kbps, false); + vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.stream_kbps, fec_overhead_multiplier_x1000); vars[i + k].dsc_enabled = false; vars[i + k].bpp_x16 = 0; ret = drm_dp_atomic_find_time_slots(state, params[i].port->mgr, @@ -1762,6 +1763,18 @@ clean_exit: return ret; }
+static uint32_t kbps_from_pbn(unsigned int pbn) +{ + uint64_t kbps = (uint64_t)pbn; + + kbps *= (1000000 / PEAK_FACTOR_X1000); + kbps *= 8; + kbps *= 54; + kbps /= 64; + + return (uint32_t)kbps; +} + static bool is_dsc_common_config_possible(struct dc_stream_state *stream, struct dc_dsc_bw_range *bw_range) { @@ -1860,7 +1873,7 @@ enum dc_status dm_dp_mst_is_port_support dc_link_get_highest_encoding_format(stream->link)); cur_link_settings = stream->link->verified_link_cap; root_link_bw_in_kbps = dc_link_bandwidth_kbps(aconnector->dc_link, &cur_link_settings); - virtual_channel_bw_in_kbps = pbn_to_kbps(aconnector->mst_output_port->full_pbn, true); + virtual_channel_bw_in_kbps = kbps_from_pbn(aconnector->mst_output_port->full_pbn);
/* pick the end to end bw bottleneck */ end_to_end_bw_in_kbps = min(root_link_bw_in_kbps, virtual_channel_bw_in_kbps); @@ -1913,7 +1926,7 @@ enum dc_status dm_dp_mst_is_port_support immediate_upstream_port = aconnector->mst_output_port->parent->port_parent;
if (immediate_upstream_port) { - virtual_channel_bw_in_kbps = pbn_to_kbps(immediate_upstream_port->full_pbn, true); + virtual_channel_bw_in_kbps = kbps_from_pbn(immediate_upstream_port->full_pbn); virtual_channel_bw_in_kbps = min(root_link_bw_in_kbps, virtual_channel_bw_in_kbps); } else { /* For topology LCT 1 case - only one mstb*/
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 3c41114dcdabb7b25f5bc33273c6db9c7af7f4a7 upstream.
This can get called from an atomic context.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4470 Reviewed-by: Harry Wentland harry.wentland@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 8acdad9344cc7b4e7bc01f0dfea80093eb3768db) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/core/dc_surface.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/display/dc/core/dc_surface.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_surface.c @@ -86,7 +86,7 @@ uint8_t dc_plane_get_pipe_mask(struct d struct dc_plane_state *dc_create_plane_state(const struct dc *dc) { struct dc_plane_state *plane_state = kvzalloc(sizeof(*plane_state), - GFP_KERNEL); + GFP_ATOMIC);
if (NULL == plane_state) return NULL;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ray Wu ray.wu@amd.com
commit 69741d9ccc7222e6b6f138db67b012ecc0d72542 upstream.
[Why] Different platforms use differnet NBIO header files, causing display code to use differnt offset and read wrong accelerated status.
[How] - Unified NBIO offset header file across platform. - Correct scratch registers offsets to proper locations.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4667 Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Reviewed-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Ray Wu ray.wu@amd.com Signed-off-by: Chenyu Chen chen-yu.chen@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 49a63bc8eda0304ba307f5ba68305f936174f72d) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c @@ -203,12 +203,12 @@ enum dcn35_clk_src_array_id { NBIO_BASE_INNER(seg)
#define NBIO_SR(reg_name)\ - REG_STRUCT.reg_name = NBIO_BASE(regBIF_BX2_ ## reg_name ## _BASE_IDX) + \ - regBIF_BX2_ ## reg_name + REG_STRUCT.reg_name = NBIO_BASE(regBIF_BX1_ ## reg_name ## _BASE_IDX) + \ + regBIF_BX1_ ## reg_name
#define NBIO_SR_ARR(reg_name, id)\ - REG_STRUCT[id].reg_name = NBIO_BASE(regBIF_BX2_ ## reg_name ## _BASE_IDX) + \ - regBIF_BX2_ ## reg_name + REG_STRUCT[id].reg_name = NBIO_BASE(regBIF_BX1_ ## reg_name ## _BASE_IDX) + \ + regBIF_BX1_ ## reg_name
#define bios_regs_init() \ ( \
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ray Wu ray.wu@amd.com
commit fd62aa13d3ee0f21c756a40a7c2f900f98992d6a upstream.
[Why] Different platforms use different NBIO header files, causing display code to use differnt offset and read wrong accelerated status.
[How] - Unified NBIO offset header file across platform. - Correct scratch registers offsets to proper locations.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4667 Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Reviewed-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Ray Wu ray.wu@amd.com Signed-off-by: Chenyu Chen chen-yu.chen@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 576e032e909c8a6bb3d907b4ef5f6abe0f644199) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c @@ -183,12 +183,12 @@ enum dcn351_clk_src_array_id { NBIO_BASE_INNER(seg)
#define NBIO_SR(reg_name)\ - REG_STRUCT.reg_name = NBIO_BASE(regBIF_BX2_ ## reg_name ## _BASE_IDX) + \ - regBIF_BX2_ ## reg_name + REG_STRUCT.reg_name = NBIO_BASE(regBIF_BX1_ ## reg_name ## _BASE_IDX) + \ + regBIF_BX1_ ## reg_name
#define NBIO_SR_ARR(reg_name, id)\ - REG_STRUCT[id].reg_name = NBIO_BASE(regBIF_BX2_ ## reg_name ## _BASE_IDX) + \ - regBIF_BX2_ ## reg_name + REG_STRUCT[id].reg_name = NBIO_BASE(regBIF_BX1_ ## reg_name ## _BASE_IDX) + \ + regBIF_BX1_ ## reg_name
#define bios_regs_init() \ ( \
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula jani.nikula@intel.com
commit 520f37c30992fd0c212a34fbe99c062b7a3dc52e upstream.
It's more convenient to pass iter than a handful of its members to drm_find_displayid_extension(), especially as we're about to add another member.
Rename the function find_next_displayid_extension() while at it, to be more descriptive.
Cc: Tiago Martins Araújo tiago.martins.araujo@gmail.com Acked-by: Alex Deucher alexander.deucher@amd.com Tested-by: Tiago Martins Araújo tiago.martins.araujo@gmail.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/3837ae7f095e77a082ac2422ce2fac96c4f9373d.1761681968... Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_displayid.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-)
--- a/drivers/gpu/drm/drm_displayid.c +++ b/drivers/gpu/drm/drm_displayid.c @@ -48,26 +48,24 @@ validate_displayid(const u8 *displayid, return base; }
-static const u8 *drm_find_displayid_extension(const struct drm_edid *drm_edid, - int *length, int *idx, - int *ext_index) +static const u8 *find_next_displayid_extension(struct displayid_iter *iter) { const struct displayid_header *base; const u8 *displayid;
- displayid = drm_edid_find_extension(drm_edid, DISPLAYID_EXT, ext_index); + displayid = drm_edid_find_extension(iter->drm_edid, DISPLAYID_EXT, &iter->ext_index); if (!displayid) return NULL;
/* EDID extensions block checksum isn't for us */ - *length = EDID_LENGTH - 1; - *idx = 1; + iter->length = EDID_LENGTH - 1; + iter->idx = 1;
- base = validate_displayid(displayid, *length, *idx); + base = validate_displayid(displayid, iter->length, iter->idx); if (IS_ERR(base)) return NULL;
- *length = *idx + sizeof(*base) + base->bytes; + iter->length = iter->idx + sizeof(*base) + base->bytes;
return displayid; } @@ -126,10 +124,7 @@ __displayid_iter_next(struct displayid_i /* The first section we encounter is the base section */ bool base_section = !iter->section;
- iter->section = drm_find_displayid_extension(iter->drm_edid, - &iter->length, - &iter->idx, - &iter->ext_index); + iter->section = find_next_displayid_extension(iter); if (!iter->section) { iter->drm_edid = NULL; return NULL;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
commit ef99c2efeacac7758cc8c2d00e3200100a4da16c upstream.
Commit 756485bfbb85 ("dt-bindings: PCI: qcom,pcie-sc7280: Move SC7280 to dedicated schema") move the device schema to separate file, but it missed a "if:not:...then:" clause in the original binding which was requiring power-domains and resets for this particular chip.
Fixes: 756485bfbb85 ("dt-bindings: PCI: qcom,pcie-sc7280: Move SC7280 to dedicated schema") Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Reviewed-by: Rob Herring (Arm) robh@kernel.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251030-dt-bindings-pci-qcom-fixes-power-domains-v... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/pci/qcom,pcie-sc7280.yaml | 5 +++++ 1 file changed, 5 insertions(+)
--- a/Documentation/devicetree/bindings/pci/qcom,pcie-sc7280.yaml +++ b/Documentation/devicetree/bindings/pci/qcom,pcie-sc7280.yaml @@ -76,6 +76,11 @@ properties: items: - const: pci
+required: + - power-domains + - resets + - reset-names + allOf: - $ref: qcom,pcie-common.yaml#
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
commit ea551601404d286813aef6819ddf0bf1d7d69a24 upstream.
Commit c007a5505504 ("dt-bindings: PCI: qcom,pcie-sc8280xp: Move SC8280XP to dedicated schema") move the device schema to separate file, but it missed a "if:not:...then:" clause in the original binding which was requiring power-domains and resets for this particular chip.
Fixes: c007a5505504 ("dt-bindings: PCI: qcom,pcie-sc8280xp: Move SC8280XP to dedicated schema") Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Reviewed-by: Rob Herring (Arm) robh@kernel.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251030-dt-bindings-pci-qcom-fixes-power-domains-v... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/pci/qcom,pcie-sc8280xp.yaml | 3 +++ 1 file changed, 3 insertions(+)
--- a/Documentation/devicetree/bindings/pci/qcom,pcie-sc8280xp.yaml +++ b/Documentation/devicetree/bindings/pci/qcom,pcie-sc8280xp.yaml @@ -61,6 +61,9 @@ properties: required: - interconnects - interconnect-names + - power-domains + - resets + - reset-names
allOf: - $ref: qcom,pcie-common.yaml#
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
commit 31cb432b62fb796e0c1084542ba39311d2f716d5 upstream.
Commit 51bc04d5b49d ("dt-bindings: PCI: qcom,pcie-sm8150: Move SM8150 to dedicated schema") move the device schema to separate file, but it missed a "if:not:...then:" clause in the original binding which was requiring power-domains and resets for this particular chip.
Fixes: 51bc04d5b49d ("dt-bindings: PCI: qcom,pcie-sm8150: Move SM8150 to dedicated schema") Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Reviewed-by: Rob Herring (Arm) robh@kernel.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251030-dt-bindings-pci-qcom-fixes-power-domains-v... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/pci/qcom,pcie-sm8150.yaml | 5 +++++ 1 file changed, 5 insertions(+)
--- a/Documentation/devicetree/bindings/pci/qcom,pcie-sm8150.yaml +++ b/Documentation/devicetree/bindings/pci/qcom,pcie-sm8150.yaml @@ -74,6 +74,11 @@ properties: items: - const: pci
+required: + - power-domains + - resets + - reset-names + allOf: - $ref: qcom,pcie-common.yaml#
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
commit 2620c6bcd8c141b79ff2afe95dc814dfab644f63 upstream.
Commit 4891b66185c1 ("dt-bindings: PCI: qcom,pcie-sm8250: Move SM8250 to dedicated schema") move the device schema to separate file, but it missed a "if:not:...then:" clause in the original binding which was requiring power-domains and resets for this particular chip.
Fixes: 4891b66185c1 ("dt-bindings: PCI: qcom,pcie-sm8250: Move SM8250 to dedicated schema") Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Reviewed-by: Rob Herring (Arm) robh@kernel.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251030-dt-bindings-pci-qcom-fixes-power-domains-v... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/pci/qcom,pcie-sm8250.yaml | 5 +++++ 1 file changed, 5 insertions(+)
--- a/Documentation/devicetree/bindings/pci/qcom,pcie-sm8250.yaml +++ b/Documentation/devicetree/bindings/pci/qcom,pcie-sm8250.yaml @@ -83,6 +83,11 @@ properties: items: - const: pci
+required: + - power-domains + - resets + - reset-names + allOf: - $ref: qcom,pcie-common.yaml#
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
commit 012ba0d5f02e1f192eda263b5f9f826e47d607bb upstream.
Commit 2278b8b54773 ("dt-bindings: PCI: qcom,pcie-sm8350: Move SM8350 to dedicated schema") move the device schema to separate file, but it missed a "if:not:...then:" clause in the original binding which was requiring power-domains and resets for this particular chip.
Fixes: 2278b8b54773 ("dt-bindings: PCI: qcom,pcie-sm8350: Move SM8350 to dedicated schema") Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Reviewed-by: Rob Herring (Arm) robh@kernel.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251030-dt-bindings-pci-qcom-fixes-power-domains-v... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/pci/qcom,pcie-sm8350.yaml | 5 +++++ 1 file changed, 5 insertions(+)
--- a/Documentation/devicetree/bindings/pci/qcom,pcie-sm8350.yaml +++ b/Documentation/devicetree/bindings/pci/qcom,pcie-sm8350.yaml @@ -73,6 +73,11 @@ properties: items: - const: pci
+required: + - power-domains + - resets + - reset-names + allOf: - $ref: qcom,pcie-common.yaml#
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
commit 667facc4000c49a7c280097ef6638f133bcb1e59 upstream.
Commit 88c9b3af4e31 ("dt-bindings: PCI: qcom,pcie-sm8450: Move SM8450 to dedicated schema") move the device schema to separate file, but it missed a "if:not:...then:" clause in the original binding which was requiring power-domains and resets for this particular chip.
Fixes: 88c9b3af4e31 ("dt-bindings: PCI: qcom,pcie-sm8450: Move SM8450 to dedicated schema") Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Reviewed-by: Rob Herring (Arm) robh@kernel.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251030-dt-bindings-pci-qcom-fixes-power-domains-v... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/pci/qcom,pcie-sm8450.yaml | 5 +++++ 1 file changed, 5 insertions(+)
--- a/Documentation/devicetree/bindings/pci/qcom,pcie-sm8450.yaml +++ b/Documentation/devicetree/bindings/pci/qcom,pcie-sm8450.yaml @@ -77,6 +77,11 @@ properties: items: - const: pci
+required: + - power-domains + - resets + - reset-names + allOf: - $ref: qcom,pcie-common.yaml#
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
commit e60c6f34b9f3a83f96006243c0ef96c134520257 upstream.
Commit b8d3404058a6 ("dt-bindings: PCI: qcom,pcie-sm8550: Move SM8550 to dedicated schema") move the device schema to separate file, but it missed a "if:not:...then:" clause in the original binding which was requiring power-domains and resets for this particular chip.
Fixes: b8d3404058a6 ("dt-bindings: PCI: qcom,pcie-sm8550: Move SM8550 to dedicated schema") Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Reviewed-by: Rob Herring (Arm) robh@kernel.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251030-dt-bindings-pci-qcom-fixes-power-domains-v... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/pci/qcom,pcie-sm8550.yaml | 5 +++++ 1 file changed, 5 insertions(+)
--- a/Documentation/devicetree/bindings/pci/qcom,pcie-sm8550.yaml +++ b/Documentation/devicetree/bindings/pci/qcom,pcie-sm8550.yaml @@ -83,6 +83,11 @@ properties: - const: pci # PCIe core reset - const: link_down # PCIe link down reset
+required: + - power-domains + - resets + - reset-names + allOf: - $ref: qcom,pcie-common.yaml#
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shivani Agarwal shivani.agarwal@broadcom.com
commit 6f6e309328d53a10c0fe1f77dec2db73373179b6 upstream.
Several crypto user API contexts and requests allocated with sock_kmalloc() were left uninitialized, relying on callers to set fields explicitly. This resulted in the use of uninitialized data in certain error paths or when new fields are added in the future.
The ACVP patches also contain two user-space interface files: algif_kpp.c and algif_akcipher.c. These too rely on proper initialization of their context structures.
A particular issue has been observed with the newly added 'inflight' variable introduced in af_alg_ctx by commit:
67b164a871af ("crypto: af_alg - Disallow multiple in-flight AIO requests")
Because the context is not memset to zero after allocation, the inflight variable has contained garbage values. As a result, af_alg_alloc_areq() has incorrectly returned -EBUSY randomly when the garbage value was interpreted as true:
https://github.com/gregkh/linux/blame/master/crypto/af_alg.c#L1209
The check directly tests ctx->inflight without explicitly comparing against true/false. Since inflight is only ever set to true or false later, an uninitialized value has triggered -EBUSY failures. Zero-initializing memory allocated with sock_kmalloc() ensures inflight and other fields start in a known state, removing random issues caused by uninitialized data.
Fixes: fe869cdb89c9 ("crypto: algif_hash - User-space interface for hash operations") Fixes: 5afdfd22e6ba ("crypto: algif_rng - add random number generator support") Fixes: 2d97591ef43d ("crypto: af_alg - consolidation of duplicate code") Fixes: 67b164a871af ("crypto: af_alg - Disallow multiple in-flight AIO requests") Cc: stable@vger.kernel.org Signed-off-by: Shivani Agarwal shivani.agarwal@broadcom.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- crypto/af_alg.c | 5 ++--- crypto/algif_hash.c | 3 +-- crypto/algif_rng.c | 3 +-- 3 files changed, 4 insertions(+), 7 deletions(-)
--- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -1212,15 +1212,14 @@ struct af_alg_async_req *af_alg_alloc_ar if (unlikely(!areq)) return ERR_PTR(-ENOMEM);
+ memset(areq, 0, areqlen); + ctx->inflight = true;
areq->areqlen = areqlen; areq->sk = sk; areq->first_rsgl.sgl.sgt.sgl = areq->first_rsgl.sgl.sgl; - areq->last_rsgl = NULL; INIT_LIST_HEAD(&areq->rsgl_list); - areq->tsgl = NULL; - areq->tsgl_entries = 0;
return areq; } --- a/crypto/algif_hash.c +++ b/crypto/algif_hash.c @@ -416,9 +416,8 @@ static int hash_accept_parent_nokey(void if (!ctx) return -ENOMEM;
- ctx->result = NULL; + memset(ctx, 0, len); ctx->len = len; - ctx->more = false; crypto_init_wait(&ctx->wait);
ask->private = ctx; --- a/crypto/algif_rng.c +++ b/crypto/algif_rng.c @@ -248,9 +248,8 @@ static int rng_accept_parent(void *priva if (!ctx) return -ENOMEM;
+ memset(ctx, 0, len); ctx->len = len; - ctx->addtl = NULL; - ctx->addtl_len = 0;
/* * No seeding done at that point -- if multiple accepts are
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guangshuo Li lgs201920130244@gmail.com
commit 7cf6e0b69b0d90ab042163e5bbddda0dfcf8b6a7 upstream.
As kcalloc() may fail, check its return value to avoid a NULL pointer dereference when passing the buffer to rng->read(). On allocation failure, log the error and return since test_len() returns void.
Fixes: 2be0d806e25e ("crypto: caam - add a test for the RNG") Cc: stable@vger.kernel.org Signed-off-by: Guangshuo Li lgs201920130244@gmail.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/crypto/caam/caamrng.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/crypto/caam/caamrng.c b/drivers/crypto/caam/caamrng.c index b3d14a7f4dd1..0eb43c862516 100644 --- a/drivers/crypto/caam/caamrng.c +++ b/drivers/crypto/caam/caamrng.c @@ -181,7 +181,9 @@ static inline void test_len(struct hwrng *rng, size_t len, bool wait) struct device *dev = ctx->ctrldev;
buf = kcalloc(CAAM_RNG_MAX_FIFO_STORE_SIZE, sizeof(u8), GFP_KERNEL); - + if (!buf) { + return; + } while (len > 0) { read_len = rng->read(rng, buf, len, wait);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@kernel.org
commit f6a458746f905adb7d70e50e8b9383dc9e3fd75f upstream.
Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") made ghash_finup() pass the wrong buffer to ghash_do_simd_update(). As a result, ghash-neon now produces incorrect outputs when the message length isn't divisible by 16 bytes. Fix this.
(I didn't notice this earlier because this code is reached only on CPUs that support NEON but not PMULL. I haven't yet found a way to get qemu-system-aarch64 to emulate that configuration.)
Fixes: 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") Cc: stable@vger.kernel.org Reported-by: Diederik de Haas diederik@cknow-tech.com Closes: https://lore.kernel.org/linux-crypto/DETXT7QI62KE.F3CGH2VWX1SC@cknow-tech.co... Tested-by: Diederik de Haas diederik@cknow-tech.com Acked-by: Herbert Xu herbert@gondor.apana.org.au Link: https://lore.kernel.org/r/20251209223417.112294-1-ebiggers@kernel.org Signed-off-by: Eric Biggers ebiggers@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/crypto/ghash-ce-glue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/crypto/ghash-ce-glue.c +++ b/arch/arm64/crypto/ghash-ce-glue.c @@ -133,7 +133,7 @@ static int ghash_finup(struct shash_desc u8 buf[GHASH_BLOCK_SIZE] = {};
memcpy(buf, src, len); - ghash_do_simd_update(1, ctx->digest, src, key, NULL, + ghash_do_simd_update(1, ctx->digest, buf, key, NULL, pmull_ghash_update_p8); memzero_explicit(buf, sizeof(buf)); }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 500e1368e46928f4b2259612dcabb6999afae2a6 upstream.
Make sure to drop the reference taken to the AHB platform device when looking up its driver data while enabling the SMMU.
Note that holding a reference to a device does not prevent its driver data from going away.
Fixes: 89c788bab1f0 ("ARM: tegra: Add SMMU enabler in AHB") Cc: stable@vger.kernel.org # 3.5 Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Thierry Reding treding@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/amba/tegra-ahb.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/amba/tegra-ahb.c +++ b/drivers/amba/tegra-ahb.c @@ -144,6 +144,7 @@ int tegra_ahb_enable_smmu(struct device_ if (!dev) return -EPROBE_DEFER; ahb = dev_get_drvdata(dev); + put_device(dev); val = gizmo_readl(ahb, AHB_ARBITRATION_XBAR_CTRL); val |= AHB_ARBITRATION_XBAR_CTRL_SMMU_INIT_DONE; gizmo_writel(ahb, val, AHB_ARBITRATION_XBAR_CTRL);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal dlemoal@kernel.org
commit cf28f6f923cb1dd2765b5c3d7697bb4dcf2096a0 upstream.
zloop_rw() will fail any regular write operation that targets a full sequential zone. The check for this is indirect and achieved by checking the write pointer alignment of the write operation. But this check is ineffective for zone append operations since these are alwasy automatically directed at a zone write pointer.
Prevent zone append operations from being executed in a full zone with an explicit check of the zone condition.
Fixes: eb0570c7df23 ("block: new zoned loop block device driver") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/zloop.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/block/zloop.c +++ b/drivers/block/zloop.c @@ -407,6 +407,10 @@ static void zloop_rw(struct zloop_cmd *c mutex_lock(&zone->lock);
if (is_append) { + if (zone->cond == BLK_ZONE_COND_FULL) { + ret = -EIO; + goto unlock; + } sector = zone->wp; cmd->sector = sector; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal dlemoal@kernel.org
commit 866d65745b635927c3d1343ab67e6fd4a99d116d upstream.
The write pointer of zones that are in the full condition is always invalid. Reflect that fact by setting the write pointer of full zones to ULLONG_MAX.
Fixes: eb0570c7df23 ("block: new zoned loop block device driver") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/zloop.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/block/zloop.c +++ b/drivers/block/zloop.c @@ -177,7 +177,7 @@ static int zloop_update_seq_zone(struct zone->wp = zone->start; } else if (file_sectors == zlo->zone_capacity) { zone->cond = BLK_ZONE_COND_FULL; - zone->wp = zone->start + zlo->zone_size; + zone->wp = ULLONG_MAX; } else { zone->cond = BLK_ZONE_COND_CLOSED; zone->wp = zone->start + file_sectors; @@ -326,7 +326,7 @@ static int zloop_finish_zone(struct zloo }
zone->cond = BLK_ZONE_COND_FULL; - zone->wp = zone->start + zlo->zone_size; + zone->wp = ULLONG_MAX; clear_bit(ZLOOP_ZONE_SEQ_ERROR, &zone->flags);
unlock: @@ -437,8 +437,10 @@ static void zloop_rw(struct zloop_cmd *c * copmpletes. */ zone->wp += nr_sectors; - if (zone->wp == zone_end) + if (zone->wp == zone_end) { zone->cond = BLK_ZONE_COND_FULL; + zone->wp = ULLONG_MAX; + } }
rq_for_each_bvec(tmp, rq, rq_iter)
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miaoqian Lin linmq006@gmail.com
commit b41ca62c0019de1321d75f2b2f274a28784a41ed upstream.
pci_get_device() will increase the reference count for the returned pci_dev, and also decrease the reference count for the input parameter from if it is not NULL.
If we break the loop in with 'vf_pdev' not NULL. We need to call pci_dev_put() to decrease the reference count.
Found via static anlaysis and this is similar to commit c508eb042d97 ("perf/x86/intel/uncore: Fix reference count leak in sad_cfg_iio_topology()")
Fixes: 8b6c724cdab8 ("virtio: vdpa: vDPA driver for Marvell OCTEON DPU devices") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin linmq006@gmail.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Message-Id: 20251027060737.33815-1-linmq006@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vdpa/octeon_ep/octep_vdpa_main.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/vdpa/octeon_ep/octep_vdpa_main.c +++ b/drivers/vdpa/octeon_ep/octep_vdpa_main.c @@ -736,6 +736,7 @@ static int octep_sriov_enable(struct pci octep_vdpa_assign_barspace(vf_pdev, pdev, index); if (++index == num_vfs) { done = true; + pci_dev_put(vf_pdev); break; } }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Raghavendra Rao Ananta rananta@google.com
commit 2f03f21fe7516902283b135de272d3c7b10672de upstream.
For the cases where user includes a non-zero value in 'token_uuid_ptr' field of 'struct vfio_device_bind_iommufd', the copy_struct_from_user() in vfio_df_ioctl_bind_iommufd() fails with -E2BIG. For the 'minsz' passed, copy_struct_from_user() expects the newly introduced field to be zero-ed, which would be incorrect in this case.
Fix this by passing the actual size of the kernel struct. If working with a newer userspace, copy_struct_from_user() would copy the 'token_uuid_ptr' field, and if working with an old userspace, it would zero out this field, thus still retaining backward compatibility.
Fixes: 86624ba3b522 ("vfio/pci: Do vf_token checks for VFIO_DEVICE_BIND_IOMMUFD") Cc: stable@vger.kernel.org Signed-off-by: Raghavendra Rao Ananta rananta@google.com Reviewed-by: Jason Gunthorpe jgg@nvidia.com Link: https://lore.kernel.org/r/20251031170603.2260022-2-rananta@google.com Signed-off-by: Alex Williamson alex@shazbot.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vfio/device_cdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/vfio/device_cdev.c +++ b/drivers/vfio/device_cdev.c @@ -99,7 +99,7 @@ long vfio_df_ioctl_bind_iommufd(struct v return ret; if (user_size < minsz) return -EINVAL; - ret = copy_struct_from_user(&bind, minsz, arg, user_size); + ret = copy_struct_from_user(&bind, sizeof(bind), arg, user_size); if (ret) return ret;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
commit 47ef834209e5981f443240d8a8b45bf680df22aa upstream.
The commit 4d38328eb442d ("tracing: Fix synth event printk format for str fields") replaced "%.*s" with "%s" but missed removing the number size of the dynamic and static strings. The commit e1a453a57bc7 ("tracing: Do not add length to print format in synthetic events") fixed the dynamic part but did not fix the static part. That is, with the commands:
# echo 's:wake_lat char[] wakee; u64 delta;' >> /sys/kernel/tracing/dynamic_events # echo 'hist:keys=pid:ts=common_timestamp.usecs if !(common_flags & 0x18)' > /sys/kernel/tracing/events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:delta=common_timestamp.usecs-$ts:onmatch(sched.sched_waking).trace(wake_lat,next_comm,$delta)' > /sys/kernel/tracing/events/sched/sched_switch/trigger
That caused the output of:
<idle>-0 [001] d..5. 193.428167: wake_lat: wakee=(efault)sshd-sessiondelta=155 sshd-session-879 [001] d..5. 193.811080: wake_lat: wakee=(efault)kworker/u34:5delta=58 <idle>-0 [002] d..5. 193.811198: wake_lat: wakee=(efault)bashdelta=91
The commit e1a453a57bc7 fixed the part where the synthetic event had "char[] wakee". But if one were to replace that with a static size string:
# echo 's:wake_lat char[16] wakee; u64 delta;' >> /sys/kernel/tracing/dynamic_events
Where "wakee" is defined as "char[16]" and not "char[]" making it a static size, the code triggered the "(efaul)" again.
Remove the added STR_VAR_LEN_MAX size as the string is still going to be nul terminated.
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Douglas Raillard douglas.raillard@arm.com Link: https://patch.msgid.link/20251204151935.5fa30355@gandalf.local.home Fixes: e1a453a57bc7 ("tracing: Do not add length to print format in synthetic events") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events_synth.c | 1 - 1 file changed, 1 deletion(-)
--- a/kernel/trace/trace_events_synth.c +++ b/kernel/trace/trace_events_synth.c @@ -375,7 +375,6 @@ static enum print_line_t print_synth_eve n_u64++; } else { trace_seq_printf(s, print_fmt, se->fields[i]->name, - STR_VAR_LEN_MAX, (char *)&entry->fields[n_u64].as_u64, i == se->n_fields - 1 ? "" : " "); n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 990eb9a8eb4540ab90c7b34bb07b87ff13881cad upstream.
Make sure to drop the reference taken when looking up the PMU device and its regmap.
Note that holding a reference to a device does not prevent its regmap from going away so there is no point in keeping the reference.
Fixes: 0b7c6075022c ("soc: samsung: exynos-pmu: Add regmap support for SoCs that protect PMU regs") Cc: stable@vger.kernel.org # 6.9 Cc: Peter Griffin peter.griffin@linaro.org Signed-off-by: Johan Hovold johan@kernel.org Link: https://patch.msgid.link/20251121121852.16825-1-johan@kernel.org Signed-off-by: Krzysztof Kozlowski krzk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/soc/samsung/exynos-pmu.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/soc/samsung/exynos-pmu.c +++ b/drivers/soc/samsung/exynos-pmu.c @@ -346,6 +346,8 @@ struct regmap *exynos_get_pmu_regmap_by_ if (!dev) return ERR_PTR(-EPROBE_DEFER);
+ put_device(dev); + return syscon_node_to_regmap(pmu_np); } EXPORT_SYMBOL_GPL(exynos_get_pmu_regmap_by_phandle);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 94124bf253d24b13e89c45618a168d5a1d8a61e7 upstream.
Make sure to drop the reference taken to the pbs platform device when looking up its driver data.
Note that holding a reference to a device does not prevent its driver data from going away so there is no point in keeping the reference.
Fixes: 5b2dd77be1d8 ("soc: qcom: add QCOM PBS driver") Cc: stable@vger.kernel.org # 6.9 Cc: Anjelique Melendez quic_amelende@quicinc.com Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20250926143511.6715-3-johan@kernel.org Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/soc/qcom/qcom-pbs.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/soc/qcom/qcom-pbs.c +++ b/drivers/soc/qcom/qcom-pbs.c @@ -173,6 +173,8 @@ struct pbs_dev *get_pbs_client_device(st return ERR_PTR(-EINVAL); }
+ platform_device_put(pdev); + return pbs; } EXPORT_SYMBOL_GPL(get_pbs_client_device);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit b5c16ea57b030b8e9428ec726e26219dfe05c3d9 upstream.
Make sure to drop the reference taken to the ocmem platform device when looking up its driver data.
Note that holding a reference to a device does not prevent its driver data from going away so there is no point in keeping the reference.
Also note that commit 0ff027027e05 ("soc: qcom: ocmem: Fix missing put_device() call in of_get_ocmem") fixed the leak in a lookup error path, but the reference is still leaking on success.
Fixes: 88c1e9404f1d ("soc: qcom: add OCMEM driver") Cc: stable@vger.kernel.org # 5.5: 0ff027027e05 Cc: Brian Masney bmasney@redhat.com Cc: Miaoqian Lin linmq006@gmail.com Signed-off-by: Johan Hovold johan@kernel.org Reviewed-by: Brian Masney bmasney@redhat.com Link: https://lore.kernel.org/r/20250926143511.6715-2-johan@kernel.org Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/soc/qcom/ocmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/soc/qcom/ocmem.c +++ b/drivers/soc/qcom/ocmem.c @@ -202,9 +202,9 @@ struct ocmem *of_get_ocmem(struct device }
ocmem = platform_get_drvdata(pdev); + put_device(&pdev->dev); if (!ocmem) { dev_err(dev, "Cannot get ocmem\n"); - put_device(&pdev->dev); return ERR_PTR(-ENODEV); } return ocmem;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit f401671e90ccc26b3022f177c4156a429c024f6c upstream.
Make sure to drop the reference taken to the mbox platform device when looking up its driver data.
Note that holding a reference to a device does not prevent its driver data from going away so there is no point in keeping the reference.
Fixes: 6e1457fcad3f ("soc: apple: mailbox: Add ASC/M3 mailbox driver") Cc: stable@vger.kernel.org # 6.8 Signed-off-by: Johan Hovold johan@kernel.org Reviewed-by: Neal Gompa neal@gompa.dev Signed-off-by: Sven Peter sven@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/soc/apple/mailbox.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
--- a/drivers/soc/apple/mailbox.c +++ b/drivers/soc/apple/mailbox.c @@ -302,11 +302,18 @@ struct apple_mbox *apple_mbox_get(struct return ERR_PTR(-EPROBE_DEFER);
mbox = platform_get_drvdata(pdev); - if (!mbox) - return ERR_PTR(-EPROBE_DEFER); + if (!mbox) { + mbox = ERR_PTR(-EPROBE_DEFER); + goto out_put_pdev; + } + + if (!device_link_add(dev, &pdev->dev, DL_FLAG_AUTOREMOVE_CONSUMER)) { + mbox = ERR_PTR(-ENODEV); + goto out_put_pdev; + }
- if (!device_link_add(dev, &pdev->dev, DL_FLAG_AUTOREMOVE_CONSUMER)) - return ERR_PTR(-ENODEV); +out_put_pdev: + put_device(&pdev->dev);
return mbox; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 32200f4828de9d7e6db379909898e718747f4e18 upstream.
Make sure to drop the reference taken to the canvas platform device when looking up its driver data.
Note that holding a reference to a device does not prevent its driver data from going away so there is no point in keeping the reference.
Also note that commit 28f851e6afa8 ("soc: amlogic: canvas: add missing put_device() call in meson_canvas_get()") fixed the leak in a lookup error path, but the reference is still leaking on success.
Fixes: d4983983d987 ("soc: amlogic: add meson-canvas driver") Cc: stable@vger.kernel.org # 4.20: 28f851e6afa8 Cc: Yu Kuai yukuai3@huawei.com Signed-off-by: Johan Hovold johan@kernel.org Reviewed-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Link: https://patch.msgid.link/20250926142454.5929-2-johan@kernel.org Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/soc/amlogic/meson-canvas.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/soc/amlogic/meson-canvas.c +++ b/drivers/soc/amlogic/meson-canvas.c @@ -73,10 +73,9 @@ struct meson_canvas *meson_canvas_get(st * current state, this driver probe cannot return -EPROBE_DEFER */ canvas = dev_get_drvdata(&canvas_pdev->dev); - if (!canvas) { - put_device(&canvas_pdev->dev); + put_device(&canvas_pdev->dev); + if (!canvas) return ERR_PTR(-EINVAL); - }
return canvas; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tomas Glozar tglozar@redhat.com
commit e4240db9336c25826a2d6634adcca86d5ee01bde upstream.
rtla-timerlat allows a *thread* latency threshold to be set via the -T/--thread option. However, the timerlat tracer calls this *total* latency (stop_tracing_total_us), and stops tracing also when the return-to-user latency is over the threshold.
Change the behavior of the timerlat BPF program to reflect what the timerlat tracer is doing, to avoid discrepancy between stopping collecting data in the BPF program and stopping tracing in the timerlat tracer.
Cc: stable@vger.kernel.org Fixes: e34293ddcebd ("rtla/timerlat: Add BPF skeleton to collect samples") Reviewed-by: Wander Lairson Costa wander@redhat.com Link: https://lore.kernel.org/r/20251006143100.137255-1-tglozar@redhat.com Signed-off-by: Tomas Glozar tglozar@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/tracing/rtla/src/timerlat.bpf.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/tools/tracing/rtla/src/timerlat.bpf.c b/tools/tracing/rtla/src/timerlat.bpf.c index 084cd10c21fc..e2265b5d6491 100644 --- a/tools/tracing/rtla/src/timerlat.bpf.c +++ b/tools/tracing/rtla/src/timerlat.bpf.c @@ -148,6 +148,9 @@ int handle_timerlat_sample(struct trace_event_raw_timerlat_sample *tp_args) } else { update_main_hist(&hist_user, bucket); update_summary(&summary_user, latency, bucket); + + if (thread_threshold != 0 && latency_us >= thread_threshold) + set_stop_tracing(); }
return 0;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Srinivas Kandagatla srinivas.kandagatla@oss.qualcomm.com
commit a53e356df548f6b0e82529ef3cc6070f42622189 upstream.
While testing rpmsg-char interface it was noticed that duplicate sysfs entries are getting created and below warning is noticed.
Reason for this is that we are leaking rpmsg device pointer, setting it null without actually unregistering device. Any further attempts to unregister fail because rpdev is NULL, resulting in a leak.
Fix this by unregistering rpmsg device before removing its reference from rpmsg channel.
sysfs: cannot create duplicate filename '/devices/platform/soc@0/3700000.remot eproc/remoteproc/remoteproc1/3700000.remoteproc:glink-edge/3700000.remoteproc: glink-edge.adsp_apps.-1.-1' [ 114.115347] CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.16.0-rc4 #7 PREEMPT [ 114.115355] Hardware name: Qualcomm Technologies, Inc. Robotics RB3gen2 (DT) [ 114.115358] Workqueue: events qcom_glink_work [ 114.115371] Call trace:8 [ 114.115374] show_stack+0x18/0x24 (C) [ 114.115382] dump_stack_lvl+0x60/0x80 [ 114.115388] dump_stack+0x18/0x24 [ 114.115393] sysfs_warn_dup+0x64/0x80 [ 114.115402] sysfs_create_dir_ns+0xf4/0x120 [ 114.115409] kobject_add_internal+0x98/0x260 [ 114.115416] kobject_add+0x9c/0x108 [ 114.115421] device_add+0xc4/0x7a0 [ 114.115429] rpmsg_register_device+0x5c/0xb0 [ 114.115434] qcom_glink_work+0x4bc/0x820 [ 114.115438] process_one_work+0x148/0x284 [ 114.115446] worker_thread+0x2c4/0x3e0 [ 114.115452] kthread+0x12c/0x204 [ 114.115457] ret_from_fork+0x10/0x20 [ 114.115464] kobject: kobject_add_internal failed for 3700000.remoteproc: glink-edge.adsp_apps.-1.-1 with -EEXIST, don't try to register things with the same name in the same directory. [ 114.250045] rpmsg 3700000.remoteproc:glink-edge.adsp_apps.-1.-1: device_add failed: -17
Fixes: 835764ddd9af ("rpmsg: glink: Move the common glink protocol implementation to glink_native.c") Cc: Stable@vger.kernel.org Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@oss.qualcomm.com Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@oss.qualcomm.com Link: https://lore.kernel.org/r/20250822100043.2604794-2-srinivas.kandagatla@oss.q... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/rpmsg/qcom_glink_native.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/rpmsg/qcom_glink_native.c +++ b/drivers/rpmsg/qcom_glink_native.c @@ -1399,6 +1399,7 @@ static void qcom_glink_destroy_ept(struc { struct glink_channel *channel = to_glink_channel(ept); struct qcom_glink *glink = channel->glink; + struct rpmsg_channel_info chinfo; unsigned long flags;
spin_lock_irqsave(&channel->recv_lock, flags); @@ -1406,6 +1407,13 @@ static void qcom_glink_destroy_ept(struc spin_unlock_irqrestore(&channel->recv_lock, flags);
/* Decouple the potential rpdev from the channel */ + if (channel->rpdev) { + strscpy_pad(chinfo.name, channel->name, sizeof(chinfo.name)); + chinfo.src = RPMSG_ADDR_ANY; + chinfo.dst = RPMSG_ADDR_ANY; + + rpmsg_unregister_device(glink->dev, &chinfo); + } channel->rpdev = NULL;
qcom_glink_send_close_req(glink, channel);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Biju Das biju.das.jz@bp.renesas.com
commit fae00ea9f00367771003ace78f29549dead58fc7 upstream.
The rzg2l_gpt_config() tests the rzg2l_gpt->period_tick variable when both channels of a hardware channel are in use. This check is not valid if rzg2l_gpt_config() is called after disabling all the channels, as it tests against the cached value. Hence, allow checking and setting the cached value only if the sibling channel is enabled.
While at it, drop else after return statement to fix the check patch warning.
Cc: stable@kernel.org Fixes: 061f087f5d0b ("pwm: Add support for RZ/G2L GPT") Signed-off-by: Biju Das biju.das.jz@bp.renesas.com Link: https://patch.msgid.link/20251126104308.142302-1-biju.das.jz@bp.renesas.com Signed-off-by: Uwe Kleine-König ukleinek@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pwm/pwm-rzg2l-gpt.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
--- a/drivers/pwm/pwm-rzg2l-gpt.c +++ b/drivers/pwm/pwm-rzg2l-gpt.c @@ -96,6 +96,11 @@ static inline unsigned int rzg2l_gpt_sub return hwpwm & 0x1; }
+static inline unsigned int rzg2l_gpt_sibling(unsigned int hwpwm) +{ + return hwpwm ^ 0x1; +} + static void rzg2l_gpt_write(struct rzg2l_gpt_chip *rzg2l_gpt, u32 reg, u32 data) { writel(data, rzg2l_gpt->mmio + reg); @@ -271,10 +276,14 @@ static int rzg2l_gpt_config(struct pwm_c * in use with different settings. */ if (rzg2l_gpt->channel_request_count[ch] > 1) { - if (period_ticks < rzg2l_gpt->period_ticks[ch]) - return -EBUSY; - else + u8 sibling_ch = rzg2l_gpt_sibling(pwm->hwpwm); + + if (rzg2l_gpt_is_ch_enabled(rzg2l_gpt, sibling_ch)) { + if (period_ticks < rzg2l_gpt->period_ticks[ch]) + return -EBUSY; + period_ticks = rzg2l_gpt->period_ticks[ch]; + } }
prescale = rzg2l_gpt_calculate_prescale(rzg2l_gpt, period_ticks);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bartosz Golaszewski bartosz.golaszewski@linaro.org
commit 527250cd9092461f1beac3e4180a4481bffa01b5 upstream.
Members of struct software_node_ref_args should not be dereferenced directly but set using the provided macros. Commit d7cdbbc93c56 ("software node: allow referencing firmware nodes") changed the name of the software node member and caused a build failure. Remove all direct dereferences of the ref struct as a fix.
However, this driver also seems to abuse the software node interface by waiting for a node with an arbitrary name "intel-xhci-usb-sw" to appear in the system before setting up the reference for the I2C device, while the actual software node already exists in the intel-xhci-usb-role-switch module and should be used to set up a static reference. Add a FIXME for a future improvement.
Fixes: d7cdbbc93c56 ("software node: allow referencing firmware nodes") Fixes: 53c24c2932e5 ("platform/x86: intel_cht_int33fe: use inline reference properties") Cc: stable@vger.kernel.org Reported-by: Stephen Rothwell sfr@canb.auug.org.au Closes: https://lore.kernel.org/all/20251121111534.7cdbfe5c@canb.auug.org.au/ Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Reviewed-by: Hans de Goede johannes.goede@oss.qualcomm.com Acked-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Philipp Zabel p.zabel@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/platform/x86/intel/chtwc_int33fe.c | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-)
--- a/drivers/platform/x86/intel/chtwc_int33fe.c +++ b/drivers/platform/x86/intel/chtwc_int33fe.c @@ -77,7 +77,7 @@ static const struct software_node max170 * software node. */ static struct software_node_ref_args fusb302_mux_refs[] = { - { .node = NULL }, + SOFTWARE_NODE_REFERENCE(NULL), };
static const struct property_entry fusb302_properties[] = { @@ -190,11 +190,6 @@ static void cht_int33fe_remove_nodes(str { software_node_unregister_node_group(node_group);
- if (fusb302_mux_refs[0].node) { - fwnode_handle_put(software_node_fwnode(fusb302_mux_refs[0].node)); - fusb302_mux_refs[0].node = NULL; - } - if (data->dp) { data->dp->secondary = NULL; fwnode_handle_put(data->dp); @@ -202,7 +197,15 @@ static void cht_int33fe_remove_nodes(str } }
-static int cht_int33fe_add_nodes(struct cht_int33fe_data *data) +static void cht_int33fe_put_swnode(void *data) +{ + struct fwnode_handle *fwnode = data; + + fwnode_handle_put(fwnode); + fusb302_mux_refs[0] = SOFTWARE_NODE_REFERENCE(NULL); +} + +static int cht_int33fe_add_nodes(struct device *dev, struct cht_int33fe_data *data) { const struct software_node *mux_ref_node; int ret; @@ -212,17 +215,25 @@ static int cht_int33fe_add_nodes(struct * until the mux driver has created software node for the mux device. * It means we depend on the mux driver. This function will return * -EPROBE_DEFER until the mux device is registered. + * + * FIXME: the relevant software node exists in intel-xhci-usb-role-switch + * and - if exported - could be used to set up a static reference. */ mux_ref_node = software_node_find_by_name(NULL, "intel-xhci-usb-sw"); if (!mux_ref_node) return -EPROBE_DEFER;
+ ret = devm_add_action_or_reset(dev, cht_int33fe_put_swnode, + software_node_fwnode(mux_ref_node)); + if (ret) + return ret; + /* * Update node used in "usb-role-switch" property. Note that we * rely on software_node_register_node_group() to use the original * instance of properties instead of copying them. */ - fusb302_mux_refs[0].node = mux_ref_node; + fusb302_mux_refs[0] = SOFTWARE_NODE_REFERENCE(mux_ref_node);
ret = software_node_register_node_group(node_group); if (ret) @@ -345,7 +356,7 @@ static int cht_int33fe_typec_probe(struc return fusb302_irq; }
- ret = cht_int33fe_add_nodes(data); + ret = cht_int33fe_add_nodes(dev, data); if (ret) return ret;
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@kernel.org
commit 1cd5bb6e9e027bab33aafd58fe8340124869ba62 upstream.
Replace the RISCV_ISA_V dependency of the RISC-V crypto code with RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS, which implies RISCV_ISA_V as well as vector unaligned accesses being efficient.
This is necessary because this code assumes that vector unaligned accesses are supported and are efficient. (It does so to avoid having to use lots of extra vsetvli instructions to switch the element width back and forth between 8 and either 32 or 64.)
This was omitted from the code originally just because the RISC-V kernel support for detecting this feature didn't exist yet. Support has now been added, but it's fragmented into per-CPU runtime detection, a command-line parameter, and a kconfig option. The kconfig option is the only reasonable way to do it, though, so let's just rely on that.
Fixes: eb24af5d7a05 ("crypto: riscv - add vector crypto accelerated AES-{ECB,CBC,CTR,XTS}") Fixes: bb54668837a0 ("crypto: riscv - add vector crypto accelerated ChaCha20") Fixes: 600a3853dfa0 ("crypto: riscv - add vector crypto accelerated GHASH") Fixes: 8c8e40470ffe ("crypto: riscv - add vector crypto accelerated SHA-{256,224}") Fixes: b3415925a08b ("crypto: riscv - add vector crypto accelerated SHA-{512,384}") Fixes: 563a5255afa2 ("crypto: riscv - add vector crypto accelerated SM3") Fixes: b8d06352bbf3 ("crypto: riscv - add vector crypto accelerated SM4") Cc: stable@vger.kernel.org Reported-by: Vivian Wang wangruikang@iscas.ac.cn Closes: https://lore.kernel.org/r/b3cfcdac-0337-4db0-a611-258f2868855f@iscas.ac.cn/ Reviewed-by: Jerry Shih jerry.shih@sifive.com Link: https://lore.kernel.org/r/20251206213750.81474-1-ebiggers@kernel.org Signed-off-by: Eric Biggers ebiggers@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/crypto/Kconfig | 12 ++++++++---- lib/crypto/Kconfig | 9 ++++++--- 2 files changed, 14 insertions(+), 7 deletions(-)
--- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -4,7 +4,8 @@ menu "Accelerated Cryptographic Algorith
config CRYPTO_AES_RISCV64 tristate "Ciphers: AES, modes: ECB, CBC, CTS, CTR, XTS" - depends on 64BIT && RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO + depends on 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \ + RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS select CRYPTO_ALGAPI select CRYPTO_LIB_AES select CRYPTO_SKCIPHER @@ -20,7 +21,8 @@ config CRYPTO_AES_RISCV64
config CRYPTO_GHASH_RISCV64 tristate "Hash functions: GHASH" - depends on 64BIT && RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO + depends on 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \ + RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS select CRYPTO_GCM help GCM GHASH function (NIST SP 800-38D) @@ -30,7 +32,8 @@ config CRYPTO_GHASH_RISCV64
config CRYPTO_SM3_RISCV64 tristate "Hash functions: SM3 (ShangMi 3)" - depends on 64BIT && RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO + depends on 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \ + RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS select CRYPTO_HASH select CRYPTO_LIB_SM3 help @@ -42,7 +45,8 @@ config CRYPTO_SM3_RISCV64
config CRYPTO_SM4_RISCV64 tristate "Ciphers: SM4 (ShangMi 4)" - depends on 64BIT && RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO + depends on 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \ + RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS select CRYPTO_ALGAPI select CRYPTO_SM4 help --- a/lib/crypto/Kconfig +++ b/lib/crypto/Kconfig @@ -50,7 +50,8 @@ config CRYPTO_LIB_CHACHA_ARCH default y if ARM64 && KERNEL_MODE_NEON default y if MIPS && CPU_MIPS32_R2 default y if PPC64 && CPU_LITTLE_ENDIAN && VSX - default y if RISCV && 64BIT && RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO + default y if RISCV && 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \ + RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS default y if S390 default y if X86_64
@@ -161,7 +162,8 @@ config CRYPTO_LIB_SHA256_ARCH default y if ARM64 default y if MIPS && CPU_CAVIUM_OCTEON default y if PPC && SPE - default y if RISCV && 64BIT && RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO + default y if RISCV && 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \ + RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS default y if S390 default y if SPARC64 default y if X86_64 @@ -179,7 +181,8 @@ config CRYPTO_LIB_SHA512_ARCH default y if ARM && !CPU_V7M default y if ARM64 default y if MIPS && CPU_CAVIUM_OCTEON - default y if RISCV && 64BIT && RISCV_ISA_V && TOOLCHAIN_HAS_VECTOR_CRYPTO + default y if RISCV && 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \ + RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS default y if S390 default y if SPARC64 default y if X86_64
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make24@iscas.ac.cn
commit a6ee6aac66fb394b7f6e6187c73bdcd873f2d139 upstream.
In i2c_amd_probe(), amd_mp2_find_device() utilizes driver_find_next_device() which internally calls driver_find_device() to locate the matching device. driver_find_device() increments the reference count of the found device by calling get_device(), but amd_mp2_find_device() fails to call put_device() to decrement the reference count before returning. This results in a reference count leak of the PCI device each time i2c_amd_probe() is executed, which may prevent the device from being properly released and cause a memory leak.
Found by code review.
Cc: stable@vger.kernel.org Fixes: 529766e0a011 ("i2c: Add drivers for the AMD PCIe MP2 I2C controller") Signed-off-by: Ma Ke make24@iscas.ac.cn Signed-off-by: Andi Shyti andi.shyti@kernel.org Link: https://lore.kernel.org/r/20251022095402.8846-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-amd-mp2-pci.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/i2c/busses/i2c-amd-mp2-pci.c +++ b/drivers/i2c/busses/i2c-amd-mp2-pci.c @@ -458,13 +458,16 @@ struct amd_mp2_dev *amd_mp2_find_device( { struct device *dev; struct pci_dev *pci_dev; + struct amd_mp2_dev *mp2_dev;
dev = driver_find_next_device(&amd_mp2_pci_driver.driver, NULL); if (!dev) return NULL;
pci_dev = to_pci_dev(dev); - return (struct amd_mp2_dev *)pci_get_drvdata(pci_dev); + mp2_dev = (struct amd_mp2_dev *)pci_get_drvdata(pci_dev); + put_device(dev); + return mp2_dev; } EXPORT_SYMBOL_GPL(amd_mp2_find_device);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Raviteja Laggyshetty quic_rlaggysh@quicinc.com
commit 295f58fdccd05b2d6da1f4a4f81952ccb565c4dc upstream.
As like other SDX SoCs, SDX75 SoC's QPIC BCM resource was modeled as a RPMh clock in clk-rpmh driver. However, for SDX75, this resource was also described as an interconnect and BCM node mistakenly. It is incorrect to describe the same resource in two different providers, as it will lead to votes from clients overriding each other.
Hence, drop the QPIC interconnect and BCM nodes and let the clients use clk-rpmh driver to vote for this resource.
Without this change, the NAND driver fails to probe on SDX75, as the interconnect sync state disables the QPIC nodes as there were no clients voting for this ICC resource. However, the NAND driver had already voted for this BCM resource through the clk-rpmh driver. Since both votes come from Linux, RPMh was unable to distinguish between these two and ends up disabling the QPIC resource during sync state.
Cc: stable@vger.kernel.org Fixes: 3642b4e5cbfe ("interconnect: qcom: Add SDX75 interconnect provider driver") Signed-off-by: Raviteja Laggyshetty quic_rlaggysh@quicinc.com [mani: dropped the reference to bcm_qp0, reworded description] Signed-off-by: Manivannan Sadhasivam manivannan.sadhasivam@oss.qualcomm.com Reviewed-by: Konrad Dybcio konrad.dybcio@oss.qualcomm.com Tested-by: Lakshmi Sowjanya D quic_laksd@quicinc.com # on SDX75 Link: https://lore.kernel.org/r/20250926-sdx75-icc-v2-1-20d6820e455c@oss.qualcomm.... Signed-off-by: Georgi Djakov djakov@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/interconnect/qcom/sdx75.c | 26 -------------------------- drivers/interconnect/qcom/sdx75.h | 2 -- 2 files changed, 28 deletions(-)
--- a/drivers/interconnect/qcom/sdx75.c +++ b/drivers/interconnect/qcom/sdx75.c @@ -16,15 +16,6 @@ #include "icc-rpmh.h" #include "sdx75.h"
-static struct qcom_icc_node qpic_core_master = { - .name = "qpic_core_master", - .id = SDX75_MASTER_QPIC_CORE, - .channels = 1, - .buswidth = 4, - .num_links = 1, - .links = { SDX75_SLAVE_QPIC_CORE }, -}; - static struct qcom_icc_node qup0_core_master = { .name = "qup0_core_master", .id = SDX75_MASTER_QUP_CORE_0, @@ -375,14 +366,6 @@ static struct qcom_icc_node xm_usb3 = { .links = { SDX75_SLAVE_A1NOC_CFG }, };
-static struct qcom_icc_node qpic_core_slave = { - .name = "qpic_core_slave", - .id = SDX75_SLAVE_QPIC_CORE, - .channels = 1, - .buswidth = 4, - .num_links = 0, -}; - static struct qcom_icc_node qup0_core_slave = { .name = "qup0_core_slave", .id = SDX75_SLAVE_QUP_CORE_0, @@ -831,12 +814,6 @@ static struct qcom_icc_bcm bcm_mc0 = { .nodes = { &ebi }, };
-static struct qcom_icc_bcm bcm_qp0 = { - .name = "QP0", - .num_nodes = 1, - .nodes = { &qpic_core_slave }, -}; - static struct qcom_icc_bcm bcm_qup0 = { .name = "QUP0", .keepalive = true, @@ -898,14 +875,11 @@ static struct qcom_icc_bcm bcm_sn4 = { };
static struct qcom_icc_bcm * const clk_virt_bcms[] = { - &bcm_qp0, &bcm_qup0, };
static struct qcom_icc_node * const clk_virt_nodes[] = { - [MASTER_QPIC_CORE] = &qpic_core_master, [MASTER_QUP_CORE_0] = &qup0_core_master, - [SLAVE_QPIC_CORE] = &qpic_core_slave, [SLAVE_QUP_CORE_0] = &qup0_core_slave, };
--- a/drivers/interconnect/qcom/sdx75.h +++ b/drivers/interconnect/qcom/sdx75.h @@ -33,7 +33,6 @@ #define SDX75_MASTER_QDSS_ETR 24 #define SDX75_MASTER_QDSS_ETR_1 25 #define SDX75_MASTER_QPIC 26 -#define SDX75_MASTER_QPIC_CORE 27 #define SDX75_MASTER_QUP_0 28 #define SDX75_MASTER_QUP_CORE_0 29 #define SDX75_MASTER_SDCC_1 30 @@ -76,7 +75,6 @@ #define SDX75_SLAVE_QDSS_CFG 67 #define SDX75_SLAVE_QDSS_STM 68 #define SDX75_SLAVE_QPIC 69 -#define SDX75_SLAVE_QPIC_CORE 70 #define SDX75_SLAVE_QUP_0 71 #define SDX75_SLAVE_QUP_CORE_0 72 #define SDX75_SLAVE_SDCC_1 73
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joanne Koong joannelkoong@gmail.com
commit 525916ce496615f531091855604eab9ca573b195 upstream.
When cloning with node replacements (IORING_REGISTER_DST_REPLACE), destination entries after the cloned range are not copied over.
Add logic to copy them over to the new destination table.
Fixes: c1329532d5aa ("io_uring/rsrc: allow cloning with node replacements") Cc: stable@vger.kernel.org Signed-off-by: Joanne Koong joannelkoong@gmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/rsrc.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
--- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -1200,7 +1200,7 @@ static int io_clone_buffers(struct io_ri if (ret) return ret;
- /* Fill entries in data from dst that won't overlap with src */ + /* Copy original dst nodes from before the cloned range */ for (i = 0; i < min(arg->dst_off, ctx->buf_table.nr); i++) { struct io_rsrc_node *src_node = ctx->buf_table.nodes[i];
@@ -1248,6 +1248,16 @@ static int io_clone_buffers(struct io_ri i++; }
+ /* Copy original dst nodes from after the cloned range */ + for (i = nbufs; i < ctx->buf_table.nr; i++) { + struct io_rsrc_node *node = ctx->buf_table.nodes[i]; + + if (node) { + data.nodes[i] = node; + node->refs++; + } + } + /* * If asked for replace, put the old table. data->nodes[] holds both * old and new nodes at this point.
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gui-Dong Han hanguidong02@gmail.com
commit b8d5acdcf525f44e521ca4ef51dce4dac403dab4 upstream.
In max16065_current_show, data->curr_sense is read twice: once for the error check and again for the calculation. Since i2c_smbus_read_byte_data returns negative error codes on failure, if the data changes to an error code between the check and the use, ADC_TO_CURR results in an incorrect calculation.
Read data->curr_sense into a local variable to ensure consistency. Note that data->curr_gain is constant and safe to access directly.
This aligns max16065_current_show with max16065_input_show, which already uses a local variable for the same reason.
Link: https://lore.kernel.org/all/CALbr=LYJ_ehtp53HXEVkSpYoub+XYSTU8Rg=o1xxMJ8=5z8... Fixes: f5bae2642e3d ("hwmon: Driver for MAX16065 System Manager and compatibles") Cc: stable@vger.kernel.org Signed-off-by: Gui-Dong Han hanguidong02@gmail.com Link: https://lore.kernel.org/r/20251128124709.3876-1-hanguidong02@gmail.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/max16065.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/hwmon/max16065.c +++ b/drivers/hwmon/max16065.c @@ -216,12 +216,13 @@ static ssize_t max16065_current_show(str struct device_attribute *da, char *buf) { struct max16065_data *data = max16065_update_device(dev); + int curr_sense = data->curr_sense;
- if (unlikely(data->curr_sense < 0)) - return data->curr_sense; + if (unlikely(curr_sense < 0)) + return curr_sense;
return sysfs_emit(buf, "%d\n", - ADC_TO_CURR(data->curr_sense, data->curr_gain)); + ADC_TO_CURR(curr_sense, data->curr_gain)); }
static ssize_t max16065_limit_store(struct device *dev,
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 02f0ad8e8de8cf5344f8f0fa26d9529b8339da47 upstream.
The i2c regmap allocated during probe is never freed.
Switch to using the device managed allocator so that the regmap is released on probe failures (e.g. probe deferral) and on driver unbind.
Fixes: 3a2a8cc3fe24 ("hwmon: (max6697) Convert to use regmap") Cc: stable@vger.kernel.org # 6.12 Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20251127134351.1585-1-johan@kernel.org Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/max6697.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/hwmon/max6697.c +++ b/drivers/hwmon/max6697.c @@ -548,7 +548,7 @@ static int max6697_probe(struct i2c_clie struct regmap *regmap; int err;
- regmap = regmap_init_i2c(client, &max6697_regmap_config); + regmap = devm_regmap_init_i2c(client, &max6697_regmap_config); if (IS_ERR(regmap)) return PTR_ERR(regmap);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gui-Dong Han hanguidong02@gmail.com
commit 670d7ef945d3a84683594429aea6ab2cdfa5ceb4 upstream.
The macro FAN_FROM_REG evaluates its arguments multiple times. When used in lockless contexts involving shared driver data, this leads to Time-of-Check to Time-of-Use (TOCTOU) race conditions, potentially causing divide-by-zero errors.
Convert the macro to a static function. This guarantees that arguments are evaluated only once (pass-by-value), preventing the race conditions.
Additionally, in store_fan_div, move the calculation of the minimum limit inside the update lock. This ensures that the read-modify-write sequence operates on consistent data.
Adhere to the principle of minimal changes by only converting macros that evaluate arguments multiple times and are used in lockless contexts.
Link: https://lore.kernel.org/all/CALbr=LYJ_ehtp53HXEVkSpYoub+XYSTU8Rg=o1xxMJ8=5z8... Fixes: 9873964d6eb2 ("[PATCH] HWMON: w83791d: New hardware monitoring driver for the Winbond W83791D") Cc: stable@vger.kernel.org Signed-off-by: Gui-Dong Han hanguidong02@gmail.com Link: https://lore.kernel.org/r/20251202180105.12842-1-hanguidong02@gmail.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/w83791d.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-)
--- a/drivers/hwmon/w83791d.c +++ b/drivers/hwmon/w83791d.c @@ -218,9 +218,14 @@ static u8 fan_to_reg(long rpm, int div) return clamp_val((1350000 + rpm * div / 2) / (rpm * div), 1, 254); }
-#define FAN_FROM_REG(val, div) ((val) == 0 ? -1 : \ - ((val) == 255 ? 0 : \ - 1350000 / ((val) * (div)))) +static int fan_from_reg(int val, int div) +{ + if (val == 0) + return -1; + if (val == 255) + return 0; + return 1350000 / (val * div); +}
/* for temp1 which is 8-bit resolution, LSB = 1 degree Celsius */ #define TEMP1_FROM_REG(val) ((val) * 1000) @@ -521,7 +526,7 @@ static ssize_t show_##reg(struct device struct w83791d_data *data = w83791d_update_device(dev); \ int nr = sensor_attr->index; \ return sprintf(buf, "%d\n", \ - FAN_FROM_REG(data->reg[nr], DIV_FROM_REG(data->fan_div[nr]))); \ + fan_from_reg(data->reg[nr], DIV_FROM_REG(data->fan_div[nr]))); \ }
show_fan_reg(fan); @@ -585,10 +590,10 @@ static ssize_t store_fan_div(struct devi if (err) return err;
+ mutex_lock(&data->update_lock); /* Save fan_min */ - min = FAN_FROM_REG(data->fan_min[nr], DIV_FROM_REG(data->fan_div[nr])); + min = fan_from_reg(data->fan_min[nr], DIV_FROM_REG(data->fan_div[nr]));
- mutex_lock(&data->update_lock); data->fan_div[nr] = div_to_reg(nr, val);
switch (nr) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gui-Dong Han hanguidong02@gmail.com
commit 07272e883fc61574b8367d44de48917f622cdd83 upstream.
The macros FAN_FROM_REG and TEMP_FROM_REG evaluate their arguments multiple times. When used in lockless contexts involving shared driver data, this causes Time-of-Check to Time-of-Use (TOCTOU) race conditions.
Convert the macros to static functions. This guarantees that arguments are evaluated only once (pass-by-value), preventing the race conditions.
Adhere to the principle of minimal changes by only converting macros that evaluate arguments multiple times and are used in lockless contexts.
Link: https://lore.kernel.org/all/CALbr=LYJ_ehtp53HXEVkSpYoub+XYSTU8Rg=o1xxMJ8=5z8... Fixes: 85f03bccd6e0 ("hwmon: Add support for Winbond W83L786NG/NR") Cc: stable@vger.kernel.org Signed-off-by: Gui-Dong Han hanguidong02@gmail.com Link: https://lore.kernel.org/r/20251128123816.3670-1-hanguidong02@gmail.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/w83l786ng.c | 26 ++++++++++++++++++-------- 1 file changed, 18 insertions(+), 8 deletions(-)
--- a/drivers/hwmon/w83l786ng.c +++ b/drivers/hwmon/w83l786ng.c @@ -76,15 +76,25 @@ FAN_TO_REG(long rpm, int div) return clamp_val((1350000 + rpm * div / 2) / (rpm * div), 1, 254); }
-#define FAN_FROM_REG(val, div) ((val) == 0 ? -1 : \ - ((val) == 255 ? 0 : \ - 1350000 / ((val) * (div)))) +static int fan_from_reg(int val, int div) +{ + if (val == 0) + return -1; + if (val == 255) + return 0; + return 1350000 / (val * div); +}
/* for temp */ #define TEMP_TO_REG(val) (clamp_val(((val) < 0 ? (val) + 0x100 * 1000 \ : (val)) / 1000, 0, 0xff)) -#define TEMP_FROM_REG(val) (((val) & 0x80 ? \ - (val) - 0x100 : (val)) * 1000) + +static int temp_from_reg(int val) +{ + if (val & 0x80) + return (val - 0x100) * 1000; + return val * 1000; +}
/* * The analog voltage inputs have 8mV LSB. Since the sysfs output is @@ -280,7 +290,7 @@ static ssize_t show_##reg(struct device int nr = to_sensor_dev_attr(attr)->index; \ struct w83l786ng_data *data = w83l786ng_update_device(dev); \ return sprintf(buf, "%d\n", \ - FAN_FROM_REG(data->reg[nr], DIV_FROM_REG(data->fan_div[nr]))); \ + fan_from_reg(data->reg[nr], DIV_FROM_REG(data->fan_div[nr]))); \ }
show_fan_reg(fan); @@ -347,7 +357,7 @@ store_fan_div(struct device *dev, struct
/* Save fan_min */ mutex_lock(&data->update_lock); - min = FAN_FROM_REG(data->fan_min[nr], DIV_FROM_REG(data->fan_div[nr])); + min = fan_from_reg(data->fan_min[nr], DIV_FROM_REG(data->fan_div[nr]));
data->fan_div[nr] = DIV_TO_REG(val);
@@ -409,7 +419,7 @@ show_temp(struct device *dev, struct dev int nr = sensor_attr->nr; int index = sensor_attr->index; struct w83l786ng_data *data = w83l786ng_update_device(dev); - return sprintf(buf, "%d\n", TEMP_FROM_REG(data->temp[nr][index])); + return sprintf(buf, "%d\n", temp_from_reg(data->temp[nr][index])); }
static ssize_t
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Ferre nicolas.ferre@microchip.com
commit 7d5864dc5d5ea6a35983dd05295fb17f2f2f44ce upstream.
Unlike standalone spi peripherals, on sama5d2, the flexcom spi have fifo size of 32 data. Fix flexcom/spi nodes where this property is wrong.
Fixes: 6b9a3584c7ed ("ARM: dts: at91: sama5d2: Add missing flexcom definitions") Cc: stable@vger.kernel.org # 5.8+ Signed-off-by: Nicolas Ferre nicolas.ferre@microchip.com Link: https://lore.kernel.org/r/20251114140225.30372-1-nicolas.ferre@microchip.com Signed-off-by: Claudiu Beznea claudiu.beznea@tuxon.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/boot/dts/microchip/sama5d2.dtsi | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/arch/arm/boot/dts/microchip/sama5d2.dtsi +++ b/arch/arm/boot/dts/microchip/sama5d2.dtsi @@ -571,7 +571,7 @@ AT91_XDMAC_DT_PER_IF(1) | AT91_XDMAC_DT_PERID(12))>; dma-names = "tx", "rx"; - atmel,fifo-size = <16>; + atmel,fifo-size = <32>; status = "disabled"; };
@@ -642,7 +642,7 @@ AT91_XDMAC_DT_PER_IF(1) | AT91_XDMAC_DT_PERID(14))>; dma-names = "tx", "rx"; - atmel,fifo-size = <16>; + atmel,fifo-size = <32>; status = "disabled"; };
@@ -854,7 +854,7 @@ AT91_XDMAC_DT_PER_IF(1) | AT91_XDMAC_DT_PERID(16))>; dma-names = "tx", "rx"; - atmel,fifo-size = <16>; + atmel,fifo-size = <32>; status = "disabled"; };
@@ -925,7 +925,7 @@ AT91_XDMAC_DT_PER_IF(1) | AT91_XDMAC_DT_PERID(18))>; dma-names = "tx", "rx"; - atmel,fifo-size = <16>; + atmel,fifo-size = <32>; status = "disabled"; };
@@ -997,7 +997,7 @@ AT91_XDMAC_DT_PER_IF(1) | AT91_XDMAC_DT_PERID(20))>; dma-names = "tx", "rx"; - atmel,fifo-size = <16>; + atmel,fifo-size = <32>; status = "disabled"; };
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Ferre nicolas.ferre@microchip.com
commit 1f591be0a02c697f65a21be35f1d74117bbf4be2 upstream.
On some flexcom nodes related to uart, the fifo sizes were wrong: fix them to 32 data. Note that product datasheet is being reviewed to fix inconsistency, but this value is validated by product's designers.
Fixes: 261dcfad1b59 ("ARM: dts: microchip: add sama7d65 SoC DT") Fixes: b51e4aea3ecf ("ARM: dts: microchip: sama7d65: Add FLEXCOMs to sama7d65 SoC") Cc: stable@vger.kernel.org # 6.16+ Signed-off-by: Nicolas Ferre nicolas.ferre@microchip.com Link: https://lore.kernel.org/r/20251114103313.20220-1-nicolas.ferre@microchip.com Signed-off-by: Claudiu Beznea claudiu.beznea@tuxon.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/boot/dts/microchip/sama7d65.dtsi | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/arm/boot/dts/microchip/sama7d65.dtsi +++ b/arch/arm/boot/dts/microchip/sama7d65.dtsi @@ -557,7 +557,7 @@ dma-names = "tx", "rx"; atmel,use-dma-rx; atmel,use-dma-tx; - atmel,fifo-size = <16>; + atmel,fifo-size = <32>; atmel,usart-mode = <AT91_USART_MODE_SERIAL>; status = "disabled"; }; @@ -618,7 +618,7 @@ clocks = <&pmc PMC_TYPE_PERIPHERAL 40>; clock-names = "usart"; atmel,usart-mode = <AT91_USART_MODE_SERIAL>; - atmel,fifo-size = <16>; + atmel,fifo-size = <32>; status = "disabled"; }; }; @@ -643,7 +643,7 @@ dma-names = "tx", "rx"; atmel,use-dma-rx; atmel,use-dma-tx; - atmel,fifo-size = <16>; + atmel,fifo-size = <32>; atmel,usart-mode = <AT91_USART_MODE_SERIAL>; status = "disabled"; };
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Ferre nicolas.ferre@microchip.com
commit 5654889a94b0de5ad6ceae3793e7f5e0b61b50b6 upstream.
On some flexcom nodes related to uart, the fifo sizes were wrong: fix them to 32 data.
Fixes: 7540629e2fc7 ("ARM: dts: at91: add sama7g5 SoC DT and sama7g5-ek") Cc: stable@vger.kernel.org # 5.15+ Signed-off-by: Nicolas Ferre nicolas.ferre@microchip.com Link: https://lore.kernel.org/r/20251114103313.20220-2-nicolas.ferre@microchip.com Signed-off-by: Claudiu Beznea claudiu.beznea@tuxon.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/boot/dts/microchip/sama7g5.dtsi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm/boot/dts/microchip/sama7g5.dtsi +++ b/arch/arm/boot/dts/microchip/sama7g5.dtsi @@ -824,7 +824,7 @@ dma-names = "tx", "rx"; atmel,use-dma-rx; atmel,use-dma-tx; - atmel,fifo-size = <16>; + atmel,fifo-size = <32>; status = "disabled"; }; }; @@ -850,7 +850,7 @@ dma-names = "tx", "rx"; atmel,use-dma-rx; atmel,use-dma-tx; - atmel,fifo-size = <16>; + atmel,fifo-size = <32>; status = "disabled"; }; };
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal dlemoal@kernel.org
commit bba4322e3f303b2d656e748be758320b567f046f upstream.
Modify disk_update_zone_resources() to freeze the device queue before updating the number of zones, zone capacity and other zone related resources. The locking order resulting from the call to queue_limits_commit_update_frozen() is preserved, that is, the queue limits lock is first taken by calling queue_limits_start_update() before freezing the queue, and the queue is unfrozen after executing queue_limits_commit_update(), which replaces the call to queue_limits_commit_update_frozen().
This change ensures that there are no in-flights I/Os when the zone resources are updated due to a zone revalidation. In case of error when the limits are applied, directly call disk_free_zone_resources() from disk_update_zone_resources() while the disk queue is still frozen to avoid needing to freeze & unfreeze the queue again in blk_revalidate_disk_zones(), thus simplifying that function code a little.
Fixes: 0b83c86b444a ("block: Prevent potential deadlock in blk_revalidate_disk_zones()") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal dlemoal@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Reviewed-by: Hannes Reinecke hare@suse.de Reviewed-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- block/blk-zoned.c | 42 ++++++++++++++++++++++++------------------ 1 file changed, 24 insertions(+), 18 deletions(-)
--- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -1516,8 +1516,13 @@ static int disk_update_zone_resources(st { struct request_queue *q = disk->queue; unsigned int nr_seq_zones, nr_conv_zones; - unsigned int pool_size; + unsigned int pool_size, memflags; struct queue_limits lim; + int ret = 0; + + lim = queue_limits_start_update(q); + + memflags = blk_mq_freeze_queue(q);
disk->nr_zones = args->nr_zones; disk->zone_capacity = args->zone_capacity; @@ -1527,11 +1532,10 @@ static int disk_update_zone_resources(st if (nr_conv_zones >= disk->nr_zones) { pr_warn("%s: Invalid number of conventional zones %u / %u\n", disk->disk_name, nr_conv_zones, disk->nr_zones); - return -ENODEV; + ret = -ENODEV; + goto unfreeze; }
- lim = queue_limits_start_update(q); - /* * Some devices can advertize zone resource limits that are larger than * the number of sequential zones of the zoned block device, e.g. a @@ -1568,7 +1572,15 @@ static int disk_update_zone_resources(st }
commit: - return queue_limits_commit_update_frozen(q, &lim); + ret = queue_limits_commit_update(q, &lim); + +unfreeze: + if (ret) + disk_free_zone_resources(disk); + + blk_mq_unfreeze_queue(q, memflags); + + return ret; }
static int blk_revalidate_conv_zone(struct blk_zone *zone, unsigned int idx, @@ -1733,7 +1745,7 @@ int blk_revalidate_disk_zones(struct gen sector_t zone_sectors = q->limits.chunk_sectors; sector_t capacity = get_capacity(disk); struct blk_revalidate_zone_args args = { }; - unsigned int noio_flag; + unsigned int memflags, noio_flag; int ret = -ENOMEM;
if (WARN_ON_ONCE(!blk_queue_is_zoned(q))) @@ -1783,20 +1795,14 @@ int blk_revalidate_disk_zones(struct gen ret = -ENODEV; }
- /* - * Set the new disk zone parameters only once the queue is frozen and - * all I/Os are completed. - */ if (ret > 0) - ret = disk_update_zone_resources(disk, &args); - else - pr_warn("%s: failed to revalidate zones\n", disk->disk_name); - if (ret) { - unsigned int memflags = blk_mq_freeze_queue(q); + return disk_update_zone_resources(disk, &args);
- disk_free_zone_resources(disk); - blk_mq_unfreeze_queue(q, memflags); - } + pr_warn("%s: failed to revalidate zones\n", disk->disk_name); + + memflags = blk_mq_freeze_queue(q); + disk_free_zone_resources(disk); + blk_mq_unfreeze_queue(q, memflags);
return ret; }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit de83d4617f9fe059623e97acf7e1e10d209625b5 upstream.
The driver is dropping the references taken to the larb devices during probe after successful lookup as well as on errors. This can potentially lead to a use-after-free in case a larb device has not yet been bound to its driver so that the iommu driver probe defers.
Fix this by keeping the references as expected while the iommu driver is bound.
Fixes: 26593928564c ("iommu/mediatek: Add error path for loop of mm_dts_parse") Cc: stable@vger.kernel.org Cc: Yong Wu yong.wu@mediatek.com Acked-by: Robin Murphy robin.murphy@arm.com Signed-off-by: Johan Hovold johan@kernel.org Reviewed-by: Yong Wu yong.wu@mediatek.com Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Signed-off-by: Joerg Roedel joerg.roedel@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iommu/mtk_iommu.c | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-)
--- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -1211,16 +1211,19 @@ static int mtk_iommu_mm_dts_parse(struct }
component_match_add(dev, match, component_compare_dev, &plarbdev->dev); - platform_device_put(plarbdev); }
- if (!frst_avail_smicomm_node) - return -EINVAL; + if (!frst_avail_smicomm_node) { + ret = -EINVAL; + goto err_larbdev_put; + }
pcommdev = of_find_device_by_node(frst_avail_smicomm_node); of_node_put(frst_avail_smicomm_node); - if (!pcommdev) - return -ENODEV; + if (!pcommdev) { + ret = -ENODEV; + goto err_larbdev_put; + } data->smicomm_dev = &pcommdev->dev;
link = device_link_add(data->smicomm_dev, dev, @@ -1228,7 +1231,8 @@ static int mtk_iommu_mm_dts_parse(struct platform_device_put(pcommdev); if (!link) { dev_err(dev, "Unable to link %s.\n", dev_name(data->smicomm_dev)); - return -EINVAL; + ret = -EINVAL; + goto err_larbdev_put; } return 0;
@@ -1400,8 +1404,12 @@ out_sysfs_remove: iommu_device_sysfs_remove(&data->iommu); out_list_del: list_del(&data->list); - if (MTK_IOMMU_IS_TYPE(data->plat_data, MTK_IOMMU_TYPE_MM)) + if (MTK_IOMMU_IS_TYPE(data->plat_data, MTK_IOMMU_TYPE_MM)) { device_link_remove(data->smicomm_dev, dev); + + for (i = 0; i < MTK_LARB_NR_MAX; i++) + put_device(data->larb_imu[i].dev); + } out_runtime_disable: pm_runtime_disable(dev); return ret; @@ -1421,6 +1429,9 @@ static void mtk_iommu_remove(struct plat if (MTK_IOMMU_IS_TYPE(data->plat_data, MTK_IOMMU_TYPE_MM)) { device_link_remove(data->smicomm_dev, &pdev->dev); component_master_del(&pdev->dev, &mtk_iommu_com_ops); + + for (i = 0; i < MTK_LARB_NR_MAX; i++) + put_device(data->larb_imu[i].dev); } pm_runtime_disable(&pdev->dev); for (i = 0; i < data->plat_data->banks_num; i++) {
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joanne Koong joannelkoong@gmail.com
commit 95c39eef7c2b666026c69ab5b30471da94ea2874 upstream.
When a request is terminated before it has been committed, the request is not removed from the queue's list. This leaves a dangling list entry that leads to list corruption and use-after-free issues.
Remove the request from the queue's list for terminated non-committed requests.
Signed-off-by: Joanne Koong joannelkoong@gmail.com Fixes: c090c8abae4b ("fuse: Add io-uring sqe commit and fetch support") Cc: stable@vger.kernel.org Reviewed-by: Bernd Schubert bschubert@ddn.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/dev_uring.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c index 0066c9c0a5d5..7760fe4e1f9e 100644 --- a/fs/fuse/dev_uring.c +++ b/fs/fuse/dev_uring.c @@ -86,6 +86,7 @@ static void fuse_uring_req_end(struct fuse_ring_ent *ent, struct fuse_req *req, lockdep_assert_not_held(&queue->lock); spin_lock(&queue->lock); ent->fuse_req = NULL; + list_del_init(&req->list); if (test_bit(FR_BACKGROUND, &req->flags)) { queue->active_background--; spin_lock(&fc->bg_lock);
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joanne Koong joannelkoong@gmail.com
commit bd5603eaae0aabf527bfb3ce1bb07e979ce5bd50 upstream.
Commit e26ee4efbc79 ("fuse: allocate ff->release_args only if release is needed") skips allocating ff->release_args if the server does not implement open. However in doing so, fuse_prepare_release() now skips grabbing the reference on the inode, which makes it possible for an inode to be evicted from the dcache while there are inflight readahead requests. This causes a deadlock if the server triggers reclaim while servicing the readahead request and reclaim attempts to evict the inode of the file being read ahead. Since the folio is locked during readahead, when reclaim evicts the fuse inode and fuse_evict_inode() attempts to remove all folios associated with the inode from the page cache (truncate_inode_pages_range()), reclaim will block forever waiting for the lock since readahead cannot relinquish the lock because it is itself blocked in reclaim:
stack_trace(1504735)
folio_wait_bit_common (mm/filemap.c:1308:4) folio_lock (./include/linux/pagemap.h:1052:3) truncate_inode_pages_range (mm/truncate.c:336:10) fuse_evict_inode (fs/fuse/inode.c:161:2) evict (fs/inode.c:704:3) dentry_unlink_inode (fs/dcache.c:412:3) __dentry_kill (fs/dcache.c:615:3) shrink_kill (fs/dcache.c:1060:12) shrink_dentry_list (fs/dcache.c:1087:3) prune_dcache_sb (fs/dcache.c:1168:2) super_cache_scan (fs/super.c:221:10) do_shrink_slab (mm/shrinker.c:435:9) shrink_slab (mm/shrinker.c:626:10) shrink_node (mm/vmscan.c:5951:2) shrink_zones (mm/vmscan.c:6195:3) do_try_to_free_pages (mm/vmscan.c:6257:3) do_swap_page (mm/memory.c:4136:11) handle_pte_fault (mm/memory.c:5562:10) handle_mm_fault (mm/memory.c:5870:9) do_user_addr_fault (arch/x86/mm/fault.c:1338:10) handle_page_fault (arch/x86/mm/fault.c:1481:3) exc_page_fault (arch/x86/mm/fault.c:1539:2) asm_exc_page_fault+0x22/0x27
Fix this deadlock by allocating ff->release_args and grabbing the reference on the inode when preparing the file for release even if the server does not implement open. The inode reference will be dropped when the last reference on the fuse file is dropped (see fuse_file_put() -> fuse_release_end()).
Fixes: e26ee4efbc79 ("fuse: allocate ff->release_args only if release is needed") Cc: stable@vger.kernel.org Signed-off-by: Joanne Koong joannelkoong@gmail.com Reported-by: Omar Sandoval osandov@fb.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/file.c | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-)
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -110,7 +110,9 @@ static void fuse_file_put(struct fuse_fi fuse_file_io_release(ff, ra->inode);
if (!args) { - /* Do nothing when server does not implement 'open' */ + /* Do nothing when server does not implement 'opendir' */ + } else if (args->opcode == FUSE_RELEASE && ff->fm->fc->no_open) { + fuse_release_end(ff->fm, args, 0); } else if (sync) { fuse_simple_request(ff->fm, args); fuse_release_end(ff->fm, args, 0); @@ -131,8 +133,17 @@ struct fuse_file *fuse_file_open(struct struct fuse_file *ff; int opcode = isdir ? FUSE_OPENDIR : FUSE_OPEN; bool open = isdir ? !fc->no_opendir : !fc->no_open; + bool release = !isdir || open;
- ff = fuse_file_alloc(fm, open); + /* + * ff->args->release_args still needs to be allocated (so we can hold an + * inode reference while there are pending inflight file operations when + * ->release() is called, see fuse_prepare_release()) even if + * fc->no_open is set else it becomes possible for reclaim to deadlock + * if while servicing the readahead request the server triggers reclaim + * and reclaim evicts the inode of the file being read ahead. + */ + ff = fuse_file_alloc(fm, release); if (!ff) return ERR_PTR(-ENOMEM);
@@ -152,13 +163,14 @@ struct fuse_file *fuse_file_open(struct fuse_file_free(ff); return ERR_PTR(err); } else { - /* No release needed */ - kfree(ff->args); - ff->args = NULL; - if (isdir) + if (isdir) { + /* No release needed */ + kfree(ff->args); + ff->args = NULL; fc->no_opendir = 1; - else + } else { fc->no_open = 1; + } } }
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cheng Ding cding@ddn.com
commit 6e0d7f7f4a43ac8868e98c87ecf48805aa8c24dd upstream.
Fix a possible reference count leak of payload pages during fuse argument copies.
[Joanne: simplified error cleanup]
Fixes: c090c8abae4b ("fuse: Add io-uring sqe commit and fetch support") Cc: stable@vger.kernel.org # v6.14 Signed-off-by: Cheng Ding cding@ddn.com Signed-off-by: Bernd Schubert bschubert@ddn.com Reviewed-by: Joanne Koong joannelkoong@gmail.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/dev.c | 2 +- fs/fuse/dev_uring.c | 5 ++++- fs/fuse/fuse_dev_i.h | 1 + 3 files changed, 6 insertions(+), 2 deletions(-)
--- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -846,7 +846,7 @@ void fuse_copy_init(struct fuse_copy_sta }
/* Unmap and put previous page of userspace buffer */ -static void fuse_copy_finish(struct fuse_copy_state *cs) +void fuse_copy_finish(struct fuse_copy_state *cs) { if (cs->currbuf) { struct pipe_buffer *buf = cs->currbuf; --- a/fs/fuse/dev_uring.c +++ b/fs/fuse/dev_uring.c @@ -599,7 +599,9 @@ static int fuse_uring_copy_from_ring(str cs.is_uring = true; cs.req = req;
- return fuse_copy_out_args(&cs, args, ring_in_out.payload_sz); + err = fuse_copy_out_args(&cs, args, ring_in_out.payload_sz); + fuse_copy_finish(&cs); + return err; }
/* @@ -650,6 +652,7 @@ static int fuse_uring_args_to_ring(struc /* copy the payload */ err = fuse_copy_args(&cs, num_args, args->in_pages, (struct fuse_arg *)in_args, 0); + fuse_copy_finish(&cs); if (err) { pr_info_ratelimited("%s fuse_copy_args failed\n", __func__); return err; --- a/fs/fuse/fuse_dev_i.h +++ b/fs/fuse/fuse_dev_i.h @@ -62,6 +62,7 @@ void fuse_dev_end_requests(struct list_h
void fuse_copy_init(struct fuse_copy_state *cs, bool write, struct iov_iter *iter); +void fuse_copy_finish(struct fuse_copy_state *cs); int fuse_copy_args(struct fuse_copy_state *cs, unsigned int numargs, unsigned int argpages, struct fuse_arg *args, int zeroing);
# Librecast Test Results
020/020 [ OK ] liblcrq 010/010 [ OK ] libmld 120/120 [ OK ] liblibrecast
CPU/kernel: Linux auntie 6.18.3-rc1-g685a8a2ad649 #1 SMP PREEMPT_DYNAMIC Mon Dec 29 16:21:54 -00 2025 x86_64 AMD Ryzen 9 9950X 16-Core Processor AuthenticAMD GNU/Linux
Tested-by: Brett A C Sheffield bacs@librecast.net
Hi
no regressions here on x86_64 (RKL, Intel 11th Gen. CPU)
Thanks
Tested-by: Ronald Warsow rwarsow@gmx.de
Hi Greg
On Tue, Dec 30, 2025 at 1:16 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.18.3 release. There are 430 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 31 Dec 2025 16:06:10 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.18.3-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.18.y and the diffstat can be found below.
thanks,
greg k-h
6.18.3-rc1 tested.
Build successfully completed. Boot successfully completed. No dmesg regressions. Video output normal. Sound output normal.
Lenovo ThinkPad X1 Carbon Gen10(Intel i7-1260P(x86_64) arch linux)
[ 0.000000] Linux version 6.18.3-rc1rv-g685a8a2ad649 (takeshi@ThinkPadX1Gen10J0764) (gcc (GCC) 15.2.1 20251112, GNU ld (GNU Binutils) 2.45.1) #1 SMP PREEMPT_DYNAMIC Tue Dec 30 09:19:52 JST 2025
Thanks
Tested-by: Takeshi Ogasawara takeshi.ogasawara@futuring-girl.com
linux-stable-mirror@lists.linaro.org