This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.14.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.14.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.14.7-rc1
Pawan Gupta pawan.kumar.gupta@linux.intel.com selftest/x86/bugs: Add selftests for ITS
Peter Zijlstra peterz@infradead.org x86/its: Use dynamic thunks for indirect branches
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/ibt: Keep IBT disabled during alternative patching
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Align RETs in BHB clear sequence to avoid thunking
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Add support for RSB stuffing mitigation
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Add "vmexit" option to skip mitigation on some CPUs
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Enable Indirect Target Selection mitigation
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Add support for ITS-safe return thunk
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Add support for ITS-safe indirect thunk
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Enumerate Indirect Target Selection (ITS) bug
Pawan Gupta pawan.kumar.gupta@linux.intel.com Documentation: x86/bugs/its: Add ITS documentation
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/bhi: Do not set BHI_DIS_S in 32-bit mode
Daniel Sneddon daniel.sneddon@linux.intel.com x86/bpf: Add IBHF call at end of classic BPF
Daniel Sneddon daniel.sneddon@linux.intel.com x86/bpf: Call branch history clearing sequence on exit
James Morse james.morse@arm.com arm64: proton-pack: Add new CPUs 'k' values for branch mitigation
James Morse james.morse@arm.com arm64: bpf: Only mitigate cBPF programs loaded by unprivileged users
James Morse james.morse@arm.com arm64: bpf: Add BHB mitigation to the epilogue for cBPF programs
James Morse james.morse@arm.com arm64: proton-pack: Expose whether the branchy loop k value
James Morse james.morse@arm.com arm64: proton-pack: Expose whether the platform is mitigated by firmware
James Morse james.morse@arm.com arm64: insn: Add support for encoding DSB
Johannes Weiner hannes@cmpxchg.org mm: page_alloc: speed up fallbacks in rmqueue_bulk()
Johannes Weiner hannes@cmpxchg.org mm: page_alloc: don't steal single pages from biggest buddy
Hao Qin hao.qin@mediatek.com Bluetooth: btmtk: Remove the resetting step before downloading the fw
Jens Axboe axboe@kernel.dk io_uring: always arm linked timeouts prior to issue
Miguel Ojeda ojeda@kernel.org rust: clean Rust 1.88.0's `clippy::uninlined_format_args` lint
Miguel Ojeda ojeda@kernel.org rust: allow Rust 1.87.0's `clippy::ptr_eq` lint
Al Viro viro@zeniv.linux.org.uk do_umount(): add missing barrier before refcount checks in sync case
Gabriel Krisman Bertazi krisman@suse.de io_uring/sqpoll: Increase task_work submission batch size
Shuicheng Lin shuicheng.lin@intel.com drm/xe: Release force wake first then runtime power
Tejas Upadhyay tejas.upadhyay@intel.com drm/xe/tests/mocs: Hold XE_FORCEWAKE_ALL for LNCF regs
Samuel Holland samuel.holland@sifive.com riscv: Disallow PR_GET_TAGGED_ADDR_CTRL without Supm
Clément Léger cleger@rivosinc.com riscv: misaligned: enable IRQs while handling misaligned accesses
Clément Léger cleger@rivosinc.com riscv: misaligned: factorize trap handling
Daniel Wagner wagi@kernel.org nvme: unblock ctrl state transition for firmware update
Kevin Baker kevinb@ventureresearch.com drm/panel: simple: Update timings for AUO G101EVN010
Lizhi Xu lizhi.xu@windriver.com loop: Add sanity check for read/write_iter
Christoph Hellwig hch@lst.de loop: factor out a loop_assign_backing_file helper
Nylon Chen nylon.chen@sifive.com riscv: misaligned: Add handling for ZCB instructions
Thorsten Blum thorsten.blum@linux.dev MIPS: Fix MAX_REG_OFFSET
Karol Wachowski karol.wachowski@intel.com accel/ivpu: Correct mutex unlock order in job submission
Karol Wachowski karol.wachowski@intel.com accel/ivpu: Separate DB ID and CMDQ ID allocations from CMDQ allocation
Thomas Gleixner tglx@linutronix.de timekeeping: Prevent coarse clocks going backwards
Marco Crivellari marco.crivellari@suse.com MIPS: Move r4k_wait() to .cpuidle.text section
Marco Crivellari marco.crivellari@suse.com MIPS: Fix idle VS timer enqueue
Jonathan Cameron Jonathan.Cameron@huawei.com iio: adc: dln2: Use aligned_s64 for timestamp
Jonathan Cameron Jonathan.Cameron@huawei.com iio: accel: adxl355: Make timestamp 64-bit aligned using aligned_s64
Jonathan Cameron Jonathan.Cameron@huawei.com iio: temp: maxim-thermocouple: Fix potential lack of DMA safe buffer.
Lothar Rubusch l.rubusch@gmail.com iio: accel: adxl367: fix setting odr for activity time update
Gustavo Silva gustavograzs@gmail.com iio: imu: bmi270: fix initial sampling frequency configuration
Dave Penkler dpenkler@gmail.com usb: usbtmc: Fix erroneous generic_read ioctl return
Dave Penkler dpenkler@gmail.com usb: usbtmc: Fix erroneous wait_srq ioctl return
Dave Penkler dpenkler@gmail.com usb: usbtmc: Fix erroneous get_stb ioctl error returns
Oliver Neukum oneukum@suse.com USB: usbtmc: use interruptible sleep in usbtmc_read
Andrei Kuchynski akuchynski@chromium.org usb: typec: ucsi: displayport: Fix NULL pointer access
Andrei Kuchynski akuchynski@chromium.org usb: typec: ucsi: displayport: Fix deadlock
RD Babiera rdbabiera@google.com usb: typec: tcpm: delay SNK_TRY_WAIT_DEBOUNCE to SRC_TRYWAIT transition
Lukasz Czechowski lukasz.czechowski@thaumatec.com usb: misc: onboard_usb_dev: fix support for Cypress HX3 hubs
Jim Lin jilin@nvidia.com usb: host: tegra: Prevent host controller crash when OTG port is used
Prashanth K prashanth.k@oss.qualcomm.com usb: gadget: Use get_status callback to set remote wakeup capability
Wayne Chang waynec@nvidia.com usb: gadget: tegra-xudc: ACK ST_RC after clearing CTRL_RUN
Prashanth K prashanth.k@oss.qualcomm.com usb: gadget: f_ecm: Add get_status callback
Pawel Laszczak pawell@cadence.com usb: cdnsp: fix L1 resume issue for RTL_REVISION_NEW_LPM version
Pawel Laszczak pawell@cadence.com usb: cdnsp: Fix issue with resuming from L1
Prashanth K prashanth.k@oss.qualcomm.com usb: dwc3: gadget: Make gadget_wakeup asynchronous
Jan Kara jack@suse.cz ocfs2: stop quota recovery before disabling quotas
Jan Kara jack@suse.cz ocfs2: implement handshaking with ocfs2 recovery thread
Jan Kara jack@suse.cz ocfs2: switch osb->disable_recovery to enum
Heming Zhao heming.zhao@suse.com ocfs2: fix the issue with discontiguous allocation in the global_bitmap
Mark Tinguely mark.tinguely@oracle.com ocfs2: fix panic in failed foilio allocation
Borislav Petkov (AMD) bp@alien8.de x86/microcode: Consolidate the loader enablement checking
Dmitry Antipov dmantipov@yandex.ru module: ensure that kobject_put() is safe for module type kobjects
Tom Lendacky thomas.lendacky@amd.com memblock: Accept allocated memory before use in memblock_double_array()
Sebastian Ott sebott@redhat.com KVM: arm64: Fix uninitialized memcache pointer in user_mem_abort()
Sebastian Andrzej Siewior bigeasy@linutronix.de clocksource/i8253: Use raw_spinlock_irqsave() in clockevent_i8253_disable()
Yeoreum Yun yeoreum.yun@arm.com arm64: cpufeature: Move arm64_use_ng_mappings to the .data section to prevent wrong idmap generation
Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com accel/ivpu: Increase state dump msg timeout
Jason Andryuk jason.andryuk@amd.com xenbus: Use kref to track req lifetime
John Ernberg john.ernberg@actia.se xen: swiotlb: Use swiotlb bouncing if kmalloc allocation demands it
Paul Aurich paul@darkrain42.org smb: client: Avoid race in open_cached_dir with lease breaks
Alexey Charkov alchark@gmail.com usb: uhci-platform: Make the clock really optional
Mathias Nyman mathias.nyman@linux.intel.com xhci: dbc: Avoid event polling busyloop if pending rx transfers are inactive.
Alex Deucher alexander.deucher@amd.com drm/amdgpu/hdp7: use memcfg register to post the write for HDP flush
Alex Deucher alexander.deucher@amd.com drm/amdgpu/hdp6: use memcfg register to post the write for HDP flush
Alex Deucher alexander.deucher@amd.com drm/amdgpu/hdp5: use memcfg register to post the write for HDP flush
Alex Deucher alexander.deucher@amd.com drm/amdgpu/hdp5.2: use memcfg register to post the write for HDP flush
Alex Deucher alexander.deucher@amd.com drm/amdgpu/hdp4: use memcfg register to post the write for HDP flush
Wayne Lin Wayne.Lin@amd.com drm/amd/display: Copy AUX read reply data whenever length > 0
Wayne Lin Wayne.Lin@amd.com drm/amd/display: Fix wrong handling for AUX_DEFER case
Wayne Lin Wayne.Lin@amd.com drm/amd/display: Remove incorrect checking in dmub aux handler
Wayne Lin Wayne.Lin@amd.com drm/amd/display: Fix the checking condition in dmub aux handling
Aurabindo Pillai aurabindo.pillai@amd.com drm/amd/display: more liberal vmin/vmax update for freesync
Roman Li Roman.Li@amd.com drm/amd/display: Fix invalid context error in dml helper
Ruijing Dong ruijing.dong@amd.com drm/amdgpu/vcn: using separate VCN1_AON_SOC offset
Alex Deucher alexander.deucher@amd.com drm/amdgpu: fix pm notifier handling
Matthew Brost matthew.brost@intel.com drm/xe: Add page queue multiplier
Maíra Canal mcanal@igalia.com drm/v3d: Add job to pending list if the reset was skipped
Alex Deucher alexander.deucher@amd.com Revert "drm/amd: Stop evicting resources on APUs in suspend"
David Lechner dlechner@baylibre.com iio: pressure: mprls0025pa: use aligned_s64 for timestamp
Luca Ceresoli luca.ceresoli@bootlin.com iio: light: opt3001: fix deadlock due to concurrent flag access
Silvano Seva s.seva@4sigma.it iio: imu: st_lsm6dsx: fix possible lockup in st_lsm6dsx_read_tagged_fifo
Silvano Seva s.seva@4sigma.it iio: imu: st_lsm6dsx: fix possible lockup in st_lsm6dsx_read_fifo
David Lechner dlechner@baylibre.com iio: imu: inv_mpu6050: align buffer for timestamp
Zhang Lixu lixu.zhang@intel.com iio: hid-sensor-prox: Fix incorrect OFFSET calculation
Zhang Lixu lixu.zhang@intel.com iio: hid-sensor-prox: support multi-channel SCALE calculation
Zhang Lixu lixu.zhang@intel.com iio: hid-sensor-prox: Restore lost scale assignments
David Lechner dlechner@baylibre.com iio: chemical: pms7003: use aligned_s64 for timestamp
David Lechner dlechner@baylibre.com iio: chemical: sps30: use aligned_s64 for timestamp
Gabriel Shahrouzi gshahrouzi@gmail.com iio: adis16201: Correct inclinometer channel resolution
Simon Xue xxm@rock-chips.com iio: adc: rockchip: Fix clock initialization sequence
Angelo Dureghello adureghello@baylibre.com iio: adc: ad7606: fix serial register access
Jonathan Cameron Jonathan.Cameron@huawei.com iio: adc: ad7266: Fix potential timestamp alignment issue.
Jonathan Cameron Jonathan.Cameron@huawei.com iio: adc: ad7768-1: Fix insufficient alignment of timestamp.
Jens Axboe axboe@kernel.dk io_uring: ensure deferred completions are flushed for multishot
Nam Cao namcao@linutronix.de riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
Wayne Lin Wayne.Lin@amd.com drm/amd/display: Shift DMUB AUX reply command if necessary
Mikhail Lobanov m.lobanov@rosa.ru KVM: SVM: Forcibly leave SMM mode on SHUTDOWN interception
Sean Christopherson seanjc@google.com KVM: x86/mmu: Prevent installing hugepages when mem attributes are changing
Madhavan Srinivasan maddy@linux.ibm.com selftests/mm: fix build break when compiling pkey_util.c
Nysal Jan K.A. nysal@linux.ibm.com selftests/mm: fix a build failure on powerpc
Feng Tang feng.tang@linux.alibaba.com selftests/mm: compaction_test: support platform with huge mount of memory
Peter Xu peterx@redhat.com mm/userfaultfd: fix uninitialized output field for -EAGAIN race
Gavin Guo gavinguo@igalia.com mm/huge_memory: fix dereferencing invalid pmd migration entry
Kees Cook kees@kernel.org mm: vmalloc: support more granular vrealloc() sizing
Petr Vaněk arkamar@atlas.cz mm: fix folio_pte_batch() on XEN PV
Dave Hansen dave.hansen@linux.intel.com x86/mm: Eliminate window where TLB flushes may be inadvertently skipped
Gabriel Shahrouzi gshahrouzi@gmail.com staging: axis-fifo: Correct handling of tx_fifo_depth for size validation
Gabriel Shahrouzi gshahrouzi@gmail.com staging: axis-fifo: Remove hardware resets for user errors
Dave Stevenson dave.stevenson@raspberrypi.com staging: bcm2835-camera: Initialise dev in v4l2_dev
Gabriel Shahrouzi gshahrouzi@gmail.com staging: iio: adc: ad7816: Correct conditional logic for store mode
Naman Jain namjain@linux.microsoft.com uio_hv_generic: Fix sysfs creation path for ring buffer
Miguel Ojeda ojeda@kernel.org rust: clean Rust 1.88.0's warning about `clippy::disallowed_macros` configuration
Miguel Ojeda ojeda@kernel.org objtool/rust: add one more `noreturn` Rust function for Rust 1.87.0
Miguel Ojeda ojeda@kernel.org rust: clean Rust 1.88.0's `unnecessary_transmutes` lint
Aditya Garg gargaditya08@live.com Input: synaptics - enable InterTouch on TUXEDO InfinityBook Pro 14 v5
Dmitry Torokhov dmitry.torokhov@gmail.com Input: synaptics - enable SMBus for HP Elitebook 850 G1
Aditya Garg gargaditya08@live.com Input: synaptics - enable InterTouch on Dell Precision M3800
Aditya Garg gargaditya08@live.com Input: synaptics - enable InterTouch on Dynabook Portege X30L-G
Manuel Fombuena fombuena@outlook.com Input: synaptics - enable InterTouch on Dynabook Portege X30-D
Vicki Pfau vi@endrift.com Input: xpad - fix two controller table values
Lode Willems me@lodewillems.com Input: xpad - add support for 8BitDo Ultimate 2 Wireless Controller
Vicki Pfau vi@endrift.com Input: xpad - fix Share button on Xbox One controllers
Gary Bisson bisson.gary@gmail.com Input: mtk-pmic-keys - fix possible null pointer dereference
Mikael Gonella-Bolduc mgonellabolduc@dimonoff.com Input: cyttsp5 - fix power control issue on wakeup
Hugo Villeneuve hvilleneuve@dimonoff.com Input: cyttsp5 - ensure minimum reset pulse width
Jakub Kicinski kuba@kernel.org virtio-net: fix total qstat values
Jakub Kicinski kuba@kernel.org net: export a helper for adding up queue stats
Alexander Duyck alexanderduyck@fb.com fbnic: Do not allow mailbox to toggle to ready outside fbnic_mbx_poll_tx_ready
Alexander Duyck alexanderduyck@fb.com fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context
Alexander Duyck alexanderduyck@fb.com fbnic: Improve responsiveness of fbnic_mbx_poll_tx_ready
Alexander Duyck alexanderduyck@fb.com fbnic: Cleanup handling of completions
Alexander Duyck alexanderduyck@fb.com fbnic: Actually flush_tx instead of stalling out
Alexander Duyck alexanderduyck@fb.com fbnic: Gate AXI read/write enabling on FW mailbox
Alexander Duyck alexanderduyck@fb.com fbnic: Fix initialization of mailbox descriptor rings
Jonas Gorski jonas.gorski@gmail.com net: dsa: b53: do not set learning and unicast/multicast on up
Jonas Gorski jonas.gorski@gmail.com net: dsa: b53: fix learning on VLAN unaware bridges
Jonas Gorski jonas.gorski@gmail.com net: dsa: b53: fix toggling vlan_filtering
Jonas Gorski jonas.gorski@gmail.com net: dsa: b53: do not program vlans when vlan filtering is off
Jonas Gorski jonas.gorski@gmail.com net: dsa: b53: do not allow to configure VLAN 0
Jonas Gorski jonas.gorski@gmail.com net: dsa: b53: always rejoin default untagged VLAN on bridge leave
Jonas Gorski jonas.gorski@gmail.com net: dsa: b53: fix VLAN ID for untagged vlan on bridge leave
Jonas Gorski jonas.gorski@gmail.com net: dsa: b53: fix flushing old pvid VLAN on pvid change
Jonas Gorski jonas.gorski@gmail.com net: dsa: b53: fix clearing PVID of a port
Jonas Gorski jonas.gorski@gmail.com net: dsa: b53: keep CPU port always tagged again
Jonas Gorski jonas.gorski@gmail.com net: dsa: b53: allow leaky reserved multicast
Paul Chaignon paul.chaignon@gmail.com bpf: Scrub packet on bpf_redirect_peer
Jozsef Kadlecsik kadlec@netfilter.org netfilter: ipset: fix region locking in hash types
Julian Anastasov ja@ssi.bg ipvs: fix uninit-value for saddr in do_output_route4
Gao Xiang xiang@kernel.org erofs: ensure the extra temporary copy is valid for shortened bvecs
Przemek Kitszel przemyslaw.kitszel@intel.com ice: use DSN instead of PCI BDF for ice_adapter index
Michael-CY Lee michael-cy.lee@mediatek.com wifi: mac80211: fix the type of status_code for negotiated TID to Link Mapping
Oliver Hartkopp socketcan@hartkopp.net can: gw: fix RCU/BH usage in cgw_create_job()
Kelsey Maes kelsey@vpprocess.com can: mcp251xfd: fix TDC setting for low data bit rates
Antonios Salios antonios@mwa.re can: m_can: m_can_class_allocate_dev(): initialize spin lock on device probe
Frank Wunderlich frank-w@public-files.de net: ethernet: mtk_eth_soc: do not reset PSE when setting FE
Daniel Golle daniel@makrotopia.org net: ethernet: mtk_eth_soc: reset all TX queues on DMA free
Guillaume Nault gnault@redhat.com gre: Fix again IPv6 link-local address generation.
Jakub Kicinski kuba@kernel.org virtio-net: free xsk_buffs on error in virtnet_xsk_pool_enable()
Jakub Kicinski kuba@kernel.org virtio-net: don't re-enable refill work too early when NAPI is disabled
Cong Wang xiyou.wangcong@gmail.com sch_htb: make htb_deactivate() idempotent
Heiko Carstens hca@linux.ibm.com s390/entry: Fix last breaking event handling in case of stack corruption
Wang Zhaolong wangzhaolong1@huawei.com ksmbd: fix memory leak in parse_lease_state()
Eelco Chaudron echaudro@redhat.com openvswitch: Fix unsafe attribute parsing in output_userspace()
Sean Heelan seanheelan@gmail.com ksmbd: Fix UAF in __close_file_table_ids
Norbert Szetei norbert@doyensec.com ksmbd: prevent out-of-bounds stream writes by validating *pos
Namjae Jeon linkinjeon@kernel.org ksmbd: prevent rename with empty string
Marc Kleine-Budde mkl@pengutronix.de can: rockchip_canfd: rkcanfd_remove(): fix order of unregistration calls
Marc Kleine-Budde mkl@pengutronix.de can: mcp251xfd: mcp251xfd_remove(): fix order of unregistration calls
Niklas Schnelle schnelle@linux.ibm.com s390/pci: Fix duplicate pci_dev_put() in disable_slot() when PF has child VFs
Alex Williamson alex.williamson@redhat.com vfio/pci: Align huge faults to order
Veerendranath Jakkam quic_vjakkam@quicinc.com wifi: cfg80211: fix out-of-bounds access during multi-link element defragmentation
Niklas Schnelle schnelle@linux.ibm.com s390/pci: Fix missing check for zpci_create_device() error return
Marc Kleine-Budde mkl@pengutronix.de can: mcan: m_can_class_unregister(): fix order of unregistration calls
Cristian Marussi cristian.marussi@arm.com firmware: arm_scmi: Fix timeout checks on polling path
Wojciech Dubowik Wojciech.Dubowik@mt.com arm64: dts: imx8mm-verdin: Link reg_usdhc2_vqmmc to usdhc2
Qu Wenruo wqu@suse.com Revert "btrfs: canonicalize the device path before adding it"
Max Kellermann max.kellermann@ionos.com fs/erofs/fileio: call erofs_onlinefolio_split() after bio_add_folio()
Dan Carpenter dan.carpenter@linaro.org dm: add missing unlock on in dm_keyslot_evict()
-------------
Diffstat:
.clippy.toml | 2 +- Documentation/ABI/testing/sysfs-devices-system-cpu | 1 + Documentation/admin-guide/hw-vuln/index.rst | 1 + .../hw-vuln/indirect-target-selection.rst | 168 ++++++++++++++++ Documentation/admin-guide/kernel-parameters.txt | 18 ++ Makefile | 4 +- arch/arm64/boot/dts/freescale/imx8mm-verdin.dtsi | 25 ++- arch/arm64/include/asm/cputype.h | 2 + arch/arm64/include/asm/insn.h | 1 + arch/arm64/include/asm/spectre.h | 3 + arch/arm64/kernel/cpufeature.c | 9 +- arch/arm64/kernel/proton-pack.c | 13 +- arch/arm64/kvm/mmu.c | 13 +- arch/arm64/lib/insn.c | 76 +++++--- arch/arm64/net/bpf_jit_comp.c | 57 +++++- arch/mips/include/asm/idle.h | 3 +- arch/mips/include/asm/ptrace.h | 3 +- arch/mips/kernel/genex.S | 63 +++--- arch/mips/kernel/idle.c | 7 - arch/riscv/kernel/process.c | 6 + arch/riscv/kernel/traps.c | 64 ++++--- arch/riscv/kernel/traps_misaligned.c | 17 ++ arch/s390/kernel/entry.S | 3 +- arch/s390/pci/pci_clp.c | 2 + arch/x86/Kconfig | 12 ++ arch/x86/entry/entry_64.S | 20 +- arch/x86/include/asm/alternative.h | 24 +++ arch/x86/include/asm/cpufeatures.h | 3 + arch/x86/include/asm/microcode.h | 2 + arch/x86/include/asm/msr-index.h | 8 + arch/x86/include/asm/nospec-branch.h | 10 + arch/x86/kernel/alternative.c | 195 ++++++++++++++++++- arch/x86/kernel/cpu/bugs.c | 176 ++++++++++++++++- arch/x86/kernel/cpu/common.c | 72 +++++-- arch/x86/kernel/cpu/microcode/amd.c | 6 +- arch/x86/kernel/cpu/microcode/core.c | 60 +++--- arch/x86/kernel/cpu/microcode/intel.c | 2 +- arch/x86/kernel/cpu/microcode/internal.h | 1 - arch/x86/kernel/ftrace.c | 2 +- arch/x86/kernel/head32.c | 4 - arch/x86/kernel/module.c | 6 + arch/x86/kernel/static_call.c | 4 +- arch/x86/kernel/vmlinux.lds.S | 10 + arch/x86/kvm/mmu/mmu.c | 89 ++++++--- arch/x86/kvm/smm.c | 1 + arch/x86/kvm/svm/svm.c | 4 + arch/x86/kvm/x86.c | 4 +- arch/x86/lib/retpoline.S | 39 ++++ arch/x86/mm/tlb.c | 23 ++- arch/x86/net/bpf_jit_comp.c | 58 +++++- drivers/accel/ivpu/ivpu_hw.c | 2 +- drivers/accel/ivpu/ivpu_job.c | 90 ++++++--- drivers/base/cpu.c | 3 + drivers/block/loop.c | 43 ++++- drivers/bluetooth/btmtk.c | 10 - drivers/clocksource/i8253.c | 4 +- drivers/firmware/arm_scmi/driver.c | 13 +- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 - drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 18 -- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 29 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 10 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h | 1 - drivers/gpu/drm/amd/amdgpu/hdp_v4_0.c | 7 +- drivers/gpu/drm/amd/amdgpu/hdp_v5_0.c | 7 +- drivers/gpu/drm/amd/amdgpu/hdp_v5_2.c | 12 +- drivers/gpu/drm/amd/amdgpu/hdp_v6_0.c | 7 +- drivers/gpu/drm/amd/amdgpu/hdp_v7_0.c | 7 +- drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 1 + drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 1 + drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 1 + drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 4 +- drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 1 + drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c | 1 + drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c | 3 +- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 ++-- .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 28 ++- .../amd/display/dc/dml2/dml2_translation_helper.c | 14 +- drivers/gpu/drm/panel/panel-simple.c | 25 +-- drivers/gpu/drm/v3d/v3d_sched.c | 28 ++- drivers/gpu/drm/xe/tests/xe_mocs.c | 7 +- drivers/gpu/drm/xe/xe_gt_debugfs.c | 9 +- drivers/gpu/drm/xe/xe_gt_pagefault.c | 11 +- drivers/hv/hyperv_vmbus.h | 6 + drivers/hv/vmbus_drv.c | 100 +++++++++- drivers/iio/accel/adis16201.c | 4 +- drivers/iio/accel/adxl355_core.c | 2 +- drivers/iio/accel/adxl367.c | 10 +- drivers/iio/adc/ad7266.c | 2 +- drivers/iio/adc/ad7606_spi.c | 2 +- drivers/iio/adc/ad7768-1.c | 2 +- drivers/iio/adc/dln2-adc.c | 2 +- drivers/iio/adc/rockchip_saradc.c | 17 +- drivers/iio/chemical/pms7003.c | 5 +- drivers/iio/chemical/sps30.c | 2 +- .../iio/common/hid-sensors/hid-sensor-attributes.c | 4 + drivers/iio/imu/bmi270/bmi270_core.c | 6 +- drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c | 2 +- drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c | 6 + drivers/iio/light/hid-sensor-prox.c | 22 ++- drivers/iio/light/opt3001.c | 5 +- drivers/iio/pressure/mprls0025pa.h | 17 +- drivers/iio/temperature/maxim_thermocouple.c | 2 +- drivers/input/joystick/xpad.c | 40 ++-- drivers/input/keyboard/mtk-pmic-keys.c | 4 +- drivers/input/mouse/synaptics.c | 5 + drivers/input/touchscreen/cyttsp5.c | 7 +- drivers/md/dm-table.c | 3 +- drivers/net/can/m_can/m_can.c | 3 +- drivers/net/can/rockchip/rockchip_canfd-core.c | 2 +- drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c | 42 +++- drivers/net/dsa/b53/b53_common.c | 213 +++++++++++++++------ drivers/net/dsa/b53/b53_priv.h | 3 + drivers/net/dsa/bcm_sf2.c | 1 + drivers/net/ethernet/intel/ice/ice_adapter.c | 47 ++--- drivers/net/ethernet/intel/ice/ice_adapter.h | 6 +- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 19 +- drivers/net/ethernet/meta/fbnic/fbnic_csr.h | 2 + drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 197 +++++++++++-------- drivers/net/ethernet/meta/fbnic/fbnic_mac.c | 6 - drivers/net/virtio_net.c | 23 ++- drivers/nvme/host/core.c | 3 +- drivers/pci/hotplug/s390_pci_hpc.c | 1 - drivers/staging/axis-fifo/axis-fifo.c | 14 +- drivers/staging/iio/adc/ad7816.c | 2 +- .../vc04_services/bcm2835-camera/bcm2835-camera.c | 1 + drivers/uio/uio_hv_generic.c | 39 ++-- drivers/usb/cdns3/cdnsp-gadget.c | 31 +++ drivers/usb/cdns3/cdnsp-gadget.h | 6 + drivers/usb/cdns3/cdnsp-pci.c | 12 +- drivers/usb/cdns3/cdnsp-ring.c | 3 +- drivers/usb/cdns3/core.h | 3 + drivers/usb/class/usbtmc.c | 59 +++--- drivers/usb/dwc3/core.h | 4 + drivers/usb/dwc3/gadget.c | 60 +++--- drivers/usb/gadget/composite.c | 12 +- drivers/usb/gadget/function/f_ecm.c | 7 + drivers/usb/gadget/udc/tegra-xudc.c | 4 + drivers/usb/host/uhci-platform.c | 2 +- drivers/usb/host/xhci-dbgcap.c | 19 +- drivers/usb/host/xhci-dbgcap.h | 3 + drivers/usb/host/xhci-tegra.c | 3 + drivers/usb/misc/onboard_usb_dev.c | 10 +- drivers/usb/typec/tcpm/tcpm.c | 2 +- drivers/usb/typec/ucsi/displayport.c | 21 +- drivers/usb/typec/ucsi/ucsi.c | 34 ++++ drivers/usb/typec/ucsi/ucsi.h | 2 + drivers/vfio/pci/vfio_pci_core.c | 12 +- drivers/xen/swiotlb-xen.c | 1 + drivers/xen/xenbus/xenbus.h | 2 + drivers/xen/xenbus/xenbus_comms.c | 9 +- drivers/xen/xenbus/xenbus_dev_frontend.c | 2 +- drivers/xen/xenbus/xenbus_xs.c | 18 +- fs/btrfs/volumes.c | 91 +-------- fs/erofs/fileio.c | 4 +- fs/erofs/zdata.c | 29 ++- fs/namespace.c | 3 +- fs/ocfs2/alloc.c | 1 + fs/ocfs2/journal.c | 80 +++++--- fs/ocfs2/journal.h | 1 + fs/ocfs2/ocfs2.h | 17 +- fs/ocfs2/quota_local.c | 9 +- fs/ocfs2/suballoc.c | 38 +++- fs/ocfs2/suballoc.h | 1 + fs/ocfs2/super.c | 3 + fs/smb/client/cached_dir.c | 10 +- fs/smb/server/oplock.c | 7 +- fs/smb/server/smb2pdu.c | 5 + fs/smb/server/vfs.c | 7 + fs/smb/server/vfs_cache.c | 33 +++- fs/userfaultfd.c | 28 ++- include/linux/cpu.h | 2 + include/linux/execmem.h | 3 + include/linux/hyperv.h | 6 + include/linux/ieee80211.h | 2 +- include/linux/module.h | 5 + include/linux/timekeeper_internal.h | 8 +- include/linux/vmalloc.h | 1 + include/net/netdev_queues.h | 6 + init/Kconfig | 3 + io_uring/io_uring.c | 58 +++--- io_uring/sqpoll.c | 2 +- kernel/params.c | 4 +- kernel/time/timekeeping.c | 50 ++++- kernel/time/vsyscall.c | 4 +- mm/huge_memory.c | 11 +- mm/internal.h | 27 ++- mm/memblock.c | 9 +- mm/page_alloc.c | 159 +++++++++------ mm/vmalloc.c | 31 ++- net/can/gw.c | 151 +++++++++------ net/core/filter.c | 1 + net/core/netdev-genl.c | 69 +++++-- net/ipv6/addrconf.c | 15 +- net/mac80211/mlme.c | 12 +- net/netfilter/ipset/ip_set_hash_gen.h | 2 +- net/netfilter/ipvs/ip_vs_xmit.c | 27 +-- net/openvswitch/actions.c | 3 +- net/sched/sch_htb.c | 15 +- net/wireless/scan.c | 2 +- rust/bindings/lib.rs | 1 + rust/kernel/alloc/kvec.rs | 3 + rust/kernel/list.rs | 3 + rust/kernel/str.rs | 46 ++--- rust/macros/module.rs | 19 +- rust/macros/paste.rs | 2 +- rust/macros/pinned_drop.rs | 3 +- rust/uapi/lib.rs | 1 + tools/objtool/check.c | 1 + tools/testing/selftests/Makefile | 1 + tools/testing/selftests/mm/compaction_test.c | 19 +- tools/testing/selftests/mm/pkey-powerpc.h | 14 +- tools/testing/selftests/mm/pkey_util.c | 1 + tools/testing/selftests/x86/bugs/Makefile | 3 + tools/testing/selftests/x86/bugs/common.py | 164 ++++++++++++++++ .../selftests/x86/bugs/its_indirect_alignment.py | 150 +++++++++++++++ .../testing/selftests/x86/bugs/its_permutations.py | 109 +++++++++++ .../selftests/x86/bugs/its_ret_alignment.py | 139 ++++++++++++++ tools/testing/selftests/x86/bugs/its_sysfs.py | 65 +++++++ 218 files changed, 3603 insertions(+), 1269 deletions(-)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
commit 650266ac4c7230c89bcd1307acf5c9c92cfa85e2 upstream.
We need to call dm_put_live_table() even if dm_get_live_table() returns NULL.
Fixes: 9355a9eb21a5 ("dm: support key eviction from keyslot managers of underlying devices") Cc: stable@vger.kernel.org # v5.12+ Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-table.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -1178,7 +1178,7 @@ static int dm_keyslot_evict(struct blk_c
t = dm_get_live_table(md, &srcu_idx); if (!t) - return 0; + goto put_live_table;
for (unsigned int i = 0; i < t->num_targets; i++) { struct dm_target *ti = dm_table_get_target(t, i); @@ -1189,6 +1189,7 @@ static int dm_keyslot_evict(struct blk_c (void *)key); }
+put_live_table: dm_put_live_table(md, srcu_idx); return 0; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Max Kellermann max.kellermann@ionos.com
commit bbfe756dc3062c1e934f06e5ba39c239aa953b92 upstream.
If bio_add_folio() fails (because it is full), erofs_fileio_scan_folio() needs to submit the I/O request via erofs_fileio_rq_submit() and allocate a new I/O request with an empty `struct bio`. Then it retries the bio_add_folio() call.
However, at this point, erofs_onlinefolio_split() has already been called which increments `folio->private`; the retry will call erofs_onlinefolio_split() again, but there will never be a matching erofs_onlinefolio_end() call. This leaves the folio locked forever and all waiters will be stuck in folio_wait_bit_common().
This bug has been added by commit ce63cb62d794 ("erofs: support unencoded inodes for fileio"), but was practically unreachable because there was room for 256 folios in the `struct bio` - until commit 9f74ae8c9ac9 ("erofs: shorten bvecs[] for file-backed mounts") which reduced the array capacity to 16 folios.
It was now trivial to trigger the bug by manually invoking readahead from userspace, e.g.:
posix_fadvise(fd, 0, st.st_size, POSIX_FADV_WILLNEED);
This should be fixed by invoking erofs_onlinefolio_split() only after bio_add_folio() has succeeded. This is safe: asynchronous completions invoking erofs_onlinefolio_end() will not unlock the folio because erofs_fileio_scan_folio() is still holding a reference to be released by erofs_onlinefolio_end() at the end.
Fixes: ce63cb62d794 ("erofs: support unencoded inodes for fileio") Fixes: 9f74ae8c9ac9 ("erofs: shorten bvecs[] for file-backed mounts") Cc: stable@vger.kernel.org Signed-off-by: Max Kellermann max.kellermann@ionos.com Reviewed-by: Gao Xiang xiang@kernel.org Tested-by: Hongbo Li lihongbo22@huawei.com Link: https://lore.kernel.org/r/20250428230933.3422273-1-max.kellermann@ionos.com Signed-off-by: Gao Xiang xiang@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/erofs/fileio.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/erofs/fileio.c +++ b/fs/erofs/fileio.c @@ -150,10 +150,10 @@ io_retry: io->rq->bio.bi_iter.bi_sector = io->dev.m_pa >> 9; attached = 0; } - if (!attached++) - erofs_onlinefolio_split(folio); if (!bio_add_folio(&io->rq->bio, folio, len, cur)) goto io_retry; + if (!attached++) + erofs_onlinefolio_split(folio); io->dev.m_pa += len; } cur += len;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
commit 8fb1dcbbcc1ffe6ed7cf3f0f96d2737491dd1fbf upstream.
This reverts commit 7e06de7c83a746e58d4701e013182af133395188.
Commit 7e06de7c83a7 ("btrfs: canonicalize the device path before adding it") tries to make btrfs to use "/dev/mapper/*" name first, then any filename inside "/dev/" as the device path.
This is mostly fine when there is only the root namespace involved, but when multiple namespace are involved, things can easily go wrong for the d_path() usage.
As d_path() returns a file path that is namespace dependent, the resulted string may not make any sense in another namespace.
Furthermore, the "/dev/" prefix checks itself is not reliable, one can still make a valid initramfs without devtmpfs, and fill all needed device nodes manually.
Overall the userspace has all its might to pass whatever device path for mount, and we are not going to win the war trying to cover every corner case.
So just revert that commit, and do no extra d_path() based file path sanity check.
CC: stable@vger.kernel.org # 6.12+ Link: https://lore.kernel.org/linux-fsdevel/20250115185608.GA2223535@zen.localdoma... Reviewed-by: Boris Burkov boris@bur.io Signed-off-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/volumes.c | 91 ----------------------------------------------------- 1 file changed, 1 insertion(+), 90 deletions(-)
--- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -733,82 +733,6 @@ const u8 *btrfs_sb_fsid_ptr(const struct return has_metadata_uuid ? sb->metadata_uuid : sb->fsid; }
-/* - * We can have very weird soft links passed in. - * One example is "/proc/self/fd/<fd>", which can be a soft link to - * a block device. - * - * But it's never a good idea to use those weird names. - * Here we check if the path (not following symlinks) is a good one inside - * "/dev/". - */ -static bool is_good_dev_path(const char *dev_path) -{ - struct path path = { .mnt = NULL, .dentry = NULL }; - char *path_buf = NULL; - char *resolved_path; - bool is_good = false; - int ret; - - if (!dev_path) - goto out; - - path_buf = kmalloc(PATH_MAX, GFP_KERNEL); - if (!path_buf) - goto out; - - /* - * Do not follow soft link, just check if the original path is inside - * "/dev/". - */ - ret = kern_path(dev_path, 0, &path); - if (ret) - goto out; - resolved_path = d_path(&path, path_buf, PATH_MAX); - if (IS_ERR(resolved_path)) - goto out; - if (strncmp(resolved_path, "/dev/", strlen("/dev/"))) - goto out; - is_good = true; -out: - kfree(path_buf); - path_put(&path); - return is_good; -} - -static int get_canonical_dev_path(const char *dev_path, char *canonical) -{ - struct path path = { .mnt = NULL, .dentry = NULL }; - char *path_buf = NULL; - char *resolved_path; - int ret; - - if (!dev_path) { - ret = -EINVAL; - goto out; - } - - path_buf = kmalloc(PATH_MAX, GFP_KERNEL); - if (!path_buf) { - ret = -ENOMEM; - goto out; - } - - ret = kern_path(dev_path, LOOKUP_FOLLOW, &path); - if (ret) - goto out; - resolved_path = d_path(&path, path_buf, PATH_MAX); - if (IS_ERR(resolved_path)) { - ret = PTR_ERR(resolved_path); - goto out; - } - ret = strscpy(canonical, resolved_path, PATH_MAX); -out: - kfree(path_buf); - path_put(&path); - return ret; -} - static bool is_same_device(struct btrfs_device *device, const char *new_path) { struct path old = { .mnt = NULL, .dentry = NULL }; @@ -1513,23 +1437,12 @@ struct btrfs_device *btrfs_scan_one_devi bool new_device_added = false; struct btrfs_device *device = NULL; struct file *bdev_file; - char *canonical_path = NULL; u64 bytenr; dev_t devt; int ret;
lockdep_assert_held(&uuid_mutex);
- if (!is_good_dev_path(path)) { - canonical_path = kmalloc(PATH_MAX, GFP_KERNEL); - if (canonical_path) { - ret = get_canonical_dev_path(path, canonical_path); - if (ret < 0) { - kfree(canonical_path); - canonical_path = NULL; - } - } - } /* * Avoid an exclusive open here, as the systemd-udev may initiate the * device scan which may race with the user's mount or mkfs command, @@ -1574,8 +1487,7 @@ struct btrfs_device *btrfs_scan_one_devi goto free_disk_super; }
- device = device_list_add(canonical_path ? : path, disk_super, - &new_device_added); + device = device_list_add(path, disk_super, &new_device_added); if (!IS_ERR(device) && new_device_added) btrfs_free_stale_devices(device->devt, device);
@@ -1584,7 +1496,6 @@ free_disk_super:
error_bdev_put: fput(bdev_file); - kfree(canonical_path);
return device; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wojciech Dubowik Wojciech.Dubowik@mt.com
commit 5591ce0069ddda97cdbbea596bed53e698f399c2 upstream.
Define vqmmc regulator-gpio for usdhc2 with vin-supply coming from LDO5.
Without this definition LDO5 will be powered down, disabling SD card after bootup. This has been introduced in commit f5aab0438ef1 ("regulator: pca9450: Fix enable register for LDO5").
Fixes: 6a57f224f734 ("arm64: dts: freescale: add initial support for verdin imx8m mini") Fixes: f5aab0438ef1 ("regulator: pca9450: Fix enable register for LDO5") Tested-by: Manuel Traut manuel.traut@mt.com Reviewed-by: Philippe Schenker philippe.schenker@impulsing.ch Tested-by: Francesco Dolcini francesco.dolcini@toradex.com Reviewed-by: Francesco Dolcini francesco.dolcini@toradex.com Cc: stable@vger.kernel.org Signed-off-by: Wojciech Dubowik Wojciech.Dubowik@mt.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/freescale/imx8mm-verdin.dtsi | 25 ++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-)
--- a/arch/arm64/boot/dts/freescale/imx8mm-verdin.dtsi +++ b/arch/arm64/boot/dts/freescale/imx8mm-verdin.dtsi @@ -165,6 +165,19 @@ startup-delay-us = <20000>; };
+ reg_usdhc2_vqmmc: regulator-usdhc2-vqmmc { + compatible = "regulator-gpio"; + pinctrl-names = "default"; + pinctrl-0 = <&pinctrl_usdhc2_vsel>; + gpios = <&gpio1 4 GPIO_ACTIVE_HIGH>; + regulator-max-microvolt = <3300000>; + regulator-min-microvolt = <1800000>; + states = <1800000 0x1>, + <3300000 0x0>; + regulator-name = "PMIC_USDHC_VSELECT"; + vin-supply = <®_nvcc_sd>; + }; + reserved-memory { #address-cells = <2>; #size-cells = <2>; @@ -290,7 +303,7 @@ "SODIMM_19", "", "", - "", + "PMIC_USDHC_VSELECT", "", "", "", @@ -806,6 +819,7 @@ pinctrl-2 = <&pinctrl_usdhc2_200mhz>, <&pinctrl_usdhc2_cd>; pinctrl-3 = <&pinctrl_usdhc2_sleep>, <&pinctrl_usdhc2_cd_sleep>; vmmc-supply = <®_usdhc2_vmmc>; + vqmmc-supply = <®_usdhc2_vqmmc>; };
&wdog1 { @@ -1227,13 +1241,17 @@ <MX8MM_IOMUXC_NAND_CLE_GPIO3_IO5 0x6>; /* SODIMM 76 */ };
+ pinctrl_usdhc2_vsel: usdhc2vselgrp { + fsl,pins = + <MX8MM_IOMUXC_GPIO1_IO04_GPIO1_IO4 0x10>; /* PMIC_USDHC_VSELECT */ + }; + /* * Note: Due to ERR050080 we use discrete external on-module resistors pulling-up to the * on-module +V3.3_1.8_SD (LDO5) rail and explicitly disable the internal pull-ups here. */ pinctrl_usdhc2: usdhc2grp { fsl,pins = - <MX8MM_IOMUXC_GPIO1_IO04_USDHC2_VSELECT 0x10>, <MX8MM_IOMUXC_SD2_CLK_USDHC2_CLK 0x90>, /* SODIMM 78 */ <MX8MM_IOMUXC_SD2_CMD_USDHC2_CMD 0x90>, /* SODIMM 74 */ <MX8MM_IOMUXC_SD2_DATA0_USDHC2_DATA0 0x90>, /* SODIMM 80 */ @@ -1244,7 +1262,6 @@
pinctrl_usdhc2_100mhz: usdhc2-100mhzgrp { fsl,pins = - <MX8MM_IOMUXC_GPIO1_IO04_USDHC2_VSELECT 0x10>, <MX8MM_IOMUXC_SD2_CLK_USDHC2_CLK 0x94>, <MX8MM_IOMUXC_SD2_CMD_USDHC2_CMD 0x94>, <MX8MM_IOMUXC_SD2_DATA0_USDHC2_DATA0 0x94>, @@ -1255,7 +1272,6 @@
pinctrl_usdhc2_200mhz: usdhc2-200mhzgrp { fsl,pins = - <MX8MM_IOMUXC_GPIO1_IO04_USDHC2_VSELECT 0x10>, <MX8MM_IOMUXC_SD2_CLK_USDHC2_CLK 0x96>, <MX8MM_IOMUXC_SD2_CMD_USDHC2_CMD 0x96>, <MX8MM_IOMUXC_SD2_DATA0_USDHC2_DATA0 0x96>, @@ -1267,7 +1283,6 @@ /* Avoid backfeeding with removed card power */ pinctrl_usdhc2_sleep: usdhc2slpgrp { fsl,pins = - <MX8MM_IOMUXC_GPIO1_IO04_USDHC2_VSELECT 0x0>, <MX8MM_IOMUXC_SD2_CLK_USDHC2_CLK 0x0>, <MX8MM_IOMUXC_SD2_CMD_USDHC2_CMD 0x0>, <MX8MM_IOMUXC_SD2_DATA0_USDHC2_DATA0 0x0>,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cristian Marussi cristian.marussi@arm.com
commit c23c03bf1faa1e76be1eba35bad6da6a2a7c95ee upstream.
Polling mode transactions wait for a reply busy-looping without holding a spinlock, but currently the timeout checks are based only on elapsed time: as a result we could hit a false positive whenever our busy-looping thread is pre-empted and scheduled out for a time greater than the polling timeout.
Change the checks at the end of the busy-loop to make sure that the polling wasn't indeed successful or an out-of-order reply caused the polling to be forcibly terminated.
Fixes: 31d2f803c19c ("firmware: arm_scmi: Add sync_cmds_completed_on_ret transport flag") Reported-by: Huangjie huangjie1663@phytium.com.cn Closes: https://lore.kernel.org/arm-scmi/20250123083323.2363749-1-jackhuang021@gmail... Signed-off-by: Cristian Marussi cristian.marussi@arm.com Cc: stable@vger.kernel.org # 5.18.x Message-Id: 20250310175800.1444293-1-cristian.marussi@arm.com Signed-off-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/firmware/arm_scmi/driver.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
--- a/drivers/firmware/arm_scmi/driver.c +++ b/drivers/firmware/arm_scmi/driver.c @@ -1248,7 +1248,8 @@ static void xfer_put(const struct scmi_p }
static bool scmi_xfer_done_no_timeout(struct scmi_chan_info *cinfo, - struct scmi_xfer *xfer, ktime_t stop) + struct scmi_xfer *xfer, ktime_t stop, + bool *ooo) { struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
@@ -1257,7 +1258,7 @@ static bool scmi_xfer_done_no_timeout(st * in case of out-of-order receptions of delayed responses */ return info->desc->ops->poll_done(cinfo, xfer) || - try_wait_for_completion(&xfer->done) || + (*ooo = try_wait_for_completion(&xfer->done)) || ktime_after(ktime_get(), stop); }
@@ -1274,15 +1275,17 @@ static int scmi_wait_for_reply(struct de * itself to support synchronous commands replies. */ if (!desc->sync_cmds_completed_on_ret) { + bool ooo = false; + /* * Poll on xfer using transport provided .poll_done(); * assumes no completion interrupt was available. */ ktime_t stop = ktime_add_ms(ktime_get(), timeout_ms);
- spin_until_cond(scmi_xfer_done_no_timeout(cinfo, - xfer, stop)); - if (ktime_after(ktime_get(), stop)) { + spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer, + stop, &ooo)); + if (!ooo && !info->desc->ops->poll_done(cinfo, xfer)) { dev_err(dev, "timed out in resp(caller: %pS) - polling\n", (void *)_RET_IP_);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
commit 0713a1b3276b98c7dafbeefef00d7bc3a9119a84 upstream.
If a driver is removed, the driver framework invokes the driver's remove callback. A CAN driver's remove function calls unregister_candev(), which calls net_device_ops::ndo_stop further down in the call stack for interfaces which are in the "up" state.
The removal of the module causes a warning, as can_rx_offload_del() deletes the NAPI, while it is still active, because the interface is still up.
To fix the warning, first unregister the network interface, which calls net_device_ops::ndo_stop, which disables the NAPI, and then call can_rx_offload_del().
Fixes: 1be37d3b0414 ("can: m_can: fix periph RX path: use rx-offload to ensure skbs are sent from softirq context") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250502-can-rx-offload-del-v1-3-59a9b131589d@pengu... Reviewed-by: Markus Schneider-Pargmann msp@baylibre.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/m_can/m_can.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/can/m_can/m_can.c +++ b/drivers/net/can/m_can/m_can.c @@ -2463,9 +2463,9 @@ EXPORT_SYMBOL_GPL(m_can_class_register);
void m_can_class_unregister(struct m_can_classdev *cdev) { + unregister_candev(cdev->net); if (cdev->is_peripheral) can_rx_offload_del(&cdev->offload); - unregister_candev(cdev->net); } EXPORT_SYMBOL_GPL(m_can_class_unregister);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Niklas Schnelle schnelle@linux.ibm.com
commit 42420c50c68f3e95e90de2479464f420602229fc upstream.
The zpci_create_device() function returns an error pointer that needs to be checked before dereferencing it as a struct zpci_dev pointer. Add the missing check in __clp_add() where it was missed when adding the scan_list in the fixed commit. Simply not adding the device to the scan list results in the previous behavior.
Cc: stable@vger.kernel.org Fixes: 0467cdde8c43 ("s390/pci: Sort PCI functions prior to creating virtual busses") Signed-off-by: Niklas Schnelle schnelle@linux.ibm.com Reviewed-by: Gerd Bayer gbayer@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/pci/pci_clp.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/s390/pci/pci_clp.c +++ b/arch/s390/pci/pci_clp.c @@ -427,6 +427,8 @@ static void __clp_add(struct clp_fh_list return; } zdev = zpci_create_device(entry->fid, entry->fh, entry->config_state); + if (IS_ERR(zdev)) + return; list_add_tail(&zdev->entry, scan_list); }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Veerendranath Jakkam quic_vjakkam@quicinc.com
commit 023c1f2f0609218103cbcb48e0104b144d4a16dc upstream.
Currently during the multi-link element defragmentation process, the multi-link element length added to the total IEs length when calculating the length of remaining IEs after the multi-link element in cfg80211_defrag_mle(). This could lead to out-of-bounds access if the multi-link element or its corresponding fragment elements are the last elements in the IEs buffer.
To address this issue, correctly calculate the remaining IEs length by deducting the multi-link element end offset from total IEs end offset.
Cc: stable@vger.kernel.org Fixes: 2481b5da9c6b ("wifi: cfg80211: handle BSS data contained in ML probe responses") Signed-off-by: Veerendranath Jakkam quic_vjakkam@quicinc.com Link: https://patch.msgid.link/20250424-fix_mle_defragmentation_oob_access-v1-1-84... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/wireless/scan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/wireless/scan.c +++ b/net/wireless/scan.c @@ -2681,7 +2681,7 @@ cfg80211_defrag_mle(const struct element /* Required length for first defragmentation */ buf_len = mle->datalen - 1; for_each_element(elem, mle->data + mle->datalen, - ielen - sizeof(*mle) + mle->datalen) { + ie + ielen - mle->data - mle->datalen) { if (elem->id != WLAN_EID_FRAGMENT) break;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Williamson alex.williamson@redhat.com
commit c1d9dac0db168198b6f63f460665256dedad9b6e upstream.
The vfio-pci huge_fault handler doesn't make any attempt to insert a mapping containing the faulting address, it only inserts mappings if the faulting address and resulting pfn are aligned. This works in a lot of cases, particularly in conjunction with QEMU where DMA mappings linearly fault the mmap. However, there are configurations where we don't get that linear faulting and pages are faulted on-demand.
The scenario reported in the bug below is such a case, where the physical address width of the CPU is greater than that of the IOMMU, resulting in a VM where guest firmware has mapped device MMIO beyond the address width of the IOMMU. In this configuration, the MMIO is faulted on demand and tracing indicates that occasionally the faults generate a VM_FAULT_OOM. Given the use case, this results in a "error: kvm run failed Bad address", killing the VM.
The host is not under memory pressure in this test, therefore it's suspected that VM_FAULT_OOM is actually the result of a NULL return from __pte_offset_map_lock() in the get_locked_pte() path from insert_pfn(). This suggests a potential race inserting a pte concurrent to a pmd, and maybe indicates some deficiency in the mm layer properly handling such a case.
Nevertheless, Peter noted the inconsistency of vfio-pci's huge_fault handler where our mapping granularity depends on the alignment of the faulting address relative to the order rather than aligning the faulting address to the order to more consistently insert huge mappings. This change not only uses the page tables more consistently and efficiently, but as any fault to an aligned page results in the same mapping, the race condition suspected in the VM_FAULT_OOM is avoided.
Reported-by: Adolfo adolfotregosa@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220057 Fixes: 09dfc8a5f2ce ("vfio/pci: Fallback huge faults for unaligned pfn") Cc: stable@vger.kernel.org Tested-by: Adolfo adolfotregosa@gmail.com Co-developed-by: Peter Xu peterx@redhat.com Signed-off-by: Peter Xu peterx@redhat.com Link: https://lore.kernel.org/r/20250502224035.3183451-1-alex.williamson@redhat.co... Signed-off-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vfio/pci/vfio_pci_core.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -1654,14 +1654,14 @@ static vm_fault_t vfio_pci_mmap_huge_fau { struct vm_area_struct *vma = vmf->vma; struct vfio_pci_core_device *vdev = vma->vm_private_data; - unsigned long pfn, pgoff = vmf->pgoff - vma->vm_pgoff; + unsigned long addr = vmf->address & ~((PAGE_SIZE << order) - 1); + unsigned long pgoff = (addr - vma->vm_start) >> PAGE_SHIFT; + unsigned long pfn = vma_to_pfn(vma) + pgoff; vm_fault_t ret = VM_FAULT_SIGBUS;
- pfn = vma_to_pfn(vma) + pgoff; - - if (order && (pfn & ((1 << order) - 1) || - vmf->address & ((PAGE_SIZE << order) - 1) || - vmf->address + (PAGE_SIZE << order) > vma->vm_end)) { + if (order && (addr < vma->vm_start || + addr + (PAGE_SIZE << order) > vma->vm_end || + pfn & ((1 << order) - 1))) { ret = VM_FAULT_FALLBACK; goto out; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Niklas Schnelle schnelle@linux.ibm.com
commit 05a2538f2b48500cf4e8a0a0ce76623cc5bafcf1 upstream.
With commit bcb5d6c76903 ("s390/pci: introduce lock to synchronize state of zpci_dev's") the code to ignore power off of a PF that has child VFs was changed from a direct return to a goto to the unlock and pci_dev_put() section. The change however left the existing pci_dev_put() untouched resulting in a doubple put. This can subsequently cause a use after free if the struct pci_dev is released in an unexpected state. Fix this by removing the extra pci_dev_put().
Cc: stable@vger.kernel.org Fixes: bcb5d6c76903 ("s390/pci: introduce lock to synchronize state of zpci_dev's") Signed-off-by: Niklas Schnelle schnelle@linux.ibm.com Reviewed-by: Gerd Bayer gbayer@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/hotplug/s390_pci_hpc.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/pci/hotplug/s390_pci_hpc.c +++ b/drivers/pci/hotplug/s390_pci_hpc.c @@ -59,7 +59,6 @@ static int disable_slot(struct hotplug_s
pdev = pci_get_slot(zdev->zbus->bus, zdev->devfn); if (pdev && pci_num_vf(pdev)) { - pci_dev_put(pdev); rc = -EBUSY; goto out; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
commit 84f5eb833f53ae192baed4cfb8d9eaab43481fc9 upstream.
If a driver is removed, the driver framework invokes the driver's remove callback. A CAN driver's remove function calls unregister_candev(), which calls net_device_ops::ndo_stop further down in the call stack for interfaces which are in the "up" state.
With the mcp251xfd driver the removal of the module causes the following warning:
| WARNING: CPU: 0 PID: 352 at net/core/dev.c:7342 __netif_napi_del_locked+0xc8/0xd8
as can_rx_offload_del() deletes the NAPI, while it is still active, because the interface is still up.
To fix the warning, first unregister the network interface, which calls net_device_ops::ndo_stop, which disables the NAPI, and then call can_rx_offload_del().
Fixes: 55e5b97f003e ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250502-can-rx-offload-del-v1-1-59a9b131589d@pengu... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c @@ -2174,8 +2174,8 @@ static void mcp251xfd_remove(struct spi_ struct mcp251xfd_priv *priv = spi_get_drvdata(spi); struct net_device *ndev = priv->ndev;
- can_rx_offload_del(&priv->offload); mcp251xfd_unregister(priv); + can_rx_offload_del(&priv->offload); spi->max_speed_hz = priv->spi_max_speed_hz_orig; free_candev(ndev); }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
commit 037ada7a3181300218e4fd78bef6a741cfa7f808 upstream.
If a driver is removed, the driver framework invokes the driver's remove callback. A CAN driver's remove function calls unregister_candev(), which calls net_device_ops::ndo_stop further down in the call stack for interfaces which are in the "up" state.
The removal of the module causes a warning, as can_rx_offload_del() deletes the NAPI, while it is still active, because the interface is still up.
To fix the warning, first unregister the network interface, which calls net_device_ops::ndo_stop, which disables the NAPI, and then call can_rx_offload_del().
Fixes: ff60bfbaf67f ("can: rockchip_canfd: add driver for Rockchip CAN-FD controller") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250502-can-rx-offload-del-v1-2-59a9b131589d@pengu... Reviewed-by: Markus Schneider-Pargmann msp@baylibre.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/rockchip/rockchip_canfd-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/can/rockchip/rockchip_canfd-core.c +++ b/drivers/net/can/rockchip/rockchip_canfd-core.c @@ -937,8 +937,8 @@ static void rkcanfd_remove(struct platfo struct rkcanfd_priv *priv = platform_get_drvdata(pdev); struct net_device *ndev = priv->ndev;
- can_rx_offload_del(&priv->offload); rkcanfd_unregister(priv); + can_rx_offload_del(&priv->offload); free_candev(ndev); }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon linkinjeon@kernel.org
commit 53e3e5babc0963a92d856a5ec0ce92c59f54bc12 upstream.
Client can send empty newname string to ksmbd server. It will cause a kernel oops from d_alloc. This patch return the error when attempting to rename a file or directory with an empty new name string.
Cc: stable@vger.kernel.org Reported-by: Norbert Szetei norbert@doyensec.com Tested-by: Norbert Szetei norbert@doyensec.com Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/smb2pdu.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -633,6 +633,11 @@ smb2_get_name(const char *src, const int return name; }
+ if (*name == '\0') { + kfree(name); + return ERR_PTR(-EINVAL); + } + if (*name == '\') { pr_err("not allow directory name included leading slash\n"); kfree(name);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Norbert Szetei norbert@doyensec.com
commit 0ca6df4f40cf4c32487944aaf48319cb6c25accc upstream.
ksmbd_vfs_stream_write() did not validate whether the write offset (*pos) was within the bounds of the existing stream data length (v_len). If *pos was greater than or equal to v_len, this could lead to an out-of-bounds memory write.
This patch adds a check to ensure *pos is less than v_len before proceeding. If the condition fails, -EINVAL is returned.
Cc: stable@vger.kernel.org Signed-off-by: Norbert Szetei norbert@doyensec.com Acked-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/vfs.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/fs/smb/server/vfs.c +++ b/fs/smb/server/vfs.c @@ -443,6 +443,13 @@ static int ksmbd_vfs_stream_write(struct goto out; }
+ if (v_len <= *pos) { + pr_err("stream write position %lld is out of bounds (stream length: %zd)\n", + *pos, v_len); + err = -EINVAL; + goto out; + } + if (v_len < size) { wbuf = kvzalloc(size, KSMBD_DEFAULT_GFP); if (!wbuf) {
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Heelan seanheelan@gmail.com
commit 36991c1ccde2d5a521577c448ffe07fcccfe104d upstream.
A use-after-free is possible if one thread destroys the file via __ksmbd_close_fd while another thread holds a reference to it. The existing checks on fp->refcount are not sufficient to prevent this.
The fix takes ft->lock around the section which removes the file from the file table. This prevents two threads acquiring the same file pointer via __close_file_table_ids, as well as the other functions which retrieve a file from the IDR and which already use this same lock.
Cc: stable@vger.kernel.org Signed-off-by: Sean Heelan seanheelan@gmail.com Acked-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/vfs_cache.c | 33 ++++++++++++++++++++++++++------- 1 file changed, 26 insertions(+), 7 deletions(-)
--- a/fs/smb/server/vfs_cache.c +++ b/fs/smb/server/vfs_cache.c @@ -661,21 +661,40 @@ __close_file_table_ids(struct ksmbd_file bool (*skip)(struct ksmbd_tree_connect *tcon, struct ksmbd_file *fp)) { - unsigned int id; - struct ksmbd_file *fp; - int num = 0; + struct ksmbd_file *fp; + unsigned int id = 0; + int num = 0;
- idr_for_each_entry(ft->idr, fp, id) { - if (skip(tcon, fp)) + while (1) { + write_lock(&ft->lock); + fp = idr_get_next(ft->idr, &id); + if (!fp) { + write_unlock(&ft->lock); + break; + } + + if (skip(tcon, fp) || + !atomic_dec_and_test(&fp->refcount)) { + id++; + write_unlock(&ft->lock); continue; + }
set_close_state_blocked_works(fp); + idr_remove(ft->idr, fp->volatile_id); + fp->volatile_id = KSMBD_NO_FID; + write_unlock(&ft->lock); + + down_write(&fp->f_ci->m_lock); + list_del_init(&fp->node); + up_write(&fp->f_ci->m_lock);
- if (!atomic_dec_and_test(&fp->refcount)) - continue; __ksmbd_close_fd(ft, fp); + num++; + id++; } + return num; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eelco Chaudron echaudro@redhat.com
commit 6beb6835c1fbb3f676aebb51a5fee6b77fed9308 upstream.
This patch replaces the manual Netlink attribute iteration in output_userspace() with nla_for_each_nested(), which ensures that only well-formed attributes are processed.
Fixes: ccb1352e76cf ("net: Add Open vSwitch kernel components.") Signed-off-by: Eelco Chaudron echaudro@redhat.com Acked-by: Ilya Maximets i.maximets@ovn.org Acked-by: Aaron Conole aconole@redhat.com Link: https://patch.msgid.link/0bd65949df61591d9171c0dc13e42cea8941da10.1746541734... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/openvswitch/actions.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/net/openvswitch/actions.c +++ b/net/openvswitch/actions.c @@ -975,8 +975,7 @@ static int output_userspace(struct datap upcall.cmd = OVS_PACKET_CMD_ACTION; upcall.mru = OVS_CB(skb)->mru;
- for (a = nla_data(attr), rem = nla_len(attr); rem > 0; - a = nla_next(a, &rem)) { + nla_for_each_nested(a, attr, rem) { switch (nla_type(a)) { case OVS_USERSPACE_ATTR_USERDATA: upcall.userdata = a;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wang Zhaolong wangzhaolong1@huawei.com
[ Upstream commit eb4447bcce915b43b691123118893fca4f372a8f ]
The previous patch that added bounds check for create lease context introduced a memory leak. When the bounds check fails, the function returns NULL without freeing the previously allocated lease_ctx_info structure.
This patch fixes the issue by adding kfree(lreq) before returning NULL in both boundary check cases.
Fixes: bab703ed8472 ("ksmbd: add bounds check for create lease context") Signed-off-by: Wang Zhaolong wangzhaolong1@huawei.com Acked-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/server/oplock.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/fs/smb/server/oplock.c b/fs/smb/server/oplock.c index 81a29857b1e32..03f606afad93a 100644 --- a/fs/smb/server/oplock.c +++ b/fs/smb/server/oplock.c @@ -1496,7 +1496,7 @@ struct lease_ctx_info *parse_lease_state(void *open_req)
if (le16_to_cpu(cc->DataOffset) + le32_to_cpu(cc->DataLength) < sizeof(struct create_lease_v2) - 4) - return NULL; + goto err_out;
memcpy(lreq->lease_key, lc->lcontext.LeaseKey, SMB2_LEASE_KEY_SIZE); lreq->req_state = lc->lcontext.LeaseState; @@ -1512,7 +1512,7 @@ struct lease_ctx_info *parse_lease_state(void *open_req)
if (le16_to_cpu(cc->DataOffset) + le32_to_cpu(cc->DataLength) < sizeof(struct create_lease)) - return NULL; + goto err_out;
memcpy(lreq->lease_key, lc->lcontext.LeaseKey, SMB2_LEASE_KEY_SIZE); lreq->req_state = lc->lcontext.LeaseState; @@ -1521,6 +1521,9 @@ struct lease_ctx_info *parse_lease_state(void *open_req) lreq->version = 1; } return lreq; +err_out: + kfree(lreq); + return NULL; }
/**
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiko Carstens hca@linux.ibm.com
[ Upstream commit ae952eea6f4a7e2193f8721a5366049946e012e7 ]
In case of stack corruption stack_invalid() is called and the expectation is that register r10 contains the last breaking event address. This dependency is quite subtle and broke a couple of years ago without that anybody noticed.
Fix this by getting rid of the dependency and read the last breaking event address from lowcore.
Fixes: 56e62a737028 ("s390: convert to generic entry") Acked-by: Ilya Leoshkevich iii@linux.ibm.com Reviewed-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/entry.S | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 88e09a650d2df..ce8bac77cbc1b 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -601,7 +601,8 @@ SYM_CODE_START(stack_overflow) stmg %r0,%r7,__PT_R0(%r11) stmg %r8,%r9,__PT_PSW(%r11) mvc __PT_R8(64,%r11),0(%r14) - stg %r10,__PT_ORIG_GPR2(%r11) # store last break to orig_gpr2 + GET_LC %r2 + mvc __PT_ORIG_GPR2(8,%r11),__LC_PGM_LAST_BREAK(%r2) xc __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15) lgr %r2,%r11 # pass pointer to pt_regs jg kernel_stack_overflow
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cong Wang xiyou.wangcong@gmail.com
[ Upstream commit 3769478610135e82b262640252d90f6efb05be71 ]
Alan reported a NULL pointer dereference in htb_next_rb_node() after we made htb_qlen_notify() idempotent.
It turns out in the following case it introduced some regression:
htb_dequeue_tree(): |-> fq_codel_dequeue() |-> qdisc_tree_reduce_backlog() |-> htb_qlen_notify() |-> htb_deactivate() |-> htb_next_rb_node() |-> htb_deactivate()
For htb_next_rb_node(), after calling the 1st htb_deactivate(), the clprio[prio]->ptr could be already set to NULL, which means htb_next_rb_node() is vulnerable here.
For htb_deactivate(), although we checked qlen before calling it, in case of qlen==0 after qdisc_tree_reduce_backlog(), we may call it again which triggers the warning inside.
To fix the issues here, we need to:
1) Make htb_deactivate() idempotent, that is, simply return if we already call it before. 2) Make htb_next_rb_node() safe against ptr==NULL.
Many thanks to Alan for testing and for the reproducer.
Fixes: 5ba8b837b522 ("sch_htb: make htb_qlen_notify() idempotent") Reported-by: Alan J. Wylie alan@wylie.me.uk Signed-off-by: Cong Wang xiyou.wangcong@gmail.com Link: https://patch.msgid.link/20250428232955.1740419-2-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_htb.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 4b9a639b642e1..14bf71f570570 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -348,7 +348,8 @@ static void htb_add_to_wait_tree(struct htb_sched *q, */ static inline void htb_next_rb_node(struct rb_node **n) { - *n = rb_next(*n); + if (*n) + *n = rb_next(*n); }
/** @@ -609,8 +610,8 @@ static inline void htb_activate(struct htb_sched *q, struct htb_class *cl) */ static inline void htb_deactivate(struct htb_sched *q, struct htb_class *cl) { - WARN_ON(!cl->prio_activity); - + if (!cl->prio_activity) + return; htb_deactivate_prios(q, cl); cl->prio_activity = 0; } @@ -1485,8 +1486,6 @@ static void htb_qlen_notify(struct Qdisc *sch, unsigned long arg) { struct htb_class *cl = (struct htb_class *)arg;
- if (!cl->prio_activity) - return; htb_deactivate(qdisc_priv(sch), cl); }
@@ -1740,8 +1739,7 @@ static int htb_delete(struct Qdisc *sch, unsigned long arg, if (cl->parent) cl->parent->children--;
- if (cl->prio_activity) - htb_deactivate(q, cl); + htb_deactivate(q, cl);
if (cl->cmode != HTB_CAN_SEND) htb_safe_rb_erase(&cl->pq_node, @@ -1949,8 +1947,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid, /* turn parent into inner node */ qdisc_purge_queue(parent->leaf.q); parent_qdisc = parent->leaf.q; - if (parent->prio_activity) - htb_deactivate(q, parent); + htb_deactivate(q, parent);
/* remove from evt list because of level change */ if (parent->cmode != HTB_CAN_SEND) {
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 1e20324b23f0afba27997434fb978f1e4a1dbcb6 ]
Commit 4bc12818b363 ("virtio-net: disable delayed refill when pausing rx") fixed a deadlock between reconfig paths and refill work trying to disable the same NAPI instance. The refill work can't run in parallel with reconfig because trying to double-disable a NAPI instance causes a stall under the instance lock, which the reconfig path needs to re-enable the NAPI and therefore unblock the stalled thread.
There are two cases where we re-enable refill too early. One is in the virtnet_set_queues() handler. We call it when installing XDP:
virtnet_rx_pause_all(vi); ... virtnet_napi_tx_disable(..); ... virtnet_set_queues(..); ... virtnet_rx_resume_all(..);
We want the work to be disabled until we call virtnet_rx_resume_all(), but virtnet_set_queues() kicks it before NAPIs were re-enabled.
The other case is a more trivial case of mis-ordering in __virtnet_rx_resume() found by code inspection.
Taking the spin lock in virtnet_set_queues() (requested during review) may be unnecessary as we are under rtnl_lock and so are all paths writing to ->refill_enabled.
Acked-by: Michael S. Tsirkin mst@redhat.com Reviewed-by: Bui Quang Minh minhquangbui99@gmail.com Fixes: 4bc12818b363 ("virtio-net: disable delayed refill when pausing rx") Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()") Link: https://patch.msgid.link/20250430163758.3029367-1-kuba@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/virtio_net.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 3e4896d9537ee..2c3c6e8e3f35b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3359,12 +3359,15 @@ static void __virtnet_rx_resume(struct virtnet_info *vi, bool refill) { bool running = netif_running(vi->dev); + bool schedule_refill = false;
if (refill && !try_fill_recv(vi, rq, GFP_KERNEL)) - schedule_delayed_work(&vi->refill, 0); - + schedule_refill = true; if (running) virtnet_napi_enable(rq); + + if (schedule_refill) + schedule_delayed_work(&vi->refill, 0); }
static void virtnet_rx_resume_all(struct virtnet_info *vi) @@ -3699,8 +3702,10 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) succ: vi->curr_queue_pairs = queue_pairs; /* virtnet_open() will refill when device is going to up. */ - if (dev->flags & IFF_UP) + spin_lock_bh(&vi->refill_lock); + if (dev->flags & IFF_UP && vi->refill_enabled) schedule_delayed_work(&vi->refill, 0); + spin_unlock_bh(&vi->refill_lock);
return 0; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 4397684a292a71fbc1e815c3e283f7490ddce5ae ]
The selftests added to our CI by Bui Quang Minh recently reveals that there is a mem leak on the error path of virtnet_xsk_pool_enable():
unreferenced object 0xffff88800a68a000 (size 2048): comm "xdp_helper", pid 318, jiffies 4294692778 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 0): __kvmalloc_node_noprof+0x402/0x570 virtnet_xsk_pool_enable+0x293/0x6a0 (drivers/net/virtio_net.c:5882) xp_assign_dev+0x369/0x670 (net/xdp/xsk_buff_pool.c:226) xsk_bind+0x6a5/0x1ae0 __sys_bind+0x15e/0x230 __x64_sys_bind+0x72/0xb0 do_syscall_64+0xc1/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Acked-by: Jason Wang jasowang@redhat.com Fixes: e9f3962441c0 ("virtio_net: xsk: rx: support fill with xsk buffer") Link: https://patch.msgid.link/20250430163836.3029761-1-kuba@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/virtio_net.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 2c3c6e8e3f35b..54f883c962373 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -5870,8 +5870,10 @@ static int virtnet_xsk_pool_enable(struct net_device *dev,
hdr_dma = virtqueue_dma_map_single_attrs(sq->vq, &xsk_hdr, vi->hdr_len, DMA_TO_DEVICE, 0); - if (virtqueue_dma_mapping_error(sq->vq, hdr_dma)) - return -ENOMEM; + if (virtqueue_dma_mapping_error(sq->vq, hdr_dma)) { + err = -ENOMEM; + goto err_free_buffs; + }
err = xsk_pool_dma_map(pool, dma_dev, 0); if (err) @@ -5899,6 +5901,8 @@ static int virtnet_xsk_pool_enable(struct net_device *dev, err_xsk_map: virtqueue_dma_unmap_single_attrs(rq->vq, hdr_dma, vi->hdr_len, DMA_TO_DEVICE, 0); +err_free_buffs: + kvfree(rq->xsk_buffs); return err; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guillaume Nault gnault@redhat.com
[ Upstream commit 3e6a0243ff002ddbd7ee18a8974ae61d2e6ed00d ]
Use addrconf_addr_gen() to generate IPv6 link-local addresses on GRE devices in most cases and fall back to using add_v4_addrs() only in case the GRE configuration is incompatible with addrconf_addr_gen().
GRE used to use addrconf_addr_gen() until commit e5dd729460ca ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address") restricted this use to gretap and ip6gretap devices, and created add_v4_addrs() (borrowed from SIT) for non-Ethernet GRE ones.
The original problem came when commit 9af28511be10 ("addrconf: refuse isatap eui64 for INADDR_ANY") made __ipv6_isatap_ifid() fail when its addr parameter was 0. The commit says that this would create an invalid address, however, I couldn't find any RFC saying that the generated interface identifier would be wrong. Anyway, since gre over IPv4 devices pass their local tunnel address to __ipv6_isatap_ifid(), that commit broke their IPv6 link-local address generation when the local address was unspecified.
Then commit e5dd729460ca ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address") tried to fix that case by defining add_v4_addrs() and calling it to generate the IPv6 link-local address instead of using addrconf_addr_gen() (apart for gretap and ip6gretap devices, which would still use the regular addrconf_addr_gen(), since they have a MAC address).
That broke several use cases because add_v4_addrs() isn't properly integrated into the rest of IPv6 Neighbor Discovery code. Several of these shortcomings have been fixed over time, but add_v4_addrs() remains broken on several aspects. In particular, it doesn't send any Router Sollicitations, so the SLAAC process doesn't start until the interface receives a Router Advertisement. Also, add_v4_addrs() mostly ignores the address generation mode of the interface (/proc/sys/net/ipv6/conf/*/addr_gen_mode), thus breaking the IN6_ADDR_GEN_MODE_RANDOM and IN6_ADDR_GEN_MODE_STABLE_PRIVACY cases.
Fix the situation by using add_v4_addrs() only in the specific scenario where the normal method would fail. That is, for interfaces that have all of the following characteristics:
* run over IPv4, * transport IP packets directly, not Ethernet (that is, not gretap interfaces), * tunnel endpoint is INADDR_ANY (that is, 0), * device address generation mode is EUI64.
In all other cases, revert back to the regular addrconf_addr_gen().
Also, remove the special case for ip6gre interfaces in add_v4_addrs(), since ip6gre devices now always use addrconf_addr_gen() instead.
Note: This patch was originally applied as commit 183185a18ff9 ("gre: Fix IPv6 link-local address generation."). However, it was then reverted by commit fc486c2d060f ("Revert "gre: Fix IPv6 link-local address generation."") because it uncovered another bug that ended up breaking net/forwarding/ip6gre_custom_multipath_hash.sh. That other bug has now been fixed by commit 4d0ab3a6885e ("ipv6: Start path selection from the first nexthop"). Therefore we can now revive this GRE patch (no changes since original commit 183185a18ff9 ("gre: Fix IPv6 link-local address generation.").
Fixes: e5dd729460ca ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address") Signed-off-by: Guillaume Nault gnault@redhat.com Reviewed-by: Ido Schimmel idosch@nvidia.com Link: https://patch.msgid.link/a88cc5c4811af36007645d610c95102dccb360a6.1746225214... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/addrconf.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 54a8ea004da28..943ba80c9e4ff 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3209,16 +3209,13 @@ static void add_v4_addrs(struct inet6_dev *idev) struct in6_addr addr; struct net_device *dev; struct net *net = dev_net(idev->dev); - int scope, plen, offset = 0; + int scope, plen; u32 pflags = 0;
ASSERT_RTNL();
memset(&addr, 0, sizeof(struct in6_addr)); - /* in case of IP6GRE the dev_addr is an IPv6 and therefore we use only the last 4 bytes */ - if (idev->dev->addr_len == sizeof(struct in6_addr)) - offset = sizeof(struct in6_addr) - 4; - memcpy(&addr.s6_addr32[3], idev->dev->dev_addr + offset, 4); + memcpy(&addr.s6_addr32[3], idev->dev->dev_addr, 4);
if (!(idev->dev->flags & IFF_POINTOPOINT) && idev->dev->type == ARPHRD_SIT) { scope = IPV6_ADDR_COMPATv4; @@ -3529,7 +3526,13 @@ static void addrconf_gre_config(struct net_device *dev) return; }
- if (dev->type == ARPHRD_ETHER) { + /* Generate the IPv6 link-local address using addrconf_addr_gen(), + * unless we have an IPv4 GRE device not bound to an IP address and + * which is in EUI64 mode (as __ipv6_isatap_ifid() would fail in this + * case). Such devices fall back to add_v4_addrs() instead. + */ + if (!(dev->type == ARPHRD_IPGRE && *(__be32 *)dev->dev_addr == 0 && + idev->cnf.addr_gen_mode == IN6_ADDR_GEN_MODE_EUI64)) { addrconf_addr_gen(idev, true); return; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Golle daniel@makrotopia.org
[ Upstream commit 4db6c75124d871fbabf8243f947d34cc7e0697fc ]
The purpose of resetting the TX queue is to reset the byte and packet count as well as to clear the software flow control XOFF bit.
MediaTek developers pointed out that netdev_reset_queue would only resets queue 0 of the network device.
Queues that are not reset may cause unexpected issues.
Packets may stop being sent after reset and "transmit timeout" log may be displayed.
Import fix from MediaTek's SDK to resolve this issue.
Link: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+... Fixes: f63959c7eec31 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") Signed-off-by: Daniel Golle daniel@makrotopia.org Link: https://patch.msgid.link/c9ff9adceac4f152239a0f65c397f13547639175.1746406763... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c index c6d60f1d4f77a..bf6e572762413 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -3140,11 +3140,19 @@ static int mtk_dma_init(struct mtk_eth *eth) static void mtk_dma_free(struct mtk_eth *eth) { const struct mtk_soc_data *soc = eth->soc; - int i; + int i, j, txqs = 1; + + if (MTK_HAS_CAPS(eth->soc->caps, MTK_QDMA)) + txqs = MTK_QDMA_NUM_QUEUES; + + for (i = 0; i < MTK_MAX_DEVS; i++) { + if (!eth->netdev[i]) + continue; + + for (j = 0; j < txqs; j++) + netdev_tx_reset_subqueue(eth->netdev[i], j); + }
- for (i = 0; i < MTK_MAX_DEVS; i++) - if (eth->netdev[i]) - netdev_reset_queue(eth->netdev[i]); if (!MTK_HAS_CAPS(soc->caps, MTK_SRAM) && eth->scratch_ring) { dma_free_coherent(eth->dma_dev, MTK_QDMA_RING_SIZE * soc->tx.desc_size,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frank Wunderlich frank-w@public-files.de
[ Upstream commit e8716b5b0dff1b3d523b4a83fd5e94d57b887c5c ]
Remove redundant PSE reset. When setting FE register there is no need to reset PSE, doing so may cause FE to work abnormal.
Link: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+... Fixes: dee4dd10c79aa ("net: ethernet: mtk_eth_soc: ppe: add support for multiple PPEs") Signed-off-by: Frank Wunderlich frank-w@public-files.de Link: https://patch.msgid.link/18f0ac7d83f82defa3342c11ef0d1362f6b81e88.1746406763... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c index bf6e572762413..341def2bf1d35 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -3427,9 +3427,6 @@ static int mtk_open(struct net_device *dev) } mtk_gdm_config(eth, target_mac->id, gdm_config); } - /* Reset and enable PSE */ - mtk_w32(eth, RST_GL_PSE, MTK_RST_GL); - mtk_w32(eth, 0, MTK_RST_GL);
napi_enable(ð->tx_napi); napi_enable(ð->rx_napi);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Antonios Salios antonios@mwa.re
[ Upstream commit dcaeeb8ae84c5506ebc574732838264f3887738c ]
The spin lock tx_handling_spinlock in struct m_can_classdev is not being initialized. This leads the following spinlock bad magic complaint from the kernel, eg. when trying to send CAN frames with cansend from can-utils:
| BUG: spinlock bad magic on CPU#0, cansend/95 | lock: 0xff60000002ec1010, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 | CPU: 0 UID: 0 PID: 95 Comm: cansend Not tainted 6.15.0-rc3-00032-ga79be02bba5c #5 NONE | Hardware name: MachineWare SIM-V (DT) | Call Trace: | [<ffffffff800133e0>] dump_backtrace+0x1c/0x24 | [<ffffffff800022f2>] show_stack+0x28/0x34 | [<ffffffff8000de3e>] dump_stack_lvl+0x4a/0x68 | [<ffffffff8000de70>] dump_stack+0x14/0x1c | [<ffffffff80003134>] spin_dump+0x62/0x6e | [<ffffffff800883ba>] do_raw_spin_lock+0xd0/0x142 | [<ffffffff807a6fcc>] _raw_spin_lock_irqsave+0x20/0x2c | [<ffffffff80536dba>] m_can_start_xmit+0x90/0x34a | [<ffffffff806148b0>] dev_hard_start_xmit+0xa6/0xee | [<ffffffff8065b730>] sch_direct_xmit+0x114/0x292 | [<ffffffff80614e2a>] __dev_queue_xmit+0x3b0/0xaa8 | [<ffffffff8073b8fa>] can_send+0xc6/0x242 | [<ffffffff8073d1c0>] raw_sendmsg+0x1a8/0x36c | [<ffffffff805ebf06>] sock_write_iter+0x9a/0xee | [<ffffffff801d06ea>] vfs_write+0x184/0x3a6 | [<ffffffff801d0a88>] ksys_write+0xa0/0xc0 | [<ffffffff801d0abc>] __riscv_sys_write+0x14/0x1c | [<ffffffff8079ebf8>] do_trap_ecall_u+0x168/0x212 | [<ffffffff807a830a>] handle_exception+0x146/0x152
Initializing the spin lock in m_can_class_allocate_dev solves that problem.
Fixes: 1fa80e23c150 ("can: m_can: Introduce a tx_fifo_in_flight counter") Signed-off-by: Antonios Salios antonios@mwa.re Reviewed-by: Vincent Mailhol mailhol.vincent@wanadoo.fr Link: https://patch.msgid.link/20250425111744.37604-2-antonios@mwa.re Reviewed-by: Markus Schneider-Pargmann msp@baylibre.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/m_can/m_can.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c index 3766b0f558288..39ad4442cb813 100644 --- a/drivers/net/can/m_can/m_can.c +++ b/drivers/net/can/m_can/m_can.c @@ -2379,6 +2379,7 @@ struct m_can_classdev *m_can_class_allocate_dev(struct device *dev, SET_NETDEV_DEV(net_dev, dev);
m_can_of_parse_mram(class_dev, mram_config_vals); + spin_lock_init(&class_dev->tx_handling_spinlock); out: return class_dev; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kelsey Maes kelsey@vpprocess.com
[ Upstream commit 5e1663810e11c64956aa7e280cf74b2f3284d816 ]
The TDC is currently hardcoded enabled. This means that even for lower CAN-FD data bitrates (with a DBRP (data bitrate prescaler) > 2) a TDC is configured. This leads to a bus-off condition.
ISO 11898-1 section 11.3.3 says "Transmitter delay compensation" (TDC) is only applicable if DBRP is 1 or 2.
To fix the problem, switch the driver to use the TDC calculation provided by the CAN driver framework (which respects ISO 11898-1 section 11.3.3). This has the positive side effect that userspace can control TDC as needed.
Demonstration of the feature in action: | $ ip link set can0 up type can bitrate 125000 dbitrate 500000 fd on | $ ip -details link show can0 | 3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10 | link/can promiscuity 0 allmulti 0 minmtu 0 maxmtu 0 | can <FD> state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0 | bitrate 125000 sample-point 0.875 | tq 50 prop-seg 69 phase-seg1 70 phase-seg2 20 sjw 10 brp 2 | mcp251xfd: tseg1 2..256 tseg2 1..128 sjw 1..128 brp 1..256 brp_inc 1 | dbitrate 500000 dsample-point 0.875 | dtq 125 dprop-seg 6 dphase-seg1 7 dphase-seg2 2 dsjw 1 dbrp 5 | mcp251xfd: dtseg1 1..32 dtseg2 1..16 dsjw 1..16 dbrp 1..256 dbrp_inc 1 | tdcv 0..63 tdco 0..63 | clock 40000000 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 parentbus spi parentdev spi0.0 | $ ip link set can0 up type can bitrate 1000000 dbitrate 4000000 fd on | $ ip -details link show can0 | 3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10 | link/can promiscuity 0 allmulti 0 minmtu 0 maxmtu 0 | can <FD,TDC-AUTO> state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0 | bitrate 1000000 sample-point 0.750 | tq 25 prop-seg 14 phase-seg1 15 phase-seg2 10 sjw 5 brp 1 | mcp251xfd: tseg1 2..256 tseg2 1..128 sjw 1..128 brp 1..256 brp_inc 1 | dbitrate 4000000 dsample-point 0.700 | dtq 25 dprop-seg 3 dphase-seg1 3 dphase-seg2 3 dsjw 1 dbrp 1 | tdco 7 | mcp251xfd: dtseg1 1..32 dtseg2 1..16 dsjw 1..16 dbrp 1..256 dbrp_inc 1 | tdcv 0..63 tdco 0..63 | clock 40000000 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 parentbus spi parentdev spi0.0
There has been some confusion about the MCP2518FD using a relative or absolute TDCO due to the datasheet specifying a range of [-64,63]. I have a custom board with a 40 MHz clock and an estimated loop delay of 100 to 216 ns. During testing at a data bit rate of 4 Mbit/s I found that using can_get_relative_tdco() resulted in bus-off errors. The final TDCO value was 1 which corresponds to a 10% SSP in an absolute configuration. This behavior is expected if the TDCO value is really absolute and not relative. Using priv->can.tdc.tdco instead results in a final TDCO of 8, setting the SSP at exactly 80%. This configuration works.
The automatic, manual, and off TDC modes were tested at speeds up to, and including, 8 Mbit/s on real hardware and behave as expected.
Fixes: 55e5b97f003e ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN") Reported-by: Kelsey Maes kelsey@vpprocess.com Closes: https://lore.kernel.org/all/C2121586-C87F-4B23-A933-845362C29CA1@vpprocess.c... Reviewed-by: Vincent Mailhol mailhol.vincent@wanadoo.fr Signed-off-by: Kelsey Maes kelsey@vpprocess.com Link: https://patch.msgid.link/20250430161501.79370-1-kelsey@vpprocess.com [mkl: add comment] Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/can/spi/mcp251xfd/mcp251xfd-core.c | 40 +++++++++++++++---- 1 file changed, 32 insertions(+), 8 deletions(-)
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c index dd0b3fb42f1b9..c30b04f8fc0df 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c @@ -75,6 +75,24 @@ static const struct can_bittiming_const mcp251xfd_data_bittiming_const = { .brp_inc = 1, };
+/* The datasheet of the mcp2518fd (DS20006027B) specifies a range of + * [-64,63] for TDCO, indicating a relative TDCO. + * + * Manual tests have shown, that using a relative TDCO configuration + * results in bus off, while an absolute configuration works. + * + * For TDCO use the max value (63) from the data sheet, but 0 as the + * minimum. + */ +static const struct can_tdc_const mcp251xfd_tdc_const = { + .tdcv_min = 0, + .tdcv_max = 63, + .tdco_min = 0, + .tdco_max = 63, + .tdcf_min = 0, + .tdcf_max = 0, +}; + static const char *__mcp251xfd_get_model_str(enum mcp251xfd_model model) { switch (model) { @@ -510,8 +528,7 @@ static int mcp251xfd_set_bittiming(const struct mcp251xfd_priv *priv) { const struct can_bittiming *bt = &priv->can.bittiming; const struct can_bittiming *dbt = &priv->can.data_bittiming; - u32 val = 0; - s8 tdco; + u32 tdcmod, val = 0; int err;
/* CAN Control Register @@ -575,11 +592,16 @@ static int mcp251xfd_set_bittiming(const struct mcp251xfd_priv *priv) return err;
/* Transmitter Delay Compensation */ - tdco = clamp_t(int, dbt->brp * (dbt->prop_seg + dbt->phase_seg1), - -64, 63); - val = FIELD_PREP(MCP251XFD_REG_TDC_TDCMOD_MASK, - MCP251XFD_REG_TDC_TDCMOD_AUTO) | - FIELD_PREP(MCP251XFD_REG_TDC_TDCO_MASK, tdco); + if (priv->can.ctrlmode & CAN_CTRLMODE_TDC_AUTO) + tdcmod = MCP251XFD_REG_TDC_TDCMOD_AUTO; + else if (priv->can.ctrlmode & CAN_CTRLMODE_TDC_MANUAL) + tdcmod = MCP251XFD_REG_TDC_TDCMOD_MANUAL; + else + tdcmod = MCP251XFD_REG_TDC_TDCMOD_DISABLED; + + val = FIELD_PREP(MCP251XFD_REG_TDC_TDCMOD_MASK, tdcmod) | + FIELD_PREP(MCP251XFD_REG_TDC_TDCV_MASK, priv->can.tdc.tdcv) | + FIELD_PREP(MCP251XFD_REG_TDC_TDCO_MASK, priv->can.tdc.tdco);
return regmap_write(priv->map_reg, MCP251XFD_REG_TDC, val); } @@ -2083,10 +2105,12 @@ static int mcp251xfd_probe(struct spi_device *spi) priv->can.do_get_berr_counter = mcp251xfd_get_berr_counter; priv->can.bittiming_const = &mcp251xfd_bittiming_const; priv->can.data_bittiming_const = &mcp251xfd_data_bittiming_const; + priv->can.tdc_const = &mcp251xfd_tdc_const; priv->can.ctrlmode_supported = CAN_CTRLMODE_LOOPBACK | CAN_CTRLMODE_LISTENONLY | CAN_CTRLMODE_BERR_REPORTING | CAN_CTRLMODE_FD | CAN_CTRLMODE_FD_NON_ISO | - CAN_CTRLMODE_CC_LEN8_DLC; + CAN_CTRLMODE_CC_LEN8_DLC | CAN_CTRLMODE_TDC_AUTO | + CAN_CTRLMODE_TDC_MANUAL; set_bit(MCP251XFD_FLAGS_DOWN, priv->flags); priv->ndev = ndev; priv->spi = spi;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oliver Hartkopp socketcan@hartkopp.net
[ Upstream commit 511e64e13d8cc72853275832e3f372607466c18c ]
As reported by Sebastian Andrzej Siewior the use of local_bh_disable() is only feasible in uni processor systems to update the modification rules. The usual use-case to update the modification rules is to update the data of the modifications but not the modification types (AND/OR/XOR/SET) or the checksum functions itself.
To omit additional memory allocations to maintain fast modification switching times, the modification description space is doubled at gw-job creation time so that only the reference to the active modification description is changed under rcu protection.
Rename cgw_job::mod to cf_mod and make it a RCU pointer. Allocate in cgw_create_job() and free it together with cgw_job in cgw_job_free_rcu(). Update all users to dereference cgw_job::cf_mod with a RCU accessor and if possible once.
[bigeasy: Replace mod1/mod2 from the Oliver's original patch with dynamic allocation, use RCU annotation and accessor]
Reported-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Closes: https://lore.kernel.org/linux-can/20231031112349.y0aLoBrz@linutronix.de/ Fixes: dd895d7f21b2 ("can: cangw: introduce optional uid to reference created routing jobs") Tested-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Link: https://patch.msgid.link/20250429070555.cs-7b_eZ@linutronix.de Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/can/gw.c | 149 +++++++++++++++++++++++++++++++-------------------- 1 file changed, 90 insertions(+), 59 deletions(-)
diff --git a/net/can/gw.c b/net/can/gw.c index ef93293c1fae3..55eccb1c7620c 100644 --- a/net/can/gw.c +++ b/net/can/gw.c @@ -130,7 +130,7 @@ struct cgw_job { u32 handled_frames; u32 dropped_frames; u32 deleted_frames; - struct cf_mod mod; + struct cf_mod __rcu *cf_mod; union { /* CAN frame data source */ struct net_device *dev; @@ -459,6 +459,7 @@ static void can_can_gw_rcv(struct sk_buff *skb, void *data) struct cgw_job *gwj = (struct cgw_job *)data; struct canfd_frame *cf; struct sk_buff *nskb; + struct cf_mod *mod; int modidx = 0;
/* process strictly Classic CAN or CAN FD frames */ @@ -506,7 +507,8 @@ static void can_can_gw_rcv(struct sk_buff *skb, void *data) * When there is at least one modification function activated, * we need to copy the skb as we want to modify skb->data. */ - if (gwj->mod.modfunc[0]) + mod = rcu_dereference(gwj->cf_mod); + if (mod->modfunc[0]) nskb = skb_copy(skb, GFP_ATOMIC); else nskb = skb_clone(skb, GFP_ATOMIC); @@ -529,8 +531,8 @@ static void can_can_gw_rcv(struct sk_buff *skb, void *data) cf = (struct canfd_frame *)nskb->data;
/* perform preprocessed modification functions if there are any */ - while (modidx < MAX_MODFUNCTIONS && gwj->mod.modfunc[modidx]) - (*gwj->mod.modfunc[modidx++])(cf, &gwj->mod); + while (modidx < MAX_MODFUNCTIONS && mod->modfunc[modidx]) + (*mod->modfunc[modidx++])(cf, mod);
/* Has the CAN frame been modified? */ if (modidx) { @@ -546,11 +548,11 @@ static void can_can_gw_rcv(struct sk_buff *skb, void *data) }
/* check for checksum updates */ - if (gwj->mod.csumfunc.crc8) - (*gwj->mod.csumfunc.crc8)(cf, &gwj->mod.csum.crc8); + if (mod->csumfunc.crc8) + (*mod->csumfunc.crc8)(cf, &mod->csum.crc8);
- if (gwj->mod.csumfunc.xor) - (*gwj->mod.csumfunc.xor)(cf, &gwj->mod.csum.xor); + if (mod->csumfunc.xor) + (*mod->csumfunc.xor)(cf, &mod->csum.xor); }
/* clear the skb timestamp if not configured the other way */ @@ -581,9 +583,20 @@ static void cgw_job_free_rcu(struct rcu_head *rcu_head) { struct cgw_job *gwj = container_of(rcu_head, struct cgw_job, rcu);
+ /* cgw_job::cf_mod is always accessed from the same cgw_job object within + * the same RCU read section. Once cgw_job is scheduled for removal, + * cf_mod can also be removed without mandating an additional grace period. + */ + kfree(rcu_access_pointer(gwj->cf_mod)); kmem_cache_free(cgw_cache, gwj); }
+/* Return cgw_job::cf_mod with RTNL protected section */ +static struct cf_mod *cgw_job_cf_mod(struct cgw_job *gwj) +{ + return rcu_dereference_protected(gwj->cf_mod, rtnl_is_locked()); +} + static int cgw_notifier(struct notifier_block *nb, unsigned long msg, void *ptr) { @@ -616,6 +629,7 @@ static int cgw_put_job(struct sk_buff *skb, struct cgw_job *gwj, int type, { struct rtcanmsg *rtcan; struct nlmsghdr *nlh; + struct cf_mod *mod;
nlh = nlmsg_put(skb, pid, seq, type, sizeof(*rtcan), flags); if (!nlh) @@ -650,82 +664,83 @@ static int cgw_put_job(struct sk_buff *skb, struct cgw_job *gwj, int type, goto cancel; }
+ mod = cgw_job_cf_mod(gwj); if (gwj->flags & CGW_FLAGS_CAN_FD) { struct cgw_fdframe_mod mb;
- if (gwj->mod.modtype.and) { - memcpy(&mb.cf, &gwj->mod.modframe.and, sizeof(mb.cf)); - mb.modtype = gwj->mod.modtype.and; + if (mod->modtype.and) { + memcpy(&mb.cf, &mod->modframe.and, sizeof(mb.cf)); + mb.modtype = mod->modtype.and; if (nla_put(skb, CGW_FDMOD_AND, sizeof(mb), &mb) < 0) goto cancel; }
- if (gwj->mod.modtype.or) { - memcpy(&mb.cf, &gwj->mod.modframe.or, sizeof(mb.cf)); - mb.modtype = gwj->mod.modtype.or; + if (mod->modtype.or) { + memcpy(&mb.cf, &mod->modframe.or, sizeof(mb.cf)); + mb.modtype = mod->modtype.or; if (nla_put(skb, CGW_FDMOD_OR, sizeof(mb), &mb) < 0) goto cancel; }
- if (gwj->mod.modtype.xor) { - memcpy(&mb.cf, &gwj->mod.modframe.xor, sizeof(mb.cf)); - mb.modtype = gwj->mod.modtype.xor; + if (mod->modtype.xor) { + memcpy(&mb.cf, &mod->modframe.xor, sizeof(mb.cf)); + mb.modtype = mod->modtype.xor; if (nla_put(skb, CGW_FDMOD_XOR, sizeof(mb), &mb) < 0) goto cancel; }
- if (gwj->mod.modtype.set) { - memcpy(&mb.cf, &gwj->mod.modframe.set, sizeof(mb.cf)); - mb.modtype = gwj->mod.modtype.set; + if (mod->modtype.set) { + memcpy(&mb.cf, &mod->modframe.set, sizeof(mb.cf)); + mb.modtype = mod->modtype.set; if (nla_put(skb, CGW_FDMOD_SET, sizeof(mb), &mb) < 0) goto cancel; } } else { struct cgw_frame_mod mb;
- if (gwj->mod.modtype.and) { - memcpy(&mb.cf, &gwj->mod.modframe.and, sizeof(mb.cf)); - mb.modtype = gwj->mod.modtype.and; + if (mod->modtype.and) { + memcpy(&mb.cf, &mod->modframe.and, sizeof(mb.cf)); + mb.modtype = mod->modtype.and; if (nla_put(skb, CGW_MOD_AND, sizeof(mb), &mb) < 0) goto cancel; }
- if (gwj->mod.modtype.or) { - memcpy(&mb.cf, &gwj->mod.modframe.or, sizeof(mb.cf)); - mb.modtype = gwj->mod.modtype.or; + if (mod->modtype.or) { + memcpy(&mb.cf, &mod->modframe.or, sizeof(mb.cf)); + mb.modtype = mod->modtype.or; if (nla_put(skb, CGW_MOD_OR, sizeof(mb), &mb) < 0) goto cancel; }
- if (gwj->mod.modtype.xor) { - memcpy(&mb.cf, &gwj->mod.modframe.xor, sizeof(mb.cf)); - mb.modtype = gwj->mod.modtype.xor; + if (mod->modtype.xor) { + memcpy(&mb.cf, &mod->modframe.xor, sizeof(mb.cf)); + mb.modtype = mod->modtype.xor; if (nla_put(skb, CGW_MOD_XOR, sizeof(mb), &mb) < 0) goto cancel; }
- if (gwj->mod.modtype.set) { - memcpy(&mb.cf, &gwj->mod.modframe.set, sizeof(mb.cf)); - mb.modtype = gwj->mod.modtype.set; + if (mod->modtype.set) { + memcpy(&mb.cf, &mod->modframe.set, sizeof(mb.cf)); + mb.modtype = mod->modtype.set; if (nla_put(skb, CGW_MOD_SET, sizeof(mb), &mb) < 0) goto cancel; } }
- if (gwj->mod.uid) { - if (nla_put_u32(skb, CGW_MOD_UID, gwj->mod.uid) < 0) + if (mod->uid) { + if (nla_put_u32(skb, CGW_MOD_UID, mod->uid) < 0) goto cancel; }
- if (gwj->mod.csumfunc.crc8) { + if (mod->csumfunc.crc8) { if (nla_put(skb, CGW_CS_CRC8, CGW_CS_CRC8_LEN, - &gwj->mod.csum.crc8) < 0) + &mod->csum.crc8) < 0) goto cancel; }
- if (gwj->mod.csumfunc.xor) { + if (mod->csumfunc.xor) { if (nla_put(skb, CGW_CS_XOR, CGW_CS_XOR_LEN, - &gwj->mod.csum.xor) < 0) + &mod->csum.xor) < 0) goto cancel; }
@@ -1059,7 +1074,7 @@ static int cgw_create_job(struct sk_buff *skb, struct nlmsghdr *nlh, struct net *net = sock_net(skb->sk); struct rtcanmsg *r; struct cgw_job *gwj; - struct cf_mod mod; + struct cf_mod *mod; struct can_can_gw ccgw; u8 limhops = 0; int err = 0; @@ -1078,37 +1093,48 @@ static int cgw_create_job(struct sk_buff *skb, struct nlmsghdr *nlh, if (r->gwtype != CGW_TYPE_CAN_CAN) return -EINVAL;
- err = cgw_parse_attr(nlh, &mod, CGW_TYPE_CAN_CAN, &ccgw, &limhops); + mod = kmalloc(sizeof(*mod), GFP_KERNEL); + if (!mod) + return -ENOMEM; + + err = cgw_parse_attr(nlh, mod, CGW_TYPE_CAN_CAN, &ccgw, &limhops); if (err < 0) - return err; + goto out_free_cf;
- if (mod.uid) { + if (mod->uid) { ASSERT_RTNL();
/* check for updating an existing job with identical uid */ hlist_for_each_entry(gwj, &net->can.cgw_list, list) { - if (gwj->mod.uid != mod.uid) + struct cf_mod *old_cf; + + old_cf = cgw_job_cf_mod(gwj); + if (old_cf->uid != mod->uid) continue;
/* interfaces & filters must be identical */ - if (memcmp(&gwj->ccgw, &ccgw, sizeof(ccgw))) - return -EINVAL; + if (memcmp(&gwj->ccgw, &ccgw, sizeof(ccgw))) { + err = -EINVAL; + goto out_free_cf; + }
- /* update modifications with disabled softirq & quit */ - local_bh_disable(); - memcpy(&gwj->mod, &mod, sizeof(mod)); - local_bh_enable(); + rcu_assign_pointer(gwj->cf_mod, mod); + kfree_rcu_mightsleep(old_cf); return 0; } }
/* ifindex == 0 is not allowed for job creation */ - if (!ccgw.src_idx || !ccgw.dst_idx) - return -ENODEV; + if (!ccgw.src_idx || !ccgw.dst_idx) { + err = -ENODEV; + goto out_free_cf; + }
gwj = kmem_cache_alloc(cgw_cache, GFP_KERNEL); - if (!gwj) - return -ENOMEM; + if (!gwj) { + err = -ENOMEM; + goto out_free_cf; + }
gwj->handled_frames = 0; gwj->dropped_frames = 0; @@ -1118,7 +1144,7 @@ static int cgw_create_job(struct sk_buff *skb, struct nlmsghdr *nlh, gwj->limit_hops = limhops;
/* insert already parsed information */ - memcpy(&gwj->mod, &mod, sizeof(mod)); + RCU_INIT_POINTER(gwj->cf_mod, mod); memcpy(&gwj->ccgw, &ccgw, sizeof(ccgw));
err = -ENODEV; @@ -1152,9 +1178,11 @@ static int cgw_create_job(struct sk_buff *skb, struct nlmsghdr *nlh, if (!err) hlist_add_head_rcu(&gwj->list, &net->can.cgw_list); out: - if (err) + if (err) { kmem_cache_free(cgw_cache, gwj); - +out_free_cf: + kfree(mod); + } return err; }
@@ -1214,19 +1242,22 @@ static int cgw_remove_job(struct sk_buff *skb, struct nlmsghdr *nlh,
/* remove only the first matching entry */ hlist_for_each_entry_safe(gwj, nx, &net->can.cgw_list, list) { + struct cf_mod *cf_mod; + if (gwj->flags != r->flags) continue;
if (gwj->limit_hops != limhops) continue;
+ cf_mod = cgw_job_cf_mod(gwj); /* we have a match when uid is enabled and identical */ - if (gwj->mod.uid || mod.uid) { - if (gwj->mod.uid != mod.uid) + if (cf_mod->uid || mod.uid) { + if (cf_mod->uid != mod.uid) continue; } else { /* no uid => check for identical modifications */ - if (memcmp(&gwj->mod, &mod, sizeof(mod))) + if (memcmp(cf_mod, &mod, sizeof(mod))) continue; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael-CY Lee michael-cy.lee@mediatek.com
[ Upstream commit e12a42f64fc3d74872b349eedd47f90c6676b78a ]
The status code should be type of __le16.
Fixes: 83e897a961b8 ("wifi: ieee80211: add definitions for negotiated TID to Link map") Fixes: 8f500fbc6c65 ("wifi: mac80211: process and save negotiated TID to Link mapping request") Signed-off-by: Michael-CY Lee michael-cy.lee@mediatek.com Link: https://patch.msgid.link/20250505081946.3927214-1-michael-cy.lee@mediatek.co... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/ieee80211.h | 2 +- net/mac80211/mlme.c | 12 ++++++------ 2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/include/linux/ieee80211.h b/include/linux/ieee80211.h index 16741e542e81c..07dcd80f3310c 100644 --- a/include/linux/ieee80211.h +++ b/include/linux/ieee80211.h @@ -1526,7 +1526,7 @@ struct ieee80211_mgmt { struct { u8 action_code; u8 dialog_token; - u8 status_code; + __le16 status_code; u8 variable[]; } __packed ttlm_res; struct { diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index 99e9b03d7fe19..e3deb89674b23 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -7412,6 +7412,7 @@ ieee80211_send_neg_ttlm_res(struct ieee80211_sub_if_data *sdata, int hdr_len = offsetofend(struct ieee80211_mgmt, u.action.u.ttlm_res); int ttlm_max_len = 2 + 1 + sizeof(struct ieee80211_ttlm_elem) + 1 + 2 * 2 * IEEE80211_TTLM_NUM_TIDS; + u16 status_code;
skb = dev_alloc_skb(local->tx_headroom + hdr_len + ttlm_max_len); if (!skb) @@ -7434,19 +7435,18 @@ ieee80211_send_neg_ttlm_res(struct ieee80211_sub_if_data *sdata, WARN_ON(1); fallthrough; case NEG_TTLM_RES_REJECT: - mgmt->u.action.u.ttlm_res.status_code = - WLAN_STATUS_DENIED_TID_TO_LINK_MAPPING; + status_code = WLAN_STATUS_DENIED_TID_TO_LINK_MAPPING; break; case NEG_TTLM_RES_ACCEPT: - mgmt->u.action.u.ttlm_res.status_code = WLAN_STATUS_SUCCESS; + status_code = WLAN_STATUS_SUCCESS; break; case NEG_TTLM_RES_SUGGEST_PREFERRED: - mgmt->u.action.u.ttlm_res.status_code = - WLAN_STATUS_PREF_TID_TO_LINK_MAPPING_SUGGESTED; + status_code = WLAN_STATUS_PREF_TID_TO_LINK_MAPPING_SUGGESTED; ieee80211_neg_ttlm_add_suggested_map(skb, neg_ttlm); break; }
+ mgmt->u.action.u.ttlm_res.status_code = cpu_to_le16(status_code); ieee80211_tx_skb(sdata, skb); }
@@ -7612,7 +7612,7 @@ void ieee80211_process_neg_ttlm_res(struct ieee80211_sub_if_data *sdata, * This can be better implemented in the future, to handle request * rejections. */ - if (mgmt->u.action.u.ttlm_res.status_code != WLAN_STATUS_SUCCESS) + if (le16_to_cpu(mgmt->u.action.u.ttlm_res.status_code) != WLAN_STATUS_SUCCESS) __ieee80211_disconnect(sdata); }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Przemek Kitszel przemyslaw.kitszel@intel.com
[ Upstream commit 0093cb194a7511d1e68865fa35b763c72e44c2f0 ]
Use Device Serial Number instead of PCI bus/device/function for the index of struct ice_adapter.
Functions on the same physical device should point to the very same ice_adapter instance, but with two PFs, when at least one of them is PCI-e passed-through to a VM, it is no longer the case - PFs will get seemingly random PCI BDF values, and thus indices, what finally leds to each of them being on their own instance of ice_adapter. That causes them to don't attempt any synchronization of the PTP HW clock usage, or any other future resources.
DSN works nicely in place of the index, as it is "immutable" in terms of virtualization.
Fixes: 0e2bddf9e5f9 ("ice: add ice_adapter for shared data across PFs on the same NIC") Suggested-by: Jacob Keller jacob.e.keller@intel.com Suggested-by: Jakub Kicinski kuba@kernel.org Suggested-by: Jiri Pirko jiri@resnulli.us Reviewed-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Signed-off-by: Przemek Kitszel przemyslaw.kitszel@intel.com Reviewed-by: Simon Horman horms@kernel.org Tested-by: Rinitha S sx.rinitha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Reviewed-by: Jiri Pirko jiri@nvidia.com Link: https://patch.msgid.link/20250505161939.2083581-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_adapter.c | 47 ++++++++------------ drivers/net/ethernet/intel/ice/ice_adapter.h | 6 ++- 2 files changed, 22 insertions(+), 31 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c index 01a08cfd0090a..66e070095d1bb 100644 --- a/drivers/net/ethernet/intel/ice/ice_adapter.c +++ b/drivers/net/ethernet/intel/ice/ice_adapter.c @@ -1,7 +1,6 @@ // SPDX-License-Identifier: GPL-2.0-only // SPDX-FileCopyrightText: Copyright Red Hat
-#include <linux/bitfield.h> #include <linux/cleanup.h> #include <linux/mutex.h> #include <linux/pci.h> @@ -14,32 +13,16 @@ static DEFINE_XARRAY(ice_adapters); static DEFINE_MUTEX(ice_adapters_mutex);
-/* PCI bus number is 8 bits. Slot is 5 bits. Domain can have the rest. */ -#define INDEX_FIELD_DOMAIN GENMASK(BITS_PER_LONG - 1, 13) -#define INDEX_FIELD_DEV GENMASK(31, 16) -#define INDEX_FIELD_BUS GENMASK(12, 5) -#define INDEX_FIELD_SLOT GENMASK(4, 0) - -static unsigned long ice_adapter_index(const struct pci_dev *pdev) +static unsigned long ice_adapter_index(u64 dsn) { - unsigned int domain = pci_domain_nr(pdev->bus); - - WARN_ON(domain > FIELD_MAX(INDEX_FIELD_DOMAIN)); - - switch (pdev->device) { - case ICE_DEV_ID_E825C_BACKPLANE: - case ICE_DEV_ID_E825C_QSFP: - case ICE_DEV_ID_E825C_SFP: - case ICE_DEV_ID_E825C_SGMII: - return FIELD_PREP(INDEX_FIELD_DEV, pdev->device); - default: - return FIELD_PREP(INDEX_FIELD_DOMAIN, domain) | - FIELD_PREP(INDEX_FIELD_BUS, pdev->bus->number) | - FIELD_PREP(INDEX_FIELD_SLOT, PCI_SLOT(pdev->devfn)); - } +#if BITS_PER_LONG == 64 + return dsn; +#else + return (u32)dsn ^ (u32)(dsn >> 32); +#endif }
-static struct ice_adapter *ice_adapter_new(void) +static struct ice_adapter *ice_adapter_new(u64 dsn) { struct ice_adapter *adapter;
@@ -47,6 +30,7 @@ static struct ice_adapter *ice_adapter_new(void) if (!adapter) return NULL;
+ adapter->device_serial_number = dsn; spin_lock_init(&adapter->ptp_gltsyn_time_lock); refcount_set(&adapter->refcount, 1);
@@ -77,23 +61,26 @@ static void ice_adapter_free(struct ice_adapter *adapter) * Return: Pointer to ice_adapter on success. * ERR_PTR() on error. -ENOMEM is the only possible error. */ -struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev) +struct ice_adapter *ice_adapter_get(struct pci_dev *pdev) { - unsigned long index = ice_adapter_index(pdev); + u64 dsn = pci_get_dsn(pdev); struct ice_adapter *adapter; + unsigned long index; int err;
+ index = ice_adapter_index(dsn); scoped_guard(mutex, &ice_adapters_mutex) { err = xa_insert(&ice_adapters, index, NULL, GFP_KERNEL); if (err == -EBUSY) { adapter = xa_load(&ice_adapters, index); refcount_inc(&adapter->refcount); + WARN_ON_ONCE(adapter->device_serial_number != dsn); return adapter; } if (err) return ERR_PTR(err);
- adapter = ice_adapter_new(); + adapter = ice_adapter_new(dsn); if (!adapter) return ERR_PTR(-ENOMEM); xa_store(&ice_adapters, index, adapter, GFP_KERNEL); @@ -110,11 +97,13 @@ struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev) * * Context: Process, may sleep. */ -void ice_adapter_put(const struct pci_dev *pdev) +void ice_adapter_put(struct pci_dev *pdev) { - unsigned long index = ice_adapter_index(pdev); + u64 dsn = pci_get_dsn(pdev); struct ice_adapter *adapter; + unsigned long index;
+ index = ice_adapter_index(dsn); scoped_guard(mutex, &ice_adapters_mutex) { adapter = xa_load(&ice_adapters, index); if (WARN_ON(!adapter)) diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h index e233225848b38..ac15c0d2bc1a4 100644 --- a/drivers/net/ethernet/intel/ice/ice_adapter.h +++ b/drivers/net/ethernet/intel/ice/ice_adapter.h @@ -32,6 +32,7 @@ struct ice_port_list { * @refcount: Reference count. struct ice_pf objects hold the references. * @ctrl_pf: Control PF of the adapter * @ports: Ports list + * @device_serial_number: DSN cached for collision detection on 32bit systems */ struct ice_adapter { refcount_t refcount; @@ -40,9 +41,10 @@ struct ice_adapter {
struct ice_pf *ctrl_pf; struct ice_port_list ports; + u64 device_serial_number; };
-struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev); -void ice_adapter_put(const struct pci_dev *pdev); +struct ice_adapter *ice_adapter_get(struct pci_dev *pdev); +void ice_adapter_put(struct pci_dev *pdev);
#endif /* _ICE_ADAPTER_H */
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gao Xiang hsiangkao@linux.alibaba.com
[ Upstream commit 35076d2223c731f7be75af61e67f90807384d030 ]
When compressed data deduplication is enabled, multiple logical extents may reference the same compressed physical cluster.
The previous commit 94c43de73521 ("erofs: fix wrong primary bvec selection on deduplicated extents") already avoids using shortened bvecs. However, in such cases, the extra temporary buffers also need to be preserved for later use in z_erofs_fill_other_copies() to to prevent data corruption.
IOWs, extra temporary buffers have to be retained not only due to varying start relative offsets (`pageofs_out`, as indicated by `pcl->multibases`) but also because of shortened bvecs.
android.hardware.graphics.composer@2.1.so : 270696 bytes 0: 0.. 204185 | 204185 : 628019200.. 628084736 | 65536 -> 1: 204185.. 225536 | 21351 : 544063488.. 544129024 | 65536 2: 225536.. 270696 | 45160 : 0.. 0 | 0
com.android.vndk.v28.apex : 93814897 bytes ... 364: 53869896..54095257 | 225361 : 543997952.. 544063488 | 65536 -> 365: 54095257..54309344 | 214087 : 544063488.. 544129024 | 65536 366: 54309344..54514557 | 205213 : 544129024.. 544194560 | 65536 ...
Both 204185 and 54095257 have the same start relative offset of 3481, but the logical page 55 of `android.hardware.graphics.composer@2.1.so` ranges from 225280 to 229632, forming a shortened bvec [225280, 225536) that cannot be used for decompressing the range from 54095257 to 54309344 of `com.android.vndk.v28.apex`.
Since `pcl->multibases` is already meaningless, just mark `be->keepxcpy` on demand for simplicity.
Again, this issue can only lead to data corruption if `-Ededupe` is on.
Fixes: 94c43de73521 ("erofs: fix wrong primary bvec selection on deduplicated extents") Reviewed-by: Hongbo Li lihongbo22@huawei.com Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Link: https://lore.kernel.org/r/20250506101850.191506-1-hsiangkao@linux.alibaba.co... Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/zdata.c | 31 ++++++++++++++----------------- 1 file changed, 14 insertions(+), 17 deletions(-)
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index d771e06db7386..67acef591646c 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -76,9 +76,6 @@ struct z_erofs_pcluster { /* L: whether partial decompression or not */ bool partial;
- /* L: indicate several pageofs_outs or not */ - bool multibases; - /* L: whether extra buffer allocations are best-effort */ bool besteffort;
@@ -1050,8 +1047,6 @@ static int z_erofs_scan_folio(struct z_erofs_frontend *f, break;
erofs_onlinefolio_split(folio); - if (f->pcl->pageofs_out != (map->m_la & ~PAGE_MASK)) - f->pcl->multibases = true; if (f->pcl->length < offset + end - map->m_la) { f->pcl->length = offset + end - map->m_la; f->pcl->pageofs_out = map->m_la & ~PAGE_MASK; @@ -1097,7 +1092,6 @@ struct z_erofs_backend { struct page *onstack_pages[Z_EROFS_ONSTACK_PAGES]; struct super_block *sb; struct z_erofs_pcluster *pcl; - /* pages with the longest decompressed length for deduplication */ struct page **decompressed_pages; /* pages to keep the compressed data */ @@ -1106,6 +1100,8 @@ struct z_erofs_backend { struct list_head decompressed_secondary_bvecs; struct page **pagepool; unsigned int onstack_used, nr_pages; + /* indicate if temporary copies should be preserved for later use */ + bool keepxcpy; };
struct z_erofs_bvec_item { @@ -1116,18 +1112,20 @@ struct z_erofs_bvec_item { static void z_erofs_do_decompressed_bvec(struct z_erofs_backend *be, struct z_erofs_bvec *bvec) { + int poff = bvec->offset + be->pcl->pageofs_out; struct z_erofs_bvec_item *item; - unsigned int pgnr; - - if (!((bvec->offset + be->pcl->pageofs_out) & ~PAGE_MASK) && - (bvec->end == PAGE_SIZE || - bvec->offset + bvec->end == be->pcl->length)) { - pgnr = (bvec->offset + be->pcl->pageofs_out) >> PAGE_SHIFT; - DBG_BUGON(pgnr >= be->nr_pages); - if (!be->decompressed_pages[pgnr]) { - be->decompressed_pages[pgnr] = bvec->page; + struct page **page; + + if (!(poff & ~PAGE_MASK) && (bvec->end == PAGE_SIZE || + bvec->offset + bvec->end == be->pcl->length)) { + DBG_BUGON((poff >> PAGE_SHIFT) >= be->nr_pages); + page = be->decompressed_pages + (poff >> PAGE_SHIFT); + if (!*page) { + *page = bvec->page; return; } + } else { + be->keepxcpy = true; }
/* (cold path) one pcluster is requested multiple times */ @@ -1291,7 +1289,7 @@ static int z_erofs_decompress_pcluster(struct z_erofs_backend *be, int err) .alg = pcl->algorithmformat, .inplace_io = overlapped, .partial_decoding = pcl->partial, - .fillgaps = pcl->multibases, + .fillgaps = be->keepxcpy, .gfp = pcl->besteffort ? GFP_KERNEL : GFP_NOWAIT | __GFP_NORETRY }, be->pagepool); @@ -1348,7 +1346,6 @@ static int z_erofs_decompress_pcluster(struct z_erofs_backend *be, int err)
pcl->length = 0; pcl->partial = true; - pcl->multibases = false; pcl->besteffort = false; pcl->bvset.nextpage = NULL; pcl->vcnt = 0;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Julian Anastasov ja@ssi.bg
[ Upstream commit e34090d7214e0516eb8722aee295cb2507317c07 ]
syzbot reports for uninit-value for the saddr argument [1]. commit 4754957f04f5 ("ipvs: do not use random local source address for tunnels") already implies that the input value of saddr should be ignored but the code is still reading it which can prevent to connect the route. Fix it by changing the argument to ret_saddr.
[1] BUG: KMSAN: uninit-value in do_output_route4+0x42c/0x4d0 net/netfilter/ipvs/ip_vs_xmit.c:147 do_output_route4+0x42c/0x4d0 net/netfilter/ipvs/ip_vs_xmit.c:147 __ip_vs_get_out_rt+0x403/0x21d0 net/netfilter/ipvs/ip_vs_xmit.c:330 ip_vs_tunnel_xmit+0x205/0x2380 net/netfilter/ipvs/ip_vs_xmit.c:1136 ip_vs_in_hook+0x1aa5/0x35b0 net/netfilter/ipvs/ip_vs_core.c:2063 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xf7/0x400 net/netfilter/core.c:626 nf_hook include/linux/netfilter.h:269 [inline] __ip_local_out+0x758/0x7e0 net/ipv4/ip_output.c:118 ip_local_out net/ipv4/ip_output.c:127 [inline] ip_send_skb+0x6a/0x3c0 net/ipv4/ip_output.c:1501 udp_send_skb+0xfda/0x1b70 net/ipv4/udp.c:1195 udp_sendmsg+0x2fe3/0x33c0 net/ipv4/udp.c:1483 inet_sendmsg+0x1fc/0x280 net/ipv4/af_inet.c:851 sock_sendmsg_nosec net/socket.c:712 [inline] __sock_sendmsg+0x267/0x380 net/socket.c:727 ____sys_sendmsg+0x91b/0xda0 net/socket.c:2566 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2620 __sys_sendmmsg+0x41d/0x880 net/socket.c:2702 __compat_sys_sendmmsg net/compat.c:360 [inline] __do_compat_sys_sendmmsg net/compat.c:367 [inline] __se_compat_sys_sendmmsg net/compat.c:364 [inline] __ia32_compat_sys_sendmmsg+0xc8/0x140 net/compat.c:364 ia32_sys_call+0x3ffa/0x41f0 arch/x86/include/generated/asm/syscalls_32.h:346 do_syscall_32_irqs_on arch/x86/entry/syscall_32.c:83 [inline] __do_fast_syscall_32+0xb0/0x110 arch/x86/entry/syscall_32.c:306 do_fast_syscall_32+0x38/0x80 arch/x86/entry/syscall_32.c:331 do_SYSENTER_32+0x1f/0x30 arch/x86/entry/syscall_32.c:369 entry_SYSENTER_compat_after_hwframe+0x84/0x8e
Uninit was created at: slab_post_alloc_hook mm/slub.c:4167 [inline] slab_alloc_node mm/slub.c:4210 [inline] __kmalloc_cache_noprof+0x8fa/0xe00 mm/slub.c:4367 kmalloc_noprof include/linux/slab.h:905 [inline] ip_vs_dest_dst_alloc net/netfilter/ipvs/ip_vs_xmit.c:61 [inline] __ip_vs_get_out_rt+0x35d/0x21d0 net/netfilter/ipvs/ip_vs_xmit.c:323 ip_vs_tunnel_xmit+0x205/0x2380 net/netfilter/ipvs/ip_vs_xmit.c:1136 ip_vs_in_hook+0x1aa5/0x35b0 net/netfilter/ipvs/ip_vs_core.c:2063 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xf7/0x400 net/netfilter/core.c:626 nf_hook include/linux/netfilter.h:269 [inline] __ip_local_out+0x758/0x7e0 net/ipv4/ip_output.c:118 ip_local_out net/ipv4/ip_output.c:127 [inline] ip_send_skb+0x6a/0x3c0 net/ipv4/ip_output.c:1501 udp_send_skb+0xfda/0x1b70 net/ipv4/udp.c:1195 udp_sendmsg+0x2fe3/0x33c0 net/ipv4/udp.c:1483 inet_sendmsg+0x1fc/0x280 net/ipv4/af_inet.c:851 sock_sendmsg_nosec net/socket.c:712 [inline] __sock_sendmsg+0x267/0x380 net/socket.c:727 ____sys_sendmsg+0x91b/0xda0 net/socket.c:2566 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2620 __sys_sendmmsg+0x41d/0x880 net/socket.c:2702 __compat_sys_sendmmsg net/compat.c:360 [inline] __do_compat_sys_sendmmsg net/compat.c:367 [inline] __se_compat_sys_sendmmsg net/compat.c:364 [inline] __ia32_compat_sys_sendmmsg+0xc8/0x140 net/compat.c:364 ia32_sys_call+0x3ffa/0x41f0 arch/x86/include/generated/asm/syscalls_32.h:346 do_syscall_32_irqs_on arch/x86/entry/syscall_32.c:83 [inline] __do_fast_syscall_32+0xb0/0x110 arch/x86/entry/syscall_32.c:306 do_fast_syscall_32+0x38/0x80 arch/x86/entry/syscall_32.c:331 do_SYSENTER_32+0x1f/0x30 arch/x86/entry/syscall_32.c:369 entry_SYSENTER_compat_after_hwframe+0x84/0x8e
CPU: 0 UID: 0 PID: 22408 Comm: syz.4.5165 Not tainted 6.15.0-rc3-syzkaller-00019-gbc3372351d0c #0 PREEMPT(undef) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Reported-by: syzbot+04b9a82855c8aed20860@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/68138dfa.050a0220.14dd7d.0017.GAE@google.com/ Fixes: 4754957f04f5 ("ipvs: do not use random local source address for tunnels") Signed-off-by: Julian Anastasov ja@ssi.bg Acked-by: Simon Horman horms@kernel.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipvs/ip_vs_xmit.c | 27 ++++++++------------------- 1 file changed, 8 insertions(+), 19 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c index 3313bceb6cc99..014f077403695 100644 --- a/net/netfilter/ipvs/ip_vs_xmit.c +++ b/net/netfilter/ipvs/ip_vs_xmit.c @@ -119,13 +119,12 @@ __mtu_check_toobig_v6(const struct sk_buff *skb, u32 mtu) return false; }
-/* Get route to daddr, update *saddr, optionally bind route to saddr */ +/* Get route to daddr, optionally bind route to saddr */ static struct rtable *do_output_route4(struct net *net, __be32 daddr, - int rt_mode, __be32 *saddr) + int rt_mode, __be32 *ret_saddr) { struct flowi4 fl4; struct rtable *rt; - bool loop = false;
memset(&fl4, 0, sizeof(fl4)); fl4.daddr = daddr; @@ -135,23 +134,17 @@ static struct rtable *do_output_route4(struct net *net, __be32 daddr, retry: rt = ip_route_output_key(net, &fl4); if (IS_ERR(rt)) { - /* Invalid saddr ? */ - if (PTR_ERR(rt) == -EINVAL && *saddr && - rt_mode & IP_VS_RT_MODE_CONNECT && !loop) { - *saddr = 0; - flowi4_update_output(&fl4, 0, daddr, 0); - goto retry; - } IP_VS_DBG_RL("ip_route_output error, dest: %pI4\n", &daddr); return NULL; - } else if (!*saddr && rt_mode & IP_VS_RT_MODE_CONNECT && fl4.saddr) { + } + if (rt_mode & IP_VS_RT_MODE_CONNECT && fl4.saddr) { ip_rt_put(rt); - *saddr = fl4.saddr; flowi4_update_output(&fl4, 0, daddr, fl4.saddr); - loop = true; + rt_mode = 0; goto retry; } - *saddr = fl4.saddr; + if (ret_saddr) + *ret_saddr = fl4.saddr; return rt; }
@@ -344,19 +337,15 @@ __ip_vs_get_out_rt(struct netns_ipvs *ipvs, int skb_af, struct sk_buff *skb, if (ret_saddr) *ret_saddr = dest_dst->dst_saddr.ip; } else { - __be32 saddr = htonl(INADDR_ANY); - noref = 0;
/* For such unconfigured boxes avoid many route lookups * for performance reasons because we do not remember saddr */ rt_mode &= ~IP_VS_RT_MODE_CONNECT; - rt = do_output_route4(net, daddr, rt_mode, &saddr); + rt = do_output_route4(net, daddr, rt_mode, ret_saddr); if (!rt) goto err_unreach; - if (ret_saddr) - *ret_saddr = saddr; }
local = (rt->rt_flags & RTCF_LOCAL) ? 1 : 0;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jozsef Kadlecsik kadlec@netfilter.org
[ Upstream commit 8478a729c0462273188263136880480729e9efca ]
Region locking introduced in v5.6-rc4 contained three macros to handle the region locks: ahash_bucket_start(), ahash_bucket_end() which gave back the start and end hash bucket values belonging to a given region lock and ahash_region() which should give back the region lock belonging to a given hash bucket. The latter was incorrect which can lead to a race condition between the garbage collector and adding new elements when a hash type of set is defined with timeouts.
Fixes: f66ee0410b1c ("netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports") Reported-by: Kota Toda kota.toda@gmo-cybersecurity.com Signed-off-by: Jozsef Kadlecsik kadlec@netfilter.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipset/ip_set_hash_gen.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index cf3ce72c3de64..5251524b96afa 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -64,7 +64,7 @@ struct hbucket { #define ahash_sizeof_regions(htable_bits) \ (ahash_numof_locks(htable_bits) * sizeof(struct ip_set_region)) #define ahash_region(n, htable_bits) \ - ((n) % ahash_numof_locks(htable_bits)) + ((n) / jhash_size(HTABLE_REGION_BITS)) #define ahash_bucket_start(h, htable_bits) \ ((htable_bits) < HTABLE_REGION_BITS ? 0 \ : (h) * jhash_size(HTABLE_REGION_BITS))
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Chaignon paul.chaignon@gmail.com
[ Upstream commit c4327229948879814229b46aa26a750718888503 ]
When bpf_redirect_peer is used to redirect packets to a device in another network namespace, the skb isn't scrubbed. That can lead skb information from one namespace to be "misused" in another namespace.
As one example, this is causing Cilium to drop traffic when using bpf_redirect_peer to redirect packets that just went through IPsec decryption to a container namespace. The following pwru trace shows (1) the packet path from the host's XFRM layer to the container's XFRM layer where it's dropped and (2) the number of active skb extensions at each function.
NETNS MARK IFACE TUPLE FUNC 4026533547 d00 eth0 10.244.3.124:35473->10.244.2.158:53 xfrm_rcv_cb .active_extensions = (__u8)2, 4026533547 d00 eth0 10.244.3.124:35473->10.244.2.158:53 xfrm4_rcv_cb .active_extensions = (__u8)2, 4026533547 d00 eth0 10.244.3.124:35473->10.244.2.158:53 gro_cells_receive .active_extensions = (__u8)2, [...] 4026533547 0 eth0 10.244.3.124:35473->10.244.2.158:53 skb_do_redirect .active_extensions = (__u8)2, 4026534999 0 eth0 10.244.3.124:35473->10.244.2.158:53 ip_rcv .active_extensions = (__u8)2, 4026534999 0 eth0 10.244.3.124:35473->10.244.2.158:53 ip_rcv_core .active_extensions = (__u8)2, [...] 4026534999 0 eth0 10.244.3.124:35473->10.244.2.158:53 udp_queue_rcv_one_skb .active_extensions = (__u8)2, 4026534999 0 eth0 10.244.3.124:35473->10.244.2.158:53 __xfrm_policy_check .active_extensions = (__u8)2, 4026534999 0 eth0 10.244.3.124:35473->10.244.2.158:53 __xfrm_decode_session .active_extensions = (__u8)2, 4026534999 0 eth0 10.244.3.124:35473->10.244.2.158:53 security_xfrm_decode_session .active_extensions = (__u8)2, 4026534999 0 eth0 10.244.3.124:35473->10.244.2.158:53 kfree_skb_reason(SKB_DROP_REASON_XFRM_POLICY) .active_extensions = (__u8)2,
In this case, there are no XFRM policies in the container's network namespace so the drop is unexpected. When we decrypt the IPsec packet, the XFRM state used for decryption is set in the skb extensions. This information is preserved across the netns switch. When we reach the XFRM policy check in the container's netns, __xfrm_policy_check drops the packet with LINUX_MIB_XFRMINNOPOLS because a (container-side) XFRM policy can't be found that matches the (host-side) XFRM state used for decryption.
This patch fixes this by scrubbing the packet when using bpf_redirect_peer, as is done on typical netns switches via veth devices except skb->mark and skb->tstamp are not zeroed.
Fixes: 9aa1206e8f482 ("bpf: Add redirect_peer helper") Signed-off-by: Paul Chaignon paul.chaignon@gmail.com Acked-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Martin KaFai Lau martin.lau@kernel.org Link: https://patch.msgid.link/1728ead5e0fe45e7a6542c36bd4e3ca07a73b7d6.1746460653... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/filter.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/core/filter.c b/net/core/filter.c index b0df9b7d16d3f..6c8fbc96b14a3 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2509,6 +2509,7 @@ int skb_do_redirect(struct sk_buff *skb) goto out_drop; skb->dev = dev; dev_sw_netstats_rx_add(dev, skb->len); + skb_scrub_packet(skb, false); return -EAGAIN; } return flags & BPF_F_NEIGH ?
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit 5f93185a757ff38b36f849c659aeef368db15a68 ]
Allow reserved multicast to ignore VLAN membership so STP and other management protocols work without a PVID VLAN configured when using a vlan aware bridge.
Fixes: 967dd82ffc52 ("net: dsa: b53: Add support for Broadcom RoboSwitch") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Tested-by: Florian Fainelli florian.fainelli@broadcom.com Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Link: https://patch.msgid.link/20250429201710.330937-2-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/b53/b53_common.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 3b49e87e8ef72..a152f632a290e 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -373,9 +373,11 @@ static void b53_enable_vlan(struct b53_device *dev, int port, bool enable, b53_read8(dev, B53_VLAN_PAGE, B53_VLAN_CTRL5, &vc5); }
+ vc1 &= ~VC1_RX_MCST_FWD_EN; + if (enable) { vc0 |= VC0_VLAN_EN | VC0_VID_CHK_EN | VC0_VID_HASH_VID; - vc1 |= VC1_RX_MCST_UNTAG_EN | VC1_RX_MCST_FWD_EN; + vc1 |= VC1_RX_MCST_UNTAG_EN; vc4 &= ~VC4_ING_VID_CHECK_MASK; if (enable_filtering) { vc4 |= VC4_ING_VID_VIO_DROP << VC4_ING_VID_CHECK_S; @@ -393,7 +395,7 @@ static void b53_enable_vlan(struct b53_device *dev, int port, bool enable,
} else { vc0 &= ~(VC0_VLAN_EN | VC0_VID_CHK_EN | VC0_VID_HASH_VID); - vc1 &= ~(VC1_RX_MCST_UNTAG_EN | VC1_RX_MCST_FWD_EN); + vc1 &= ~VC1_RX_MCST_UNTAG_EN; vc4 &= ~VC4_ING_VID_CHECK_MASK; vc5 &= ~VC5_DROP_VTABLE_MISS;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit 425f11d4cc9bd9e97e6825d9abb2c51a068ca7b5 ]
The Broadcom management header does not carry the original VLAN tag state information, just the ingress port, so for untagged frames we do not know from which VLAN they originated.
Therefore keep the CPU port always tagged except for VLAN 0.
Fixes the following setup:
$ ip link add br0 type bridge vlan_filtering 1 $ ip link set sw1p1 master br0 $ bridge vlan add dev br0 pvid untagged self $ ip link add sw1p2.10 link sw1p2 type vlan id 10
Where VID 10 would stay untagged on the CPU port.
Fixes: 2c32a3d3c233 ("net: dsa: b53: Do not force CPU to be always tagged") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Tested-by: Florian Fainelli florian.fainelli@broadcom.com Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Link: https://patch.msgid.link/20250429201710.330937-3-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/b53/b53_common.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index a152f632a290e..772f8954ddf43 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1135,6 +1135,11 @@ static int b53_setup(struct dsa_switch *ds) */ ds->untag_bridge_pvid = dev->tag_protocol == DSA_TAG_PROTO_NONE;
+ /* The switch does not tell us the original VLAN for untagged + * packets, so keep the CPU port always tagged. + */ + ds->untag_vlan_aware_bridge_pvid = true; + ret = b53_reset_switch(dev); if (ret) { dev_err(ds->dev, "failed to reset switch\n"); @@ -1545,6 +1550,9 @@ int b53_vlan_add(struct dsa_switch *ds, int port, if (vlan->vid == 0 && vlan->vid == b53_default_pvid(dev)) untagged = true;
+ if (vlan->vid > 0 && dsa_is_cpu_port(ds, port)) + untagged = false; + vl->members |= BIT(port); if (untagged && !b53_vlan_port_needs_forced_tagged(ds, port)) vl->untag |= BIT(port);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit f480851981043d9bb6447ca9883ade9247b9a0ad ]
Currently the PVID of ports are only set when adding/updating VLANs with PVID set or removing VLANs, but not when clearing the PVID flag of a VLAN.
E.g. the following flow
$ ip link add br0 type bridge vlan_filtering 1 $ ip link set sw1p1 master bridge $ bridge vlan add dev sw1p1 vid 10 pvid untagged $ bridge vlan add dev sw1p1 vid 10 untagged
Would keep the PVID set as 10, despite the flag being cleared. Fix this by checking if we need to unset the PVID on vlan updates.
Fixes: a2482d2ce349 ("net: dsa: b53: Plug in VLAN support") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Tested-by: Florian Fainelli florian.fainelli@broadcom.com Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Link: https://patch.msgid.link/20250429201710.330937-4-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/b53/b53_common.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 772f8954ddf43..fb7560201d7a9 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1537,12 +1537,21 @@ int b53_vlan_add(struct dsa_switch *ds, int port, bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED; bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID; struct b53_vlan *vl; + u16 old_pvid, new_pvid; int err;
err = b53_vlan_prepare(ds, port, vlan); if (err) return err;
+ b53_read16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port), &old_pvid); + if (pvid) + new_pvid = vlan->vid; + else if (!pvid && vlan->vid == old_pvid) + new_pvid = b53_default_pvid(dev); + else + new_pvid = old_pvid; + vl = &dev->vlans[vlan->vid];
b53_get_vlan_entry(dev, vlan->vid, vl); @@ -1562,9 +1571,9 @@ int b53_vlan_add(struct dsa_switch *ds, int port, b53_set_vlan_entry(dev, vlan->vid, vl); b53_fast_age_vlan(dev, vlan->vid);
- if (pvid && !dsa_is_cpu_port(ds, port)) { + if (!dsa_is_cpu_port(ds, port) && new_pvid != old_pvid) { b53_write16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port), - vlan->vid); + new_pvid); b53_fast_age_vlan(dev, vlan->vid); }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit 083c6b28c0cbcd83b6af1a10f2c82937129b3438 ]
Presumably the intention here was to flush the VLAN of the old pvid, not the added VLAN again, which we already flushed before.
Fixes: a2482d2ce349 ("net: dsa: b53: Plug in VLAN support") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Tested-by: Florian Fainelli florian.fainelli@broadcom.com Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Link: https://patch.msgid.link/20250429201710.330937-5-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/b53/b53_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index fb7560201d7a9..e75afba8b080a 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1574,7 +1574,7 @@ int b53_vlan_add(struct dsa_switch *ds, int port, if (!dsa_is_cpu_port(ds, port) && new_pvid != old_pvid) { b53_write16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port), new_pvid); - b53_fast_age_vlan(dev, vlan->vid); + b53_fast_age_vlan(dev, old_pvid); }
return 0;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit a1c1901c5cc881425cc45992ab6c5418174e9e5a ]
The untagged default VLAN is added to the default vlan, which may be one, but we modify the VLAN 0 entry on bridge leave.
Fix this to use the correct VLAN entry for the default pvid.
Fixes: fea83353177a ("net: dsa: b53: Fix default VLAN ID") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Tested-by: Florian Fainelli florian.fainelli@broadcom.com Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Link: https://patch.msgid.link/20250429201710.330937-6-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/b53/b53_common.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index e75afba8b080a..9745713e0b10d 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1986,7 +1986,7 @@ EXPORT_SYMBOL(b53_br_join); void b53_br_leave(struct dsa_switch *ds, int port, struct dsa_bridge bridge) { struct b53_device *dev = ds->priv; - struct b53_vlan *vl = &dev->vlans[0]; + struct b53_vlan *vl; s8 cpu_port = dsa_to_port(ds, port)->cpu_dp->index; unsigned int i; u16 pvlan, reg, pvid; @@ -2012,6 +2012,7 @@ void b53_br_leave(struct dsa_switch *ds, int port, struct dsa_bridge bridge) dev->ports[port].vlan_ctl_mask = pvlan;
pvid = b53_default_pvid(dev); + vl = &dev->vlans[pvid];
/* Make this port join all VLANs without VLAN entries */ if (is58xx(dev)) {
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit 13b152ae40495966501697693f048f47430c50fd ]
While JOIN_ALL_VLAN allows to join all VLANs, we still need to keep the default VLAN enabled so that untagged traffic stays untagged.
So rejoin the default VLAN even for switches with JOIN_ALL_VLAN support.
Fixes: 48aea33a77ab ("net: dsa: b53: Add JOIN_ALL_VLAN support") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Tested-by: Florian Fainelli florian.fainelli@broadcom.com Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Link: https://patch.msgid.link/20250429201710.330937-7-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/b53/b53_common.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 9745713e0b10d..305e3b5c804a2 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -2021,12 +2021,12 @@ void b53_br_leave(struct dsa_switch *ds, int port, struct dsa_bridge bridge) if (!(reg & BIT(cpu_port))) reg |= BIT(cpu_port); b53_write16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, reg); - } else { - b53_get_vlan_entry(dev, pvid, vl); - vl->members |= BIT(port) | BIT(cpu_port); - vl->untag |= BIT(port) | BIT(cpu_port); - b53_set_vlan_entry(dev, pvid, vl); } + + b53_get_vlan_entry(dev, pvid, vl); + vl->members |= BIT(port) | BIT(cpu_port); + vl->untag |= BIT(port) | BIT(cpu_port); + b53_set_vlan_entry(dev, pvid, vl); } EXPORT_SYMBOL(b53_br_leave);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit 45e9d59d39503bb3e6ab4d258caea4ba6496e2dc ]
Since we cannot set forwarding destinations per VLAN, we should not have a VLAN 0 configured, as it would allow untagged traffic to work across ports on VLAN aware bridges regardless if a PVID untagged VLAN exists.
So remove the VLAN 0 on join, an re-add it on leave. But only do so if we have a VLAN aware bridge, as without it, untagged traffic would become tagged with VID 0 on a VLAN unaware bridge.
Fixes: a2482d2ce349 ("net: dsa: b53: Plug in VLAN support") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Tested-by: Florian Fainelli florian.fainelli@broadcom.com Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Link: https://patch.msgid.link/20250429201710.330937-8-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/b53/b53_common.c | 36 ++++++++++++++++++++++++-------- 1 file changed, 27 insertions(+), 9 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 305e3b5c804a2..24d3d693086b2 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1544,6 +1544,9 @@ int b53_vlan_add(struct dsa_switch *ds, int port, if (err) return err;
+ if (vlan->vid == 0) + return 0; + b53_read16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port), &old_pvid); if (pvid) new_pvid = vlan->vid; @@ -1556,10 +1559,7 @@ int b53_vlan_add(struct dsa_switch *ds, int port,
b53_get_vlan_entry(dev, vlan->vid, vl);
- if (vlan->vid == 0 && vlan->vid == b53_default_pvid(dev)) - untagged = true; - - if (vlan->vid > 0 && dsa_is_cpu_port(ds, port)) + if (dsa_is_cpu_port(ds, port)) untagged = false;
vl->members |= BIT(port); @@ -1589,6 +1589,9 @@ int b53_vlan_del(struct dsa_switch *ds, int port, struct b53_vlan *vl; u16 pvid;
+ if (vlan->vid == 0) + return 0; + b53_read16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port), &pvid);
vl = &dev->vlans[vlan->vid]; @@ -1935,8 +1938,9 @@ int b53_br_join(struct dsa_switch *ds, int port, struct dsa_bridge bridge, bool *tx_fwd_offload, struct netlink_ext_ack *extack) { struct b53_device *dev = ds->priv; + struct b53_vlan *vl; s8 cpu_port = dsa_to_port(ds, port)->cpu_dp->index; - u16 pvlan, reg; + u16 pvlan, reg, pvid; unsigned int i;
/* On 7278, port 7 which connects to the ASP should only receive @@ -1945,6 +1949,9 @@ int b53_br_join(struct dsa_switch *ds, int port, struct dsa_bridge bridge, if (dev->chip_id == BCM7278_DEVICE_ID && port == 7) return -EINVAL;
+ pvid = b53_default_pvid(dev); + vl = &dev->vlans[pvid]; + /* Make this port leave the all VLANs join since we will have proper * VLAN entries from now on */ @@ -1956,6 +1963,15 @@ int b53_br_join(struct dsa_switch *ds, int port, struct dsa_bridge bridge, b53_write16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, reg); }
+ if (ds->vlan_filtering) { + b53_get_vlan_entry(dev, pvid, vl); + vl->members &= ~BIT(port); + if (vl->members == BIT(cpu_port)) + vl->members &= ~BIT(cpu_port); + vl->untag = vl->members; + b53_set_vlan_entry(dev, pvid, vl); + } + b53_read16(dev, B53_PVLAN_PAGE, B53_PVLAN_PORT_MASK(port), &pvlan);
b53_for_each_port(dev, i) { @@ -2023,10 +2039,12 @@ void b53_br_leave(struct dsa_switch *ds, int port, struct dsa_bridge bridge) b53_write16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, reg); }
- b53_get_vlan_entry(dev, pvid, vl); - vl->members |= BIT(port) | BIT(cpu_port); - vl->untag |= BIT(port) | BIT(cpu_port); - b53_set_vlan_entry(dev, pvid, vl); + if (ds->vlan_filtering) { + b53_get_vlan_entry(dev, pvid, vl); + vl->members |= BIT(port) | BIT(cpu_port); + vl->untag |= BIT(port) | BIT(cpu_port); + b53_set_vlan_entry(dev, pvid, vl); + } } EXPORT_SYMBOL(b53_br_leave);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit f089652b6b16452535dcc5cbaa6e2bb05acd3f93 ]
Documentation/networking/switchdev.rst says:
- with VLAN filtering turned off: the bridge is strictly VLAN unaware and its data path will process all Ethernet frames as if they are VLAN-untagged. The bridge VLAN database can still be modified, but the modifications should have no effect while VLAN filtering is turned off.
This breaks if we immediately apply the VLAN configuration, so skip writing it when vlan_filtering is off.
Fixes: 0ee2af4ebbe3 ("net: dsa: set configure_vlan_while_not_filtering to true by default") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Tested-by: Florian Fainelli florian.fainelli@broadcom.com Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Link: https://patch.msgid.link/20250429201710.330937-9-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/b53/b53_common.c | 48 +++++++++++++++++++------------- 1 file changed, 28 insertions(+), 20 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 24d3d693086b2..bc51c9d807768 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1547,6 +1547,9 @@ int b53_vlan_add(struct dsa_switch *ds, int port, if (vlan->vid == 0) return 0;
+ if (!ds->vlan_filtering) + return 0; + b53_read16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port), &old_pvid); if (pvid) new_pvid = vlan->vid; @@ -1592,6 +1595,9 @@ int b53_vlan_del(struct dsa_switch *ds, int port, if (vlan->vid == 0) return 0;
+ if (!ds->vlan_filtering) + return 0; + b53_read16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port), &pvid);
vl = &dev->vlans[vlan->vid]; @@ -1952,18 +1958,20 @@ int b53_br_join(struct dsa_switch *ds, int port, struct dsa_bridge bridge, pvid = b53_default_pvid(dev); vl = &dev->vlans[pvid];
- /* Make this port leave the all VLANs join since we will have proper - * VLAN entries from now on - */ - if (is58xx(dev)) { - b53_read16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, ®); - reg &= ~BIT(port); - if ((reg & BIT(cpu_port)) == BIT(cpu_port)) - reg &= ~BIT(cpu_port); - b53_write16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, reg); - } - if (ds->vlan_filtering) { + /* Make this port leave the all VLANs join since we will have + * proper VLAN entries from now on + */ + if (is58xx(dev)) { + b53_read16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, + ®); + reg &= ~BIT(port); + if ((reg & BIT(cpu_port)) == BIT(cpu_port)) + reg &= ~BIT(cpu_port); + b53_write16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, + reg); + } + b53_get_vlan_entry(dev, pvid, vl); vl->members &= ~BIT(port); if (vl->members == BIT(cpu_port)) @@ -2030,16 +2038,16 @@ void b53_br_leave(struct dsa_switch *ds, int port, struct dsa_bridge bridge) pvid = b53_default_pvid(dev); vl = &dev->vlans[pvid];
- /* Make this port join all VLANs without VLAN entries */ - if (is58xx(dev)) { - b53_read16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, ®); - reg |= BIT(port); - if (!(reg & BIT(cpu_port))) - reg |= BIT(cpu_port); - b53_write16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, reg); - } - if (ds->vlan_filtering) { + /* Make this port join all VLANs without VLAN entries */ + if (is58xx(dev)) { + b53_read16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, ®); + reg |= BIT(port); + if (!(reg & BIT(cpu_port))) + reg |= BIT(cpu_port); + b53_write16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, reg); + } + b53_get_vlan_entry(dev, pvid, vl); vl->members |= BIT(port) | BIT(cpu_port); vl->untag |= BIT(port) | BIT(cpu_port);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit 2dc2bd57111582895e10f54ea380329c89873f1c ]
To allow runtime switching between vlan aware and vlan non-aware mode, we need to properly keep track of any bridge VLAN configuration. Likewise, we need to know when we actually switch between both modes, to not have to rewrite the full VLAN table every time we update the VLANs.
So keep track of the current vlan_filtering mode, and on changes, apply the appropriate VLAN configuration.
Fixes: 0ee2af4ebbe3 ("net: dsa: set configure_vlan_while_not_filtering to true by default") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Tested-by: Florian Fainelli florian.fainelli@broadcom.com Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Link: https://patch.msgid.link/20250429201710.330937-10-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/b53/b53_common.c | 104 ++++++++++++++++++++++--------- drivers/net/dsa/b53/b53_priv.h | 2 + 2 files changed, 75 insertions(+), 31 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index bc51c9d807768..118457e28e717 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -763,6 +763,22 @@ static bool b53_vlan_port_needs_forced_tagged(struct dsa_switch *ds, int port) return dev->tag_protocol == DSA_TAG_PROTO_NONE && dsa_is_cpu_port(ds, port); }
+static bool b53_vlan_port_may_join_untagged(struct dsa_switch *ds, int port) +{ + struct b53_device *dev = ds->priv; + struct dsa_port *dp; + + if (!dev->vlan_filtering) + return true; + + dp = dsa_to_port(ds, port); + + if (dsa_port_is_cpu(dp)) + return true; + + return dp->bridge == NULL; +} + int b53_configure_vlan(struct dsa_switch *ds) { struct b53_device *dev = ds->priv; @@ -781,7 +797,7 @@ int b53_configure_vlan(struct dsa_switch *ds) b53_do_vlan_op(dev, VTA_CMD_CLEAR); }
- b53_enable_vlan(dev, -1, dev->vlan_enabled, ds->vlan_filtering); + b53_enable_vlan(dev, -1, dev->vlan_enabled, dev->vlan_filtering);
/* Create an untagged VLAN entry for the default PVID in case * CONFIG_VLAN_8021Q is disabled and there are no calls to @@ -789,26 +805,39 @@ int b53_configure_vlan(struct dsa_switch *ds) * entry. Do this only when the tagging protocol is not * DSA_TAG_PROTO_NONE */ + v = &dev->vlans[def_vid]; b53_for_each_port(dev, i) { - v = &dev->vlans[def_vid]; - v->members |= BIT(i); + if (!b53_vlan_port_may_join_untagged(ds, i)) + continue; + + vl.members |= BIT(i); if (!b53_vlan_port_needs_forced_tagged(ds, i)) - v->untag = v->members; - b53_write16(dev, B53_VLAN_PAGE, - B53_VLAN_PORT_DEF_TAG(i), def_vid); + vl.untag = vl.members; + b53_write16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(i), + def_vid); } + b53_set_vlan_entry(dev, def_vid, &vl);
- /* Upon initial call we have not set-up any VLANs, but upon - * system resume, we need to restore all VLAN entries. - */ - for (vid = def_vid; vid < dev->num_vlans; vid++) { - v = &dev->vlans[vid]; + if (dev->vlan_filtering) { + /* Upon initial call we have not set-up any VLANs, but upon + * system resume, we need to restore all VLAN entries. + */ + for (vid = def_vid + 1; vid < dev->num_vlans; vid++) { + v = &dev->vlans[vid];
- if (!v->members) - continue; + if (!v->members) + continue; + + b53_set_vlan_entry(dev, vid, v); + b53_fast_age_vlan(dev, vid); + }
- b53_set_vlan_entry(dev, vid, v); - b53_fast_age_vlan(dev, vid); + b53_for_each_port(dev, i) { + if (!dsa_is_cpu_port(ds, i)) + b53_write16(dev, B53_VLAN_PAGE, + B53_VLAN_PORT_DEF_TAG(i), + dev->ports[i].pvid); + } }
return 0; @@ -1127,7 +1156,9 @@ EXPORT_SYMBOL(b53_setup_devlink_resources); static int b53_setup(struct dsa_switch *ds) { struct b53_device *dev = ds->priv; + struct b53_vlan *vl; unsigned int port; + u16 pvid; int ret;
/* Request bridge PVID untagged when DSA_TAG_PROTO_NONE is set @@ -1146,6 +1177,15 @@ static int b53_setup(struct dsa_switch *ds) return ret; }
+ /* setup default vlan for filtering mode */ + pvid = b53_default_pvid(dev); + vl = &dev->vlans[pvid]; + b53_for_each_port(dev, port) { + vl->members |= BIT(port); + if (!b53_vlan_port_needs_forced_tagged(ds, port)) + vl->untag |= BIT(port); + } + b53_reset_mib(dev);
ret = b53_apply_config(dev); @@ -1499,7 +1539,10 @@ int b53_vlan_filtering(struct dsa_switch *ds, int port, bool vlan_filtering, { struct b53_device *dev = ds->priv;
- b53_enable_vlan(dev, port, dev->vlan_enabled, vlan_filtering); + if (dev->vlan_filtering != vlan_filtering) { + dev->vlan_filtering = vlan_filtering; + b53_apply_config(dev); + }
return 0; } @@ -1524,7 +1567,7 @@ static int b53_vlan_prepare(struct dsa_switch *ds, int port, if (vlan->vid >= dev->num_vlans) return -ERANGE;
- b53_enable_vlan(dev, port, true, ds->vlan_filtering); + b53_enable_vlan(dev, port, true, dev->vlan_filtering);
return 0; } @@ -1547,21 +1590,17 @@ int b53_vlan_add(struct dsa_switch *ds, int port, if (vlan->vid == 0) return 0;
- if (!ds->vlan_filtering) - return 0; - - b53_read16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port), &old_pvid); + old_pvid = dev->ports[port].pvid; if (pvid) new_pvid = vlan->vid; else if (!pvid && vlan->vid == old_pvid) new_pvid = b53_default_pvid(dev); else new_pvid = old_pvid; + dev->ports[port].pvid = new_pvid;
vl = &dev->vlans[vlan->vid];
- b53_get_vlan_entry(dev, vlan->vid, vl); - if (dsa_is_cpu_port(ds, port)) untagged = false;
@@ -1571,6 +1610,9 @@ int b53_vlan_add(struct dsa_switch *ds, int port, else vl->untag &= ~BIT(port);
+ if (!dev->vlan_filtering) + return 0; + b53_set_vlan_entry(dev, vlan->vid, vl); b53_fast_age_vlan(dev, vlan->vid);
@@ -1595,23 +1637,22 @@ int b53_vlan_del(struct dsa_switch *ds, int port, if (vlan->vid == 0) return 0;
- if (!ds->vlan_filtering) - return 0; - - b53_read16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port), &pvid); + pvid = dev->ports[port].pvid;
vl = &dev->vlans[vlan->vid];
- b53_get_vlan_entry(dev, vlan->vid, vl); - vl->members &= ~BIT(port);
if (pvid == vlan->vid) pvid = b53_default_pvid(dev); + dev->ports[port].pvid = pvid;
if (untagged && !b53_vlan_port_needs_forced_tagged(ds, port)) vl->untag &= ~(BIT(port));
+ if (!dev->vlan_filtering) + return 0; + b53_set_vlan_entry(dev, vlan->vid, vl); b53_fast_age_vlan(dev, vlan->vid);
@@ -1958,7 +1999,7 @@ int b53_br_join(struct dsa_switch *ds, int port, struct dsa_bridge bridge, pvid = b53_default_pvid(dev); vl = &dev->vlans[pvid];
- if (ds->vlan_filtering) { + if (dev->vlan_filtering) { /* Make this port leave the all VLANs join since we will have * proper VLAN entries from now on */ @@ -2038,7 +2079,7 @@ void b53_br_leave(struct dsa_switch *ds, int port, struct dsa_bridge bridge) pvid = b53_default_pvid(dev); vl = &dev->vlans[pvid];
- if (ds->vlan_filtering) { + if (dev->vlan_filtering) { /* Make this port join all VLANs without VLAN entries */ if (is58xx(dev)) { b53_read16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, ®); @@ -2790,6 +2831,7 @@ struct b53_device *b53_switch_alloc(struct device *base, ds->ops = &b53_switch_ops; ds->phylink_mac_ops = &b53_phylink_mac_ops; dev->vlan_enabled = true; + dev->vlan_filtering = false; /* Let DSA handle the case were multiple bridges span the same switch * device and different VLAN awareness settings are requested, which * would be breaking filtering semantics for any of the other bridge diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h index 9e9b5bc0c5d6a..982d1867f76b5 100644 --- a/drivers/net/dsa/b53/b53_priv.h +++ b/drivers/net/dsa/b53/b53_priv.h @@ -95,6 +95,7 @@ struct b53_pcs {
struct b53_port { u16 vlan_ctl_mask; + u16 pvid; struct ethtool_keee eee; };
@@ -146,6 +147,7 @@ struct b53_device { unsigned int num_vlans; struct b53_vlan *vlans; bool vlan_enabled; + bool vlan_filtering; unsigned int num_ports; struct b53_port *ports;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit 9f34ad89bcf0e6df6f8b01f1bdab211493fc66d1 ]
When VLAN filtering is off, we configure the switch to forward, but not learn on VLAN table misses. This effectively disables learning while not filtering.
Fix this by switching to forward and learn. Setting the learning disable register will still control whether learning actually happens.
Fixes: dad8d7c6452b ("net: dsa: b53: Properly account for VLAN filtering") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Tested-by: Florian Fainelli florian.fainelli@broadcom.com Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Link: https://patch.msgid.link/20250429201710.330937-11-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/b53/b53_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 118457e28e717..0c79c6c0a9187 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -383,7 +383,7 @@ static void b53_enable_vlan(struct b53_device *dev, int port, bool enable, vc4 |= VC4_ING_VID_VIO_DROP << VC4_ING_VID_CHECK_S; vc5 |= VC5_DROP_VTABLE_MISS; } else { - vc4 |= VC4_ING_VID_VIO_FWD << VC4_ING_VID_CHECK_S; + vc4 |= VC4_NO_ING_VID_CHK << VC4_ING_VID_CHECK_S; vc5 &= ~VC5_DROP_VTABLE_MISS; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit 2e7179c628d3cb9aee75e412473813b099e11ed4 ]
When a port gets set up, b53 disables learning and enables the port for flooding. This can undo any bridge configuration on the port.
E.g. the following flow would disable learning on a port:
$ ip link add br0 type bridge $ ip link set sw1p1 master br0 <- enables learning for sw1p1 $ ip link set br0 up $ ip link set sw1p1 up <- disables learning again
Fix this by populating dsa_switch_ops::port_setup(), and set up initial config there.
Fixes: f9b3827ee66c ("net: dsa: b53: Support setting learning on port") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Tested-by: Florian Fainelli florian.fainelli@broadcom.com Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Link: https://patch.msgid.link/20250429201710.330937-12-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/b53/b53_common.c | 21 +++++++++++++-------- drivers/net/dsa/b53/b53_priv.h | 1 + drivers/net/dsa/bcm_sf2.c | 1 + 3 files changed, 15 insertions(+), 8 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 0c79c6c0a9187..e3b5b450ee932 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -578,6 +578,18 @@ static void b53_eee_enable_set(struct dsa_switch *ds, int port, bool enable) b53_write16(dev, B53_EEE_PAGE, B53_EEE_EN_CTRL, reg); }
+int b53_setup_port(struct dsa_switch *ds, int port) +{ + struct b53_device *dev = ds->priv; + + b53_port_set_ucast_flood(dev, port, true); + b53_port_set_mcast_flood(dev, port, true); + b53_port_set_learning(dev, port, false); + + return 0; +} +EXPORT_SYMBOL(b53_setup_port); + int b53_enable_port(struct dsa_switch *ds, int port, struct phy_device *phy) { struct b53_device *dev = ds->priv; @@ -590,10 +602,6 @@ int b53_enable_port(struct dsa_switch *ds, int port, struct phy_device *phy)
cpu_port = dsa_to_port(ds, port)->cpu_dp->index;
- b53_port_set_ucast_flood(dev, port, true); - b53_port_set_mcast_flood(dev, port, true); - b53_port_set_learning(dev, port, false); - if (dev->ops->irq_enable) ret = dev->ops->irq_enable(dev, port); if (ret) @@ -724,10 +732,6 @@ static void b53_enable_cpu_port(struct b53_device *dev, int port) b53_write8(dev, B53_CTRL_PAGE, B53_PORT_CTRL(port), port_ctrl);
b53_brcm_hdr_setup(dev->ds, port); - - b53_port_set_ucast_flood(dev, port, true); - b53_port_set_mcast_flood(dev, port, true); - b53_port_set_learning(dev, port, false); }
static void b53_enable_mib(struct b53_device *dev) @@ -2387,6 +2391,7 @@ static const struct dsa_switch_ops b53_switch_ops = { .phy_read = b53_phy_read16, .phy_write = b53_phy_write16, .phylink_get_caps = b53_phylink_get_caps, + .port_setup = b53_setup_port, .port_enable = b53_enable_port, .port_disable = b53_disable_port, .support_eee = b53_support_eee, diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h index 982d1867f76b5..cc86aa777df56 100644 --- a/drivers/net/dsa/b53/b53_priv.h +++ b/drivers/net/dsa/b53/b53_priv.h @@ -382,6 +382,7 @@ enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds, int port, enum dsa_tag_protocol mprot); void b53_mirror_del(struct dsa_switch *ds, int port, struct dsa_mall_mirror_tc_entry *mirror); +int b53_setup_port(struct dsa_switch *ds, int port); int b53_enable_port(struct dsa_switch *ds, int port, struct phy_device *phy); void b53_disable_port(struct dsa_switch *ds, int port); void b53_brcm_hdr_setup(struct dsa_switch *ds, int port); diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c index fa2bf3fa90191..454a8c7fd7eea 100644 --- a/drivers/net/dsa/bcm_sf2.c +++ b/drivers/net/dsa/bcm_sf2.c @@ -1230,6 +1230,7 @@ static const struct dsa_switch_ops bcm_sf2_ops = { .resume = bcm_sf2_sw_resume, .get_wol = bcm_sf2_sw_get_wol, .set_wol = bcm_sf2_sw_set_wol, + .port_setup = b53_setup_port, .port_enable = bcm_sf2_port_setup, .port_disable = bcm_sf2_port_disable, .support_eee = b53_support_eee,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Duyck alexanderduyck@fb.com
[ Upstream commit f34343cc11afc7bb1f881c3492bee3484016bf71 ]
Address to issues with the FW mailbox descriptor initialization.
We need to reverse the order of accesses when we invalidate an entry versus writing an entry. When writing an entry we write upper and then lower as the lower 32b contain the valid bit that makes the entire address valid. However for invalidation we should write it in the reverse order so that the upper is marked invalid before we update it.
Without this change we may see FW attempt to access pages with the upper 32b of the address set to 0 which will likely result in DMAR faults due to write access failures on mailbox shutdown.
Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck alexanderduyck@fb.com Reviewed-by: Simon Horman horms@kernel.org Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/174654717972.499179.8083789731819297034.stgit@ahduy... Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 32 ++++++++++++++++------ 1 file changed, 23 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c index bbc7c1c0c37ef..9996a70a1f872 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c @@ -17,11 +17,29 @@ static void __fbnic_mbx_wr_desc(struct fbnic_dev *fbd, int mbx_idx, { u32 desc_offset = FBNIC_IPC_MBX(mbx_idx, desc_idx);
+ /* Write the upper 32b and then the lower 32b. Doing this the + * FW can then read lower, upper, lower to verify that the state + * of the descriptor wasn't changed mid-transaction. + */ fw_wr32(fbd, desc_offset + 1, upper_32_bits(desc)); fw_wrfl(fbd); fw_wr32(fbd, desc_offset, lower_32_bits(desc)); }
+static void __fbnic_mbx_invalidate_desc(struct fbnic_dev *fbd, int mbx_idx, + int desc_idx, u32 desc) +{ + u32 desc_offset = FBNIC_IPC_MBX(mbx_idx, desc_idx); + + /* For initialization we write the lower 32b of the descriptor first. + * This way we can set the state to mark it invalid before we clear the + * upper 32b. + */ + fw_wr32(fbd, desc_offset, desc); + fw_wrfl(fbd); + fw_wr32(fbd, desc_offset + 1, 0); +} + static u64 __fbnic_mbx_rd_desc(struct fbnic_dev *fbd, int mbx_idx, int desc_idx) { u32 desc_offset = FBNIC_IPC_MBX(mbx_idx, desc_idx); @@ -41,21 +59,17 @@ static void fbnic_mbx_init_desc_ring(struct fbnic_dev *fbd, int mbx_idx) * solid stop for the firmware to hit when it is done looping * through the ring. */ - __fbnic_mbx_wr_desc(fbd, mbx_idx, 0, 0); - - fw_wrfl(fbd); + __fbnic_mbx_invalidate_desc(fbd, mbx_idx, 0, 0);
/* We then fill the rest of the ring starting at the end and moving * back toward descriptor 0 with skip descriptors that have no * length nor address, and tell the firmware that they can skip * them and just move past them to the one we initialized to 0. */ - for (desc_idx = FBNIC_IPC_MBX_DESC_LEN; --desc_idx;) { - __fbnic_mbx_wr_desc(fbd, mbx_idx, desc_idx, - FBNIC_IPC_MBX_DESC_FW_CMPL | - FBNIC_IPC_MBX_DESC_HOST_CMPL); - fw_wrfl(fbd); - } + for (desc_idx = FBNIC_IPC_MBX_DESC_LEN; --desc_idx;) + __fbnic_mbx_invalidate_desc(fbd, mbx_idx, desc_idx, + FBNIC_IPC_MBX_DESC_FW_CMPL | + FBNIC_IPC_MBX_DESC_HOST_CMPL); }
void fbnic_mbx_init(struct fbnic_dev *fbd)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Duyck alexanderduyck@fb.com
[ Upstream commit 3b12f00ddd08e888273b2ac0488d396d90a836fc ]
In order to prevent the device from throwing spurious writes and/or reads at us we need to gate the AXI fabric interface to the PCIe until such time as we know the FW is in a known good state.
To accomplish this we use the mailbox as a mechanism for us to recognize that the FW has acknowledged our presence and is no longer sending any stale message data to us.
We start in fbnic_mbx_init by calling fbnic_mbx_reset_desc_ring function, disabling the DMA in both directions, and then invalidating all the descriptors in each ring.
We then poll the mailbox in fbnic_mbx_poll_tx_ready and when the interrupt is set by the FW we pick it up and mark the mailboxes as ready, while also enabling the DMA.
Once we have completed all the transactions and need to shut down we call into fbnic_mbx_clean which will in turn call fbnic_mbx_reset_desc_ring for each ring and shut down the DMA and once again invalidate the descriptors.
Fixes: 3646153161f1 ("eth: fbnic: Add register init to set PCIe/Ethernet device config") Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck alexanderduyck@fb.com Reviewed-by: Simon Horman horms@kernel.org Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/174654718623.499179.7445197308109347982.stgit@ahduy... Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/meta/fbnic/fbnic_csr.h | 2 ++ drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 38 +++++++++++++++++---- drivers/net/ethernet/meta/fbnic/fbnic_mac.c | 6 ---- 3 files changed, 33 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_csr.h b/drivers/net/ethernet/meta/fbnic/fbnic_csr.h index 02bb81b3c5063..bf1655edeed2a 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_csr.h +++ b/drivers/net/ethernet/meta/fbnic/fbnic_csr.h @@ -785,8 +785,10 @@ enum { /* PUL User Registers */ #define FBNIC_CSR_START_PUL_USER 0x31000 /* CSR section delimiter */ #define FBNIC_PUL_OB_TLP_HDR_AW_CFG 0x3103d /* 0xc40f4 */ +#define FBNIC_PUL_OB_TLP_HDR_AW_CFG_FLUSH CSR_BIT(19) #define FBNIC_PUL_OB_TLP_HDR_AW_CFG_BME CSR_BIT(18) #define FBNIC_PUL_OB_TLP_HDR_AR_CFG 0x3103e /* 0xc40f8 */ +#define FBNIC_PUL_OB_TLP_HDR_AR_CFG_FLUSH CSR_BIT(19) #define FBNIC_PUL_OB_TLP_HDR_AR_CFG_BME CSR_BIT(18) #define FBNIC_CSR_END_PUL_USER 0x31080 /* CSR section delimiter */
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c index 9996a70a1f872..dc90df287c0a8 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c @@ -51,10 +51,26 @@ static u64 __fbnic_mbx_rd_desc(struct fbnic_dev *fbd, int mbx_idx, int desc_idx) return desc; }
-static void fbnic_mbx_init_desc_ring(struct fbnic_dev *fbd, int mbx_idx) +static void fbnic_mbx_reset_desc_ring(struct fbnic_dev *fbd, int mbx_idx) { int desc_idx;
+ /* Disable DMA transactions from the device, + * and flush any transactions triggered during cleaning + */ + switch (mbx_idx) { + case FBNIC_IPC_MBX_RX_IDX: + wr32(fbd, FBNIC_PUL_OB_TLP_HDR_AW_CFG, + FBNIC_PUL_OB_TLP_HDR_AW_CFG_FLUSH); + break; + case FBNIC_IPC_MBX_TX_IDX: + wr32(fbd, FBNIC_PUL_OB_TLP_HDR_AR_CFG, + FBNIC_PUL_OB_TLP_HDR_AR_CFG_FLUSH); + break; + } + + wrfl(fbd); + /* Initialize first descriptor to all 0s. Doing this gives us a * solid stop for the firmware to hit when it is done looping * through the ring. @@ -90,7 +106,7 @@ void fbnic_mbx_init(struct fbnic_dev *fbd) wr32(fbd, FBNIC_INTR_CLEAR(0), 1u << FBNIC_FW_MSIX_ENTRY);
for (i = 0; i < FBNIC_IPC_MBX_INDICES; i++) - fbnic_mbx_init_desc_ring(fbd, i); + fbnic_mbx_reset_desc_ring(fbd, i); }
static int fbnic_mbx_map_msg(struct fbnic_dev *fbd, int mbx_idx, @@ -155,7 +171,7 @@ static void fbnic_mbx_clean_desc_ring(struct fbnic_dev *fbd, int mbx_idx) { int i;
- fbnic_mbx_init_desc_ring(fbd, mbx_idx); + fbnic_mbx_reset_desc_ring(fbd, mbx_idx);
for (i = FBNIC_IPC_MBX_DESC_LEN; i--;) fbnic_mbx_unmap_and_free_msg(fbd, mbx_idx, i); @@ -354,7 +370,7 @@ static int fbnic_fw_xmit_cap_msg(struct fbnic_dev *fbd) return (err == -EOPNOTSUPP) ? 0 : err; }
-static void fbnic_mbx_postinit_desc_ring(struct fbnic_dev *fbd, int mbx_idx) +static void fbnic_mbx_init_desc_ring(struct fbnic_dev *fbd, int mbx_idx) { struct fbnic_fw_mbx *mbx = &fbd->mbx[mbx_idx];
@@ -366,10 +382,18 @@ static void fbnic_mbx_postinit_desc_ring(struct fbnic_dev *fbd, int mbx_idx)
switch (mbx_idx) { case FBNIC_IPC_MBX_RX_IDX: + /* Enable DMA writes from the device */ + wr32(fbd, FBNIC_PUL_OB_TLP_HDR_AW_CFG, + FBNIC_PUL_OB_TLP_HDR_AW_CFG_BME); + /* Make sure we have a page for the FW to write to */ fbnic_mbx_alloc_rx_msgs(fbd); break; case FBNIC_IPC_MBX_TX_IDX: + /* Enable DMA reads from the device */ + wr32(fbd, FBNIC_PUL_OB_TLP_HDR_AR_CFG, + FBNIC_PUL_OB_TLP_HDR_AR_CFG_BME); + /* Force version to 1 if we successfully requested an update * from the firmware. This should be overwritten once we get * the actual version from the firmware in the capabilities @@ -386,7 +410,7 @@ static void fbnic_mbx_postinit(struct fbnic_dev *fbd) { int i;
- /* We only need to do this on the first interrupt following init. + /* We only need to do this on the first interrupt following reset. * this primes the mailbox so that we will have cleared all the * skip descriptors. */ @@ -396,7 +420,7 @@ static void fbnic_mbx_postinit(struct fbnic_dev *fbd) wr32(fbd, FBNIC_INTR_CLEAR(0), 1u << FBNIC_FW_MSIX_ENTRY);
for (i = 0; i < FBNIC_IPC_MBX_INDICES; i++) - fbnic_mbx_postinit_desc_ring(fbd, i); + fbnic_mbx_init_desc_ring(fbd, i); }
/** @@ -899,7 +923,7 @@ int fbnic_mbx_poll_tx_ready(struct fbnic_dev *fbd) * avoid the mailbox getting stuck closed if the interrupt * is reset. */ - fbnic_mbx_init_desc_ring(fbd, FBNIC_IPC_MBX_TX_IDX); + fbnic_mbx_reset_desc_ring(fbd, FBNIC_IPC_MBX_TX_IDX);
msleep(200);
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_mac.c b/drivers/net/ethernet/meta/fbnic/fbnic_mac.c index 14291401f4632..dde4a37116e20 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_mac.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_mac.c @@ -79,12 +79,6 @@ static void fbnic_mac_init_axi(struct fbnic_dev *fbd) fbnic_init_readrq(fbd, FBNIC_QM_RNI_RBP_CTL, cls, readrq); fbnic_init_mps(fbd, FBNIC_QM_RNI_RDE_CTL, cls, mps); fbnic_init_mps(fbd, FBNIC_QM_RNI_RCM_CTL, cls, mps); - - /* Enable XALI AR/AW outbound */ - wr32(fbd, FBNIC_PUL_OB_TLP_HDR_AW_CFG, - FBNIC_PUL_OB_TLP_HDR_AW_CFG_BME); - wr32(fbd, FBNIC_PUL_OB_TLP_HDR_AR_CFG, - FBNIC_PUL_OB_TLP_HDR_AR_CFG_BME); }
static void fbnic_mac_init_qm(struct fbnic_dev *fbd)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Duyck alexanderduyck@fb.com
[ Upstream commit 0f9a959a0addd9bbc47e5d16c36b3a7f97981915 ]
The fbnic_mbx_flush_tx function had a number of issues.
First, we were waiting 200ms for the firmware to process the packets. We can drop this to 20ms and in almost all cases this should be more than enough time. So by changing this we can significantly reduce shutdown time.
Second, we were not making sure that the Tx path was actually shut off. As such we could still have packets added while we were flushing the mailbox. To prevent that we can now clear the ready flag for the Tx side and it should stay down since the interrupt is disabled.
Third, we kept re-reading the tail due to the second issue. The tail should not move after we have started the flush so we can just read it once while we are holding the mailbox Tx lock. By doing that we are guaranteed that the value should be consistent.
Fourth, we were keeping a count of descriptors cleaned due to the second and third issues called out. That count is not a valid reason to be exiting the cleanup, and with the tail only being read once we shouldn't see any cases where the tail moves after the disable so the tracking of count can be dropped.
Fifth, we were using attempts * sleep time to determine how long we would wait in our polling loop to flush out the Tx. This can be very imprecise. In order to tighten up the timing we are shifting over to using a jiffies value of jiffies + 10 * HZ + 1 to determine the jiffies value we should stop polling at as this should be accurate within once sleep cycle for the total amount of time spent polling.
Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck alexanderduyck@fb.com Reviewed-by: Simon Horman horms@kernel.org Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/174654719929.499179.16406653096197423749.stgit@ahdu... Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 31 +++++++++++----------- 1 file changed, 16 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c index dc90df287c0a8..73e08c8c41630 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c @@ -935,35 +935,36 @@ int fbnic_mbx_poll_tx_ready(struct fbnic_dev *fbd)
void fbnic_mbx_flush_tx(struct fbnic_dev *fbd) { + unsigned long timeout = jiffies + 10 * HZ + 1; struct fbnic_fw_mbx *tx_mbx; - int attempts = 50; - u8 count = 0; - - /* Nothing to do if there is no mailbox */ - if (!fbnic_fw_present(fbd)) - return; + u8 tail;
/* Record current Rx stats */ tx_mbx = &fbd->mbx[FBNIC_IPC_MBX_TX_IDX];
- /* Nothing to do if mailbox never got to ready */ - if (!tx_mbx->ready) - return; + spin_lock_irq(&fbd->fw_tx_lock); + + /* Clear ready to prevent any further attempts to transmit */ + tx_mbx->ready = false; + + /* Read tail to determine the last tail state for the ring */ + tail = tx_mbx->tail; + + spin_unlock_irq(&fbd->fw_tx_lock);
/* Give firmware time to process packet, - * we will wait up to 10 seconds which is 50 waits of 200ms. + * we will wait up to 10 seconds which is 500 waits of 20ms. */ do { u8 head = tx_mbx->head;
- if (head == tx_mbx->tail) + /* Tx ring is empty once head == tail */ + if (head == tail) break;
- msleep(200); + msleep(20); fbnic_mbx_process_tx_msgs(fbd); - - count += (tx_mbx->head - head) % FBNIC_IPC_MBX_DESC_LEN; - } while (count < FBNIC_IPC_MBX_DESC_LEN && --attempts); + } while (time_is_after_jiffies(timeout)); }
void fbnic_get_fw_ver_commit_str(struct fbnic_dev *fbd, char *fw_version,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Duyck alexanderduyck@fb.com
[ Upstream commit cdbb2dc3996a60ed3d7431c1239a8ca98c778e04 ]
There was an issue in that if we were to shutdown we could be left with a completion in flight as the mailbox went away. To address that I have added an fbnic_mbx_evict_all_cmpl function that is meant to essentially create a "broken pipe" type response so that all callers will receive an error indicating that the connection has been broken as a result of us shutting down the mailbox.
Fixes: 378e5cc1c6c6 ("eth: fbnic: hwmon: Add completion infrastructure for firmware requests") Signed-off-by: Alexander Duyck alexanderduyck@fb.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/174654720578.499179.380252598204530873.stgit@ahduyc... Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c index 73e08c8c41630..e9b63755cdc52 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c @@ -933,6 +933,20 @@ int fbnic_mbx_poll_tx_ready(struct fbnic_dev *fbd) return attempts ? 0 : -ETIMEDOUT; }
+static void __fbnic_fw_evict_cmpl(struct fbnic_fw_completion *cmpl_data) +{ + cmpl_data->result = -EPIPE; + complete(&cmpl_data->done); +} + +static void fbnic_mbx_evict_all_cmpl(struct fbnic_dev *fbd) +{ + if (fbd->cmpl_data) { + __fbnic_fw_evict_cmpl(fbd->cmpl_data); + fbd->cmpl_data = NULL; + } +} + void fbnic_mbx_flush_tx(struct fbnic_dev *fbd) { unsigned long timeout = jiffies + 10 * HZ + 1; @@ -950,6 +964,9 @@ void fbnic_mbx_flush_tx(struct fbnic_dev *fbd) /* Read tail to determine the last tail state for the ring */ tail = tx_mbx->tail;
+ /* Flush any completions as we are no longer processing Rx */ + fbnic_mbx_evict_all_cmpl(fbd); + spin_unlock_irq(&fbd->fw_tx_lock);
/* Give firmware time to process packet,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Duyck alexanderduyck@fb.com
[ Upstream commit ab064f6005973d456f95ae99cd9ea0d8ab676cce ]
There were a couple different issues found in fbnic_mbx_poll_tx_ready. Among them were the fact that we were sleeping much longer than we actually needed to as the actual FW could respond in under 20ms. The other issue was that we would just keep polling the mailbox even if the device itself had gone away.
To address the responsiveness issues we can decrease the sleeps to 20ms and use a jiffies based timeout value rather than just counting the number of times we slept and then polled.
To address the hardware going away we can move the check for the firmware BAR being present from where it was and place it inside the loop after the mailbox descriptor ring is initialized and before we sleep so that we just abort and return an error if the device went away during initialization.
With these two changes we see a significant improvement in boot times for the driver.
Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck alexanderduyck@fb.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/174654721224.499179.2698616208976624755.stgit@ahduy... Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c index e9b63755cdc52..da6e5ba5acaee 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c @@ -910,27 +910,30 @@ void fbnic_mbx_poll(struct fbnic_dev *fbd)
int fbnic_mbx_poll_tx_ready(struct fbnic_dev *fbd) { + unsigned long timeout = jiffies + 10 * HZ + 1; struct fbnic_fw_mbx *tx_mbx; - int attempts = 50; - - /* Immediate fail if BAR4 isn't there */ - if (!fbnic_fw_present(fbd)) - return -ENODEV;
tx_mbx = &fbd->mbx[FBNIC_IPC_MBX_TX_IDX]; - while (!tx_mbx->ready && --attempts) { + while (!tx_mbx->ready) { + if (!time_is_after_jiffies(timeout)) + return -ETIMEDOUT; + /* Force the firmware to trigger an interrupt response to * avoid the mailbox getting stuck closed if the interrupt * is reset. */ fbnic_mbx_reset_desc_ring(fbd, FBNIC_IPC_MBX_TX_IDX);
- msleep(200); + /* Immediate fail if BAR4 went away */ + if (!fbnic_fw_present(fbd)) + return -ENODEV; + + msleep(20);
fbnic_mbx_poll(fbd); }
- return attempts ? 0 : -ETIMEDOUT; + return 0; }
static void __fbnic_fw_evict_cmpl(struct fbnic_fw_completion *cmpl_data)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Duyck alexanderduyck@fb.com
[ Upstream commit 1b34d1c1dc8384884febd83140c9afbc7c4b9eb8 ]
This change pulls the call to fbnic_fw_xmit_cap_msg out of fbnic_mbx_init_desc_ring and instead places it in the polling function for getting the Tx ready. Doing that we can avoid the potential issue with an interrupt coming in later from the firmware that causes it to get fired in interrupt context.
Fixes: 20d2e88cc746 ("eth: fbnic: Add initial messaging to notify FW of our presence") Signed-off-by: Alexander Duyck alexanderduyck@fb.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/174654721876.499179.9839651602256668493.stgit@ahduy... Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 43 ++++++++-------------- 1 file changed, 16 insertions(+), 27 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c index da6e5ba5acaee..b804b5480db97 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c @@ -352,24 +352,6 @@ static int fbnic_fw_xmit_simple_msg(struct fbnic_dev *fbd, u32 msg_type) return err; }
-/** - * fbnic_fw_xmit_cap_msg - Allocate and populate a FW capabilities message - * @fbd: FBNIC device structure - * - * Return: NULL on failure to allocate, error pointer on error, or pointer - * to new TLV test message. - * - * Sends a single TLV header indicating the host wants the firmware to - * confirm the capabilities and version. - **/ -static int fbnic_fw_xmit_cap_msg(struct fbnic_dev *fbd) -{ - int err = fbnic_fw_xmit_simple_msg(fbd, FBNIC_TLV_MSG_ID_HOST_CAP_REQ); - - /* Return 0 if we are not calling this on ASIC */ - return (err == -EOPNOTSUPP) ? 0 : err; -} - static void fbnic_mbx_init_desc_ring(struct fbnic_dev *fbd, int mbx_idx) { struct fbnic_fw_mbx *mbx = &fbd->mbx[mbx_idx]; @@ -393,15 +375,6 @@ static void fbnic_mbx_init_desc_ring(struct fbnic_dev *fbd, int mbx_idx) /* Enable DMA reads from the device */ wr32(fbd, FBNIC_PUL_OB_TLP_HDR_AR_CFG, FBNIC_PUL_OB_TLP_HDR_AR_CFG_BME); - - /* Force version to 1 if we successfully requested an update - * from the firmware. This should be overwritten once we get - * the actual version from the firmware in the capabilities - * request message. - */ - if (!fbnic_fw_xmit_cap_msg(fbd) && - !fbd->fw_cap.running.mgmt.version) - fbd->fw_cap.running.mgmt.version = 1; break; } } @@ -912,6 +885,7 @@ int fbnic_mbx_poll_tx_ready(struct fbnic_dev *fbd) { unsigned long timeout = jiffies + 10 * HZ + 1; struct fbnic_fw_mbx *tx_mbx; + int err;
tx_mbx = &fbd->mbx[FBNIC_IPC_MBX_TX_IDX]; while (!tx_mbx->ready) { @@ -933,7 +907,22 @@ int fbnic_mbx_poll_tx_ready(struct fbnic_dev *fbd) fbnic_mbx_poll(fbd); }
+ /* Request an update from the firmware. This should overwrite + * mgmt.version once we get the actual version from the firmware + * in the capabilities request message. + */ + err = fbnic_fw_xmit_simple_msg(fbd, FBNIC_TLV_MSG_ID_HOST_CAP_REQ); + if (err) + goto clean_mbx; + + /* Use "1" to indicate we entered the state waiting for a response */ + fbd->fw_cap.running.mgmt.version = 1; + return 0; +clean_mbx: + /* Cleanup Rx buffers and disable mailbox */ + fbnic_mbx_clean(fbd); + return err; }
static void __fbnic_fw_evict_cmpl(struct fbnic_fw_completion *cmpl_data)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Duyck alexanderduyck@fb.com
[ Upstream commit ce2fa1dba204c761582674cf2eb9cbe0b949b5c7 ]
We had originally thought to have the mailbox go to ready in the background while we were doing other things. One issue with this though is that we can't disable it by clearing the ready state without also blocking interrupts or calls to mbx_poll as it will just pop back to life during an interrupt.
In order to prevent that from happening we can pull the code for toggling to ready out of the interrupt path and instead place it in the fbnic_mbx_poll_tx_ready path so that it becomes the only spot where the Rx/Tx can toggle to the ready state. By doing this we can prevent races where we disable the DMA and/or free buffers only to have an interrupt fire and undo what we have done.
Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck alexanderduyck@fb.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/174654722518.499179.11612865740376848478.stgit@ahdu... Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 27 ++++++++-------------- 1 file changed, 10 insertions(+), 17 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c index b804b5480db97..9351a874689f8 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c @@ -356,10 +356,6 @@ static void fbnic_mbx_init_desc_ring(struct fbnic_dev *fbd, int mbx_idx) { struct fbnic_fw_mbx *mbx = &fbd->mbx[mbx_idx];
- /* This is a one time init, so just exit if it is completed */ - if (mbx->ready) - return; - mbx->ready = true;
switch (mbx_idx) { @@ -379,21 +375,18 @@ static void fbnic_mbx_init_desc_ring(struct fbnic_dev *fbd, int mbx_idx) } }
-static void fbnic_mbx_postinit(struct fbnic_dev *fbd) +static bool fbnic_mbx_event(struct fbnic_dev *fbd) { - int i; - /* We only need to do this on the first interrupt following reset. * this primes the mailbox so that we will have cleared all the * skip descriptors. */ if (!(rd32(fbd, FBNIC_INTR_STATUS(0)) & (1u << FBNIC_FW_MSIX_ENTRY))) - return; + return false;
wr32(fbd, FBNIC_INTR_CLEAR(0), 1u << FBNIC_FW_MSIX_ENTRY);
- for (i = 0; i < FBNIC_IPC_MBX_INDICES; i++) - fbnic_mbx_init_desc_ring(fbd, i); + return true; }
/** @@ -875,7 +868,7 @@ static void fbnic_mbx_process_rx_msgs(struct fbnic_dev *fbd)
void fbnic_mbx_poll(struct fbnic_dev *fbd) { - fbnic_mbx_postinit(fbd); + fbnic_mbx_event(fbd);
fbnic_mbx_process_tx_msgs(fbd); fbnic_mbx_process_rx_msgs(fbd); @@ -884,11 +877,9 @@ void fbnic_mbx_poll(struct fbnic_dev *fbd) int fbnic_mbx_poll_tx_ready(struct fbnic_dev *fbd) { unsigned long timeout = jiffies + 10 * HZ + 1; - struct fbnic_fw_mbx *tx_mbx; - int err; + int err, i;
- tx_mbx = &fbd->mbx[FBNIC_IPC_MBX_TX_IDX]; - while (!tx_mbx->ready) { + do { if (!time_is_after_jiffies(timeout)) return -ETIMEDOUT;
@@ -903,9 +894,11 @@ int fbnic_mbx_poll_tx_ready(struct fbnic_dev *fbd) return -ENODEV;
msleep(20); + } while (!fbnic_mbx_event(fbd));
- fbnic_mbx_poll(fbd); - } + /* FW has shown signs of life. Enable DMA and start Tx/Rx */ + for (i = 0; i < FBNIC_IPC_MBX_INDICES; i++) + fbnic_mbx_init_desc_ring(fbd, i);
/* Request an update from the firmware. This should overwrite * mgmt.version once we get the actual version from the firmware
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 23fa6a23d97182d36ca3c71e43c804fa91e46a03 ]
Older drivers and drivers with lower queue counts often have a static array of queues, rather than allocating structs for each queue on demand. Add a helper for adding up qstats from a queue range. Expectation is that driver will pass a queue range [netdev->real_num_*x_queues, MAX). It was tempting to always use num_*x_queues as the end, but virtio seems to clamp its queue count after allocating the netdev. And this way we can trivaly reuse the helper for [0, real_..).
Signed-off-by: Jakub Kicinski kuba@kernel.org Link: https://patch.msgid.link/20250507003221.823267-2-kuba@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Stable-dep-of: 001160ec8c59 ("virtio-net: fix total qstat values") Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netdev_queues.h | 6 ++++ net/core/netdev-genl.c | 69 +++++++++++++++++++++++++++---------- 2 files changed, 56 insertions(+), 19 deletions(-)
diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h index b02bb9f109d5e..88598e14ecfa4 100644 --- a/include/net/netdev_queues.h +++ b/include/net/netdev_queues.h @@ -102,6 +102,12 @@ struct netdev_stat_ops { struct netdev_queue_stats_tx *tx); };
+void netdev_stat_queue_sum(struct net_device *netdev, + int rx_start, int rx_end, + struct netdev_queue_stats_rx *rx_sum, + int tx_start, int tx_end, + struct netdev_queue_stats_tx *tx_sum); + /** * struct netdev_queue_mgmt_ops - netdev ops for queue management * diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index 7832abc5ca6e2..9be2bdd2dca89 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -690,25 +690,66 @@ netdev_nl_stats_by_queue(struct net_device *netdev, struct sk_buff *rsp, return 0; }
+/** + * netdev_stat_queue_sum() - add up queue stats from range of queues + * @netdev: net_device + * @rx_start: index of the first Rx queue to query + * @rx_end: index after the last Rx queue (first *not* to query) + * @rx_sum: output Rx stats, should be already initialized + * @tx_start: index of the first Tx queue to query + * @tx_end: index after the last Tx queue (first *not* to query) + * @tx_sum: output Tx stats, should be already initialized + * + * Add stats from [start, end) range of queue IDs to *x_sum structs. + * The sum structs must be already initialized. Usually this + * helper is invoked from the .get_base_stats callbacks of drivers + * to account for stats of disabled queues. In that case the ranges + * are usually [netdev->real_num_*x_queues, netdev->num_*x_queues). + */ +void netdev_stat_queue_sum(struct net_device *netdev, + int rx_start, int rx_end, + struct netdev_queue_stats_rx *rx_sum, + int tx_start, int tx_end, + struct netdev_queue_stats_tx *tx_sum) +{ + const struct netdev_stat_ops *ops; + struct netdev_queue_stats_rx rx; + struct netdev_queue_stats_tx tx; + int i; + + ops = netdev->stat_ops; + + for (i = rx_start; i < rx_end; i++) { + memset(&rx, 0xff, sizeof(rx)); + if (ops->get_queue_stats_rx) + ops->get_queue_stats_rx(netdev, i, &rx); + netdev_nl_stats_add(rx_sum, &rx, sizeof(rx)); + } + for (i = tx_start; i < tx_end; i++) { + memset(&tx, 0xff, sizeof(tx)); + if (ops->get_queue_stats_tx) + ops->get_queue_stats_tx(netdev, i, &tx); + netdev_nl_stats_add(tx_sum, &tx, sizeof(tx)); + } +} +EXPORT_SYMBOL(netdev_stat_queue_sum); + static int netdev_nl_stats_by_netdev(struct net_device *netdev, struct sk_buff *rsp, const struct genl_info *info) { - struct netdev_queue_stats_rx rx_sum, rx; - struct netdev_queue_stats_tx tx_sum, tx; - const struct netdev_stat_ops *ops; + struct netdev_queue_stats_rx rx_sum; + struct netdev_queue_stats_tx tx_sum; void *hdr; - int i;
- ops = netdev->stat_ops; /* Netdev can't guarantee any complete counters */ - if (!ops->get_base_stats) + if (!netdev->stat_ops->get_base_stats) return 0;
memset(&rx_sum, 0xff, sizeof(rx_sum)); memset(&tx_sum, 0xff, sizeof(tx_sum));
- ops->get_base_stats(netdev, &rx_sum, &tx_sum); + netdev->stat_ops->get_base_stats(netdev, &rx_sum, &tx_sum);
/* The op was there, but nothing reported, don't bother */ if (!memchr_inv(&rx_sum, 0xff, sizeof(rx_sum)) && @@ -721,18 +762,8 @@ netdev_nl_stats_by_netdev(struct net_device *netdev, struct sk_buff *rsp, if (nla_put_u32(rsp, NETDEV_A_QSTATS_IFINDEX, netdev->ifindex)) goto nla_put_failure;
- for (i = 0; i < netdev->real_num_rx_queues; i++) { - memset(&rx, 0xff, sizeof(rx)); - if (ops->get_queue_stats_rx) - ops->get_queue_stats_rx(netdev, i, &rx); - netdev_nl_stats_add(&rx_sum, &rx, sizeof(rx)); - } - for (i = 0; i < netdev->real_num_tx_queues; i++) { - memset(&tx, 0xff, sizeof(tx)); - if (ops->get_queue_stats_tx) - ops->get_queue_stats_tx(netdev, i, &tx); - netdev_nl_stats_add(&tx_sum, &tx, sizeof(tx)); - } + netdev_stat_queue_sum(netdev, 0, netdev->real_num_rx_queues, &rx_sum, + 0, netdev->real_num_tx_queues, &tx_sum);
if (netdev_nl_stats_write_rx(rsp, &rx_sum) || netdev_nl_stats_write_tx(rsp, &tx_sum))
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 001160ec8c59115efc39e197d40829bdafd4d7f5 ]
NIPA tests report that the interface statistics reported via qstat are lower than those reported via ip link. Looks like this is because some tests flip the queue count up and down, and we end up with some of the traffic accounted on disabled queues.
Add up counters from disabled queues.
Fixes: d888f04c09bb ("virtio-net: support queue stat") Signed-off-by: Jakub Kicinski kuba@kernel.org Link: https://patch.msgid.link/20250507003221.823267-3-kuba@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/virtio_net.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 54f883c962373..8879af5292b49 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -5663,6 +5663,10 @@ static void virtnet_get_base_stats(struct net_device *dev,
if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_SPEED) tx->hw_drop_ratelimits = 0; + + netdev_stat_queue_sum(dev, + dev->real_num_rx_queues, vi->max_queue_pairs, rx, + dev->real_num_tx_queues, vi->max_queue_pairs, tx); }
static const struct netdev_stat_ops virtnet_stat_ops = {
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hugo Villeneuve hvilleneuve@dimonoff.com
commit c6cb8bf79466ae66bd0d07338c7c505ce758e9d7 upstream.
The current reset pulse width is measured to be 5us on a Renesas RZ/G2L SOM. The manufacturer's minimum reset pulse width is specified as 10us.
Extend reset pulse width to make sure it is long enough on all platforms.
Also reword confusing comments about reset pin assertion.
Fixes: 5b0c03e24a06 ("Input: Add driver for Cypress Generation 5 touchscreen") Cc: stable@vger.kernel.org Acked-by: Alistair Francis alistair@alistair23.me Signed-off-by: Hugo Villeneuve hvilleneuve@dimonoff.com Link: https://lore.kernel.org/r/20250410184633.1164837-1-hugo@hugovil.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/touchscreen/cyttsp5.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/input/touchscreen/cyttsp5.c +++ b/drivers/input/touchscreen/cyttsp5.c @@ -870,13 +870,16 @@ static int cyttsp5_probe(struct device * ts->input->phys = ts->phys; input_set_drvdata(ts->input, ts);
- /* Reset the gpio to be in a reset state */ + /* Assert gpio to be in a reset state */ ts->reset_gpio = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_HIGH); if (IS_ERR(ts->reset_gpio)) { error = PTR_ERR(ts->reset_gpio); dev_err(dev, "Failed to request reset gpio, error %d\n", error); return error; } + + fsleep(10); /* Ensure long-enough reset pulse (minimum 10us). */ + gpiod_set_value_cansleep(ts->reset_gpio, 0);
/* Need a delay to have device up */
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikael Gonella-Bolduc mgonellabolduc@dimonoff.com
commit 7675b5efd81fe6d524e29d5a541f43201e98afa8 upstream.
The power control function ignores the "on" argument when setting the report ID, and thus is always sending HID_POWER_SLEEP. This causes a problem when trying to wakeup.
Fix by sending the state variable, which contains the proper HID_POWER_ON or HID_POWER_SLEEP based on the "on" argument.
Fixes: 3c98b8dbdced ("Input: cyttsp5 - implement proper sleep and wakeup procedures") Cc: stable@vger.kernel.org Signed-off-by: Mikael Gonella-Bolduc mgonellabolduc@dimonoff.com Signed-off-by: Hugo Villeneuve hvilleneuve@dimonoff.com Reviewed-by: Alistair Francis alistair@alistair23.me Link: https://lore.kernel.org/r/20250423135243.1261460-1-hugo@hugovil.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/touchscreen/cyttsp5.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/input/touchscreen/cyttsp5.c +++ b/drivers/input/touchscreen/cyttsp5.c @@ -580,7 +580,7 @@ static int cyttsp5_power_control(struct int rc;
SET_CMD_REPORT_TYPE(cmd[0], 0); - SET_CMD_REPORT_ID(cmd[0], HID_POWER_SLEEP); + SET_CMD_REPORT_ID(cmd[0], state); SET_CMD_OPCODE(cmd[1], HID_CMD_SET_POWER);
rc = cyttsp5_write(ts, HID_COMMAND_REG, cmd, sizeof(cmd));
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gary Bisson bisson.gary@gmail.com
commit 11cdb506d0fbf5ac05bf55f5afcb3a215c316490 upstream.
In mtk_pmic_keys_probe, the regs parameter is only set if the button is parsed in the device tree. However, on hardware where the button is left floating, that node will most likely be removed not to enable that input. In that case the code will try to dereference a null pointer.
Let's use the regs struct instead as it is defined for all supported platforms. Note that it is ok setting the key reg even if that latter is disabled as the interrupt won't be enabled anyway.
Fixes: b581acb49aec ("Input: mtk-pmic-keys - transfer per-key bit in mtk_pmic_keys_regs") Signed-off-by: Gary Bisson bisson.gary@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/keyboard/mtk-pmic-keys.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/input/keyboard/mtk-pmic-keys.c +++ b/drivers/input/keyboard/mtk-pmic-keys.c @@ -147,8 +147,8 @@ static void mtk_pmic_keys_lp_reset_setup u32 value, mask; int error;
- kregs_home = keys->keys[MTK_PMIC_HOMEKEY_INDEX].regs; - kregs_pwr = keys->keys[MTK_PMIC_PWRKEY_INDEX].regs; + kregs_home = ®s->keys_regs[MTK_PMIC_HOMEKEY_INDEX]; + kregs_pwr = ®s->keys_regs[MTK_PMIC_PWRKEY_INDEX];
error = of_property_read_u32(keys->dev->of_node, "power-off-time-sec", &long_press_debounce);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vicki Pfau vi@endrift.com
commit 4ef46367073b107ec22f46fe5f12176e87c238e8 upstream.
The Share button, if present, is always one of two offsets from the end of the file, depending on the presence of a specific interface. As we lack parsing for the identify packet we can't automatically determine the presence of that interface, but we can hardcode which of these offsets is correct for a given controller.
More controllers are probably fixable by adding the MAP_SHARE_BUTTON in the future, but for now I only added the ones that I have the ability to test directly.
Signed-off-by: Vicki Pfau vi@endrift.com Link: https://lore.kernel.org/r/20250328234345.989761-2-vi@endrift.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/joystick/xpad.c | 35 ++++++++++++++++++++--------------- 1 file changed, 20 insertions(+), 15 deletions(-)
--- a/drivers/input/joystick/xpad.c +++ b/drivers/input/joystick/xpad.c @@ -77,12 +77,13 @@ * xbox d-pads should map to buttons, as is required for DDR pads * but we map them to axes when possible to simplify things */ -#define MAP_DPAD_TO_BUTTONS (1 << 0) -#define MAP_TRIGGERS_TO_BUTTONS (1 << 1) -#define MAP_STICKS_TO_NULL (1 << 2) -#define MAP_SELECT_BUTTON (1 << 3) -#define MAP_PADDLES (1 << 4) -#define MAP_PROFILE_BUTTON (1 << 5) +#define MAP_DPAD_TO_BUTTONS BIT(0) +#define MAP_TRIGGERS_TO_BUTTONS BIT(1) +#define MAP_STICKS_TO_NULL BIT(2) +#define MAP_SHARE_BUTTON BIT(3) +#define MAP_PADDLES BIT(4) +#define MAP_PROFILE_BUTTON BIT(5) +#define MAP_SHARE_OFFSET BIT(6)
#define DANCEPAD_MAP_CONFIG (MAP_DPAD_TO_BUTTONS | \ MAP_TRIGGERS_TO_BUTTONS | MAP_STICKS_TO_NULL) @@ -135,7 +136,7 @@ static const struct xpad_device { { 0x03f0, 0x048D, "HyperX Clutch", 0, XTYPE_XBOX360 }, /* wireless */ { 0x03f0, 0x0495, "HyperX Clutch Gladiate", 0, XTYPE_XBOXONE }, { 0x03f0, 0x07A0, "HyperX Clutch Gladiate RGB", 0, XTYPE_XBOXONE }, - { 0x03f0, 0x08B6, "HyperX Clutch Gladiate", 0, XTYPE_XBOXONE }, /* v2 */ + { 0x03f0, 0x08B6, "HyperX Clutch Gladiate", MAP_SHARE_BUTTON, XTYPE_XBOXONE }, /* v2 */ { 0x03f0, 0x09B4, "HyperX Clutch Tanto", 0, XTYPE_XBOXONE }, { 0x044f, 0x0f00, "Thrustmaster Wheel", 0, XTYPE_XBOX }, { 0x044f, 0x0f03, "Thrustmaster Wheel", 0, XTYPE_XBOX }, @@ -159,7 +160,7 @@ static const struct xpad_device { { 0x045e, 0x0719, "Xbox 360 Wireless Receiver", MAP_DPAD_TO_BUTTONS, XTYPE_XBOX360W }, { 0x045e, 0x0b00, "Microsoft X-Box One Elite 2 pad", MAP_PADDLES, XTYPE_XBOXONE }, { 0x045e, 0x0b0a, "Microsoft X-Box Adaptive Controller", MAP_PROFILE_BUTTON, XTYPE_XBOXONE }, - { 0x045e, 0x0b12, "Microsoft Xbox Series S|X Controller", MAP_SELECT_BUTTON, XTYPE_XBOXONE }, + { 0x045e, 0x0b12, "Microsoft Xbox Series S|X Controller", MAP_SHARE_BUTTON | MAP_SHARE_OFFSET, XTYPE_XBOXONE }, { 0x046d, 0xc21d, "Logitech Gamepad F310", 0, XTYPE_XBOX360 }, { 0x046d, 0xc21e, "Logitech Gamepad F510", 0, XTYPE_XBOX360 }, { 0x046d, 0xc21f, "Logitech Gamepad F710", 0, XTYPE_XBOX360 }, @@ -211,7 +212,7 @@ static const struct xpad_device { { 0x0738, 0xcb29, "Saitek Aviator Stick AV8R02", 0, XTYPE_XBOX360 }, { 0x0738, 0xf738, "Super SFIV FightStick TE S", 0, XTYPE_XBOX360 }, { 0x07ff, 0xffff, "Mad Catz GamePad", 0, XTYPE_XBOX360 }, - { 0x0b05, 0x1a38, "ASUS ROG RAIKIRI", 0, XTYPE_XBOXONE }, + { 0x0b05, 0x1a38, "ASUS ROG RAIKIRI", MAP_SHARE_BUTTON, XTYPE_XBOXONE }, { 0x0b05, 0x1abb, "ASUS ROG RAIKIRI PRO", 0, XTYPE_XBOXONE }, { 0x0c12, 0x0005, "Intec wireless", 0, XTYPE_XBOX }, { 0x0c12, 0x8801, "Nyko Xbox Controller", 0, XTYPE_XBOX }, @@ -390,7 +391,7 @@ static const struct xpad_device { { 0x2dc8, 0x6001, "8BitDo SN30 Pro", 0, XTYPE_XBOX360 }, { 0x2e24, 0x0652, "Hyperkin Duke X-Box One pad", 0, XTYPE_XBOXONE }, { 0x2e24, 0x1688, "Hyperkin X91 X-Box One pad", 0, XTYPE_XBOXONE }, - { 0x2e95, 0x0504, "SCUF Gaming Controller", MAP_SELECT_BUTTON, XTYPE_XBOXONE }, + { 0x2e95, 0x0504, "SCUF Gaming Controller", MAP_SHARE_BUTTON, XTYPE_XBOXONE }, { 0x31e3, 0x1100, "Wooting One", 0, XTYPE_XBOX360 }, { 0x31e3, 0x1200, "Wooting Two", 0, XTYPE_XBOX360 }, { 0x31e3, 0x1210, "Wooting Lekker", 0, XTYPE_XBOX360 }, @@ -1027,7 +1028,7 @@ static void xpad360w_process_packet(stru * The report format was gleaned from * https://github.com/kylelemons/xbox/blob/master/xbox.go */ -static void xpadone_process_packet(struct usb_xpad *xpad, u16 cmd, unsigned char *data) +static void xpadone_process_packet(struct usb_xpad *xpad, u16 cmd, unsigned char *data, u32 len) { struct input_dev *dev = xpad->dev; bool do_sync = false; @@ -1068,8 +1069,12 @@ static void xpadone_process_packet(struc /* menu/view buttons */ input_report_key(dev, BTN_START, data[4] & BIT(2)); input_report_key(dev, BTN_SELECT, data[4] & BIT(3)); - if (xpad->mapping & MAP_SELECT_BUTTON) - input_report_key(dev, KEY_RECORD, data[22] & BIT(0)); + if (xpad->mapping & MAP_SHARE_BUTTON) { + if (xpad->mapping & MAP_SHARE_OFFSET) + input_report_key(dev, KEY_RECORD, data[len - 26] & BIT(0)); + else + input_report_key(dev, KEY_RECORD, data[len - 18] & BIT(0)); + }
/* buttons A,B,X,Y */ input_report_key(dev, BTN_A, data[4] & BIT(4)); @@ -1217,7 +1222,7 @@ static void xpad_irq_in(struct urb *urb) xpad360w_process_packet(xpad, 0, xpad->idata); break; case XTYPE_XBOXONE: - xpadone_process_packet(xpad, 0, xpad->idata); + xpadone_process_packet(xpad, 0, xpad->idata, urb->actual_length); break; default: xpad_process_packet(xpad, 0, xpad->idata); @@ -1944,7 +1949,7 @@ static int xpad_init_input(struct usb_xp xpad->xtype == XTYPE_XBOXONE) { for (i = 0; xpad360_btn[i] >= 0; i++) input_set_capability(input_dev, EV_KEY, xpad360_btn[i]); - if (xpad->mapping & MAP_SELECT_BUTTON) + if (xpad->mapping & MAP_SHARE_BUTTON) input_set_capability(input_dev, EV_KEY, KEY_RECORD); } else { for (i = 0; xpad_btn[i] >= 0; i++)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lode Willems me@lodewillems.com
commit 22cd66a5db56a07d9e621367cb4d16ff0f6baf56 upstream.
This patch adds support for the 8BitDo Ultimate 2 Wireless Controller. Tested using the wireless dongle and plugged in.
Signed-off-by: Lode Willems me@lodewillems.com Link: https://lore.kernel.org/r/20250422112457.6728-1-me@lodewillems.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/joystick/xpad.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/input/joystick/xpad.c +++ b/drivers/input/joystick/xpad.c @@ -388,6 +388,7 @@ static const struct xpad_device { { 0x2dc8, 0x3106, "8BitDo Ultimate Wireless / Pro 2 Wired Controller", 0, XTYPE_XBOX360 }, { 0x2dc8, 0x3109, "8BitDo Ultimate Wireless Bluetooth", 0, XTYPE_XBOX360 }, { 0x2dc8, 0x310a, "8BitDo Ultimate 2C Wireless Controller", 0, XTYPE_XBOX360 }, + { 0x2dc8, 0x310b, "8BitDo Ultimate 2 Wireless Controller", 0, XTYPE_XBOX360 }, { 0x2dc8, 0x6001, "8BitDo SN30 Pro", 0, XTYPE_XBOX360 }, { 0x2e24, 0x0652, "Hyperkin Duke X-Box One pad", 0, XTYPE_XBOXONE }, { 0x2e24, 0x1688, "Hyperkin X91 X-Box One pad", 0, XTYPE_XBOXONE },
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vicki Pfau vi@endrift.com
commit d05a424bea9aa3435009d5c462055008cc1545d8 upstream.
Two controllers -- Mad Catz JOYTECH NEO SE Advanced and PDP Mirror's Edge Official -- were missing the value of the mapping field, and thus wouldn't detect properly.
Signed-off-by: Vicki Pfau vi@endrift.com Link: https://lore.kernel.org/r/20250328234345.989761-1-vi@endrift.com Fixes: 540602a43ae5 ("Input: xpad - add a few new VID/PID combinations") Fixes: 3492321e2e60 ("Input: xpad - add multiple supported devices") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/joystick/xpad.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/input/joystick/xpad.c +++ b/drivers/input/joystick/xpad.c @@ -206,7 +206,7 @@ static const struct xpad_device { { 0x0738, 0x9871, "Mad Catz Portable Drum", 0, XTYPE_XBOX360 }, { 0x0738, 0xb726, "Mad Catz Xbox controller - MW2", 0, XTYPE_XBOX360 }, { 0x0738, 0xb738, "Mad Catz MVC2TE Stick 2", MAP_TRIGGERS_TO_BUTTONS, XTYPE_XBOX360 }, - { 0x0738, 0xbeef, "Mad Catz JOYTECH NEO SE Advanced GamePad", XTYPE_XBOX360 }, + { 0x0738, 0xbeef, "Mad Catz JOYTECH NEO SE Advanced GamePad", 0, XTYPE_XBOX360 }, { 0x0738, 0xcb02, "Saitek Cyborg Rumble Pad - PC/Xbox 360", 0, XTYPE_XBOX360 }, { 0x0738, 0xcb03, "Saitek P3200 Rumble Pad - PC/Xbox 360", 0, XTYPE_XBOX360 }, { 0x0738, 0xcb29, "Saitek Aviator Stick AV8R02", 0, XTYPE_XBOX360 }, @@ -241,7 +241,7 @@ static const struct xpad_device { { 0x0e6f, 0x0146, "Rock Candy Wired Controller for Xbox One", 0, XTYPE_XBOXONE }, { 0x0e6f, 0x0147, "PDP Marvel Xbox One Controller", 0, XTYPE_XBOXONE }, { 0x0e6f, 0x015c, "PDP Xbox One Arcade Stick", MAP_TRIGGERS_TO_BUTTONS, XTYPE_XBOXONE }, - { 0x0e6f, 0x015d, "PDP Mirror's Edge Official Wired Controller for Xbox One", XTYPE_XBOXONE }, + { 0x0e6f, 0x015d, "PDP Mirror's Edge Official Wired Controller for Xbox One", 0, XTYPE_XBOXONE }, { 0x0e6f, 0x0161, "PDP Xbox One Controller", 0, XTYPE_XBOXONE }, { 0x0e6f, 0x0162, "PDP Xbox One Controller", 0, XTYPE_XBOXONE }, { 0x0e6f, 0x0163, "PDP Xbox One Controller", 0, XTYPE_XBOXONE },
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Manuel Fombuena fombuena@outlook.com
commit 6d7ea0881000966607772451b789b5fb5766f11d upstream.
[ 5.989588] psmouse serio1: synaptics: Your touchpad (PNP: TOS0213 PNP0f03) says it can support a different bus. If i2c-hid and hid-rmi are not used, you might want to try setting psmouse.synaptics_intertouch to 1 and report this to linux-input@vger.kernel.org. [ 6.039923] psmouse serio1: synaptics: Touchpad model: 1, fw: 9.32, id: 0x1e2a1, caps: 0xf00223/0x840300/0x12e800/0x52d884, board id: 3322, fw id: 2658004
The board is labelled TM3322.
Present on the Toshiba / Dynabook Portege X30-D and possibly others.
Confirmed working well with psmouse.synaptics_intertouch=1 and local build.
Signed-off-by: Manuel Fombuena fombuena@outlook.com Signed-off-by: Aditya Garg gargaditya08@live.com Link: https://lore.kernel.org/r/PN3PR01MB9597711E7933A08389FEC31DB888A@PN3PR01MB95... Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/mouse/synaptics.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/input/mouse/synaptics.c +++ b/drivers/input/mouse/synaptics.c @@ -194,6 +194,7 @@ static const char * const smbus_pnp_ids[ "SYN3221", /* HP 15-ay000 */ "SYN323d", /* HP Spectre X360 13-w013dx */ "SYN3257", /* HP Envy 13-ad105ng */ + "TOS0213", /* Dynabook Portege X30-D */ NULL };
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aditya Garg gargaditya08@live.com
commit 47d768b32e644b56901bb4bbbdb1feb01ea86c85 upstream.
Enable InterTouch mode on Dynabook Portege X30L-G by adding "TOS01f6" to the list of SMBus-enabled variants.
Reported-by: Xuntao Chi chotaotao1qaz2wsx@gmail.com Tested-by: Xuntao Chi chotaotao1qaz2wsx@gmail.com Signed-off-by: Aditya Garg gargaditya08@live.com Link: https://lore.kernel.org/r/PN3PR01MB959786E4AC797160CDA93012B888A@PN3PR01MB95... Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/mouse/synaptics.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/input/mouse/synaptics.c +++ b/drivers/input/mouse/synaptics.c @@ -194,6 +194,7 @@ static const char * const smbus_pnp_ids[ "SYN3221", /* HP 15-ay000 */ "SYN323d", /* HP Spectre X360 13-w013dx */ "SYN3257", /* HP Envy 13-ad105ng */ + "TOS01f6", /* Dynabook Portege X30L-G */ "TOS0213", /* Dynabook Portege X30-D */ NULL };
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aditya Garg gargaditya08@live.com
commit a609cb4cc07aa9ab8f50466622814356c06f2c17 upstream.
Enable InterTouch mode on Dell Precision M3800 by adding "DLL060d" to the list of SMBus-enabled variants.
Reported-by: Markus Rathgeb maggu2810@gmail.com Signed-off-by: Aditya Garg gargaditya08@live.com Link: https://lore.kernel.org/r/PN3PR01MB959789DD6D574E16141E5DC4B888A@PN3PR01MB95... Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/mouse/synaptics.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/input/mouse/synaptics.c +++ b/drivers/input/mouse/synaptics.c @@ -163,6 +163,7 @@ static const char * const topbuttonpad_p
static const char * const smbus_pnp_ids[] = { /* all of the topbuttonpad_pnp_ids are valid, we just add some extras */ + "DLL060d", /* Dell Precision M3800 */ "LEN0048", /* X1 Carbon 3 */ "LEN0046", /* X250 */ "LEN0049", /* Yoga 11e */
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Torokhov dmitry.torokhov@gmail.com
commit f04f03d3e99bc8f89b6af5debf07ff67d961bc23 upstream.
The kernel reports that the touchpad for this device can support SMBus mode.
Reported-by: jt enopatch@gmail.com Link: https://lore.kernel.org/r/iys5dbv3ldddsgobfkxldazxyp54kay4bozzmagga6emy45jop... Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/mouse/synaptics.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/input/mouse/synaptics.c +++ b/drivers/input/mouse/synaptics.c @@ -190,6 +190,7 @@ static const char * const smbus_pnp_ids[ "LEN2054", /* E480 */ "LEN2055", /* E580 */ "LEN2068", /* T14 Gen 1 */ + "SYN3003", /* HP EliteBook 850 G1 */ "SYN3015", /* HP EliteBook 840 G2 */ "SYN3052", /* HP EliteBook 840 G4 */ "SYN3221", /* HP 15-ay000 */
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aditya Garg gargaditya08@live.com
commit 2abc698ac77314e0de5b33a6d96a39c5159d88e4 upstream.
Enable InterTouch mode on TUXEDO InfinityBook Pro 14 v5 by adding "SYN1221" to the list of SMBus-enabled variants.
Add support for InterTouch on SYN1221 by adding it to the list of SMBus-enabled variants.
Reported-by: Matthias Eilert kernel.hias@eilert.tech Tested-by: Matthias Eilert kernel.hias@eilert.tech Signed-off-by: Aditya Garg gargaditya08@live.com Link: https://lore.kernel.org/r/PN3PR01MB9597C033C4BC20EE2A0C4543B888A@PN3PR01MB95... Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/mouse/synaptics.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/input/mouse/synaptics.c +++ b/drivers/input/mouse/synaptics.c @@ -190,6 +190,7 @@ static const char * const smbus_pnp_ids[ "LEN2054", /* E480 */ "LEN2055", /* E580 */ "LEN2068", /* T14 Gen 1 */ + "SYN1221", /* TUXEDO InfinityBook Pro 14 v5 */ "SYN3003", /* HP EliteBook 850 G1 */ "SYN3015", /* HP EliteBook 840 G2 */ "SYN3052", /* HP EliteBook 840 G4 */
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miguel Ojeda ojeda@kernel.org
commit 7129ea6e242b00938532537da41ddf5fa3e21471 upstream.
Starting with Rust 1.88.0 (expected 2025-06-26) [1][2], `rustc` may introduce a new lint that catches unnecessary transmutes, e.g.:
error: unnecessary transmute --> rust/uapi/uapi_generated.rs:23242:18 | 23242 | unsafe { ::core::mem::transmute(self._bitfield_1.get(0usize, 1u8) as u8) } | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: replace this with: `(self._bitfield_1.get(0usize, 1u8) as u8 == 1)` | = note: `-D unnecessary-transmutes` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(unnecessary_transmutes)]`
There are a lot of them (at least 300), but luckily they are all in `bindgen`-generated code.
Thus clean all up by allowing it there.
Since unknown lints trigger a lint itself in older compilers, do it conditionally so that we can keep the `unknown_lints` lint enabled.
Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs). Link: https://github.com/rust-lang/rust/pull/136083 [1] Link: https://github.com/rust-lang/rust/issues/136067 [2] Reviewed-by: Alice Ryhl aliceryhl@google.com Link: https://lore.kernel.org/r/20250502140237.1659624-4-ojeda@kernel.org Signed-off-by: Miguel Ojeda ojeda@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- init/Kconfig | 3 +++ rust/bindings/lib.rs | 1 + rust/uapi/lib.rs | 1 + 3 files changed, 5 insertions(+)
--- a/init/Kconfig +++ b/init/Kconfig @@ -137,6 +137,9 @@ config LD_CAN_USE_KEEP_IN_OVERLAY config RUSTC_HAS_COERCE_POINTEE def_bool RUSTC_VERSION >= 108400
+config RUSTC_HAS_UNNECESSARY_TRANSMUTES + def_bool RUSTC_VERSION >= 108800 + config PAHOLE_VERSION int default $(shell,$(srctree)/scripts/pahole-version.sh $(PAHOLE)) --- a/rust/bindings/lib.rs +++ b/rust/bindings/lib.rs @@ -26,6 +26,7 @@
#[allow(dead_code)] #[allow(clippy::undocumented_unsafe_blocks)] +#[cfg_attr(CONFIG_RUSTC_HAS_UNNECESSARY_TRANSMUTES, allow(unnecessary_transmutes))] mod bindings_raw { // Manual definition for blocklisted types. type __kernel_size_t = usize; --- a/rust/uapi/lib.rs +++ b/rust/uapi/lib.rs @@ -24,6 +24,7 @@ unreachable_pub, unsafe_op_in_unsafe_fn )] +#![cfg_attr(CONFIG_RUSTC_HAS_UNNECESSARY_TRANSMUTES, allow(unnecessary_transmutes))]
// Manual definition of blocklisted types. type __kernel_size_t = usize;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miguel Ojeda ojeda@kernel.org
commit 19f5ca461d5fc09bdf93a9f8e4bd78ed3a49dc71 upstream.
Starting with Rust 1.87.0 (expected 2025-05-15), `objtool` may report:
rust/core.o: warning: objtool: _R..._4core9panicking9panic_fmt() falls through to next function _R..._4core9panicking18panic_nounwind_fmt()
rust/core.o: warning: objtool: _R..._4core9panicking18panic_nounwind_fmt() falls through to next function _R..._4core9panicking5panic()
The reason is that `rust_begin_unwind` is now mangled:
_R..._7___rustc17rust_begin_unwind
Thus add the mangled one to the list so that `objtool` knows it is actually `noreturn`.
See commit 56d680dd23c3 ("objtool/rust: list `noreturn` Rust functions") for more details.
Alternatively, we could remove the fixed one in `noreturn.h` and relax this test to cover both, but it seems best to be strict as long as we can.
Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs). Cc: Josh Poimboeuf jpoimboe@kernel.org Cc: Peter Zijlstra peterz@infradead.org Reviewed-by: Alice Ryhl aliceryhl@google.com Link: https://lore.kernel.org/r/20250502140237.1659624-2-ojeda@kernel.org Signed-off-by: Miguel Ojeda ojeda@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/objtool/check.c | 1 + 1 file changed, 1 insertion(+)
--- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -228,6 +228,7 @@ static bool is_rust_noreturn(const struc str_ends_with(func->name, "_4core9panicking19assert_failed_inner") || str_ends_with(func->name, "_4core9panicking30panic_null_pointer_dereference") || str_ends_with(func->name, "_4core9panicking36panic_misaligned_pointer_dereference") || + str_ends_with(func->name, "_7___rustc17rust_begin_unwind") || strstr(func->name, "_4core9panicking13assert_failed") || strstr(func->name, "_4core9panicking11panic_const24panic_const_") || (strstr(func->name, "_4core5slice5index24slice_") &&
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miguel Ojeda ojeda@kernel.org
commit c016722fd57551f8a6fcf472c9d2bcf2130ea0ec upstream.
Starting with Rust 1.88.0 (expected 2025-06-26) [1], Clippy may start warning about paths that do not resolve in the `disallowed_macros` configuration:
warning: `kernel::dbg` does not refer to an existing macro --> .clippy.toml:10:5 | 10 | { path = "kernel::dbg", reason = "the `dbg!` macro is intended as a debugging tool" }, | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This is a lint we requested at [2], due to the trouble debugging the lint due to false negatives (e.g. [3]), which we use to emulate `clippy::dbg_macro` [4]. See commit 8577c9dca799 ("rust: replace `clippy::dbg_macro` with `disallowed_macros`") for more details.
Given the false negatives are not resolved yet, it is expected that Clippy complains about not finding this macro.
Thus, until the false negatives are fixed (and, even then, probably we will need to wait for the MSRV to raise enough), use the escape hatch to allow an invalid path.
Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs). Link: https://github.com/rust-lang/rust-clippy/pull/14397 [1] Link: https://github.com/rust-lang/rust-clippy/issues/11432 [2] Link: https://github.com/rust-lang/rust-clippy/issues/11431 [3] Link: https://github.com/rust-lang/rust-clippy/issues/11303 [4] Reviewed-by: Alice Ryhl aliceryhl@google.com Link: https://lore.kernel.org/r/20250502140237.1659624-5-ojeda@kernel.org Signed-off-by: Miguel Ojeda ojeda@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- .clippy.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/.clippy.toml +++ b/.clippy.toml @@ -7,5 +7,5 @@ check-private-items = true disallowed-macros = [ # The `clippy::dbg_macro` lint only works with `std::dbg!`, thus we simulate # it here, see: https://github.com/rust-lang/rust-clippy/issues/11303. - { path = "kernel::dbg", reason = "the `dbg!` macro is intended as a debugging tool" }, + { path = "kernel::dbg", reason = "the `dbg!` macro is intended as a debugging tool", allow-invalid = true }, ]
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Naman Jain namjain@linux.microsoft.com
commit f31fe8165d365379d858c53bef43254c7d6d1cfd upstream.
On regular bootup, devices get registered to VMBus first, so when uio_hv_generic driver for a particular device type is probed, the device is already initialized and added, so sysfs creation in hv_uio_probe() works fine. However, when the device is removed and brought back, the channel gets rescinded and the device again gets registered to VMBus. However this time, the uio_hv_generic driver is already registered to probe for that device and in this case sysfs creation is tried before the device's kobject gets initialized completely.
Fix this by moving the core logic of sysfs creation of ring buffer, from uio_hv_generic to HyperV's VMBus driver, where the rest of the sysfs attributes for the channels are defined. While doing that, make use of attribute groups and macros, instead of creating sysfs directly, to ensure better error handling and code flow.
Problematic path: vmbus_process_offer (A new offer comes for the VMBus device) vmbus_add_channel_work vmbus_device_register |-> device_register | |... | |-> hv_uio_probe | |... | |-> sysfs_create_bin_file (leads to a warning as | the primary channel's kobject, which is used to | create the sysfs file, is not yet initialized) |-> kset_create_and_add |-> vmbus_add_channel_kobj (initialization of the primary channel's kobject happens later)
Above code flow is sequential and the warning is always reproducible in this path.
Fixes: 9ab877a6ccf8 ("uio_hv_generic: make ring buffer attribute for primary channel") Cc: stable@kernel.org Suggested-by: Saurabh Sengar ssengar@linux.microsoft.com Suggested-by: Michael Kelley mhklinux@outlook.com Reviewed-by: Michael Kelley mhklinux@outlook.com Tested-by: Michael Kelley mhklinux@outlook.com Reviewed-by: Dexuan Cui decui@microsoft.com Signed-off-by: Naman Jain namjain@linux.microsoft.com Link: https://lore.kernel.org/r/20250502074811.2022-2-namjain@linux.microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hv/hyperv_vmbus.h | 6 ++ drivers/hv/vmbus_drv.c | 100 ++++++++++++++++++++++++++++++++++++++++++- drivers/uio/uio_hv_generic.c | 39 +++++++--------- include/linux/hyperv.h | 6 ++ 4 files changed, 128 insertions(+), 23 deletions(-)
--- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -477,4 +477,10 @@ static inline int hv_debug_add_dev_dir(s
#endif /* CONFIG_HYPERV_TESTING */
+/* Create and remove sysfs entry for memory mapped ring buffers for a channel */ +int hv_create_ring_sysfs(struct vmbus_channel *channel, + int (*hv_mmap_ring_buffer)(struct vmbus_channel *channel, + struct vm_area_struct *vma)); +int hv_remove_ring_sysfs(struct vmbus_channel *channel); + #endif /* _HYPERV_VMBUS_H */ --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1792,6 +1792,27 @@ static ssize_t subchannel_id_show(struct } static VMBUS_CHAN_ATTR_RO(subchannel_id);
+static int hv_mmap_ring_buffer_wrapper(struct file *filp, struct kobject *kobj, + const struct bin_attribute *attr, + struct vm_area_struct *vma) +{ + struct vmbus_channel *channel = container_of(kobj, struct vmbus_channel, kobj); + + /* + * hv_(create|remove)_ring_sysfs implementation ensures that mmap_ring_buffer + * is not NULL. + */ + return channel->mmap_ring_buffer(channel, vma); +} + +static struct bin_attribute chan_attr_ring_buffer = { + .attr = { + .name = "ring", + .mode = 0600, + }, + .size = 2 * SZ_2M, + .mmap = hv_mmap_ring_buffer_wrapper, +}; static struct attribute *vmbus_chan_attrs[] = { &chan_attr_out_mask.attr, &chan_attr_in_mask.attr, @@ -1811,6 +1832,11 @@ static struct attribute *vmbus_chan_attr NULL };
+static struct bin_attribute *vmbus_chan_bin_attrs[] = { + &chan_attr_ring_buffer, + NULL +}; + /* * Channel-level attribute_group callback function. Returns the permission for * each attribute, and returns 0 if an attribute is not visible. @@ -1831,9 +1857,24 @@ static umode_t vmbus_chan_attr_is_visibl return attr->mode; }
+static umode_t vmbus_chan_bin_attr_is_visible(struct kobject *kobj, + const struct bin_attribute *attr, int idx) +{ + const struct vmbus_channel *channel = + container_of(kobj, struct vmbus_channel, kobj); + + /* Hide ring attribute if channel's ring_sysfs_visible is set to false */ + if (attr == &chan_attr_ring_buffer && !channel->ring_sysfs_visible) + return 0; + + return attr->attr.mode; +} + static const struct attribute_group vmbus_chan_group = { .attrs = vmbus_chan_attrs, - .is_visible = vmbus_chan_attr_is_visible + .bin_attrs = vmbus_chan_bin_attrs, + .is_visible = vmbus_chan_attr_is_visible, + .is_bin_visible = vmbus_chan_bin_attr_is_visible, };
static const struct kobj_type vmbus_chan_ktype = { @@ -1841,6 +1882,63 @@ static const struct kobj_type vmbus_chan .release = vmbus_chan_release, };
+/** + * hv_create_ring_sysfs() - create "ring" sysfs entry corresponding to ring buffers for a channel. + * @channel: Pointer to vmbus_channel structure + * @hv_mmap_ring_buffer: function pointer for initializing the function to be called on mmap of + * channel's "ring" sysfs node, which is for the ring buffer of that channel. + * Function pointer is of below type: + * int (*hv_mmap_ring_buffer)(struct vmbus_channel *channel, + * struct vm_area_struct *vma)) + * This has a pointer to the channel and a pointer to vm_area_struct, + * used for mmap, as arguments. + * + * Sysfs node for ring buffer of a channel is created along with other fields, however its + * visibility is disabled by default. Sysfs creation needs to be controlled when the use-case + * is running. + * For example, HV_NIC device is used either by uio_hv_generic or hv_netvsc at any given point of + * time, and "ring" sysfs is needed only when uio_hv_generic is bound to that device. To avoid + * exposing the ring buffer by default, this function is reponsible to enable visibility of + * ring for userspace to use. + * Note: Race conditions can happen with userspace and it is not encouraged to create new + * use-cases for this. This was added to maintain backward compatibility, while solving + * one of the race conditions in uio_hv_generic while creating sysfs. + * + * Returns 0 on success or error code on failure. + */ +int hv_create_ring_sysfs(struct vmbus_channel *channel, + int (*hv_mmap_ring_buffer)(struct vmbus_channel *channel, + struct vm_area_struct *vma)) +{ + struct kobject *kobj = &channel->kobj; + + channel->mmap_ring_buffer = hv_mmap_ring_buffer; + channel->ring_sysfs_visible = true; + + return sysfs_update_group(kobj, &vmbus_chan_group); +} +EXPORT_SYMBOL_GPL(hv_create_ring_sysfs); + +/** + * hv_remove_ring_sysfs() - remove ring sysfs entry corresponding to ring buffers for a channel. + * @channel: Pointer to vmbus_channel structure + * + * Hide "ring" sysfs for a channel by changing its is_visible attribute and updating sysfs group. + * + * Returns 0 on success or error code on failure. + */ +int hv_remove_ring_sysfs(struct vmbus_channel *channel) +{ + struct kobject *kobj = &channel->kobj; + int ret; + + channel->ring_sysfs_visible = false; + ret = sysfs_update_group(kobj, &vmbus_chan_group); + channel->mmap_ring_buffer = NULL; + return ret; +} +EXPORT_SYMBOL_GPL(hv_remove_ring_sysfs); + /* * vmbus_add_channel_kobj - setup a sub-directory under device/channels */ --- a/drivers/uio/uio_hv_generic.c +++ b/drivers/uio/uio_hv_generic.c @@ -131,15 +131,12 @@ static void hv_uio_rescind(struct vmbus_ vmbus_device_unregister(channel->device_obj); }
-/* Sysfs API to allow mmap of the ring buffers +/* Function used for mmap of ring buffer sysfs interface. * The ring buffer is allocated as contiguous memory by vmbus_open */ -static int hv_uio_ring_mmap(struct file *filp, struct kobject *kobj, - const struct bin_attribute *attr, - struct vm_area_struct *vma) +static int +hv_uio_ring_mmap(struct vmbus_channel *channel, struct vm_area_struct *vma) { - struct vmbus_channel *channel - = container_of(kobj, struct vmbus_channel, kobj); void *ring_buffer = page_address(channel->ringbuffer_page);
if (channel->state != CHANNEL_OPENED_STATE) @@ -149,15 +146,6 @@ static int hv_uio_ring_mmap(struct file channel->ringbuffer_pagecount << PAGE_SHIFT); }
-static const struct bin_attribute ring_buffer_bin_attr = { - .attr = { - .name = "ring", - .mode = 0600, - }, - .size = 2 * SZ_2M, - .mmap = hv_uio_ring_mmap, -}; - /* Callback from VMBUS subsystem when new channel created. */ static void hv_uio_new_channel(struct vmbus_channel *new_sc) @@ -178,8 +166,7 @@ hv_uio_new_channel(struct vmbus_channel /* Disable interrupts on sub channel */ new_sc->inbound.ring_buffer->interrupt_mask = 1; set_channel_read_mode(new_sc, HV_CALL_ISR); - - ret = sysfs_create_bin_file(&new_sc->kobj, &ring_buffer_bin_attr); + ret = hv_create_ring_sysfs(new_sc, hv_uio_ring_mmap); if (ret) { dev_err(device, "sysfs create ring bin file failed; %d\n", ret); vmbus_close(new_sc); @@ -350,10 +337,18 @@ hv_uio_probe(struct hv_device *dev, goto fail_close; }
- ret = sysfs_create_bin_file(&channel->kobj, &ring_buffer_bin_attr); - if (ret) - dev_notice(&dev->device, - "sysfs create ring bin file failed; %d\n", ret); + /* + * This internally calls sysfs_update_group, which returns a non-zero value if it executes + * before sysfs_create_group. This is expected as the 'ring' will be created later in + * vmbus_device_register() -> vmbus_add_channel_kobj(). Thus, no need to check the return + * value and print warning. + * + * Creating/exposing sysfs in driver probe is not encouraged as it can lead to race + * conditions with userspace. For backward compatibility, "ring" sysfs could not be removed + * or decoupled from uio_hv_generic probe. Userspace programs can make use of inotify + * APIs to make sure that ring is created. + */ + hv_create_ring_sysfs(channel, hv_uio_ring_mmap);
hv_set_drvdata(dev, pdata);
@@ -375,7 +370,7 @@ hv_uio_remove(struct hv_device *dev) if (!pdata) return;
- sysfs_remove_bin_file(&dev->channel->kobj, &ring_buffer_bin_attr); + hv_remove_ring_sysfs(dev->channel); uio_unregister_device(&pdata->info); hv_uio_cleanup(dev, pdata);
--- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1058,6 +1058,12 @@ struct vmbus_channel {
/* The max size of a packet on this channel */ u32 max_pkt_size; + + /* function to mmap ring buffer memory to the channel's sysfs ring attribute */ + int (*mmap_ring_buffer)(struct vmbus_channel *channel, struct vm_area_struct *vma); + + /* boolean to control visibility of sysfs for ring buffer */ + bool ring_sysfs_visible; };
#define lock_requestor(channel, flags) \
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gabriel Shahrouzi gshahrouzi@gmail.com
commit 2e922956277187655ed9bedf7b5c28906e51708f upstream.
The mode setting logic in ad7816_store_mode was reversed due to incorrect handling of the strcmp return value. strcmp returns 0 on match, so the `if (strcmp(buf, "full"))` block executed when the input was not "full".
This resulted in "full" setting the mode to AD7816_PD (power-down) and other inputs setting it to AD7816_FULL.
Fix this by checking it against 0 to correctly check for "full" and "power-down", mapping them to AD7816_FULL and AD7816_PD respectively.
Fixes: 7924425db04a ("staging: iio: adc: new driver for AD7816 devices") Cc: stable@vger.kernel.org Signed-off-by: Gabriel Shahrouzi gshahrouzi@gmail.com Acked-by: Nuno Sá nuno.sa@analog.com Link: https://lore.kernel.org/stable/20250414152920.467505-1-gshahrouzi%40gmail.co... Link: https://patch.msgid.link/20250414154050.469482-1-gshahrouzi@gmail.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/iio/adc/ad7816.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/iio/adc/ad7816.c +++ b/drivers/staging/iio/adc/ad7816.c @@ -136,7 +136,7 @@ static ssize_t ad7816_store_mode(struct struct iio_dev *indio_dev = dev_to_iio_dev(dev); struct ad7816_chip_info *chip = iio_priv(indio_dev);
- if (strcmp(buf, "full")) { + if (strcmp(buf, "full") == 0) { gpiod_set_value(chip->rdwr_pin, 1); chip->mode = AD7816_FULL; } else {
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Stevenson dave.stevenson@raspberrypi.com
commit 98698ca0e58734bc5c1c24e5bbc7429f981cd186 upstream.
Commit 42a2f6664e18 ("staging: vc04_services: Move global g_state to vchiq_state") changed mmal_init to pass dev->v4l2_dev.dev to vchiq_mmal_init, however nothing iniitialised dev->v4l2_dev, so we got a NULL pointer dereference.
Set dev->v4l2_dev.dev during bcm2835_mmal_probe. The device pointer could be passed into v4l2_device_register to set it, however that also has other effects that would need additional changes.
Fixes: 42a2f6664e18 ("staging: vc04_services: Move global g_state to vchiq_state") Cc: stable@vger.kernel.org Signed-off-by: Dave Stevenson dave.stevenson@raspberrypi.com Reviewed-by: Stefan Wahren wahrenst@gmx.net Link: https://lore.kernel.org/r/20250423-staging-bcm2835-v4l2-fix-v2-1-3227f0ba470... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/vc04_services/bcm2835-camera/bcm2835-camera.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/staging/vc04_services/bcm2835-camera/bcm2835-camera.c +++ b/drivers/staging/vc04_services/bcm2835-camera/bcm2835-camera.c @@ -1902,6 +1902,7 @@ static int bcm2835_mmal_probe(struct vch __func__, ret); goto free_dev; } + dev->v4l2_dev.dev = &device->dev;
/* setup v4l controls */ ret = bcm2835_mmal_init_controls(dev, &dev->ctrl_handler);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gabriel Shahrouzi gshahrouzi@gmail.com
commit c6e8d85fafa7193613db37da29c0e8d6e2515b13 upstream.
The axis-fifo driver performs a full hardware reset (via reset_ip_core()) in several error paths within the read and write functions. This reset flushes both TX and RX FIFOs and resets the AXI-Stream links.
Allow the user to handle the error without causing hardware disruption or data loss in other FIFO paths.
Fixes: 4a965c5f89de ("staging: add driver for Xilinx AXI-Stream FIFO v4.1 IP core") Cc: stable@vger.kernel.org Signed-off-by: Gabriel Shahrouzi gshahrouzi@gmail.com Link: https://lore.kernel.org/r/20250419004306.669605-1-gshahrouzi@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/axis-fifo/axis-fifo.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-)
--- a/drivers/staging/axis-fifo/axis-fifo.c +++ b/drivers/staging/axis-fifo/axis-fifo.c @@ -393,16 +393,14 @@ static ssize_t axis_fifo_read(struct fil
bytes_available = ioread32(fifo->base_addr + XLLF_RLR_OFFSET); if (!bytes_available) { - dev_err(fifo->dt_device, "received a packet of length 0 - fifo core will be reset\n"); - reset_ip_core(fifo); + dev_err(fifo->dt_device, "received a packet of length 0\n"); ret = -EIO; goto end_unlock; }
if (bytes_available > len) { - dev_err(fifo->dt_device, "user read buffer too small (available bytes=%zu user buffer bytes=%zu) - fifo core will be reset\n", + dev_err(fifo->dt_device, "user read buffer too small (available bytes=%zu user buffer bytes=%zu)\n", bytes_available, len); - reset_ip_core(fifo); ret = -EINVAL; goto end_unlock; } @@ -411,8 +409,7 @@ static ssize_t axis_fifo_read(struct fil /* this probably can't happen unless IP * registers were previously mishandled */ - dev_err(fifo->dt_device, "received a packet that isn't word-aligned - fifo core will be reset\n"); - reset_ip_core(fifo); + dev_err(fifo->dt_device, "received a packet that isn't word-aligned\n"); ret = -EIO; goto end_unlock; } @@ -433,7 +430,6 @@ static ssize_t axis_fifo_read(struct fil
if (copy_to_user(buf + copied * sizeof(u32), tmp_buf, copy * sizeof(u32))) { - reset_ip_core(fifo); ret = -EFAULT; goto end_unlock; } @@ -542,7 +538,6 @@ static ssize_t axis_fifo_write(struct fi
if (copy_from_user(tmp_buf, buf + copied * sizeof(u32), copy * sizeof(u32))) { - reset_ip_core(fifo); ret = -EFAULT; goto end_unlock; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gabriel Shahrouzi gshahrouzi@gmail.com
commit 2ca34b508774aaa590fc3698a54204706ecca4ba upstream.
Remove erroneous subtraction of 4 from the total FIFO depth read from device tree. The stored depth is for checking against total capacity, not initial vacancy. This prevented writes near the FIFO's full size.
The check performed just before data transfer, which uses live reads of the TDFV register to determine current vacancy, correctly handles the initial Depth - 4 hardware state and subsequent FIFO fullness.
Fixes: 4a965c5f89de ("staging: add driver for Xilinx AXI-Stream FIFO v4.1 IP core") Cc: stable@vger.kernel.org Signed-off-by: Gabriel Shahrouzi gshahrouzi@gmail.com Link: https://lore.kernel.org/r/20250419012937.674924-1-gshahrouzi@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/axis-fifo/axis-fifo.c | 3 --- 1 file changed, 3 deletions(-)
--- a/drivers/staging/axis-fifo/axis-fifo.c +++ b/drivers/staging/axis-fifo/axis-fifo.c @@ -770,9 +770,6 @@ static int axis_fifo_parse_dt(struct axi goto end; }
- /* IP sets TDFV to fifo depth - 4 so we will do the same */ - fifo->tx_fifo_depth -= 4; - ret = get_dts_property(fifo, "xlnx,use-rx-data", &fifo->has_rx_fifo); if (ret) { dev_err(fifo->dt_device, "missing xlnx,use-rx-data property\n");
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Hansen dave.hansen@linux.intel.com
commit fea4e317f9e7e1f449ce90dedc27a2d2a95bee5a upstream.
tl;dr: There is a window in the mm switching code where the new CR3 is set and the CPU should be getting TLB flushes for the new mm. But should_flush_tlb() has a bug and suppresses the flush. Fix it by widening the window where should_flush_tlb() sends an IPI.
Long Version:
=== History ===
There were a few things leading up to this.
First, updating mm_cpumask() was observed to be too expensive, so it was made lazier. But being lazy caused too many unnecessary IPIs to CPUs due to the now-lazy mm_cpumask(). So code was added to cull mm_cpumask() periodically[2]. But that culling was a bit too aggressive and skipped sending TLB flushes to CPUs that need them. So here we are again.
=== Problem ===
The too-aggressive code in should_flush_tlb() strikes in this window:
// Turn on IPIs for this CPU/mm combination, but only // if should_flush_tlb() agrees: cpumask_set_cpu(cpu, mm_cpumask(next));
next_tlb_gen = atomic64_read(&next->context.tlb_gen); choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush); load_new_mm_cr3(need_flush); // ^ After 'need_flush' is set to false, IPIs *MUST* // be sent to this CPU and not be ignored.
this_cpu_write(cpu_tlbstate.loaded_mm, next); // ^ Not until this point does should_flush_tlb() // become true!
should_flush_tlb() will suppress TLB flushes between load_new_mm_cr3() and writing to 'loaded_mm', which is a window where they should not be suppressed. Whoops.
=== Solution ===
Thankfully, the fuzzy "just about to write CR3" window is already marked with loaded_mm==LOADED_MM_SWITCHING. Simply checking for that state in should_flush_tlb() is sufficient to ensure that the CPU is targeted with an IPI.
This will cause more TLB flush IPIs. But the window is relatively small and I do not expect this to cause any kind of measurable performance impact.
Update the comment where LOADED_MM_SWITCHING is written since it grew yet another user.
Peter Z also raised a concern that should_flush_tlb() might not observe 'loaded_mm' and 'is_lazy' in the same order that switch_mm_irqs_off() writes them. Add a barrier to ensure that they are observed in the order they are written.
Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Acked-by: Rik van Riel riel@surriel.com Link: https://lore.kernel.org/oe-lkp/202411282207.6bd28eae-lkp@intel.com/ [1] Fixes: 6db2526c1d69 ("x86/mm/tlb: Only trim the mm_cpumask once a second") [2] Reported-by: Stephen Dolan sdolan@janestreet.com Cc: stable@vger.kernel.org Acked-by: Ingo Molnar mingo@kernel.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/mm/tlb.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-)
--- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -621,7 +621,11 @@ void switch_mm_irqs_off(struct mm_struct
choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush);
- /* Let nmi_uaccess_okay() know that we're changing CR3. */ + /* + * Indicate that CR3 is about to change. nmi_uaccess_okay() + * and others are sensitive to the window where mm_cpumask(), + * CR3 and cpu_tlbstate.loaded_mm are not all in sync. + */ this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING); barrier(); } @@ -895,8 +899,16 @@ done:
static bool should_flush_tlb(int cpu, void *data) { + struct mm_struct *loaded_mm = per_cpu(cpu_tlbstate.loaded_mm, cpu); struct flush_tlb_info *info = data;
+ /* + * Order the 'loaded_mm' and 'is_lazy' against their + * write ordering in switch_mm_irqs_off(). Ensure + * 'is_lazy' is at least as new as 'loaded_mm'. + */ + smp_rmb(); + /* Lazy TLB will get flushed at the next context switch. */ if (per_cpu(cpu_tlbstate_shared.is_lazy, cpu)) return false; @@ -905,8 +917,15 @@ static bool should_flush_tlb(int cpu, vo if (!info->mm) return true;
+ /* + * While switching, the remote CPU could have state from + * either the prev or next mm. Assume the worst and flush. + */ + if (loaded_mm == LOADED_MM_SWITCHING) + return true; + /* The target mm is loaded, and the CPU is not lazy. */ - if (per_cpu(cpu_tlbstate.loaded_mm, cpu) == info->mm) + if (loaded_mm == info->mm) return true;
/* In cpumask, but not the loaded mm? Periodically remove by flushing. */
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Petr Vaněk arkamar@atlas.cz
commit 7b08b74f3d99f6b801250683c751d391128799ec upstream.
On XEN PV, folio_pte_batch() can incorrectly batch beyond the end of a folio due to a corner case in pte_advance_pfn(). Specifically, when the PFN following the folio maps to an invalidated MFN,
expected_pte = pte_advance_pfn(expected_pte, nr);
produces a pte_none(). If the actual next PTE in memory is also pte_none(), the pte_same() succeeds,
if (!pte_same(pte, expected_pte)) break;
the loop is not broken, and batching continues into unrelated memory.
For example, with a 4-page folio, the PTE layout might look like this:
[ 53.465673] [ T2552] folio_pte_batch: printing PTE values at addr=0x7f1ac9dc5000 [ 53.465674] [ T2552] PTE[453] = 000000010085c125 [ 53.465679] [ T2552] PTE[454] = 000000010085d125 [ 53.465682] [ T2552] PTE[455] = 000000010085e125 [ 53.465684] [ T2552] PTE[456] = 000000010085f125 [ 53.465686] [ T2552] PTE[457] = 0000000000000000 <-- not present [ 53.465689] [ T2552] PTE[458] = 0000000101da7125
pte_advance_pfn(PTE[456]) returns a pte_none() due to invalid PFN->MFN mapping. The next actual PTE (PTE[457]) is also pte_none(), so the loop continues and includes PTE[457] in the batch, resulting in 5 batched entries for a 4-page folio. This triggers the following warning:
[ 53.465751] [ T2552] page: refcount:85 mapcount:20 mapping:ffff88813ff4f6a8 index:0x110 pfn:0x10085c [ 53.465754] [ T2552] head: order:2 mapcount:80 entire_mapcount:0 nr_pages_mapped:4 pincount:0 [ 53.465756] [ T2552] memcg:ffff888003573000 [ 53.465758] [ T2552] aops:0xffffffff8226fd20 ino:82467c dentry name(?):"libc.so.6" [ 53.465761] [ T2552] flags: 0x2000000000416c(referenced|uptodate|lru|active|private|head|node=0|zone=2) [ 53.465764] [ T2552] raw: 002000000000416c ffffea0004021f08 ffffea0004021908 ffff88813ff4f6a8 [ 53.465767] [ T2552] raw: 0000000000000110 ffff888133d8bd40 0000005500000013 ffff888003573000 [ 53.465768] [ T2552] head: 002000000000416c ffffea0004021f08 ffffea0004021908 ffff88813ff4f6a8 [ 53.465770] [ T2552] head: 0000000000000110 ffff888133d8bd40 0000005500000013 ffff888003573000 [ 53.465772] [ T2552] head: 0020000000000202 ffffea0004021701 000000040000004f 00000000ffffffff [ 53.465774] [ T2552] head: 0000000300000003 8000000300000002 0000000000000013 0000000000000004 [ 53.465775] [ T2552] page dumped because: VM_WARN_ON_FOLIO((_Generic((page + nr_pages - 1), const struct page *: (const struct folio *)_compound_head(page + nr_pages - 1), struct page *: (struct folio *)_compound_head(page + nr_pages - 1))) != folio)
Original code works as expected everywhere, except on XEN PV, where pte_advance_pfn() can yield a pte_none() after balloon inflation due to MFNs invalidation. In XEN, pte_advance_pfn() ends up calling __pte()->xen_make_pte()->pte_pfn_to_mfn(), which returns pte_none() when mfn == INVALID_P2M_ENTRY.
The pte_pfn_to_mfn() documents that nastiness:
If there's no mfn for the pfn, then just create an empty non-present pte. Unfortunately this loses information about the original pfn, so pte_mfn_to_pfn is asymmetric.
While such hacks should certainly be removed, we can do better in folio_pte_batch() and simply check ahead of time how many PTEs we can possibly batch in our folio.
This way, we can not only fix the issue but cleanup the code: removing the pte_pfn() check inside the loop body and avoiding end_ptr comparison + arithmetic.
Link: https://lkml.kernel.org/r/20250502215019.822-2-arkamar@atlas.cz Fixes: f8d937761d65 ("mm/memory: optimize fork() with PTE-mapped THP") Co-developed-by: David Hildenbrand david@redhat.com Signed-off-by: David Hildenbrand david@redhat.com Signed-off-by: Petr Vaněk arkamar@atlas.cz Cc: Ryan Roberts ryan.roberts@arm.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/internal.h | 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-)
--- a/mm/internal.h +++ b/mm/internal.h @@ -205,11 +205,9 @@ static inline int folio_pte_batch(struct pte_t *start_ptep, pte_t pte, int max_nr, fpb_t flags, bool *any_writable, bool *any_young, bool *any_dirty) { - unsigned long folio_end_pfn = folio_pfn(folio) + folio_nr_pages(folio); - const pte_t *end_ptep = start_ptep + max_nr; pte_t expected_pte, *ptep; bool writable, young, dirty; - int nr; + int nr, cur_nr;
if (any_writable) *any_writable = false; @@ -222,11 +220,15 @@ static inline int folio_pte_batch(struct VM_WARN_ON_FOLIO(!folio_test_large(folio) || max_nr < 1, folio); VM_WARN_ON_FOLIO(page_folio(pfn_to_page(pte_pfn(pte))) != folio, folio);
+ /* Limit max_nr to the actual remaining PFNs in the folio we could batch. */ + max_nr = min_t(unsigned long, max_nr, + folio_pfn(folio) + folio_nr_pages(folio) - pte_pfn(pte)); + nr = pte_batch_hint(start_ptep, pte); expected_pte = __pte_batch_clear_ignored(pte_advance_pfn(pte, nr), flags); ptep = start_ptep + nr;
- while (ptep < end_ptep) { + while (nr < max_nr) { pte = ptep_get(ptep); if (any_writable) writable = !!pte_write(pte); @@ -239,14 +241,6 @@ static inline int folio_pte_batch(struct if (!pte_same(pte, expected_pte)) break;
- /* - * Stop immediately once we reached the end of the folio. In - * corner cases the next PFN might fall into a different - * folio. - */ - if (pte_pfn(pte) >= folio_end_pfn) - break; - if (any_writable) *any_writable |= writable; if (any_young) @@ -254,12 +248,13 @@ static inline int folio_pte_batch(struct if (any_dirty) *any_dirty |= dirty;
- nr = pte_batch_hint(ptep, pte); - expected_pte = pte_advance_pfn(expected_pte, nr); - ptep += nr; + cur_nr = pte_batch_hint(ptep, pte); + expected_pte = pte_advance_pfn(expected_pte, cur_nr); + ptep += cur_nr; + nr += cur_nr; }
- return min(ptep - start_ptep, max_nr); + return min(nr, max_nr); }
/**
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kees Cook kees@kernel.org
commit a0309faf1cb0622cac7c820150b7abf2024acff5 upstream.
Introduce struct vm_struct::requested_size so that the requested (re)allocation size is retained separately from the allocated area size. This means that KASAN will correctly poison the correct spans of requested bytes. This also means we can support growing the usable portion of an allocation that can already be supported by the existing area's existing allocation.
Link: https://lkml.kernel.org/r/20250426001105.it.679-kees@kernel.org Fixes: 3ddc2fefe6f3 ("mm: vmalloc: implement vrealloc()") Signed-off-by: Kees Cook kees@kernel.org Reported-by: Erhard Furtner erhard_f@mailbox.org Closes: https://lore.kernel.org/all/20250408192503.6149a816@outsider.home/ Reviewed-by: Danilo Krummrich dakr@kernel.org Cc: Michal Hocko mhocko@suse.com Cc: "Uladzislau Rezki (Sony)" urezki@gmail.com Cc: Vlastimil Babka vbabka@suse.cz Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/vmalloc.h | 1 + mm/vmalloc.c | 31 ++++++++++++++++++++++++------- 2 files changed, 25 insertions(+), 7 deletions(-)
--- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -61,6 +61,7 @@ struct vm_struct { unsigned int nr_pages; phys_addr_t phys_addr; const void *caller; + unsigned long requested_size; };
struct vmap_area { --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1940,7 +1940,7 @@ static inline void setup_vmalloc_vm(stru { vm->flags = flags; vm->addr = (void *)va->va_start; - vm->size = va_size(va); + vm->size = vm->requested_size = va_size(va); vm->caller = caller; va->vm = vm; } @@ -3133,6 +3133,7 @@ struct vm_struct *__get_vm_area_node(uns
area->flags = flags; area->caller = caller; + area->requested_size = requested_size;
va = alloc_vmap_area(size, align, start, end, node, gfp_mask, 0, area); if (IS_ERR(va)) { @@ -4067,6 +4068,8 @@ EXPORT_SYMBOL(vzalloc_node_noprof); */ void *vrealloc_noprof(const void *p, size_t size, gfp_t flags) { + struct vm_struct *vm = NULL; + size_t alloced_size = 0; size_t old_size = 0; void *n;
@@ -4076,15 +4079,17 @@ void *vrealloc_noprof(const void *p, siz }
if (p) { - struct vm_struct *vm; - vm = find_vm_area(p); if (unlikely(!vm)) { WARN(1, "Trying to vrealloc() nonexistent vm area (%p)\n", p); return NULL; }
- old_size = get_vm_area_size(vm); + alloced_size = get_vm_area_size(vm); + old_size = vm->requested_size; + if (WARN(alloced_size < old_size, + "vrealloc() has mismatched area vs requested sizes (%p)\n", p)) + return NULL; }
/* @@ -4092,14 +4097,26 @@ void *vrealloc_noprof(const void *p, siz * would be a good heuristic for when to shrink the vm_area? */ if (size <= old_size) { - /* Zero out spare memory. */ - if (want_init_on_alloc(flags)) + /* Zero out "freed" memory. */ + if (want_init_on_free()) memset((void *)p + size, 0, old_size - size); + vm->requested_size = size; kasan_poison_vmalloc(p + size, old_size - size); - kasan_unpoison_vmalloc(p, size, KASAN_VMALLOC_PROT_NORMAL); return (void *)p; }
+ /* + * We already have the bytes available in the allocation; use them. + */ + if (size <= alloced_size) { + kasan_unpoison_vmalloc(p + old_size, size - old_size, + KASAN_VMALLOC_PROT_NORMAL); + /* Zero out "alloced" memory. */ + if (want_init_on_alloc(flags)) + memset((void *)p + old_size, 0, size - old_size); + vm->requested_size = size; + } + /* TODO: Grow the vm_area, i.e. allocate and map additional pages. */ n = __vmalloc_noprof(size, flags); if (!n)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gavin Guo gavinguo@igalia.com
commit be6e843fc51a584672dfd9c4a6a24c8cb81d5fb7 upstream.
When migrating a THP, concurrent access to the PMD migration entry during a deferred split scan can lead to an invalid address access, as illustrated below. To prevent this invalid access, it is necessary to check the PMD migration entry and return early. In this context, there is no need to use pmd_to_swp_entry and pfn_swap_entry_to_page to verify the equality of the target folio. Since the PMD migration entry is locked, it cannot be served as the target.
Mailing list discussion and explanation from Hugh Dickins: "An anon_vma lookup points to a location which may contain the folio of interest, but might instead contain another folio: and weeding out those other folios is precisely what the "folio != pmd_folio((*pmd)" check (and the "risk of replacing the wrong folio" comment a few lines above it) is for."
BUG: unable to handle page fault for address: ffffea60001db008 CPU: 0 UID: 0 PID: 2199114 Comm: tee Not tainted 6.14.0+ #4 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:split_huge_pmd_locked+0x3b5/0x2b60 Call Trace: <TASK> try_to_migrate_one+0x28c/0x3730 rmap_walk_anon+0x4f6/0x770 unmap_folio+0x196/0x1f0 split_huge_page_to_list_to_order+0x9f6/0x1560 deferred_split_scan+0xac5/0x12a0 shrinker_debugfs_scan_write+0x376/0x470 full_proxy_write+0x15c/0x220 vfs_write+0x2fc/0xcb0 ksys_write+0x146/0x250 do_syscall_64+0x6a/0x120 entry_SYSCALL_64_after_hwframe+0x76/0x7e
The bug is found by syzkaller on an internal kernel, then confirmed on upstream.
Link: https://lkml.kernel.org/r/20250421113536.3682201-1-gavinguo@igalia.com Link: https://lore.kernel.org/all/20250414072737.1698513-1-gavinguo@igalia.com/ Link: https://lore.kernel.org/all/20250418085802.2973519-1-gavinguo@igalia.com/ Fixes: 84c3fc4e9c56 ("mm: thp: check pmd migration entry in common path") Signed-off-by: Gavin Guo gavinguo@igalia.com Acked-by: David Hildenbrand david@redhat.com Acked-by: Hugh Dickins hughd@google.com Acked-by: Zi Yan ziy@nvidia.com Reviewed-by: Gavin Shan gshan@redhat.com Cc: Florent Revest revest@google.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Miaohe Lin linmiaohe@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/huge_memory.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
--- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2959,6 +2959,8 @@ static void __split_huge_pmd_locked(stru void split_huge_pmd_locked(struct vm_area_struct *vma, unsigned long address, pmd_t *pmd, bool freeze, struct folio *folio) { + bool pmd_migration = is_pmd_migration_entry(*pmd); + VM_WARN_ON_ONCE(folio && !folio_test_pmd_mappable(folio)); VM_WARN_ON_ONCE(!IS_ALIGNED(address, HPAGE_PMD_SIZE)); VM_WARN_ON_ONCE(folio && !folio_test_locked(folio)); @@ -2969,9 +2971,12 @@ void split_huge_pmd_locked(struct vm_are * require a folio to check the PMD against. Otherwise, there * is a risk of replacing the wrong folio. */ - if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || - is_pmd_migration_entry(*pmd)) { - if (folio && folio != pmd_folio(*pmd)) + if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || pmd_migration) { + /* + * Do not apply pmd_folio() to a migration entry; and folio lock + * guarantees that it must be of the wrong folio anyway. + */ + if (folio && (pmd_migration || folio != pmd_folio(*pmd))) return; __split_huge_pmd_locked(vma, pmd, address, freeze); }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Xu peterx@redhat.com
commit 95567729173e62e0e60a1f8ad9eb2e1320a8ccac upstream.
While discussing some userfaultfd relevant issues recently, Andrea noticed a potential ABI breakage with -EAGAIN on almost all userfaultfd ioctl()s.
Quote from Andrea, explaining how -EAGAIN was processed, and how this should fix it (taking example of UFFDIO_COPY ioctl):
The "mmap_changing" and "stale pmd" conditions are already reported as -EAGAIN written in the copy field, this does not change it. This change removes the subnormal case that left copy.copy uninitialized and required apps to explicitly set the copy field to get deterministic behavior (which is a requirement contrary to the documentation in both the manpage and source code). In turn there's no alteration to backwards compatibility as result of this change because userland will find the copy field consistently set to -EAGAIN, and not anymore sometime -EAGAIN and sometime uninitialized.
Even then the change only can make a difference to non cooperative users of userfaultfd, so when UFFD_FEATURE_EVENT_* is enabled, which is not true for the vast majority of apps using userfaultfd or this unintended uninitialized field may have been noticed sooner.
Meanwhile, since this bug existed for years, it also almost affects all ioctl()s that was introduced later. Besides UFFDIO_ZEROPAGE, these also get affected in the same way:
- UFFDIO_CONTINUE - UFFDIO_POISON - UFFDIO_MOVE
This patch should have fixed all of them.
Link: https://lkml.kernel.org/r/20250424215729.194656-2-peterx@redhat.com Fixes: df2cc96e7701 ("userfaultfd: prevent non-cooperative events vs mcopy_atomic races") Fixes: f619147104c8 ("userfaultfd: add UFFDIO_CONTINUE ioctl") Fixes: fc71884a5f59 ("mm: userfaultfd: add new UFFDIO_POISON ioctl") Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI") Signed-off-by: Peter Xu peterx@redhat.com Reported-by: Andrea Arcangeli aarcange@redhat.com Suggested-by: Andrea Arcangeli aarcange@redhat.com Reviewed-by: David Hildenbrand david@redhat.com Cc: Mike Rapoport rppt@kernel.org Cc: Axel Rasmussen axelrasmussen@google.com Cc: Suren Baghdasaryan surenb@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/userfaultfd.c | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-)
--- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1585,8 +1585,11 @@ static int userfaultfd_copy(struct userf user_uffdio_copy = (struct uffdio_copy __user *) arg;
ret = -EAGAIN; - if (atomic_read(&ctx->mmap_changing)) + if (unlikely(atomic_read(&ctx->mmap_changing))) { + if (unlikely(put_user(ret, &user_uffdio_copy->copy))) + return -EFAULT; goto out; + }
ret = -EFAULT; if (copy_from_user(&uffdio_copy, user_uffdio_copy, @@ -1641,8 +1644,11 @@ static int userfaultfd_zeropage(struct u user_uffdio_zeropage = (struct uffdio_zeropage __user *) arg;
ret = -EAGAIN; - if (atomic_read(&ctx->mmap_changing)) + if (unlikely(atomic_read(&ctx->mmap_changing))) { + if (unlikely(put_user(ret, &user_uffdio_zeropage->zeropage))) + return -EFAULT; goto out; + }
ret = -EFAULT; if (copy_from_user(&uffdio_zeropage, user_uffdio_zeropage, @@ -1744,8 +1750,11 @@ static int userfaultfd_continue(struct u user_uffdio_continue = (struct uffdio_continue __user *)arg;
ret = -EAGAIN; - if (atomic_read(&ctx->mmap_changing)) + if (unlikely(atomic_read(&ctx->mmap_changing))) { + if (unlikely(put_user(ret, &user_uffdio_continue->mapped))) + return -EFAULT; goto out; + }
ret = -EFAULT; if (copy_from_user(&uffdio_continue, user_uffdio_continue, @@ -1801,8 +1810,11 @@ static inline int userfaultfd_poison(str user_uffdio_poison = (struct uffdio_poison __user *)arg;
ret = -EAGAIN; - if (atomic_read(&ctx->mmap_changing)) + if (unlikely(atomic_read(&ctx->mmap_changing))) { + if (unlikely(put_user(ret, &user_uffdio_poison->updated))) + return -EFAULT; goto out; + }
ret = -EFAULT; if (copy_from_user(&uffdio_poison, user_uffdio_poison, @@ -1870,8 +1882,12 @@ static int userfaultfd_move(struct userf
user_uffdio_move = (struct uffdio_move __user *) arg;
- if (atomic_read(&ctx->mmap_changing)) - return -EAGAIN; + ret = -EAGAIN; + if (unlikely(atomic_read(&ctx->mmap_changing))) { + if (unlikely(put_user(ret, &user_uffdio_move->move))) + return -EFAULT; + goto out; + }
if (copy_from_user(&uffdio_move, user_uffdio_move, /* don't copy "move" last field */
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Feng Tang feng.tang@linux.alibaba.com
commit ab00ddd802f80e31fc9639c652d736fe3913feae upstream.
When running mm selftest to verify mm patches, 'compaction_test' case failed on an x86 server with 1TB memory. And the root cause is that it has too much free memory than what the test supports.
The test case tries to allocate 100000 huge pages, which is about 200 GB for that x86 server, and when it succeeds, it expects it's large than 1/3 of 80% of the free memory in system. This logic only works for platform with 750 GB ( 200 / (1/3) / 80% ) or less free memory, and may raise false alarm for others.
Fix it by changing the fixed page number to self-adjustable number according to the real number of free memory.
Link: https://lkml.kernel.org/r/20250423103645.2758-1-feng.tang@linux.alibaba.com Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory") Signed-off-by: Feng Tang feng.tang@linux.alibaba.com Acked-by: Dev Jain dev.jain@arm.com Reviewed-by: Baolin Wang baolin.wang@linux.alibaba.com Tested-by: Baolin Wang baolin.wang@inux.alibaba.com Cc: Shuah Khan shuah@kernel.org Cc: Sri Jayaramappa sjayaram@akamai.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/mm/compaction_test.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
--- a/tools/testing/selftests/mm/compaction_test.c +++ b/tools/testing/selftests/mm/compaction_test.c @@ -90,6 +90,8 @@ int check_compaction(unsigned long mem_f int compaction_index = 0; char nr_hugepages[20] = {0}; char init_nr_hugepages[24] = {0}; + char target_nr_hugepages[24] = {0}; + int slen;
snprintf(init_nr_hugepages, sizeof(init_nr_hugepages), "%lu", initial_nr_hugepages); @@ -106,11 +108,18 @@ int check_compaction(unsigned long mem_f goto out; }
- /* Request a large number of huge pages. The Kernel will allocate - as much as it can */ - if (write(fd, "100000", (6*sizeof(char))) != (6*sizeof(char))) { - ksft_print_msg("Failed to write 100000 to /proc/sys/vm/nr_hugepages: %s\n", - strerror(errno)); + /* + * Request huge pages for about half of the free memory. The Kernel + * will allocate as much as it can, and we expect it will get at least 1/3 + */ + nr_hugepages_ul = mem_free / hugepage_size / 2; + snprintf(target_nr_hugepages, sizeof(target_nr_hugepages), + "%lu", nr_hugepages_ul); + + slen = strlen(target_nr_hugepages); + if (write(fd, target_nr_hugepages, slen) != slen) { + ksft_print_msg("Failed to write %lu to /proc/sys/vm/nr_hugepages: %s\n", + nr_hugepages_ul, strerror(errno)); goto close_fd; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nysal Jan K.A. nysal@linux.ibm.com
commit 8cf6ecb18baac867585fe1cba5dde6dbf3b6d29a upstream.
The compiler is unaware of the size of code generated by the ".rept" assembler directive. This results in the compiler emitting branch instructions where the offset to branch to exceeds the maximum allowed value, resulting in build failures like the following:
CC protection_keys /tmp/ccypKWAE.s: Assembler messages: /tmp/ccypKWAE.s:2073: Error: operand out of range (0x0000000000020158 is not between 0xffffffffffff8000 and 0x0000000000007ffc) /tmp/ccypKWAE.s:2509: Error: operand out of range (0x0000000000020130 is not between 0xffffffffffff8000 and 0x0000000000007ffc)
Fix the issue by manually adding nop instructions using the preprocessor.
Link: https://lkml.kernel.org/r/20250428131937.641989-2-nysal@linux.ibm.com Fixes: 46036188ea1f ("selftests/mm: build with -O2") Reported-by: Madhavan Srinivasan maddy@linux.ibm.com Signed-off-by: Nysal Jan K.A. nysal@linux.ibm.com Tested-by: Venkat Rao Bagalkote venkat88@linux.ibm.com Reviewed-by: Donet Tom donettom@linux.ibm.com Tested-by: Donet Tom donettom@linux.ibm.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/mm/pkey-powerpc.h | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/mm/pkey-powerpc.h +++ b/tools/testing/selftests/mm/pkey-powerpc.h @@ -102,8 +102,18 @@ static inline void expect_fault_on_read_ return; }
+#define REPEAT_8(s) s s s s s s s s +#define REPEAT_64(s) REPEAT_8(s) REPEAT_8(s) REPEAT_8(s) REPEAT_8(s) \ + REPEAT_8(s) REPEAT_8(s) REPEAT_8(s) REPEAT_8(s) +#define REPEAT_512(s) REPEAT_64(s) REPEAT_64(s) REPEAT_64(s) REPEAT_64(s) \ + REPEAT_64(s) REPEAT_64(s) REPEAT_64(s) REPEAT_64(s) +#define REPEAT_4096(s) REPEAT_512(s) REPEAT_512(s) REPEAT_512(s) REPEAT_512(s) \ + REPEAT_512(s) REPEAT_512(s) REPEAT_512(s) REPEAT_512(s) +#define REPEAT_16384(s) REPEAT_4096(s) REPEAT_4096(s) \ + REPEAT_4096(s) REPEAT_4096(s) + /* 4-byte instructions * 16384 = 64K page */ -#define __page_o_noops() asm(".rept 16384 ; nop; .endr") +#define __page_o_noops() asm(REPEAT_16384("nop\n"))
static inline void *malloc_pkey_with_mprotect_subpage(long size, int prot, u16 pkey) {
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Madhavan Srinivasan maddy@linux.ibm.com
commit 22adb528621ddc92f887882a658507fbf88a5214 upstream.
Commit 50910acd6f615 ("selftests/mm: use sys_pkey helpers consistently") added a pkey_util.c to refactor some of the protection_keys functions accessible by other tests. But this broken the build in powerpc in two ways,
pkey-powerpc.h: In function `arch_is_powervm': pkey-powerpc.h:73:21: error: storage size of `buf' isn't known 73 | struct stat buf; | ^~~ pkey-powerpc.h:75:14: error: implicit declaration of function `stat'; did you mean `strcat'? [-Wimplicit-function-declaration] 75 | if ((stat("/sys/firmware/devicetree/base/ibm,partition-name", &buf) == 0) && | ^~~~ | strcat
Since pkey_util.c includes pkeys-helper.h, which in turn includes pkeys-powerpc.h, stat.h including is missing for "struct stat". This is fixed by adding "sys/stat.h" in pkeys-powerpc.h
Secondly,
pkey-powerpc.h:55:18: warning: format `%llx' expects argument of type `long long unsigned int', but argument 3 has type `u64' {aka `long unsigned int'} [-Wformat=] 55 | dprintf4("%s() changing %016llx to %016llx\n", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __func__, __read_pkey_reg(), pkey_reg); | ~~~~~~~~~~~~~~~~~ | | | u64 {aka long unsigned int} pkey-helpers.h:63:32: note: in definition of macro `dprintf_level' 63 | sigsafe_printf(args); \ | ^~~~
These format specifier related warning are removed by adding "__SANE_USERSPACE_TYPES__" to pkeys_utils.c.
Link: https://lkml.kernel.org/r/20250428131937.641989-1-nysal@linux.ibm.com Fixes: 50910acd6f61 ("selftests/mm: use sys_pkey helpers consistently") Signed-off-by: Madhavan Srinivasan maddy@linux.ibm.com Signed-off-by: Nysal Jan K.A. nysal@linux.ibm.com Tested-by: Venkat Rao Bagalkote venkat88@linux.ibm.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/mm/pkey-powerpc.h | 2 ++ tools/testing/selftests/mm/pkey_util.c | 1 + 2 files changed, 3 insertions(+)
--- a/tools/testing/selftests/mm/pkey-powerpc.h +++ b/tools/testing/selftests/mm/pkey-powerpc.h @@ -3,6 +3,8 @@ #ifndef _PKEYS_POWERPC_H #define _PKEYS_POWERPC_H
+#include <sys/stat.h> + #ifndef SYS_pkey_alloc # define SYS_pkey_alloc 384 # define SYS_pkey_free 385 --- a/tools/testing/selftests/mm/pkey_util.c +++ b/tools/testing/selftests/mm/pkey_util.c @@ -1,4 +1,5 @@ // SPDX-License-Identifier: GPL-2.0-only +#define __SANE_USERSPACE_TYPES__ #include <sys/syscall.h> #include <unistd.h>
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 9129633d568edd36aa22bf703b12835153cec985 upstream.
When changing memory attributes on a subset of a potential hugepage, add the hugepage to the invalidation range tracking to prevent installing a hugepage until the attributes are fully updated. Like the actual hugepage tracking updates in kvm_arch_post_set_memory_attributes(), process only the head and tail pages, as any potential hugepages that are entirely covered by the range will already be tracked.
Note, only hugepage chunks whose current attributes are NOT mixed need to be added to the invalidation set, as mixed attributes already prevent installing a hugepage, and it's perfectly safe to install a smaller mapping for a gfn whose attributes aren't changing.
Fixes: 8dd2eee9d526 ("KVM: x86/mmu: Handle page fault for private memory") Cc: stable@vger.kernel.org Reported-by: Michael Roth michael.roth@amd.com Tested-by: Michael Roth michael.roth@amd.com Link: https://lore.kernel.org/r/20250430220954.522672-1-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/mmu/mmu.c | 69 +++++++++++++++++++++++++++++++++++++------------ 1 file changed, 53 insertions(+), 16 deletions(-)
--- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -7496,9 +7496,30 @@ void kvm_mmu_pre_destroy_vm(struct kvm * }
#ifdef CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES +static bool hugepage_test_mixed(struct kvm_memory_slot *slot, gfn_t gfn, + int level) +{ + return lpage_info_slot(gfn, slot, level)->disallow_lpage & KVM_LPAGE_MIXED_FLAG; +} + +static void hugepage_clear_mixed(struct kvm_memory_slot *slot, gfn_t gfn, + int level) +{ + lpage_info_slot(gfn, slot, level)->disallow_lpage &= ~KVM_LPAGE_MIXED_FLAG; +} + +static void hugepage_set_mixed(struct kvm_memory_slot *slot, gfn_t gfn, + int level) +{ + lpage_info_slot(gfn, slot, level)->disallow_lpage |= KVM_LPAGE_MIXED_FLAG; +} + bool kvm_arch_pre_set_memory_attributes(struct kvm *kvm, struct kvm_gfn_range *range) { + struct kvm_memory_slot *slot = range->slot; + int level; + /* * Zap SPTEs even if the slot can't be mapped PRIVATE. KVM x86 only * supports KVM_MEMORY_ATTRIBUTE_PRIVATE, and so it *seems* like KVM @@ -7513,6 +7534,38 @@ bool kvm_arch_pre_set_memory_attributes( if (WARN_ON_ONCE(!kvm_arch_has_private_mem(kvm))) return false;
+ if (WARN_ON_ONCE(range->end <= range->start)) + return false; + + /* + * If the head and tail pages of the range currently allow a hugepage, + * i.e. reside fully in the slot and don't have mixed attributes, then + * add each corresponding hugepage range to the ongoing invalidation, + * e.g. to prevent KVM from creating a hugepage in response to a fault + * for a gfn whose attributes aren't changing. Note, only the range + * of gfns whose attributes are being modified needs to be explicitly + * unmapped, as that will unmap any existing hugepages. + */ + for (level = PG_LEVEL_2M; level <= KVM_MAX_HUGEPAGE_LEVEL; level++) { + gfn_t start = gfn_round_for_level(range->start, level); + gfn_t end = gfn_round_for_level(range->end - 1, level); + gfn_t nr_pages = KVM_PAGES_PER_HPAGE(level); + + if ((start != range->start || start + nr_pages > range->end) && + start >= slot->base_gfn && + start + nr_pages <= slot->base_gfn + slot->npages && + !hugepage_test_mixed(slot, start, level)) + kvm_mmu_invalidate_range_add(kvm, start, start + nr_pages); + + if (end == start) + continue; + + if ((end + nr_pages) > range->end && + (end + nr_pages) <= (slot->base_gfn + slot->npages) && + !hugepage_test_mixed(slot, end, level)) + kvm_mmu_invalidate_range_add(kvm, end, end + nr_pages); + } + /* Unmap the old attribute page. */ if (range->arg.attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE) range->attr_filter = KVM_FILTER_SHARED; @@ -7522,23 +7575,7 @@ bool kvm_arch_pre_set_memory_attributes( return kvm_unmap_gfn_range(kvm, range); }
-static bool hugepage_test_mixed(struct kvm_memory_slot *slot, gfn_t gfn, - int level) -{ - return lpage_info_slot(gfn, slot, level)->disallow_lpage & KVM_LPAGE_MIXED_FLAG; -} - -static void hugepage_clear_mixed(struct kvm_memory_slot *slot, gfn_t gfn, - int level) -{ - lpage_info_slot(gfn, slot, level)->disallow_lpage &= ~KVM_LPAGE_MIXED_FLAG; -}
-static void hugepage_set_mixed(struct kvm_memory_slot *slot, gfn_t gfn, - int level) -{ - lpage_info_slot(gfn, slot, level)->disallow_lpage |= KVM_LPAGE_MIXED_FLAG; -}
static bool hugepage_has_attrs(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, int level, unsigned long attrs)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikhail Lobanov m.lobanov@rosa.ru
commit a2620f8932fa9fdabc3d78ed6efb004ca409019f upstream.
Previously, commit ed129ec9057f ("KVM: x86: forcibly leave nested mode on vCPU reset") addressed an issue where a triple fault occurring in nested mode could lead to use-after-free scenarios. However, the commit did not handle the analogous situation for System Management Mode (SMM).
This omission results in triggering a WARN when KVM forces a vCPU INIT after SHUTDOWN interception while the vCPU is in SMM. This situation was reprodused using Syzkaller by:
1) Creating a KVM VM and vCPU 2) Sending a KVM_SMI ioctl to explicitly enter SMM 3) Executing invalid instructions causing consecutive exceptions and eventually a triple fault
The issue manifests as follows:
WARNING: CPU: 0 PID: 25506 at arch/x86/kvm/x86.c:12112 kvm_vcpu_reset+0x1d2/0x1530 arch/x86/kvm/x86.c:12112 Modules linked in: CPU: 0 PID: 25506 Comm: syz-executor.0 Not tainted 6.1.130-syzkaller-00157-g164fe5dde9b6 #0 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 RIP: 0010:kvm_vcpu_reset+0x1d2/0x1530 arch/x86/kvm/x86.c:12112 Call Trace: <TASK> shutdown_interception+0x66/0xb0 arch/x86/kvm/svm/svm.c:2136 svm_invoke_exit_handler+0x110/0x530 arch/x86/kvm/svm/svm.c:3395 svm_handle_exit+0x424/0x920 arch/x86/kvm/svm/svm.c:3457 vcpu_enter_guest arch/x86/kvm/x86.c:10959 [inline] vcpu_run+0x2c43/0x5a90 arch/x86/kvm/x86.c:11062 kvm_arch_vcpu_ioctl_run+0x50f/0x1cf0 arch/x86/kvm/x86.c:11283 kvm_vcpu_ioctl+0x570/0xf00 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4122 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl fs/ioctl.c:856 [inline] __x64_sys_ioctl+0x19a/0x210 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Architecturally, INIT is blocked when the CPU is in SMM, hence KVM's WARN() in kvm_vcpu_reset() to guard against KVM bugs, e.g. to detect improper emulation of INIT. SHUTDOWN on SVM is a weird edge case where KVM needs to do _something_ sane with the VMCB, since it's technically undefined, and INIT is the least awful choice given KVM's ABI.
So, double down on stuffing INIT on SHUTDOWN, and force the vCPU out of SMM to avoid any weirdness (and the WARN).
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: ed129ec9057f ("KVM: x86: forcibly leave nested mode on vCPU reset") Cc: stable@vger.kernel.org Suggested-by: Sean Christopherson seanjc@google.com Signed-off-by: Mikhail Lobanov m.lobanov@rosa.ru Link: https://lore.kernel.org/r/20250414171207.155121-1-m.lobanov@rosa.ru [sean: massage changelog, make it clear this isn't architectural behavior] Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/smm.c | 1 + arch/x86/kvm/svm/svm.c | 4 ++++ 2 files changed, 5 insertions(+)
--- a/arch/x86/kvm/smm.c +++ b/arch/x86/kvm/smm.c @@ -131,6 +131,7 @@ void kvm_smm_changed(struct kvm_vcpu *vc
kvm_mmu_reset_context(vcpu); } +EXPORT_SYMBOL_GPL(kvm_smm_changed);
void process_smi(struct kvm_vcpu *vcpu) { --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -2220,6 +2220,10 @@ static int shutdown_interception(struct */ if (!sev_es_guest(vcpu->kvm)) { clear_page(svm->vmcb); +#ifdef CONFIG_KVM_SMM + if (is_smm(vcpu)) + kvm_smm_changed(vcpu, false); +#endif kvm_vcpu_reset(vcpu, true); }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wayne Lin Wayne.Lin@amd.com
commit 5a3846648c0523fd850b7f0aec78c0139453ab8b upstream.
[Why] Defined value of dmub AUX reply command field get updated but didn't adjust dm receiving side accordingly.
[How] Check the received reply command value to see if it's updated version or not. Adjust it if necessary.
Fixes: ead08b95fa50 ("drm/amd/display: Fix race condition in DPIA AUX transfer") Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Reviewed-by: Ray Wu ray.wu@amd.com Signed-off-by: Wayne Lin Wayne.Lin@amd.com Signed-off-by: Ray Wu ray.wu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit d5c9ade755a9afa210840708a12a8f44c0d532f4) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -12610,8 +12610,11 @@ int amdgpu_dm_process_dmub_aux_transfer_ goto out; }
+ payload->reply[0] = adev->dm.dmub_notify->aux_reply.command & 0xF; + if (adev->dm.dmub_notify->aux_reply.command & 0xF0) + /* The reply is stored in the top nibble of the command. */ + payload->reply[0] = (adev->dm.dmub_notify->aux_reply.command >> 4) & 0xF;
- payload->reply[0] = adev->dm.dmub_notify->aux_reply.command; if (!payload->write && p_notify->aux_reply.length && (payload->reply[0] == AUX_TRANSACTION_REPLY_AUX_ACK)) {
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nam Cao namcao@linutronix.de
commit ae08d55807c099357c047dba17624b09414635dd upstream.
When userspace does PR_SET_TAGGED_ADDR_CTRL, but Supm extension is not available, the kernel crashes:
Oops - illegal instruction [#1] [snip] epc : set_tagged_addr_ctrl+0x112/0x15a ra : set_tagged_addr_ctrl+0x74/0x15a epc : ffffffff80011ace ra : ffffffff80011a30 sp : ffffffc60039be10 [snip] status: 0000000200000120 badaddr: 0000000010a79073 cause: 0000000000000002 set_tagged_addr_ctrl+0x112/0x15a __riscv_sys_prctl+0x352/0x73c do_trap_ecall_u+0x17c/0x20c andle_exception+0x150/0x15c
Fix it by checking if Supm is available.
Fixes: 09d6775f503b ("riscv: Add support for userspace pointer masking") Signed-off-by: Nam Cao namcao@linutronix.de Cc: stable@vger.kernel.org Reviewed-by: Samuel Holland samuel.holland@sifive.com Link: https://lore.kernel.org/r/20250504101920.3393053-1-namcao@linutronix.de Signed-off-by: Alexandre Ghiti alexghiti@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/kernel/process.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index 7c244de77180..3db2c0c07acd 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -275,6 +275,9 @@ long set_tagged_addr_ctrl(struct task_struct *task, unsigned long arg) unsigned long pmm; u8 pmlen;
+ if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SUPM)) + return -EINVAL; + if (is_compat_thread(ti)) return -EINVAL;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
commit 687b2bae0efff9b25e071737d6af5004e6e35af5 upstream.
Multishot normally uses io_req_post_cqe() to post completions, but when stopping it, it may finish up with a deferred completion. This is fine, except if another multishot event triggers before the deferred completions get flushed. If this occurs, then CQEs may get reordered in the CQ ring, as new multishot completions get posted before the deferred ones are flushed. This can cause confusion on the application side, if strict ordering is required for the use case.
When multishot posting via io_req_post_cqe(), flush any pending deferred completions first, if any.
Cc: stable@vger.kernel.org # 6.1+ Reported-by: Norman Maurer norman_maurer@apple.com Reported-by: Christian Mazakas christian.mazakas@gmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -874,6 +874,14 @@ bool io_req_post_cqe(struct io_kiocb *re struct io_ring_ctx *ctx = req->ctx; bool posted;
+ /* + * If multishot has already posted deferred completions, ensure that + * those are flushed first before posting this one. If not, CQEs + * could get reordered. + */ + if (!wq_list_empty(&ctx->submit_state.compl_reqs)) + __io_submit_flush_completions(ctx); + lockdep_assert(!io_wq_current_is_worker()); lockdep_assert_held(&ctx->uring_lock);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Cameron Jonathan.Cameron@huawei.com
commit ffbc26bc91c1f1eb3dcf5d8776e74cbae21ee13a upstream.
On architectures where an s64 is not 64-bit aligned, this may result insufficient alignment of the timestamp and the structure being too small. Use aligned_s64 to force the alignment.
Fixes: a1caeebab07e ("iio: adc: ad7768-1: Fix too small buffer passed to iio_push_to_buffers_with_timestamp()") # aligned_s64 newer Reported-by: David Lechner dlechner@baylibre.com Reviewed-by: Nuno Sá nuno.sa@analog.com Reviewed-by: David Lechner dlechner@baylibre.com Link: https://patch.msgid.link/20250413103443.2420727-3-jic23@kernel.org Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/ad7768-1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/adc/ad7768-1.c +++ b/drivers/iio/adc/ad7768-1.c @@ -169,7 +169,7 @@ struct ad7768_state { union { struct { __be32 chan; - s64 timestamp; + aligned_s64 timestamp; } scan; __be32 d32; u8 d8[2];
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Cameron Jonathan.Cameron@huawei.com
commit 52d349884738c346961e153f195f4c7fe186fcf4 upstream.
On architectures where an s64 is only 32-bit aligned insufficient padding would be left between the earlier elements and the timestamp. Use aligned_s64 to enforce the correct placement and ensure the storage is large enough.
Fixes: 54e018da3141 ("iio:ad7266: Mark transfer buffer as __be16") # aligned_s64 is much newer. Reported-by: David Lechner dlechner@baylibre.com Reviewed-by: Nuno Sá nuno.sa@analog.com Reviewed-by: David Lechner dlechner@baylibre.com Link: https://patch.msgid.link/20250413103443.2420727-2-jic23@kernel.org Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/ad7266.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/adc/ad7266.c +++ b/drivers/iio/adc/ad7266.c @@ -45,7 +45,7 @@ struct ad7266_state { */ struct { __be16 sample[2]; - s64 timestamp; + aligned_s64 timestamp; } data __aligned(IIO_DMA_MINALIGN); };
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Angelo Dureghello adureghello@baylibre.com
commit f083f8a21cc785ebe3a33f756a3fa3660611f8db upstream.
Fix register read/write routine as per datasheet.
When reading multiple consecutive registers, only the first one is read properly. This is due to missing chip select deassert and assert again between first and second 16bit transfer, as shown in the datasheet AD7606C-16, rev 0, figure 110.
Fixes: f2a22e1e172f ("iio: adc: ad7606: Add support for software mode for ad7616") Reviewed-by: David Lechner dlechner@baylibre.com Signed-off-by: Angelo Dureghello adureghello@baylibre.com Link: https://patch.msgid.link/20250418-wip-bl-ad7606-fix-reg-access-v3-1-d5eeb440... Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/ad7606_spi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/adc/ad7606_spi.c +++ b/drivers/iio/adc/ad7606_spi.c @@ -165,7 +165,7 @@ static int ad7606_spi_reg_read(struct ad { .tx_buf = &st->d16[0], .len = 2, - .cs_change = 0, + .cs_change = 1, }, { .rx_buf = &st->d16[1], .len = 2,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Simon Xue xxm@rock-chips.com
commit 839f81de397019f55161c5982d670ac19d836173 upstream.
clock_set_rate should be executed after devm_clk_get_enabled.
Fixes: 97ad10bb2901 ("iio: adc: rockchip_saradc: Make use of devm_clk_get_enabled") Signed-off-by: Simon Xue xxm@rock-chips.com Reviewed-by: Heiko Stuebner heiko@sntech.de Link: https://patch.msgid.link/20250312062016.137821-1-xxm@rock-chips.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/rockchip_saradc.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-)
--- a/drivers/iio/adc/rockchip_saradc.c +++ b/drivers/iio/adc/rockchip_saradc.c @@ -480,15 +480,6 @@ static int rockchip_saradc_probe(struct if (info->reset) rockchip_saradc_reset_controller(info->reset);
- /* - * Use a default value for the converter clock. - * This may become user-configurable in the future. - */ - ret = clk_set_rate(info->clk, info->data->clk_rate); - if (ret < 0) - return dev_err_probe(&pdev->dev, ret, - "failed to set adc clk rate\n"); - ret = regulator_enable(info->vref); if (ret < 0) return dev_err_probe(&pdev->dev, ret, @@ -515,6 +506,14 @@ static int rockchip_saradc_probe(struct if (IS_ERR(info->clk)) return dev_err_probe(&pdev->dev, PTR_ERR(info->clk), "failed to get adc clock\n"); + /* + * Use a default value for the converter clock. + * This may become user-configurable in the future. + */ + ret = clk_set_rate(info->clk, info->data->clk_rate); + if (ret < 0) + return dev_err_probe(&pdev->dev, ret, + "failed to set adc clk rate\n");
platform_set_drvdata(pdev, indio_dev);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gabriel Shahrouzi gshahrouzi@gmail.com
commit 609bc31eca06c7408e6860d8b46311ebe45c1fef upstream.
The inclinometer channels were previously defined with 14 realbits. However, the ADIS16201 datasheet states the resolution for these output channels is 12 bits (Page 14, text description; Page 15, table 7).
Correct the realbits value to 12 to accurately reflect the hardware.
Fixes: f7fe1d1dd5a5 ("staging: iio: new adis16201 driver") Cc: stable@vger.kernel.org Signed-off-by: Gabriel Shahrouzi gshahrouzi@gmail.com Reviewed-by: Marcelo Schmitt marcelo.schmitt1@gmail.com Link: https://patch.msgid.link/20250421131539.912966-1-gshahrouzi@gmail.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/accel/adis16201.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/iio/accel/adis16201.c +++ b/drivers/iio/accel/adis16201.c @@ -211,9 +211,9 @@ static const struct iio_chan_spec adis16 BIT(IIO_CHAN_INFO_CALIBBIAS), 0, 14), ADIS_AUX_ADC_CHAN(ADIS16201_AUX_ADC_REG, ADIS16201_SCAN_AUX_ADC, 0, 12), ADIS_INCLI_CHAN(X, ADIS16201_XINCL_OUT_REG, ADIS16201_SCAN_INCLI_X, - BIT(IIO_CHAN_INFO_CALIBBIAS), 0, 14), + BIT(IIO_CHAN_INFO_CALIBBIAS), 0, 12), ADIS_INCLI_CHAN(Y, ADIS16201_YINCL_OUT_REG, ADIS16201_SCAN_INCLI_Y, - BIT(IIO_CHAN_INFO_CALIBBIAS), 0, 14), + BIT(IIO_CHAN_INFO_CALIBBIAS), 0, 12), IIO_CHAN_SOFT_TIMESTAMP(7) };
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Lechner dlechner@baylibre.com
commit bb49d940344bcb8e2b19e69d7ac86f567887ea9a upstream.
Follow the pattern of other drivers and use aligned_s64 for the timestamp. This will ensure that the timestamp is correctly aligned on all architectures.
Fixes: a5bf6fdd19c3 ("iio:chemical:sps30: Fix timestamp alignment") Signed-off-by: David Lechner dlechner@baylibre.com Reviewed-by: Nuno Sá nuno.sa@analog.com Link: https://patch.msgid.link/20250417-iio-more-timestamp-alignment-v1-5-eafac1e2... Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/chemical/sps30.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/chemical/sps30.c +++ b/drivers/iio/chemical/sps30.c @@ -108,7 +108,7 @@ static irqreturn_t sps30_trigger_handler int ret; struct { s32 data[4]; /* PM1, PM2P5, PM4, PM10 */ - s64 ts; + aligned_s64 ts; } scan;
mutex_lock(&state->lock);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Lechner dlechner@baylibre.com
commit 6ffa698674053e82e811520642db2650d00d2c01 upstream.
Follow the pattern of other drivers and use aligned_s64 for the timestamp. This will ensure that the timestamp is correctly aligned on all architectures.
Also move the unaligned.h header while touching this since it was the only one not in alphabetical order.
Fixes: 13e945631c2f ("iio:chemical:pms7003: Fix timestamp alignment and prevent data leak.") Signed-off-by: David Lechner dlechner@baylibre.com Reviewed-by: Nuno Sá nuno.sa@analog.com Link: https://patch.msgid.link/20250417-iio-more-timestamp-alignment-v1-4-eafac1e2... Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/chemical/pms7003.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/iio/chemical/pms7003.c +++ b/drivers/iio/chemical/pms7003.c @@ -5,7 +5,6 @@ * Copyright (c) Tomasz Duszynski tduszyns@gmail.com */
-#include <linux/unaligned.h> #include <linux/completion.h> #include <linux/device.h> #include <linux/errno.h> @@ -19,6 +18,8 @@ #include <linux/module.h> #include <linux/mutex.h> #include <linux/serdev.h> +#include <linux/types.h> +#include <linux/unaligned.h>
#define PMS7003_DRIVER_NAME "pms7003"
@@ -76,7 +77,7 @@ struct pms7003_state { /* Used to construct scan to push to the IIO buffer */ struct { u16 data[3]; /* PM1, PM2P5, PM10 */ - s64 ts; + aligned_s64 ts; } scan; };
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Lixu lixu.zhang@intel.com
commit 83ded7cfaccccd2f4041769c313b58b4c9e265ad upstream.
The variables `scale_pre_decml`, `scale_post_decml`, and `scale_precision` were assigned in commit d68c592e02f6 ("iio: hid-sensor-prox: Fix scale not correct issue"), but due to a merge conflict in commit 9c15db92a8e5 ("Merge tag 'iio-for-5.13a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next"), these assignments were lost.
Add back lost assignments and replace `st->prox_attr` with `st->prox_attr[0]` because commit 596ef5cf654b ("iio: hid-sensor-prox: Add support for more channels") changed `prox_attr` to an array.
Cc: stable@vger.kernel.org # 5.13+ Fixes: 9c15db92a8e5 ("Merge tag 'iio-for-5.13a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next") Signed-off-by: Zhang Lixu lixu.zhang@intel.com Acked-by: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Link: https://patch.msgid.link/20250331055022.1149736-2-lixu.zhang@intel.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/light/hid-sensor-prox.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/iio/light/hid-sensor-prox.c +++ b/drivers/iio/light/hid-sensor-prox.c @@ -257,6 +257,11 @@ static int prox_parse_report(struct plat
st->num_channels = index;
+ st->scale_precision = hid_sensor_format_scale(hsdev->usage, + &st->prox_attr[0], + &st->scale_pre_decml, + &st->scale_post_decml); + return 0; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Lixu lixu.zhang@intel.com
commit 8b518cdb03f5f6e06d635cbfd9583d1fdbb39bfd upstream.
With the introduction of multi-channel support in commit 596ef5cf654b ("iio: hid-sensor-prox: Add support for more channels"), each channel requires an independent SCALE calculation, but the existing code only calculates SCALE for a single channel.
Addresses the problem by modifying the driver to perform independent SCALE calculations for each channel.
Cc: stable@vger.kernel.org Fixes: 596ef5cf654b ("iio: hid-sensor-prox: Add support for more channels") Signed-off-by: Zhang Lixu lixu.zhang@intel.com Acked-by: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Link: https://patch.msgid.link/20250331055022.1149736-3-lixu.zhang@intel.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/common/hid-sensors/hid-sensor-attributes.c | 4 ++ drivers/iio/light/hid-sensor-prox.c | 24 +++++++++-------- 2 files changed, 17 insertions(+), 11 deletions(-)
--- a/drivers/iio/common/hid-sensors/hid-sensor-attributes.c +++ b/drivers/iio/common/hid-sensors/hid-sensor-attributes.c @@ -66,6 +66,10 @@ static struct { {HID_USAGE_SENSOR_HUMIDITY, 0, 1000, 0}, {HID_USAGE_SENSOR_HINGE, 0, 0, 17453293}, {HID_USAGE_SENSOR_HINGE, HID_USAGE_SENSOR_UNITS_DEGREES, 0, 17453293}, + + {HID_USAGE_SENSOR_HUMAN_PRESENCE, 0, 1, 0}, + {HID_USAGE_SENSOR_HUMAN_PROXIMITY, 0, 1, 0}, + {HID_USAGE_SENSOR_HUMAN_ATTENTION, 0, 1, 0}, };
static void simple_div(int dividend, int divisor, int *whole, --- a/drivers/iio/light/hid-sensor-prox.c +++ b/drivers/iio/light/hid-sensor-prox.c @@ -34,9 +34,9 @@ struct prox_state { struct iio_chan_spec channels[MAX_CHANNELS]; u32 channel2usage[MAX_CHANNELS]; u32 human_presence[MAX_CHANNELS]; - int scale_pre_decml; - int scale_post_decml; - int scale_precision; + int scale_pre_decml[MAX_CHANNELS]; + int scale_post_decml[MAX_CHANNELS]; + int scale_precision[MAX_CHANNELS]; unsigned long scan_mask[2]; /* One entry plus one terminator. */ int num_channels; }; @@ -116,9 +116,12 @@ static int prox_read_raw(struct iio_dev ret_type = IIO_VAL_INT; break; case IIO_CHAN_INFO_SCALE: - *val = prox_state->scale_pre_decml; - *val2 = prox_state->scale_post_decml; - ret_type = prox_state->scale_precision; + if (chan->scan_index >= prox_state->num_channels) + return -EINVAL; + + *val = prox_state->scale_pre_decml[chan->scan_index]; + *val2 = prox_state->scale_post_decml[chan->scan_index]; + ret_type = prox_state->scale_precision[chan->scan_index]; break; case IIO_CHAN_INFO_OFFSET: *val = hid_sensor_convert_exponent( @@ -249,6 +252,10 @@ static int prox_parse_report(struct plat st->prox_attr[index].size); dev_dbg(&pdev->dev, "prox %x:%x\n", st->prox_attr[index].index, st->prox_attr[index].report_id); + st->scale_precision[index] = + hid_sensor_format_scale(usage_id, &st->prox_attr[index], + &st->scale_pre_decml[index], + &st->scale_post_decml[index]); index++; }
@@ -257,11 +264,6 @@ static int prox_parse_report(struct plat
st->num_channels = index;
- st->scale_precision = hid_sensor_format_scale(hsdev->usage, - &st->prox_attr[0], - &st->scale_pre_decml, - &st->scale_post_decml); - return 0; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Lixu lixu.zhang@intel.com
commit 79dabbd505210e41c88060806c92c052496dd61c upstream.
The OFFSET calculation in the prox_read_raw() was incorrectly using the unit exponent, which is intended for SCALE calculations.
Remove the incorrect OFFSET calculation and set it to a fixed value of 0.
Cc: stable@vger.kernel.org Fixes: 39a3a0138f61 ("iio: hid-sensors: Added Proximity Sensor Driver") Signed-off-by: Zhang Lixu lixu.zhang@intel.com Acked-by: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Link: https://patch.msgid.link/20250331055022.1149736-4-lixu.zhang@intel.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/light/hid-sensor-prox.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/iio/light/hid-sensor-prox.c +++ b/drivers/iio/light/hid-sensor-prox.c @@ -124,8 +124,7 @@ static int prox_read_raw(struct iio_dev ret_type = prox_state->scale_precision[chan->scan_index]; break; case IIO_CHAN_INFO_OFFSET: - *val = hid_sensor_convert_exponent( - prox_state->prox_attr[chan->scan_index].unit_expo); + *val = 0; ret_type = IIO_VAL_INT; break; case IIO_CHAN_INFO_SAMP_FREQ:
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Lechner dlechner@baylibre.com
commit 1d2d8524eaffc4d9a116213520d2c650e07c9cc6 upstream.
Align the buffer used with iio_push_to_buffers_with_timestamp() to ensure the s64 timestamp is aligned to 8 bytes.
Fixes: 0829edc43e0a ("iio: imu: inv_mpu6050: read the full fifo when processing data") Signed-off-by: David Lechner dlechner@baylibre.com Link: https://patch.msgid.link/20250417-iio-more-timestamp-alignment-v1-7-eafac1e2... Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c +++ b/drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c @@ -50,7 +50,7 @@ irqreturn_t inv_mpu6050_read_fifo(int ir u16 fifo_count; u32 fifo_period; s64 timestamp; - u8 data[INV_MPU6050_OUTPUT_DATA_SIZE]; + u8 data[INV_MPU6050_OUTPUT_DATA_SIZE] __aligned(8); size_t i, nb;
mutex_lock(&st->lock);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Silvano Seva s.seva@4sigma.it
commit 159ca7f18129834b6f4c7eae67de48e96c752fc9 upstream.
Prevent st_lsm6dsx_read_fifo from falling in an infinite loop in case pattern_len is equal to zero and the device FIFO is not empty.
Fixes: 290a6ce11d93 ("iio: imu: add support to lsm6dsx driver") Signed-off-by: Silvano Seva s.seva@4sigma.it Acked-by: Lorenzo Bianconi lorenzo@kernel.org Link: https://patch.msgid.link/20250311085030.3593-2-s.seva@4sigma.it Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c +++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c @@ -392,6 +392,9 @@ int st_lsm6dsx_read_fifo(struct st_lsm6d if (fifo_status & cpu_to_le16(ST_LSM6DSX_FIFO_EMPTY_MASK)) return 0;
+ if (!pattern_len) + pattern_len = ST_LSM6DSX_SAMPLE_SIZE; + fifo_len = (le16_to_cpu(fifo_status) & fifo_diff_mask) * ST_LSM6DSX_CHAN_SIZE; fifo_len = (fifo_len / pattern_len) * pattern_len;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Silvano Seva s.seva@4sigma.it
commit 8114ef86e2058e2554111b793596f17bee23fa15 upstream.
Prevent st_lsm6dsx_read_tagged_fifo from falling in an infinite loop in case pattern_len is equal to zero and the device FIFO is not empty.
Fixes: 801a6e0af0c6 ("iio: imu: st_lsm6dsx: add support to LSM6DSO") Signed-off-by: Silvano Seva s.seva@4sigma.it Acked-by: Lorenzo Bianconi lorenzo@kernel.org Link: https://patch.msgid.link/20250311085030.3593-4-s.seva@4sigma.it Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c +++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c @@ -626,6 +626,9 @@ int st_lsm6dsx_read_tagged_fifo(struct s if (!fifo_len) return 0;
+ if (!pattern_len) + pattern_len = ST_LSM6DSX_TAGGED_SAMPLE_SIZE; + for (read_len = 0; read_len < fifo_len; read_len += pattern_len) { err = st_lsm6dsx_read_block(hw, ST_LSM6DSX_REG_FIFO_OUT_TAG_ADDR,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luca Ceresoli luca.ceresoli@bootlin.com
commit f063a28002e3350088b4577c5640882bf4ea17ea upstream.
The threaded IRQ function in this driver is reading the flag twice: once to lock a mutex and once to unlock it. Even though the code setting the flag is designed to prevent it, there are subtle cases where the flag could be true at the mutex_lock stage and false at the mutex_unlock stage. This results in the mutex not being unlocked, resulting in a deadlock.
Fix it by making the opt3001_irq() code generally more robust, reading the flag into a variable and using the variable value at both stages.
Fixes: 94a9b7b1809f ("iio: light: add support for TI's opt3001 light sensor") Cc: stable@vger.kernel.org Signed-off-by: Luca Ceresoli luca.ceresoli@bootlin.com Link: https://patch.msgid.link/20250321-opt3001-irq-fix-v1-1-6c520d851562@bootlin.... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/light/opt3001.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/iio/light/opt3001.c +++ b/drivers/iio/light/opt3001.c @@ -788,8 +788,9 @@ static irqreturn_t opt3001_irq(int irq, int ret; bool wake_result_ready_queue = false; enum iio_chan_type chan_type = opt->chip_info->chan_type; + bool ok_to_ignore_lock = opt->ok_to_ignore_lock;
- if (!opt->ok_to_ignore_lock) + if (!ok_to_ignore_lock) mutex_lock(&opt->lock);
ret = i2c_smbus_read_word_swapped(opt->client, OPT3001_CONFIGURATION); @@ -826,7 +827,7 @@ static irqreturn_t opt3001_irq(int irq, }
out: - if (!opt->ok_to_ignore_lock) + if (!ok_to_ignore_lock) mutex_unlock(&opt->lock);
if (wake_result_ready_queue)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Lechner dlechner@baylibre.com
commit ffcd19e9f4cca0c8f9e23e88f968711acefbb37b upstream.
Follow the pattern of other drivers and use aligned_s64 for the timestamp. This will ensure the struct itself it also 8-byte aligned.
While touching this, convert struct mpr_chan to an anonymous struct to consolidate the code a bit to make it easier for future readers.
Fixes: 713337d9143e ("iio: pressure: Honeywell mprls0025pa pressure sensor") Signed-off-by: David Lechner dlechner@baylibre.com Link: https://patch.msgid.link/20250418-iio-more-timestamp-alignment-v2-2-d6a5d2b1... Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/pressure/mprls0025pa.h | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-)
--- a/drivers/iio/pressure/mprls0025pa.h +++ b/drivers/iio/pressure/mprls0025pa.h @@ -34,16 +34,6 @@ struct iio_dev; struct mpr_data; struct mpr_ops;
-/** - * struct mpr_chan - * @pres: pressure value - * @ts: timestamp - */ -struct mpr_chan { - s32 pres; - s64 ts; -}; - enum mpr_func_id { MPR_FUNCTION_A, MPR_FUNCTION_B, @@ -69,6 +59,8 @@ enum mpr_func_id { * reading in a loop until data is ready * @completion: handshake from irq to read * @chan: channel values for buffered mode + * @chan.pres: pressure value + * @chan.ts: timestamp * @buffer: raw conversion data */ struct mpr_data { @@ -87,7 +79,10 @@ struct mpr_data { struct gpio_desc *gpiod_reset; int irq; struct completion completion; - struct mpr_chan chan; + struct { + s32 pres; + aligned_s64 ts; + } chan; u8 buffer[MPR_MEASUREMENT_RD_SIZE] __aligned(IIO_DMA_MINALIGN); };
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit d0ce1aaa8531a4a4707711cab5721374751c51b0 upstream.
This reverts commit 3a9626c816db901def438dc2513622e281186d39.
This breaks S4 because we end up setting the s3/s0ix flags even when we are entering s4 since prepare is used by both flows. The causes both the S3/s0ix and s4 flags to be set which breaks several checks in the driver which assume they are mutually exclusive.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3634 Cc: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit ce8f7d95899c2869b47ea6ce0b3e5bf304b2fff4) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 -- drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 18 ------------------ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++--------- 3 files changed, 2 insertions(+), 29 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1593,11 +1593,9 @@ static inline void amdgpu_acpi_get_backl #if defined(CONFIG_ACPI) && defined(CONFIG_SUSPEND) bool amdgpu_acpi_is_s3_active(struct amdgpu_device *adev); bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev); -void amdgpu_choose_low_power_state(struct amdgpu_device *adev); #else static inline bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev) { return false; } static inline bool amdgpu_acpi_is_s3_active(struct amdgpu_device *adev) { return false; } -static inline void amdgpu_choose_low_power_state(struct amdgpu_device *adev) { } #endif
void amdgpu_register_gpu_instance(struct amdgpu_device *adev); --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c @@ -1533,22 +1533,4 @@ bool amdgpu_acpi_is_s0ix_active(struct a #endif /* CONFIG_AMD_PMC */ }
-/** - * amdgpu_choose_low_power_state - * - * @adev: amdgpu_device_pointer - * - * Choose the target low power state for the GPU - */ -void amdgpu_choose_low_power_state(struct amdgpu_device *adev) -{ - if (adev->in_runpm) - return; - - if (amdgpu_acpi_is_s0ix_active(adev)) - adev->in_s0ix = true; - else if (amdgpu_acpi_is_s3_active(adev)) - adev->in_s3 = true; -} - #endif /* CONFIG_SUSPEND */ --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4861,15 +4861,13 @@ int amdgpu_device_prepare(struct drm_dev struct amdgpu_device *adev = drm_to_adev(dev); int i, r;
- amdgpu_choose_low_power_state(adev); - if (dev->switch_power_state == DRM_SWITCH_POWER_OFF) return 0;
/* Evict the majority of BOs before starting suspend sequence */ r = amdgpu_device_evict_resources(adev); if (r) - goto unprepare; + return r;
flush_delayed_work(&adev->gfx.gfx_off_delay_work);
@@ -4880,15 +4878,10 @@ int amdgpu_device_prepare(struct drm_dev continue; r = adev->ip_blocks[i].version->funcs->prepare_suspend(&adev->ip_blocks[i]); if (r) - goto unprepare; + return r; }
return 0; - -unprepare: - adev->in_s0ix = adev->in_s3 = adev->in_s4 = false; - - return r; }
/**
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maíra Canal mcanal@igalia.com
commit 35e4079bf1a2570abffce6ababa631afcf8ea0e5 upstream.
When a CL/CSD job times out, we check if the GPU has made any progress since the last timeout. If so, instead of resetting the hardware, we skip the reset and let the timer get rearmed. This gives long-running jobs a chance to complete.
However, when `timedout_job()` is called, the job in question is removed from the pending list, which means it won't be automatically freed through `free_job()`. Consequently, when we skip the reset and keep the job running, the job won't be freed when it finally completes.
This situation leads to a memory leak, as exposed in [1] and [2].
Similarly to commit 704d3d60fec4 ("drm/etnaviv: don't block scheduler when GPU is still active"), this patch ensures the job is put back on the pending list when extending the timeout.
Cc: stable@vger.kernel.org # 6.0 Reported-by: Daivik Bhatia dtgs1208@gmail.com Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12227 [1] Closes: https://github.com/raspberrypi/linux/issues/6817 [2] Reviewed-by: Iago Toral Quiroga itoral@igalia.com Acked-by: Tvrtko Ursulin tvrtko.ursulin@igalia.com Link: https://lore.kernel.org/r/20250430210643.57924-1-mcanal@igalia.com Signed-off-by: Maíra Canal mcanal@igalia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/v3d/v3d_sched.c | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-)
--- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -746,11 +746,16 @@ v3d_gpu_reset_for_timeout(struct v3d_dev return DRM_GPU_SCHED_STAT_NOMINAL; }
-/* If the current address or return address have changed, then the GPU - * has probably made progress and we should delay the reset. This - * could fail if the GPU got in an infinite loop in the CL, but that - * is pretty unlikely outside of an i-g-t testcase. - */ +static void +v3d_sched_skip_reset(struct drm_sched_job *sched_job) +{ + struct drm_gpu_scheduler *sched = sched_job->sched; + + spin_lock(&sched->job_list_lock); + list_add(&sched_job->list, &sched->pending_list); + spin_unlock(&sched->job_list_lock); +} + static enum drm_gpu_sched_stat v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q, u32 *timedout_ctca, u32 *timedout_ctra) @@ -760,9 +765,16 @@ v3d_cl_job_timedout(struct drm_sched_job u32 ctca = V3D_CORE_READ(0, V3D_CLE_CTNCA(q)); u32 ctra = V3D_CORE_READ(0, V3D_CLE_CTNRA(q));
+ /* If the current address or return address have changed, then the GPU + * has probably made progress and we should delay the reset. This + * could fail if the GPU got in an infinite loop in the CL, but that + * is pretty unlikely outside of an i-g-t testcase. + */ if (*timedout_ctca != ctca || *timedout_ctra != ctra) { *timedout_ctca = ctca; *timedout_ctra = ctra; + + v3d_sched_skip_reset(sched_job); return DRM_GPU_SCHED_STAT_NOMINAL; }
@@ -802,11 +814,13 @@ v3d_csd_job_timedout(struct drm_sched_jo struct v3d_dev *v3d = job->base.v3d; u32 batches = V3D_CORE_READ(0, V3D_CSD_CURRENT_CFG4(v3d->ver));
- /* If we've made progress, skip reset and let the timer get - * rearmed. + /* If we've made progress, skip reset, add the job to the pending + * list, and let the timer get rearmed. */ if (job->timedout_batches != batches) { job->timedout_batches = batches; + + v3d_sched_skip_reset(sched_job); return DRM_GPU_SCHED_STAT_NOMINAL; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthew Brost matthew.brost@intel.com
commit 391008f34e711253c5983b0bf52277cc43723127 upstream.
For an unknown reason the math to determine the PF queue size does is not correct - compute UMD applications are overflowing the PF queue which is fatal. A multippier of 8 fixes the problem.
Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost matthew.brost@intel.com Reviewed-by: Jagmeet Randhawa jagmeet.randhawa@intel.com Link: https://lore.kernel.org/r/20250408155915.78770-1-matthew.brost@intel.com (cherry picked from commit 29582e0ea75c95668d168b12406e3c56cf5a73c4) Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/xe/xe_gt_pagefault.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c @@ -422,9 +422,16 @@ static int xe_alloc_pf_queue(struct xe_g num_eus = bitmap_weight(gt->fuse_topo.eu_mask_per_dss, XE_MAX_EU_FUSE_BITS) * num_dss;
- /* user can issue separate page faults per EU and per CS */ + /* + * user can issue separate page faults per EU and per CS + * + * XXX: Multiplier required as compute UMD are getting PF queue errors + * without it. Follow on why this multiplier is required. + */ +#define PF_MULTIPLIER 8 pf_queue->num_dw = - (num_eus + XE_NUM_HW_ENGINES) * PF_MSG_LEN_DW; + (num_eus + XE_NUM_HW_ENGINES) * PF_MSG_LEN_DW * PF_MULTIPLIER; +#undef PF_MULTIPLIER
pf_queue->gt = gt; pf_queue->data = devm_kcalloc(xe->drm.dev, pf_queue->num_dw,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 4aaffc85751da5722e858e4333e8cf0aa4b6c78f upstream.
Set the s3/s0ix and s4 flags in the pm notifier so that we can skip the resource evictions properly in pm prepare based on whether we are suspending or hibernating. Drop the eviction as processes are not frozen at this time, we we can end up getting stuck trying to evict VRAM while applications continue to submit work which causes the buffers to get pulled back into VRAM.
v2: Move suspend flags out of pm notifier (Mario)
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4178 Fixes: 2965e6355dcd ("drm/amd: Add Suspend/Hibernate notification callback support") Cc: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 06f2dcc241e7e5c681f81fbc46cacdf4bfd7d6d7) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 18 +++++------------- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 10 +--------- 2 files changed, 6 insertions(+), 22 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4819,28 +4819,20 @@ static int amdgpu_device_evict_resources * @data: data * * This function is called when the system is about to suspend or hibernate. - * It is used to evict resources from the device before the system goes to - * sleep while there is still access to swap. + * It is used to set the appropriate flags so that eviction can be optimized + * in the pm prepare callback. */ static int amdgpu_device_pm_notifier(struct notifier_block *nb, unsigned long mode, void *data) { struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, pm_nb); - int r;
switch (mode) { case PM_HIBERNATION_PREPARE: adev->in_s4 = true; - fallthrough; - case PM_SUSPEND_PREPARE: - r = amdgpu_device_evict_resources(adev); - /* - * This is considered non-fatal at this time because - * amdgpu_device_prepare() will also fatally evict resources. - * See https://gitlab.freedesktop.org/drm/amd/-/issues/3781 - */ - if (r) - drm_warn(adev_to_drm(adev), "Failed to evict resources, freeze active processes if problems occur: %d\n", r); + break; + case PM_POST_HIBERNATION: + adev->in_s4 = false; break; }
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2582,13 +2582,8 @@ static int amdgpu_pmops_freeze(struct de static int amdgpu_pmops_thaw(struct device *dev) { struct drm_device *drm_dev = dev_get_drvdata(dev); - struct amdgpu_device *adev = drm_to_adev(drm_dev); - int r;
- r = amdgpu_device_resume(drm_dev, true); - adev->in_s4 = false; - - return r; + return amdgpu_device_resume(drm_dev, true); }
static int amdgpu_pmops_poweroff(struct device *dev) @@ -2601,9 +2596,6 @@ static int amdgpu_pmops_poweroff(struct static int amdgpu_pmops_restore(struct device *dev) { struct drm_device *drm_dev = dev_get_drvdata(dev); - struct amdgpu_device *adev = drm_to_adev(drm_dev); - - adev->in_s4 = false;
return amdgpu_device_resume(drm_dev, true); }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ruijing Dong ruijing.dong@amd.com
commit b7e84fb708392b37e5dbb2a95db9b94a0e3f0aa2 upstream.
VCN1_AON_SOC_ADDRESS_3_0 offset varies on different VCN generations, the issue in vcn4.0.5 is caused by a different VCN1_AON_SOC_ADDRESS_3_0 offset.
This patch does the following:
1. use the same offset for other VCN generations. 2. use the vcn4.0.5 special offset 3. update vcn_4_0 and vcn_5_0
Acked-by: Saleemkhan Jamadar saleemkhan.jamadar@amd.com Reviewed-by: Leo Liu leo.liu@amd.com Signed-off-by: Ruijing Dong ruijing.dong@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 5c89ceda9984498b28716944633a9a01cbb2c90d) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h | 1 - drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 1 + drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 1 + drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 1 + drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 4 +++- drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 1 + drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c | 1 + drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c | 3 ++- 8 files changed, 10 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h @@ -66,7 +66,6 @@ #define VCN_ENC_CMD_REG_WAIT 0x0000000c
#define VCN_AON_SOC_ADDRESS_2_0 0x1f800 -#define VCN1_AON_SOC_ADDRESS_3_0 0x48000 #define VCN_VID_IP_ADDRESS_2_0 0x0 #define VCN_AON_IP_ADDRESS_2_0 0x30000
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c @@ -39,6 +39,7 @@
#define VCN_VID_SOC_ADDRESS_2_0 0x1fa00 #define VCN1_VID_SOC_ADDRESS_3_0 0x48200 +#define VCN1_AON_SOC_ADDRESS_3_0 0x48000
#define mmUVD_CONTEXT_ID_INTERNAL_OFFSET 0x1fd #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET 0x503 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c @@ -39,6 +39,7 @@
#define VCN_VID_SOC_ADDRESS_2_0 0x1fa00 #define VCN1_VID_SOC_ADDRESS_3_0 0x48200 +#define VCN1_AON_SOC_ADDRESS_3_0 0x48000
#define mmUVD_CONTEXT_ID_INTERNAL_OFFSET 0x27 #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET 0x0f --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c @@ -40,6 +40,7 @@
#define VCN_VID_SOC_ADDRESS_2_0 0x1fa00 #define VCN1_VID_SOC_ADDRESS_3_0 0x48200 +#define VCN1_AON_SOC_ADDRESS_3_0 0x48000
#define mmUVD_CONTEXT_ID_INTERNAL_OFFSET 0x27 #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET 0x0f --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c @@ -46,6 +46,7 @@
#define VCN_VID_SOC_ADDRESS_2_0 0x1fb00 #define VCN1_VID_SOC_ADDRESS_3_0 0x48300 +#define VCN1_AON_SOC_ADDRESS_3_0 0x48000
#define VCN_HARVEST_MMSCH 0
@@ -582,7 +583,8 @@ static void vcn_v4_0_mc_resume_dpg_mode(
/* VCN global tiling registers */ WREG32_SOC15_DPG_MODE(inst_idx, SOC15_DPG_MODE_OFFSET( - VCN, 0, regUVD_GFX10_ADDR_CONFIG), adev->gfx.config.gb_addr_config, 0, indirect); + VCN, inst_idx, regUVD_GFX10_ADDR_CONFIG), + adev->gfx.config.gb_addr_config, 0, indirect); }
/** --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c @@ -44,6 +44,7 @@
#define VCN_VID_SOC_ADDRESS_2_0 0x1fb00 #define VCN1_VID_SOC_ADDRESS_3_0 0x48300 +#define VCN1_AON_SOC_ADDRESS_3_0 0x48000
static const struct amdgpu_hwip_reg_entry vcn_reg_list_4_0_3[] = { SOC15_REG_ENTRY_STR(VCN, 0, regUVD_POWER_STATUS), --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c @@ -46,6 +46,7 @@
#define VCN_VID_SOC_ADDRESS_2_0 0x1fb00 #define VCN1_VID_SOC_ADDRESS_3_0 (0x48300 + 0x38000) +#define VCN1_AON_SOC_ADDRESS_3_0 (0x48000 + 0x38000)
#define VCN_HARVEST_MMSCH 0
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c @@ -502,7 +502,8 @@ static void vcn_v5_0_0_mc_resume_dpg_mod
/* VCN global tiling registers */ WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( - VCN, 0, regUVD_GFX10_ADDR_CONFIG), adev->gfx.config.gb_addr_config, 0, indirect); + VCN, inst_idx, regUVD_GFX10_ADDR_CONFIG), + adev->gfx.config.gb_addr_config, 0, indirect);
return; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roman Li Roman.Li@amd.com
commit 9984db63742099ee3f3cff35cf71306d10e64356 upstream.
[Why] "BUG: sleeping function called from invalid context" error. after: "drm/amd/display: Protect FPU in dml2_validate()/dml21_validate()"
The populate_dml_plane_cfg_from_plane_state() uses the GFP_KERNEL flag for memory allocation, which shouldn't be used in atomic contexts.
The allocation is needed only for using another helper function get_scaler_data_for_plane().
[How] Modify helpers to pass a pointer to scaler_data within existing context, eliminating the need for dynamic memory allocation/deallocation and copying.
Fixes: 366e77cd4923 ("drm/amd/display: Protect FPU in dml2_validate()/dml21_validate()") Reviewed-by: Aurabindo Pillai aurabindo.pillai@amd.com Signed-off-by: Roman Li Roman.Li@amd.com Signed-off-by: Ray Wu ray.wu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit bd3e84bc98f81b44f2c43936bdadc3241d654259) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c | 14 +++------- 1 file changed, 5 insertions(+), 9 deletions(-)
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c @@ -969,7 +969,9 @@ static void populate_dml_surface_cfg_fro } }
-static void get_scaler_data_for_plane(const struct dc_plane_state *in, struct dc_state *context, struct scaler_data *out) +static struct scaler_data *get_scaler_data_for_plane( + const struct dc_plane_state *in, + struct dc_state *context) { int i; struct pipe_ctx *temp_pipe = &context->res_ctx.temp_pipe; @@ -990,7 +992,7 @@ static void get_scaler_data_for_plane(co }
ASSERT(i < MAX_PIPES); - memcpy(out, &temp_pipe->plane_res.scl_data, sizeof(*out)); + return &temp_pipe->plane_res.scl_data; }
static void populate_dummy_dml_plane_cfg(struct dml_plane_cfg_st *out, unsigned int location, @@ -1053,11 +1055,7 @@ static void populate_dml_plane_cfg_from_ const struct dc_plane_state *in, struct dc_state *context, const struct soc_bounding_box_st *soc) { - struct scaler_data *scaler_data = kzalloc(sizeof(*scaler_data), GFP_KERNEL); - if (!scaler_data) - return; - - get_scaler_data_for_plane(in, context, scaler_data); + struct scaler_data *scaler_data = get_scaler_data_for_plane(in, context);
out->CursorBPP[location] = dml_cur_32bit; out->CursorWidth[location] = 256; @@ -1122,8 +1120,6 @@ static void populate_dml_plane_cfg_from_ out->DynamicMetadataTransmittedBytes[location] = 0;
out->NumberOfCursors[location] = 1; - - kfree(scaler_data); }
static unsigned int map_stream_to_dml_display_cfg(const struct dml2_context *dml2,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aurabindo Pillai aurabindo.pillai@amd.com
commit f1c6be3999d2be2673a51a9be0caf9348e254e52 upstream.
[Why] FAMS2 expects vmin/vmax to be updated in the case when freesync is off, but supported. But we only update it when freesync is enabled.
[How] Change the vsync handler such that dc_stream_adjust_vmin_vmax() its called irrespective of whether freesync is enabled. If freesync is supported, then there is no harm in updating vmin/vmax registers.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3546 Reviewed-by: ChiaHsuan Chung chiahsuan.chung@amd.com Signed-off-by: Aurabindo Pillai aurabindo.pillai@amd.com Signed-off-by: Ray Wu ray.wu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit cfb2d41831ee5647a4ae0ea7c24971a92d5dfa0d) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -668,15 +668,21 @@ static void dm_crtc_high_irq(void *inter spin_lock_irqsave(&adev_to_drm(adev)->event_lock, flags);
if (acrtc->dm_irq_params.stream && - acrtc->dm_irq_params.vrr_params.supported && - acrtc->dm_irq_params.freesync_config.state == - VRR_STATE_ACTIVE_VARIABLE) { + acrtc->dm_irq_params.vrr_params.supported) { + bool replay_en = acrtc->dm_irq_params.stream->link->replay_settings.replay_feature_enabled; + bool psr_en = acrtc->dm_irq_params.stream->link->psr_settings.psr_feature_enabled; + bool fs_active_var_en = acrtc->dm_irq_params.freesync_config.state == VRR_STATE_ACTIVE_VARIABLE; + mod_freesync_handle_v_update(adev->dm.freesync_module, acrtc->dm_irq_params.stream, &acrtc->dm_irq_params.vrr_params);
- dc_stream_adjust_vmin_vmax(adev->dm.dc, acrtc->dm_irq_params.stream, - &acrtc->dm_irq_params.vrr_params.adjust); + /* update vmin_vmax only if freesync is enabled, or only if PSR and REPLAY are disabled */ + if (fs_active_var_en || (!fs_active_var_en && !replay_en && !psr_en)) { + dc_stream_adjust_vmin_vmax(adev->dm.dc, + acrtc->dm_irq_params.stream, + &acrtc->dm_irq_params.vrr_params.adjust); + } }
/*
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wayne Lin Wayne.Lin@amd.com
commit bc70e11b550d37fbd9eaed0f113ba560894f1609 upstream.
[Why & How] Fix the checking condition for detecting AUX_RET_ERROR_PROTOCOL_ERROR. It was wrongly checking by "not equals to"
Reviewed-by: Ray Wu ray.wu@amd.com Signed-off-by: Wayne Lin Wayne.Lin@amd.com Signed-off-by: Ray Wu ray.wu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 1db6c9e9b62e1a8912f0a281c941099fca678da3) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -12607,7 +12607,7 @@ int amdgpu_dm_process_dmub_aux_transfer_ * Transient states before tunneling is enabled could * lead to this error. We can ignore this for now. */ - if (p_notify->result != AUX_RET_ERROR_PROTOCOL_ERROR) { + if (p_notify->result == AUX_RET_ERROR_PROTOCOL_ERROR) { DRM_WARN("DPIA AUX failed on 0x%x(%d), error %d\n", payload->address, payload->length, p_notify->result);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wayne Lin Wayne.Lin@amd.com
commit 396dc51b3b7ea524bf8061f478332d0039e96d5d upstream.
[Why & How] "Request length != reply length" is expected behavior defined in spec. It's not an invalid reply. Besides, replied data handling logic is not designed to be written in amdgpu_dm_process_dmub_aux_transfer_sync(). Remove the incorrectly handling section.
Fixes: ead08b95fa50 ("drm/amd/display: Fix race condition in DPIA AUX transfer") Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Reviewed-by: Ray Wu ray.wu@amd.com Signed-off-by: Wayne Lin Wayne.Lin@amd.com Signed-off-by: Ray Wu ray.wu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 81b5c6fa62af62fe89ae9576f41aae37830b94cb) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -12622,19 +12622,9 @@ int amdgpu_dm_process_dmub_aux_transfer_ payload->reply[0] = (adev->dm.dmub_notify->aux_reply.command >> 4) & 0xF;
if (!payload->write && p_notify->aux_reply.length && - (payload->reply[0] == AUX_TRANSACTION_REPLY_AUX_ACK)) { - - if (payload->length != p_notify->aux_reply.length) { - DRM_WARN("invalid read length %d from DPIA AUX 0x%x(%d)!\n", - p_notify->aux_reply.length, - payload->address, payload->length); - *operation_result = AUX_RET_ERROR_INVALID_REPLY; - goto out; - } - + (payload->reply[0] == AUX_TRANSACTION_REPLY_AUX_ACK)) memcpy(payload->data, p_notify->aux_reply.data, p_notify->aux_reply.length); - }
/* success */ ret = p_notify->aux_reply.length;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wayne Lin Wayne.Lin@amd.com
commit 65924ec69b29296845c7f628112353438e63ea56 upstream.
[Why] We incorrectly ack all bytes get written when the reply actually is defer. When it's defer, means sink is not ready for the request. We should retry the request.
[How] Only reply all data get written when receive I2C_ACK|AUX_ACK. Otherwise, reply the number of actual written bytes received from the sink. Add some messages to facilitate debugging as well.
Fixes: ad6756b4d773 ("drm/amd/display: Shift dc link aux to aux_payload") Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Reviewed-by: Ray Wu ray.wu@amd.com Signed-off-by: Wayne Lin Wayne.Lin@amd.com Signed-off-by: Ray Wu ray.wu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 3637e457eb0000bc37d8bbbec95964aad2fb29fd) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 28 ++++++++++-- 1 file changed, 24 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c @@ -51,6 +51,9 @@
#define PEAK_FACTOR_X1000 1006
+/* + * This function handles both native AUX and I2C-Over-AUX transactions. + */ static ssize_t dm_dp_aux_transfer(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg) { @@ -87,15 +90,25 @@ static ssize_t dm_dp_aux_transfer(struct if (adev->dm.aux_hpd_discon_quirk) { if (msg->address == DP_SIDEBAND_MSG_DOWN_REQ_BASE && operation_result == AUX_RET_ERROR_HPD_DISCON) { - result = 0; + result = msg->size; operation_result = AUX_RET_SUCCESS; } }
- if (payload.write && result >= 0) - result = msg->size; + /* + * result equals to 0 includes the cases of AUX_DEFER/I2C_DEFER + */ + if (payload.write && result >= 0) { + if (result) { + /*one byte indicating partially written bytes. Force 0 to retry*/ + drm_info(adev_to_drm(adev), "amdgpu: AUX partially written\n"); + result = 0; + } else if (!payload.reply[0]) + /*I2C_ACK|AUX_ACK*/ + result = msg->size; + }
- if (result < 0) + if (result < 0) { switch (operation_result) { case AUX_RET_SUCCESS: break; @@ -114,6 +127,13 @@ static ssize_t dm_dp_aux_transfer(struct break; }
+ drm_info(adev_to_drm(adev), "amdgpu: DP AUX transfer fail:%d\n", operation_result); + } + + if (payload.reply[0]) + drm_info(adev_to_drm(adev), "amdgpu: AUX reply command not ACK: 0x%02x.", + payload.reply[0]); + return result; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wayne Lin Wayne.Lin@amd.com
commit 3924f45d4de7250a603fd7b50379237a6a0e5adf upstream.
[Why] amdgpu_dm_process_dmub_aux_transfer_sync() should return all exact data reply from the sink side. Don't do the analysis job in it.
[How] Remove unnecessary check condition AUX_TRANSACTION_REPLY_AUX_ACK.
Fixes: ead08b95fa50 ("drm/amd/display: Fix race condition in DPIA AUX transfer") Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Reviewed-by: Ray Wu ray.wu@amd.com Signed-off-by: Wayne Lin Wayne.Lin@amd.com Signed-off-by: Ray Wu ray.wu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 9b540e3fe6796fec4fb1344f3be8952fc2f084d4) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -12621,8 +12621,7 @@ int amdgpu_dm_process_dmub_aux_transfer_ /* The reply is stored in the top nibble of the command. */ payload->reply[0] = (adev->dm.dmub_notify->aux_reply.command >> 4) & 0xF;
- if (!payload->write && p_notify->aux_reply.length && - (payload->reply[0] == AUX_TRANSACTION_REPLY_AUX_ACK)) + if (!payload->write && p_notify->aux_reply.length) memcpy(payload->data, p_notify->aux_reply.data, p_notify->aux_reply.length);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit f690e3974755a650259a45d71456decc9c96a282 upstream.
Reading back the remapped HDP flush register seems to cause problems on some platforms. All we need is a read, so read back the memcfg register.
Fixes: c9b8dcabb52a ("drm/amdgpu/hdp4.0: do a posting read when flushing HDP") Reported-by: Alexey Klimov alexey.klimov@linaro.org Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908 Reviewed-by: Felix Kuehling felix.kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 5c937b4a6050316af37ef214825b6340b5e9e391) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/hdp_v4_0.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/hdp_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/hdp_v4_0.c @@ -42,7 +42,12 @@ static void hdp_v4_0_flush_hdp(struct am { if (!ring || !ring->funcs->emit_wreg) { WREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); - RREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2); + /* We just need to read back a register to post the write. + * Reading back the remapped register causes problems on + * some platforms so just read back the memory size register. + */ + if (adev->nbio.funcs->get_memsize) + adev->nbio.funcs->get_memsize(adev); } else { amdgpu_ring_emit_wreg(ring, (adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit dbc988c689333faeeed44d5561f372ff20395304 upstream.
Reading back the remapped HDP flush register seems to cause problems on some platforms. All we need is a read, so read back the memcfg register.
Fixes: f756dbac1ce1 ("drm/amdgpu/hdp5.2: do a posting read when flushing HDP") Reported-by: Alexey Klimov alexey.klimov@linaro.org Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908 Reviewed-by: Felix Kuehling felix.kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 4a89b7698e771914b4d5b571600c76e2fdcbe2a9) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/hdp_v5_2.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/hdp_v5_2.c +++ b/drivers/gpu/drm/amd/amdgpu/hdp_v5_2.c @@ -34,7 +34,17 @@ static void hdp_v5_2_flush_hdp(struct am if (!ring || !ring->funcs->emit_wreg) { WREG32_NO_KIQ((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); - RREG32_NO_KIQ((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2); + if (amdgpu_sriov_vf(adev)) { + /* this is fine because SR_IOV doesn't remap the register */ + RREG32_NO_KIQ((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2); + } else { + /* We just need to read back a register to post the write. + * Reading back the remapped register causes problems on + * some platforms so just read back the memory size register. + */ + if (adev->nbio.funcs->get_memsize) + adev->nbio.funcs->get_memsize(adev); + } } else { amdgpu_ring_emit_wreg(ring, (adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 0e33e0f339b91eecd9558311449a3d1e728722d4 upstream.
Reading back the remapped HDP flush register seems to cause problems on some platforms. All we need is a read, so read back the memcfg register.
Fixes: cf424020e040 ("drm/amdgpu/hdp5.0: do a posting read when flushing HDP") Reported-by: Alexey Klimov alexey.klimov@linaro.org Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908 Reviewed-by: Felix Kuehling felix.kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit a5cb344033c7598762e89255e8ff52827abb57a4) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/hdp_v5_0.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/hdp_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/hdp_v5_0.c @@ -33,7 +33,12 @@ static void hdp_v5_0_flush_hdp(struct am { if (!ring || !ring->funcs->emit_wreg) { WREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); - RREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2); + /* We just need to read back a register to post the write. + * Reading back the remapped register causes problems on + * some platforms so just read back the memory size register. + */ + if (adev->nbio.funcs->get_memsize) + adev->nbio.funcs->get_memsize(adev); } else { amdgpu_ring_emit_wreg(ring, (adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit ca28e80abe4219c8f1a2961ae05102d70af6dc87 upstream.
Reading back the remapped HDP flush register seems to cause problems on some platforms. All we need is a read, so read back the memcfg register.
Fixes: abe1cbaec6cf ("drm/amdgpu/hdp6.0: do a posting read when flushing HDP") Reported-by: Alexey Klimov alexey.klimov@linaro.org Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908 Reviewed-by: Felix Kuehling felix.kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 84141ff615951359c9a99696fd79a36c465ed847) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/hdp_v6_0.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/hdp_v6_0.c +++ b/drivers/gpu/drm/amd/amdgpu/hdp_v6_0.c @@ -36,7 +36,12 @@ static void hdp_v6_0_flush_hdp(struct am { if (!ring || !ring->funcs->emit_wreg) { WREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); - RREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2); + /* We just need to read back a register to post the write. + * Reading back the remapped register causes problems on + * some platforms so just read back the memory size register. + */ + if (adev->nbio.funcs->get_memsize) + adev->nbio.funcs->get_memsize(adev); } else { amdgpu_ring_emit_wreg(ring, (adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 5a11a2767731139bf87e667331aa2209e33a1d19 upstream.
Reading back the remapped HDP flush register seems to cause problems on some platforms. All we need is a read, so read back the memcfg register.
Fixes: 689275140cb8 ("drm/amdgpu/hdp7.0: do a posting read when flushing HDP") Reported-by: Alexey Klimov alexey.klimov@linaro.org Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908 Reviewed-by: Felix Kuehling felix.kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit dbc064adfcf9095e7d895bea87b2f75c1ab23236) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/hdp_v7_0.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/hdp_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/hdp_v7_0.c @@ -33,7 +33,12 @@ static void hdp_v7_0_flush_hdp(struct am { if (!ring || !ring->funcs->emit_wreg) { WREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); - RREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2); + /* We just need to read back a register to post the write. + * Reading back the remapped register causes problems on + * some platforms so just read back the memory size register. + */ + if (adev->nbio.funcs->get_memsize) + adev->nbio.funcs->get_memsize(adev); } else { amdgpu_ring_emit_wreg(ring, (adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mathias Nyman mathias.nyman@linux.intel.com
commit cab63934c33b12c0d1e9f4da7450928057f2c142 upstream.
Event polling delay is set to 0 if there are any pending requests in either rx or tx requests lists. Checking for pending requests does not work well for "IN" transfers as the tty driver always queues requests to the list and TRBs to the ring, preparing to receive data from the host.
This causes unnecessary busylooping and cpu hogging.
Only set the event polling delay to 0 if there are pending tx "write" transfers, or if it was less than 10ms since last active data transfer in any direction.
Cc: Łukasz Bartosik ukaszb@chromium.org Fixes: fb18e5bb9660 ("xhci: dbc: poll at different rate depending on data transfer activity") Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20250505125630.561699-3-mathias.nyman@linux.intel.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-dbgcap.c | 19 ++++++++++++++++--- drivers/usb/host/xhci-dbgcap.h | 3 +++ 2 files changed, 19 insertions(+), 3 deletions(-)
--- a/drivers/usb/host/xhci-dbgcap.c +++ b/drivers/usb/host/xhci-dbgcap.c @@ -823,6 +823,7 @@ static enum evtreturn xhci_dbc_do_handle { dma_addr_t deq; union xhci_trb *evt; + enum evtreturn ret = EVT_DONE; u32 ctrl, portsc; bool update_erdp = false;
@@ -909,6 +910,7 @@ static enum evtreturn xhci_dbc_do_handle break; case TRB_TYPE(TRB_TRANSFER): dbc_handle_xfer_event(dbc, evt); + ret = EVT_XFER_DONE; break; default: break; @@ -927,7 +929,7 @@ static enum evtreturn xhci_dbc_do_handle lo_hi_writeq(deq, &dbc->regs->erdp); }
- return EVT_DONE; + return ret; }
static void xhci_dbc_handle_events(struct work_struct *work) @@ -936,6 +938,7 @@ static void xhci_dbc_handle_events(struc struct xhci_dbc *dbc; unsigned long flags; unsigned int poll_interval; + unsigned long busypoll_timelimit;
dbc = container_of(to_delayed_work(work), struct xhci_dbc, event_work); poll_interval = dbc->poll_interval; @@ -954,11 +957,21 @@ static void xhci_dbc_handle_events(struc dbc->driver->disconnect(dbc); break; case EVT_DONE: - /* set fast poll rate if there are pending data transfers */ + /* + * Set fast poll rate if there are pending out transfers, or + * a transfer was recently processed + */ + busypoll_timelimit = dbc->xfer_timestamp + + msecs_to_jiffies(DBC_XFER_INACTIVITY_TIMEOUT); + if (!list_empty(&dbc->eps[BULK_OUT].list_pending) || - !list_empty(&dbc->eps[BULK_IN].list_pending)) + time_is_after_jiffies(busypoll_timelimit)) poll_interval = 0; break; + case EVT_XFER_DONE: + dbc->xfer_timestamp = jiffies; + poll_interval = 0; + break; default: dev_info(dbc->dev, "stop handling dbc events\n"); return; --- a/drivers/usb/host/xhci-dbgcap.h +++ b/drivers/usb/host/xhci-dbgcap.h @@ -96,6 +96,7 @@ struct dbc_ep { #define DBC_WRITE_BUF_SIZE 8192 #define DBC_POLL_INTERVAL_DEFAULT 64 /* milliseconds */ #define DBC_POLL_INTERVAL_MAX 5000 /* milliseconds */ +#define DBC_XFER_INACTIVITY_TIMEOUT 10 /* milliseconds */ /* * Private structure for DbC hardware state: */ @@ -142,6 +143,7 @@ struct xhci_dbc { enum dbc_state state; struct delayed_work event_work; unsigned int poll_interval; /* ms */ + unsigned long xfer_timestamp; unsigned resume_required:1; struct dbc_ep eps[2];
@@ -187,6 +189,7 @@ struct dbc_request { enum evtreturn { EVT_ERR = -1, EVT_DONE, + EVT_XFER_DONE, EVT_GSER, EVT_DISC, };
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Charkov alchark@gmail.com
commit a5c7973539b010874a37a0e846e62ac6f00553ba upstream.
Device tree bindings state that the clock is optional for UHCI platform controllers, and some existing device trees don't provide those - such as those for VIA/WonderMedia devices.
The driver however fails to probe now if no clock is provided, because devm_clk_get returns an error pointer in such case.
Switch to devm_clk_get_optional instead, so that it could probe again on those platforms where no clocks are given.
Cc: stable stable@kernel.org Fixes: 26c502701c52 ("usb: uhci: Add clk support to uhci-platform") Signed-off-by: Alexey Charkov alchark@gmail.com Link: https://lore.kernel.org/r/20250425-uhci-clock-optional-v1-1-a1d462592f29@gma... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/uhci-platform.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/usb/host/uhci-platform.c +++ b/drivers/usb/host/uhci-platform.c @@ -121,7 +121,7 @@ static int uhci_hcd_platform_probe(struc }
/* Get and enable clock if any specified */ - uhci->clk = devm_clk_get(&pdev->dev, NULL); + uhci->clk = devm_clk_get_optional(&pdev->dev, NULL); if (IS_ERR(uhci->clk)) { ret = PTR_ERR(uhci->clk); goto err_rmr;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Aurich paul@darkrain42.org
commit 3ca02e63edccb78ef3659bebc68579c7224a6ca2 upstream.
A pre-existing valid cfid returned from find_or_create_cached_dir might race with a lease break, meaning open_cached_dir doesn't consider it valid, and thinks it's newly-constructed. This leaks a dentry reference if the allocation occurs before the queued lease break work runs.
Avoid the race by extending holding the cfid_list_lock across find_or_create_cached_dir and when the result is checked.
Cc: stable@vger.kernel.org Reviewed-by: Henrique Carvalho henrique.carvalho@suse.com Signed-off-by: Paul Aurich paul@darkrain42.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/cached_dir.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-)
--- a/fs/smb/client/cached_dir.c +++ b/fs/smb/client/cached_dir.c @@ -29,7 +29,6 @@ static struct cached_fid *find_or_create { struct cached_fid *cfid;
- spin_lock(&cfids->cfid_list_lock); list_for_each_entry(cfid, &cfids->entries, entry) { if (!strcmp(cfid->path, path)) { /* @@ -38,25 +37,20 @@ static struct cached_fid *find_or_create * being deleted due to a lease break. */ if (!cfid->time || !cfid->has_lease) { - spin_unlock(&cfids->cfid_list_lock); return NULL; } kref_get(&cfid->refcount); - spin_unlock(&cfids->cfid_list_lock); return cfid; } } if (lookup_only) { - spin_unlock(&cfids->cfid_list_lock); return NULL; } if (cfids->num_entries >= max_cached_dirs) { - spin_unlock(&cfids->cfid_list_lock); return NULL; } cfid = init_cached_dir(path); if (cfid == NULL) { - spin_unlock(&cfids->cfid_list_lock); return NULL; } cfid->cfids = cfids; @@ -74,7 +68,6 @@ static struct cached_fid *find_or_create */ cfid->has_lease = true;
- spin_unlock(&cfids->cfid_list_lock); return cfid; }
@@ -187,8 +180,10 @@ replay_again: if (!utf16_path) return -ENOMEM;
+ spin_lock(&cfids->cfid_list_lock); cfid = find_or_create_cached_dir(cfids, path, lookup_only, tcon->max_cached_dirs); if (cfid == NULL) { + spin_unlock(&cfids->cfid_list_lock); kfree(utf16_path); return -ENOENT; } @@ -197,7 +192,6 @@ replay_again: * Otherwise, it is either a new entry or laundromat worker removed it * from @cfids->entries. Caller will put last reference if the latter. */ - spin_lock(&cfids->cfid_list_lock); if (cfid->has_lease && cfid->time) { spin_unlock(&cfids->cfid_list_lock); *ret_cfid = cfid;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Ernberg john.ernberg@actia.se
commit cd9c058489053e172a6654cad82ee936d1b09fab upstream.
Xen swiotlb support was missed when the patch set starting with 4ab5f8ec7d71 ("mm/slab: decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN") was merged.
When running Xen on iMX8QXP, a SoC without IOMMU, the effect was that USB transfers ended up corrupted when there was more than one URB inflight at the same time.
Add a call to dma_kmalloc_needs_bounce() to make sure that allocations too small for DMA get bounced via swiotlb.
Closes: https://lore.kernel.org/linux-usb/ab2776f0-b838-4cf6-a12a-c208eb6aad59@actia... Fixes: 4ab5f8ec7d71 ("mm/slab: decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN") Cc: stable@kernel.org # v6.5+ Signed-off-by: John Ernberg john.ernberg@actia.se Reviewed-by: Stefano Stabellini sstabellini@kernel.org Signed-off-by: Juergen Gross jgross@suse.com Message-ID: 20250502114043.1968976-2-john.ernberg@actia.se Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/swiotlb-xen.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -217,6 +217,7 @@ static dma_addr_t xen_swiotlb_map_page(s * buffering it. */ if (dma_capable(dev, dev_addr, size, true) && + !dma_kmalloc_needs_bounce(dev, size, dir) && !range_straddles_page_boundary(phys, size) && !xen_arch_need_swiotlb(dev, phys, dev_addr) && !is_swiotlb_force_bounce(dev))
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Andryuk jason.andryuk@amd.com
commit 1f0304dfd9d217c2f8b04a9ef4b3258a66eedd27 upstream.
Marek reported seeing a NULL pointer fault in the xenbus_thread callstack: BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: e030:__wake_up_common+0x4c/0x180 Call Trace: <TASK> __wake_up_common_lock+0x82/0xd0 process_msg+0x18e/0x2f0 xenbus_thread+0x165/0x1c0
process_msg+0x18e is req->cb(req). req->cb is set to xs_wake_up(), a thin wrapper around wake_up(), or xenbus_dev_queue_reply(). It seems like it was xs_wake_up() in this case.
It seems like req may have woken up the xs_wait_for_reply(), which kfree()ed the req. When xenbus_thread resumes, it faults on the zero-ed data.
Linux Device Drivers 2nd edition states: "Normally, a wake_up call can cause an immediate reschedule to happen, meaning that other processes might run before wake_up returns." ... which would match the behaviour observed.
Change to keeping two krefs on each request. One for the caller, and one for xenbus_thread. Each will kref_put() when finished, and the last will free it.
This use of kref matches the description in Documentation/core-api/kref.rst
Link: https://lore.kernel.org/xen-devel/ZO0WrR5J0xuwDIxW@mail-itl/ Reported-by: Marek Marczykowski-Górecki marmarek@invisiblethingslab.com Fixes: fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") Cc: stable@vger.kernel.org Signed-off-by: Jason Andryuk jason.andryuk@amd.com Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Message-ID: 20250506210935.5607-1-jason.andryuk@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/xenbus/xenbus.h | 2 ++ drivers/xen/xenbus/xenbus_comms.c | 9 ++++----- drivers/xen/xenbus/xenbus_dev_frontend.c | 2 +- drivers/xen/xenbus/xenbus_xs.c | 18 ++++++++++++++++-- 4 files changed, 23 insertions(+), 8 deletions(-)
--- a/drivers/xen/xenbus/xenbus.h +++ b/drivers/xen/xenbus/xenbus.h @@ -77,6 +77,7 @@ enum xb_req_state { struct xb_req_data { struct list_head list; wait_queue_head_t wq; + struct kref kref; struct xsd_sockmsg msg; uint32_t caller_req_id; enum xsd_sockmsg_type type; @@ -103,6 +104,7 @@ int xb_init_comms(void); void xb_deinit_comms(void); int xs_watch_msg(struct xs_watch_event *event); void xs_request_exit(struct xb_req_data *req); +void xs_free_req(struct kref *kref);
int xenbus_match(struct device *_dev, const struct device_driver *_drv); int xenbus_dev_probe(struct device *_dev); --- a/drivers/xen/xenbus/xenbus_comms.c +++ b/drivers/xen/xenbus/xenbus_comms.c @@ -309,8 +309,8 @@ static int process_msg(void) virt_wmb(); req->state = xb_req_state_got_reply; req->cb(req); - } else - kfree(req); + } + kref_put(&req->kref, xs_free_req); }
mutex_unlock(&xs_response_mutex); @@ -386,14 +386,13 @@ static int process_writes(void) state.req->msg.type = XS_ERROR; state.req->err = err; list_del(&state.req->list); - if (state.req->state == xb_req_state_aborted) - kfree(state.req); - else { + if (state.req->state != xb_req_state_aborted) { /* write err, then update state */ virt_wmb(); state.req->state = xb_req_state_got_reply; wake_up(&state.req->wq); } + kref_put(&state.req->kref, xs_free_req);
mutex_unlock(&xb_write_mutex);
--- a/drivers/xen/xenbus/xenbus_dev_frontend.c +++ b/drivers/xen/xenbus/xenbus_dev_frontend.c @@ -406,7 +406,7 @@ void xenbus_dev_queue_reply(struct xb_re mutex_unlock(&u->reply_mutex);
kfree(req->body); - kfree(req); + kref_put(&req->kref, xs_free_req);
kref_put(&u->kref, xenbus_file_free);
--- a/drivers/xen/xenbus/xenbus_xs.c +++ b/drivers/xen/xenbus/xenbus_xs.c @@ -112,6 +112,12 @@ static void xs_suspend_exit(void) wake_up_all(&xs_state_enter_wq); }
+void xs_free_req(struct kref *kref) +{ + struct xb_req_data *req = container_of(kref, struct xb_req_data, kref); + kfree(req); +} + static uint32_t xs_request_enter(struct xb_req_data *req) { uint32_t rq_id; @@ -237,6 +243,12 @@ static void xs_send(struct xb_req_data * req->caller_req_id = req->msg.req_id; req->msg.req_id = xs_request_enter(req);
+ /* + * Take 2nd ref. One for this thread, and the second for the + * xenbus_thread. + */ + kref_get(&req->kref); + mutex_lock(&xb_write_mutex); list_add_tail(&req->list, &xb_write_list); notify = list_is_singular(&xb_write_list); @@ -261,8 +273,8 @@ static void *xs_wait_for_reply(struct xb if (req->state == xb_req_state_queued || req->state == xb_req_state_wait_reply) req->state = xb_req_state_aborted; - else - kfree(req); + + kref_put(&req->kref, xs_free_req); mutex_unlock(&xb_write_mutex);
return ret; @@ -291,6 +303,7 @@ int xenbus_dev_request_and_reply(struct req->cb = xenbus_dev_queue_reply; req->par = par; req->user_req = true; + kref_init(&req->kref);
xs_send(req, msg);
@@ -319,6 +332,7 @@ static void *xs_talkv(struct xenbus_tran req->num_vecs = num_vecs; req->cb = xs_wake_up; req->user_req = false; + kref_init(&req->kref);
msg.req_id = 0; msg.tx_id = t.id;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com
commit c4eb2f88d2796ab90c5430e11c48709716181364 upstream.
Increase JMS message state dump command timeout to 100 ms. On some platforms, the FW may take a bit longer than 50 ms to dump its state to the log buffer and we don't want to miss any debug info during TDR.
Fixes: 5e162f872d7a ("accel/ivpu: Add FW state dump on TDR") Cc: stable@vger.kernel.org # v6.13+ Reviewed-by: Jeff Hugo jeff.hugo@oss.qualcomm.com Signed-off-by: Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com Link: https://lore.kernel.org/r/20250425092822.2194465-1-jacek.lawrynowicz@linux.i... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/accel/ivpu/ivpu_hw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/accel/ivpu/ivpu_hw.c +++ b/drivers/accel/ivpu/ivpu_hw.c @@ -106,7 +106,7 @@ static void timeouts_init(struct ivpu_de else vdev->timeout.autosuspend = 100; vdev->timeout.d0i3_entry_msg = 5; - vdev->timeout.state_dump_msg = 10; + vdev->timeout.state_dump_msg = 100; } }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yeoreum Yun yeoreum.yun@arm.com
commit 363cd2b81cfdf706bbfc9ec78db000c9b1ecc552 upstream.
The PTE_MAYBE_NG macro sets the nG page table bit according to the value of "arm64_use_ng_mappings". This variable is currently placed in the .bss section. create_init_idmap() is called before the .bss section initialisation which is done in early_map_kernel(). Therefore, data/test_prot in create_init_idmap() could be set incorrectly through the PAGE_KERNEL -> PROT_DEFAULT -> PTE_MAYBE_NG macros.
# llvm-objdump-21 --syms vmlinux-gcc | grep arm64_use_ng_mappings ffff800082f242a8 g O .bss 0000000000000001 arm64_use_ng_mappings
The create_init_idmap() function disassembly compiled with llvm-21:
// create_init_idmap() ffff80008255c058: d10103ff sub sp, sp, #0x40 ffff80008255c05c: a9017bfd stp x29, x30, [sp, #0x10] ffff80008255c060: a90257f6 stp x22, x21, [sp, #0x20] ffff80008255c064: a9034ff4 stp x20, x19, [sp, #0x30] ffff80008255c068: 910043fd add x29, sp, #0x10 ffff80008255c06c: 90003fc8 adrp x8, 0xffff800082d54000 ffff80008255c070: d280e06a mov x10, #0x703 // =1795 ffff80008255c074: 91400409 add x9, x0, #0x1, lsl #12 // =0x1000 ffff80008255c078: 394a4108 ldrb w8, [x8, #0x290] ------------- (1) ffff80008255c07c: f2e00d0a movk x10, #0x68, lsl #48 ffff80008255c080: f90007e9 str x9, [sp, #0x8] ffff80008255c084: aa0103f3 mov x19, x1 ffff80008255c088: aa0003f4 mov x20, x0 ffff80008255c08c: 14000000 b 0xffff80008255c08c <__pi_create_init_idmap+0x34> ffff80008255c090: aa082d56 orr x22, x10, x8, lsl #11 -------- (2)
Note (1) is loading the arm64_use_ng_mappings value in w8 and (2) is set the text or data prot with the w8 value to set PTE_NG bit. If the .bss section isn't initialized, x8 could include a garbage value and generate an incorrect mapping.
Annotate arm64_use_ng_mappings as __read_mostly so that it is placed in the .data section.
Fixes: 84b04d3e6bdb ("arm64: kernel: Create initial ID map from C code") Cc: stable@vger.kernel.org # 6.9.x Tested-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Yeoreum Yun yeoreum.yun@arm.com Link: https://lore.kernel.org/r/20250502180412.3774883-1-yeoreum.yun@arm.com [catalin.marinas@arm.com: use __read_mostly instead of __ro_after_init] [catalin.marinas@arm.com: slight tweaking of the code comment] Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/cpufeature.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -113,7 +113,14 @@ static struct arm64_cpu_capabilities con
DECLARE_BITMAP(boot_cpucaps, ARM64_NCAPS);
-bool arm64_use_ng_mappings = false; +/* + * arm64_use_ng_mappings must be placed in the .data section, otherwise it + * ends up in the .bss section where it is initialized in early_map_kernel() + * after the MMU (with the idmap) was enabled. create_init_idmap() - which + * runs before early_map_kernel() and reads the variable via PTE_MAYBE_NG - + * may end up generating an incorrect idmap page table attributes. + */ +bool arm64_use_ng_mappings __read_mostly = false; EXPORT_SYMBOL(arm64_use_ng_mappings);
DEFINE_PER_CPU_READ_MOSTLY(const char *, this_cpu_vector) = vectors;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sebastian Andrzej Siewior bigeasy@linutronix.de
commit 94cff94634e506a4a44684bee1875d2dbf782722 upstream.
On x86 during boot, clockevent_i8253_disable() can be invoked via x86_late_time_init -> hpet_time_init() -> pit_timer_init() which happens with enabled interrupts.
If some of the old i8253 hardware is actually used then lockdep will notice that i8253_lock is used in hard interrupt context. This causes lockdep to complain because it observed the lock being acquired with interrupts enabled and in hard interrupt context.
Make clockevent_i8253_disable() acquire the lock with raw_spinlock_irqsave() to cure this.
[ tglx: Massage change log and use guard() ]
Fixes: c8c4076723dac ("x86/timer: Skip PIT initialization on modern chipsets") Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20250404133116.p-XRWJXf@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clocksource/i8253.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/clocksource/i8253.c +++ b/drivers/clocksource/i8253.c @@ -103,7 +103,7 @@ int __init clocksource_i8253_init(void) #ifdef CONFIG_CLKEVT_I8253 void clockevent_i8253_disable(void) { - raw_spin_lock(&i8253_lock); + guard(raw_spinlock_irqsave)(&i8253_lock);
/* * Writing the MODE register should stop the counter, according to @@ -132,8 +132,6 @@ void clockevent_i8253_disable(void) outb_p(0, PIT_CH0);
outb_p(0x30, PIT_MODE); - - raw_spin_unlock(&i8253_lock); }
static int pit_shutdown(struct clock_event_device *evt)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sebastian Ott sebott@redhat.com
commit 157dbc4a321f5bb6f8b6c724d12ba720a90f1a7c upstream.
Commit fce886a60207 ("KVM: arm64: Plumb the pKVM MMU in KVM") made the initialization of the local memcache variable in user_mem_abort() conditional, leaving a codepath where it is used uninitialized via kvm_pgtable_stage2_map().
This can fail on any path that requires a stage-2 allocation without transition via a permission fault or dirty logging.
Fix this by making sure that memcache is always valid.
Fixes: fce886a60207 ("KVM: arm64: Plumb the pKVM MMU in KVM") Signed-off-by: Sebastian Ott sebott@redhat.com Reviewed-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/kvmarm/3f5db4c7-ccce-fb95-595c-692fa7aad227@redhat.c... Link: https://lore.kernel.org/r/20250505173148.33900-1-sebott@redhat.com Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kvm/mmu.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
--- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1489,6 +1489,11 @@ static int user_mem_abort(struct kvm_vcp return -EFAULT; }
+ if (!is_protected_kvm_enabled()) + memcache = &vcpu->arch.mmu_page_cache; + else + memcache = &vcpu->arch.pkvm_memcache; + /* * Permission faults just need to update the existing leaf entry, * and so normally don't require allocations from the memcache. The @@ -1498,13 +1503,11 @@ static int user_mem_abort(struct kvm_vcp if (!fault_is_perm || (logging_active && write_fault)) { int min_pages = kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu);
- if (!is_protected_kvm_enabled()) { - memcache = &vcpu->arch.mmu_page_cache; + if (!is_protected_kvm_enabled()) ret = kvm_mmu_topup_memory_cache(memcache, min_pages); - } else { - memcache = &vcpu->arch.pkvm_memcache; + else ret = topup_hyp_memcache(memcache, min_pages); - } + if (ret) return ret; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tom Lendacky thomas.lendacky@amd.com
commit da8bf5daa5e55a6af2b285ecda460d6454712ff4 upstream.
When increasing the array size in memblock_double_array() and the slab is not yet available, a call to memblock_find_in_range() is used to reserve/allocate memory. However, the range returned may not have been accepted, which can result in a crash when booting an SNP guest:
RIP: 0010:memcpy_orig+0x68/0x130 Code: ... RSP: 0000:ffffffff9cc03ce8 EFLAGS: 00010006 RAX: ff11001ff83e5000 RBX: 0000000000000000 RCX: fffffffffffff000 RDX: 0000000000000bc0 RSI: ffffffff9dba8860 RDI: ff11001ff83e5c00 RBP: 0000000000002000 R08: 0000000000000000 R09: 0000000000002000 R10: 000000207fffe000 R11: 0000040000000000 R12: ffffffff9d06ef78 R13: ff11001ff83e5000 R14: ffffffff9dba7c60 R15: 0000000000000c00 memblock_double_array+0xff/0x310 memblock_add_range+0x1fb/0x2f0 memblock_reserve+0x4f/0xa0 memblock_alloc_range_nid+0xac/0x130 memblock_alloc_internal+0x53/0xc0 memblock_alloc_try_nid+0x3d/0xa0 swiotlb_init_remap+0x149/0x2f0 mem_init+0xb/0xb0 mm_core_init+0x8f/0x350 start_kernel+0x17e/0x5d0 x86_64_start_reservations+0x14/0x30 x86_64_start_kernel+0x92/0xa0 secondary_startup_64_no_verify+0x194/0x19b
Mitigate this by calling accept_memory() on the memory range returned before the slab is available.
Prior to v6.12, the accept_memory() interface used a 'start' and 'end' parameter instead of 'start' and 'size', therefore the accept_memory() call must be adjusted to specify 'start + size' for 'end' when applying to kernels prior to v6.12.
Cc: stable@vger.kernel.org # see patch description, needs adjustments for <= 6.11 Fixes: dcdfdd40fa82 ("mm: Add support for unaccepted memory") Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Link: https://lore.kernel.org/r/da1ac73bf4ded761e21b4e4bb5178382a580cd73.174672505... Signed-off-by: Mike Rapoport (Microsoft) rppt@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memblock.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/mm/memblock.c +++ b/mm/memblock.c @@ -456,7 +456,14 @@ static int __init_memblock memblock_doub min(new_area_start, memblock.current_limit), new_alloc_size, PAGE_SIZE);
- new_array = addr ? __va(addr) : NULL; + if (addr) { + /* The memory may not have been accepted, yet. */ + accept_memory(addr, new_alloc_size); + + new_array = __va(addr); + } else { + new_array = NULL; + } } if (!addr) { pr_err("memblock: Failed to double %s array from %ld to %ld entries !\n",
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Antipov dmantipov@yandex.ru
commit a6aeb739974ec73e5217c75a7c008a688d3d5cf1 upstream.
In 'lookup_or_create_module_kobject()', an internal kobject is created using 'module_ktype'. So call to 'kobject_put()' on error handling path causes an attempt to use an uninitialized completion pointer in 'module_kobject_release()'. In this scenario, we just want to release kobject without an extra synchronization required for a regular module unloading process, so adding an extra check whether 'complete()' is actually required makes 'kobject_put()' safe.
Reported-by: syzbot+7fb8a372e1f6add936dd@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7fb8a372e1f6add936dd Fixes: 942e443127e9 ("module: Fix mod->mkobj.kobj potentially freed too early") Cc: stable@vger.kernel.org Suggested-by: Petr Pavlu petr.pavlu@suse.com Signed-off-by: Dmitry Antipov dmantipov@yandex.ru Link: https://lore.kernel.org/r/20250507065044.86529-1-dmantipov@yandex.ru Signed-off-by: Petr Pavlu petr.pavlu@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/params.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/kernel/params.c +++ b/kernel/params.c @@ -949,7 +949,9 @@ struct kset *module_kset; static void module_kobj_release(struct kobject *kobj) { struct module_kobject *mk = to_module_kobject(kobj); - complete(mk->kobj_completion); + + if (mk->kobj_completion) + complete(mk->kobj_completion); }
const struct kobj_type module_ktype = {
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov (AMD) bp@alien8.de
commit 5214a9f6c0f56644acb9d2cbb58facf1856d322b upstream.
Consolidate the whole logic which determines whether the microcode loader should be enabled or not into a single function and call it everywhere.
Well, almost everywhere - not in mk_early_pgtbl_32() because there the kernel is running without paging enabled and checking dis_ucode_ldr et al would require physical addresses and uglification of the code.
But since this is 32-bit, the easier thing to do is to simply map the initrd unconditionally especially since that mapping is getting removed later anyway by zap_early_initrd_mapping() and avoid the uglification.
In doing so, address the issue of old 486er machines without CPUID support, not booting current kernels.
[ mingo: Fix no previous prototype for ‘microcode_loader_disabled’ [-Wmissing-prototypes] ]
Fixes: 4c585af7180c1 ("x86/boot/32: Temporarily map initrd for microcode loading") Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Cc: stable@kernel.org Link: https://lore.kernel.org/r/CANpbe9Wm3z8fy9HbgS8cuhoj0TREYEEkBipDuhgkWFvqX0UoV... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/microcode.h | 2 + arch/x86/kernel/cpu/microcode/amd.c | 6 ++- arch/x86/kernel/cpu/microcode/core.c | 58 ++++++++++++++++++------------- arch/x86/kernel/cpu/microcode/intel.c | 2 - arch/x86/kernel/cpu/microcode/internal.h | 1 arch/x86/kernel/head32.c | 4 -- 6 files changed, 41 insertions(+), 32 deletions(-)
--- a/arch/x86/include/asm/microcode.h +++ b/arch/x86/include/asm/microcode.h @@ -17,10 +17,12 @@ struct ucode_cpu_info { void load_ucode_bsp(void); void load_ucode_ap(void); void microcode_bsp_resume(void); +bool __init microcode_loader_disabled(void); #else static inline void load_ucode_bsp(void) { } static inline void load_ucode_ap(void) { } static inline void microcode_bsp_resume(void) { } +static inline bool __init microcode_loader_disabled(void) { return false; } #endif
extern unsigned long initrd_start_early; --- a/arch/x86/kernel/cpu/microcode/amd.c +++ b/arch/x86/kernel/cpu/microcode/amd.c @@ -1098,15 +1098,17 @@ static enum ucode_state load_microcode_a
static int __init save_microcode_in_initrd(void) { - unsigned int cpuid_1_eax = native_cpuid_eax(1); struct cpuinfo_x86 *c = &boot_cpu_data; struct cont_desc desc = { 0 }; + unsigned int cpuid_1_eax; enum ucode_state ret; struct cpio_data cp;
- if (dis_ucode_ldr || c->x86_vendor != X86_VENDOR_AMD || c->x86 < 0x10) + if (microcode_loader_disabled() || c->x86_vendor != X86_VENDOR_AMD || c->x86 < 0x10) return 0;
+ cpuid_1_eax = native_cpuid_eax(1); + if (!find_blobs_in_containers(&cp)) return -EINVAL;
--- a/arch/x86/kernel/cpu/microcode/core.c +++ b/arch/x86/kernel/cpu/microcode/core.c @@ -41,8 +41,8 @@
#include "internal.h"
-static struct microcode_ops *microcode_ops; -bool dis_ucode_ldr = true; +static struct microcode_ops *microcode_ops; +static bool dis_ucode_ldr = false;
bool force_minrev = IS_ENABLED(CONFIG_MICROCODE_LATE_FORCE_MINREV); module_param(force_minrev, bool, S_IRUSR | S_IWUSR); @@ -84,6 +84,9 @@ static bool amd_check_current_patch_leve u32 lvl, dummy, i; u32 *levels;
+ if (x86_cpuid_vendor() != X86_VENDOR_AMD) + return false; + native_rdmsr(MSR_AMD64_PATCH_LEVEL, lvl, dummy);
levels = final_levels; @@ -95,27 +98,29 @@ static bool amd_check_current_patch_leve return false; }
-static bool __init check_loader_disabled_bsp(void) +bool __init microcode_loader_disabled(void) { - static const char *__dis_opt_str = "dis_ucode_ldr"; - const char *cmdline = boot_command_line; - const char *option = __dis_opt_str; + if (dis_ucode_ldr) + return true;
/* - * CPUID(1).ECX[31]: reserved for hypervisor use. This is still not - * completely accurate as xen pv guests don't see that CPUID bit set but - * that's good enough as they don't land on the BSP path anyway. + * Disable when: + * + * 1) The CPU does not support CPUID. + * + * 2) Bit 31 in CPUID[1]:ECX is clear + * The bit is reserved for hypervisor use. This is still not + * completely accurate as XEN PV guests don't see that CPUID bit + * set, but that's good enough as they don't land on the BSP + * path anyway. + * + * 3) Certain AMD patch levels are not allowed to be + * overwritten. */ - if (native_cpuid_ecx(1) & BIT(31)) - return true; - - if (x86_cpuid_vendor() == X86_VENDOR_AMD) { - if (amd_check_current_patch_level()) - return true; - } - - if (cmdline_find_option_bool(cmdline, option) <= 0) - dis_ucode_ldr = false; + if (!have_cpuid_p() || + native_cpuid_ecx(1) & BIT(31) || + amd_check_current_patch_level()) + dis_ucode_ldr = true;
return dis_ucode_ldr; } @@ -125,7 +130,10 @@ void __init load_ucode_bsp(void) unsigned int cpuid_1_eax; bool intel = true;
- if (!have_cpuid_p()) + if (cmdline_find_option_bool(boot_command_line, "dis_ucode_ldr") > 0) + dis_ucode_ldr = true; + + if (microcode_loader_disabled()) return;
cpuid_1_eax = native_cpuid_eax(1); @@ -146,9 +154,6 @@ void __init load_ucode_bsp(void) return; }
- if (check_loader_disabled_bsp()) - return; - if (intel) load_ucode_intel_bsp(&early_data); else @@ -159,6 +164,11 @@ void load_ucode_ap(void) { unsigned int cpuid_1_eax;
+ /* + * Can't use microcode_loader_disabled() here - .init section + * hell. It doesn't have to either - the BSP variant must've + * parsed cmdline already anyway. + */ if (dis_ucode_ldr) return;
@@ -810,7 +820,7 @@ static int __init microcode_init(void) struct cpuinfo_x86 *c = &boot_cpu_data; int error;
- if (dis_ucode_ldr) + if (microcode_loader_disabled()) return -EINVAL;
if (c->x86_vendor == X86_VENDOR_INTEL) --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -389,7 +389,7 @@ static int __init save_builtin_microcode if (xchg(&ucode_patch_va, NULL) != UCODE_BSP_LOADED) return 0;
- if (dis_ucode_ldr || boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) + if (microcode_loader_disabled() || boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) return 0;
uci.mc = get_microcode_blob(&uci, true); --- a/arch/x86/kernel/cpu/microcode/internal.h +++ b/arch/x86/kernel/cpu/microcode/internal.h @@ -94,7 +94,6 @@ static inline unsigned int x86_cpuid_fam return x86_family(eax); }
-extern bool dis_ucode_ldr; extern bool force_minrev;
#ifdef CONFIG_CPU_SUP_AMD --- a/arch/x86/kernel/head32.c +++ b/arch/x86/kernel/head32.c @@ -145,10 +145,6 @@ void __init __no_stack_protector mk_earl *ptr = (unsigned long)ptep + PAGE_OFFSET;
#ifdef CONFIG_MICROCODE_INITRD32 - /* Running on a hypervisor? */ - if (native_cpuid_ecx(1) & BIT(31)) - return; - params = (struct boot_params *)__pa_nodebug(&boot_params); if (!params->hdr.ramdisk_size || !params->hdr.ramdisk_image) return;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Tinguely mark.tinguely@oracle.com
commit 31d4cd4eb2f8d9b87ebfa6a5e443a59e3b3d7b8c upstream.
commit 7e119cff9d0a ("ocfs2: convert w_pages to w_folios") and commit 9a5e08652dc4b ("ocfs2: use an array of folios instead of an array of pages") save -ENOMEM in the folio array upon allocation failure and call the folio array free code.
The folio array free code expects either valid folio pointers or NULL. Finding the -ENOMEM will result in a panic. Fix by NULLing the error folio entry.
Link: https://lkml.kernel.org/r/c879a52b-835c-4fa0-902b-8b2e9196dcbd@oracle.com Fixes: 7e119cff9d0a ("ocfs2: convert w_pages to w_folios") Fixes: 9a5e08652dc4b ("ocfs2: use an array of folios instead of an array of pages") Signed-off-by: Mark Tinguely mark.tinguely@oracle.com Reviewed-by: Matthew Wilcox (Oracle) willy@infradead.org Cc: Changwei Ge gechangwei@live.cn Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Mark Fasheh mark@fasheh.com Cc: Nathan Chancellor nathan@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/alloc.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/ocfs2/alloc.c +++ b/fs/ocfs2/alloc.c @@ -6918,6 +6918,7 @@ static int ocfs2_grab_folios(struct inod if (IS_ERR(folios[numfolios])) { ret = PTR_ERR(folios[numfolios]); mlog_errno(ret); + folios[numfolios] = NULL; goto out; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heming Zhao heming.zhao@suse.com
commit bd1261b16d9131d79723d982d54295e7f309797a upstream.
commit 4eb7b93e0310 ("ocfs2: improve write IO performance when fragmentation is high") introduced another regression.
The following ocfs2-test case can trigger this issue:
discontig_runner.sh => activate_discontig_bg.sh => resv_unwritten: ${RESV_UNWRITTEN_BIN} -f ${WORK_PLACE}/large_testfile -s 0 -l \ $((${FILE_MAJOR_SIZE_M}*1024*1024))
In my env, test disk size (by "fdisk -l <dev>"):
53687091200 bytes, 104857600 sectors.
Above command is:
/usr/local/ocfs2-test/bin/resv_unwritten -f \ /mnt/ocfs2/ocfs2-activate-discontig-bg-dir/large_testfile -s 0 -l \ 53187969024
Error log:
[*] Reserve 50724M space for a LARGE file, reserve 200M space for future test. ioctl error 28: "No space left on device" resv allocation failed Unknown error -1 reserve unwritten region from 0 to 53187969024.
Call flow: __ocfs2_change_file_space //by ioctl OCFS2_IOC_RESVSP64 ocfs2_allocate_unwritten_extents //start:0 len:53187969024 while() + ocfs2_get_clusters //cpos:0, alloc_size:1623168 (cluster number) + ocfs2_extend_allocation + ocfs2_lock_allocators | + choose OCFS2_AC_USE_MAIN & ocfs2_cluster_group_search | + ocfs2_add_inode_data ocfs2_add_clusters_in_btree __ocfs2_claim_clusters ocfs2_claim_suballoc_bits + During the allocation of the final part of the large file (after ~47GB), no chain had the required contiguous bits_wanted. Consequently, the allocation failed.
How to fix: When OCFS2 is encountering fragmented allocation, the file system should stop attempting bits_wanted contiguous allocation and instead provide the largest available contiguous free bits from the cluster groups.
Link: https://lkml.kernel.org/r/20250414060125.19938-2-heming.zhao@suse.com Fixes: 4eb7b93e0310 ("ocfs2: improve write IO performance when fragmentation is high") Signed-off-by: Heming Zhao heming.zhao@suse.com Reported-by: Gautham Ananthakrishna gautham.ananthakrishna@oracle.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Jun Piao piaojun@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/suballoc.c | 38 ++++++++++++++++++++++++++++++++------ fs/ocfs2/suballoc.h | 1 + 2 files changed, 33 insertions(+), 6 deletions(-)
--- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -698,10 +698,12 @@ static int ocfs2_block_group_alloc(struc
bg_bh = ocfs2_block_group_alloc_contig(osb, handle, alloc_inode, ac, cl); - if (PTR_ERR(bg_bh) == -ENOSPC) + if (PTR_ERR(bg_bh) == -ENOSPC) { + ac->ac_which = OCFS2_AC_USE_MAIN_DISCONTIG; bg_bh = ocfs2_block_group_alloc_discontig(handle, alloc_inode, ac, cl); + } if (IS_ERR(bg_bh)) { status = PTR_ERR(bg_bh); bg_bh = NULL; @@ -1794,6 +1796,7 @@ static int ocfs2_search_chain(struct ocf { int status; u16 chain; + u32 contig_bits; u64 next_group; struct inode *alloc_inode = ac->ac_inode; struct buffer_head *group_bh = NULL; @@ -1819,10 +1822,21 @@ static int ocfs2_search_chain(struct ocf status = -ENOSPC; /* for now, the chain search is a bit simplistic. We just use * the 1st group with any empty bits. */ - while ((status = ac->ac_group_search(alloc_inode, group_bh, - bits_wanted, min_bits, - ac->ac_max_block, - res)) == -ENOSPC) { + while (1) { + if (ac->ac_which == OCFS2_AC_USE_MAIN_DISCONTIG) { + contig_bits = le16_to_cpu(bg->bg_contig_free_bits); + if (!contig_bits) + contig_bits = ocfs2_find_max_contig_free_bits(bg->bg_bitmap, + le16_to_cpu(bg->bg_bits), 0); + if (bits_wanted > contig_bits && contig_bits >= min_bits) + bits_wanted = contig_bits; + } + + status = ac->ac_group_search(alloc_inode, group_bh, + bits_wanted, min_bits, + ac->ac_max_block, res); + if (status != -ENOSPC) + break; if (!bg->bg_next_group) break;
@@ -1982,6 +1996,7 @@ static int ocfs2_claim_suballoc_bits(str victim = ocfs2_find_victim_chain(cl); ac->ac_chain = victim;
+search: status = ocfs2_search_chain(ac, handle, bits_wanted, min_bits, res, &bits_left); if (!status) { @@ -2022,6 +2037,16 @@ static int ocfs2_claim_suballoc_bits(str } }
+ /* Chains can't supply the bits_wanted contiguous space. + * We should switch to using every single bit when allocating + * from the global bitmap. */ + if (i == le16_to_cpu(cl->cl_next_free_rec) && + status == -ENOSPC && ac->ac_which == OCFS2_AC_USE_MAIN) { + ac->ac_which = OCFS2_AC_USE_MAIN_DISCONTIG; + ac->ac_chain = victim; + goto search; + } + set_hint: if (status != -ENOSPC) { /* If the next search of this group is not likely to @@ -2365,7 +2390,8 @@ int __ocfs2_claim_clusters(handle_t *han BUG_ON(ac->ac_bits_given >= ac->ac_bits_wanted);
BUG_ON(ac->ac_which != OCFS2_AC_USE_LOCAL - && ac->ac_which != OCFS2_AC_USE_MAIN); + && ac->ac_which != OCFS2_AC_USE_MAIN + && ac->ac_which != OCFS2_AC_USE_MAIN_DISCONTIG);
if (ac->ac_which == OCFS2_AC_USE_LOCAL) { WARN_ON(min_clusters > 1); --- a/fs/ocfs2/suballoc.h +++ b/fs/ocfs2/suballoc.h @@ -29,6 +29,7 @@ struct ocfs2_alloc_context { #define OCFS2_AC_USE_MAIN 2 #define OCFS2_AC_USE_INODE 3 #define OCFS2_AC_USE_META 4 +#define OCFS2_AC_USE_MAIN_DISCONTIG 5 u32 ac_which;
/* these are used by the chain search */
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit c0fb83088f0cc4ee4706e0495ee8b06f49daa716 upstream.
Patch series "ocfs2: Fix deadlocks in quota recovery", v3.
This implements another approach to fixing quota recovery deadlocks. We avoid grabbing sb->s_umount semaphore from ocfs2_finish_quota_recovery() and instead stop quota recovery early in ocfs2_dismount_volume().
This patch (of 3):
We will need more recovery states than just pure enable / disable to fix deadlocks with quota recovery. Switch osb->disable_recovery to enum.
Link: https://lkml.kernel.org/r/20250424134301.1392-1-jack@suse.cz Link: https://lkml.kernel.org/r/20250424134515.18933-4-jack@suse.cz Fixes: 5f530de63cfc ("ocfs2: Use s_umount for quota recovery protection") Signed-off-by: Jan Kara jack@suse.cz Reviewed-by: Heming Zhao heming.zhao@suse.com Tested-by: Heming Zhao heming.zhao@suse.com Acked-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Jun Piao piaojun@huawei.com Cc: Murad Masimov m.masimov@mt-integration.ru Cc: Shichangkuo shi.changkuo@h3c.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/journal.c | 14 ++++++++------ fs/ocfs2/ocfs2.h | 7 ++++++- 2 files changed, 14 insertions(+), 7 deletions(-)
--- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@ -174,7 +174,7 @@ int ocfs2_recovery_init(struct ocfs2_sup struct ocfs2_recovery_map *rm;
mutex_init(&osb->recovery_lock); - osb->disable_recovery = 0; + osb->recovery_state = OCFS2_REC_ENABLED; osb->recovery_thread_task = NULL; init_waitqueue_head(&osb->recovery_event);
@@ -206,7 +206,7 @@ void ocfs2_recovery_exit(struct ocfs2_su /* disable any new recovery threads and wait for any currently * running ones to exit. Do this before setting the vol_state. */ mutex_lock(&osb->recovery_lock); - osb->disable_recovery = 1; + osb->recovery_state = OCFS2_REC_DISABLED; mutex_unlock(&osb->recovery_lock); wait_event(osb->recovery_event, !ocfs2_recovery_thread_running(osb));
@@ -1582,14 +1582,16 @@ bail:
void ocfs2_recovery_thread(struct ocfs2_super *osb, int node_num) { + int was_set = -1; + mutex_lock(&osb->recovery_lock); + if (osb->recovery_state < OCFS2_REC_DISABLED) + was_set = ocfs2_recovery_map_set(osb, node_num);
trace_ocfs2_recovery_thread(node_num, osb->node_num, - osb->disable_recovery, osb->recovery_thread_task, - osb->disable_recovery ? - -1 : ocfs2_recovery_map_set(osb, node_num)); + osb->recovery_state, osb->recovery_thread_task, was_set);
- if (osb->disable_recovery) + if (osb->recovery_state == OCFS2_REC_DISABLED) goto out;
if (osb->recovery_thread_task) --- a/fs/ocfs2/ocfs2.h +++ b/fs/ocfs2/ocfs2.h @@ -308,6 +308,11 @@ enum ocfs2_journal_trigger_type { void ocfs2_initialize_journal_triggers(struct super_block *sb, struct ocfs2_triggers triggers[]);
+enum ocfs2_recovery_state { + OCFS2_REC_ENABLED = 0, + OCFS2_REC_DISABLED, +}; + struct ocfs2_journal; struct ocfs2_slot_info; struct ocfs2_recovery_map; @@ -370,7 +375,7 @@ struct ocfs2_super struct ocfs2_recovery_map *recovery_map; struct ocfs2_replay_map *replay_map; struct task_struct *recovery_thread_task; - int disable_recovery; + enum ocfs2_recovery_state recovery_state; wait_queue_head_t checkpoint_event; struct ocfs2_journal *journal; unsigned long osb_commit_interval;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 8f947e0fd595951460f5a6e1ac29baa82fa02eab upstream.
We will need ocfs2 recovery thread to acknowledge transitions of recovery_state when disabling particular types of recovery. This is similar to what currently happens when disabling recovery completely, just more general. Implement the handshake and use it for exit from recovery.
Link: https://lkml.kernel.org/r/20250424134515.18933-5-jack@suse.cz Fixes: 5f530de63cfc ("ocfs2: Use s_umount for quota recovery protection") Signed-off-by: Jan Kara jack@suse.cz Reviewed-by: Heming Zhao heming.zhao@suse.com Tested-by: Heming Zhao heming.zhao@suse.com Acked-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Changwei Ge gechangwei@live.cn Cc: Joel Becker jlbec@evilplan.org Cc: Jun Piao piaojun@huawei.com Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Mark Fasheh mark@fasheh.com Cc: Murad Masimov m.masimov@mt-integration.ru Cc: Shichangkuo shi.changkuo@h3c.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/journal.c | 52 +++++++++++++++++++++++++++++++++++----------------- fs/ocfs2/ocfs2.h | 4 ++++ 2 files changed, 39 insertions(+), 17 deletions(-)
--- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@ -190,31 +190,48 @@ int ocfs2_recovery_init(struct ocfs2_sup return 0; }
-/* we can't grab the goofy sem lock from inside wait_event, so we use - * memory barriers to make sure that we'll see the null task before - * being woken up */ static int ocfs2_recovery_thread_running(struct ocfs2_super *osb) { - mb(); return osb->recovery_thread_task != NULL; }
-void ocfs2_recovery_exit(struct ocfs2_super *osb) +static void ocfs2_recovery_disable(struct ocfs2_super *osb, + enum ocfs2_recovery_state state) { - struct ocfs2_recovery_map *rm; - - /* disable any new recovery threads and wait for any currently - * running ones to exit. Do this before setting the vol_state. */ mutex_lock(&osb->recovery_lock); - osb->recovery_state = OCFS2_REC_DISABLED; + /* + * If recovery thread is not running, we can directly transition to + * final state. + */ + if (!ocfs2_recovery_thread_running(osb)) { + osb->recovery_state = state + 1; + goto out_lock; + } + osb->recovery_state = state; + /* Wait for recovery thread to acknowledge state transition */ + wait_event_cmd(osb->recovery_event, + !ocfs2_recovery_thread_running(osb) || + osb->recovery_state >= state + 1, + mutex_unlock(&osb->recovery_lock), + mutex_lock(&osb->recovery_lock)); +out_lock: mutex_unlock(&osb->recovery_lock); - wait_event(osb->recovery_event, !ocfs2_recovery_thread_running(osb));
- /* At this point, we know that no more recovery threads can be - * launched, so wait for any recovery completion work to - * complete. */ + /* + * At this point we know that no more recovery work can be queued so + * wait for any recovery completion work to complete. + */ if (osb->ocfs2_wq) flush_workqueue(osb->ocfs2_wq); +} + +void ocfs2_recovery_exit(struct ocfs2_super *osb) +{ + struct ocfs2_recovery_map *rm; + + /* disable any new recovery threads and wait for any currently + * running ones to exit. Do this before setting the vol_state. */ + ocfs2_recovery_disable(osb, OCFS2_REC_WANT_DISABLE);
/* * Now that recovery is shut down, and the osb is about to be @@ -1569,7 +1586,8 @@ bail:
ocfs2_free_replay_slots(osb); osb->recovery_thread_task = NULL; - mb(); /* sync with ocfs2_recovery_thread_running */ + if (osb->recovery_state == OCFS2_REC_WANT_DISABLE) + osb->recovery_state = OCFS2_REC_DISABLED; wake_up(&osb->recovery_event);
mutex_unlock(&osb->recovery_lock); @@ -1585,13 +1603,13 @@ void ocfs2_recovery_thread(struct ocfs2_ int was_set = -1;
mutex_lock(&osb->recovery_lock); - if (osb->recovery_state < OCFS2_REC_DISABLED) + if (osb->recovery_state < OCFS2_REC_WANT_DISABLE) was_set = ocfs2_recovery_map_set(osb, node_num);
trace_ocfs2_recovery_thread(node_num, osb->node_num, osb->recovery_state, osb->recovery_thread_task, was_set);
- if (osb->recovery_state == OCFS2_REC_DISABLED) + if (osb->recovery_state >= OCFS2_REC_WANT_DISABLE) goto out;
if (osb->recovery_thread_task) --- a/fs/ocfs2/ocfs2.h +++ b/fs/ocfs2/ocfs2.h @@ -310,6 +310,10 @@ void ocfs2_initialize_journal_triggers(s
enum ocfs2_recovery_state { OCFS2_REC_ENABLED = 0, + OCFS2_REC_WANT_DISABLE, + /* + * Must be OCFS2_REC_WANT_DISABLE + 1 for ocfs2_recovery_exit() to work + */ OCFS2_REC_DISABLED, };
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit fcaf3b2683b05a9684acdebda706a12025a6927a upstream.
Currently quota recovery is synchronized with unmount using sb->s_umount semaphore. That is however prone to deadlocks because flush_workqueue(osb->ocfs2_wq) called from umount code can wait for quota recovery to complete while ocfs2_finish_quota_recovery() waits for sb->s_umount semaphore.
Grabbing of sb->s_umount semaphore in ocfs2_finish_quota_recovery() is only needed to protect that function from disabling of quotas from ocfs2_dismount_volume(). Handle this problem by disabling quota recovery early during unmount in ocfs2_dismount_volume() instead so that we can drop acquisition of sb->s_umount from ocfs2_finish_quota_recovery().
Link: https://lkml.kernel.org/r/20250424134515.18933-6-jack@suse.cz Fixes: 5f530de63cfc ("ocfs2: Use s_umount for quota recovery protection") Signed-off-by: Jan Kara jack@suse.cz Reported-by: Shichangkuo shi.changkuo@h3c.com Reported-by: Murad Masimov m.masimov@mt-integration.ru Reviewed-by: Heming Zhao heming.zhao@suse.com Tested-by: Heming Zhao heming.zhao@suse.com Acked-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Changwei Ge gechangwei@live.cn Cc: Joel Becker jlbec@evilplan.org Cc: Jun Piao piaojun@huawei.com Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Mark Fasheh mark@fasheh.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/journal.c | 20 ++++++++++++++++++-- fs/ocfs2/journal.h | 1 + fs/ocfs2/ocfs2.h | 6 ++++++ fs/ocfs2/quota_local.c | 9 ++------- fs/ocfs2/super.c | 3 +++ 5 files changed, 30 insertions(+), 9 deletions(-)
--- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@ -225,6 +225,11 @@ out_lock: flush_workqueue(osb->ocfs2_wq); }
+void ocfs2_recovery_disable_quota(struct ocfs2_super *osb) +{ + ocfs2_recovery_disable(osb, OCFS2_REC_QUOTA_WANT_DISABLE); +} + void ocfs2_recovery_exit(struct ocfs2_super *osb) { struct ocfs2_recovery_map *rm; @@ -1489,6 +1494,18 @@ static int __ocfs2_recovery_thread(void } } restart: + if (quota_enabled) { + mutex_lock(&osb->recovery_lock); + /* Confirm that recovery thread will no longer recover quotas */ + if (osb->recovery_state == OCFS2_REC_QUOTA_WANT_DISABLE) { + osb->recovery_state = OCFS2_REC_QUOTA_DISABLED; + wake_up(&osb->recovery_event); + } + if (osb->recovery_state >= OCFS2_REC_QUOTA_DISABLED) + quota_enabled = 0; + mutex_unlock(&osb->recovery_lock); + } + status = ocfs2_super_lock(osb, 1); if (status < 0) { mlog_errno(status); @@ -1592,8 +1609,7 @@ bail:
mutex_unlock(&osb->recovery_lock);
- if (quota_enabled) - kfree(rm_quota); + kfree(rm_quota);
return status; } --- a/fs/ocfs2/journal.h +++ b/fs/ocfs2/journal.h @@ -148,6 +148,7 @@ void ocfs2_wait_for_recovery(struct ocfs
int ocfs2_recovery_init(struct ocfs2_super *osb); void ocfs2_recovery_exit(struct ocfs2_super *osb); +void ocfs2_recovery_disable_quota(struct ocfs2_super *osb);
int ocfs2_compute_replay_slots(struct ocfs2_super *osb); void ocfs2_free_replay_slots(struct ocfs2_super *osb); --- a/fs/ocfs2/ocfs2.h +++ b/fs/ocfs2/ocfs2.h @@ -310,6 +310,12 @@ void ocfs2_initialize_journal_triggers(s
enum ocfs2_recovery_state { OCFS2_REC_ENABLED = 0, + OCFS2_REC_QUOTA_WANT_DISABLE, + /* + * Must be OCFS2_REC_QUOTA_WANT_DISABLE + 1 for + * ocfs2_recovery_disable_quota() to work. + */ + OCFS2_REC_QUOTA_DISABLED, OCFS2_REC_WANT_DISABLE, /* * Must be OCFS2_REC_WANT_DISABLE + 1 for ocfs2_recovery_exit() to work --- a/fs/ocfs2/quota_local.c +++ b/fs/ocfs2/quota_local.c @@ -453,8 +453,7 @@ out:
/* Sync changes in local quota file into global quota file and * reinitialize local quota file. - * The function expects local quota file to be already locked and - * s_umount locked in shared mode. */ + * The function expects local quota file to be already locked. */ static int ocfs2_recover_local_quota_file(struct inode *lqinode, int type, struct ocfs2_quota_recovery *rec) @@ -588,7 +587,6 @@ int ocfs2_finish_quota_recovery(struct o { unsigned int ino[OCFS2_MAXQUOTAS] = { LOCAL_USER_QUOTA_SYSTEM_INODE, LOCAL_GROUP_QUOTA_SYSTEM_INODE }; - struct super_block *sb = osb->sb; struct ocfs2_local_disk_dqinfo *ldinfo; struct buffer_head *bh; handle_t *handle; @@ -600,7 +598,6 @@ int ocfs2_finish_quota_recovery(struct o printk(KERN_NOTICE "ocfs2: Finishing quota recovery on device (%s) for " "slot %u\n", osb->dev_str, slot_num);
- down_read(&sb->s_umount); for (type = 0; type < OCFS2_MAXQUOTAS; type++) { if (list_empty(&(rec->r_list[type]))) continue; @@ -677,7 +674,6 @@ out_put: break; } out: - up_read(&sb->s_umount); kfree(rec); return status; } @@ -843,8 +839,7 @@ static int ocfs2_local_free_info(struct ocfs2_release_local_quota_bitmaps(&oinfo->dqi_chunk);
/* - * s_umount held in exclusive mode protects us against racing with - * recovery thread... + * ocfs2_dismount_volume() has already aborted quota recovery... */ if (oinfo->dqi_rec) { ocfs2_free_quota_recovery(oinfo->dqi_rec); --- a/fs/ocfs2/super.c +++ b/fs/ocfs2/super.c @@ -1812,6 +1812,9 @@ static void ocfs2_dismount_volume(struct /* Orphan scan should be stopped as early as possible */ ocfs2_orphan_scan_stop(osb);
+ /* Stop quota recovery so that we can disable quotas */ + ocfs2_recovery_disable_quota(osb); + ocfs2_disable_quotas(osb);
/* All dquots should be freed by now */
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Prashanth K prashanth.k@oss.qualcomm.com
commit 2372f1caeca433c4c01c2482f73fbe057f5168ce upstream.
Currently gadget_wakeup() waits for U0 synchronously if it was called from func_wakeup(), this is because we need to send the function wakeup command soon after the link is active. And the call is made synchronous by polling DSTS continuosly for 20000 times in __dwc3_gadget_wakeup(). But it observed that sometimes the link is not active even after polling 20K times, leading to remote wakeup failures. Adding a small delay between each poll helps, but that won't guarantee resolution in future. Hence make the gadget_wakeup completely asynchronous.
Since multiple interfaces can issue a function wakeup at once, add a new variable wakeup_pending_funcs which will indicate the functions that has issued func_wakup, this is represented in a bitmap format. If the link is in U3, dwc3_gadget_func_wakeup() will set the bit corresponding to interface_id and bail out. Once link comes back to U0, linksts_change irq is triggered, where the function wakeup command is sent based on bitmap.
Cc: stable stable@kernel.org Fixes: 92c08a84b53e ("usb: dwc3: Add function suspend and function wakeup support") Signed-off-by: Prashanth K prashanth.k@oss.qualcomm.com Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://lore.kernel.org/r/20250422103231.1954387-4-prashanth.k@oss.qualcomm.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/core.h | 4 +++ drivers/usb/dwc3/gadget.c | 60 +++++++++++++++++----------------------------- 2 files changed, 27 insertions(+), 37 deletions(-)
--- a/drivers/usb/dwc3/core.h +++ b/drivers/usb/dwc3/core.h @@ -1164,6 +1164,9 @@ struct dwc3_scratchpad_array { * @gsbuscfg0_reqinfo: store GSBUSCFG0.DATRDREQINFO, DESRDREQINFO, * DATWRREQINFO, and DESWRREQINFO value passed from * glue driver. + * @wakeup_pending_funcs: Indicates whether any interface has requested for + * function wakeup in bitmap format where bit position + * represents interface_id. */ struct dwc3 { struct work_struct drd_work; @@ -1394,6 +1397,7 @@ struct dwc3 { int num_ep_resized; struct dentry *debug_root; u32 gsbuscfg0_reqinfo; + u32 wakeup_pending_funcs; };
#define INCRX_BURST_MODE 0 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -276,8 +276,6 @@ int dwc3_send_gadget_generic_command(str return ret; }
-static int __dwc3_gadget_wakeup(struct dwc3 *dwc, bool async); - /** * dwc3_send_gadget_ep_cmd - issue an endpoint command * @dep: the endpoint to which the command is going to be issued @@ -2359,10 +2357,8 @@ static int dwc3_gadget_get_frame(struct return __dwc3_gadget_get_frame(dwc); }
-static int __dwc3_gadget_wakeup(struct dwc3 *dwc, bool async) +static int __dwc3_gadget_wakeup(struct dwc3 *dwc) { - int retries; - int ret; u32 reg;
@@ -2390,8 +2386,7 @@ static int __dwc3_gadget_wakeup(struct d return -EINVAL; }
- if (async) - dwc3_gadget_enable_linksts_evts(dwc, true); + dwc3_gadget_enable_linksts_evts(dwc, true);
ret = dwc3_gadget_set_link_state(dwc, DWC3_LINK_STATE_RECOV); if (ret < 0) { @@ -2410,27 +2405,8 @@ static int __dwc3_gadget_wakeup(struct d
/* * Since link status change events are enabled we will receive - * an U0 event when wakeup is successful. So bail out. + * an U0 event when wakeup is successful. */ - if (async) - return 0; - - /* poll until Link State changes to ON */ - retries = 20000; - - while (retries--) { - reg = dwc3_readl(dwc->regs, DWC3_DSTS); - - /* in HS, means ON */ - if (DWC3_DSTS_USBLNKST(reg) == DWC3_LINK_STATE_U0) - break; - } - - if (DWC3_DSTS_USBLNKST(reg) != DWC3_LINK_STATE_U0) { - dev_err(dwc->dev, "failed to send remote wakeup\n"); - return -EINVAL; - } - return 0; }
@@ -2451,7 +2427,7 @@ static int dwc3_gadget_wakeup(struct usb spin_unlock_irqrestore(&dwc->lock, flags); return -EINVAL; } - ret = __dwc3_gadget_wakeup(dwc, true); + ret = __dwc3_gadget_wakeup(dwc);
spin_unlock_irqrestore(&dwc->lock, flags);
@@ -2479,14 +2455,10 @@ static int dwc3_gadget_func_wakeup(struc */ link_state = dwc3_gadget_get_link_state(dwc); if (link_state == DWC3_LINK_STATE_U3) { - ret = __dwc3_gadget_wakeup(dwc, false); - if (ret) { - spin_unlock_irqrestore(&dwc->lock, flags); - return -EINVAL; - } - dwc3_resume_gadget(dwc); - dwc->suspended = false; - dwc->link_state = DWC3_LINK_STATE_U0; + dwc->wakeup_pending_funcs |= BIT(intf_id); + ret = __dwc3_gadget_wakeup(dwc); + spin_unlock_irqrestore(&dwc->lock, flags); + return ret; }
ret = dwc3_send_gadget_generic_command(dwc, DWC3_DGCMD_DEV_NOTIFICATION, @@ -4314,6 +4286,8 @@ static void dwc3_gadget_linksts_change_i { enum dwc3_link_state next = evtinfo & DWC3_LINK_STATE_MASK; unsigned int pwropt; + int ret; + int intf_id;
/* * WORKAROUND: DWC3 < 2.50a have an issue when configured without @@ -4389,7 +4363,7 @@ static void dwc3_gadget_linksts_change_i
switch (next) { case DWC3_LINK_STATE_U0: - if (dwc->gadget->wakeup_armed) { + if (dwc->gadget->wakeup_armed || dwc->wakeup_pending_funcs) { dwc3_gadget_enable_linksts_evts(dwc, false); dwc3_resume_gadget(dwc); dwc->suspended = false; @@ -4412,6 +4386,18 @@ static void dwc3_gadget_linksts_change_i }
dwc->link_state = next; + + /* Proceed with func wakeup if any interfaces that has requested */ + while (dwc->wakeup_pending_funcs && (next == DWC3_LINK_STATE_U0)) { + intf_id = ffs(dwc->wakeup_pending_funcs) - 1; + ret = dwc3_send_gadget_generic_command(dwc, DWC3_DGCMD_DEV_NOTIFICATION, + DWC3_DGCMDPAR_DN_FUNC_WAKE | + DWC3_DGCMDPAR_INTF_SEL(intf_id)); + if (ret) + dev_err(dwc->dev, "Failed to send DN wake for intf %d\n", intf_id); + + dwc->wakeup_pending_funcs &= ~BIT(intf_id); + } }
static void dwc3_gadget_suspend_interrupt(struct dwc3 *dwc,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawel Laszczak pawell@cadence.com
commit 241e2ce88e5a494be7a5d44c0697592f1632fbee upstream.
In very rare cases after resuming controller from L1 to L0 it reads registers before the clock UTMI have been enabled and as the result driver reads incorrect value. Most of registers are in APB domain clock but some of them (e.g. PORTSC) are in UTMI domain clock. After entering to L1 state the UTMI clock can be disabled. When controller transition from L1 to L0 the port status change event is reported and in interrupt runtime function driver reads PORTSC. During this read operation controller synchronize UTMI and APB domain but UTMI clock is still disabled and in result it reads 0xFFFFFFFF value. To fix this issue driver increases APB timeout value.
The issue is platform specific and if the default value of APB timeout is not sufficient then this time should be set Individually for each platform.
Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Cc: stable stable@kernel.org Signed-off-by: Pawel Laszczak pawell@cadence.com Acked-by: Peter Chen peter.chen@kernel.org Link: https://lore.kernel.org/r/PH7PR07MB953846C57973E4DB134CAA71DDBF2@PH7PR07MB95... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/cdns3/cdnsp-gadget.c | 29 +++++++++++++++++++++++++++++ drivers/usb/cdns3/cdnsp-gadget.h | 3 +++ drivers/usb/cdns3/cdnsp-pci.c | 12 ++++++++++-- drivers/usb/cdns3/core.h | 3 +++ 4 files changed, 45 insertions(+), 2 deletions(-)
--- a/drivers/usb/cdns3/cdnsp-gadget.c +++ b/drivers/usb/cdns3/cdnsp-gadget.c @@ -139,6 +139,26 @@ static void cdnsp_clear_port_change_bit( (portsc & PORT_CHANGE_BITS), port_regs); }
+static void cdnsp_set_apb_timeout_value(struct cdnsp_device *pdev) +{ + struct cdns *cdns = dev_get_drvdata(pdev->dev); + __le32 __iomem *reg; + void __iomem *base; + u32 offset = 0; + u32 val; + + if (!cdns->override_apb_timeout) + return; + + base = &pdev->cap_regs->hc_capbase; + offset = cdnsp_find_next_ext_cap(base, offset, D_XEC_PRE_REGS_CAP); + reg = base + offset + REG_CHICKEN_BITS_3_OFFSET; + + val = le32_to_cpu(readl(reg)); + val = CHICKEN_APB_TIMEOUT_SET(val, cdns->override_apb_timeout); + writel(cpu_to_le32(val), reg); +} + static void cdnsp_set_chicken_bits_2(struct cdnsp_device *pdev, u32 bit) { __le32 __iomem *reg; @@ -1798,6 +1818,15 @@ static int cdnsp_gen_setup(struct cdnsp_ pdev->hci_version = HC_VERSION(pdev->hcc_params); pdev->hcc_params = readl(&pdev->cap_regs->hcc_params);
+ /* + * Override the APB timeout value to give the controller more time for + * enabling UTMI clock and synchronizing APB and UTMI clock domains. + * This fix is platform specific and is required to fixes issue with + * reading incorrect value from PORTSC register after resuming + * from L1 state. + */ + cdnsp_set_apb_timeout_value(pdev); + cdnsp_get_rev_cap(pdev);
/* Make sure the Device Controller is halted. */ --- a/drivers/usb/cdns3/cdnsp-gadget.h +++ b/drivers/usb/cdns3/cdnsp-gadget.h @@ -520,6 +520,9 @@ struct cdnsp_rev_cap { #define REG_CHICKEN_BITS_2_OFFSET 0x48 #define CHICKEN_XDMA_2_TP_CACHE_DIS BIT(28)
+#define REG_CHICKEN_BITS_3_OFFSET 0x4C +#define CHICKEN_APB_TIMEOUT_SET(p, val) (((p) & ~GENMASK(21, 0)) | (val)) + /* XBUF Extended Capability ID. */ #define XBUF_CAP_ID 0xCB #define XBUF_RX_TAG_MASK_0_OFFSET 0x1C --- a/drivers/usb/cdns3/cdnsp-pci.c +++ b/drivers/usb/cdns3/cdnsp-pci.c @@ -28,6 +28,8 @@ #define PCI_DRIVER_NAME "cdns-pci-usbssp" #define PLAT_DRIVER_NAME "cdns-usbssp"
+#define CHICKEN_APB_TIMEOUT_VALUE 0x1C20 + static struct pci_dev *cdnsp_get_second_fun(struct pci_dev *pdev) { /* @@ -139,6 +141,14 @@ static int cdnsp_pci_probe(struct pci_de cdnsp->otg_irq = pdev->irq; }
+ /* + * Cadence PCI based platform require some longer timeout for APB + * to fixes domain clock synchronization issue after resuming + * controller from L1 state. + */ + cdnsp->override_apb_timeout = CHICKEN_APB_TIMEOUT_VALUE; + pci_set_drvdata(pdev, cdnsp); + if (pci_is_enabled(func)) { cdnsp->dev = dev; cdnsp->gadget_init = cdnsp_gadget_init; @@ -148,8 +158,6 @@ static int cdnsp_pci_probe(struct pci_de goto free_cdnsp; }
- pci_set_drvdata(pdev, cdnsp); - device_wakeup_enable(&pdev->dev); if (pci_dev_run_wake(pdev)) pm_runtime_put_noidle(&pdev->dev); --- a/drivers/usb/cdns3/core.h +++ b/drivers/usb/cdns3/core.h @@ -79,6 +79,8 @@ struct cdns3_platform_data { * @pdata: platform data from glue layer * @lock: spinlock structure * @xhci_plat_data: xhci private data structure pointer + * @override_apb_timeout: hold value of APB timeout. For value 0 the default + * value in CHICKEN_BITS_3 will be preserved. * @gadget_init: pointer to gadget initialization function */ struct cdns { @@ -117,6 +119,7 @@ struct cdns { struct cdns3_platform_data *pdata; spinlock_t lock; struct xhci_plat_priv *xhci_plat_data; + u32 override_apb_timeout;
int (*gadget_init)(struct cdns *cdns); };
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawel Laszczak pawell@cadence.com
commit 8614ecdb1570e4fffe87ebdc62b613ed66f1f6a6 upstream.
The controllers with rtl version larger than RTL_REVISION_NEW_LPM (0x00002700) has bug which causes that controller doesn't resume from L1 state. It happens if after receiving LPM packet controller starts transitioning to L1 and in this moment the driver force resuming by write operation to PORTSC.PLS. It's corner case and happens when write operation to PORTSC occurs during device delay before transitioning to L1 after transmitting ACK time (TL1TokenRetry).
Forcing transition from L1->L0 by driver for revision larger than RTL_REVISION_NEW_LPM is not needed, so driver can simply fix this issue through block call of cdnsp_force_l0_go function.
Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Cc: stable stable@kernel.org Signed-off-by: Pawel Laszczak pawell@cadence.com Acked-by: Peter Chen peter.chen@kernel.org Link: https://lore.kernel.org/r/PH7PR07MB9538B55C3A6E71F9ED29E980DD842@PH7PR07MB95... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/cdns3/cdnsp-gadget.c | 2 ++ drivers/usb/cdns3/cdnsp-gadget.h | 3 +++ drivers/usb/cdns3/cdnsp-ring.c | 3 ++- 3 files changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/usb/cdns3/cdnsp-gadget.c +++ b/drivers/usb/cdns3/cdnsp-gadget.c @@ -1793,6 +1793,8 @@ static void cdnsp_get_rev_cap(struct cdn reg += cdnsp_find_next_ext_cap(reg, 0, RTL_REV_CAP); pdev->rev_cap = reg;
+ pdev->rtl_revision = readl(&pdev->rev_cap->rtl_revision); + dev_info(pdev->dev, "Rev: %08x/%08x, eps: %08x, buff: %08x/%08x\n", readl(&pdev->rev_cap->ctrl_revision), readl(&pdev->rev_cap->rtl_revision), --- a/drivers/usb/cdns3/cdnsp-gadget.h +++ b/drivers/usb/cdns3/cdnsp-gadget.h @@ -1360,6 +1360,7 @@ struct cdnsp_port { * @rev_cap: Controller Capabilities Registers. * @hcs_params1: Cached register copies of read-only HCSPARAMS1 * @hcc_params: Cached register copies of read-only HCCPARAMS1 + * @rtl_revision: Cached controller rtl revision. * @setup: Temporary buffer for setup packet. * @ep0_preq: Internal allocated request used during enumeration. * @ep0_stage: ep0 stage during enumeration process. @@ -1414,6 +1415,8 @@ struct cdnsp_device { __u32 hcs_params1; __u32 hcs_params3; __u32 hcc_params; + #define RTL_REVISION_NEW_LPM 0x2700 + __u32 rtl_revision; /* Lock used in interrupt thread context. */ spinlock_t lock; struct usb_ctrlrequest setup; --- a/drivers/usb/cdns3/cdnsp-ring.c +++ b/drivers/usb/cdns3/cdnsp-ring.c @@ -308,7 +308,8 @@ static bool cdnsp_ring_ep_doorbell(struc
writel(db_value, reg_addr);
- cdnsp_force_l0_go(pdev); + if (pdev->rtl_revision < RTL_REVISION_NEW_LPM) + cdnsp_force_l0_go(pdev);
/* Doorbell was set. */ return true;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Prashanth K prashanth.k@oss.qualcomm.com
commit 8e3820271c517ceb89ab7442656ba49fa23ee1d0 upstream.
When host sends GET_STATUS to ECM interface, handle the request from the function driver. Since the interface is wakeup capable, set the corresponding bit, and set RW bit if the function is already armed for wakeup by the host.
Cc: stable stable@kernel.org Fixes: 481c225c4802 ("usb: gadget: Handle function suspend feature selector") Signed-off-by: Prashanth K prashanth.k@oss.qualcomm.com Reviewed-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://lore.kernel.org/r/20250422103231.1954387-2-prashanth.k@oss.qualcomm.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/f_ecm.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/usb/gadget/function/f_ecm.c +++ b/drivers/usb/gadget/function/f_ecm.c @@ -892,6 +892,12 @@ static void ecm_resume(struct usb_functi gether_resume(&ecm->port); }
+static int ecm_get_status(struct usb_function *f) +{ + return (f->func_wakeup_armed ? USB_INTRF_STAT_FUNC_RW : 0) | + USB_INTRF_STAT_FUNC_RW_CAP; +} + static void ecm_free(struct usb_function *f) { struct f_ecm *ecm; @@ -960,6 +966,7 @@ static struct usb_function *ecm_alloc(st ecm->port.func.disable = ecm_disable; ecm->port.func.free_func = ecm_free; ecm->port.func.suspend = ecm_suspend; + ecm->port.func.get_status = ecm_get_status; ecm->port.func.resume = ecm_resume;
return &ecm->port.func;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wayne Chang waynec@nvidia.com
commit 59820fde001500c167342257650541280c622b73 upstream.
We identified a bug where the ST_RC bit in the status register was not being acknowledged after clearing the CTRL_RUN bit in the control register. This could lead to unexpected behavior in the USB gadget drivers.
This patch resolves the issue by adding the necessary code to explicitly acknowledge ST_RC after clearing CTRL_RUN based on the programming sequence, ensuring proper state transition.
Fixes: 49db427232fe ("usb: gadget: Add UDC driver for tegra XUSB device mode controller") Cc: stable stable@kernel.org Signed-off-by: Wayne Chang waynec@nvidia.com Link: https://lore.kernel.org/r/20250418081228.1194779-1-waynec@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/udc/tegra-xudc.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/usb/gadget/udc/tegra-xudc.c +++ b/drivers/usb/gadget/udc/tegra-xudc.c @@ -1749,6 +1749,10 @@ static int __tegra_xudc_ep_disable(struc val = xudc_readl(xudc, CTRL); val &= ~CTRL_RUN; xudc_writel(xudc, val, CTRL); + + val = xudc_readl(xudc, ST); + if (val & ST_RC) + xudc_writel(xudc, ST_RC, ST); }
dev_info(xudc->dev, "ep %u disabled\n", ep->index);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Prashanth K prashanth.k@oss.qualcomm.com
commit 5977a58dd5a4865198b0204b998adb0f634abe19 upstream.
Currently when the host sends GET_STATUS request for an interface, we use get_status callbacks to set/clear remote wakeup capability of that interface. And if get_status callback isn't present for that interface, then we assume its remote wakeup capability based on bmAttributes.
Now consider a scenario, where we have a USB configuration with multiple interfaces (say ECM + ADB), here ECM is remote wakeup capable and as of now ADB isn't. And bmAttributes will indicate the device as wakeup capable. With the current implementation, when host sends GET_STATUS request for both interfaces, we will set FUNC_RW_CAP for both. This results in USB3 CV Chapter 9.15 (Function Remote Wakeup Test) failures as host expects remote wakeup from both interfaces.
The above scenario is just an example, and the failure can be observed if we use configuration with any interface except ECM. Hence avoid configuring remote wakeup capability from composite driver based on bmAttributes, instead use get_status callbacks and let the function drivers decide this.
Cc: stable stable@kernel.org Fixes: 481c225c4802 ("usb: gadget: Handle function suspend feature selector") Signed-off-by: Prashanth K prashanth.k@oss.qualcomm.com Reviewed-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://lore.kernel.org/r/20250422103231.1954387-3-prashanth.k@oss.qualcomm.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/composite.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
--- a/drivers/usb/gadget/composite.c +++ b/drivers/usb/gadget/composite.c @@ -2011,15 +2011,13 @@ composite_setup(struct usb_gadget *gadge
if (f->get_status) { status = f->get_status(f); + if (status < 0) break; - } else { - /* Set D0 and D1 bits based on func wakeup capability */ - if (f->config->bmAttributes & USB_CONFIG_ATT_WAKEUP) { - status |= USB_INTRF_STAT_FUNC_RW_CAP; - if (f->func_wakeup_armed) - status |= USB_INTRF_STAT_FUNC_RW; - } + + /* if D5 is not set, then device is not wakeup capable */ + if (!(f->config->bmAttributes & USB_CONFIG_ATT_WAKEUP)) + status &= ~(USB_INTRF_STAT_FUNC_RW_CAP | USB_INTRF_STAT_FUNC_RW); }
put_unaligned_le16(status & 0x0000ffff, req->buf);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jim Lin jilin@nvidia.com
commit 732f35cf8bdfece582f6e4a9c659119036577308 upstream.
When a USB device is connected to the OTG port, the tegra_xhci_id_work() routine transitions the PHY to host mode and calls xhci_hub_control() with the SetPortFeature command to enable port power.
In certain cases, the XHCI controller may be in a low-power state when this operation occurs. If xhci_hub_control() is invoked while the controller is suspended, the PORTSC register may return 0xFFFFFFFF, indicating a read failure. This causes xhci_hc_died() to be triggered, leading to host controller shutdown.
Example backtrace: [ 105.445736] Workqueue: events tegra_xhci_id_work [ 105.445747] dump_backtrace+0x0/0x1e8 [ 105.445759] xhci_hc_died.part.48+0x40/0x270 [ 105.445769] tegra_xhci_set_port_power+0xc0/0x240 [ 105.445774] tegra_xhci_id_work+0x130/0x240
To prevent this, ensure the controller is fully resumed before interacting with hardware registers by calling pm_runtime_get_sync() prior to the host mode transition and xhci_hub_control().
Fixes: f836e7843036 ("usb: xhci-tegra: Add OTG support") Cc: stable stable@kernel.org Signed-off-by: Jim Lin jilin@nvidia.com Signed-off-by: Wayne Chang waynec@nvidia.com Link: https://lore.kernel.org/r/20250422114001.126367-1-waynec@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-tegra.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c index b5c362c2051d..0c7af44d4dae 100644 --- a/drivers/usb/host/xhci-tegra.c +++ b/drivers/usb/host/xhci-tegra.c @@ -1364,6 +1364,7 @@ static void tegra_xhci_id_work(struct work_struct *work) tegra->otg_usb3_port = tegra_xusb_padctl_get_usb3_companion(tegra->padctl, tegra->otg_usb2_port);
+ pm_runtime_get_sync(tegra->dev); if (tegra->host_mode) { /* switch to host mode */ if (tegra->otg_usb3_port >= 0) { @@ -1393,6 +1394,7 @@ static void tegra_xhci_id_work(struct work_struct *work) }
tegra_xhci_set_port_power(tegra, true, true); + pm_runtime_mark_last_busy(tegra->dev);
} else { if (tegra->otg_usb3_port >= 0) @@ -1400,6 +1402,7 @@ static void tegra_xhci_id_work(struct work_struct *work)
tegra_xhci_set_port_power(tegra, true, false); } + pm_runtime_put_autosuspend(tegra->dev); }
#if IS_ENABLED(CONFIG_PM) || IS_ENABLED(CONFIG_PM_SLEEP)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lukasz Czechowski lukasz.czechowski@thaumatec.com
commit 9f657a92805cfc98e11cf5da9e8f4e02ecff2260 upstream.
The Cypress HX3 USB3.0 hubs use different PID values depending on the product variant. The comment in compatibles table is misleading, as the currently used PIDs (0x6504 and 0x6506 for USB 3.0 and USB 2.0, respectively) are defaults for the CYUSB331x, while CYUSB330x and CYUSB332x variants use different values. Based on the datasheet [1], update the compatible usb devices table to handle different types of the hub. The change also includes vendor mode PIDs, which are used by the hub in I2C Master boot mode, if connected EEPROM contains invalid signature or is blank. This allows to correctly boot the hub even if the EEPROM will have broken content. Number of vcc supplies and timing requirements are the same for all HX variants, so the platform driver's match table does not have to be extended.
[1] https://www.infineon.com/dgdl/Infineon-HX3_USB_3_0_Hub_Consumer_Industrial-D... Table 9. PID Values
Fixes: b43cd82a1a40 ("usb: misc: onboard-hub: add support for Cypress HX3 USB 3.0 family") Cc: stable stable@kernel.org Signed-off-by: Lukasz Czechowski lukasz.czechowski@thaumatec.com Link: https://lore.kernel.org/r/20250425-onboard_usb_dev-v2-1-4a76a474a010@thaumat... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/misc/onboard_usb_dev.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/drivers/usb/misc/onboard_usb_dev.c +++ b/drivers/usb/misc/onboard_usb_dev.c @@ -569,8 +569,14 @@ static void onboard_dev_usbdev_disconnec }
static const struct usb_device_id onboard_dev_id_table[] = { - { USB_DEVICE(VENDOR_ID_CYPRESS, 0x6504) }, /* CYUSB33{0,1,2}x/CYUSB230x 3.0 HUB */ - { USB_DEVICE(VENDOR_ID_CYPRESS, 0x6506) }, /* CYUSB33{0,1,2}x/CYUSB230x 2.0 HUB */ + { USB_DEVICE(VENDOR_ID_CYPRESS, 0x6500) }, /* CYUSB330x 3.0 HUB */ + { USB_DEVICE(VENDOR_ID_CYPRESS, 0x6502) }, /* CYUSB330x 2.0 HUB */ + { USB_DEVICE(VENDOR_ID_CYPRESS, 0x6503) }, /* CYUSB33{0,1}x 2.0 HUB, Vendor Mode */ + { USB_DEVICE(VENDOR_ID_CYPRESS, 0x6504) }, /* CYUSB331x 3.0 HUB */ + { USB_DEVICE(VENDOR_ID_CYPRESS, 0x6506) }, /* CYUSB331x 2.0 HUB */ + { USB_DEVICE(VENDOR_ID_CYPRESS, 0x6507) }, /* CYUSB332x 2.0 HUB, Vendor Mode */ + { USB_DEVICE(VENDOR_ID_CYPRESS, 0x6508) }, /* CYUSB332x 3.0 HUB */ + { USB_DEVICE(VENDOR_ID_CYPRESS, 0x650a) }, /* CYUSB332x 2.0 HUB */ { USB_DEVICE(VENDOR_ID_CYPRESS, 0x6570) }, /* CY7C6563x 2.0 HUB */ { USB_DEVICE(VENDOR_ID_GENESYS, 0x0608) }, /* Genesys Logic GL850G USB 2.0 HUB */ { USB_DEVICE(VENDOR_ID_GENESYS, 0x0610) }, /* Genesys Logic GL852G USB 2.0 HUB */
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: RD Babiera rdbabiera@google.com
commit e918d3959b5ae0e793b8f815ce62240e10ba03a4 upstream.
This patch fixes Type-C Compliance Test TD 4.7.6 - Try.SNK DRP Connect SNKAS.
The compliance tester moves into SNK_UNATTACHED during toggling and expects the PUT to apply Rp after tPDDebounce of detection. If the port is in SNK_TRY_WAIT_DEBOUNCE, it will move into SRC_TRYWAIT immediately and apply Rp. This violates TD 4.7.5.V.3, where the tester confirms that the PUT attaches Rp after the transitions to Unattached.SNK for tPDDebounce.
Change the tcpm_set_state delay between SNK_TRY_WAIT_DEBOUNCE and SRC_TRYWAIT to tPDDebounce.
Fixes: a0a3e04e6b2c ("staging: typec: tcpm: Check for Rp for tPDDebounce") Cc: stable stable@kernel.org Signed-off-by: RD Babiera rdbabiera@google.com Reviewed-by: Badhri Jagan Sridharan badhri@google.com Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20250429234703.3748506-2-rdbabiera@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/tcpm/tcpm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/usb/typec/tcpm/tcpm.c +++ b/drivers/usb/typec/tcpm/tcpm.c @@ -5965,7 +5965,7 @@ static void _tcpm_cc_change(struct tcpm_ case SNK_TRY_WAIT_DEBOUNCE: if (!tcpm_port_is_sink(port)) { port->max_wait = 0; - tcpm_set_state(port, SRC_TRYWAIT, 0); + tcpm_set_state(port, SRC_TRYWAIT, PD_T_PD_DEBOUNCE); } break; case SRC_TRY_WAIT:
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrei Kuchynski akuchynski@chromium.org
commit 364618c89d4c57c85e5fc51a2446cd939bf57802 upstream.
This patch introduces the ucsi_con_mutex_lock / ucsi_con_mutex_unlock functions to the UCSI driver. ucsi_con_mutex_lock ensures the connector mutex is only locked if a connection is established and the partner pointer is valid. This resolves a deadlock scenario where ucsi_displayport_remove_partner holds con->mutex waiting for dp_altmode_work to complete while dp_altmode_work attempts to acquire it.
Cc: stable stable@kernel.org Fixes: af8622f6a585 ("usb: typec: ucsi: Support for DisplayPort alt mode") Signed-off-by: Andrei Kuchynski akuchynski@chromium.org Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20250424084429.3220757-2-akuchynski@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/ucsi/displayport.c | 19 +++++++++++-------- drivers/usb/typec/ucsi/ucsi.c | 34 ++++++++++++++++++++++++++++++++++ drivers/usb/typec/ucsi/ucsi.h | 2 ++ 3 files changed, 47 insertions(+), 8 deletions(-)
--- a/drivers/usb/typec/ucsi/displayport.c +++ b/drivers/usb/typec/ucsi/displayport.c @@ -54,7 +54,8 @@ static int ucsi_displayport_enter(struct u8 cur = 0; int ret;
- mutex_lock(&dp->con->lock); + if (!ucsi_con_mutex_lock(dp->con)) + return -ENOTCONN;
if (!dp->override && dp->initialized) { const struct typec_altmode *p = typec_altmode_get_partner(alt); @@ -100,7 +101,7 @@ static int ucsi_displayport_enter(struct schedule_work(&dp->work); ret = 0; err_unlock: - mutex_unlock(&dp->con->lock); + ucsi_con_mutex_unlock(dp->con);
return ret; } @@ -112,7 +113,8 @@ static int ucsi_displayport_exit(struct u64 command; int ret = 0;
- mutex_lock(&dp->con->lock); + if (!ucsi_con_mutex_lock(dp->con)) + return -ENOTCONN;
if (!dp->override) { const struct typec_altmode *p = typec_altmode_get_partner(alt); @@ -144,7 +146,7 @@ static int ucsi_displayport_exit(struct schedule_work(&dp->work);
out_unlock: - mutex_unlock(&dp->con->lock); + ucsi_con_mutex_unlock(dp->con);
return ret; } @@ -202,20 +204,21 @@ static int ucsi_displayport_vdm(struct t int cmd = PD_VDO_CMD(header); int svdm_version;
- mutex_lock(&dp->con->lock); + if (!ucsi_con_mutex_lock(dp->con)) + return -ENOTCONN;
if (!dp->override && dp->initialized) { const struct typec_altmode *p = typec_altmode_get_partner(alt);
dev_warn(&p->dev, "firmware doesn't support alternate mode overriding\n"); - mutex_unlock(&dp->con->lock); + ucsi_con_mutex_unlock(dp->con); return -EOPNOTSUPP; }
svdm_version = typec_altmode_get_svdm_version(alt); if (svdm_version < 0) { - mutex_unlock(&dp->con->lock); + ucsi_con_mutex_unlock(dp->con); return svdm_version; }
@@ -259,7 +262,7 @@ static int ucsi_displayport_vdm(struct t break; }
- mutex_unlock(&dp->con->lock); + ucsi_con_mutex_unlock(dp->con);
return 0; } --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -1923,6 +1923,40 @@ void ucsi_set_drvdata(struct ucsi *ucsi, EXPORT_SYMBOL_GPL(ucsi_set_drvdata);
/** + * ucsi_con_mutex_lock - Acquire the connector mutex + * @con: The connector interface to lock + * + * Returns true on success, false if the connector is disconnected + */ +bool ucsi_con_mutex_lock(struct ucsi_connector *con) +{ + bool mutex_locked = false; + bool connected = true; + + while (connected && !mutex_locked) { + mutex_locked = mutex_trylock(&con->lock) != 0; + connected = UCSI_CONSTAT(con, CONNECTED); + if (connected && !mutex_locked) + msleep(20); + } + + connected = connected && con->partner; + if (!connected && mutex_locked) + mutex_unlock(&con->lock); + + return connected; +} + +/** + * ucsi_con_mutex_unlock - Release the connector mutex + * @con: The connector interface to unlock + */ +void ucsi_con_mutex_unlock(struct ucsi_connector *con) +{ + mutex_unlock(&con->lock); +} + +/** * ucsi_create - Allocate UCSI instance * @dev: Device interface to the PPM (Platform Policy Manager) * @ops: I/O routines --- a/drivers/usb/typec/ucsi/ucsi.h +++ b/drivers/usb/typec/ucsi/ucsi.h @@ -94,6 +94,8 @@ int ucsi_register(struct ucsi *ucsi); void ucsi_unregister(struct ucsi *ucsi); void *ucsi_get_drvdata(struct ucsi *ucsi); void ucsi_set_drvdata(struct ucsi *ucsi, void *data); +bool ucsi_con_mutex_lock(struct ucsi_connector *con); +void ucsi_con_mutex_unlock(struct ucsi_connector *con);
void ucsi_connector_change(struct ucsi *ucsi, u8 num);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrei Kuchynski akuchynski@chromium.org
commit 312d79669e71283d05c05cc49a1a31e59e3d9e0e upstream.
This patch ensures that the UCSI driver waits for all pending tasks in the ucsi_displayport_work workqueue to finish executing before proceeding with the partner removal.
Cc: stable stable@kernel.org Fixes: af8622f6a585 ("usb: typec: ucsi: Support for DisplayPort alt mode") Signed-off-by: Andrei Kuchynski akuchynski@chromium.org Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Reviewed-by: Benson Leung bleung@chromium.org Link: https://lore.kernel.org/r/20250424084429.3220757-3-akuchynski@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/ucsi/displayport.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/typec/ucsi/displayport.c +++ b/drivers/usb/typec/ucsi/displayport.c @@ -299,6 +299,8 @@ void ucsi_displayport_remove_partner(str if (!dp) return;
+ cancel_work_sync(&dp->work); + dp->data.conf = 0; dp->data.status = 0; dp->initialized = false;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oliver Neukum oneukum@suse.com
commit 054c5145540e5ad5b80adf23a5e3e2fc281fb8aa upstream.
usbtmc_read() calls usbtmc_generic_read() which uses interruptible sleep, but usbtmc_read() itself uses uninterruptble sleep for mutual exclusion between threads. That makes no sense. Both should use interruptible sleep.
Fixes: 5b775f672cc99 ("USB: add USB test and measurement class driver") Cc: stable stable@kernel.org Signed-off-by: Oliver Neukum oneukum@suse.com Link: https://lore.kernel.org/r/20250430134810.226015-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/class/usbtmc.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/usb/class/usbtmc.c +++ b/drivers/usb/class/usbtmc.c @@ -1380,7 +1380,10 @@ static ssize_t usbtmc_read(struct file * if (!buffer) return -ENOMEM;
- mutex_lock(&data->io_mutex); + retval = mutex_lock_interruptible(&data->io_mutex); + if (retval < 0) + goto exit_nolock; + if (data->zombie) { retval = -ENODEV; goto exit; @@ -1503,6 +1506,7 @@ static ssize_t usbtmc_read(struct file *
exit: mutex_unlock(&data->io_mutex); +exit_nolock: kfree(buffer); return retval; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Penkler dpenkler@gmail.com
commit cac01bd178d6a2a23727f138d647ce1a0e8a73a1 upstream.
wait_event_interruptible_timeout returns a long The return was being assigned to an int causing an integer overflow when the remaining jiffies > INT_MAX resulting in random error returns.
Use a long return value and convert to int ioctl return only on error.
When the return value of wait_event_interruptible_timeout was <= INT_MAX the number of remaining jiffies was returned which has no meaning for the user. Return 0 on success.
Reported-by: Michael Katzmann vk2bea@gmail.com Fixes: dbf3e7f654c0 ("Implement an ioctl to support the USMTMC-USB488 READ_STATUS_BYTE operation.") Cc: stable@vger.kernel.org Signed-off-by: Dave Penkler dpenkler@gmail.com Link: https://lore.kernel.org/r/20250502070941.31819-2-dpenkler@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/class/usbtmc.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
--- a/drivers/usb/class/usbtmc.c +++ b/drivers/usb/class/usbtmc.c @@ -482,6 +482,7 @@ static int usbtmc_get_stb(struct usbtmc_ u8 *buffer; u8 tag; int rv; + long wait_rv;
dev_dbg(dev, "Enter ioctl_read_stb iin_ep_present: %d\n", data->iin_ep_present); @@ -511,16 +512,17 @@ static int usbtmc_get_stb(struct usbtmc_ }
if (data->iin_ep_present) { - rv = wait_event_interruptible_timeout( + wait_rv = wait_event_interruptible_timeout( data->waitq, atomic_read(&data->iin_data_valid) != 0, file_data->timeout); - if (rv < 0) { - dev_dbg(dev, "wait interrupted %d\n", rv); + if (wait_rv < 0) { + dev_dbg(dev, "wait interrupted %ld\n", wait_rv); + rv = wait_rv; goto exit; }
- if (rv == 0) { + if (wait_rv == 0) { dev_dbg(dev, "wait timed out\n"); rv = -ETIMEDOUT; goto exit; @@ -539,6 +541,8 @@ static int usbtmc_get_stb(struct usbtmc_
dev_dbg(dev, "stb:0x%02x received %d\n", (unsigned int)*stb, rv);
+ rv = 0; + exit: /* bump interrupt bTag */ data->iin_bTag += 1;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Penkler dpenkler@gmail.com
commit a9747c9b8b59ab4207effd20eb91a890acb44e16 upstream.
wait_event_interruptible_timeout returns a long The return was being assigned to an int causing an integer overflow when the remaining jiffies > INT_MAX resulting in random error returns.
Use a long return value, converting to the int ioctl return only on error.
Fixes: 739240a9f6ac ("usb: usbtmc: Add ioctl USBTMC488_IOCTL_WAIT_SRQ") Cc: stable@vger.kernel.org Signed-off-by: Dave Penkler dpenkler@gmail.com Link: https://lore.kernel.org/r/20250502070941.31819-3-dpenkler@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/class/usbtmc.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-)
--- a/drivers/usb/class/usbtmc.c +++ b/drivers/usb/class/usbtmc.c @@ -606,9 +606,9 @@ static int usbtmc488_ioctl_wait_srq(stru { struct usbtmc_device_data *data = file_data->data; struct device *dev = &data->intf->dev; - int rv; u32 timeout; unsigned long expire; + long wait_rv;
if (!data->iin_ep_present) { dev_dbg(dev, "no interrupt endpoint present\n"); @@ -622,25 +622,24 @@ static int usbtmc488_ioctl_wait_srq(stru
mutex_unlock(&data->io_mutex);
- rv = wait_event_interruptible_timeout( - data->waitq, - atomic_read(&file_data->srq_asserted) != 0 || - atomic_read(&file_data->closing), - expire); + wait_rv = wait_event_interruptible_timeout( + data->waitq, + atomic_read(&file_data->srq_asserted) != 0 || + atomic_read(&file_data->closing), + expire);
mutex_lock(&data->io_mutex);
/* Note! disconnect or close could be called in the meantime */ if (atomic_read(&file_data->closing) || data->zombie) - rv = -ENODEV; + return -ENODEV;
- if (rv < 0) { - /* dev can be invalid now! */ - pr_debug("%s - wait interrupted %d\n", __func__, rv); - return rv; + if (wait_rv < 0) { + dev_dbg(dev, "%s - wait interrupted %ld\n", __func__, wait_rv); + return wait_rv; }
- if (rv == 0) { + if (wait_rv == 0) { dev_dbg(dev, "%s - wait timed out\n", __func__); return -ETIMEDOUT; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Penkler dpenkler@gmail.com
commit 4e77d3ec7c7c0d9535ccf1138827cb9bb5480b9b upstream.
wait_event_interruptible_timeout returns a long The return value was being assigned to an int causing an integer overflow when the remaining jiffies > INT_MAX which resulted in random error returns.
Use a long return value, converting to the int ioctl return only on error.
Fixes: bb99794a4792 ("usb: usbtmc: Add ioctl for vendor specific read") Cc: stable@vger.kernel.org Signed-off-by: Dave Penkler dpenkler@gmail.com Link: https://lore.kernel.org/r/20250502070941.31819-4-dpenkler@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/class/usbtmc.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
--- a/drivers/usb/class/usbtmc.c +++ b/drivers/usb/class/usbtmc.c @@ -833,6 +833,7 @@ static ssize_t usbtmc_generic_read(struc unsigned long expire; int bufcount = 1; int again = 0; + long wait_rv;
/* mutex already locked */
@@ -945,19 +946,24 @@ static ssize_t usbtmc_generic_read(struc if (!(flags & USBTMC_FLAG_ASYNC)) { dev_dbg(dev, "%s: before wait time %lu\n", __func__, expire); - retval = wait_event_interruptible_timeout( + wait_rv = wait_event_interruptible_timeout( file_data->wait_bulk_in, usbtmc_do_transfer(file_data), expire);
- dev_dbg(dev, "%s: wait returned %d\n", - __func__, retval); + dev_dbg(dev, "%s: wait returned %ld\n", + __func__, wait_rv);
- if (retval <= 0) { - if (retval == 0) - retval = -ETIMEDOUT; + if (wait_rv < 0) { + retval = wait_rv; goto error; } + + if (wait_rv == 0) { + retval = -ETIMEDOUT; + goto error; + } + }
urb = usb_get_from_anchor(&file_data->in_anchor);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gustavo Silva gustavograzs@gmail.com
[ Upstream commit 6d03811d7a99e08d5928f58120acb45b8ba22b08 ]
In the bmi270_configure_imu() function, the accelerometer and gyroscope configuration registers are incorrectly written with the mask BMI270_PWR_CONF_ADV_PWR_SAVE_MSK, which is unrelated to these registers.
As a result, the accelerometer's sampling frequency is set to 200 Hz instead of the intended 100 Hz.
Remove the mask to ensure the correct bits are set in the configuration registers.
Fixes: 3ea51548d6b2 ("iio: imu: Add i2c driver for bmi270 imu") Signed-off-by: Gustavo Silva gustavograzs@gmail.com Reviewed-by: Alex Lanzano lanzano.alex@gmail.com Link: https://patch.msgid.link/20250304-bmi270-odr-fix-v1-1-384dbcd699fb@gmail.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/imu/bmi270/bmi270_core.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/iio/imu/bmi270/bmi270_core.c b/drivers/iio/imu/bmi270/bmi270_core.c index 7fec52e0b4862..950fcacddd40d 100644 --- a/drivers/iio/imu/bmi270/bmi270_core.c +++ b/drivers/iio/imu/bmi270/bmi270_core.c @@ -654,8 +654,7 @@ static int bmi270_configure_imu(struct bmi270_data *bmi270_device) FIELD_PREP(BMI270_ACC_CONF_ODR_MSK, BMI270_ACC_CONF_ODR_100HZ) | FIELD_PREP(BMI270_ACC_CONF_BWP_MSK, - BMI270_ACC_CONF_BWP_NORMAL_MODE) | - BMI270_PWR_CONF_ADV_PWR_SAVE_MSK); + BMI270_ACC_CONF_BWP_NORMAL_MODE)); if (ret) return dev_err_probe(dev, ret, "Failed to configure accelerometer");
@@ -663,8 +662,7 @@ static int bmi270_configure_imu(struct bmi270_data *bmi270_device) FIELD_PREP(BMI270_GYR_CONF_ODR_MSK, BMI270_GYR_CONF_ODR_200HZ) | FIELD_PREP(BMI270_GYR_CONF_BWP_MSK, - BMI270_GYR_CONF_BWP_NORMAL_MODE) | - BMI270_PWR_CONF_ADV_PWR_SAVE_MSK); + BMI270_GYR_CONF_BWP_NORMAL_MODE)); if (ret) return dev_err_probe(dev, ret, "Failed to configure gyroscope");
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lothar Rubusch l.rubusch@gmail.com
[ Upstream commit 38f67d0264929762e54ae5948703a21f841fe706 ]
Fix setting the odr value to update activity time based on frequency derrived by recent odr, and not by obsolete odr value.
The [small] bug: When _adxl367_set_odr() is called with a new odr value, it first writes the new odr value to the hardware register ADXL367_REG_FILTER_CTL. Second, it calls _adxl367_set_act_time_ms(), which calls adxl367_time_ms_to_samples(). Here st->odr still holds the old odr value. This st->odr member is used to derrive a frequency value, which is applied to update ADXL367_REG_TIME_ACT. Hence, the idea is to update activity time, based on possibilities and power consumption by the current ODR rate. Finally, when the function calls return, again in _adxl367_set_odr() the new ODR is assigned to st->odr.
The fix: When setting a new ODR value is set to ADXL367_REG_FILTER_CTL, also ADXL367_REG_TIME_ACT should probably be updated with a frequency based on the recent ODR value and not the old one. Changing the location of the assignment to st->odr fixes this.
Fixes: cbab791c5e2a5 ("iio: accel: add ADXL367 driver") Signed-off-by: Lothar Rubusch l.rubusch@gmail.com Reviewed-by: Marcelo Schmitt marcelo.schmitt1@gmail.com Link: https://patch.msgid.link/20250309193515.2974-1-l.rubusch@gmail.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/accel/adxl367.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/drivers/iio/accel/adxl367.c b/drivers/iio/accel/adxl367.c index a48ac0d7bd96b..2ba7d7de47e44 100644 --- a/drivers/iio/accel/adxl367.c +++ b/drivers/iio/accel/adxl367.c @@ -604,18 +604,14 @@ static int _adxl367_set_odr(struct adxl367_state *st, enum adxl367_odr odr) if (ret) return ret;
+ st->odr = odr; + /* Activity timers depend on ODR */ ret = _adxl367_set_act_time_ms(st, st->act_time_ms); if (ret) return ret;
- ret = _adxl367_set_inact_time_ms(st, st->inact_time_ms); - if (ret) - return ret; - - st->odr = odr; - - return 0; + return _adxl367_set_inact_time_ms(st, st->inact_time_ms); }
static int adxl367_set_odr(struct iio_dev *indio_dev, enum adxl367_odr odr)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Cameron Jonathan.Cameron@huawei.com
[ Upstream commit f79aeb6c631b57395f37acbfbe59727e355a714c ]
The trick of using __aligned(IIO_DMA_MINALIGN) ensures that there is no overlap between buffers used for DMA and those used for driver state storage that are before the marking. It doesn't ensure anything above state variables found after the marking. Hence move this particular bit of state earlier in the structure.
Fixes: 10897f34309b ("iio: temp: maxim_thermocouple: Fix alignment for DMA safety") Reviewed-by: David Lechner dlechner@baylibre.com Link: https://patch.msgid.link/20250413103443.2420727-14-jic23@kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/temperature/maxim_thermocouple.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/temperature/maxim_thermocouple.c b/drivers/iio/temperature/maxim_thermocouple.c index c28a7a6dea5f1..555a61e2f3fdd 100644 --- a/drivers/iio/temperature/maxim_thermocouple.c +++ b/drivers/iio/temperature/maxim_thermocouple.c @@ -121,9 +121,9 @@ static const struct maxim_thermocouple_chip maxim_thermocouple_chips[] = { struct maxim_thermocouple_data { struct spi_device *spi; const struct maxim_thermocouple_chip *chip; + char tc_type;
u8 buffer[16] __aligned(IIO_DMA_MINALIGN); - char tc_type; };
static int maxim_thermocouple_read(struct maxim_thermocouple_data *data,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Cameron Jonathan.Cameron@huawei.com
[ Upstream commit 1bb942287e05dc4c304a003ea85e6dd9a5e7db39 ]
The IIO ABI requires 64-bit aligned timestamps. In this case insufficient padding would have been added on architectures where an s64 is only 32-bit aligned. Use aligned_s64 to enforce the correct alignment.
Fixes: 327a0eaf19d5 ("iio: accel: adxl355: Add triggered buffer support") Reported-by: David Lechner dlechner@baylibre.com Reviewed-by: Nuno Sá nuno.sa@analog.com Reviewed-by: David Lechner dlechner@baylibre.com Link: https://patch.msgid.link/20250413103443.2420727-5-jic23@kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/accel/adxl355_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/accel/adxl355_core.c b/drivers/iio/accel/adxl355_core.c index e8cd21fa77a69..cbac622ef8211 100644 --- a/drivers/iio/accel/adxl355_core.c +++ b/drivers/iio/accel/adxl355_core.c @@ -231,7 +231,7 @@ struct adxl355_data { u8 transf_buf[3]; struct { u8 buf[14]; - s64 ts; + aligned_s64 ts; } buffer; } __aligned(IIO_DMA_MINALIGN); };
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Cameron Jonathan.Cameron@huawei.com
[ Upstream commit 5097eaae98e53f9ab9d35801c70da819b92ca907 ]
Here the lack of marking allows the overall structure to not be sufficiently aligned resulting in misplacement of the timestamp in iio_push_to_buffers_with_timestamp(). Use aligned_s64 to force the alignment on all architectures.
Fixes: 7c0299e879dd ("iio: adc: Add support for DLN2 ADC") Reported-by: David Lechner dlechner@baylibre.com Reviewed-by: Andy Shevchenko andy@kernel.org Reviewed-by: Nuno Sá nuno.sa@analog.com Reviewed-by: David Lechner dlechner@baylibre.com Link: https://patch.msgid.link/20250413103443.2420727-4-jic23@kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/adc/dln2-adc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/adc/dln2-adc.c b/drivers/iio/adc/dln2-adc.c index 221a5fdc1eaac..e416501770855 100644 --- a/drivers/iio/adc/dln2-adc.c +++ b/drivers/iio/adc/dln2-adc.c @@ -467,7 +467,7 @@ static irqreturn_t dln2_adc_trigger_h(int irq, void *p) struct iio_dev *indio_dev = pf->indio_dev; struct { __le16 values[DLN2_ADC_MAX_CHANNELS]; - int64_t timestamp_space; + aligned_s64 timestamp_space; } data; struct dln2_adc_get_all_vals dev_data; struct dln2_adc *dln2 = iio_priv(indio_dev);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marco Crivellari marco.crivellari@suse.com
[ Upstream commit 56651128e2fbad80f632f388d6bf1f39c928267a ]
MIPS re-enables interrupts on its idle routine and performs a TIF_NEED_RESCHED check afterwards before putting the CPU to sleep.
The IRQs firing between the check and the 'wait' instruction may set the TIF_NEED_RESCHED flag. In order to deal with this possible race, IRQs interrupting __r4k_wait() rollback their return address to the beginning of __r4k_wait() so that TIF_NEED_RESCHED is checked again before going back to sleep.
However idle IRQs can also queue timers that may require a tick reprogramming through a new generic idle loop iteration but those timers would go unnoticed here because __r4k_wait() only checks TIF_NEED_RESCHED. It doesn't check for pending timers.
Fix this with fast-forwarding idle IRQs return address to the end of the idle routine instead of the beginning, so that the generic idle loop handles both TIF_NEED_RESCHED and pending timers.
CONFIG_CPU_MICROMIPS has been removed along with the nop instructions. There, NOPs are 2 byte in size, so change the code with 3 _ssnop which are always 4 byte and remove the ifdef. Added ehb to make sure the hazard is always cleared.
Fixes: c65a5480ff29 ("[MIPS] Fix potential latency problem due to non-atomic cpu_wait.") Signed-off-by: Marco Crivellari marco.crivellari@suse.com Signed-off-by: Maciej W. Rozycki macro@orcam.me.uk Acked-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/include/asm/idle.h | 3 +- arch/mips/kernel/genex.S | 62 +++++++++++++++++++++--------------- arch/mips/kernel/idle.c | 7 ---- 3 files changed, 37 insertions(+), 35 deletions(-)
diff --git a/arch/mips/include/asm/idle.h b/arch/mips/include/asm/idle.h index 0992cad9c632e..2bc3678455ed0 100644 --- a/arch/mips/include/asm/idle.h +++ b/arch/mips/include/asm/idle.h @@ -6,8 +6,7 @@ #include <linux/linkage.h>
extern void (*cpu_wait)(void); -extern void r4k_wait(void); -extern asmlinkage void __r4k_wait(void); +extern asmlinkage void r4k_wait(void); extern void r4k_wait_irqoff(void);
static inline int using_rollback_handler(void) diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S index a572ce36a24f2..46d975d00298d 100644 --- a/arch/mips/kernel/genex.S +++ b/arch/mips/kernel/genex.S @@ -104,42 +104,52 @@ handle_vcei:
__FINIT
- .align 5 /* 32 byte rollback region */ -LEAF(__r4k_wait) - .set push - .set noreorder - /* start of rollback region */ - LONG_L t0, TI_FLAGS($28) - nop - andi t0, _TIF_NEED_RESCHED - bnez t0, 1f - nop - nop - nop -#ifdef CONFIG_CPU_MICROMIPS - nop - nop - nop - nop -#endif + /* Align to 32 bytes for the maximum idle interrupt region size. */ + .align 5 +LEAF(r4k_wait) + /* Keep the ISA bit clear for calculations on local labels here. */ +0: .fill 0 + /* Start of idle interrupt region. */ + local_irq_enable + /* + * If an interrupt lands here, before going idle on the next + * instruction, we must *NOT* go idle since the interrupt could + * have set TIF_NEED_RESCHED or caused a timer to need resched. + * Fall through -- see rollback_handler below -- and have the + * idle loop take care of things. + */ +1: .fill 0 + /* The R2 EI/EHB sequence takes 8 bytes, otherwise pad up. */ + .if 1b - 0b > 32 + .error "overlong idle interrupt region" + .elseif 1b - 0b > 8 + .align 4 + .endif +2: .fill 0 + .equ r4k_wait_idle_size, 2b - 0b + /* End of idle interrupt region; size has to be a power of 2. */ .set MIPS_ISA_ARCH_LEVEL_RAW +r4k_wait_insn: wait - /* end of rollback region (the region size must be power of two) */ -1: +r4k_wait_exit: + .set mips0 + local_irq_disable jr ra - nop - .set pop - END(__r4k_wait) + END(r4k_wait) + .previous
.macro BUILD_ROLLBACK_PROLOGUE handler FEXPORT(rollback_\handler) .set push .set noat MFC0 k0, CP0_EPC - PTR_LA k1, __r4k_wait - ori k0, 0x1f /* 32 byte rollback region */ - xori k0, 0x1f + /* Subtract/add 2 to let the ISA bit propagate through the mask. */ + PTR_LA k1, r4k_wait_insn - 2 + ori k0, r4k_wait_idle_size - 2 + .set noreorder bne k0, k1, \handler + PTR_ADDIU k0, r4k_wait_exit - r4k_wait_insn + 2 + .set reorder MTC0 k0, CP0_EPC .set pop .endm diff --git a/arch/mips/kernel/idle.c b/arch/mips/kernel/idle.c index 5abc8b7340f88..80e8a04a642e0 100644 --- a/arch/mips/kernel/idle.c +++ b/arch/mips/kernel/idle.c @@ -35,13 +35,6 @@ static void __cpuidle r3081_wait(void) write_c0_conf(cfg | R30XX_CONF_HALT); }
-void __cpuidle r4k_wait(void) -{ - raw_local_irq_enable(); - __r4k_wait(); - raw_local_irq_disable(); -} - /* * This variant is preferable as it allows testing need_resched and going to * sleep depending on the outcome atomically. Unfortunately the "It is
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marco Crivellari marco.crivellari@suse.com
[ Upstream commit b713f27e32d87c35737ec942dd6f5ed6b7475f48 ]
Fix missing .cpuidle.text section assignment for r4k_wait() to correct backtracing with nmi_backtrace().
Fixes: 97c8580e85cf ("MIPS: Annotate cpu_wait implementations with __cpuidle") Signed-off-by: Marco Crivellari marco.crivellari@suse.com Signed-off-by: Maciej W. Rozycki macro@orcam.me.uk Acked-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/kernel/genex.S | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S index 46d975d00298d..2cf312d9a3b09 100644 --- a/arch/mips/kernel/genex.S +++ b/arch/mips/kernel/genex.S @@ -104,6 +104,7 @@ handle_vcei:
__FINIT
+ .section .cpuidle.text,"ax" /* Align to 32 bytes for the maximum idle interrupt region size. */ .align 5 LEAF(r4k_wait)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit b71f9804f66c2592d4c3a2397b7374a4039005a5 ]
Lei Chen raised an issue with CLOCK_MONOTONIC_COARSE seeing time inconsistencies. Lei tracked down that this was being caused by the adjustment:
tk->tkr_mono.xtime_nsec -= offset;
which is made to compensate for the unaccumulated cycles in offset when the multiplicator is adjusted forward, so that the non-_COARSE clockids don't see inconsistencies.
However, the _COARSE clockid getter functions use the adjusted xtime_nsec value directly and do not compensate the negative offset via the clocksource delta multiplied with the new multiplicator. In that case the caller can observe time going backwards in consecutive calls.
By design, this negative adjustment should be fine, because the logic run from timekeeping_adjust() is done after it accumulated approximately
multiplicator * interval_cycles
into xtime_nsec. The accumulated value is always larger then the
mult_adj * offset
value, which is subtracted from xtime_nsec. Both operations are done together under the tk_core.lock, so the net change to xtime_nsec is always always be positive.
However, do_adjtimex() calls into timekeeping_advance() as well, to apply the NTP frequency adjustment immediately. In this case, timekeeping_advance() does not return early when the offset is smaller then interval_cycles. In that case there is no time accumulated into xtime_nsec. But the subsequent call into timekeeping_adjust(), which modifies the multiplicator, subtracts from xtime_nsec to correct for the new multiplicator.
Here because there was no accumulation, xtime_nsec becomes smaller than before, which opens a window up to the next accumulation, where the _COARSE clockid getters, which don't compensate for the offset, can observe the inconsistency.
This has been tried to be fixed by forwarding the timekeeper in the case that adjtimex() adjusts the multiplier, which resets the offset to zero:
757b000f7b93 ("timekeeping: Fix possible inconsistencies in _COARSE clockids")
That works correctly, but unfortunately causes a regression on the adjtimex() side. There are two issues:
1) The forwarding of the base time moves the update out of the original period and establishes a new one.
2) The clearing of the accumulated NTP error is changing the behaviour as well.
User-space expects that multiplier/frequency updates are in effect, when the syscall returns, so delaying the update to the next tick is not solving the problem either.
Commit 757b000f7b93 was reverted so that the established expectations of user space implementations (ntpd, chronyd) are restored, but that obviously brought the inconsistencies back.
One of the initial approaches to fix this was to establish a separate storage for the coarse time getter nanoseconds part by calculating it from the offset. That was dropped on the floor because not having yet another state to maintain was simpler. But given the result of the above exercise, this solution turns out to be the right one. Bring it back in a slightly modified form.
Thus introduce timekeeper::coarse_nsec and store that nanoseconds part in it, switch the time getter functions and the VDSO update to use that value. coarse_nsec is set on operations which forward or initialize the timekeeper and after time was accumulated during a tick. If there is no accumulation the timestamp is unchanged.
This leaves the adjtimex() behaviour unmodified and prevents coarse time from going backwards.
[ jstultz: Simplified the coarse_nsec calculation and kept behavior so coarse clockids aren't adjusted on each inter-tick adjtimex call, slightly reworked the comments and commit message ]
Fixes: da15cfdae033 ("time: Introduce CLOCK_REALTIME_COARSE") Reported-by: Lei Chen lei.chen@smartx.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: John Stultz jstultz@google.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/all/20250419054706.2319105-1-jstultz@google.com Closes: https://lore.kernel.org/lkml/20250310030004.3705801-1-lei.chen@smartx.com/ Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/timekeeper_internal.h | 8 +++-- kernel/time/timekeeping.c | 50 ++++++++++++++++++++++++----- kernel/time/vsyscall.c | 4 +-- 3 files changed, 49 insertions(+), 13 deletions(-)
diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h index e39d4d563b197..785048a3b3e60 100644 --- a/include/linux/timekeeper_internal.h +++ b/include/linux/timekeeper_internal.h @@ -51,7 +51,7 @@ struct tk_read_base { * @offs_real: Offset clock monotonic -> clock realtime * @offs_boot: Offset clock monotonic -> clock boottime * @offs_tai: Offset clock monotonic -> clock tai - * @tai_offset: The current UTC to TAI offset in seconds + * @coarse_nsec: The nanoseconds part for coarse time getters * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds * @clock_was_set_seq: The sequence number of clock was set events @@ -76,6 +76,7 @@ struct tk_read_base { * ntp shifted nano seconds. * @ntp_err_mult: Multiplication factor for scaled math conversion * @skip_second_overflow: Flag used to avoid updating NTP twice with same second + * @tai_offset: The current UTC to TAI offset in seconds * * Note: For timespec(64) based interfaces wall_to_monotonic is what * we need to add to xtime (or xtime corrected for sub jiffy times) @@ -100,7 +101,7 @@ struct tk_read_base { * which results in the following cacheline layout: * * 0: seqcount, tkr_mono - * 1: xtime_sec ... tai_offset + * 1: xtime_sec ... coarse_nsec * 2: tkr_raw, raw_sec * 3,4: Internal variables * @@ -121,7 +122,7 @@ struct timekeeper { ktime_t offs_real; ktime_t offs_boot; ktime_t offs_tai; - s32 tai_offset; + u32 coarse_nsec;
/* Cacheline 2: */ struct tk_read_base tkr_raw; @@ -144,6 +145,7 @@ struct timekeeper { u32 ntp_error_shift; u32 ntp_err_mult; u32 skip_second_overflow; + s32 tai_offset; };
#ifdef CONFIG_GENERIC_TIME_VSYSCALL diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 1e67d076f1955..a009c91f7b05f 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -164,10 +164,34 @@ static inline struct timespec64 tk_xtime(const struct timekeeper *tk) return ts; }
+static inline struct timespec64 tk_xtime_coarse(const struct timekeeper *tk) +{ + struct timespec64 ts; + + ts.tv_sec = tk->xtime_sec; + ts.tv_nsec = tk->coarse_nsec; + return ts; +} + +/* + * Update the nanoseconds part for the coarse time keepers. They can't rely + * on xtime_nsec because xtime_nsec could be adjusted by a small negative + * amount when the multiplication factor of the clock is adjusted, which + * could cause the coarse clocks to go slightly backwards. See + * timekeeping_apply_adjustment(). Thus we keep a separate copy for the coarse + * clockids which only is updated when the clock has been set or we have + * accumulated time. + */ +static inline void tk_update_coarse_nsecs(struct timekeeper *tk) +{ + tk->coarse_nsec = tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift; +} + static void tk_set_xtime(struct timekeeper *tk, const struct timespec64 *ts) { tk->xtime_sec = ts->tv_sec; tk->tkr_mono.xtime_nsec = (u64)ts->tv_nsec << tk->tkr_mono.shift; + tk_update_coarse_nsecs(tk); }
static void tk_xtime_add(struct timekeeper *tk, const struct timespec64 *ts) @@ -175,6 +199,7 @@ static void tk_xtime_add(struct timekeeper *tk, const struct timespec64 *ts) tk->xtime_sec += ts->tv_sec; tk->tkr_mono.xtime_nsec += (u64)ts->tv_nsec << tk->tkr_mono.shift; tk_normalize_xtime(tk); + tk_update_coarse_nsecs(tk); }
static void tk_set_wall_to_mono(struct timekeeper *tk, struct timespec64 wtm) @@ -708,6 +733,7 @@ static void timekeeping_forward_now(struct timekeeper *tk) tk_normalize_xtime(tk); delta -= incr; } + tk_update_coarse_nsecs(tk); }
/** @@ -804,8 +830,8 @@ EXPORT_SYMBOL_GPL(ktime_get_with_offset); ktime_t ktime_get_coarse_with_offset(enum tk_offsets offs) { struct timekeeper *tk = &tk_core.timekeeper; - unsigned int seq; ktime_t base, *offset = offsets[offs]; + unsigned int seq; u64 nsecs;
WARN_ON(timekeeping_suspended); @@ -813,7 +839,7 @@ ktime_t ktime_get_coarse_with_offset(enum tk_offsets offs) do { seq = read_seqcount_begin(&tk_core.seq); base = ktime_add(tk->tkr_mono.base, *offset); - nsecs = tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift; + nsecs = tk->coarse_nsec;
} while (read_seqcount_retry(&tk_core.seq, seq));
@@ -2161,7 +2187,7 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode) struct timekeeper *real_tk = &tk_core.timekeeper; unsigned int clock_set = 0; int shift = 0, maxshift; - u64 offset; + u64 offset, orig_offset;
guard(raw_spinlock_irqsave)(&tk_core.lock);
@@ -2172,7 +2198,7 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode) offset = clocksource_delta(tk_clock_read(&tk->tkr_mono), tk->tkr_mono.cycle_last, tk->tkr_mono.mask, tk->tkr_mono.clock->max_raw_delta); - + orig_offset = offset; /* Check if there's really nothing to do */ if (offset < real_tk->cycle_interval && mode == TK_ADV_TICK) return false; @@ -2205,6 +2231,14 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode) */ clock_set |= accumulate_nsecs_to_secs(tk);
+ /* + * To avoid inconsistencies caused adjtimex TK_ADV_FREQ calls + * making small negative adjustments to the base xtime_nsec + * value, only update the coarse clocks if we accumulated time + */ + if (orig_offset != offset) + tk_update_coarse_nsecs(tk); + timekeeping_update_from_shadow(&tk_core, clock_set);
return !!clock_set; @@ -2248,7 +2282,7 @@ void ktime_get_coarse_real_ts64(struct timespec64 *ts) do { seq = read_seqcount_begin(&tk_core.seq);
- *ts = tk_xtime(tk); + *ts = tk_xtime_coarse(tk); } while (read_seqcount_retry(&tk_core.seq, seq)); } EXPORT_SYMBOL(ktime_get_coarse_real_ts64); @@ -2271,7 +2305,7 @@ void ktime_get_coarse_real_ts64_mg(struct timespec64 *ts)
do { seq = read_seqcount_begin(&tk_core.seq); - *ts = tk_xtime(tk); + *ts = tk_xtime_coarse(tk); offset = tk_core.timekeeper.offs_real; } while (read_seqcount_retry(&tk_core.seq, seq));
@@ -2350,12 +2384,12 @@ void ktime_get_coarse_ts64(struct timespec64 *ts) do { seq = read_seqcount_begin(&tk_core.seq);
- now = tk_xtime(tk); + now = tk_xtime_coarse(tk); mono = tk->wall_to_monotonic; } while (read_seqcount_retry(&tk_core.seq, seq));
set_normalized_timespec64(ts, now.tv_sec + mono.tv_sec, - now.tv_nsec + mono.tv_nsec); + now.tv_nsec + mono.tv_nsec); } EXPORT_SYMBOL(ktime_get_coarse_ts64);
diff --git a/kernel/time/vsyscall.c b/kernel/time/vsyscall.c index 05d3831431658..c9d946b012d8b 100644 --- a/kernel/time/vsyscall.c +++ b/kernel/time/vsyscall.c @@ -97,12 +97,12 @@ void update_vsyscall(struct timekeeper *tk) /* CLOCK_REALTIME_COARSE */ vdso_ts = &vdata[CS_HRES_COARSE].basetime[CLOCK_REALTIME_COARSE]; vdso_ts->sec = tk->xtime_sec; - vdso_ts->nsec = tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift; + vdso_ts->nsec = tk->coarse_nsec;
/* CLOCK_MONOTONIC_COARSE */ vdso_ts = &vdata[CS_HRES_COARSE].basetime[CLOCK_MONOTONIC_COARSE]; vdso_ts->sec = tk->xtime_sec + tk->wall_to_monotonic.tv_sec; - nsec = tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift; + nsec = tk->coarse_nsec; nsec = nsec + tk->wall_to_monotonic.tv_nsec; vdso_ts->sec += __iter_div_u64_rem(nsec, NSEC_PER_SEC, &vdso_ts->nsec);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Karol Wachowski karol.wachowski@intel.com
[ Upstream commit 950942b4813f8c44dbec683fdb140cf4a238516b ]
Move doorbell ID and command queue ID XArray allocations from command queue memory allocation function. This will allow ID allocations to be done without the need for actual memory allocation.
Signed-off-by: Karol Wachowski karol.wachowski@intel.com Signed-off-by: Maciej Falkowski maciej.falkowski@linux.intel.com Reviewed-by: Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com Signed-off-by: Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-2-maciej... Stable-dep-of: 75680b7cd461 ("accel/ivpu: Correct mutex unlock order in job submission") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/accel/ivpu/ivpu_job.c | 88 +++++++++++++++++++++++++---------- 1 file changed, 64 insertions(+), 24 deletions(-)
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c index 673801889c7b2..766fc383680f1 100644 --- a/drivers/accel/ivpu/ivpu_job.c +++ b/drivers/accel/ivpu/ivpu_job.c @@ -83,23 +83,9 @@ static struct ivpu_cmdq *ivpu_cmdq_alloc(struct ivpu_file_priv *file_priv) if (!cmdq) return NULL;
- ret = xa_alloc_cyclic(&vdev->db_xa, &cmdq->db_id, NULL, vdev->db_limit, &vdev->db_next, - GFP_KERNEL); - if (ret < 0) { - ivpu_err(vdev, "Failed to allocate doorbell id: %d\n", ret); - goto err_free_cmdq; - } - - ret = xa_alloc_cyclic(&file_priv->cmdq_xa, &cmdq->id, cmdq, file_priv->cmdq_limit, - &file_priv->cmdq_id_next, GFP_KERNEL); - if (ret < 0) { - ivpu_err(vdev, "Failed to allocate command queue id: %d\n", ret); - goto err_erase_db_xa; - } - cmdq->mem = ivpu_bo_create_global(vdev, SZ_4K, DRM_IVPU_BO_WC | DRM_IVPU_BO_MAPPABLE); if (!cmdq->mem) - goto err_erase_cmdq_xa; + goto err_free_cmdq;
ret = ivpu_preemption_buffers_create(vdev, file_priv, cmdq); if (ret) @@ -107,10 +93,6 @@ static struct ivpu_cmdq *ivpu_cmdq_alloc(struct ivpu_file_priv *file_priv)
return cmdq;
-err_erase_cmdq_xa: - xa_erase(&file_priv->cmdq_xa, cmdq->id); -err_erase_db_xa: - xa_erase(&vdev->db_xa, cmdq->db_id); err_free_cmdq: kfree(cmdq); return NULL; @@ -234,30 +216,88 @@ static int ivpu_cmdq_fini(struct ivpu_file_priv *file_priv, struct ivpu_cmdq *cm return 0; }
+static int ivpu_db_id_alloc(struct ivpu_device *vdev, u32 *db_id) +{ + int ret; + u32 id; + + ret = xa_alloc_cyclic(&vdev->db_xa, &id, NULL, vdev->db_limit, &vdev->db_next, GFP_KERNEL); + if (ret < 0) + return ret; + + *db_id = id; + return 0; +} + +static int ivpu_cmdq_id_alloc(struct ivpu_file_priv *file_priv, u32 *cmdq_id) +{ + int ret; + u32 id; + + ret = xa_alloc_cyclic(&file_priv->cmdq_xa, &id, NULL, file_priv->cmdq_limit, + &file_priv->cmdq_id_next, GFP_KERNEL); + if (ret < 0) + return ret; + + *cmdq_id = id; + return 0; +} + static struct ivpu_cmdq *ivpu_cmdq_acquire(struct ivpu_file_priv *file_priv, u8 priority) { + struct ivpu_device *vdev = file_priv->vdev; struct ivpu_cmdq *cmdq; - unsigned long cmdq_id; + unsigned long id; int ret;
lockdep_assert_held(&file_priv->lock);
- xa_for_each(&file_priv->cmdq_xa, cmdq_id, cmdq) + xa_for_each(&file_priv->cmdq_xa, id, cmdq) if (cmdq->priority == priority) break;
if (!cmdq) { cmdq = ivpu_cmdq_alloc(file_priv); - if (!cmdq) + if (!cmdq) { + ivpu_err(vdev, "Failed to allocate command queue\n"); return NULL; + } + + ret = ivpu_db_id_alloc(vdev, &cmdq->db_id); + if (ret) { + ivpu_err(file_priv->vdev, "Failed to allocate doorbell ID: %d\n", ret); + goto err_free_cmdq; + } + + ret = ivpu_cmdq_id_alloc(file_priv, &cmdq->id); + if (ret) { + ivpu_err(vdev, "Failed to allocate command queue ID: %d\n", ret); + goto err_erase_db_id; + } + cmdq->priority = priority; + ret = xa_err(xa_store(&file_priv->cmdq_xa, cmdq->id, cmdq, GFP_KERNEL)); + if (ret) { + ivpu_err(vdev, "Failed to store command queue in cmdq_xa: %d\n", ret); + goto err_erase_cmdq_id; + } }
ret = ivpu_cmdq_init(file_priv, cmdq, priority); - if (ret) - return NULL; + if (ret) { + ivpu_err(vdev, "Failed to initialize command queue: %d\n", ret); + goto err_free_cmdq; + }
return cmdq; + +err_erase_cmdq_id: + xa_erase(&file_priv->cmdq_xa, cmdq->id); +err_erase_db_id: + xa_erase(&vdev->db_xa, cmdq->db_id); +err_free_cmdq: + ivpu_cmdq_free(file_priv, cmdq); + return NULL; }
void ivpu_cmdq_release_all_locked(struct ivpu_file_priv *file_priv)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Karol Wachowski karol.wachowski@intel.com
[ Upstream commit 75680b7cd461b169c7ccd2a0fba7542868b7fce2 ]
The mutex unlock for vdev->submitted_jobs_lock was incorrectly placed before unlocking file_priv->lock. Change order of unlocks to avoid potential race conditions.
Fixes: 5bbccadaf33e ("accel/ivpu: Abort all jobs after command queue unregister") Signed-off-by: Karol Wachowski karol.wachowski@intel.com Reviewed-by: Jeff Hugo jeff.hugo@oss.qualcomm.com Signed-off-by: Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com Link: https://lore.kernel.org/r/20250425093656.2228168-1-jacek.lawrynowicz@linux.i... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/accel/ivpu/ivpu_job.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c index 766fc383680f1..79b77d8a35a77 100644 --- a/drivers/accel/ivpu/ivpu_job.c +++ b/drivers/accel/ivpu/ivpu_job.c @@ -646,8 +646,8 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 priority) err_erase_xa: xa_erase(&vdev->submitted_jobs_xa, job->job_id); err_unlock: - mutex_unlock(&vdev->submitted_jobs_lock); mutex_unlock(&file_priv->lock); + mutex_unlock(&vdev->submitted_jobs_lock); ivpu_rpm_put(vdev); return ret; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thorsten Blum thorsten.blum@linux.dev
[ Upstream commit c44572e0cc13c9afff83fd333135a0aa9b27ba26 ]
Fix MAX_REG_OFFSET to point to the last register in 'pt_regs' and not to the marker itself, which could allow regs_get_register() to return an invalid offset.
Fixes: 40e084a506eb ("MIPS: Add uprobes support.") Suggested-by: Maciej W. Rozycki macro@orcam.me.uk Signed-off-by: Thorsten Blum thorsten.blum@linux.dev Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/include/asm/ptrace.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/mips/include/asm/ptrace.h b/arch/mips/include/asm/ptrace.h index 85fa9962266a2..ef72c46b55688 100644 --- a/arch/mips/include/asm/ptrace.h +++ b/arch/mips/include/asm/ptrace.h @@ -65,7 +65,8 @@ static inline void instruction_pointer_set(struct pt_regs *regs,
/* Query offset/name of register from its name/offset */ extern int regs_query_register_offset(const char *name); -#define MAX_REG_OFFSET (offsetof(struct pt_regs, __last)) +#define MAX_REG_OFFSET \ + (offsetof(struct pt_regs, __last) - sizeof(unsigned long))
/** * regs_get_register() - get register value from its offset
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nylon Chen nylon.chen@sifive.com
[ Upstream commit eb16b3727c05ed36420c90eca1e8f0e279514c1c ]
Add support for the Zcb extension's compressed half-word instructions (C.LHU, C.LH, and C.SH) in the RISC-V misaligned access trap handler.
Signed-off-by: Zong Li zong.li@sifive.com Signed-off-by: Nylon Chen nylon.chen@sifive.com Fixes: 956d705dd279 ("riscv: Unaligned load/store handling for M_MODE") Reviewed-by: Alexandre Ghiti alexghiti@rivosinc.com Link: https://lore.kernel.org/r/20250411073850.3699180-2-nylon.chen@sifive.com Signed-off-by: Alexandre Ghiti alexghiti@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/traps_misaligned.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index 4354c87c0376f..dde5d11dc1b50 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -88,6 +88,13 @@ #define INSN_MATCH_C_FSWSP 0xe002 #define INSN_MASK_C_FSWSP 0xe003
+#define INSN_MATCH_C_LHU 0x8400 +#define INSN_MASK_C_LHU 0xfc43 +#define INSN_MATCH_C_LH 0x8440 +#define INSN_MASK_C_LH 0xfc43 +#define INSN_MATCH_C_SH 0x8c00 +#define INSN_MASK_C_SH 0xfc43 + #define INSN_LEN(insn) ((((insn) & 0x3) < 0x3) ? 2 : 4)
#if defined(CONFIG_64BIT) @@ -431,6 +438,13 @@ static int handle_scalar_misaligned_load(struct pt_regs *regs) fp = 1; len = 4; #endif + } else if ((insn & INSN_MASK_C_LHU) == INSN_MATCH_C_LHU) { + len = 2; + insn = RVC_RS2S(insn) << SH_RD; + } else if ((insn & INSN_MASK_C_LH) == INSN_MATCH_C_LH) { + len = 2; + shift = 8 * (sizeof(ulong) - len); + insn = RVC_RS2S(insn) << SH_RD; } else { regs->epc = epc; return -1; @@ -530,6 +544,9 @@ static int handle_scalar_misaligned_store(struct pt_regs *regs) len = 4; val.data_ulong = GET_F32_RS2C(insn, regs); #endif + } else if ((insn & INSN_MASK_C_SH) == INSN_MATCH_C_SH) { + len = 2; + val.data_ulong = GET_RS2S(insn, regs); } else { regs->epc = epc; return -1;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
[ Upstream commit d278164832618bf2775c6a89e6434e2633de1eed ]
Split the code for setting up a backing file into a helper in preparation of adding more code to this path.
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Damien Le Moal dlemoal@kernel.org Link: https://lore.kernel.org/r/20250131120120.1315125-2-hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Stable-dep-of: f5c84eff634b ("loop: Add sanity check for read/write_iter") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/loop.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 7668b79d8b0a9..61ce7ccde3445 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -496,6 +496,14 @@ static int loop_validate_file(struct file *file, struct block_device *bdev) return 0; }
+static void loop_assign_backing_file(struct loop_device *lo, struct file *file) +{ + lo->lo_backing_file = file; + lo->old_gfp_mask = mapping_gfp_mask(file->f_mapping); + mapping_set_gfp_mask(file->f_mapping, + lo->old_gfp_mask & ~(__GFP_IO | __GFP_FS)); +} + /* * loop_change_fd switched the backing store of a loopback device to * a new file. This is useful for operating system installers to free up @@ -549,10 +557,7 @@ static int loop_change_fd(struct loop_device *lo, struct block_device *bdev, disk_force_media_change(lo->lo_disk); memflags = blk_mq_freeze_queue(lo->lo_queue); mapping_set_gfp_mask(old_file->f_mapping, lo->old_gfp_mask); - lo->lo_backing_file = file; - lo->old_gfp_mask = mapping_gfp_mask(file->f_mapping); - mapping_set_gfp_mask(file->f_mapping, - lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS)); + loop_assign_backing_file(lo, file); loop_update_dio(lo); blk_mq_unfreeze_queue(lo->lo_queue, memflags); partscan = lo->lo_flags & LO_FLAGS_PARTSCAN; @@ -943,7 +948,6 @@ static int loop_configure(struct loop_device *lo, blk_mode_t mode, const struct loop_config *config) { struct file *file = fget(config->fd); - struct address_space *mapping; struct queue_limits lim; int error; loff_t size; @@ -979,8 +983,6 @@ static int loop_configure(struct loop_device *lo, blk_mode_t mode, if (error) goto out_unlock;
- mapping = file->f_mapping; - if ((config->info.lo_flags & ~LOOP_CONFIGURE_SETTABLE_FLAGS) != 0) { error = -EINVAL; goto out_unlock; @@ -1012,9 +1014,7 @@ static int loop_configure(struct loop_device *lo, blk_mode_t mode, set_disk_ro(lo->lo_disk, (lo->lo_flags & LO_FLAGS_READ_ONLY) != 0);
lo->lo_device = bdev; - lo->lo_backing_file = file; - lo->old_gfp_mask = mapping_gfp_mask(mapping); - mapping_set_gfp_mask(mapping, lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS)); + loop_assign_backing_file(lo, file);
lim = queue_limits_start_update(lo->lo_queue); loop_update_limits(lo, &lim, config->block_size);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lizhi Xu lizhi.xu@windriver.com
[ Upstream commit f5c84eff634ba003326aa034c414e2a9dcb7c6a7 ]
Some file systems do not support read_iter/write_iter, such as selinuxfs in this issue. So before calling them, first confirm that the interface is supported and then call it.
It is releavant in that vfs_iter_read/write have the check, and removal of their used caused szybot to be able to hit this issue.
Fixes: f2fed441c69b ("loop: stop using vfs_iter__{read,write} for buffered I/O") Reported-by: syzbot+6af973a3b8dfd2faefdc@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6af973a3b8dfd2faefdc Signed-off-by: Lizhi Xu lizhi.xu@windriver.com Reviewed-by: Christoph Hellwig hch@lst.de Link: https://lore.kernel.org/r/20250428143626.3318717-1-lizhi.xu@windriver.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/loop.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 61ce7ccde3445..b378d2aa49f06 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -504,6 +504,17 @@ static void loop_assign_backing_file(struct loop_device *lo, struct file *file) lo->old_gfp_mask & ~(__GFP_IO | __GFP_FS)); }
+static int loop_check_backing_file(struct file *file) +{ + if (!file->f_op->read_iter) + return -EINVAL; + + if ((file->f_mode & FMODE_WRITE) && !file->f_op->write_iter) + return -EINVAL; + + return 0; +} + /* * loop_change_fd switched the backing store of a loopback device to * a new file. This is useful for operating system installers to free up @@ -525,6 +536,10 @@ static int loop_change_fd(struct loop_device *lo, struct block_device *bdev, if (!file) return -EBADF;
+ error = loop_check_backing_file(file); + if (error) + return error; + /* suppress uevents while reconfiguring the device */ dev_set_uevent_suppress(disk_to_dev(lo->lo_disk), 1);
@@ -956,6 +971,14 @@ static int loop_configure(struct loop_device *lo, blk_mode_t mode,
if (!file) return -EBADF; + + if ((mode & BLK_OPEN_WRITE) && !file->f_op->write_iter) + return -EINVAL; + + error = loop_check_backing_file(file); + if (error) + return error; + is_loop = is_loop_device(file);
/* This is safe, since we have a reference from open(). */
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kevin Baker kevinb@ventureresearch.com
[ Upstream commit 7c6fa1797a725732981f2d77711c867166737719 ]
Switch to panel timings based on datasheet for the AUO G101EVN01.0 LVDS panel. Default timings were tested on the panel.
Previous mode-based timings resulted in horizontal display shift.
Signed-off-by: Kevin Baker kevinb@ventureresearch.com Fixes: 4fb86404a977 ("drm/panel: simple: Add AUO G101EVN010 panel support") Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20250505170256.1385113-1-kevinb@ventureresearch.co... Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20250505170256.1385113-1-kevinb@ventureresearch.co... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-simple.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c index 9b2f128fd3094..cf9ab2d1f1d2a 100644 --- a/drivers/gpu/drm/panel/panel-simple.c +++ b/drivers/gpu/drm/panel/panel-simple.c @@ -1027,27 +1027,28 @@ static const struct panel_desc auo_g070vvn01 = { }, };
-static const struct drm_display_mode auo_g101evn010_mode = { - .clock = 68930, - .hdisplay = 1280, - .hsync_start = 1280 + 82, - .hsync_end = 1280 + 82 + 2, - .htotal = 1280 + 82 + 2 + 84, - .vdisplay = 800, - .vsync_start = 800 + 8, - .vsync_end = 800 + 8 + 2, - .vtotal = 800 + 8 + 2 + 6, +static const struct display_timing auo_g101evn010_timing = { + .pixelclock = { 64000000, 68930000, 85000000 }, + .hactive = { 1280, 1280, 1280 }, + .hfront_porch = { 8, 64, 256 }, + .hback_porch = { 8, 64, 256 }, + .hsync_len = { 40, 168, 767 }, + .vactive = { 800, 800, 800 }, + .vfront_porch = { 4, 8, 100 }, + .vback_porch = { 4, 8, 100 }, + .vsync_len = { 8, 16, 223 }, };
static const struct panel_desc auo_g101evn010 = { - .modes = &auo_g101evn010_mode, - .num_modes = 1, + .timings = &auo_g101evn010_timing, + .num_timings = 1, .bpc = 6, .size = { .width = 216, .height = 135, }, .bus_format = MEDIA_BUS_FMT_RGB666_1X7X3_SPWG, + .bus_flags = DRM_BUS_FLAG_DE_HIGH, .connector_type = DRM_MODE_CONNECTOR_LVDS, };
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Wagner wagi@kernel.org
[ Upstream commit 650415fca0a97472fdd79725e35152614d1aad76 ]
The original nvme subsystem design didn't have a CONNECTING state; the state machine allowed transitions from RESETTING to LIVE directly.
With the introduction of nvme fabrics the CONNECTING state was introduce. Over time the nvme-pci started to use the CONNECTING state as well.
Eventually, a bug fix for the nvme-fc started to depend that the only valid transition to LIVE was from CONNECTING. Though this change didn't update the firmware update handler which was still depending on RESETTING to LIVE transition.
The simplest way to address it for the time being is to switch into CONNECTING state before going to LIVE state.
Fixes: d2fe192348f9 ("nvme: only allow entering LIVE from CONNECTING state") Reported-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Daniel Wagner wagi@kernel.org Closes: https://lore.kernel.org/all/0134ea15-8d5f-41f7-9e9a-d7e6d82accaa@roeck-us.ne... Reviewed-by: Keith Busch kbusch@kernel.org Reviewed-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 150de63b26b2c..a27149e37a988 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4492,7 +4492,8 @@ static void nvme_fw_act_work(struct work_struct *work) msleep(100); }
- if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE)) + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) || + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE)) return;
nvme_unquiesce_io_queues(ctrl);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Clément Léger cleger@rivosinc.com
[ Upstream commit fd94de9f9e7aac11ec659e386b9db1203d502023 ]
Since both load/store and user/kernel should use almost the same path and that we are going to add some code around that, factorize it.
Signed-off-by: Clément Léger cleger@rivosinc.com Reviewed-by: Alexandre Ghiti alexghiti@rivosinc.com Link: https://lore.kernel.org/r/20250422162324.956065-2-cleger@rivosinc.com Signed-off-by: Alexandre Ghiti alexghiti@rivosinc.com Stable-dep-of: 453805f0a28f ("riscv: misaligned: enable IRQs while handling misaligned accesses") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/traps.c | 66 +++++++++++++++++++++------------------ 1 file changed, 36 insertions(+), 30 deletions(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 8ff8e8b36524b..b1d991c78a233 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -198,47 +198,53 @@ asmlinkage __visible __trap_section void do_trap_insn_illegal(struct pt_regs *re DO_ERROR_INFO(do_trap_load_fault, SIGSEGV, SEGV_ACCERR, "load access fault");
-asmlinkage __visible __trap_section void do_trap_load_misaligned(struct pt_regs *regs) +enum misaligned_access_type { + MISALIGNED_STORE, + MISALIGNED_LOAD, +}; +static const struct { + const char *type_str; + int (*handler)(struct pt_regs *regs); +} misaligned_handler[] = { + [MISALIGNED_STORE] = { + .type_str = "Oops - store (or AMO) address misaligned", + .handler = handle_misaligned_store, + }, + [MISALIGNED_LOAD] = { + .type_str = "Oops - load address misaligned", + .handler = handle_misaligned_load, + }, +}; + +static void do_trap_misaligned(struct pt_regs *regs, enum misaligned_access_type type) { - if (user_mode(regs)) { + irqentry_state_t state; + + if (user_mode(regs)) irqentry_enter_from_user_mode(regs); + else + state = irqentry_nmi_enter(regs);
- if (handle_misaligned_load(regs)) - do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc, - "Oops - load address misaligned"); + if (misaligned_handler[type].handler(regs)) + do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc, + misaligned_handler[type].type_str);
+ if (user_mode(regs)) irqentry_exit_to_user_mode(regs); - } else { - irqentry_state_t state = irqentry_nmi_enter(regs); - - if (handle_misaligned_load(regs)) - do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc, - "Oops - load address misaligned"); - + else irqentry_nmi_exit(regs, state); - } }
-asmlinkage __visible __trap_section void do_trap_store_misaligned(struct pt_regs *regs) +asmlinkage __visible __trap_section void do_trap_load_misaligned(struct pt_regs *regs) { - if (user_mode(regs)) { - irqentry_enter_from_user_mode(regs); - - if (handle_misaligned_store(regs)) - do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc, - "Oops - store (or AMO) address misaligned"); - - irqentry_exit_to_user_mode(regs); - } else { - irqentry_state_t state = irqentry_nmi_enter(regs); - - if (handle_misaligned_store(regs)) - do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc, - "Oops - store (or AMO) address misaligned"); + do_trap_misaligned(regs, MISALIGNED_LOAD); +}
- irqentry_nmi_exit(regs, state); - } +asmlinkage __visible __trap_section void do_trap_store_misaligned(struct pt_regs *regs) +{ + do_trap_misaligned(regs, MISALIGNED_STORE); } + DO_ERROR_INFO(do_trap_store_fault, SIGSEGV, SEGV_ACCERR, "store (or AMO) access fault"); DO_ERROR_INFO(do_trap_ecall_s,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Clément Léger cleger@rivosinc.com
[ Upstream commit 453805f0a28fc5091e46145e6560c776f7c7a611 ]
We can safely reenable IRQs if coming from userspace. This allows to access user memory that could potentially trigger a page fault.
Fixes: b686ecdeacf6 ("riscv: misaligned: Restrict user access to kernel memory") Signed-off-by: Clément Léger cleger@rivosinc.com Reviewed-by: Alexandre Ghiti alexghiti@rivosinc.com Link: https://lore.kernel.org/r/20250422162324.956065-3-cleger@rivosinc.com Signed-off-by: Alexandre Ghiti alexghiti@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/traps.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index b1d991c78a233..9c83848797a78 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -220,19 +220,23 @@ static void do_trap_misaligned(struct pt_regs *regs, enum misaligned_access_type { irqentry_state_t state;
- if (user_mode(regs)) + if (user_mode(regs)) { irqentry_enter_from_user_mode(regs); - else + local_irq_enable(); + } else { state = irqentry_nmi_enter(regs); + }
if (misaligned_handler[type].handler(regs)) do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc, misaligned_handler[type].type_str);
- if (user_mode(regs)) + if (user_mode(regs)) { + local_irq_disable(); irqentry_exit_to_user_mode(regs); - else + } else { irqentry_nmi_exit(regs, state); + } }
asmlinkage __visible __trap_section void do_trap_load_misaligned(struct pt_regs *regs)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Samuel Holland samuel.holland@sifive.com
[ Upstream commit 7f1c3de1370bc6a8ad5157336b258067dac0ae9c ]
When the prctl() interface for pointer masking was added, it did not check that the pointer masking ISA extension was supported, only the individual submodes. Userspace could still attempt to disable pointer masking and query the pointer masking state. commit 81de1afb2dd1 ("riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL") disallowed the former, as the senvcfg write could crash on older systems. PR_GET_TAGGED_ADDR_CTRL state does not crash, because it reads only kernel-internal state and not senvcfg, but it should still be disallowed for consistency.
Fixes: 09d6775f503b ("riscv: Add support for userspace pointer masking") Signed-off-by: Samuel Holland samuel.holland@sifive.com Reviewed-by: Nam Cao namcao@linutronix.de Link: https://lore.kernel.org/r/20250507145230.2272871-1-samuel.holland@sifive.com Signed-off-by: Alexandre Ghiti alexghiti@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/process.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index 3db2c0c07acd0..15d8f75902f85 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -333,6 +333,9 @@ long get_tagged_addr_ctrl(struct task_struct *task) struct thread_info *ti = task_thread_info(task); long ret = 0;
+ if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SUPM)) + return -EINVAL; + if (is_compat_thread(ti)) return -EINVAL;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tejas Upadhyay tejas.upadhyay@intel.com
[ Upstream commit 51c0ee84e4dc339287b2d7335f2b54d747794c83 ]
LNCF registers report wrong values when XE_FORCEWAKE_GT only is held. Holding XE_FORCEWAKE_ALL ensures correct operations on LNCF regs.
V2(Himal): - Use xe_force_wake_ref_has_domain
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1999 Fixes: a6a4ea6d7d37 ("drm/xe: Add mocs kunit") Reviewed-by: Himal Prasad Ghimiray himal.prasad.ghimiray@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20250428082357.1730068-1-tejas... Signed-off-by: Tejas Upadhyay tejas.upadhyay@intel.com (cherry picked from commit 70a2585e582058e94fe4381a337be42dec800337) Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/xe/tests/xe_mocs.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/xe/tests/xe_mocs.c b/drivers/gpu/drm/xe/tests/xe_mocs.c index ef1e5256c56a8..0e502feaca818 100644 --- a/drivers/gpu/drm/xe/tests/xe_mocs.c +++ b/drivers/gpu/drm/xe/tests/xe_mocs.c @@ -46,8 +46,11 @@ static void read_l3cc_table(struct xe_gt *gt, unsigned int fw_ref, i; u32 reg_val;
- fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT); - KUNIT_ASSERT_NE_MSG(test, fw_ref, 0, "Forcewake Failed.\n"); + fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL); + if (!xe_force_wake_ref_has_domain(fw_ref, XE_FORCEWAKE_ALL)) { + xe_force_wake_put(gt_to_fw(gt), fw_ref); + KUNIT_ASSERT_TRUE_MSG(test, true, "Forcewake Failed.\n"); + }
for (i = 0; i < info->num_mocs_regs; i++) { if (!(i & 1)) {
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuicheng Lin shuicheng.lin@intel.com
[ Upstream commit 9d271a4f5ba52520e448ab223b1a91c6e35f17c7 ]
xe_force_wake_get() is dependent on xe_pm_runtime_get(), so for the release path, xe_force_wake_put() should be called first then xe_pm_runtime_put(). Combine the error path and normal path together with goto.
Fixes: 85d547608ef5 ("drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get return") Cc: Himal Prasad Ghimiray himal.prasad.ghimiray@intel.com Cc: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Shuicheng Lin shuicheng.lin@intel.com Reviewed-by: Himal Prasad Ghimiray himal.prasad.ghimiray@intel.com Link: https://lore.kernel.org/r/20250507022302.2187527-1-shuicheng.lin@intel.com Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com (cherry picked from commit 432cd94efdca06296cc5e76d673546f58aa90ee1) Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/xe/xe_gt_debugfs.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_debugfs.c b/drivers/gpu/drm/xe/xe_gt_debugfs.c index 2d63a69cbfa38..f7005a3643e62 100644 --- a/drivers/gpu/drm/xe/xe_gt_debugfs.c +++ b/drivers/gpu/drm/xe/xe_gt_debugfs.c @@ -92,22 +92,23 @@ static int hw_engines(struct xe_gt *gt, struct drm_printer *p) struct xe_hw_engine *hwe; enum xe_hw_engine_id id; unsigned int fw_ref; + int ret = 0;
xe_pm_runtime_get(xe); fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL); if (!xe_force_wake_ref_has_domain(fw_ref, XE_FORCEWAKE_ALL)) { - xe_pm_runtime_put(xe); - xe_force_wake_put(gt_to_fw(gt), fw_ref); - return -ETIMEDOUT; + ret = -ETIMEDOUT; + goto fw_put; }
for_each_hw_engine(hwe, gt, id) xe_hw_engine_print(hwe, p);
+fw_put: xe_force_wake_put(gt_to_fw(gt), fw_ref); xe_pm_runtime_put(xe);
- return 0; + return ret; }
static int powergate_info(struct xe_gt *gt, struct drm_printer *p)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gabriel Krisman Bertazi krisman@suse.de
[ Upstream commit 92835cebab120f8a5f023a26a792a2ac3f816c4f ]
Our QA team reported a 10%-23%, throughput reduction on an io_uring sqpoll testcase doing IO to a null_blk, that I traced back to a reduction of the device submission queue depth utilization. It turns out that, after commit af5d68f8892f ("io_uring/sqpoll: manage task_work privately"), we capped the number of task_work entries that can be completed from a single spin of sqpoll to only 8 entries, before the sqpoll goes around to (potentially) sleep. While this cap doesn't drive the submission side directly, it impacts the completion behavior, which affects the number of IO queued by fio per sqpoll cycle on the submission side, and io_uring ends up seeing less ios per sqpoll cycle. As a result, block layer plugging is less effective, and we see more time spent inside the block layer in profilings charts, and increased submission latency measured by fio.
There are other places that have increased overhead once sqpoll sleeps more often, such as the sqpoll utilization calculation. But, in this microbenchmark, those were not representative enough in perf charts, and their removal didn't yield measurable changes in throughput. The major overhead comes from the fact we plug less, and less often, when submitting to the block layer.
My benchmark is:
fio --ioengine=io_uring --direct=1 --iodepth=128 --runtime=300 --bs=4k \ --invalidate=1 --time_based --ramp_time=10 --group_reporting=1 \ --filename=/dev/nullb0 --name=RandomReads-direct-nullb-sqpoll-4k-1 \ --rw=randread --numjobs=1 --sqthread_poll
In one machine, tested on top of Linux 6.15-rc1, we have the following baseline: READ: bw=4994MiB/s (5236MB/s), 4994MiB/s-4994MiB/s (5236MB/s-5236MB/s), io=439GiB (471GB), run=90001-90001msec
With this patch: READ: bw=5762MiB/s (6042MB/s), 5762MiB/s-5762MiB/s (6042MB/s-6042MB/s), io=506GiB (544GB), run=90001-90001msec
which is a 15% improvement in measured bandwidth. The average submission latency is noticeably lowered too. As measured by fio:
Baseline: lat (usec): min=20, max=241, avg=99.81, stdev=3.38 Patched: lat (usec): min=26, max=226, avg=86.48, stdev=4.82
If we look at blktrace, we can also see the plugging behavior is improved. In the baseline, we end up limited to plugging 8 requests in the block layer regardless of the device queue depth size, while after patching we can drive more io, and we manage to utilize the full device queue.
In the baseline, after a stabilization phase, an ordinary submission looks like: 254,0 1 49942 0.016028795 5977 U N [iou-sqp-5976] 7
After patching, I see consistently more requests per unplug. 254,0 1 4996 0.001432872 3145 U N [iou-sqp-3144] 32
Ideally, the cap size would at least be the deep enough to fill the device queue, but we can't predict that behavior, or assume all IO goes to a single device, and thus can't guess the ideal batch size. We also don't want to let the tw run unbounded, though I'm not sure it would really be a problem. Instead, let's just give it a more sensible value that will allow for more efficient batching. I've tested with different cap values, and initially proposed to increase the cap to 1024. Jens argued it is too big of a bump and I observed that, with 32, I'm no longer able to observe this bottleneck in any of my machines.
Fixes: af5d68f8892f ("io_uring/sqpoll: manage task_work privately") Signed-off-by: Gabriel Krisman Bertazi krisman@suse.de Link: https://lore.kernel.org/r/20250508181203.3785544-1-krisman@suse.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- io_uring/sqpoll.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c index d037cc68e9d3e..03c699493b5ab 100644 --- a/io_uring/sqpoll.c +++ b/io_uring/sqpoll.c @@ -20,7 +20,7 @@ #include "sqpoll.h"
#define IORING_SQPOLL_CAP_ENTRIES_VALUE 8 -#define IORING_TW_CAP_ENTRIES_VALUE 8 +#define IORING_TW_CAP_ENTRIES_VALUE 32
enum { IO_SQ_THREAD_SHOULD_STOP = 0,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit 65781e19dcfcb4aed1167d87a3ffcc2a0c071d47 ]
do_umount() analogue of the race fixed in 119e1ef80ecf "fix __legitimize_mnt()/mntput() race". Here we want to make sure that if __legitimize_mnt() doesn't notice our lock_mount_hash(), we will notice their refcount increment. Harder to hit than mntput_no_expire() one, fortunately, and consequences are milder (sync umount acting like umount -l on a rare race with RCU pathwalk hitting at just the wrong time instead of use-after-free galore mntput_no_expire() counterpart used to be hit). Still a bug...
Fixes: 48a066e72d97 ("RCU'd vfsmounts") Reviewed-by: Christian Brauner brauner@kernel.org Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/namespace.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/namespace.c b/fs/namespace.c index 280a6ebc46d93..5b84e29613fe4 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -778,7 +778,7 @@ int __legitimize_mnt(struct vfsmount *bastard, unsigned seq) return 0; mnt = real_mount(bastard); mnt_add_count(mnt, 1); - smp_mb(); // see mntput_no_expire() + smp_mb(); // see mntput_no_expire() and do_umount() if (likely(!read_seqretry(&mount_lock, seq))) return 0; if (bastard->mnt_flags & MNT_SYNC_UMOUNT) { @@ -1956,6 +1956,7 @@ static int do_umount(struct mount *mnt, int flags) umount_tree(mnt, UMOUNT_PROPAGATE); retval = 0; } else { + smp_mb(); // paired with __legitimize_mnt() shrink_submounts(mnt); retval = -EBUSY; if (!propagate_mount_busy(mnt, 2)) {
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miguel Ojeda ojeda@kernel.org
commit a39f3087092716f2bd531d6fdc20403c3dc2a879 upstream.
Starting with Rust 1.87.0 (expected 2025-05-15) [1], Clippy may expand the `ptr_eq` lint, e.g.:
error: use `core::ptr::eq` when comparing raw pointers --> rust/kernel/list.rs:438:12 | 438 | if self.first == item { | ^^^^^^^^^^^^^^^^^^ help: try: `core::ptr::eq(self.first, item)` | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_eq = note: `-D clippy::ptr-eq` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::ptr_eq)]`
It is expected that a PR to relax the lint will be backported [2] by the time Rust 1.87.0 releases, since the lint was considered too eager (at least by default) [3].
Thus allow the lint temporarily just in case.
Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs). Link: https://github.com/rust-lang/rust-clippy/pull/14339 [1] Link: https://github.com/rust-lang/rust-clippy/pull/14526 [2] Link: https://github.com/rust-lang/rust-clippy/issues/14525 [3] Link: https://lore.kernel.org/r/20250502140237.1659624-3-ojeda@kernel.org [ Converted to `allow`s since backport was confirmed. - Miguel ] Signed-off-by: Miguel Ojeda ojeda@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- rust/kernel/alloc/kvec.rs | 3 +++ rust/kernel/list.rs | 3 +++ 2 files changed, 6 insertions(+)
--- a/rust/kernel/alloc/kvec.rs +++ b/rust/kernel/alloc/kvec.rs @@ -2,6 +2,9 @@
//! Implementation of [`Vec`].
+// May not be needed in Rust 1.87.0 (pending beta backport). +#![allow(clippy::ptr_eq)] + use super::{ allocator::{KVmalloc, Kmalloc, Vmalloc}, layout::ArrayLayout, --- a/rust/kernel/list.rs +++ b/rust/kernel/list.rs @@ -4,6 +4,9 @@
//! A linked list implementation.
+// May not be needed in Rust 1.87.0 (pending beta backport). +#![allow(clippy::ptr_eq)] + use crate::init::PinInit; use crate::sync::ArcBorrow; use crate::types::Opaque;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miguel Ojeda ojeda@kernel.org
commit 211dcf77856db64c73e0c3b9ce0c624ec855daca upstream.
Starting with Rust 1.88.0 (expected 2025-06-26) [1], `rustc` may move back the `uninlined_format_args` to `style` from `pedantic` (it was there waiting for rust-analyzer suppotr), and thus we will start to see lints like:
warning: variables can be used directly in the `format!` string --> rust/macros/kunit.rs:105:37 | 105 | let kunit_wrapper_fn_name = format!("kunit_rust_wrapper_{}", test); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_a... help: change this to | 105 - let kunit_wrapper_fn_name = format!("kunit_rust_wrapper_{}", test); 105 + let kunit_wrapper_fn_name = format!("kunit_rust_wrapper_{test}");
There is even a case that is a pure removal:
warning: variables can be used directly in the `format!` string --> rust/macros/module.rs:51:13 | 51 | format!("{field}={content}\0", field = field, content = content) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_a... help: change this to | 51 - format!("{field}={content}\0", field = field, content = content) 51 + format!("{field}={content}\0")
The lints all seem like nice cleanups, thus just apply them.
We may want to disable `allow-mixed-uninlined-format-args` in the future.
Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs). Link: https://github.com/rust-lang/rust-clippy/pull/14160 [1] Acked-by: Benno Lossin lossin@kernel.org Reviewed-by: Tamir Duberstein tamird@gmail.com Reviewed-by: Alice Ryhl aliceryhl@google.com Link: https://lore.kernel.org/r/20250502140237.1659624-6-ojeda@kernel.org Signed-off-by: Miguel Ojeda ojeda@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- rust/kernel/str.rs | 46 ++++++++++++++++++++++----------------------- rust/macros/module.rs | 19 ++++-------------- rust/macros/paste.rs | 2 - rust/macros/pinned_drop.rs | 3 -- 4 files changed, 30 insertions(+), 40 deletions(-)
--- a/rust/kernel/str.rs +++ b/rust/kernel/str.rs @@ -56,7 +56,7 @@ impl fmt::Display for BStr { b'\r' => f.write_str("\r")?, // Printable characters. 0x20..=0x7e => f.write_char(b as char)?, - _ => write!(f, "\x{:02x}", b)?, + _ => write!(f, "\x{b:02x}")?, } } Ok(()) @@ -92,7 +92,7 @@ impl fmt::Debug for BStr { b'\' => f.write_str("\\")?, // Printable characters. 0x20..=0x7e => f.write_char(b as char)?, - _ => write!(f, "\x{:02x}", b)?, + _ => write!(f, "\x{b:02x}")?, } } f.write_char('"') @@ -401,7 +401,7 @@ impl fmt::Display for CStr { // Printable character. f.write_char(c as char)?; } else { - write!(f, "\x{:02x}", c)?; + write!(f, "\x{c:02x}")?; } } Ok(()) @@ -433,7 +433,7 @@ impl fmt::Debug for CStr { // Printable characters. b'"' => f.write_str("\"")?, 0x20..=0x7e => f.write_char(c as char)?, - _ => write!(f, "\x{:02x}", c)?, + _ => write!(f, "\x{c:02x}")?, } } f.write_str(""") @@ -595,13 +595,13 @@ mod tests { #[test] fn test_cstr_display() { let hello_world = CStr::from_bytes_with_nul(b"hello, world!\0").unwrap(); - assert_eq!(format!("{}", hello_world), "hello, world!"); + assert_eq!(format!("{hello_world}"), "hello, world!"); let non_printables = CStr::from_bytes_with_nul(b"\x01\x09\x0a\0").unwrap(); - assert_eq!(format!("{}", non_printables), "\x01\x09\x0a"); + assert_eq!(format!("{non_printables}"), "\x01\x09\x0a"); let non_ascii = CStr::from_bytes_with_nul(b"d\xe9j\xe0 vu\0").unwrap(); - assert_eq!(format!("{}", non_ascii), "d\xe9j\xe0 vu"); + assert_eq!(format!("{non_ascii}"), "d\xe9j\xe0 vu"); let good_bytes = CStr::from_bytes_with_nul(b"\xf0\x9f\xa6\x80\0").unwrap(); - assert_eq!(format!("{}", good_bytes), "\xf0\x9f\xa6\x80"); + assert_eq!(format!("{good_bytes}"), "\xf0\x9f\xa6\x80"); }
#[test] @@ -612,47 +612,47 @@ mod tests { bytes[i as usize] = i.wrapping_add(1); } let cstr = CStr::from_bytes_with_nul(&bytes).unwrap(); - assert_eq!(format!("{}", cstr), ALL_ASCII_CHARS); + assert_eq!(format!("{cstr}"), ALL_ASCII_CHARS); }
#[test] fn test_cstr_debug() { let hello_world = CStr::from_bytes_with_nul(b"hello, world!\0").unwrap(); - assert_eq!(format!("{:?}", hello_world), ""hello, world!""); + assert_eq!(format!("{hello_world:?}"), ""hello, world!""); let non_printables = CStr::from_bytes_with_nul(b"\x01\x09\x0a\0").unwrap(); - assert_eq!(format!("{:?}", non_printables), ""\x01\x09\x0a""); + assert_eq!(format!("{non_printables:?}"), ""\x01\x09\x0a""); let non_ascii = CStr::from_bytes_with_nul(b"d\xe9j\xe0 vu\0").unwrap(); - assert_eq!(format!("{:?}", non_ascii), ""d\xe9j\xe0 vu""); + assert_eq!(format!("{non_ascii:?}"), ""d\xe9j\xe0 vu""); let good_bytes = CStr::from_bytes_with_nul(b"\xf0\x9f\xa6\x80\0").unwrap(); - assert_eq!(format!("{:?}", good_bytes), ""\xf0\x9f\xa6\x80""); + assert_eq!(format!("{good_bytes:?}"), ""\xf0\x9f\xa6\x80""); }
#[test] fn test_bstr_display() { let hello_world = BStr::from_bytes(b"hello, world!"); - assert_eq!(format!("{}", hello_world), "hello, world!"); + assert_eq!(format!("{hello_world}"), "hello, world!"); let escapes = BStr::from_bytes(b"_\t_\n_\r_\_'_"_"); - assert_eq!(format!("{}", escapes), "_\t_\n_\r_\_'_"_"); + assert_eq!(format!("{escapes}"), "_\t_\n_\r_\_'_"_"); let others = BStr::from_bytes(b"\x01"); - assert_eq!(format!("{}", others), "\x01"); + assert_eq!(format!("{others}"), "\x01"); let non_ascii = BStr::from_bytes(b"d\xe9j\xe0 vu"); - assert_eq!(format!("{}", non_ascii), "d\xe9j\xe0 vu"); + assert_eq!(format!("{non_ascii}"), "d\xe9j\xe0 vu"); let good_bytes = BStr::from_bytes(b"\xf0\x9f\xa6\x80"); - assert_eq!(format!("{}", good_bytes), "\xf0\x9f\xa6\x80"); + assert_eq!(format!("{good_bytes}"), "\xf0\x9f\xa6\x80"); }
#[test] fn test_bstr_debug() { let hello_world = BStr::from_bytes(b"hello, world!"); - assert_eq!(format!("{:?}", hello_world), ""hello, world!""); + assert_eq!(format!("{hello_world:?}"), ""hello, world!""); let escapes = BStr::from_bytes(b"_\t_\n_\r_\_'_"_"); - assert_eq!(format!("{:?}", escapes), ""_\t_\n_\r_\\_'_\"_""); + assert_eq!(format!("{escapes:?}"), ""_\t_\n_\r_\\_'_\"_""); let others = BStr::from_bytes(b"\x01"); - assert_eq!(format!("{:?}", others), ""\x01""); + assert_eq!(format!("{others:?}"), ""\x01""); let non_ascii = BStr::from_bytes(b"d\xe9j\xe0 vu"); - assert_eq!(format!("{:?}", non_ascii), ""d\xe9j\xe0 vu""); + assert_eq!(format!("{non_ascii:?}"), ""d\xe9j\xe0 vu""); let good_bytes = BStr::from_bytes(b"\xf0\x9f\xa6\x80"); - assert_eq!(format!("{:?}", good_bytes), ""\xf0\x9f\xa6\x80""); + assert_eq!(format!("{good_bytes:?}"), ""\xf0\x9f\xa6\x80""); } }
--- a/rust/macros/module.rs +++ b/rust/macros/module.rs @@ -48,7 +48,7 @@ impl<'a> ModInfoBuilder<'a> { ) } else { // Loadable modules' modinfo strings go as-is. - format!("{field}={content}\0", field = field, content = content) + format!("{field}={content}\0") };
write!( @@ -124,10 +124,7 @@ impl ModuleInfo { };
if seen_keys.contains(&key) { - panic!( - "Duplicated key "{}". Keys can only be specified once.", - key - ); + panic!("Duplicated key "{key}". Keys can only be specified once."); }
assert_eq!(expect_punct(it), ':'); @@ -140,10 +137,7 @@ impl ModuleInfo { "license" => info.license = expect_string_ascii(it), "alias" => info.alias = Some(expect_string_array(it)), "firmware" => info.firmware = Some(expect_string_array(it)), - _ => panic!( - "Unknown key "{}". Valid keys are: {:?}.", - key, EXPECTED_KEYS - ), + _ => panic!("Unknown key "{key}". Valid keys are: {EXPECTED_KEYS:?}."), }
assert_eq!(expect_punct(it), ','); @@ -155,7 +149,7 @@ impl ModuleInfo {
for key in REQUIRED_KEYS { if !seen_keys.iter().any(|e| e == key) { - panic!("Missing required key "{}".", key); + panic!("Missing required key "{key}"."); } }
@@ -167,10 +161,7 @@ impl ModuleInfo { }
if seen_keys != ordered_keys { - panic!( - "Keys are not ordered as expected. Order them like: {:?}.", - ordered_keys - ); + panic!("Keys are not ordered as expected. Order them like: {ordered_keys:?}."); }
info --- a/rust/macros/paste.rs +++ b/rust/macros/paste.rs @@ -50,7 +50,7 @@ fn concat_helper(tokens: &[TokenTree]) - let tokens = group.stream().into_iter().collect::<Vec<TokenTree>>(); segments.append(&mut concat_helper(tokens.as_slice())); } - token => panic!("unexpected token in paste segments: {:?}", token), + token => panic!("unexpected token in paste segments: {token:?}"), }; }
--- a/rust/macros/pinned_drop.rs +++ b/rust/macros/pinned_drop.rs @@ -25,8 +25,7 @@ pub(crate) fn pinned_drop(_args: TokenSt // Found the end of the generics, this should be `PinnedDrop`. assert!( matches!(tt, TokenTree::Ident(i) if i.to_string() == "PinnedDrop"), - "expected 'PinnedDrop', found: '{:?}'", - tt + "expected 'PinnedDrop', found: '{tt:?}'" ); pinned_drop_idx = Some(i); break;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
Commit b53e523261bf058ea4a518b482222e7a277b186b upstream.
There are a few spots where linked timeouts are armed, and not all of them adhere to the pre-arm, attempt issue, post-arm pattern. This can be problematic if the linked request returns that it will trigger a callback later, and does so before the linked timeout is fully armed.
Consolidate all the linked timeout handling into __io_issue_sqe(), rather than have it spread throughout the various issue entry points.
Cc: stable@vger.kernel.org Link: https://github.com/axboe/liburing/issues/1390 Reported-by: Chase Hiltz chase@path.net Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.c | 50 +++++++++++++++----------------------------------- 1 file changed, 15 insertions(+), 35 deletions(-)
--- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -443,24 +443,6 @@ static struct io_kiocb *__io_prep_linked return req->link; }
-static inline struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req) -{ - if (likely(!(req->flags & REQ_F_ARM_LTIMEOUT))) - return NULL; - return __io_prep_linked_timeout(req); -} - -static noinline void __io_arm_ltimeout(struct io_kiocb *req) -{ - io_queue_linked_timeout(__io_prep_linked_timeout(req)); -} - -static inline void io_arm_ltimeout(struct io_kiocb *req) -{ - if (unlikely(req->flags & REQ_F_ARM_LTIMEOUT)) - __io_arm_ltimeout(req); -} - static void io_prep_async_work(struct io_kiocb *req) { const struct io_issue_def *def = &io_issue_defs[req->opcode]; @@ -513,7 +495,6 @@ static void io_prep_async_link(struct io
static void io_queue_iowq(struct io_kiocb *req) { - struct io_kiocb *link = io_prep_linked_timeout(req); struct io_uring_task *tctx = req->tctx;
BUG_ON(!tctx); @@ -538,8 +519,6 @@ static void io_queue_iowq(struct io_kioc
trace_io_uring_queue_async_work(req, io_wq_is_hashed(&req->work)); io_wq_enqueue(tctx->io_wq, &req->work); - if (link) - io_queue_linked_timeout(link); }
static void io_req_queue_iowq_tw(struct io_kiocb *req, struct io_tw_state *ts) @@ -1728,17 +1707,24 @@ static bool io_assign_file(struct io_kio return !!req->file; }
+#define REQ_ISSUE_SLOW_FLAGS (REQ_F_CREDS | REQ_F_ARM_LTIMEOUT) + static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags) { const struct io_issue_def *def = &io_issue_defs[req->opcode]; const struct cred *creds = NULL; + struct io_kiocb *link = NULL; int ret;
if (unlikely(!io_assign_file(req, def, issue_flags))) return -EBADF;
- if (unlikely((req->flags & REQ_F_CREDS) && req->creds != current_cred())) - creds = override_creds(req->creds); + if (unlikely(req->flags & REQ_ISSUE_SLOW_FLAGS)) { + if ((req->flags & REQ_F_CREDS) && req->creds != current_cred()) + creds = override_creds(req->creds); + if (req->flags & REQ_F_ARM_LTIMEOUT) + link = __io_prep_linked_timeout(req); + }
if (!def->audit_skip) audit_uring_entry(req->opcode); @@ -1748,8 +1734,12 @@ static int io_issue_sqe(struct io_kiocb if (!def->audit_skip) audit_uring_exit(!ret, ret);
- if (creds) - revert_creds(creds); + if (unlikely(creds || link)) { + if (creds) + revert_creds(creds); + if (link) + io_queue_linked_timeout(link); + }
if (ret == IOU_OK) { if (issue_flags & IO_URING_F_COMPLETE_DEFER) @@ -1762,7 +1752,6 @@ static int io_issue_sqe(struct io_kiocb
if (ret == IOU_ISSUE_SKIP_COMPLETE) { ret = 0; - io_arm_ltimeout(req);
/* If the op doesn't have a file, we're not polling for it */ if ((req->ctx->flags & IORING_SETUP_IOPOLL) && def->iopoll_queue) @@ -1805,8 +1794,6 @@ void io_wq_submit_work(struct io_wq_work else req_ref_get(req);
- io_arm_ltimeout(req); - /* either cancelled or io-wq is dying, so don't touch tctx->iowq */ if (atomic_read(&work->flags) & IO_WQ_WORK_CANCEL) { fail: @@ -1922,15 +1909,11 @@ struct file *io_file_get_normal(struct i static void io_queue_async(struct io_kiocb *req, int ret) __must_hold(&req->ctx->uring_lock) { - struct io_kiocb *linked_timeout; - if (ret != -EAGAIN || (req->flags & REQ_F_NOWAIT)) { io_req_defer_failed(req, ret); return; }
- linked_timeout = io_prep_linked_timeout(req); - switch (io_arm_poll_handler(req, 0)) { case IO_APOLL_READY: io_kbuf_recycle(req, 0); @@ -1943,9 +1926,6 @@ static void io_queue_async(struct io_kio case IO_APOLL_OK: break; } - - if (linked_timeout) - io_queue_linked_timeout(linked_timeout); }
static inline void io_queue_sqe(struct io_kiocb *req)
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hao Qin hao.qin@mediatek.com
commit 33634e2ab7c6369391e0ca4b9b97dc861e33d20e upstream.
Remove the resetting step before downloading the fw, as it may cause other usb devices to fail to initialise when connected during boot on kernels 6.11 and newer.
Signed-off-by: Hao Qin hao.qin@mediatek.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Cc: "Geoffrey D. Bennett" g@b4.vu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/bluetooth/btmtk.c | 10 ---------- 1 file changed, 10 deletions(-)
--- a/drivers/bluetooth/btmtk.c +++ b/drivers/bluetooth/btmtk.c @@ -1330,13 +1330,6 @@ int btmtk_usb_setup(struct hci_dev *hdev break; case 0x7922: case 0x7925: - /* Reset the device to ensure it's in the initial state before - * downloading the firmware to ensure. - */ - - if (!test_bit(BTMTK_FIRMWARE_LOADED, &btmtk_data->flags)) - btmtk_usb_subsys_reset(hdev, dev_id); - fallthrough; case 0x7961: btmtk_fw_get_filename(fw_bin_name, sizeof(fw_bin_name), dev_id, fw_version, fw_flavor); @@ -1345,12 +1338,9 @@ int btmtk_usb_setup(struct hci_dev *hdev btmtk_usb_hci_wmt_sync); if (err < 0) { bt_dev_err(hdev, "Failed to set up firmware (%d)", err); - clear_bit(BTMTK_FIRMWARE_LOADED, &btmtk_data->flags); return err; }
- set_bit(BTMTK_FIRMWARE_LOADED, &btmtk_data->flags); - /* It's Device EndPoint Reset Option Register */ err = btmtk_usb_uhw_reg_write(hdev, MTK_EP_RST_OPT, MTK_EP_RST_IN_OUT_OPT);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Weiner hannes@cmpxchg.org
commit c2f6ea38fc1b640aa7a2e155cc1c0410ff91afa2 upstream.
The fallback code searches for the biggest buddy first in an attempt to steal the whole block and encourage type grouping down the line.
The approach used to be this:
- Non-movable requests will split the largest buddy and steal the remainder. This splits up contiguity, but it allows subsequent requests of this type to fall back into adjacent space.
- Movable requests go and look for the smallest buddy instead. The thinking is that movable requests can be compacted, so grouping is less important than retaining contiguity.
c0cd6f557b90 ("mm: page_alloc: fix freelist movement during block conversion") enforces freelist type hygiene, which restricts stealing to either claiming the whole block or just taking the requested chunk; no additional pages or buddy remainders can be stolen any more.
The patch mishandled when to switch to finding the smallest buddy in that new reality. As a result, it may steal the exact request size, but from the biggest buddy. This causes fracturing for no good reason.
Fix this by committing to the new behavior: either steal the whole block, or fall back to the smallest buddy.
Remove single-page stealing from steal_suitable_fallback(). Rename it to try_to_steal_block() to make the intentions clear. If this fails, always fall back to the smallest buddy.
The following is from 4 runs of mmtest's thpchallenge. "Pollute" is single page fallback, "steal" is conversion of a partially used block. The numbers for free block conversions (omitted) are comparable.
vanilla patched
@pollute[unmovable from reclaimable]: 27 106 @pollute[unmovable from movable]: 82 46 @pollute[reclaimable from unmovable]: 256 83 @pollute[reclaimable from movable]: 46 8 @pollute[movable from unmovable]: 4841 868 @pollute[movable from reclaimable]: 5278 12568
@steal[unmovable from reclaimable]: 11 12 @steal[unmovable from movable]: 113 49 @steal[reclaimable from unmovable]: 19 34 @steal[reclaimable from movable]: 47 21 @steal[movable from unmovable]: 250 183 @steal[movable from reclaimable]: 81 93
The allocator appears to do a better job at keeping stealing and polluting to the first fallback preference. As a result, the numbers for "from movable" - the least preferred fallback option, and most detrimental to compactability - are down across the board.
Link: https://lkml.kernel.org/r/20250225001023.1494422-2-hannes@cmpxchg.org Fixes: c0cd6f557b90 ("mm: page_alloc: fix freelist movement during block conversion") Signed-off-by: Johannes Weiner hannes@cmpxchg.org Suggested-by: Vlastimil Babka vbabka@suse.cz Reviewed-by: Brendan Jackman jackmanb@google.com Reviewed-by: Vlastimil Babka vbabka@suse.cz Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Johannes Weiner hannes@cmpxchg.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/page_alloc.c | 80 +++++++++++++++++++++++--------------------------------- 1 file changed, 34 insertions(+), 46 deletions(-)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1908,13 +1908,12 @@ static inline bool boost_watermark(struc * can claim the whole pageblock for the requested migratetype. If not, we check * the pageblock for constituent pages; if at least half of the pages are free * or compatible, we can still claim the whole block, so pages freed in the - * future will be put on the correct free list. Otherwise, we isolate exactly - * the order we need from the fallback block and leave its migratetype alone. + * future will be put on the correct free list. */ static struct page * -steal_suitable_fallback(struct zone *zone, struct page *page, - int current_order, int order, int start_type, - unsigned int alloc_flags, bool whole_block) +try_to_steal_block(struct zone *zone, struct page *page, + int current_order, int order, int start_type, + unsigned int alloc_flags) { int free_pages, movable_pages, alike_pages; unsigned long start_pfn; @@ -1927,7 +1926,7 @@ steal_suitable_fallback(struct zone *zon * highatomic accounting. */ if (is_migrate_highatomic(block_type)) - goto single_page; + return NULL;
/* Take ownership for orders >= pageblock_order */ if (current_order >= pageblock_order) { @@ -1948,14 +1947,10 @@ steal_suitable_fallback(struct zone *zon if (boost_watermark(zone) && (alloc_flags & ALLOC_KSWAPD)) set_bit(ZONE_BOOSTED_WATERMARK, &zone->flags);
- /* We are not allowed to try stealing from the whole block */ - if (!whole_block) - goto single_page; - /* moving whole block can fail due to zone boundary conditions */ if (!prep_move_freepages_block(zone, page, &start_pfn, &free_pages, &movable_pages)) - goto single_page; + return NULL;
/* * Determine how many pages are compatible with our allocation. @@ -1988,9 +1983,7 @@ steal_suitable_fallback(struct zone *zon return __rmqueue_smallest(zone, order, start_type); }
-single_page: - page_del_and_expand(zone, page, order, current_order, block_type); - return page; + return NULL; }
/* @@ -2172,14 +2165,19 @@ static bool unreserve_highatomic_pageblo }
/* - * Try finding a free buddy page on the fallback list and put it on the free - * list of requested migratetype, possibly along with other pages from the same - * block, depending on fragmentation avoidance heuristics. Returns true if - * fallback was found so that __rmqueue_smallest() can grab it. + * Try finding a free buddy page on the fallback list. + * + * This will attempt to steal a whole pageblock for the requested type + * to ensure grouping of such requests in the future. + * + * If a whole block cannot be stolen, regress to __rmqueue_smallest() + * logic to at least break up as little contiguity as possible. * * The use of signed ints for order and current_order is a deliberate * deviation from the rest of this file, to make the for loop * condition simpler. + * + * Return the stolen page, or NULL if none can be found. */ static __always_inline struct page * __rmqueue_fallback(struct zone *zone, int order, int start_migratetype, @@ -2213,45 +2211,35 @@ __rmqueue_fallback(struct zone *zone, in if (fallback_mt == -1) continue;
- /* - * We cannot steal all free pages from the pageblock and the - * requested migratetype is movable. In that case it's better to - * steal and split the smallest available page instead of the - * largest available page, because even if the next movable - * allocation falls back into a different pageblock than this - * one, it won't cause permanent fragmentation. - */ - if (!can_steal && start_migratetype == MIGRATE_MOVABLE - && current_order > order) - goto find_smallest; + if (!can_steal) + break;
- goto do_steal; + page = get_page_from_free_area(area, fallback_mt); + page = try_to_steal_block(zone, page, current_order, order, + start_migratetype, alloc_flags); + if (page) + goto got_one; }
- return NULL; + if (alloc_flags & ALLOC_NOFRAGMENT) + return NULL;
-find_smallest: + /* No luck stealing blocks. Find the smallest fallback page */ for (current_order = order; current_order < NR_PAGE_ORDERS; current_order++) { area = &(zone->free_area[current_order]); fallback_mt = find_suitable_fallback(area, current_order, start_migratetype, false, &can_steal); - if (fallback_mt != -1) - break; - } - - /* - * This should not happen - we already found a suitable fallback - * when looking for the largest page. - */ - VM_BUG_ON(current_order > MAX_PAGE_ORDER); + if (fallback_mt == -1) + continue;
-do_steal: - page = get_page_from_free_area(area, fallback_mt); + page = get_page_from_free_area(area, fallback_mt); + page_del_and_expand(zone, page, order, current_order, fallback_mt); + goto got_one; + }
- /* take off list, maybe claim block, expand remainder */ - page = steal_suitable_fallback(zone, page, current_order, order, - start_migratetype, alloc_flags, can_steal); + return NULL;
+got_one: trace_mm_page_alloc_extfrag(page, order, current_order, start_migratetype, fallback_mt);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Weiner hannes@cmpxchg.org
commit 90abee6d7895d5eef18c91d870d8168be4e76e9d upstream.
The test robot identified c2f6ea38fc1b ("mm: page_alloc: don't steal single pages from biggest buddy") as the root cause of a 56.4% regression in vm-scalability::lru-file-mmap-read.
Carlos reports an earlier patch, c0cd6f557b90 ("mm: page_alloc: fix freelist movement during block conversion"), as the root cause for a regression in worst-case zone->lock+irqoff hold times.
Both of these patches modify the page allocator's fallback path to be less greedy in an effort to stave off fragmentation. The flip side of this is that fallbacks are also less productive each time around, which means the fallback search can run much more frequently.
Carlos' traces point to rmqueue_bulk() specifically, which tries to refill the percpu cache by allocating a large batch of pages in a loop. It highlights how once the native freelists are exhausted, the fallback code first scans orders top-down for whole blocks to claim, then falls back to a bottom-up search for the smallest buddy to steal. For the next batch page, it goes through the same thing again.
This can be made more efficient. Since rmqueue_bulk() holds the zone->lock over the entire batch, the freelists are not subject to outside changes; when the search for a block to claim has already failed, there is no point in trying again for the next page.
Modify __rmqueue() to remember the last successful fallback mode, and restart directly from there on the next rmqueue_bulk() iteration.
Oliver confirms that this improves beyond the regression that the test robot reported against c2f6ea38fc1b:
commit: f3b92176f4 ("tools/selftests: add guard region test for /proc/$pid/pagemap") c2f6ea38fc ("mm: page_alloc: don't steal single pages from biggest buddy") acc4d5ff0b ("Merge tag 'net-6.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") 2c847f27c3 ("mm: page_alloc: speed up fallbacks in rmqueue_bulk()") <--- your patch
f3b92176f4f7100f c2f6ea38fc1b640aa7a2e155cc1 acc4d5ff0b61eb1715c498b6536 2c847f27c37da65a93d23c237c5 ---------------- --------------------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 25525364 ± 3% -56.4% 11135467 -57.8% 10779336 +31.6% 33581409 vm-scalability.throughput
Carlos confirms that worst-case times are almost fully recovered compared to before the earlier culprit patch:
2dd482ba627d (before freelist hygiene): 1ms c0cd6f557b90 (after freelist hygiene): 90ms next-20250319 (steal smallest buddy): 280ms this patch : 8ms
[jackmanb@google.com: comment updates] Link: https://lkml.kernel.org/r/D92AC0P9594X.3BML64MUKTF8Z@google.com [hannes@cmpxchg.org: reset rmqueue_mode in rmqueue_buddy() error loop, per Yunsheng Lin] Link: https://lkml.kernel.org/r/20250409140023.GA2313@cmpxchg.org Link: https://lkml.kernel.org/r/20250407180154.63348-1-hannes@cmpxchg.org Fixes: c0cd6f557b90 ("mm: page_alloc: fix freelist movement during block conversion") Fixes: c2f6ea38fc1b ("mm: page_alloc: don't steal single pages from biggest buddy") Signed-off-by: Johannes Weiner hannes@cmpxchg.org Signed-off-by: Brendan Jackman jackmanb@google.com Reported-by: kernel test robot oliver.sang@intel.com Reported-by: Carlos Song carlos.song@nxp.com Tested-by: Carlos Song carlos.song@nxp.com Tested-by: kernel test robot oliver.sang@intel.com Closes: https://lore.kernel.org/oe-lkp/202503271547.fc08b188-lkp@intel.com Reviewed-by: Brendan Jackman jackmanb@google.com Tested-by: Shivank Garg shivankg@amd.com Acked-by: Zi Yan ziy@nvidia.com Reviewed-by: Vlastimil Babka vbabka@suse.cz Cc: stable@vger.kernel.org [6.10+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Johannes Weiner hannes@cmpxchg.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/page_alloc.c | 113 +++++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 80 insertions(+), 33 deletions(-)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2165,22 +2165,15 @@ static bool unreserve_highatomic_pageblo }
/* - * Try finding a free buddy page on the fallback list. - * - * This will attempt to steal a whole pageblock for the requested type - * to ensure grouping of such requests in the future. - * - * If a whole block cannot be stolen, regress to __rmqueue_smallest() - * logic to at least break up as little contiguity as possible. + * Try to allocate from some fallback migratetype by claiming the entire block, + * i.e. converting it to the allocation's start migratetype. * * The use of signed ints for order and current_order is a deliberate * deviation from the rest of this file, to make the for loop * condition simpler. - * - * Return the stolen page, or NULL if none can be found. */ static __always_inline struct page * -__rmqueue_fallback(struct zone *zone, int order, int start_migratetype, +__rmqueue_claim(struct zone *zone, int order, int start_migratetype, unsigned int alloc_flags) { struct free_area *area; @@ -2217,14 +2210,29 @@ __rmqueue_fallback(struct zone *zone, in page = get_page_from_free_area(area, fallback_mt); page = try_to_steal_block(zone, page, current_order, order, start_migratetype, alloc_flags); - if (page) - goto got_one; + if (page) { + trace_mm_page_alloc_extfrag(page, order, current_order, + start_migratetype, fallback_mt); + return page; + } }
- if (alloc_flags & ALLOC_NOFRAGMENT) - return NULL; + return NULL; +} + +/* + * Try to steal a single page from some fallback migratetype. Leave the rest of + * the block as its current migratetype, potentially causing fragmentation. + */ +static __always_inline struct page * +__rmqueue_steal(struct zone *zone, int order, int start_migratetype) +{ + struct free_area *area; + int current_order; + struct page *page; + int fallback_mt; + bool can_steal;
- /* No luck stealing blocks. Find the smallest fallback page */ for (current_order = order; current_order < NR_PAGE_ORDERS; current_order++) { area = &(zone->free_area[current_order]); fallback_mt = find_suitable_fallback(area, current_order, @@ -2234,25 +2242,28 @@ __rmqueue_fallback(struct zone *zone, in
page = get_page_from_free_area(area, fallback_mt); page_del_and_expand(zone, page, order, current_order, fallback_mt); - goto got_one; + trace_mm_page_alloc_extfrag(page, order, current_order, + start_migratetype, fallback_mt); + return page; }
return NULL; - -got_one: - trace_mm_page_alloc_extfrag(page, order, current_order, - start_migratetype, fallback_mt); - - return page; }
+enum rmqueue_mode { + RMQUEUE_NORMAL, + RMQUEUE_CMA, + RMQUEUE_CLAIM, + RMQUEUE_STEAL, +}; + /* * Do the hard work of removing an element from the buddy allocator. * Call me with the zone->lock already held. */ static __always_inline struct page * __rmqueue(struct zone *zone, unsigned int order, int migratetype, - unsigned int alloc_flags) + unsigned int alloc_flags, enum rmqueue_mode *mode) { struct page *page;
@@ -2271,16 +2282,49 @@ __rmqueue(struct zone *zone, unsigned in } }
- page = __rmqueue_smallest(zone, order, migratetype); - if (unlikely(!page)) { - if (alloc_flags & ALLOC_CMA) + /* + * First try the freelists of the requested migratetype, then try + * fallbacks modes with increasing levels of fragmentation risk. + * + * The fallback logic is expensive and rmqueue_bulk() calls in + * a loop with the zone->lock held, meaning the freelists are + * not subject to any outside changes. Remember in *mode where + * we found pay dirt, to save us the search on the next call. + */ + switch (*mode) { + case RMQUEUE_NORMAL: + page = __rmqueue_smallest(zone, order, migratetype); + if (page) + return page; + fallthrough; + case RMQUEUE_CMA: + if (alloc_flags & ALLOC_CMA) { page = __rmqueue_cma_fallback(zone, order); + if (page) { + *mode = RMQUEUE_CMA; + return page; + } + } + fallthrough; + case RMQUEUE_CLAIM: + page = __rmqueue_claim(zone, order, migratetype, alloc_flags); + if (page) { + /* Replenished preferred freelist, back to normal mode. */ + *mode = RMQUEUE_NORMAL; + return page; + } + fallthrough; + case RMQUEUE_STEAL: + if (!(alloc_flags & ALLOC_NOFRAGMENT)) { + page = __rmqueue_steal(zone, order, migratetype); + if (page) { + *mode = RMQUEUE_STEAL; + return page; + } + } + }
- if (!page) - page = __rmqueue_fallback(zone, order, migratetype, - alloc_flags); - } - return page; + return NULL; }
/* @@ -2292,13 +2336,14 @@ static int rmqueue_bulk(struct zone *zon unsigned long count, struct list_head *list, int migratetype, unsigned int alloc_flags) { + enum rmqueue_mode rmqm = RMQUEUE_NORMAL; unsigned long flags; int i;
spin_lock_irqsave(&zone->lock, flags); for (i = 0; i < count; ++i) { struct page *page = __rmqueue(zone, order, migratetype, - alloc_flags); + alloc_flags, &rmqm); if (unlikely(page == NULL)) break;
@@ -2899,7 +2944,9 @@ struct page *rmqueue_buddy(struct zone * if (alloc_flags & ALLOC_HIGHATOMIC) page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC); if (!page) { - page = __rmqueue(zone, order, migratetype, alloc_flags); + enum rmqueue_mode rmqm = RMQUEUE_NORMAL; + + page = __rmqueue(zone, order, migratetype, alloc_flags, &rmqm);
/* * If the allocation fails, allow OOM handling and
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Morse james.morse@arm.com
commit 63de8abd97ddb9b758bd8f915ecbd18e1f1a87a0 upstream.
To generate code in the eBPF epilogue that uses the DSB instruction, insn.c needs a heler to encode the type and domain.
Re-use the crm encoding logic from the DMB instruction.
Signed-off-by: James Morse james.morse@arm.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/insn.h | 1 arch/arm64/lib/insn.c | 60 +++++++++++++++++++++++++----------------- 2 files changed, 38 insertions(+), 23 deletions(-)
--- a/arch/arm64/include/asm/insn.h +++ b/arch/arm64/include/asm/insn.h @@ -698,6 +698,7 @@ u32 aarch64_insn_gen_cas(enum aarch64_in } #endif u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type); +u32 aarch64_insn_gen_dsb(enum aarch64_insn_mb_type type); u32 aarch64_insn_gen_mrs(enum aarch64_insn_register result, enum aarch64_insn_system_register sysreg);
--- a/arch/arm64/lib/insn.c +++ b/arch/arm64/lib/insn.c @@ -5,6 +5,7 @@ * * Copyright (C) 2014-2016 Zi Shen Lim zlim.lnx@gmail.com */ +#include <linux/bitfield.h> #include <linux/bitops.h> #include <linux/bug.h> #include <linux/printk.h> @@ -1471,48 +1472,61 @@ u32 aarch64_insn_gen_extr(enum aarch64_i return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RM, insn, Rm); }
-u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type) +static u32 __get_barrier_crm_val(enum aarch64_insn_mb_type type) { - u32 opt; - u32 insn; - switch (type) { case AARCH64_INSN_MB_SY: - opt = 0xf; - break; + return 0xf; case AARCH64_INSN_MB_ST: - opt = 0xe; - break; + return 0xe; case AARCH64_INSN_MB_LD: - opt = 0xd; - break; + return 0xd; case AARCH64_INSN_MB_ISH: - opt = 0xb; - break; + return 0xb; case AARCH64_INSN_MB_ISHST: - opt = 0xa; - break; + return 0xa; case AARCH64_INSN_MB_ISHLD: - opt = 0x9; - break; + return 0x9; case AARCH64_INSN_MB_NSH: - opt = 0x7; - break; + return 0x7; case AARCH64_INSN_MB_NSHST: - opt = 0x6; - break; + return 0x6; case AARCH64_INSN_MB_NSHLD: - opt = 0x5; - break; + return 0x5; default: - pr_err("%s: unknown dmb type %d\n", __func__, type); + pr_err("%s: unknown barrier type %d\n", __func__, type); return AARCH64_BREAK_FAULT; } +} + +u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type) +{ + u32 opt; + u32 insn; + + opt = __get_barrier_crm_val(type); + if (opt == AARCH64_BREAK_FAULT) + return AARCH64_BREAK_FAULT;
insn = aarch64_insn_get_dmb_value(); insn &= ~GENMASK(11, 8); insn |= (opt << 8);
+ return insn; +} + +u32 aarch64_insn_gen_dsb(enum aarch64_insn_mb_type type) +{ + u32 opt, insn; + + opt = __get_barrier_crm_val(type); + if (opt == AARCH64_BREAK_FAULT) + return AARCH64_BREAK_FAULT; + + insn = aarch64_insn_get_dsb_base_value(); + insn &= ~GENMASK(11, 8); + insn |= (opt << 8); + return insn; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Morse james.morse@arm.com
commit e7956c92f396a44eeeb6eaf7a5b5e1ad24db6748 upstream.
is_spectre_bhb_fw_affected() allows the caller to determine if the CPU is known to need a firmware mitigation. CPUs are either on the list of CPUs we know about, or firmware has been queried and reported that the platform is affected - and mitigated by firmware.
This helper is not useful to determine if the platform is mitigated by firmware. A CPU could be on the know list, but the firmware may not be implemented. Its affected but not mitigated.
spectre_bhb_enable_mitigation() handles this distinction by checking the firmware state before enabling the mitigation.
Add a helper to expose this state. This will be used by the BPF JIT to determine if calling firmware for a mitigation is necessary and supported.
Signed-off-by: James Morse james.morse@arm.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/spectre.h | 1 + arch/arm64/kernel/proton-pack.c | 5 +++++ 2 files changed, 6 insertions(+)
--- a/arch/arm64/include/asm/spectre.h +++ b/arch/arm64/include/asm/spectre.h @@ -97,6 +97,7 @@ enum mitigation_state arm64_get_meltdown
enum mitigation_state arm64_get_spectre_bhb_state(void); bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry, int scope); +bool is_spectre_bhb_fw_mitigated(void); void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *__unused); bool try_emulate_el1_ssbs(struct pt_regs *regs, u32 instr);
--- a/arch/arm64/kernel/proton-pack.c +++ b/arch/arm64/kernel/proton-pack.c @@ -1093,6 +1093,11 @@ void spectre_bhb_enable_mitigation(const update_mitigation_state(&spectre_bhb_state, state); }
+bool is_spectre_bhb_fw_mitigated(void) +{ + return test_bit(BHB_FW, &system_bhb_mitigations); +} + /* Patched to NOP when enabled */ void noinstr spectre_bhb_patch_loop_mitigation_enable(struct alt_instr *alt, __le32 *origptr,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Morse james.morse@arm.com
commit a1152be30a043d2d4dcb1683415f328bf3c51978 upstream.
Add a helper to expose the k value of the branchy loop. This is needed by the BPF JIT to generate the mitigation sequence in BPF programs.
Signed-off-by: James Morse james.morse@arm.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/spectre.h | 1 + arch/arm64/kernel/proton-pack.c | 5 +++++ 2 files changed, 6 insertions(+)
--- a/arch/arm64/include/asm/spectre.h +++ b/arch/arm64/include/asm/spectre.h @@ -97,6 +97,7 @@ enum mitigation_state arm64_get_meltdown
enum mitigation_state arm64_get_spectre_bhb_state(void); bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry, int scope); +u8 get_spectre_bhb_loop_value(void); bool is_spectre_bhb_fw_mitigated(void); void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *__unused); bool try_emulate_el1_ssbs(struct pt_regs *regs, u32 instr); --- a/arch/arm64/kernel/proton-pack.c +++ b/arch/arm64/kernel/proton-pack.c @@ -998,6 +998,11 @@ bool is_spectre_bhb_affected(const struc return true; }
+u8 get_spectre_bhb_loop_value(void) +{ + return max_bhb_k; +} + static void this_cpu_set_vectors(enum arm64_bp_harden_el1_vectors slot) { const char *v = arm64_get_bp_hardening_vector(slot);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Morse james.morse@arm.com
commit 0dfefc2ea2f29ced2416017d7e5b1253a54c2735 upstream.
A malicious BPF program may manipulate the branch history to influence what the hardware speculates will happen next.
On exit from a BPF program, emit the BHB mititgation sequence.
This is only applied for 'classic' cBPF programs that are loaded by seccomp.
Signed-off-by: James Morse james.morse@arm.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Acked-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/spectre.h | 1 arch/arm64/kernel/proton-pack.c | 2 - arch/arm64/net/bpf_jit_comp.c | 54 ++++++++++++++++++++++++++++++++++++--- 3 files changed, 52 insertions(+), 5 deletions(-)
--- a/arch/arm64/include/asm/spectre.h +++ b/arch/arm64/include/asm/spectre.h @@ -97,6 +97,7 @@ enum mitigation_state arm64_get_meltdown
enum mitigation_state arm64_get_spectre_bhb_state(void); bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry, int scope); +extern bool __nospectre_bhb; u8 get_spectre_bhb_loop_value(void); bool is_spectre_bhb_fw_mitigated(void); void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *__unused); --- a/arch/arm64/kernel/proton-pack.c +++ b/arch/arm64/kernel/proton-pack.c @@ -1020,7 +1020,7 @@ static void this_cpu_set_vectors(enum ar isb(); }
-static bool __read_mostly __nospectre_bhb; +bool __read_mostly __nospectre_bhb; static int __init parse_spectre_bhb_param(char *str) { __nospectre_bhb = true; --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -7,6 +7,7 @@
#define pr_fmt(fmt) "bpf_jit: " fmt
+#include <linux/arm-smccc.h> #include <linux/bitfield.h> #include <linux/bpf.h> #include <linux/filter.h> @@ -17,6 +18,7 @@ #include <asm/asm-extable.h> #include <asm/byteorder.h> #include <asm/cacheflush.h> +#include <asm/cpufeature.h> #include <asm/debug-monitors.h> #include <asm/insn.h> #include <asm/text-patching.h> @@ -864,7 +866,48 @@ static void build_plt(struct jit_ctx *ct plt->target = (u64)&dummy_tramp; }
-static void build_epilogue(struct jit_ctx *ctx) +/* Clobbers BPF registers 1-4, aka x0-x3 */ +static void __maybe_unused build_bhb_mitigation(struct jit_ctx *ctx) +{ + const u8 r1 = bpf2a64[BPF_REG_1]; /* aka x0 */ + u8 k = get_spectre_bhb_loop_value(); + + if (!IS_ENABLED(CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY) || + cpu_mitigations_off() || __nospectre_bhb || + arm64_get_spectre_v2_state() == SPECTRE_VULNERABLE) + return; + + if (supports_clearbhb(SCOPE_SYSTEM)) { + emit(aarch64_insn_gen_hint(AARCH64_INSN_HINT_CLEARBHB), ctx); + return; + } + + if (k) { + emit_a64_mov_i64(r1, k, ctx); + emit(A64_B(1), ctx); + emit(A64_SUBS_I(true, r1, r1, 1), ctx); + emit(A64_B_(A64_COND_NE, -2), ctx); + emit(aarch64_insn_gen_dsb(AARCH64_INSN_MB_ISH), ctx); + emit(aarch64_insn_get_isb_value(), ctx); + } + + if (is_spectre_bhb_fw_mitigated()) { + emit(A64_ORR_I(false, r1, AARCH64_INSN_REG_ZR, + ARM_SMCCC_ARCH_WORKAROUND_3), ctx); + switch (arm_smccc_1_1_get_conduit()) { + case SMCCC_CONDUIT_HVC: + emit(aarch64_insn_get_hvc_value(), ctx); + break; + case SMCCC_CONDUIT_SMC: + emit(aarch64_insn_get_smc_value(), ctx); + break; + default: + pr_err_once("Firmware mitigation enabled with unknown conduit\n"); + } + } +} + +static void build_epilogue(struct jit_ctx *ctx, bool was_classic) { const u8 r0 = bpf2a64[BPF_REG_0]; const u8 ptr = bpf2a64[TCCNT_PTR]; @@ -877,10 +920,13 @@ static void build_epilogue(struct jit_ct
emit(A64_POP(A64_ZR, ptr, A64_SP), ctx);
+ if (was_classic) + build_bhb_mitigation(ctx); + /* Restore FP/LR registers */ emit(A64_POP(A64_FP, A64_LR, A64_SP), ctx);
- /* Set return value */ + /* Move the return value from bpf:r0 (aka x7) to x0 */ emit(A64_MOV(1, A64_R(0), r0), ctx);
/* Authenticate lr */ @@ -1817,7 +1863,7 @@ struct bpf_prog *bpf_int_jit_compile(str }
ctx.epilogue_offset = ctx.idx; - build_epilogue(&ctx); + build_epilogue(&ctx, was_classic); build_plt(&ctx);
extable_align = __alignof__(struct exception_table_entry); @@ -1880,7 +1926,7 @@ skip_init_ctx: goto out_free_hdr; }
- build_epilogue(&ctx); + build_epilogue(&ctx, was_classic); build_plt(&ctx);
/* Extra pass to validate JITed code. */
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Morse james.morse@arm.com
commit f300769ead032513a68e4a02e806393402e626f8 upstream.
Support for eBPF programs loaded by unprivileged users is typically disabled. This means only cBPF programs need to be mitigated for BHB.
In addition, only mitigate cBPF programs that were loaded by an unprivileged user. Privileged users can also load the same program via eBPF, making the mitigation pointless.
Signed-off-by: James Morse james.morse@arm.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Acked-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/net/bpf_jit_comp.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -877,6 +877,9 @@ static void __maybe_unused build_bhb_mit arm64_get_spectre_v2_state() == SPECTRE_VULNERABLE) return;
+ if (capable(CAP_SYS_ADMIN)) + return; + if (supports_clearbhb(SCOPE_SYSTEM)) { emit(aarch64_insn_gen_hint(AARCH64_INSN_HINT_CLEARBHB), ctx); return;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Morse james.morse@arm.com
commit efe676a1a7554219eae0b0dcfe1e0cdcc9ef9aef upstream.
Update the list of 'k' values for the branch mitigation from arm's website.
Add the values for Cortex-X1C. The MIDR_EL1 value can be found here: https://developer.arm.com/documentation/101968/0002/Register-descriptions/AA...
Link: https://developer.arm.com/documentation/110280/2-0/?lang=en Signed-off-by: James Morse james.morse@arm.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/cputype.h | 2 ++ arch/arm64/kernel/proton-pack.c | 1 + 2 files changed, 3 insertions(+)
--- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -81,6 +81,7 @@ #define ARM_CPU_PART_CORTEX_A78AE 0xD42 #define ARM_CPU_PART_CORTEX_X1 0xD44 #define ARM_CPU_PART_CORTEX_A510 0xD46 +#define ARM_CPU_PART_CORTEX_X1C 0xD4C #define ARM_CPU_PART_CORTEX_A520 0xD80 #define ARM_CPU_PART_CORTEX_A710 0xD47 #define ARM_CPU_PART_CORTEX_A715 0xD4D @@ -167,6 +168,7 @@ #define MIDR_CORTEX_A78AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78AE) #define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1) #define MIDR_CORTEX_A510 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A510) +#define MIDR_CORTEX_X1C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1C) #define MIDR_CORTEX_A520 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A520) #define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710) #define MIDR_CORTEX_A715 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A715) --- a/arch/arm64/kernel/proton-pack.c +++ b/arch/arm64/kernel/proton-pack.c @@ -891,6 +891,7 @@ static u8 spectre_bhb_loop_affected(void MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE), MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C), MIDR_ALL_VERSIONS(MIDR_CORTEX_X1), + MIDR_ALL_VERSIONS(MIDR_CORTEX_X1C), MIDR_ALL_VERSIONS(MIDR_CORTEX_A710), MIDR_ALL_VERSIONS(MIDR_CORTEX_X2), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Sneddon daniel.sneddon@linux.intel.com
commit d4e89d212d401672e9cdfe825d947ee3a9fbe3f5 upstream.
Classic BPF programs have been identified as potential vectors for intra-mode Branch Target Injection (BTI) attacks. Classic BPF programs can be run by unprivileged users. They allow unprivileged code to execute inside the kernel. Attackers can use unprivileged cBPF to craft branch history in kernel mode that can influence the target of indirect branches.
Introduce a branch history buffer (BHB) clearing sequence during the JIT compilation of classic BPF programs. The clearing sequence is the same as is used in previous mitigations to protect syscalls. Since eBPF programs already have their own mitigations in place, only insert the call on classic programs that aren't run by privileged users.
Signed-off-by: Daniel Sneddon daniel.sneddon@linux.intel.com Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Acked-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/net/bpf_jit_comp.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+)
--- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1450,6 +1450,30 @@ static void emit_priv_frame_ptr(u8 **ppr #define PRIV_STACK_GUARD_SZ 8 #define PRIV_STACK_GUARD_VAL 0xEB9F12345678eb9fULL
+static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip, + struct bpf_prog *bpf_prog) +{ + u8 *prog = *pprog; + u8 *func; + + if (cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_LOOP)) { + /* The clearing sequence clobbers eax and ecx. */ + EMIT1(0x50); /* push rax */ + EMIT1(0x51); /* push rcx */ + ip += 2; + + func = (u8 *)clear_bhb_loop; + ip += x86_call_depth_emit_accounting(&prog, func, ip); + + if (emit_call(&prog, func, ip)) + return -EINVAL; + EMIT1(0x59); /* pop rcx */ + EMIT1(0x58); /* pop rax */ + } + *pprog = prog; + return 0; +} + static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image, int oldproglen, struct jit_context *ctx, bool jmp_padding) { @@ -2467,6 +2491,13 @@ emit_jmp: seen_exit = true; /* Update cleanup_addr */ ctx->cleanup_addr = proglen; + if (bpf_prog_was_classic(bpf_prog) && + !capable(CAP_SYS_ADMIN)) { + u8 *ip = image + addrs[i - 1]; + + if (emit_spectre_bhb_barrier(&prog, ip, bpf_prog)) + return -EINVAL; + } if (bpf_prog->aux->exception_boundary) { pop_callee_regs(&prog, all_callee_regs_used); pop_r12(&prog);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Sneddon daniel.sneddon@linux.intel.com
commit 9f725eec8fc0b39bdc07dcc8897283c367c1a163 upstream.
Classic BPF programs can be run by unprivileged users, allowing unprivileged code to execute inside the kernel. Attackers can use this to craft branch history in kernel mode that can influence the target of indirect branches.
BHI_DIS_S provides user-kernel isolation of branch history, but cBPF can be used to bypass this protection by crafting branch history in kernel mode. To stop intra-mode attacks via cBPF programs, Intel created a new instruction Indirect Branch History Fence (IBHF). IBHF prevents the predicted targets of subsequent indirect branches from being influenced by branch history prior to the IBHF. IBHF is only effective while BHI_DIS_S is enabled.
Add the IBHF instruction to cBPF jitted code's exit path. Add the new fence when the hardware mitigation is enabled (i.e., X86_FEATURE_CLEAR_BHB_HW is set) or after the software sequence (X86_FEATURE_CLEAR_BHB_LOOP) is being used in a virtual machine. Note that X86_FEATURE_CLEAR_BHB_HW and X86_FEATURE_CLEAR_BHB_LOOP are mutually exclusive, so the JIT compiler will only emit the new fence, not the SW sequence, when X86_FEATURE_CLEAR_BHB_HW is set.
Hardware that enumerates BHI_NO basically has BHI_DIS_S protections always enabled, regardless of the value of BHI_DIS_S. Since BHI_DIS_S doesn't protect against intra-mode attacks, enumerate BHI bug on BHI_NO hardware as well.
Signed-off-by: Daniel Sneddon daniel.sneddon@linux.intel.com Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Acked-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/common.c | 9 ++++++--- arch/x86/net/bpf_jit_comp.c | 19 +++++++++++++++++++ 2 files changed, 25 insertions(+), 3 deletions(-)
--- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1436,9 +1436,12 @@ static void __init cpu_set_bug_bits(stru if (vulnerable_to_rfds(x86_arch_cap_msr)) setup_force_cpu_bug(X86_BUG_RFDS);
- /* When virtualized, eIBRS could be hidden, assume vulnerable */ - if (!(x86_arch_cap_msr & ARCH_CAP_BHI_NO) && - !cpu_matches(cpu_vuln_whitelist, NO_BHI) && + /* + * Intel parts with eIBRS are vulnerable to BHI attacks. Parts with + * BHI_NO still need to use the BHI mitigation to prevent Intra-mode + * attacks. When virtualized, eIBRS could be hidden, assume vulnerable. + */ + if (!cpu_matches(cpu_vuln_whitelist, NO_BHI) && (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED) || boot_cpu_has(X86_FEATURE_HYPERVISOR))) setup_force_cpu_bug(X86_BUG_BHI); --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -41,6 +41,8 @@ static u8 *emit_code(u8 *ptr, u32 bytes, #define EMIT2(b1, b2) EMIT((b1) + ((b2) << 8), 2) #define EMIT3(b1, b2, b3) EMIT((b1) + ((b2) << 8) + ((b3) << 16), 3) #define EMIT4(b1, b2, b3, b4) EMIT((b1) + ((b2) << 8) + ((b3) << 16) + ((b4) << 24), 4) +#define EMIT5(b1, b2, b3, b4, b5) \ + do { EMIT1(b1); EMIT4(b2, b3, b4, b5); } while (0)
#define EMIT1_off32(b1, off) \ do { EMIT1(b1); EMIT(off, 4); } while (0) @@ -1470,6 +1472,23 @@ static int emit_spectre_bhb_barrier(u8 * EMIT1(0x59); /* pop rcx */ EMIT1(0x58); /* pop rax */ } + /* Insert IBHF instruction */ + if ((cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_LOOP) && + cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) || + (cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_HW) && + IS_ENABLED(CONFIG_X86_64))) { + /* + * Add an Indirect Branch History Fence (IBHF). IBHF acts as a + * fence preventing branch history from before the fence from + * affecting indirect branches after the fence. This is + * specifically used in cBPF jitted code to prevent Intra-mode + * BHI attacks. The IBHF instruction is designed to be a NOP on + * hardware that doesn't need or support it. The REP and REX.W + * prefixes are required by the microcode, and they also ensure + * that the NOP is unlikely to be used in existing code. + */ + EMIT5(0xF3, 0x48, 0x0F, 0x1E, 0xF8); /* ibhf */ + } *pprog = prog; return 0; }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 073fdbe02c69c43fb7c0d547ec265c7747d4a646 upstream.
With the possibility of intra-mode BHI via cBPF, complete mitigation for BHI is to use IBHF (history fence) instruction with BHI_DIS_S set. Since this new instruction is only available in 64-bit mode, setting BHI_DIS_S in 32-bit mode is only a partial mitigation.
Do not set BHI_DIS_S in 32-bit mode so as to avoid reporting misleading mitigated status. With this change IBHF won't be used in 32-bit mode, also remove the CONFIG_X86_64 check from emit_spectre_bhb_barrier().
Suggested-by: Josh Poimboeuf jpoimboe@kernel.org Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/bugs.c | 6 +++--- arch/x86/net/bpf_jit_comp.c | 5 +++-- 2 files changed, 6 insertions(+), 5 deletions(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1684,11 +1684,11 @@ static void __init bhi_select_mitigation return; }
- /* Mitigate in hardware if supported */ - if (spec_ctrl_bhi_dis()) + if (!IS_ENABLED(CONFIG_X86_64)) return;
- if (!IS_ENABLED(CONFIG_X86_64)) + /* Mitigate in hardware if supported */ + if (spec_ctrl_bhi_dis()) return;
if (bhi_mitigation == BHI_MITIGATION_VMEXIT_ONLY) { --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1475,8 +1475,7 @@ static int emit_spectre_bhb_barrier(u8 * /* Insert IBHF instruction */ if ((cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_LOOP) && cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) || - (cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_HW) && - IS_ENABLED(CONFIG_X86_64))) { + cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_HW)) { /* * Add an Indirect Branch History Fence (IBHF). IBHF acts as a * fence preventing branch history from before the fence from @@ -1486,6 +1485,8 @@ static int emit_spectre_bhb_barrier(u8 * * hardware that doesn't need or support it. The REP and REX.W * prefixes are required by the microcode, and they also ensure * that the NOP is unlikely to be used in existing code. + * + * IBHF is not a valid instruction in 32-bit mode. */ EMIT5(0xF3, 0x48, 0x0F, 0x1E, 0xF8); /* ibhf */ }
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 1ac116ce6468670eeda39345a5585df308243dca upstream.
Add the admin-guide for Indirect Target Selection (ITS).
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/hw-vuln/index.rst | 1 Documentation/admin-guide/hw-vuln/indirect-target-selection.rst | 168 ++++++++++ 2 files changed, 169 insertions(+) create mode 100644 Documentation/admin-guide/hw-vuln/indirect-target-selection.rst
--- a/Documentation/admin-guide/hw-vuln/index.rst +++ b/Documentation/admin-guide/hw-vuln/index.rst @@ -22,3 +22,4 @@ are configurable at compile, boot or run srso gather_data_sampling reg-file-data-sampling + indirect-target-selection --- /dev/null +++ b/Documentation/admin-guide/hw-vuln/indirect-target-selection.rst @@ -0,0 +1,168 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Indirect Target Selection (ITS) +=============================== + +ITS is a vulnerability in some Intel CPUs that support Enhanced IBRS and were +released before Alder Lake. ITS may allow an attacker to control the prediction +of indirect branches and RETs located in the lower half of a cacheline. + +ITS is assigned CVE-2024-28956 with a CVSS score of 4.7 (Medium). + +Scope of Impact +--------------- +- **eIBRS Guest/Host Isolation**: Indirect branches in KVM/kernel may still be + predicted with unintended target corresponding to a branch in the guest. + +- **Intra-Mode BTI**: In-kernel training such as through cBPF or other native + gadgets. + +- **Indirect Branch Prediction Barrier (IBPB)**: After an IBPB, indirect + branches may still be predicted with targets corresponding to direct branches + executed prior to the IBPB. This is fixed by the IPU 2025.1 microcode, which + should be available via distro updates. Alternatively microcode can be + obtained from Intel's github repository [#f1]_. + +Affected CPUs +------------- +Below is the list of ITS affected CPUs [#f2]_ [#f3]_: + + ======================== ============ ==================== =============== + Common name Family_Model eIBRS Intra-mode BTI + Guest/Host Isolation + ======================== ============ ==================== =============== + SKYLAKE_X (step >= 6) 06_55H Affected Affected + ICELAKE_X 06_6AH Not affected Affected + ICELAKE_D 06_6CH Not affected Affected + ICELAKE_L 06_7EH Not affected Affected + TIGERLAKE_L 06_8CH Not affected Affected + TIGERLAKE 06_8DH Not affected Affected + KABYLAKE_L (step >= 12) 06_8EH Affected Affected + KABYLAKE (step >= 13) 06_9EH Affected Affected + COMETLAKE 06_A5H Affected Affected + COMETLAKE_L 06_A6H Affected Affected + ROCKETLAKE 06_A7H Not affected Affected + ======================== ============ ==================== =============== + +- All affected CPUs enumerate Enhanced IBRS feature. +- IBPB isolation is affected on all ITS affected CPUs, and need a microcode + update for mitigation. +- None of the affected CPUs enumerate BHI_CTRL which was introduced in Golden + Cove (Alder Lake and Sapphire Rapids). This can help guests to determine the + host's affected status. +- Intel Atom CPUs are not affected by ITS. + +Mitigation +---------- +As only the indirect branches and RETs that have their last byte of instruction +in the lower half of the cacheline are vulnerable to ITS, the basic idea behind +the mitigation is to not allow indirect branches in the lower half. + +This is achieved by relying on existing retpoline support in the kernel, and in +compilers. ITS-vulnerable retpoline sites are runtime patched to point to newly +added ITS-safe thunks. These safe thunks consists of indirect branch in the +second half of the cacheline. Not all retpoline sites are patched to thunks, if +a retpoline site is evaluated to be ITS-safe, it is replaced with an inline +indirect branch. + +Dynamic thunks +~~~~~~~~~~~~~~ +From a dynamically allocated pool of safe-thunks, each vulnerable site is +replaced with a new thunk, such that they get a unique address. This could +improve the branch prediction accuracy. Also, it is a defense-in-depth measure +against aliasing. + +Note, for simplicity, indirect branches in eBPF programs are always replaced +with a jump to a static thunk in __x86_indirect_its_thunk_array. If required, +in future this can be changed to use dynamic thunks. + +All vulnerable RETs are replaced with a static thunk, they do not use dynamic +thunks. This is because RETs get their prediction from RSB mostly that does not +depend on source address. RETs that underflow RSB may benefit from dynamic +thunks. But, RETs significantly outnumber indirect branches, and any benefit +from a unique source address could be outweighed by the increased icache +footprint and iTLB pressure. + +Retpoline +~~~~~~~~~ +Retpoline sequence also mitigates ITS-unsafe indirect branches. For this +reason, when retpoline is enabled, ITS mitigation only relocates the RETs to +safe thunks. Unless user requested the RSB-stuffing mitigation. + +RSB Stuffing +~~~~~~~~~~~~ +RSB-stuffing via Call Depth Tracking is a mitigation for Retbleed RSB-underflow +attacks. And it also mitigates RETs that are vulnerable to ITS. + +Mitigation in guests +^^^^^^^^^^^^^^^^^^^^ +All guests deploy ITS mitigation by default, irrespective of eIBRS enumeration +and Family/Model of the guest. This is because eIBRS feature could be hidden +from a guest. One exception to this is when a guest enumerates BHI_DIS_S, which +indicates that the guest is running on an unaffected host. + +To prevent guests from unnecessarily deploying the mitigation on unaffected +platforms, Intel has defined ITS_NO bit(62) in MSR IA32_ARCH_CAPABILITIES. When +a guest sees this bit set, it should not enumerate the ITS bug. Note, this bit +is not set by any hardware, but is **intended for VMMs to synthesize** it for +guests as per the host's affected status. + +Mitigation options +^^^^^^^^^^^^^^^^^^ +The ITS mitigation can be controlled using the "indirect_target_selection" +kernel parameter. The available options are: + + ======== =================================================================== + on (default) Deploy the "Aligned branch/return thunks" mitigation. + If spectre_v2 mitigation enables retpoline, aligned-thunks are only + deployed for the affected RET instructions. Retpoline mitigates + indirect branches. + + off Disable ITS mitigation. + + vmexit Equivalent to "=on" if the CPU is affected by guest/host isolation + part of ITS. Otherwise, mitigation is not deployed. This option is + useful when host userspace is not in the threat model, and only + attacks from guest to host are considered. + + stuff Deploy RSB-fill mitigation when retpoline is also deployed. + Otherwise, deploy the default mitigation. When retpoline mitigation + is enabled, RSB-stuffing via Call-Depth-Tracking also mitigates + ITS. + + force Force the ITS bug and deploy the default mitigation. + ======== =================================================================== + +Sysfs reporting +--------------- + +The sysfs file showing ITS mitigation status is: + + /sys/devices/system/cpu/vulnerabilities/indirect_target_selection + +Note, microcode mitigation status is not reported in this file. + +The possible values in this file are: + +.. list-table:: + + * - Not affected + - The processor is not vulnerable. + * - Vulnerable + - System is vulnerable and no mitigation has been applied. + * - Vulnerable, KVM: Not affected + - System is vulnerable to intra-mode BTI, but not affected by eIBRS + guest/host isolation. + * - Mitigation: Aligned branch/return thunks + - The mitigation is enabled, affected indirect branches and RETs are + relocated to safe thunks. + * - Mitigation: Retpolines, Stuffing RSB + - The mitigation is enabled using retpoline and RSB stuffing. + +References +---------- +.. [#f1] Microcode repository - https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files + +.. [#f2] Affected Processors list - https://www.intel.com/content/www/us/en/developer/topic-technology/software-... + +.. [#f3] Affected Processors list (machine readable) - https://github.com/intel/Intel-affected-processor-list
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 159013a7ca18c271ff64192deb62a689b622d860 upstream.
ITS bug in some pre-Alderlake Intel CPUs may allow indirect branches in the first half of a cache line get predicted to a target of a branch located in the second half of the cache line.
Set X86_BUG_ITS on affected CPUs. Mitigation to follow in later commits.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/cpufeatures.h | 1 arch/x86/include/asm/msr-index.h | 8 +++++ arch/x86/kernel/cpu/common.c | 58 +++++++++++++++++++++++++++++-------- arch/x86/kvm/x86.c | 4 +- 4 files changed, 58 insertions(+), 13 deletions(-)
--- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -534,4 +534,5 @@ #define X86_BUG_RFDS X86_BUG(1*32 + 2) /* "rfds" CPU is vulnerable to Register File Data Sampling */ #define X86_BUG_BHI X86_BUG(1*32 + 3) /* "bhi" CPU is affected by Branch History Injection */ #define X86_BUG_IBPB_NO_RET X86_BUG(1*32 + 4) /* "ibpb_no_ret" IBPB omits return target predictions */ +#define X86_BUG_ITS X86_BUG(1*32 + 5) /* "its" CPU is affected by Indirect Target Selection */ #endif /* _ASM_X86_CPUFEATURES_H */ --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -209,6 +209,14 @@ * VERW clears CPU Register * File. */ +#define ARCH_CAP_ITS_NO BIT_ULL(62) /* + * Not susceptible to + * Indirect Target Selection. + * This bit is not set by + * HW, but is synthesized by + * VMMs for guests to know + * their affected status. + */
#define MSR_IA32_FLUSH_CMD 0x0000010b #define L1D_FLUSH BIT(0) /* --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1226,6 +1226,8 @@ static const __initconst struct x86_cpu_ #define GDS BIT(6) /* CPU is affected by Register File Data Sampling */ #define RFDS BIT(7) +/* CPU is affected by Indirect Target Selection */ +#define ITS BIT(8)
static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { VULNBL_INTEL_STEPS(INTEL_IVYBRIDGE, X86_STEP_MAX, SRBDS), @@ -1237,22 +1239,25 @@ static const struct x86_cpu_id cpu_vuln_ VULNBL_INTEL_STEPS(INTEL_BROADWELL_G, X86_STEP_MAX, SRBDS), VULNBL_INTEL_STEPS(INTEL_BROADWELL_X, X86_STEP_MAX, MMIO), VULNBL_INTEL_STEPS(INTEL_BROADWELL, X86_STEP_MAX, SRBDS), - VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, X86_STEP_MAX, MMIO | RETBLEED | GDS), + VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, 0x5, MMIO | RETBLEED | GDS), + VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS), VULNBL_INTEL_STEPS(INTEL_SKYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS), VULNBL_INTEL_STEPS(INTEL_SKYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS), - VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS), - VULNBL_INTEL_STEPS(INTEL_KABYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS), + VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, 0xb, MMIO | RETBLEED | GDS | SRBDS), + VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS), + VULNBL_INTEL_STEPS(INTEL_KABYLAKE, 0xc, MMIO | RETBLEED | GDS | SRBDS), + VULNBL_INTEL_STEPS(INTEL_KABYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS), VULNBL_INTEL_STEPS(INTEL_CANNONLAKE_L, X86_STEP_MAX, RETBLEED), - VULNBL_INTEL_STEPS(INTEL_ICELAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS), - VULNBL_INTEL_STEPS(INTEL_ICELAKE_D, X86_STEP_MAX, MMIO | GDS), - VULNBL_INTEL_STEPS(INTEL_ICELAKE_X, X86_STEP_MAX, MMIO | GDS), - VULNBL_INTEL_STEPS(INTEL_COMETLAKE, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS), - VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, 0x0, MMIO | RETBLEED), - VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS), - VULNBL_INTEL_STEPS(INTEL_TIGERLAKE_L, X86_STEP_MAX, GDS), - VULNBL_INTEL_STEPS(INTEL_TIGERLAKE, X86_STEP_MAX, GDS), + VULNBL_INTEL_STEPS(INTEL_ICELAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), + VULNBL_INTEL_STEPS(INTEL_ICELAKE_D, X86_STEP_MAX, MMIO | GDS | ITS), + VULNBL_INTEL_STEPS(INTEL_ICELAKE_X, X86_STEP_MAX, MMIO | GDS | ITS), + VULNBL_INTEL_STEPS(INTEL_COMETLAKE, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), + VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, 0x0, MMIO | RETBLEED | ITS), + VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), + VULNBL_INTEL_STEPS(INTEL_TIGERLAKE_L, X86_STEP_MAX, GDS | ITS), + VULNBL_INTEL_STEPS(INTEL_TIGERLAKE, X86_STEP_MAX, GDS | ITS), VULNBL_INTEL_STEPS(INTEL_LAKEFIELD, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED), - VULNBL_INTEL_STEPS(INTEL_ROCKETLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS), + VULNBL_INTEL_STEPS(INTEL_ROCKETLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS), VULNBL_INTEL_STEPS(INTEL_ALDERLAKE, X86_STEP_MAX, RFDS), VULNBL_INTEL_STEPS(INTEL_ALDERLAKE_L, X86_STEP_MAX, RFDS), VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE, X86_STEP_MAX, RFDS), @@ -1317,6 +1322,32 @@ static bool __init vulnerable_to_rfds(u6 return cpu_matches(cpu_vuln_blacklist, RFDS); }
+static bool __init vulnerable_to_its(u64 x86_arch_cap_msr) +{ + /* The "immunity" bit trumps everything else: */ + if (x86_arch_cap_msr & ARCH_CAP_ITS_NO) + return false; + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) + return false; + + /* None of the affected CPUs have BHI_CTRL */ + if (boot_cpu_has(X86_FEATURE_BHI_CTRL)) + return false; + + /* + * If a VMM did not expose ITS_NO, assume that a guest could + * be running on a vulnerable hardware or may migrate to such + * hardware. + */ + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) + return true; + + if (cpu_matches(cpu_vuln_blacklist, ITS)) + return true; + + return false; +} + static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) { u64 x86_arch_cap_msr = x86_read_arch_cap_msr(); @@ -1449,6 +1480,9 @@ static void __init cpu_set_bug_bits(stru if (cpu_has(c, X86_FEATURE_AMD_IBPB) && !cpu_has(c, X86_FEATURE_AMD_IBPB_RET)) setup_force_cpu_bug(X86_BUG_IBPB_NO_RET);
+ if (vulnerable_to_its(x86_arch_cap_msr)) + setup_force_cpu_bug(X86_BUG_ITS); + if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) return;
--- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1589,7 +1589,7 @@ EXPORT_SYMBOL_GPL(kvm_emulate_rdpmc); ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \ ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \ ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | ARCH_CAP_GDS_NO | \ - ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR | ARCH_CAP_BHI_NO) + ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR | ARCH_CAP_BHI_NO | ARCH_CAP_ITS_NO)
static u64 kvm_get_arch_capabilities(void) { @@ -1623,6 +1623,8 @@ static u64 kvm_get_arch_capabilities(voi data |= ARCH_CAP_MDS_NO; if (!boot_cpu_has_bug(X86_BUG_RFDS)) data |= ARCH_CAP_RFDS_NO; + if (!boot_cpu_has_bug(X86_BUG_ITS)) + data |= ARCH_CAP_ITS_NO;
if (!boot_cpu_has(X86_FEATURE_RTM)) { /*
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 8754e67ad4ac692c67ff1f99c0d07156f04ae40c upstream.
Due to ITS, indirect branches in the lower half of a cacheline may be vulnerable to branch target injection attack.
Introduce ITS-safe thunks to patch indirect branches in the lower half of cacheline with the thunk. Also thunk any eBPF generated indirect branches in emit_indirect_jump().
Below category of indirect branches are not mitigated:
- Indirect branches in the .init section are not mitigated because they are discarded after boot. - Indirect branches that are explicitly marked retpoline-safe.
Note that retpoline also mitigates the indirect branches against ITS. This is because the retpoline sequence fills an RSB entry before RET, and it does not suffer from RSB-underflow part of the ITS.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/Kconfig | 11 ++++++++ arch/x86/include/asm/cpufeatures.h | 1 arch/x86/include/asm/nospec-branch.h | 4 +++ arch/x86/kernel/alternative.c | 45 ++++++++++++++++++++++++++++++++--- arch/x86/kernel/vmlinux.lds.S | 6 ++++ arch/x86/lib/retpoline.S | 28 +++++++++++++++++++++ arch/x86/net/bpf_jit_comp.c | 5 +++ 7 files changed, 96 insertions(+), 4 deletions(-)
--- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2764,6 +2764,17 @@ config MITIGATION_SSB of speculative execution in a similar way to the Meltdown and Spectre security vulnerabilities.
+config MITIGATION_ITS + bool "Enable Indirect Target Selection mitigation" + depends on CPU_SUP_INTEL && X86_64 + depends on MITIGATION_RETPOLINE && MITIGATION_RETHUNK + default y + help + Enable Indirect Target Selection (ITS) mitigation. ITS is a bug in + BPU on some Intel CPUs that may allow Spectre V2 style attacks. If + disabled, mitigation cannot be enabled via cmdline. + See file:Documentation/admin-guide/hw-vuln/indirect-target-selection.rst + endif
config ARCH_HAS_ADD_PAGES --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -483,6 +483,7 @@ #define X86_FEATURE_AMD_FAST_CPPC (21*32 + 5) /* Fast CPPC */ #define X86_FEATURE_AMD_HETEROGENEOUS_CORES (21*32 + 6) /* Heterogeneous Core Topology */ #define X86_FEATURE_AMD_WORKLOAD_CLASS (21*32 + 7) /* Workload Classification */ +#define X86_FEATURE_INDIRECT_THUNK_ITS (21*32 + 8) /* Use thunk for indirect branches in lower half of cacheline */
/* * BUG word(s) --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -337,10 +337,14 @@
#else /* __ASSEMBLY__ */
+#define ITS_THUNK_SIZE 64 + typedef u8 retpoline_thunk_t[RETPOLINE_THUNK_SIZE]; +typedef u8 its_thunk_t[ITS_THUNK_SIZE]; extern retpoline_thunk_t __x86_indirect_thunk_array[]; extern retpoline_thunk_t __x86_indirect_call_thunk_array[]; extern retpoline_thunk_t __x86_indirect_jump_thunk_array[]; +extern its_thunk_t __x86_indirect_its_thunk_array[];
#ifdef CONFIG_MITIGATION_RETHUNK extern void __x86_return_thunk(void); --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -590,7 +590,8 @@ static int emit_indirect(int op, int reg return i; }
-static int emit_call_track_retpoline(void *addr, struct insn *insn, int reg, u8 *bytes) +static int __emit_trampoline(void *addr, struct insn *insn, u8 *bytes, + void *call_dest, void *jmp_dest) { u8 op = insn->opcode.bytes[0]; int i = 0; @@ -611,7 +612,7 @@ static int emit_call_track_retpoline(voi switch (op) { case CALL_INSN_OPCODE: __text_gen_insn(bytes+i, op, addr+i, - __x86_indirect_call_thunk_array[reg], + call_dest, CALL_INSN_SIZE); i += CALL_INSN_SIZE; break; @@ -619,7 +620,7 @@ static int emit_call_track_retpoline(voi case JMP32_INSN_OPCODE: clang_jcc: __text_gen_insn(bytes+i, op, addr+i, - __x86_indirect_jump_thunk_array[reg], + jmp_dest, JMP32_INSN_SIZE); i += JMP32_INSN_SIZE; break; @@ -634,6 +635,35 @@ clang_jcc: return i; }
+static int emit_call_track_retpoline(void *addr, struct insn *insn, int reg, u8 *bytes) +{ + return __emit_trampoline(addr, insn, bytes, + __x86_indirect_call_thunk_array[reg], + __x86_indirect_jump_thunk_array[reg]); +} + +#ifdef CONFIG_MITIGATION_ITS +static int emit_its_trampoline(void *addr, struct insn *insn, int reg, u8 *bytes) +{ + return __emit_trampoline(addr, insn, bytes, + __x86_indirect_its_thunk_array[reg], + __x86_indirect_its_thunk_array[reg]); +} + +/* Check if an indirect branch is at ITS-unsafe address */ +static bool cpu_wants_indirect_its_thunk_at(unsigned long addr, int reg) +{ + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) + return false; + + /* Indirect branch opcode is 2 or 3 bytes depending on reg */ + addr += 1 + reg / 8; + + /* Lower-half of the cacheline? */ + return !(addr & 0x20); +} +#endif + /* * Rewrite the compiler generated retpoline thunk calls. * @@ -708,6 +738,15 @@ static int patch_retpoline(void *addr, s bytes[i++] = 0xe8; /* LFENCE */ }
+#ifdef CONFIG_MITIGATION_ITS + /* + * Check if the address of last byte of emitted-indirect is in + * lower-half of the cacheline. Such branches need ITS mitigation. + */ + if (cpu_wants_indirect_its_thunk_at((unsigned long)addr + i, reg)) + return emit_its_trampoline(addr, insn, reg, bytes); +#endif + ret = emit_indirect(op, reg, bytes + i); if (ret < 0) return ret; --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -528,6 +528,12 @@ INIT_PER_CPU(irq_stack_backing_store); "SRSO function pair won't alias"); #endif
+#if defined(CONFIG_MITIGATION_ITS) && !defined(CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B) +. = ASSERT(__x86_indirect_its_thunk_rax & 0x20, "__x86_indirect_thunk_rax not in second half of cacheline"); +. = ASSERT(((__x86_indirect_its_thunk_rcx - __x86_indirect_its_thunk_rax) % 64) == 0, "Indirect thunks are not cacheline apart"); +. = ASSERT(__x86_indirect_its_thunk_array == __x86_indirect_its_thunk_rax, "Gap in ITS thunk array"); +#endif + #endif /* CONFIG_X86_64 */
/* --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -366,6 +366,34 @@ SYM_FUNC_END(call_depth_return_thunk)
#endif /* CONFIG_MITIGATION_CALL_DEPTH_TRACKING */
+#ifdef CONFIG_MITIGATION_ITS + +.macro ITS_THUNK reg + +SYM_INNER_LABEL(__x86_indirect_its_thunk_\reg, SYM_L_GLOBAL) + UNWIND_HINT_UNDEFINED + ANNOTATE_NOENDBR + ANNOTATE_RETPOLINE_SAFE + jmp *%\reg + int3 + .align 32, 0xcc /* fill to the end of the line */ + .skip 32, 0xcc /* skip to the next upper half */ +.endm + +/* ITS mitigation requires thunks be aligned to upper half of cacheline */ +.align 64, 0xcc +.skip 32, 0xcc +SYM_CODE_START(__x86_indirect_its_thunk_array) + +#define GEN(reg) ITS_THUNK reg +#include <asm/GEN-for-each-reg.h> +#undef GEN + + .align 64, 0xcc +SYM_CODE_END(__x86_indirect_its_thunk_array) + +#endif + /* * This function name is magical and is used by -mfunction-return=thunk-extern * for the compiler to generate JMPs to it. --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -655,7 +655,10 @@ static void emit_indirect_jump(u8 **ppro { u8 *prog = *pprog;
- if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) { + if (cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) { + OPTIMIZER_HIDE_VAR(reg); + emit_jump(&prog, &__x86_indirect_its_thunk_array[reg], ip); + } else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) { EMIT_LFENCE(); EMIT2(0xFF, 0xE0 + reg); } else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE)) {
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit a75bf27fe41abe658c53276a0c486c4bf9adecfc upstream.
RETs in the lower half of cacheline may be affected by ITS bug, specifically when the RSB-underflows. Use ITS-safe return thunk for such RETs.
RETs that are not patched:
- RET in retpoline sequence does not need to be patched, because the sequence itself fills an RSB before RET. - RET in Call Depth Tracking (CDT) thunks __x86_indirect_{call|jump}_thunk and call_depth_return_thunk are not patched because CDT by design prevents RSB-underflow. - RETs in .init section are not reachable after init. - RETs that are explicitly marked safe with ANNOTATE_UNRET_SAFE.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/alternative.h | 14 ++++++++++++++ arch/x86/include/asm/nospec-branch.h | 6 ++++++ arch/x86/kernel/alternative.c | 19 +++++++++++++++++-- arch/x86/kernel/ftrace.c | 2 +- arch/x86/kernel/static_call.c | 4 ++-- arch/x86/kernel/vmlinux.lds.S | 4 ++++ arch/x86/lib/retpoline.S | 13 ++++++++++++- arch/x86/net/bpf_jit_comp.c | 2 +- 8 files changed, 57 insertions(+), 7 deletions(-)
--- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -125,6 +125,20 @@ static __always_inline int x86_call_dept } #endif
+#if defined(CONFIG_MITIGATION_RETHUNK) && defined(CONFIG_OBJTOOL) +extern bool cpu_wants_rethunk(void); +extern bool cpu_wants_rethunk_at(void *addr); +#else +static __always_inline bool cpu_wants_rethunk(void) +{ + return false; +} +static __always_inline bool cpu_wants_rethunk_at(void *addr) +{ + return false; +} +#endif + #ifdef CONFIG_SMP extern void alternatives_smp_module_add(struct module *mod, char *name, void *locks, void *locks_end, --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -368,6 +368,12 @@ static inline void srso_return_thunk(voi static inline void srso_alias_return_thunk(void) {} #endif
+#ifdef CONFIG_MITIGATION_ITS +extern void its_return_thunk(void); +#else +static inline void its_return_thunk(void) {} +#endif + extern void retbleed_return_thunk(void); extern void srso_return_thunk(void); extern void srso_alias_return_thunk(void); --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -820,6 +820,21 @@ void __init_or_module noinline apply_ret
#ifdef CONFIG_MITIGATION_RETHUNK
+bool cpu_wants_rethunk(void) +{ + return cpu_feature_enabled(X86_FEATURE_RETHUNK); +} + +bool cpu_wants_rethunk_at(void *addr) +{ + if (!cpu_feature_enabled(X86_FEATURE_RETHUNK)) + return false; + if (x86_return_thunk != its_return_thunk) + return true; + + return !((unsigned long)addr & 0x20); +} + /* * Rewrite the compiler generated return thunk tail-calls. * @@ -836,7 +851,7 @@ static int patch_return(void *addr, stru int i = 0;
/* Patch the custom return thunks... */ - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) { + if (cpu_wants_rethunk_at(addr)) { i = JMP32_INSN_SIZE; __text_gen_insn(bytes, JMP32_INSN_OPCODE, addr, x86_return_thunk, i); } else { @@ -854,7 +869,7 @@ void __init_or_module noinline apply_ret { s32 *s;
- if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) + if (cpu_wants_rethunk()) static_call_force_reinit();
for (s = start; s < end; s++) { --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -357,7 +357,7 @@ create_trampoline(struct ftrace_ops *ops goto fail;
ip = trampoline + size; - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) + if (cpu_wants_rethunk_at(ip)) __text_gen_insn(ip, JMP32_INSN_OPCODE, ip, x86_return_thunk, JMP32_INSN_SIZE); else text_poke_copy(ip, retq, sizeof(retq)); --- a/arch/x86/kernel/static_call.c +++ b/arch/x86/kernel/static_call.c @@ -81,7 +81,7 @@ static void __ref __static_call_transfor break;
case RET: - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) + if (cpu_wants_rethunk_at(insn)) code = text_gen_insn(JMP32_INSN_OPCODE, insn, x86_return_thunk); else code = &retinsn; @@ -90,7 +90,7 @@ static void __ref __static_call_transfor case JCC: if (!func) { func = __static_call_return; - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) + if (cpu_wants_rethunk()) func = x86_return_thunk; }
--- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -534,6 +534,10 @@ INIT_PER_CPU(irq_stack_backing_store); . = ASSERT(__x86_indirect_its_thunk_array == __x86_indirect_its_thunk_rax, "Gap in ITS thunk array"); #endif
+#if defined(CONFIG_MITIGATION_ITS) && !defined(CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B) +. = ASSERT(its_return_thunk & 0x20, "its_return_thunk not in second half of cacheline"); +#endif + #endif /* CONFIG_X86_64 */
/* --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -392,7 +392,18 @@ SYM_CODE_START(__x86_indirect_its_thunk_ .align 64, 0xcc SYM_CODE_END(__x86_indirect_its_thunk_array)
-#endif +.align 64, 0xcc +.skip 32, 0xcc +SYM_CODE_START(its_return_thunk) + UNWIND_HINT_FUNC + ANNOTATE_NOENDBR + ANNOTATE_UNRET_SAFE + ret + int3 +SYM_CODE_END(its_return_thunk) +EXPORT_SYMBOL(its_return_thunk) + +#endif /* CONFIG_MITIGATION_ITS */
/* * This function name is magical and is used by -mfunction-return=thunk-extern --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -680,7 +680,7 @@ static void emit_return(u8 **pprog, u8 * { u8 *prog = *pprog;
- if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) { + if (cpu_wants_rethunk()) { emit_jump(&prog, x86_return_thunk, ip); } else { EMIT1(0xC3); /* ret */
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit f4818881c47fd91fcb6d62373c57c7844e3de1c0 upstream.
Indirect Target Selection (ITS) is a bug in some pre-ADL Intel CPUs with eIBRS. It affects prediction of indirect branch and RETs in the lower half of cacheline. Due to ITS such branches may get wrongly predicted to a target of (direct or indirect) branch that is located in the upper half of the cacheline.
Scope of impact ===============
Guest/host isolation -------------------- When eIBRS is used for guest/host isolation, the indirect branches in the VMM may still be predicted with targets corresponding to branches in the guest.
Intra-mode ---------- cBPF or other native gadgets can be used for intra-mode training and disclosure using ITS.
User/kernel isolation --------------------- When eIBRS is enabled user/kernel isolation is not impacted.
Indirect Branch Prediction Barrier (IBPB) ----------------------------------------- After an IBPB, indirect branches may be predicted with targets corresponding to direct branches which were executed prior to IBPB. This is mitigated by a microcode update.
Add cmdline parameter indirect_target_selection=off|on|force to control the mitigation to relocate the affected branches to an ITS-safe thunk i.e. located in the upper half of cacheline. Also add the sysfs reporting.
When retpoline mitigation is deployed, ITS safe-thunks are not needed, because retpoline sequence is already ITS-safe. Similarly, when call depth tracking (CDT) mitigation is deployed (retbleed=stuff), ITS safe return thunk is not used, as CDT prevents RSB-underflow.
To not overcomplicate things, ITS mitigation is not supported with spectre-v2 lfence;jmp mitigation. Moreover, it is less practical to deploy lfence;jmp mitigation on ITS affected parts anyways.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/ABI/testing/sysfs-devices-system-cpu | 1 Documentation/admin-guide/kernel-parameters.txt | 13 + arch/x86/kernel/cpu/bugs.c | 140 ++++++++++++++++++++- drivers/base/cpu.c | 3 include/linux/cpu.h | 2 5 files changed, 155 insertions(+), 4 deletions(-)
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -511,6 +511,7 @@ Description: information about CPUs hete
What: /sys/devices/system/cpu/vulnerabilities /sys/devices/system/cpu/vulnerabilities/gather_data_sampling + /sys/devices/system/cpu/vulnerabilities/indirect_target_selection /sys/devices/system/cpu/vulnerabilities/itlb_multihit /sys/devices/system/cpu/vulnerabilities/l1tf /sys/devices/system/cpu/vulnerabilities/mds --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2178,6 +2178,18 @@ different crypto accelerators. This option can be used to achieve best performance for particular HW.
+ indirect_target_selection= [X86,Intel] Mitigation control for Indirect + Target Selection(ITS) bug in Intel CPUs. Updated + microcode is also required for a fix in IBPB. + + on: Enable mitigation (default). + off: Disable mitigation. + force: Force the ITS bug and deploy default + mitigation. + + For details see: + Documentation/admin-guide/hw-vuln/indirect-target-selection.rst + init= [KNL] Format: <full_path> Run specified binary instead of /sbin/init as init @@ -3666,6 +3678,7 @@ expose users to several CPU vulnerabilities. Equivalent to: if nokaslr then kpti=0 [ARM64] gather_data_sampling=off [X86] + indirect_target_selection=off [X86] kvm.nx_huge_pages=off [X86] l1tf=off [X86] mds=off [X86] --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -49,6 +49,7 @@ static void __init srbds_select_mitigati static void __init l1d_flush_select_mitigation(void); static void __init srso_select_mitigation(void); static void __init gds_select_mitigation(void); +static void __init its_select_mitigation(void);
/* The base value of the SPEC_CTRL MSR without task-specific bits set */ u64 x86_spec_ctrl_base; @@ -67,6 +68,14 @@ static DEFINE_MUTEX(spec_ctrl_mutex);
void (*x86_return_thunk)(void) __ro_after_init = __x86_return_thunk;
+static void __init set_return_thunk(void *thunk) +{ + if (x86_return_thunk != __x86_return_thunk) + pr_warn("x86/bugs: return thunk changed\n"); + + x86_return_thunk = thunk; +} + /* Update SPEC_CTRL MSR and its cached copy unconditionally */ static void update_spec_ctrl(u64 val) { @@ -175,6 +184,7 @@ void __init cpu_select_mitigations(void) */ srso_select_mitigation(); gds_select_mitigation(); + its_select_mitigation(); }
/* @@ -1104,7 +1114,7 @@ do_cmd_auto: setup_force_cpu_cap(X86_FEATURE_RETHUNK); setup_force_cpu_cap(X86_FEATURE_UNRET);
- x86_return_thunk = retbleed_return_thunk; + set_return_thunk(retbleed_return_thunk);
if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD && boot_cpu_data.x86_vendor != X86_VENDOR_HYGON) @@ -1139,7 +1149,7 @@ do_cmd_auto: setup_force_cpu_cap(X86_FEATURE_RETHUNK); setup_force_cpu_cap(X86_FEATURE_CALL_DEPTH);
- x86_return_thunk = call_depth_return_thunk; + set_return_thunk(call_depth_return_thunk); break;
default: @@ -1174,6 +1184,115 @@ do_cmd_auto: }
#undef pr_fmt +#define pr_fmt(fmt) "ITS: " fmt + +enum its_mitigation_cmd { + ITS_CMD_OFF, + ITS_CMD_ON, +}; + +enum its_mitigation { + ITS_MITIGATION_OFF, + ITS_MITIGATION_ALIGNED_THUNKS, + ITS_MITIGATION_RETPOLINE_STUFF, +}; + +static const char * const its_strings[] = { + [ITS_MITIGATION_OFF] = "Vulnerable", + [ITS_MITIGATION_ALIGNED_THUNKS] = "Mitigation: Aligned branch/return thunks", + [ITS_MITIGATION_RETPOLINE_STUFF] = "Mitigation: Retpolines, Stuffing RSB", +}; + +static enum its_mitigation its_mitigation __ro_after_init = ITS_MITIGATION_ALIGNED_THUNKS; + +static enum its_mitigation_cmd its_cmd __ro_after_init = + IS_ENABLED(CONFIG_MITIGATION_ITS) ? ITS_CMD_ON : ITS_CMD_OFF; + +static int __init its_parse_cmdline(char *str) +{ + if (!str) + return -EINVAL; + + if (!IS_ENABLED(CONFIG_MITIGATION_ITS)) { + pr_err("Mitigation disabled at compile time, ignoring option (%s)", str); + return 0; + } + + if (!strcmp(str, "off")) { + its_cmd = ITS_CMD_OFF; + } else if (!strcmp(str, "on")) { + its_cmd = ITS_CMD_ON; + } else if (!strcmp(str, "force")) { + its_cmd = ITS_CMD_ON; + setup_force_cpu_bug(X86_BUG_ITS); + } else { + pr_err("Ignoring unknown indirect_target_selection option (%s).", str); + } + + return 0; +} +early_param("indirect_target_selection", its_parse_cmdline); + +static void __init its_select_mitigation(void) +{ + enum its_mitigation_cmd cmd = its_cmd; + + if (!boot_cpu_has_bug(X86_BUG_ITS) || cpu_mitigations_off()) { + its_mitigation = ITS_MITIGATION_OFF; + return; + } + + /* Retpoline+CDT mitigates ITS, bail out */ + if (boot_cpu_has(X86_FEATURE_RETPOLINE) && + boot_cpu_has(X86_FEATURE_CALL_DEPTH)) { + its_mitigation = ITS_MITIGATION_RETPOLINE_STUFF; + goto out; + } + + /* Exit early to avoid irrelevant warnings */ + if (cmd == ITS_CMD_OFF) { + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + if (spectre_v2_enabled == SPECTRE_V2_NONE) { + pr_err("WARNING: Spectre-v2 mitigation is off, disabling ITS\n"); + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + if (!IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) || + !IS_ENABLED(CONFIG_MITIGATION_RETHUNK)) { + pr_err("WARNING: ITS mitigation depends on retpoline and rethunk support\n"); + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + if (IS_ENABLED(CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B)) { + pr_err("WARNING: ITS mitigation is not compatible with CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B\n"); + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + if (boot_cpu_has(X86_FEATURE_RETPOLINE_LFENCE)) { + pr_err("WARNING: ITS mitigation is not compatible with lfence mitigation\n"); + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + + switch (cmd) { + case ITS_CMD_OFF: + its_mitigation = ITS_MITIGATION_OFF; + break; + case ITS_CMD_ON: + its_mitigation = ITS_MITIGATION_ALIGNED_THUNKS; + if (!boot_cpu_has(X86_FEATURE_RETPOLINE)) + setup_force_cpu_cap(X86_FEATURE_INDIRECT_THUNK_ITS); + setup_force_cpu_cap(X86_FEATURE_RETHUNK); + set_return_thunk(its_return_thunk); + break; + } +out: + pr_info("%s\n", its_strings[its_mitigation]); +} + +#undef pr_fmt #define pr_fmt(fmt) "Spectre V2 : " fmt
static enum spectre_v2_user_mitigation spectre_v2_user_stibp __ro_after_init = @@ -2627,10 +2746,10 @@ static void __init srso_select_mitigatio
if (boot_cpu_data.x86 == 0x19) { setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS); - x86_return_thunk = srso_alias_return_thunk; + set_return_thunk(srso_alias_return_thunk); } else { setup_force_cpu_cap(X86_FEATURE_SRSO); - x86_return_thunk = srso_return_thunk; + set_return_thunk(srso_return_thunk); } if (has_microcode) srso_mitigation = SRSO_MITIGATION_SAFE_RET; @@ -2806,6 +2925,11 @@ static ssize_t rfds_show_state(char *buf return sysfs_emit(buf, "%s\n", rfds_strings[rfds_mitigation]); }
+static ssize_t its_show_state(char *buf) +{ + return sysfs_emit(buf, "%s\n", its_strings[its_mitigation]); +} + static char *stibp_state(void) { if (spectre_v2_in_eibrs_mode(spectre_v2_enabled) && @@ -2988,6 +3112,9 @@ static ssize_t cpu_show_common(struct de case X86_BUG_RFDS: return rfds_show_state(buf);
+ case X86_BUG_ITS: + return its_show_state(buf); + default: break; } @@ -3067,6 +3194,11 @@ ssize_t cpu_show_reg_file_data_sampling( { return cpu_show_common(dev, attr, buf, X86_BUG_RFDS); } + +ssize_t cpu_show_indirect_target_selection(struct device *dev, struct device_attribute *attr, char *buf) +{ + return cpu_show_common(dev, attr, buf, X86_BUG_ITS); +} #endif
void __warn_thunk(void) --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -600,6 +600,7 @@ CPU_SHOW_VULN_FALLBACK(spec_rstack_overf CPU_SHOW_VULN_FALLBACK(gds); CPU_SHOW_VULN_FALLBACK(reg_file_data_sampling); CPU_SHOW_VULN_FALLBACK(ghostwrite); +CPU_SHOW_VULN_FALLBACK(indirect_target_selection);
static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); @@ -616,6 +617,7 @@ static DEVICE_ATTR(spec_rstack_overflow, static DEVICE_ATTR(gather_data_sampling, 0444, cpu_show_gds, NULL); static DEVICE_ATTR(reg_file_data_sampling, 0444, cpu_show_reg_file_data_sampling, NULL); static DEVICE_ATTR(ghostwrite, 0444, cpu_show_ghostwrite, NULL); +static DEVICE_ATTR(indirect_target_selection, 0444, cpu_show_indirect_target_selection, NULL);
static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_meltdown.attr, @@ -633,6 +635,7 @@ static struct attribute *cpu_root_vulner &dev_attr_gather_data_sampling.attr, &dev_attr_reg_file_data_sampling.attr, &dev_attr_ghostwrite.attr, + &dev_attr_indirect_target_selection.attr, NULL };
--- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -78,6 +78,8 @@ extern ssize_t cpu_show_gds(struct devic extern ssize_t cpu_show_reg_file_data_sampling(struct device *dev, struct device_attribute *attr, char *buf); extern ssize_t cpu_show_ghostwrite(struct device *dev, struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_indirect_target_selection(struct device *dev, + struct device_attribute *attr, char *buf);
extern __printf(4, 5) struct device *cpu_device_create(struct device *parent, void *drvdata,
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 2665281a07e19550944e8354a2024635a7b2714a upstream.
Ice Lake generation CPUs are not affected by guest/host isolation part of ITS. If a user is only concerned about KVM guests, they can now choose a new cmdline option "vmexit" that will not deploy the ITS mitigation when CPU is not affected by guest/host isolation. This saves the performance overhead of ITS mitigation on Ice Lake gen CPUs.
When "vmexit" option selected, if the CPU is affected by ITS guest/host isolation, the default ITS mitigation is deployed.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/kernel-parameters.txt | 2 ++ arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kernel/cpu/bugs.c | 11 +++++++++++ arch/x86/kernel/cpu/common.c | 19 ++++++++++++------- 4 files changed, 26 insertions(+), 7 deletions(-)
--- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2186,6 +2186,8 @@ off: Disable mitigation. force: Force the ITS bug and deploy default mitigation. + vmexit: Only deploy mitigation if CPU is affected by + guest/host isolation part of ITS.
For details see: Documentation/admin-guide/hw-vuln/indirect-target-selection.rst --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -536,4 +536,5 @@ #define X86_BUG_BHI X86_BUG(1*32 + 3) /* "bhi" CPU is affected by Branch History Injection */ #define X86_BUG_IBPB_NO_RET X86_BUG(1*32 + 4) /* "ibpb_no_ret" IBPB omits return target predictions */ #define X86_BUG_ITS X86_BUG(1*32 + 5) /* "its" CPU is affected by Indirect Target Selection */ +#define X86_BUG_ITS_NATIVE_ONLY X86_BUG(1*32 + 6) /* "its_native_only" CPU is affected by ITS, VMX is not affected */ #endif /* _ASM_X86_CPUFEATURES_H */ --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1189,16 +1189,19 @@ do_cmd_auto: enum its_mitigation_cmd { ITS_CMD_OFF, ITS_CMD_ON, + ITS_CMD_VMEXIT, };
enum its_mitigation { ITS_MITIGATION_OFF, + ITS_MITIGATION_VMEXIT_ONLY, ITS_MITIGATION_ALIGNED_THUNKS, ITS_MITIGATION_RETPOLINE_STUFF, };
static const char * const its_strings[] = { [ITS_MITIGATION_OFF] = "Vulnerable", + [ITS_MITIGATION_VMEXIT_ONLY] = "Mitigation: Vulnerable, KVM: Not affected", [ITS_MITIGATION_ALIGNED_THUNKS] = "Mitigation: Aligned branch/return thunks", [ITS_MITIGATION_RETPOLINE_STUFF] = "Mitigation: Retpolines, Stuffing RSB", }; @@ -1225,6 +1228,8 @@ static int __init its_parse_cmdline(char } else if (!strcmp(str, "force")) { its_cmd = ITS_CMD_ON; setup_force_cpu_bug(X86_BUG_ITS); + } else if (!strcmp(str, "vmexit")) { + its_cmd = ITS_CMD_VMEXIT; } else { pr_err("Ignoring unknown indirect_target_selection option (%s).", str); } @@ -1280,6 +1285,12 @@ static void __init its_select_mitigation case ITS_CMD_OFF: its_mitigation = ITS_MITIGATION_OFF; break; + case ITS_CMD_VMEXIT: + if (boot_cpu_has_bug(X86_BUG_ITS_NATIVE_ONLY)) { + its_mitigation = ITS_MITIGATION_VMEXIT_ONLY; + goto out; + } + fallthrough; case ITS_CMD_ON: its_mitigation = ITS_MITIGATION_ALIGNED_THUNKS; if (!boot_cpu_has(X86_FEATURE_RETPOLINE)) --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1228,6 +1228,8 @@ static const __initconst struct x86_cpu_ #define RFDS BIT(7) /* CPU is affected by Indirect Target Selection */ #define ITS BIT(8) +/* CPU is affected by Indirect Target Selection, but guest-host isolation is not affected */ +#define ITS_NATIVE_ONLY BIT(9)
static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { VULNBL_INTEL_STEPS(INTEL_IVYBRIDGE, X86_STEP_MAX, SRBDS), @@ -1248,16 +1250,16 @@ static const struct x86_cpu_id cpu_vuln_ VULNBL_INTEL_STEPS(INTEL_KABYLAKE, 0xc, MMIO | RETBLEED | GDS | SRBDS), VULNBL_INTEL_STEPS(INTEL_KABYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS), VULNBL_INTEL_STEPS(INTEL_CANNONLAKE_L, X86_STEP_MAX, RETBLEED), - VULNBL_INTEL_STEPS(INTEL_ICELAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), - VULNBL_INTEL_STEPS(INTEL_ICELAKE_D, X86_STEP_MAX, MMIO | GDS | ITS), - VULNBL_INTEL_STEPS(INTEL_ICELAKE_X, X86_STEP_MAX, MMIO | GDS | ITS), + VULNBL_INTEL_STEPS(INTEL_ICELAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY), + VULNBL_INTEL_STEPS(INTEL_ICELAKE_D, X86_STEP_MAX, MMIO | GDS | ITS | ITS_NATIVE_ONLY), + VULNBL_INTEL_STEPS(INTEL_ICELAKE_X, X86_STEP_MAX, MMIO | GDS | ITS | ITS_NATIVE_ONLY), VULNBL_INTEL_STEPS(INTEL_COMETLAKE, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, 0x0, MMIO | RETBLEED | ITS), VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), - VULNBL_INTEL_STEPS(INTEL_TIGERLAKE_L, X86_STEP_MAX, GDS | ITS), - VULNBL_INTEL_STEPS(INTEL_TIGERLAKE, X86_STEP_MAX, GDS | ITS), + VULNBL_INTEL_STEPS(INTEL_TIGERLAKE_L, X86_STEP_MAX, GDS | ITS | ITS_NATIVE_ONLY), + VULNBL_INTEL_STEPS(INTEL_TIGERLAKE, X86_STEP_MAX, GDS | ITS | ITS_NATIVE_ONLY), VULNBL_INTEL_STEPS(INTEL_LAKEFIELD, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED), - VULNBL_INTEL_STEPS(INTEL_ROCKETLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS), + VULNBL_INTEL_STEPS(INTEL_ROCKETLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY), VULNBL_INTEL_STEPS(INTEL_ALDERLAKE, X86_STEP_MAX, RFDS), VULNBL_INTEL_STEPS(INTEL_ALDERLAKE_L, X86_STEP_MAX, RFDS), VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE, X86_STEP_MAX, RFDS), @@ -1480,8 +1482,11 @@ static void __init cpu_set_bug_bits(stru if (cpu_has(c, X86_FEATURE_AMD_IBPB) && !cpu_has(c, X86_FEATURE_AMD_IBPB_RET)) setup_force_cpu_bug(X86_BUG_IBPB_NO_RET);
- if (vulnerable_to_its(x86_arch_cap_msr)) + if (vulnerable_to_its(x86_arch_cap_msr)) { setup_force_cpu_bug(X86_BUG_ITS); + if (cpu_matches(cpu_vuln_blacklist, ITS_NATIVE_ONLY)) + setup_force_cpu_bug(X86_BUG_ITS_NATIVE_ONLY); + }
if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) return;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit facd226f7e0c8ca936ac114aba43cb3e8b94e41e upstream.
When retpoline mitigation is enabled for spectre-v2, enabling call-depth-tracking and RSB stuffing also mitigates ITS. Add cmdline option indirect_target_selection=stuff to allow enabling RSB stuffing mitigation.
When retpoline mitigation is not enabled, =stuff option is ignored, and default mitigation for ITS is deployed.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/kernel-parameters.txt | 3 +++ arch/x86/kernel/cpu/bugs.c | 19 +++++++++++++++++++ 2 files changed, 22 insertions(+)
--- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2188,6 +2188,9 @@ mitigation. vmexit: Only deploy mitigation if CPU is affected by guest/host isolation part of ITS. + stuff: Deploy RSB-fill mitigation when retpoline is + also deployed. Otherwise, deploy the default + mitigation.
For details see: Documentation/admin-guide/hw-vuln/indirect-target-selection.rst --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1190,6 +1190,7 @@ enum its_mitigation_cmd { ITS_CMD_OFF, ITS_CMD_ON, ITS_CMD_VMEXIT, + ITS_CMD_RSB_STUFF, };
enum its_mitigation { @@ -1230,6 +1231,8 @@ static int __init its_parse_cmdline(char setup_force_cpu_bug(X86_BUG_ITS); } else if (!strcmp(str, "vmexit")) { its_cmd = ITS_CMD_VMEXIT; + } else if (!strcmp(str, "stuff")) { + its_cmd = ITS_CMD_RSB_STUFF; } else { pr_err("Ignoring unknown indirect_target_selection option (%s).", str); } @@ -1281,6 +1284,12 @@ static void __init its_select_mitigation goto out; }
+ if (cmd == ITS_CMD_RSB_STUFF && + (!boot_cpu_has(X86_FEATURE_RETPOLINE) || !IS_ENABLED(CONFIG_MITIGATION_CALL_DEPTH_TRACKING))) { + pr_err("RSB stuff mitigation not supported, using default\n"); + cmd = ITS_CMD_ON; + } + switch (cmd) { case ITS_CMD_OFF: its_mitigation = ITS_MITIGATION_OFF; @@ -1298,6 +1307,16 @@ static void __init its_select_mitigation setup_force_cpu_cap(X86_FEATURE_RETHUNK); set_return_thunk(its_return_thunk); break; + case ITS_CMD_RSB_STUFF: + its_mitigation = ITS_MITIGATION_RETPOLINE_STUFF; + setup_force_cpu_cap(X86_FEATURE_RETHUNK); + setup_force_cpu_cap(X86_FEATURE_CALL_DEPTH); + set_return_thunk(call_depth_return_thunk); + if (retbleed_mitigation == RETBLEED_MITIGATION_NONE) { + retbleed_mitigation = RETBLEED_MITIGATION_STUFF; + pr_info("Retbleed mitigation updated to stuffing\n"); + } + break; } out: pr_info("%s\n", its_strings[its_mitigation]);
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit f0cd7091cc5a032c8870b4285305d9172569d126 upstream.
The software mitigation for BHI is to execute BHB clear sequence at syscall entry, and possibly after a cBPF program. ITS mitigation thunks RETs in the lower half of the cacheline. This causes the RETs in the BHB clear sequence to be thunked as well, adding unnecessary branches to the BHB clear sequence.
Since the sequence is in hot path, align the RET instructions in the sequence to avoid thunking.
This is how disassembly clear_bhb_loop() looks like after this change:
0x44 <+4>: mov $0x5,%ecx 0x49 <+9>: call 0xffffffff81001d9b <clear_bhb_loop+91> 0x4e <+14>: jmp 0xffffffff81001de5 <clear_bhb_loop+165> 0x53 <+19>: int3 ... 0x9b <+91>: call 0xffffffff81001dce <clear_bhb_loop+142> 0xa0 <+96>: ret 0xa1 <+97>: int3 ... 0xce <+142>: mov $0x5,%eax 0xd3 <+147>: jmp 0xffffffff81001dd6 <clear_bhb_loop+150> 0xd5 <+149>: nop 0xd6 <+150>: sub $0x1,%eax 0xd9 <+153>: jne 0xffffffff81001dd3 <clear_bhb_loop+147> 0xdb <+155>: sub $0x1,%ecx 0xde <+158>: jne 0xffffffff81001d9b <clear_bhb_loop+91> 0xe0 <+160>: ret 0xe1 <+161>: int3 0xe2 <+162>: int3 0xe3 <+163>: int3 0xe4 <+164>: int3 0xe5 <+165>: lfence 0xe8 <+168>: pop %rbp 0xe9 <+169>: ret
Suggested-by: Andrew Cooper andrew.cooper3@citrix.com Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/entry/entry_64.S | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-)
--- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1523,7 +1523,9 @@ SYM_CODE_END(rewind_stack_and_make_dead) * ORC to unwind properly. * * The alignment is for performance and not for safety, and may be safely - * refactored in the future if needed. + * refactored in the future if needed. The .skips are for safety, to ensure + * that all RETs are in the second half of a cacheline to mitigate Indirect + * Target Selection, rather than taking the slowpath via its_return_thunk. */ SYM_FUNC_START(clear_bhb_loop) push %rbp @@ -1533,10 +1535,22 @@ SYM_FUNC_START(clear_bhb_loop) call 1f jmp 5f .align 64, 0xcc + /* + * Shift instructions so that the RET is in the upper half of the + * cacheline and don't take the slowpath to its_return_thunk. + */ + .skip 32 - (.Lret1 - 1f), 0xcc ANNOTATE_INTRA_FUNCTION_CALL 1: call 2f - RET +.Lret1: RET .align 64, 0xcc + /* + * As above shift instructions for RET at .Lret2 as well. + * + * This should be ideally be: .skip 32 - (.Lret2 - 2f), 0xcc + * but some Clang versions (e.g. 18) don't like this. + */ + .skip 32 - 18, 0xcc 2: movl $5, %eax 3: jmp 4f nop @@ -1544,7 +1558,7 @@ SYM_FUNC_START(clear_bhb_loop) jnz 3b sub $1, %ecx jnz 1b - RET +.Lret2: RET 5: lfence pop %rbp RET
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit ebebe30794d38c51f71fe4951ba6af4159d9837d upstream.
cfi_rewrite_callers() updates the fineIBT hash matching at the caller side, but except for paranoid-mode it relies on apply_retpoline() and friends for any ENDBR relocation. This could temporarily cause an indirect branch to land on a poisoned ENDBR.
For instance, with para-virtualization enabled, a simple wrmsrl() could have an indirect branch pointing to native_write_msr() who's ENDBR has been relocated due to fineIBT:
<wrmsrl>: push %rbp mov %rsp,%rbp mov %esi,%eax mov %rsi,%rdx shr $0x20,%rdx mov %edi,%edi mov %rax,%rsi call *0x21e65d0(%rip) # <pv_ops+0xb8> ^^^^^^^^^^^^^^^^^^^^^^^
Such an indirect call during the alternative patching could #CP if the caller is not *yet* adjusted for the new target ENDBR. To prevent a false #CP, keep CET-IBT disabled until all callers are patched.
Patching during the module load does not need to be guarded by IBT-disable because the module code is not executed until the patching is complete.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/alternative.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -31,6 +31,7 @@ #include <asm/paravirt.h> #include <asm/asm-prototypes.h> #include <asm/cfi.h> +#include <asm/ibt.h>
int __read_mostly alternatives_patched;
@@ -1748,6 +1749,8 @@ static noinline void __init alt_reloc_se
void __init alternative_instructions(void) { + u64 ibt; + int3_selftest();
/* @@ -1774,6 +1777,9 @@ void __init alternative_instructions(voi */ paravirt_set_cap();
+ /* Keep CET-IBT disabled until caller/callee are patched */ + ibt = ibt_save(/*disable*/ true); + __apply_fineibt(__retpoline_sites, __retpoline_sites_end, __cfi_sites, __cfi_sites_end, NULL);
@@ -1797,6 +1803,8 @@ void __init alternative_instructions(voi */ apply_seal_endbr(__ibt_endbr_seal, __ibt_endbr_seal_end, NULL);
+ ibt_restore(ibt); + #ifdef CONFIG_SMP /* Patch to UP if other cpus not imminent. */ if (!noreplace_smp && (num_present_cpus() == 1 || setup_max_cpus <= 1)) {
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
commit 872df34d7c51a79523820ea6a14860398c639b87 upstream.
ITS mitigation moves the unsafe indirect branches to a safe thunk. This could degrade the prediction accuracy as the source address of indirect branches becomes same for different execution paths.
To improve the predictions, and hence the performance, assign a separate thunk for each indirect callsite. This is also a defense-in-depth measure to avoid indirect branches aliasing with each other.
As an example, 5000 dynamic thunks would utilize around 16 bits of the address space, thereby gaining entropy. For a BTB that uses 32 bits for indexing, dynamic thunks could provide better prediction accuracy over fixed thunks.
Have ITS thunks be variable sized and use EXECMEM_MODULE_TEXT such that they are both more flexible (got to extend them later) and live in 2M TLBs, just like kernel code, avoiding undue TLB pressure.
[ pawan: CONFIG_EXECMEM_ROX is not supported on backport kernel, made adjustments to set memory to RW and ROX ]
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/Kconfig | 1 arch/x86/include/asm/alternative.h | 10 ++ arch/x86/kernel/alternative.c | 129 ++++++++++++++++++++++++++++++++++++- arch/x86/kernel/module.c | 6 + include/linux/execmem.h | 3 include/linux/module.h | 5 + 6 files changed, 151 insertions(+), 3 deletions(-)
--- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2768,6 +2768,7 @@ config MITIGATION_ITS bool "Enable Indirect Target Selection mitigation" depends on CPU_SUP_INTEL && X86_64 depends on MITIGATION_RETPOLINE && MITIGATION_RETHUNK + select EXECMEM default y help Enable Indirect Target Selection (ITS) mitigation. ITS is a bug in --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -125,6 +125,16 @@ static __always_inline int x86_call_dept } #endif
+#ifdef CONFIG_MITIGATION_ITS +extern void its_init_mod(struct module *mod); +extern void its_fini_mod(struct module *mod); +extern void its_free_mod(struct module *mod); +#else /* CONFIG_MITIGATION_ITS */ +static inline void its_init_mod(struct module *mod) { } +static inline void its_fini_mod(struct module *mod) { } +static inline void its_free_mod(struct module *mod) { } +#endif + #if defined(CONFIG_MITIGATION_RETHUNK) && defined(CONFIG_OBJTOOL) extern bool cpu_wants_rethunk(void); extern bool cpu_wants_rethunk_at(void *addr); --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -18,6 +18,7 @@ #include <linux/mmu_context.h> #include <linux/bsearch.h> #include <linux/sync_core.h> +#include <linux/execmem.h> #include <asm/text-patching.h> #include <asm/alternative.h> #include <asm/sections.h> @@ -32,6 +33,7 @@ #include <asm/asm-prototypes.h> #include <asm/cfi.h> #include <asm/ibt.h> +#include <asm/set_memory.h>
int __read_mostly alternatives_patched;
@@ -125,6 +127,123 @@ const unsigned char * const x86_nops[ASM #endif };
+#ifdef CONFIG_MITIGATION_ITS + +static struct module *its_mod; +static void *its_page; +static unsigned int its_offset; + +/* Initialize a thunk with the "jmp *reg; int3" instructions. */ +static void *its_init_thunk(void *thunk, int reg) +{ + u8 *bytes = thunk; + int i = 0; + + if (reg >= 8) { + bytes[i++] = 0x41; /* REX.B prefix */ + reg -= 8; + } + bytes[i++] = 0xff; + bytes[i++] = 0xe0 + reg; /* jmp *reg */ + bytes[i++] = 0xcc; + + return thunk; +} + +void its_init_mod(struct module *mod) +{ + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) + return; + + mutex_lock(&text_mutex); + its_mod = mod; + its_page = NULL; +} + +void its_fini_mod(struct module *mod) +{ + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) + return; + + WARN_ON_ONCE(its_mod != mod); + + its_mod = NULL; + its_page = NULL; + mutex_unlock(&text_mutex); + + for (int i = 0; i < mod->its_num_pages; i++) { + void *page = mod->its_page_array[i]; + set_memory_rox((unsigned long)page, 1); + } +} + +void its_free_mod(struct module *mod) +{ + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) + return; + + for (int i = 0; i < mod->its_num_pages; i++) { + void *page = mod->its_page_array[i]; + execmem_free(page); + } + kfree(mod->its_page_array); +} + +static void *its_alloc(void) +{ + void *page __free(execmem) = execmem_alloc(EXECMEM_MODULE_TEXT, PAGE_SIZE); + + if (!page) + return NULL; + + if (its_mod) { + void *tmp = krealloc(its_mod->its_page_array, + (its_mod->its_num_pages+1) * sizeof(void *), + GFP_KERNEL); + if (!tmp) + return NULL; + + its_mod->its_page_array = tmp; + its_mod->its_page_array[its_mod->its_num_pages++] = page; + } + + return no_free_ptr(page); +} + +static void *its_allocate_thunk(int reg) +{ + int size = 3 + (reg / 8); + void *thunk; + + if (!its_page || (its_offset + size - 1) >= PAGE_SIZE) { + its_page = its_alloc(); + if (!its_page) { + pr_err("ITS page allocation failed\n"); + return NULL; + } + memset(its_page, INT3_INSN_OPCODE, PAGE_SIZE); + its_offset = 32; + } + + /* + * If the indirect branch instruction will be in the lower half + * of a cacheline, then update the offset to reach the upper half. + */ + if ((its_offset + size - 1) % 64 < 32) + its_offset = ((its_offset - 1) | 0x3F) + 33; + + thunk = its_page + its_offset; + its_offset += size; + + set_memory_rw((unsigned long)its_page, 1); + thunk = its_init_thunk(thunk, reg); + set_memory_rox((unsigned long)its_page, 1); + + return thunk; +} + +#endif + /* * Nomenclature for variable names to simplify and clarify this code and ease * any potential staring at it: @@ -646,9 +765,13 @@ static int emit_call_track_retpoline(voi #ifdef CONFIG_MITIGATION_ITS static int emit_its_trampoline(void *addr, struct insn *insn, int reg, u8 *bytes) { - return __emit_trampoline(addr, insn, bytes, - __x86_indirect_its_thunk_array[reg], - __x86_indirect_its_thunk_array[reg]); + u8 *thunk = __x86_indirect_its_thunk_array[reg]; + u8 *tmp = its_allocate_thunk(reg); + + if (tmp) + thunk = tmp; + + return __emit_trampoline(addr, insn, bytes, thunk, thunk); }
/* Check if an indirect branch is at ITS-unsafe address */ --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -252,6 +252,8 @@ int module_finalize(const Elf_Ehdr *hdr, ibt_endbr = s; }
+ its_init_mod(me); + if (retpolines || cfi) { void *rseg = NULL, *cseg = NULL; unsigned int rsize = 0, csize = 0; @@ -272,6 +274,9 @@ int module_finalize(const Elf_Ehdr *hdr, void *rseg = (void *)retpolines->sh_addr; apply_retpolines(rseg, rseg + retpolines->sh_size, me); } + + its_fini_mod(me); + if (returns) { void *rseg = (void *)returns->sh_addr; apply_returns(rseg, rseg + returns->sh_size, me); @@ -335,4 +340,5 @@ int module_post_finalize(const Elf_Ehdr void module_arch_cleanup(struct module *mod) { alternatives_smp_module_del(mod); + its_free_mod(mod); } --- a/include/linux/execmem.h +++ b/include/linux/execmem.h @@ -4,6 +4,7 @@
#include <linux/types.h> #include <linux/moduleloader.h> +#include <linux/cleanup.h>
#if (defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)) && \ !defined(CONFIG_KASAN_VMALLOC) @@ -139,6 +140,8 @@ void *execmem_alloc(enum execmem_type ty */ void execmem_free(void *ptr);
+DEFINE_FREE(execmem, void *, if (_T) execmem_free(_T)); + #ifdef CONFIG_MMU /** * execmem_vmap - create virtual mapping for EXECMEM_MODULE_DATA memory --- a/include/linux/module.h +++ b/include/linux/module.h @@ -587,6 +587,11 @@ struct module { atomic_t refcnt; #endif
+#ifdef CONFIG_MITIGATION_ITS + int its_num_pages; + void **its_page_array; +#endif + #ifdef CONFIG_CONSTRUCTORS /* Constructor functions. */ ctor_fn_t *ctors;
6.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 7a9b709e7cc5ce1ffb84ce07bf6d157e1de758df upstream.
Below are the tests added for Indirect Target Selection (ITS):
- its_sysfs.py - Check if sysfs reflects the correct mitigation status for the mitigation selected via the kernel cmdline.
- its_permutations.py - tests mitigation selection with cmdline permutations with other bugs like spectre_v2 and retbleed.
- its_indirect_alignment.py - verifies that for addresses in .retpoline_sites section that belong to lower half of cacheline are patched to ITS-safe thunk. Typical output looks like below:
Site 49: function symbol: __x64_sys_restart_syscall+0x1f <0xffffffffbb1509af> # vmlinux: 0xffffffff813509af: jmp 0xffffffff81f5a8e0 # kcore: 0xffffffffbb1509af: jmpq *%rax # ITS thunk NOT expected for site 49 # PASSED: Found *%rax # Site 50: function symbol: __resched_curr+0xb0 <0xffffffffbb181910> # vmlinux: 0xffffffff81381910: jmp 0xffffffff81f5a8e0 # kcore: 0xffffffffbb181910: jmp 0xffffffffc02000fc # ITS thunk expected for site 50 # PASSED: Found 0xffffffffc02000fc -> jmpq *%rax <scattered-thunk?>
- its_ret_alignment.py - verifies that for addresses in .return_sites section that belong to lower half of cacheline are patched to its_return_thunk. Typical output looks like below:
Site 97: function symbol: collect_event+0x48 <0xffffffffbb007f18> # vmlinux: 0xffffffff81207f18: jmp 0xffffffff81f5b500 # kcore: 0xffffffffbb007f18: jmp 0xffffffffbbd5b560 # PASSED: Found jmp 0xffffffffbbd5b560 <its_return_thunk> # Site 98: function symbol: collect_event+0xa4 <0xffffffffbb007f74> # vmlinux: 0xffffffff81207f74: jmp 0xffffffff81f5b500 # kcore: 0xffffffffbb007f74: retq # PASSED: Found retq
Some of these tests have dependency on tools like virtme-ng[1] and drgn[2]. When the dependencies are not met, the test will be skipped.
[1] https://github.com/arighi/virtme-ng [2] https://github.com/osandov/drgn
Co-developed-by: Tao Zhang tao1.zhang@linux.intel.com Signed-off-by: Tao Zhang tao1.zhang@linux.intel.com Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/Makefile | 1 tools/testing/selftests/x86/bugs/Makefile | 3 tools/testing/selftests/x86/bugs/common.py | 164 +++++++++++++ tools/testing/selftests/x86/bugs/its_indirect_alignment.py | 150 +++++++++++ tools/testing/selftests/x86/bugs/its_permutations.py | 109 ++++++++ tools/testing/selftests/x86/bugs/its_ret_alignment.py | 139 +++++++++++ tools/testing/selftests/x86/bugs/its_sysfs.py | 65 +++++ 7 files changed, 631 insertions(+) create mode 100644 tools/testing/selftests/x86/bugs/Makefile create mode 100755 tools/testing/selftests/x86/bugs/common.py create mode 100755 tools/testing/selftests/x86/bugs/its_indirect_alignment.py create mode 100755 tools/testing/selftests/x86/bugs/its_permutations.py create mode 100755 tools/testing/selftests/x86/bugs/its_ret_alignment.py create mode 100755 tools/testing/selftests/x86/bugs/its_sysfs.py
--- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -118,6 +118,7 @@ TARGETS += user_events TARGETS += vDSO TARGETS += mm TARGETS += x86 +TARGETS += x86/bugs TARGETS += zram #Please keep the TARGETS list alphabetically sorted # Run "make quicktest=1 run_tests" or --- /dev/null +++ b/tools/testing/selftests/x86/bugs/Makefile @@ -0,0 +1,3 @@ +TEST_PROGS := its_sysfs.py its_permutations.py its_indirect_alignment.py its_ret_alignment.py +TEST_FILES := common.py +include ../../lib.mk --- /dev/null +++ b/tools/testing/selftests/x86/bugs/common.py @@ -0,0 +1,164 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (c) 2025 Intel Corporation +# +# This contains kselftest framework adapted common functions for testing +# mitigation for x86 bugs. + +import os, sys, re, shutil + +sys.path.insert(0, '../../kselftest') +import ksft + +def read_file(path): + if not os.path.exists(path): + return None + with open(path, 'r') as file: + return file.read().strip() + +def cpuinfo_has(arg): + cpuinfo = read_file('/proc/cpuinfo') + if arg in cpuinfo: + return True + return False + +def cmdline_has(arg): + cmdline = read_file('/proc/cmdline') + if arg in cmdline: + return True + return False + +def cmdline_has_either(args): + cmdline = read_file('/proc/cmdline') + for arg in args: + if arg in cmdline: + return True + return False + +def cmdline_has_none(args): + return not cmdline_has_either(args) + +def cmdline_has_all(args): + cmdline = read_file('/proc/cmdline') + for arg in args: + if arg not in cmdline: + return False + return True + +def get_sysfs(bug): + return read_file("/sys/devices/system/cpu/vulnerabilities/" + bug) + +def sysfs_has(bug, mitigation): + status = get_sysfs(bug) + if mitigation in status: + return True + return False + +def sysfs_has_either(bugs, mitigations): + for bug in bugs: + for mitigation in mitigations: + if sysfs_has(bug, mitigation): + return True + return False + +def sysfs_has_none(bugs, mitigations): + return not sysfs_has_either(bugs, mitigations) + +def sysfs_has_all(bugs, mitigations): + for bug in bugs: + for mitigation in mitigations: + if not sysfs_has(bug, mitigation): + return False + return True + +def bug_check_pass(bug, found): + ksft.print_msg(f"\nFound: {found}") + # ksft.print_msg(f"\ncmdline: {read_file('/proc/cmdline')}") + ksft.test_result_pass(f'{bug}: {found}') + +def bug_check_fail(bug, found, expected): + ksft.print_msg(f'\nFound:\t {found}') + ksft.print_msg(f'Expected:\t {expected}') + ksft.print_msg(f"\ncmdline: {read_file('/proc/cmdline')}") + ksft.test_result_fail(f'{bug}: {found}') + +def bug_status_unknown(bug, found): + ksft.print_msg(f'\nUnknown status: {found}') + ksft.print_msg(f"\ncmdline: {read_file('/proc/cmdline')}") + ksft.test_result_fail(f'{bug}: {found}') + +def basic_checks_sufficient(bug, mitigation): + if not mitigation: + bug_status_unknown(bug, "None") + return True + elif mitigation == "Not affected": + ksft.test_result_pass(bug) + return True + elif mitigation == "Vulnerable": + if cmdline_has_either([f'{bug}=off', 'mitigations=off']): + bug_check_pass(bug, mitigation) + return True + return False + +def get_section_info(vmlinux, section_name): + from elftools.elf.elffile import ELFFile + with open(vmlinux, 'rb') as f: + elffile = ELFFile(f) + section = elffile.get_section_by_name(section_name) + if section is None: + ksft.print_msg("Available sections in vmlinux:") + for sec in elffile.iter_sections(): + ksft.print_msg(sec.name) + raise ValueError(f"Section {section_name} not found in {vmlinux}") + return section['sh_addr'], section['sh_offset'], section['sh_size'] + +def get_patch_sites(vmlinux, offset, size): + import struct + output = [] + with open(vmlinux, 'rb') as f: + f.seek(offset) + i = 0 + while i < size: + data = f.read(4) # s32 + if not data: + break + sym_offset = struct.unpack('<i', data)[0] + i + i += 4 + output.append(sym_offset) + return output + +def get_instruction_from_vmlinux(elffile, section, virtual_address, target_address): + from capstone import Cs, CS_ARCH_X86, CS_MODE_64 + section_start = section['sh_addr'] + section_end = section_start + section['sh_size'] + + if not (section_start <= target_address < section_end): + return None + + offset = target_address - section_start + code = section.data()[offset:offset + 16] + + cap = init_capstone() + for instruction in cap.disasm(code, target_address): + if instruction.address == target_address: + return instruction + return None + +def init_capstone(): + from capstone import Cs, CS_ARCH_X86, CS_MODE_64, CS_OPT_SYNTAX_ATT + cap = Cs(CS_ARCH_X86, CS_MODE_64) + cap.syntax = CS_OPT_SYNTAX_ATT + return cap + +def get_runtime_kernel(): + import drgn + return drgn.program_from_kernel() + +def check_dependencies_or_skip(modules, script_name="unknown test"): + for mod in modules: + try: + __import__(mod) + except ImportError: + ksft.test_result_skip(f"Skipping {script_name}: missing module '{mod}'") + ksft.finished() --- /dev/null +++ b/tools/testing/selftests/x86/bugs/its_indirect_alignment.py @@ -0,0 +1,150 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (c) 2025 Intel Corporation +# +# Test for indirect target selection (ITS) mitigation. +# +# Test if indirect CALL/JMP are correctly patched by evaluating +# the vmlinux .retpoline_sites in /proc/kcore. + +# Install dependencies +# add-apt-repository ppa:michel-slm/kernel-utils +# apt update +# apt install -y python3-drgn python3-pyelftools python3-capstone +# +# Best to copy the vmlinux at a standard location: +# mkdir -p /usr/lib/debug/lib/modules/$(uname -r) +# cp $VMLINUX /usr/lib/debug/lib/modules/$(uname -r)/vmlinux +# +# Usage: ./its_indirect_alignment.py [vmlinux] + +import os, sys, argparse +from pathlib import Path + +this_dir = os.path.dirname(os.path.realpath(__file__)) +sys.path.insert(0, this_dir + '/../../kselftest') +import ksft +import common as c + +bug = "indirect_target_selection" + +mitigation = c.get_sysfs(bug) +if not mitigation or "Aligned branch/return thunks" not in mitigation: + ksft.test_result_skip("Skipping its_indirect_alignment.py: Aligned branch/return thunks not enabled") + ksft.finished() + +if c.sysfs_has("spectre_v2", "Retpolines"): + ksft.test_result_skip("Skipping its_indirect_alignment.py: Retpolines deployed") + ksft.finished() + +c.check_dependencies_or_skip(['drgn', 'elftools', 'capstone'], script_name="its_indirect_alignment.py") + +from elftools.elf.elffile import ELFFile +from drgn.helpers.common.memory import identify_address + +cap = c.init_capstone() + +if len(os.sys.argv) > 1: + arg_vmlinux = os.sys.argv[1] + if not os.path.exists(arg_vmlinux): + ksft.test_result_fail(f"its_indirect_alignment.py: vmlinux not found at argument path: {arg_vmlinux}") + ksft.exit_fail() + os.makedirs(f"/usr/lib/debug/lib/modules/{os.uname().release}", exist_ok=True) + os.system(f'cp {arg_vmlinux} /usr/lib/debug/lib/modules/$(uname -r)/vmlinux') + +vmlinux = f"/usr/lib/debug/lib/modules/{os.uname().release}/vmlinux" +if not os.path.exists(vmlinux): + ksft.test_result_fail(f"its_indirect_alignment.py: vmlinux not found at {vmlinux}") + ksft.exit_fail() + +ksft.print_msg(f"Using vmlinux: {vmlinux}") + +retpolines_start_vmlinux, retpolines_sec_offset, size = c.get_section_info(vmlinux, '.retpoline_sites') +ksft.print_msg(f"vmlinux: Section .retpoline_sites (0x{retpolines_start_vmlinux:x}) found at 0x{retpolines_sec_offset:x} with size 0x{size:x}") + +sites_offset = c.get_patch_sites(vmlinux, retpolines_sec_offset, size) +total_retpoline_tests = len(sites_offset) +ksft.print_msg(f"Found {total_retpoline_tests} retpoline sites") + +prog = c.get_runtime_kernel() +retpolines_start_kcore = prog.symbol('__retpoline_sites').address +ksft.print_msg(f'kcore: __retpoline_sites: 0x{retpolines_start_kcore:x}') + +x86_indirect_its_thunk_r15 = prog.symbol('__x86_indirect_its_thunk_r15').address +ksft.print_msg(f'kcore: __x86_indirect_its_thunk_r15: 0x{x86_indirect_its_thunk_r15:x}') + +tests_passed = 0 +tests_failed = 0 +tests_unknown = 0 + +with open(vmlinux, 'rb') as f: + elffile = ELFFile(f) + text_section = elffile.get_section_by_name('.text') + + for i in range(0, len(sites_offset)): + site = retpolines_start_kcore + sites_offset[i] + vmlinux_site = retpolines_start_vmlinux + sites_offset[i] + passed = unknown = failed = False + try: + vmlinux_insn = c.get_instruction_from_vmlinux(elffile, text_section, text_section['sh_addr'], vmlinux_site) + kcore_insn = list(cap.disasm(prog.read(site, 16), site))[0] + operand = kcore_insn.op_str + insn_end = site + kcore_insn.size - 1 # TODO handle Jcc.32 __x86_indirect_thunk_\reg + safe_site = insn_end & 0x20 + site_status = "" if safe_site else "(unsafe)" + + ksft.print_msg(f"\nSite {i}: {identify_address(prog, site)} <0x{site:x}> {site_status}") + ksft.print_msg(f"\tvmlinux: 0x{vmlinux_insn.address:x}:\t{vmlinux_insn.mnemonic}\t{vmlinux_insn.op_str}") + ksft.print_msg(f"\tkcore: 0x{kcore_insn.address:x}:\t{kcore_insn.mnemonic}\t{kcore_insn.op_str}") + + if (site & 0x20) ^ (insn_end & 0x20): + ksft.print_msg(f"\tSite at safe/unsafe boundary: {str(kcore_insn.bytes)} {kcore_insn.mnemonic} {operand}") + if safe_site: + tests_passed += 1 + passed = True + ksft.print_msg(f"\tPASSED: At safe address") + continue + + if operand.startswith('0xffffffff'): + thunk = int(operand, 16) + if thunk > x86_indirect_its_thunk_r15: + insn_at_thunk = list(cap.disasm(prog.read(thunk, 16), thunk))[0] + operand += ' -> ' + insn_at_thunk.mnemonic + ' ' + insn_at_thunk.op_str + ' <dynamic-thunk?>' + if 'jmp' in insn_at_thunk.mnemonic and thunk & 0x20: + ksft.print_msg(f"\tPASSED: Found {operand} at safe address") + passed = True + if not passed: + if kcore_insn.operands[0].type == capstone.CS_OP_IMM: + operand += ' <' + prog.symbol(int(operand, 16)) + '>' + if '__x86_indirect_its_thunk_' in operand: + ksft.print_msg(f"\tPASSED: Found {operand}") + else: + ksft.print_msg(f"\tPASSED: Found direct branch: {kcore_insn}, ITS thunk not required.") + passed = True + else: + unknown = True + if passed: + tests_passed += 1 + elif unknown: + ksft.print_msg(f"UNKNOWN: unexpected operand: {kcore_insn}") + tests_unknown += 1 + else: + ksft.print_msg(f'\t************* FAILED *************') + ksft.print_msg(f"\tFound {kcore_insn.bytes} {kcore_insn.mnemonic} {operand}") + ksft.print_msg(f'\t**********************************') + tests_failed += 1 + except Exception as e: + ksft.print_msg(f"UNKNOWN: An unexpected error occurred: {e}") + tests_unknown += 1 + +ksft.print_msg(f"\n\nSummary:") +ksft.print_msg(f"PASS: \t{tests_passed} \t/ {total_retpoline_tests}") +ksft.print_msg(f"FAIL: \t{tests_failed} \t/ {total_retpoline_tests}") +ksft.print_msg(f"UNKNOWN: \t{tests_unknown} \t/ {total_retpoline_tests}") + +if tests_failed == 0: + ksft.test_result_pass("All ITS return thunk sites passed") +else: + ksft.test_result_fail(f"{tests_failed} ITS return thunk sites failed") +ksft.finished() --- /dev/null +++ b/tools/testing/selftests/x86/bugs/its_permutations.py @@ -0,0 +1,109 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (c) 2025 Intel Corporation +# +# Test for indirect target selection (ITS) cmdline permutations with other bugs +# like spectre_v2 and retbleed. + +import os, sys, subprocess, itertools, re, shutil + +test_dir = os.path.dirname(os.path.realpath(__file__)) +sys.path.insert(0, test_dir + '/../../kselftest') +import ksft +import common as c + +bug = "indirect_target_selection" +mitigation = c.get_sysfs(bug) + +if not mitigation or "Not affected" in mitigation: + ksft.test_result_skip("Skipping its_permutations.py: not applicable") + ksft.finished() + +if shutil.which('vng') is None: + ksft.test_result_skip("Skipping its_permutations.py: virtme-ng ('vng') not found in PATH.") + ksft.finished() + +TEST = f"{test_dir}/its_sysfs.py" +default_kparam = ['clearcpuid=hypervisor', 'panic=5', 'panic_on_warn=1', 'oops=panic', 'nmi_watchdog=1', 'hung_task_panic=1'] + +DEBUG = " -v " + +# Install dependencies +# https://github.com/arighi/virtme-ng +# apt install virtme-ng +BOOT_CMD = f"vng --run {test_dir}/../../../../../arch/x86/boot/bzImage " +#BOOT_CMD += DEBUG + +bug = "indirect_target_selection" + +input_options = { + 'indirect_target_selection' : ['off', 'on', 'stuff', 'vmexit'], + 'retbleed' : ['off', 'stuff', 'auto'], + 'spectre_v2' : ['off', 'on', 'eibrs', 'retpoline', 'ibrs', 'eibrs,retpoline'], +} + +def pretty_print(output): + OKBLUE = '\033[94m' + OKGREEN = '\033[92m' + WARNING = '\033[93m' + FAIL = '\033[91m' + ENDC = '\033[0m' + BOLD = '\033[1m' + + # Define patterns and their corresponding colors + patterns = { + r"^ok \d+": OKGREEN, + r"^not ok \d+": FAIL, + r"^# Testing .*": OKBLUE, + r"^# Found: .*": WARNING, + r"^# Totals: .*": BOLD, + r"pass:([1-9]\d*)": OKGREEN, + r"fail:([1-9]\d*)": FAIL, + r"skip:([1-9]\d*)": WARNING, + } + + # Apply colors based on patterns + for pattern, color in patterns.items(): + output = re.sub(pattern, lambda match: f"{color}{match.group(0)}{ENDC}", output, flags=re.MULTILINE) + + print(output) + +combinations = list(itertools.product(*input_options.values())) +ksft.print_header() +ksft.set_plan(len(combinations)) + +logs = "" + +for combination in combinations: + append = "" + log = "" + for p in default_kparam: + append += f' --append={p}' + command = BOOT_CMD + append + test_params = "" + for i, key in enumerate(input_options.keys()): + param = f'{key}={combination[i]}' + test_params += f' {param}' + command += f" --append={param}" + command += f" -- {TEST}" + test_name = f"{bug} {test_params}" + pretty_print(f'# Testing {test_name}') + t = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) + t.wait() + output, _ = t.communicate() + if t.returncode == 0: + ksft.test_result_pass(test_name) + else: + ksft.test_result_fail(test_name) + output = output.decode() + log += f" {output}" + pretty_print(log) + logs += output + "\n" + +# Optionally use tappy to parse the output +# apt install python3-tappy +with open("logs.txt", "w") as f: + f.write(logs) + +ksft.finished() --- /dev/null +++ b/tools/testing/selftests/x86/bugs/its_ret_alignment.py @@ -0,0 +1,139 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (c) 2025 Intel Corporation +# +# Test for indirect target selection (ITS) mitigation. +# +# Tests if the RETs are correctly patched by evaluating the +# vmlinux .return_sites in /proc/kcore. +# +# Install dependencies +# add-apt-repository ppa:michel-slm/kernel-utils +# apt update +# apt install -y python3-drgn python3-pyelftools python3-capstone +# +# Run on target machine +# mkdir -p /usr/lib/debug/lib/modules/$(uname -r) +# cp $VMLINUX /usr/lib/debug/lib/modules/$(uname -r)/vmlinux +# +# Usage: ./its_ret_alignment.py + +import os, sys, argparse +from pathlib import Path + +this_dir = os.path.dirname(os.path.realpath(__file__)) +sys.path.insert(0, this_dir + '/../../kselftest') +import ksft +import common as c + +bug = "indirect_target_selection" +mitigation = c.get_sysfs(bug) +if not mitigation or "Aligned branch/return thunks" not in mitigation: + ksft.test_result_skip("Skipping its_ret_alignment.py: Aligned branch/return thunks not enabled") + ksft.finished() + +c.check_dependencies_or_skip(['drgn', 'elftools', 'capstone'], script_name="its_ret_alignment.py") + +from elftools.elf.elffile import ELFFile +from drgn.helpers.common.memory import identify_address + +cap = c.init_capstone() + +if len(os.sys.argv) > 1: + arg_vmlinux = os.sys.argv[1] + if not os.path.exists(arg_vmlinux): + ksft.test_result_fail(f"its_ret_alignment.py: vmlinux not found at user-supplied path: {arg_vmlinux}") + ksft.exit_fail() + os.makedirs(f"/usr/lib/debug/lib/modules/{os.uname().release}", exist_ok=True) + os.system(f'cp {arg_vmlinux} /usr/lib/debug/lib/modules/$(uname -r)/vmlinux') + +vmlinux = f"/usr/lib/debug/lib/modules/{os.uname().release}/vmlinux" +if not os.path.exists(vmlinux): + ksft.test_result_fail(f"its_ret_alignment.py: vmlinux not found at {vmlinux}") + ksft.exit_fail() + +ksft.print_msg(f"Using vmlinux: {vmlinux}") + +rethunks_start_vmlinux, rethunks_sec_offset, size = c.get_section_info(vmlinux, '.return_sites') +ksft.print_msg(f"vmlinux: Section .return_sites (0x{rethunks_start_vmlinux:x}) found at 0x{rethunks_sec_offset:x} with size 0x{size:x}") + +sites_offset = c.get_patch_sites(vmlinux, rethunks_sec_offset, size) +total_rethunk_tests = len(sites_offset) +ksft.print_msg(f"Found {total_rethunk_tests} rethunk sites") + +prog = c.get_runtime_kernel() +rethunks_start_kcore = prog.symbol('__return_sites').address +ksft.print_msg(f'kcore: __rethunk_sites: 0x{rethunks_start_kcore:x}') + +its_return_thunk = prog.symbol('its_return_thunk').address +ksft.print_msg(f'kcore: its_return_thunk: 0x{its_return_thunk:x}') + +tests_passed = 0 +tests_failed = 0 +tests_unknown = 0 +tests_skipped = 0 + +with open(vmlinux, 'rb') as f: + elffile = ELFFile(f) + text_section = elffile.get_section_by_name('.text') + + for i in range(len(sites_offset)): + site = rethunks_start_kcore + sites_offset[i] + vmlinux_site = rethunks_start_vmlinux + sites_offset[i] + try: + passed = unknown = failed = skipped = False + + symbol = identify_address(prog, site) + vmlinux_insn = c.get_instruction_from_vmlinux(elffile, text_section, text_section['sh_addr'], vmlinux_site) + kcore_insn = list(cap.disasm(prog.read(site, 16), site))[0] + + insn_end = site + kcore_insn.size - 1 + + safe_site = insn_end & 0x20 + site_status = "" if safe_site else "(unsafe)" + + ksft.print_msg(f"\nSite {i}: {symbol} <0x{site:x}> {site_status}") + ksft.print_msg(f"\tvmlinux: 0x{vmlinux_insn.address:x}:\t{vmlinux_insn.mnemonic}\t{vmlinux_insn.op_str}") + ksft.print_msg(f"\tkcore: 0x{kcore_insn.address:x}:\t{kcore_insn.mnemonic}\t{kcore_insn.op_str}") + + if safe_site: + tests_passed += 1 + passed = True + ksft.print_msg(f"\tPASSED: At safe address") + continue + + if "jmp" in kcore_insn.mnemonic: + passed = True + elif "ret" not in kcore_insn.mnemonic: + skipped = True + + if passed: + ksft.print_msg(f"\tPASSED: Found {kcore_insn.mnemonic} {kcore_insn.op_str}") + tests_passed += 1 + elif skipped: + ksft.print_msg(f"\tSKIPPED: Found '{kcore_insn.mnemonic}'") + tests_skipped += 1 + elif unknown: + ksft.print_msg(f"UNKNOWN: An unknown instruction: {kcore_insn}") + tests_unknown += 1 + else: + ksft.print_msg(f'\t************* FAILED *************') + ksft.print_msg(f"\tFound {kcore_insn.mnemonic} {kcore_insn.op_str}") + ksft.print_msg(f'\t**********************************') + tests_failed += 1 + except Exception as e: + ksft.print_msg(f"UNKNOWN: An unexpected error occurred: {e}") + tests_unknown += 1 + +ksft.print_msg(f"\n\nSummary:") +ksft.print_msg(f"PASSED: \t{tests_passed} \t/ {total_rethunk_tests}") +ksft.print_msg(f"FAILED: \t{tests_failed} \t/ {total_rethunk_tests}") +ksft.print_msg(f"SKIPPED: \t{tests_skipped} \t/ {total_rethunk_tests}") +ksft.print_msg(f"UNKNOWN: \t{tests_unknown} \t/ {total_rethunk_tests}") + +if tests_failed == 0: + ksft.test_result_pass("All ITS return thunk sites passed.") +else: + ksft.test_result_fail(f"{tests_failed} failed sites need ITS return thunks.") +ksft.finished() --- /dev/null +++ b/tools/testing/selftests/x86/bugs/its_sysfs.py @@ -0,0 +1,65 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (c) 2025 Intel Corporation +# +# Test for Indirect Target Selection(ITS) mitigation sysfs status. + +import sys, os, re +this_dir = os.path.dirname(os.path.realpath(__file__)) +sys.path.insert(0, this_dir + '/../../kselftest') +import ksft + +from common import * + +bug = "indirect_target_selection" +mitigation = get_sysfs(bug) + +ITS_MITIGATION_ALIGNED_THUNKS = "Mitigation: Aligned branch/return thunks" +ITS_MITIGATION_RETPOLINE_STUFF = "Mitigation: Retpolines, Stuffing RSB" +ITS_MITIGATION_VMEXIT_ONLY = "Mitigation: Vulnerable, KVM: Not affected" +ITS_MITIGATION_VULNERABLE = "Vulnerable" + +def check_mitigation(): + if mitigation == ITS_MITIGATION_ALIGNED_THUNKS: + if cmdline_has(f'{bug}=stuff') and sysfs_has("spectre_v2", "Retpolines"): + bug_check_fail(bug, ITS_MITIGATION_ALIGNED_THUNKS, ITS_MITIGATION_RETPOLINE_STUFF) + return + if cmdline_has(f'{bug}=vmexit') and cpuinfo_has('its_native_only'): + bug_check_fail(bug, ITS_MITIGATION_ALIGNED_THUNKS, ITS_MITIGATION_VMEXIT_ONLY) + return + bug_check_pass(bug, ITS_MITIGATION_ALIGNED_THUNKS) + return + + if mitigation == ITS_MITIGATION_RETPOLINE_STUFF: + if cmdline_has(f'{bug}=stuff') and sysfs_has("spectre_v2", "Retpolines"): + bug_check_pass(bug, ITS_MITIGATION_RETPOLINE_STUFF) + return + if sysfs_has('retbleed', 'Stuffing'): + bug_check_pass(bug, ITS_MITIGATION_RETPOLINE_STUFF) + return + bug_check_fail(bug, ITS_MITIGATION_RETPOLINE_STUFF, ITS_MITIGATION_ALIGNED_THUNKS) + + if mitigation == ITS_MITIGATION_VMEXIT_ONLY: + if cmdline_has(f'{bug}=vmexit') and cpuinfo_has('its_native_only'): + bug_check_pass(bug, ITS_MITIGATION_VMEXIT_ONLY) + return + bug_check_fail(bug, ITS_MITIGATION_VMEXIT_ONLY, ITS_MITIGATION_ALIGNED_THUNKS) + + if mitigation == ITS_MITIGATION_VULNERABLE: + if sysfs_has("spectre_v2", "Vulnerable"): + bug_check_pass(bug, ITS_MITIGATION_VULNERABLE) + else: + bug_check_fail(bug, "Mitigation", ITS_MITIGATION_VULNERABLE) + + bug_status_unknown(bug, mitigation) + return + +ksft.print_header() +ksft.set_plan(1) +ksft.print_msg(f'{bug}: {mitigation} ...') + +if not basic_checks_sufficient(bug, mitigation): + check_mitigation() + +ksft.finished()
Hi!
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
We are getting errors here:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/18... https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/1001053...
arch/x86/kernel/alternative.c: In function 'its_fini_mod': 1702 arch/x86/kernel/alternative.c:174:32: error: invalid use of undefined type 'struct module' 1703 174 | for (int i = 0; i < mod->its_num_pages; i++) { 1704 | ^~ 1705 arch/x86/kernel/alternative.c:175:33: error: invalid use of undefined type 'struct module' 1706 175 | void *page = mod->its_page_array[i]; 1707 | ^~ ...
6.12 has same problem, likely 6.6 too.
Best regards, Pavel
On Mon, 12 May 2025 19:37:30 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.14.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.14.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v6.14: 10 builds: 10 pass, 0 fail 28 boots: 28 pass, 0 fail 116 tests: 116 pass, 0 fail
Linux version: 6.14.7-rc1-g4f7f8fb4f8e3 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra186-p3509-0000+p3636-0001, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On 25/05/12 07:37PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
Hello everyone,
I have noticed that the following commit produces a whole bunch of lines in my journal, which looks like an error for me:
Wayne Lin Wayne.Lin@amd.com drm/amd/display: Fix wrong handling for AUX_DEFER case
amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written
this does not seem to be serious, i.e. the system otherwise works as intended but it's still noteworthy. Is there a dependency commit missing maybe? From the code it looks like it was meant to be this way 🤔
You can find a full journal here, with the logspammed parts in highlight: https://gist.github.com/christian-heusel/e8418bbdca097871489a31d79ed166d6#fi...
Cheers, Chris
Yeah getting that too
[ 21.463202] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.464700] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.466133] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.467631] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.469127] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.470631] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.472127] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.473624] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.475130] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.476631] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.478127] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.479624] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.481126] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.482623] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.484130] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.485630] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.487127] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.488630] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.490125] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.491633] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.493120] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.494642] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.496128] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.497632] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.499128] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.500633] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.502130] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.503631] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.505126] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.506629] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.508127] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 21.509647] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. [ 22.259286] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.259935] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.260583] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.261234] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.261883] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.262533] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.263185] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.263835] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.264481] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.265128] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.265771] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.266323] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.266970] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.267616] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.268270] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.268918] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.269567] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.270213] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.270857] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.271506] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.272154] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.272802] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.273450] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.274097] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.274745] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.275393] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.276039] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.276682] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.277274] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.277916] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.278563] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 22.279210] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.335457] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.336103] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.336745] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.337387] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.338029] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.338676] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.339271] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.339922] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.340570] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.341216] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.341864] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.342512] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.343159] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.343806] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.344456] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.345279] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.345929] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.346579] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.347232] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.347878] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.348526] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.349173] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.349816] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.350458] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.351100] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.351743] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.352390] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.353039] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.353685] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.354273] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.354920] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4 [ 27.355567] amdgpu 0000:08:00.0: amdgpu: [drm] amdgpu: DP AUX transfer fail:4
but other then that it seems to work
Den tis 13 maj 2025 kl 07:26 skrev Christian Heusel christian@heusel.eu:
On 25/05/12 07:37PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
Hello everyone,
I have noticed that the following commit produces a whole bunch of lines in my journal, which looks like an error for me:
Wayne Lin Wayne.Lin@amd.com drm/amd/display: Fix wrong handling for AUX_DEFER case
amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written
this does not seem to be serious, i.e. the system otherwise works as intended but it's still noteworthy. Is there a dependency commit missing maybe? From the code it looks like it was meant to be this way 🤔
You can find a full journal here, with the logspammed parts in highlight: https://gist.github.com/christian-heusel/e8418bbdca097871489a31d79ed166d6#fi...
Cheers, Chris
On 5/12/2025 7:26 PM, Christian Heusel wrote:
On 25/05/12 07:37PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
Hello everyone,
I have noticed that the following commit produces a whole bunch of lines in my journal, which looks like an error for me:
Wayne Lin Wayne.Lin@amd.com drm/amd/display: Fix wrong handling for AUX_DEFER case
amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX reply command not ACK: 0x01. amdgpu 0000:04:00.0: amdgpu: [drm] amdgpu: AUX partially written
this does not seem to be serious, i.e. the system otherwise works as intended but it's still noteworthy. Is there a dependency commit missing maybe? From the code it looks like it was meant to be this way 🤔
You can find a full journal here, with the logspammed parts in highlight: https://gist.github.com/christian-heusel/e8418bbdca097871489a31d79ed166d6#fi...
Cheers, Chris
Here's the fix:
https://lore.kernel.org/amd-gfx/CADnq5_MrUPvFVTkMixCuhFqpEuk+cKQpXJPBBBpaVwq...
Thanks,
On 5/12/25 10:37, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.14.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.14.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On Mon, May 12, 2025 at 07:37:30PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Tested-by: Mark Brown broonie@kernel.org
On Mon, 12 May 2025 at 18:43, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.14.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.14.y and the diffstat can be found below.
thanks,
greg k-h
Regressions on mips defconfig tinyconfig and allnoconfig builds failed with clang-20 toolchain on stable-rc 6.14.7-rc1, 6.12.29-rc1 and 6.6.91-rc1. But, builds pass with gcc-13.
* mips, build - clang-20-allnoconfig - clang-20-defconfig - clang-20-tinyconfig - korg-clang-20-lkftconfig-hardening - korg-clang-20-lkftconfig-lto-full - korg-clang-20-lkftconfig-lto-thing
Regression Analysis: - New regression? Yes - Reproducibility? Yes
Build regression: mips defconfig clang-20 instantiation error expected an immediate
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
## Build error mips <instantiation>:7:11: error: expected an immediate ori $26, r4k_wait_idle_size - 2 ^ <instantiation>:10:13: error: expected an immediate addiu $26, r4k_wait_exit - r4k_wait_insn + 2 ^ <instantiation>:10:29: error: expected an immediate addiu $26, r4k_wait_exit - r4k_wait_insn + 2 ^ <instantiation>:7:11: error: expected an immediate ori $26, r4k_wait_idle_size - 2 ^ <instantiation>:10:13: error: expected an immediate addiu $26, r4k_wait_exit - r4k_wait_insn + 2 ^ <instantiation>:10:29: error: expected an immediate addiu $26, r4k_wait_exit - r4k_wait_insn + 2 ^ arch/mips/kernel/genex.S:531:2: warning: macro defined with named parameters which are not used in macro body, possible positional parameter found in body which will have no effect .macro __BUILD_verbose nexception ^
## Build mips * Build log: https://qa-reports.linaro.org/api/testruns/28409657/log_file/ * Build history: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.14.y/build/v6.14.... * Build details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.14.y/build/v6.14.... * Build link: https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/builds/2x0SR9ZL9r... * Kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2x0SR9ZL9r6xvF3HT3Ugk... * Toolchain: clang-20
## Steps to reproduce - tuxmake --runtime podman --target-arch mips --toolchain clang-20 --kconfig defconfig LLVM=1 LLVM_IAS=1
## Build * kernel: 6.14.7-rc1 * git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git * git commit: 4f7f8fb4f8e35798b197be0b6b13229aa1864da1 * git describe: v6.14.6-198-g4f7f8fb4f8e3 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.14.y/build/v6.14....
## Test Regressions (compared to v6.14.5-184-ga33747967783) * mips, build - clang-20-allnoconfig - clang-20-defconfig - clang-20-tinyconfig - korg-clang-20-lkftconfig-hardening - korg-clang-20-lkftconfig-lto-full - korg-clang-20-lkftconfig-lto-thing
## Metric Regressions (compared to v6.14.5-184-ga33747967783)
## Test Fixes (compared to v6.14.5-184-ga33747967783)
## Metric Fixes (compared to v6.14.5-184-ga33747967783)
## Test result summary total: 155299, pass: 129321, fail: 5951, skip: 19411, xfail: 616
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 139 total, 137 passed, 2 failed * arm64: 56 total, 55 passed, 1 failed * i386: 18 total, 16 passed, 2 failed * mips: 34 total, 27 passed, 7 failed * parisc: 4 total, 4 passed, 0 failed * powerpc: 40 total, 40 passed, 0 failed * riscv: 25 total, 22 passed, 3 failed * s390: 22 total, 22 passed, 0 failed * sh: 5 total, 5 passed, 0 failed * sparc: 4 total, 3 passed, 1 failed * x86_64: 49 total, 42 passed, 7 failed
## Test suites summary * boot * commands * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-efivarfs * kselftest-exec * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-kcmp * kselftest-kvm * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-mincore * kselftest-mm * kselftest-mqueue * kselftest-net * kselftest-net-mptcp * kselftest-openat2 * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-rust * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-tc-testing * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user_events * kselftest-vDSO * kselftest-x86 * kunit * kvm-unit-tests * lava * libgpiod * libhugetlbfs * log-parser-boot * log-parser-build-clang * log-parser-build-gcc * log-parser-test * ltp-capability * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-hugetlb * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-smoke * ltp-syscalls * ltp-tracing * perf * rcutorture * rt-tests-cyclicdeadline * rt-tests-pi-stress * rt-tests-pmqtest * rt-tests-rt-migrate-test * rt-tests-signaltest
-- Linaro LKFT https://lkft.linaro.org
On Tue, 13 May 2025 at 11:40, Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Mon, 12 May 2025 at 18:43, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.14.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.14.y and the diffstat can be found below.
thanks,
greg k-h
Regressions on mips defconfig tinyconfig and allnoconfig builds failed with clang-20 toolchain on stable-rc 6.14.7-rc1, 6.12.29-rc1 and 6.6.91-rc1. But, builds pass with gcc-13.
- mips, build
- clang-20-allnoconfig
- clang-20-defconfig
- clang-20-tinyconfig
- korg-clang-20-lkftconfig-hardening
- korg-clang-20-lkftconfig-lto-full
- korg-clang-20-lkftconfig-lto-thing
Regression Analysis:
- New regression? Yes
- Reproducibility? Yes
Build regression: mips defconfig clang-20 instantiation error expected an immediate
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
## Build error mips <instantiation>:7:11: error: expected an immediate ori $26, r4k_wait_idle_size - 2 ^ <instantiation>:10:13: error: expected an immediate addiu $26, r4k_wait_exit - r4k_wait_insn + 2 ^ <instantiation>:10:29: error: expected an immediate addiu $26, r4k_wait_exit - r4k_wait_insn + 2 ^ <instantiation>:7:11: error: expected an immediate ori $26, r4k_wait_idle_size - 2 ^ <instantiation>:10:13: error: expected an immediate addiu $26, r4k_wait_exit - r4k_wait_insn + 2 ^ <instantiation>:10:29: error: expected an immediate addiu $26, r4k_wait_exit - r4k_wait_insn + 2 ^
The bisection found this as first bad commit,
MIPS: Fix idle VS timer enqueue
[ Upstream commit 56651128e2fbad80f632f388d6bf1f39c928267a ]
- Naresh Kamboju
On Tue, May 13, 2025 at 02:29:20PM +0100, Naresh Kamboju wrote:
On Tue, 13 May 2025 at 11:40, Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Mon, 12 May 2025 at 18:43, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.14.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.14.y and the diffstat can be found below.
thanks,
greg k-h
Regressions on mips defconfig tinyconfig and allnoconfig builds failed with clang-20 toolchain on stable-rc 6.14.7-rc1, 6.12.29-rc1 and 6.6.91-rc1. But, builds pass with gcc-13.
- mips, build
- clang-20-allnoconfig
- clang-20-defconfig
- clang-20-tinyconfig
- korg-clang-20-lkftconfig-hardening
- korg-clang-20-lkftconfig-lto-full
- korg-clang-20-lkftconfig-lto-thing
Regression Analysis:
- New regression? Yes
- Reproducibility? Yes
Build regression: mips defconfig clang-20 instantiation error expected an immediate
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
## Build error mips <instantiation>:7:11: error: expected an immediate ori $26, r4k_wait_idle_size - 2 ^ <instantiation>:10:13: error: expected an immediate addiu $26, r4k_wait_exit - r4k_wait_insn + 2 ^ <instantiation>:10:29: error: expected an immediate addiu $26, r4k_wait_exit - r4k_wait_insn + 2 ^ <instantiation>:7:11: error: expected an immediate ori $26, r4k_wait_idle_size - 2 ^ <instantiation>:10:13: error: expected an immediate addiu $26, r4k_wait_exit - r4k_wait_insn + 2 ^ <instantiation>:10:29: error: expected an immediate addiu $26, r4k_wait_exit - r4k_wait_insn + 2 ^
The bisection found this as first bad commit,
MIPS: Fix idle VS timer enqueue [ Upstream commit 56651128e2fbad80f632f388d6bf1f39c928267a ]
Thanks, now dropped from all queues.
greg k-h
Hi Greg
On Tue, May 13, 2025 at 2:44 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.14.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.14.y and the diffstat can be found below.
thanks,
greg k-h
6.14.7-rc1 tested.
Build successfully completed. Boot successfully completed. No dmesg regressions. Video output normal. Sound output normal.
Lenovo ThinkPad X1 Carbon Gen10(Intel i7-1260P(x86_64) arch linux)
[ 0.000000] Linux version 6.14.7-rc1rv-g4f7f8fb4f8e3 (takeshi@ThinkPadX1Gen10J0764) (gcc (GCC) 15.1.1 20250425, GNU ld (GNU Binutils) 2.44.0) #1 SMP PREEMPT_DYNAMIC Tue May 13 19:59:27 JST 2025
Thanks
Tested-by: Takeshi Ogasawara takeshi.ogasawara@futuring-girl.com
On 5/12/2025 7:37 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.14.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.14.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
* Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
Hi Greg
6.14.7-rc1 compiles, boots and runs here on x86_64 (AMD Ryzen 5 7520U, Slackware64-current), no regressions observed.
Tested-by: Markus Reichelt lkt+2023@mareichelt.com
Am 12.05.2025 um 19:37 schrieb Greg Kroah-Hartman:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Builds, boots and works on my 2-socket Ivy Bridge Xeon E5-2697 v2 server. No dmesg oddities or regressions found.
Tested-by: Peter Schneider pschneider1968@googlemail.com
Beste Grüße, Peter Schneider
On 5/12/25 11:37, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.14.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.14.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On 25/05/12 07:37PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.14.7 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 May 2025 17:19:58 +0000. Anything received after that time might be too late.
As it turns out my [previous concern][0] was just about a bit too noisy logging facility you can add my
Tested-by: Christian Heusel christian@heusel.eu
Tested on a ThinkPad E14 Gen 3 with a AMD Ryzen 5 5500U CPU and on the Steam Deck (LCD variant) aswell as a Framework Desktop.
[0]: https://lore.kernel.org/all/32c592ea-0afd-4753-a81d-73021b8e193c@heusel.eu
linux-stable-mirror@lists.linaro.org