This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.12.7-rc1
Xuewen Yan xuewen.yan@unisoc.com epoll: Add synchronous wakeup support for ep_poll_callback
Usama Arif usamaarif642@gmail.com mm: convert partially_mapped set/clear operations to be atomic
Hugh Dickins hughd@google.com mm: shmem: fix ShmemHugePages at swapout
Kefeng Wang wangkefeng.wang@huawei.com mm: use aligned address in copy_user_gigantic_page()
Kefeng Wang wangkefeng.wang@huawei.com mm: use aligned address in clear_gigantic_page()
Ilya Dryomov idryomov@gmail.com ceph: fix memory leak in ceph_direct_read_write()
Max Kellermann max.kellermann@ionos.com ceph: fix memory leaks in __ceph_sync_read()
Alex Markuze amarkuze@redhat.com ceph: improve error handling and short/overflow-read logic in __ceph_sync_read()
Ilya Dryomov idryomov@gmail.com ceph: validate snapdirname option length when mounting
Max Kellermann max.kellermann@ionos.com ceph: give up on paths longer than PATH_MAX
Zijun Hu quic_zijuhu@quicinc.com of: Fix refcount leakage for OF node returned by __of_get_dma_parent()
Herve Codina herve.codina@bootlin.com of: Fix error path in of_parse_phandle_with_args_map()
Andrea della Porta andrea.porta@suse.com of: address: Preserve the flags portion on 1:1 dma-ranges mapping
Samuel Holland samuel.holland@sifive.com of: property: fw_devlink: Do not use interrupt-parent directly
Jann Horn jannh@google.com udmabuf: also check for F_SEAL_FUTURE_WRITE
Jann Horn jannh@google.com udmabuf: fix racy memfd sealing check
Edward Adam Davis eadavis@qq.com nilfs2: prevent use of deleted inode
Ryusuke Konishi konishi.ryusuke@gmail.com nilfs2: fix buffer head leaks in calls to truncate_inode_pages()
Heming Zhao heming.zhao@suse.com ocfs2: fix the space leak in LA when releasing LA
Zijun Hu quic_zijuhu@quicinc.com of/irq: Fix using uninitialized variable @addr_len in API of_irq_parse_one()
Zijun Hu quic_zijuhu@quicinc.com of/irq: Fix interrupt-map cell length check in of_irq_parse_imap_parent()
Sean Christopherson seanjc@google.com KVM: SVM: Allow guest writes to set MSR_AMD64_DE_CFG bits
Trond Myklebust trond.myklebust@hammerspace.com NFS/pnfs: Fix a live lock between recalled layouts and layoutget
Pavel Begunkov asml.silence@gmail.com io_uring: check if iowq is killed before queuing
Jann Horn jannh@google.com io_uring: Fix registered ring file refcount leak
Tiezhu Yang yangtiezhu@loongson.cn selftests/bpf: Use asm constraint "m" for LoongArch
Isaac J. Manjarres isaacmanjarres@google.com selftests/memfd: run sysctl tests when PID namespace support is enabled
Steven Rostedt rostedt@goodmis.org tracing: Check "%s" dereference via the field and not the TP_printk format
Steven Rostedt rostedt@goodmis.org tracing: Add "%s" check in test_event_printk()
Steven Rostedt rostedt@goodmis.org tracing: Add missing helper functions in event pointer dereference check
Steven Rostedt rostedt@goodmis.org tracing: Fix test_event_printk() to process entire print argument
Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com accel/ivpu: Fix WARN in ivpu_ipc_send_receive_internal()
Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com accel/ivpu: Fix general protection fault in ivpu_bo_list()
Enzo Matsumiya ematsumiya@suse.de smb: client: fix TCP timers deadlock after rmmod
Sean Christopherson seanjc@google.com KVM: x86: Play nice with protected guests in complete_hypercall_exit()
Naman Jain namjain@linux.microsoft.com x86/hyperv: Fix hv tsc page based sched_clock for hibernation
Dexuan Cui decui@microsoft.com tools: hv: Fix a complier warning in the fcopy uio daemon
Michael Kelley mhklinux@outlook.com Drivers: hv: util: Avoid accessing a ringbuffer not initialized yet
Steven Rostedt rostedt@goodmis.org fgraph: Still initialize idle shadow stacks when starting
Alex Deucher alexander.deucher@amd.com drm/amdgpu/mmhub4.1: fix IP version check
Alex Deucher alexander.deucher@amd.com drm/amdgpu/gfx12: fix IP version check
Alex Deucher alexander.deucher@amd.com drm/amdgpu/nbio7.0: fix IP version check
Heiko Carstens hca@linux.ibm.com s390/mm: Fix DirectMap accounting
Qu Wenruo wqu@suse.com btrfs: tree-checker: reject inline extent items with 0 ref count
Josef Bacik josef@toxicpanda.com btrfs: fix improper generation check in snapshot delete
Christoph Hellwig hch@lst.de btrfs: split bios to the fs sector size boundary
Suren Baghdasaryan surenb@google.com alloc_tag: fix set_codetag_empty() when !CONFIG_MEM_ALLOC_PROFILING_DEBUG
Edward Adam Davis eadavis@qq.com ring-buffer: Fix overflow in __rb_map_vma
David Hildenbrand david@redhat.com mm/page_alloc: don't call pfn_to_page() on possibly non-existent PFN in split_large_buddy()
Matthew Wilcox (Oracle) willy@infradead.org vmalloc: fix accounting with i915
Kairui Song kasong@tencent.com zram: fix uninitialized ZRAM not releasing backing device
Kairui Song kasong@tencent.com zram: refuse to use zero sized block device as backing device
Alex Deucher alexander.deucher@amd.com drm/amdgpu/smu14.0.2: fix IP version check
Alex Deucher alexander.deucher@amd.com drm/amdgpu/nbio7.7: fix IP version check
Alex Deucher alexander.deucher@amd.com drm/amdgpu/nbio7.11: fix IP version check
Steven Rostedt rostedt@goodmis.org trace/ring-buffer: Do not use TP_printk() formatting for boot mapped buffers
Ming Lei ming.lei@redhat.com block: avoid to reuse `hctx` not removed from cpuhp callback list
Murad Masimov m.masimov@maxima.ru hwmon: (tmp513) Fix interpretation of values of Temperature Result and Limit Registers
Murad Masimov m.masimov@maxima.ru hwmon: (tmp513) Fix Current Register value interpretation
Murad Masimov m.masimov@maxima.ru hwmon: (tmp513) Fix interpretation of values of Shunt Voltage and Limit Registers
Pierre-Eric Pelloux-Prayer pierre-eric.pelloux-prayer@amd.com drm/amdgpu: don't access invalid sched
Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com i915/guc: Accumulate active runtime on gt reset
Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com i915/guc: Ensure busyness counter increases motonically
Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com i915/guc: Reset engine utilization buffer before registration
Michael Trimarchi michael@amarulasolutions.com drm/panel: synaptics-r63353: Fix regulator unbalance
Marek Vasut marex@denx.de drm/panel: st7701: Add prepare_prev_first flag to drm_panel
Yang Yingliang yangyingliang@huawei.com drm/panel: novatek-nt35950: fix return value check in nt35950_probe()
Zhang Zekun zhangzekun11@huawei.com drm/panel: himax-hx83102: Add a check to prevent NULL pointer dereference
T.J. Mercier tjmercier@google.com dma-buf: Fix __dma_buf_debugfs_list_del argument for !CONFIG_DEBUG_FS
Jann Horn jannh@google.com udmabuf: fix memory leak on last export_udmabuf() error path
Huan Yang link@vivo.com udmabuf: udmabuf_create pin folio codestyle cleanup
Michel Dänzer mdaenzer@redhat.com drm/amdgpu: Handle NULL bo->tbo.resource (again) in amdgpu_vm_bo_update
Christian König christian.koenig@amd.com drm/amdgpu: fix amdgpu_coredump
Ville Syrjälä ville.syrjala@linux.intel.com drm/modes: Avoid divide by zero harder in drm_mode_vrefresh()
Mario Limonciello mario.limonciello@amd.com drm/amd: Update strapping for NBIO 2.5.0
Krzysztof Karas krzysztof.karas@intel.com drm/display: use ERR_PTR on DP tunnel manager creation fail
Mario Limonciello mario.limonciello@amd.com thunderbolt: Don't display nvm_version unless upgrade supported
Mika Westerberg mika.westerberg@linux.intel.com thunderbolt: Improve redrive mode handling
Mika Westerberg mika.westerberg@linux.intel.com thunderbolt: Add support for Intel Panther Lake-M/P
Mathias Nyman mathias.nyman@linux.intel.com xhci: Turn NEC specific quirk for handling Stop Endpoint errors generic
Daniele Palmas dnlplm@gmail.com USB: serial: option: add Telit FE910C04 rmnet compositions
Jack Wu wojackbb@gmail.com USB: serial: option: add MediaTek T7XX compositions
Mank Wang mank.wang@netprisma.com USB: serial: option: add Netprisma LCUK54 modules for WWAN Ready
Michal Hrusecky michal.hrusecky@turris.com USB: serial: option: add MeiG Smart SLM770A
Daniel Swanemar d.swanemar@gmail.com USB: serial: option: add TCL IK512 MBIM & ECM
Nathan Chancellor nathan@kernel.org hexagon: Disable constant extender optimization for LLVM prior to 19.1.0
James Bottomley James.Bottomley@HansenPartnership.com efivarfs: Fix error on non-existent file
Geert Uytterhoeven geert+renesas@glider.be i2c: riic: Always round-up when calculating bus period
Ming Lei ming.lei@redhat.com block: Revert "block: Fix potential deadlock while freezing queue and acquiring sysfs_lock"
Jeremy Kerr jk@codeconstruct.com.au net: mctp: handle skb cleanup on sock_queue failures
Dan Carpenter dan.carpenter@linaro.org chelsio/chtls: prevent potential integer overflow on 32bit
Eric Dumazet edumazet@google.com net: tun: fix tun_napi_alloc_frags()
Sean Christopherson seanjc@google.com KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init
Marc Zyngier maz@kernel.org KVM: arm64: Do not allow ID_AA64MMFR0_EL1.ASIDbits to be overridden
Borislav Petkov (AMD) bp@alien8.de EDAC/amd64: Simplify ECC check on unified memory controllers
Marc Zyngier maz@kernel.org irqchip/gic-v3: Work around insecure GIC integrations
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp mmc: mtk-sd: disable wakeup in .remove() and in the error path of .probe()
Prathamesh Shete pshete@nvidia.com mmc: sdhci-tegra: Remove SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC quirk
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp net: mdiobus: fix an OF node reference leak
Adrian Moreno amorenoz@redhat.com psample: adjust size if rate_as_probability is set
Jakub Kicinski kuba@kernel.org netdev-genl: avoid empty messages in queue dump
Vladimir Oltean vladimir.oltean@nxp.com net: dsa: restore dsa_software_vlan_untag() ability to operate on VLAN-untagged traffic
Adrian Moreno amorenoz@redhat.com selftests: openvswitch: fix tcpdump execution
Phil Sutter phil@nwl.cc netfilter: ipset: Fix for recursive locking warning
David Laight David.Laight@ACULAB.COM ipvs: Fix clamp() of ip_vs_conn_tab on small memory systems
Matthias Schiffer matthias.schiffer@ew.tq-group.com can: m_can: fix missed interrupts with m_can_pci
Matthias Schiffer matthias.schiffer@ew.tq-group.com can: m_can: set init flag earlier in probe
Eric Dumazet edumazet@google.com net: netdevsim: fix nsim_pp_hold_write()
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp net: ethernet: bgmac-platform: fix an OF node reference leak
Parthiban Veerasooran parthiban.veerasooran@microchip.com net: ethernet: oa_tc6: fix tx skb race condition between reference pointers
Parthiban Veerasooran parthiban.veerasooran@microchip.com net: ethernet: oa_tc6: fix infinite loop error when tx credits becomes 0
Dan Carpenter dan.carpenter@linaro.org net: hinic: Fix cleanup in create_rxqs/txqs()
Daniel Borkmann daniel@iogearbox.net team: Fix feature exposure when no ports are present
Jakub Kicinski kuba@kernel.org netdev: fix repeated netlink messages in queue stats
Jakub Kicinski kuba@kernel.org netdev: fix repeated netlink messages in queue dump
Marios Makassikis mmakassikis@freebox.fr ksmbd: fix broken transfers when exceeding max simultaneous operations
Marios Makassikis mmakassikis@freebox.fr ksmbd: count all requests in req_running counter
Nikita Yushchenko nikita.yoush@cogentembedded.com net: renesas: rswitch: rework ts tags management
Shannon Nelson shannon.nelson@amd.com ionic: use ee->offset when returning sprom data
Shannon Nelson shannon.nelson@amd.com ionic: no double destroy workqueue
Brett Creeley brett.creeley@amd.com ionic: Fix netdev notifier unregister on failure
Donald Hunter donald.hunter@gmail.com tools/net/ynl: fix sub-message key lookup for nested attributes
Eric Dumazet edumazet@google.com netdevsim: prevent bad user input in nsim_dev_health_break_write()
Vladimir Oltean vladimir.oltean@nxp.com net: mscc: ocelot: fix incorrect IFH SRC_PORT field in ocelot_ifh_set_basic()
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check return value of sock_recvmsg when draining clc data
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check smcd_v2_ext_offset when receiving proposal msg
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check v2_ext_offset/eid_cnt/ism_gid_cnt when receiving proposal msg
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check iparea_offset and ipv6_prefixes_cnt when receiving proposal msg
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check sndbuf_space again after NOSPACE flag is set in smc_poll
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: protect link down work from execute after lgr freed
Huaisheng Ye huaisheng.ye@intel.com cxl/region: Fix region creation for greater than x2 switches
Davidlohr Bueso dave@stgolabs.net cxl/pci: Fix potential bogus return value upon successful probing
Olaf Hering olaf@aepfle.de tools: hv: change permissions of NetworkManager configuration file
Darrick J. Wong djwong@kernel.org xfs: fix zero byte checking in the superblock scrubber
Darrick J. Wong djwong@kernel.org xfs: fix sb_spino_align checks for large fsblock sizes
Darrick J. Wong djwong@kernel.org xfs: fix off-by-one error in fsmap's end_daddr usage
Dave Chinner dchinner@redhat.com xfs: fix sparse inode limits on runt AG
Dave Chinner dchinner@redhat.com xfs: sb_spino_align is not verified
Gao Xiang xiang@kernel.org erofs: use buffered I/O for file-backed mounts by default
Gao Xiang xiang@kernel.org erofs: reference `struct erofs_device_info` for erofs_map_dev
Gao Xiang xiang@kernel.org erofs: use `struct erofs_device_info` for the primary device
Gao Xiang xiang@kernel.org erofs: add erofs_sb_free() helper
Vasily Gorbik gor@linux.ibm.com s390/mm: Consider KMSAN modules metadata for paging levels
Vineeth Pillai (Google) vineeth@bitbyteword.org sched/dlserver: Fix dlserver time accounting
Vineeth Pillai (Google) vineeth@bitbyteword.org sched/dlserver: Fix dlserver double enqueue
Gao Xiang xiang@kernel.org erofs: fix PSI memstall accounting
Alexander Gordeev agordeev@linux.ibm.com s390/ipl: Fix never less than zero warning
Vladimir Riabchun ferr.lambarginio@gmail.com i2c: pnx: Fix timeout in wait functions
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com p2sb: Do not scan and remove the P2SB device when it is unhidden
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com p2sb: Move P2SB hide and unhide code to p2sb_scan_and_cache()
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com p2sb: Introduce the global flag p2sb_hidden_by_bios
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com p2sb: Factor out p2sb_read_from_cache()
Peter Zijlstra peterz@infradead.org sched/eevdf: More PELT vs DELAYED_DEQUEUE
Vincent Guittot vincent.guittot@linaro.org sched/fair: Fix sched_can_stop_tick() for fair tasks
K Prateek Nayak kprateek.nayak@amd.com sched/fair: Fix NEXT_BUDDY
Michael Neuling michaelneuling@tenstorrent.com RISC-V: KVM: Fix csr_write -> csr_set for HVIEN PMU overflow bit
Levi Yun yeoreum.yun@arm.com firmware: arm_ffa: Fix the race around setting ffa_dev->properties
Arnd Bergmann arnd@arndb.de firmware: arm_scmi: Fix i.MX build dependency
Russell King (Oracle) rmk+kernel@armlinux.org.uk net: stmmac: fix TSO DMA API usage causing oops
Lion Ackermann nnamrec@gmail.com net: sched: fix ordering of qlen adjustment
-------------
Diffstat:
Makefile | 4 +- arch/arm64/kvm/sys_regs.c | 3 +- arch/hexagon/Makefile | 6 + arch/riscv/kvm/aia.c | 2 +- arch/s390/boot/startup.c | 2 + arch/s390/boot/vmem.c | 6 +- arch/s390/kernel/ipl.c | 2 +- arch/x86/kernel/cpu/mshyperv.c | 58 +++++ arch/x86/kvm/cpuid.c | 31 ++- arch/x86/kvm/cpuid.h | 1 + arch/x86/kvm/svm/svm.c | 9 - arch/x86/kvm/x86.c | 4 +- block/blk-mq-sysfs.c | 16 +- block/blk-mq.c | 40 ++-- block/blk-sysfs.c | 4 +- drivers/accel/ivpu/ivpu_gem.c | 2 +- drivers/accel/ivpu/ivpu_pm.c | 2 +- drivers/block/zram/zram_drv.c | 15 +- drivers/clocksource/hyperv_timer.c | 14 +- drivers/cxl/core/region.c | 25 +- drivers/cxl/pci.c | 3 +- drivers/dma-buf/dma-buf.c | 2 +- drivers/dma-buf/udmabuf.c | 180 ++++++++------ drivers/edac/amd64_edac.c | 32 +-- drivers/firmware/arm_ffa/bus.c | 15 +- drivers/firmware/arm_ffa/driver.c | 7 +- drivers/firmware/arm_scmi/vendors/imx/Kconfig | 1 + drivers/firmware/imx/Kconfig | 1 - drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +- drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c | 11 + drivers/gpu/drm/amd/amdgpu/nbio_v7_11.c | 2 +- drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c | 2 +- .../gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c | 2 +- drivers/gpu/drm/display/drm_dp_tunnel.c | 10 +- drivers/gpu/drm/drm_modes.c | 11 +- drivers/gpu/drm/i915/gt/intel_engine_types.h | 5 + drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 41 +++- drivers/gpu/drm/panel/panel-himax-hx83102.c | 2 + drivers/gpu/drm/panel/panel-novatek-nt35950.c | 4 +- drivers/gpu/drm/panel/panel-sitronix-st7701.c | 1 + drivers/gpu/drm/panel/panel-synaptics-r63353.c | 2 +- drivers/hv/hv_kvp.c | 6 + drivers/hv/hv_snapshot.c | 6 + drivers/hv/hv_util.c | 9 + drivers/hv/hyperv_vmbus.h | 2 + drivers/hwmon/tmp513.c | 10 +- drivers/i2c/busses/i2c-pnx.c | 4 +- drivers/i2c/busses/i2c-riic.c | 2 +- drivers/irqchip/irq-gic-v3.c | 17 +- drivers/mmc/host/mtk-sd.c | 2 + drivers/mmc/host/sdhci-tegra.c | 1 - drivers/net/can/m_can/m_can.c | 36 ++- drivers/net/can/m_can/m_can.h | 1 + drivers/net/can/m_can/m_can_pci.c | 1 + drivers/net/ethernet/broadcom/bgmac-platform.c | 5 +- .../chelsio/inline_crypto/chtls/chtls_main.c | 5 +- drivers/net/ethernet/huawei/hinic/hinic_main.c | 2 + drivers/net/ethernet/mscc/ocelot.c | 2 +- drivers/net/ethernet/oa_tc6.c | 11 +- drivers/net/ethernet/pensando/ionic/ionic_dev.c | 5 +- .../net/ethernet/pensando/ionic/ionic_ethtool.c | 4 +- drivers/net/ethernet/pensando/ionic/ionic_lif.c | 4 +- drivers/net/ethernet/renesas/rswitch.c | 68 +++--- drivers/net/ethernet/renesas/rswitch.h | 13 +- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 7 +- drivers/net/mdio/fwnode_mdio.c | 13 +- drivers/net/netdevsim/health.c | 2 + drivers/net/netdevsim/netdev.c | 4 +- drivers/net/team/team_core.c | 10 +- drivers/net/tun.c | 2 +- drivers/of/address.c | 5 +- drivers/of/base.c | 15 +- drivers/of/irq.c | 2 + drivers/of/property.c | 2 - drivers/platform/x86/p2sb.c | 79 ++++-- drivers/thunderbolt/nhi.c | 8 + drivers/thunderbolt/nhi.h | 4 + drivers/thunderbolt/retimer.c | 19 +- drivers/thunderbolt/tb.c | 41 ++++ drivers/usb/host/xhci-ring.c | 2 - drivers/usb/serial/option.c | 27 +++ fs/btrfs/bio.c | 10 +- fs/btrfs/ctree.h | 19 ++ fs/btrfs/extent-tree.c | 6 +- fs/btrfs/tree-checker.c | 27 ++- fs/ceph/file.c | 77 +++--- fs/ceph/mds_client.c | 9 +- fs/ceph/super.c | 2 + fs/efivarfs/inode.c | 2 +- fs/efivarfs/internal.h | 1 - fs/efivarfs/super.c | 3 - fs/erofs/data.c | 36 +-- fs/erofs/fileio.c | 9 +- fs/erofs/fscache.c | 10 +- fs/erofs/internal.h | 15 +- fs/erofs/super.c | 80 ++++--- fs/erofs/zdata.c | 4 +- fs/eventpoll.c | 5 +- fs/hugetlbfs/inode.c | 2 +- fs/nfs/pnfs.c | 2 +- fs/nilfs2/btnode.c | 1 + fs/nilfs2/gcinode.c | 2 +- fs/nilfs2/inode.c | 13 +- fs/nilfs2/namei.c | 5 + fs/nilfs2/nilfs.h | 1 + fs/ocfs2/localalloc.c | 8 +- fs/smb/client/connect.c | 36 ++- fs/smb/server/connection.c | 18 +- fs/smb/server/connection.h | 1 - fs/smb/server/server.c | 7 +- fs/smb/server/server.h | 1 + fs/smb/server/transport_ipc.c | 5 +- fs/xfs/libxfs/xfs_ialloc.c | 16 +- fs/xfs/libxfs/xfs_sb.c | 15 ++ fs/xfs/scrub/agheader.c | 29 ++- fs/xfs/xfs_fsmap.c | 29 ++- include/clocksource/hyperv_timer.h | 2 + include/linux/alloc_tag.h | 7 +- include/linux/arm_ffa.h | 13 +- include/linux/hyperv.h | 1 + include/linux/io_uring.h | 4 +- include/linux/page-flags.h | 12 +- include/linux/sched.h | 7 + include/linux/trace_events.h | 6 +- include/linux/wait.h | 1 + io_uring/io_uring.c | 7 +- kernel/sched/core.c | 2 +- kernel/sched/deadline.c | 8 +- kernel/sched/debug.c | 1 + kernel/sched/fair.c | 73 ++++-- kernel/sched/pelt.c | 2 +- kernel/sched/sched.h | 13 +- kernel/trace/fgraph.c | 8 +- kernel/trace/ring_buffer.c | 6 +- kernel/trace/trace.c | 264 +++++---------------- kernel/trace/trace.h | 6 +- kernel/trace/trace_events.c | 227 ++++++++++++++---- kernel/trace/trace_output.c | 6 +- mm/huge_memory.c | 8 +- mm/hugetlb.c | 5 +- mm/memory.c | 8 +- mm/page_alloc.c | 6 +- mm/shmem.c | 22 +- mm/vmalloc.c | 6 +- net/core/netdev-genl.c | 19 +- net/dsa/tag.h | 16 +- net/mctp/route.c | 36 ++- net/mctp/test/route-test.c | 86 +++++++ net/netfilter/ipset/ip_set_list_set.c | 3 + net/netfilter/ipvs/ip_vs_conn.c | 4 +- net/psample/psample.c | 9 +- net/sched/sch_cake.c | 2 +- net/sched/sch_choke.c | 2 +- net/smc/af_smc.c | 18 +- net/smc/smc_clc.c | 17 +- net/smc/smc_clc.h | 22 +- net/smc/smc_core.c | 9 +- sound/soc/fsl/Kconfig | 1 + tools/hv/hv_fcopy_uio_daemon.c | 8 +- tools/hv/hv_set_ifconfig.sh | 2 +- tools/net/ynl/lib/ynl.py | 6 +- tools/testing/selftests/bpf/sdt.h | 2 + tools/testing/selftests/memfd/memfd_test.c | 14 +- .../selftests/net/openvswitch/openvswitch.sh | 6 +- 168 files changed, 1685 insertions(+), 891 deletions(-)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lion Ackermann nnamrec@gmail.com
commit 5eb7de8cd58e73851cd37ff8d0666517d9926948 upstream.
Changes to sch->q.qlen around qdisc_tree_reduce_backlog() need to happen _before_ a call to said function because otherwise it may fail to notify parent qdiscs when the child is about to become empty.
Signed-off-by: Lion Ackermann nnamrec@gmail.com Acked-by: Toke Høiland-Jørgensen toke@toke.dk Signed-off-by: David S. Miller davem@davemloft.net Cc: Artem Metla ametla@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_cake.c | 2 +- net/sched/sch_choke.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -1525,7 +1525,6 @@ static unsigned int cake_drop(struct Qdi b->backlogs[idx] -= len; b->tin_backlog -= len; sch->qstats.backlog -= len; - qdisc_tree_reduce_backlog(sch, 1, len);
flow->dropped++; b->tin_dropped++; @@ -1536,6 +1535,7 @@ static unsigned int cake_drop(struct Qdi
__qdisc_drop(skb, to_free); sch->q.qlen--; + qdisc_tree_reduce_backlog(sch, 1, len);
cake_heapify(q, 0);
--- a/net/sched/sch_choke.c +++ b/net/sched/sch_choke.c @@ -123,10 +123,10 @@ static void choke_drop_by_idx(struct Qdi if (idx == q->tail) choke_zap_tail_holes(q);
+ --sch->q.qlen; qdisc_qstats_backlog_dec(sch, skb); qdisc_tree_reduce_backlog(sch, 1, qdisc_pkt_len(skb)); qdisc_drop(skb, sch, to_free); - --sch->q.qlen; }
struct choke_skb_cb {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King (Oracle) rmk+kernel@armlinux.org.uk
[ Upstream commit 4c49f38e20a57f8abaebdf95b369295b153d1f8e ]
Commit 66600fac7a98 ("net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data") moved the assignment of tx_skbuff_dma[]'s members to be later in stmmac_tso_xmit().
The buf (dma cookie) and len stored in this structure are passed to dma_unmap_single() by stmmac_tx_clean(). The DMA API requires that the dma cookie passed to dma_unmap_single() is the same as the value returned from dma_map_single(). However, by moving the assignment later, this is not the case when priv->dma_cap.addr64 > 32 as "des" is offset by proto_hdr_len.
This causes problems such as:
dwc-eth-dwmac 2490000.ethernet eth0: Tx DMA map failed
and with DMA_API_DEBUG enabled:
DMA-API: dwc-eth-dwmac 2490000.ethernet: device driver tries to +free DMA memory it has not allocated [device address=0x000000ffffcf65c0] [size=66 bytes]
Fix this by maintaining "des" as the original DMA cookie, and use tso_des to pass the offset DMA cookie to stmmac_tso_allocator().
Full details of the crashes can be found at: https://lore.kernel.org/all/d8112193-0386-4e14-b516-37c2d838171a@nvidia.com/ https://lore.kernel.org/all/klkzp5yn5kq5efgtrow6wbvnc46bcqfxs65nz3qy77ujr5tu...
Reported-by: Jon Hunter jonathanh@nvidia.com Reported-by: Thierry Reding thierry.reding@gmail.com Fixes: 66600fac7a98 ("net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data") Tested-by: Jon Hunter jonathanh@nvidia.com Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Furong Xu 0x1207@gmail.com Link: https://patch.msgid.link/E1tJXcx-006N4Z-PC@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 766213ee82c1..cf7b59b8cc64 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -4220,8 +4220,8 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev) struct stmmac_txq_stats *txq_stats; struct stmmac_tx_queue *tx_q; u32 pay_len, mss, queue; + dma_addr_t tso_des, des; u8 proto_hdr_len, hdr; - dma_addr_t des; bool set_ic; int i;
@@ -4317,14 +4317,15 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
/* If needed take extra descriptors to fill the remaining payload */ tmp_pay_len = pay_len - TSO_MAX_BUFF_SIZE; + tso_des = des; } else { stmmac_set_desc_addr(priv, first, des); tmp_pay_len = pay_len; - des += proto_hdr_len; + tso_des = des + proto_hdr_len; pay_len = 0; }
- stmmac_tso_allocator(priv, des, tmp_pay_len, (nfrags == 0), queue); + stmmac_tso_allocator(priv, tso_des, tmp_pay_len, (nfrags == 0), queue);
/* In case two or more DMA transmit descriptors are allocated for this * non-paged SKB data, the DMA buffer address should be saved to
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 514b2262ade48a0503ac6aa03c3bfb8c5be69b21 ]
The newly added SCMI vendor driver references functions in the protocol driver but needs a Kconfig dependency to ensure it can link, essentially the Kconfig dependency needs to be reversed to match the link time dependency:
| arm-linux-gnueabi-ld: sound/soc/fsl/fsl_mqs.o: in function `fsl_mqs_sm_write': | fsl_mqs.c:(.text+0x1aa): undefined reference to `scmi_imx_misc_ctrl_set' | arm-linux-gnueabi-ld: sound/soc/fsl/fsl_mqs.o: in function `fsl_mqs_sm_read': | fsl_mqs.c:(.text+0x1ee): undefined reference to `scmi_imx_misc_ctrl_get'
This however only works after changing the dependency in the SND_SOC_FSL_MQS driver as well, which uses 'select IMX_SCMI_MISC_DRV' to turn on a driver it depends on. This is generally a bad idea, so the best solution is to change that into a dependency.
To allow the ASoC driver to keep building with the SCMI support, this needs to be an optional dependency that enforces the link-time dependency if IMX_SCMI_MISC_DRV is a loadable module but not depend on it if that is disabled.
Fixes: 61c9f03e22fc ("firmware: arm_scmi: Add initial support for i.MX MISC protocol") Fixes: 101c9023594a ("ASoC: fsl_mqs: Support accessing registers by scmi interface") Signed-off-by: Arnd Bergmann arnd@arndb.de Acked-by: Mark Brown broonie@kernel.org Acked-by: Shengjiu Wang shengjiu.wang@gmail.com Message-Id: 20241115230555.2435004-1-arnd@kernel.org Signed-off-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/arm_scmi/vendors/imx/Kconfig | 1 + drivers/firmware/imx/Kconfig | 1 - sound/soc/fsl/Kconfig | 1 + 3 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/firmware/arm_scmi/vendors/imx/Kconfig b/drivers/firmware/arm_scmi/vendors/imx/Kconfig index 2883ed24a84d..a01bf5e47301 100644 --- a/drivers/firmware/arm_scmi/vendors/imx/Kconfig +++ b/drivers/firmware/arm_scmi/vendors/imx/Kconfig @@ -15,6 +15,7 @@ config IMX_SCMI_BBM_EXT config IMX_SCMI_MISC_EXT tristate "i.MX SCMI MISC EXTENSION" depends on ARM_SCMI_PROTOCOL || (COMPILE_TEST && OF) + depends on IMX_SCMI_MISC_DRV default y if ARCH_MXC help This enables i.MX System MISC control logic such as gpio expander diff --git a/drivers/firmware/imx/Kconfig b/drivers/firmware/imx/Kconfig index 477d3f32d99a..907cd149c40a 100644 --- a/drivers/firmware/imx/Kconfig +++ b/drivers/firmware/imx/Kconfig @@ -25,7 +25,6 @@ config IMX_SCU
config IMX_SCMI_MISC_DRV tristate "IMX SCMI MISC Protocol driver" - depends on IMX_SCMI_MISC_EXT || COMPILE_TEST default y if ARCH_MXC help The System Controller Management Interface firmware (SCMI FW) is diff --git a/sound/soc/fsl/Kconfig b/sound/soc/fsl/Kconfig index e283751abfef..678540b78280 100644 --- a/sound/soc/fsl/Kconfig +++ b/sound/soc/fsl/Kconfig @@ -29,6 +29,7 @@ config SND_SOC_FSL_SAI config SND_SOC_FSL_MQS tristate "Medium Quality Sound (MQS) module support" depends on SND_SOC_FSL_SAI + depends on IMX_SCMI_MISC_DRV || !IMX_SCMI_MISC_DRV select REGMAP_MMIO help Say Y if you want to add Medium Quality Sound (MQS)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Levi Yun yeoreum.yun@arm.com
[ Upstream commit 6fe437cfe2cdc797b03f63b338a13fac96ed6a08 ]
Currently, ffa_dev->properties is set after the ffa_device_register() call return in ffa_setup_partitions(). This could potentially result in a race where the partition's properties is accessed while probing struct ffa_device before it is set.
Update the ffa_device_register() to receive ffa_partition_info so all the data from the partition information received from the firmware can be updated into the struct ffa_device before the calling device_register() in ffa_device_register().
Fixes: e781858488b9 ("firmware: arm_ffa: Add initial FFA bus support for device enumeration") Signed-off-by: Levi Yun yeoreum.yun@arm.com Message-Id: 20241203143109.1030514-2-yeoreum.yun@arm.com Signed-off-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/arm_ffa/bus.c | 15 +++++++++++---- drivers/firmware/arm_ffa/driver.c | 7 +------ include/linux/arm_ffa.h | 13 ++++++++----- 3 files changed, 20 insertions(+), 15 deletions(-)
diff --git a/drivers/firmware/arm_ffa/bus.c b/drivers/firmware/arm_ffa/bus.c index eb17d03b66fe..dfda5ffc14db 100644 --- a/drivers/firmware/arm_ffa/bus.c +++ b/drivers/firmware/arm_ffa/bus.c @@ -187,13 +187,18 @@ bool ffa_device_is_valid(struct ffa_device *ffa_dev) return valid; }
-struct ffa_device *ffa_device_register(const uuid_t *uuid, int vm_id, - const struct ffa_ops *ops) +struct ffa_device * +ffa_device_register(const struct ffa_partition_info *part_info, + const struct ffa_ops *ops) { int id, ret; + uuid_t uuid; struct device *dev; struct ffa_device *ffa_dev;
+ if (!part_info) + return NULL; + id = ida_alloc_min(&ffa_bus_id, 1, GFP_KERNEL); if (id < 0) return NULL; @@ -210,9 +215,11 @@ struct ffa_device *ffa_device_register(const uuid_t *uuid, int vm_id, dev_set_name(&ffa_dev->dev, "arm-ffa-%d", id);
ffa_dev->id = id; - ffa_dev->vm_id = vm_id; + ffa_dev->vm_id = part_info->id; + ffa_dev->properties = part_info->properties; ffa_dev->ops = ops; - uuid_copy(&ffa_dev->uuid, uuid); + import_uuid(&uuid, (u8 *)part_info->uuid); + uuid_copy(&ffa_dev->uuid, &uuid);
ret = device_register(&ffa_dev->dev); if (ret) { diff --git a/drivers/firmware/arm_ffa/driver.c b/drivers/firmware/arm_ffa/driver.c index b14cbdae94e8..2c2ec3c35f15 100644 --- a/drivers/firmware/arm_ffa/driver.c +++ b/drivers/firmware/arm_ffa/driver.c @@ -1387,7 +1387,6 @@ static struct notifier_block ffa_bus_nb = { static int ffa_setup_partitions(void) { int count, idx, ret; - uuid_t uuid; struct ffa_device *ffa_dev; struct ffa_dev_part_info *info; struct ffa_partition_info *pbuf, *tpbuf; @@ -1406,23 +1405,19 @@ static int ffa_setup_partitions(void)
xa_init(&drv_info->partition_info); for (idx = 0, tpbuf = pbuf; idx < count; idx++, tpbuf++) { - import_uuid(&uuid, (u8 *)tpbuf->uuid); - /* Note that if the UUID will be uuid_null, that will require * ffa_bus_notifier() to find the UUID of this partition id * with help of ffa_device_match_uuid(). FF-A v1.1 and above * provides UUID here for each partition as part of the * discovery API and the same is passed. */ - ffa_dev = ffa_device_register(&uuid, tpbuf->id, &ffa_drv_ops); + ffa_dev = ffa_device_register(tpbuf, &ffa_drv_ops); if (!ffa_dev) { pr_err("%s: failed to register partition ID 0x%x\n", __func__, tpbuf->id); continue; }
- ffa_dev->properties = tpbuf->properties; - if (drv_info->version > FFA_VERSION_1_0 && !(tpbuf->properties & FFA_PARTITION_AARCH64_EXEC)) ffa_mode_32bit_set(ffa_dev); diff --git a/include/linux/arm_ffa.h b/include/linux/arm_ffa.h index a28e2a6a13d0..74169dd0f659 100644 --- a/include/linux/arm_ffa.h +++ b/include/linux/arm_ffa.h @@ -166,9 +166,12 @@ static inline void *ffa_dev_get_drvdata(struct ffa_device *fdev) return dev_get_drvdata(&fdev->dev); }
+struct ffa_partition_info; + #if IS_REACHABLE(CONFIG_ARM_FFA_TRANSPORT) -struct ffa_device *ffa_device_register(const uuid_t *uuid, int vm_id, - const struct ffa_ops *ops); +struct ffa_device * +ffa_device_register(const struct ffa_partition_info *part_info, + const struct ffa_ops *ops); void ffa_device_unregister(struct ffa_device *ffa_dev); int ffa_driver_register(struct ffa_driver *driver, struct module *owner, const char *mod_name); @@ -176,9 +179,9 @@ void ffa_driver_unregister(struct ffa_driver *driver); bool ffa_device_is_valid(struct ffa_device *ffa_dev);
#else -static inline -struct ffa_device *ffa_device_register(const uuid_t *uuid, int vm_id, - const struct ffa_ops *ops) +static inline struct ffa_device * +ffa_device_register(const struct ffa_partition_info *part_info, + const struct ffa_ops *ops) { return NULL; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Neuling michaelneuling@tenstorrent.com
[ Upstream commit ea6398a5af81e3e7fb3da5d261694d479a321fd9 ]
This doesn't cause a problem currently as HVIEN isn't used elsewhere yet. Found by inspection.
Signed-off-by: Michael Neuling michaelneuling@tenstorrent.com Fixes: 16b0bde9a37c ("RISC-V: KVM: Add perf sampling support for guests") Reviewed-by: Atish Patra atishp@rivosinc.com Reviewed-by: Anup Patel anup@brainfault.org Link: https://lore.kernel.org/r/20241127041840.419940-1-michaelneuling@tenstorrent... Signed-off-by: Anup Patel anup@brainfault.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kvm/aia.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/kvm/aia.c b/arch/riscv/kvm/aia.c index 2967d305c442..9f3b527596de 100644 --- a/arch/riscv/kvm/aia.c +++ b/arch/riscv/kvm/aia.c @@ -552,7 +552,7 @@ void kvm_riscv_aia_enable(void) csr_set(CSR_HIE, BIT(IRQ_S_GEXT)); /* Enable IRQ filtering for overflow interrupt only if sscofpmf is present */ if (__riscv_isa_extension_available(NULL, RISCV_ISA_EXT_SSCOFPMF)) - csr_write(CSR_HVIEN, BIT(IRQ_PMU_OVF)); + csr_set(CSR_HVIEN, BIT(IRQ_PMU_OVF)); }
void kvm_riscv_aia_disable(void)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: K Prateek Nayak kprateek.nayak@amd.com
[ Upstream commit 493afbd187c4c9cc1642792c0d9ba400c3d6d90d ]
Adam reports that enabling NEXT_BUDDY insta triggers a WARN in pick_next_entity().
Moving clear_buddies() up before the delayed dequeue bits ensures no ->next buddy becomes delayed. Further ensure no new ->next buddy ever starts as delayed.
Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue") Reported-by: Adam Li adamli@os.amperecomputing.com Signed-off-by: K Prateek Nayak kprateek.nayak@amd.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Tested-by: Adam Li adamli@os.amperecomputing.com Link: https://lkml.kernel.org/r/670a0d54-e398-4b1f-8a6e-90784e2fdf89@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/fair.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 782ce70ebd1b..c467e389cd6f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5484,6 +5484,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) bool sleep = flags & DEQUEUE_SLEEP;
update_curr(cfs_rq); + clear_buddies(cfs_rq, se);
if (flags & DEQUEUE_DELAYED) { SCHED_WARN_ON(!se->sched_delayed); @@ -5500,8 +5501,6 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
if (sched_feat(DELAY_DEQUEUE) && delay && !entity_eligible(cfs_rq, se)) { - if (cfs_rq->next == se) - cfs_rq->next = NULL; update_load_avg(cfs_rq, se, 0); se->sched_delayed = 1; return false; @@ -5526,8 +5525,6 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
update_stats_dequeue_fair(cfs_rq, se, flags);
- clear_buddies(cfs_rq, se); - update_entity_lag(cfs_rq, se); if (sched_feat(PLACE_REL_DEADLINE) && !sleep) { se->deadline -= se->vruntime; @@ -8786,7 +8783,7 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int if (unlikely(throttled_hierarchy(cfs_rq_of(pse)))) return;
- if (sched_feat(NEXT_BUDDY) && !(wake_flags & WF_FORK)) { + if (sched_feat(NEXT_BUDDY) && !(wake_flags & WF_FORK) && !pse->sched_delayed) { set_next_buddy(pse); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vincent Guittot vincent.guittot@linaro.org
[ Upstream commit c1f43c342e1f2e32f0620bf2e972e2a9ea0a1e60 ]
We can't stop the tick of a rq if there are at least 2 tasks enqueued in the whole hierarchy and not only at the root cfs rq.
rq->cfs.nr_running tracks the number of sched_entity at one level whereas rq->cfs.h_nr_running tracks all queued tasks in the hierarchy.
Fixes: 11cc374f4643b ("sched_ext: Simplify scx_can_stop_tick() invocation in sched_can_stop_tick()") Signed-off-by: Vincent Guittot vincent.guittot@linaro.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Dietmar Eggemann dietmar.eggemann@arm.com Link: https://lore.kernel.org/r/20241202174606.4074512-2-vincent.guittot@linaro.or... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 6cc12777bb11..d07dc87787df 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1300,7 +1300,7 @@ bool sched_can_stop_tick(struct rq *rq) if (scx_enabled() && !scx_can_stop_tick(rq)) return false;
- if (rq->cfs.nr_running > 1) + if (rq->cfs.h_nr_running > 1) return false;
/*
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 76f2f783294d7d55c2564e2dfb0a7279ba0bc264 ]
Vincent and Dietmar noted that while commit fc1892becd56 ("sched/eevdf: Fixup PELT vs DELAYED_DEQUEUE") fixes the entity runnable stats, it does not adjust the cfs_rq runnable stats, which are based off of h_nr_running.
Track h_nr_delayed such that we can discount those and adjust the signal.
Fixes: fc1892becd56 ("sched/eevdf: Fixup PELT vs DELAYED_DEQUEUE") Closes: https://lore.kernel.org/lkml/a9a45193-d0c6-4ba2-a822-464ad30b550e@arm.com/ Closes: https://lore.kernel.org/lkml/CAKfTPtCNUvWE_GX5LyvTF-WdxUT=ZgvZZv-4t=eWntg5uO... [ Fixes checkpatch warnings and rebased ] Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reported-by: Dietmar Eggemann dietmar.eggemann@arm.com Reported-by: Vincent Guittot vincent.guittot@linaro.org Signed-off-by: "Peter Zijlstra (Intel)" peterz@infradead.org Signed-off-by: Vincent Guittot vincent.guittot@linaro.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Dietmar Eggemann dietmar.eggemann@arm.com Tested-by: K Prateek Nayak kprateek.nayak@amd.com Link: https://lore.kernel.org/r/20241202174606.4074512-3-vincent.guittot@linaro.or... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/debug.c | 1 + kernel/sched/fair.c | 51 +++++++++++++++++++++++++++++++++++++++----- kernel/sched/pelt.c | 2 +- kernel/sched/sched.h | 8 +++++-- 4 files changed, 54 insertions(+), 8 deletions(-)
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index f4035c7a0fa1..82b165bf48c4 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -844,6 +844,7 @@ void print_cfs_rq(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq) SEQ_printf(m, " .%-30s: %Ld.%06ld\n", "spread", SPLIT_NS(spread)); SEQ_printf(m, " .%-30s: %d\n", "nr_running", cfs_rq->nr_running); SEQ_printf(m, " .%-30s: %d\n", "h_nr_running", cfs_rq->h_nr_running); + SEQ_printf(m, " .%-30s: %d\n", "h_nr_delayed", cfs_rq->h_nr_delayed); SEQ_printf(m, " .%-30s: %d\n", "idle_nr_running", cfs_rq->idle_nr_running); SEQ_printf(m, " .%-30s: %d\n", "idle_h_nr_running", diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c467e389cd6f..93142f9077c7 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5471,9 +5471,33 @@ static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
static __always_inline void return_cfs_rq_runtime(struct cfs_rq *cfs_rq);
-static inline void finish_delayed_dequeue_entity(struct sched_entity *se) +static void set_delayed(struct sched_entity *se) +{ + se->sched_delayed = 1; + for_each_sched_entity(se) { + struct cfs_rq *cfs_rq = cfs_rq_of(se); + + cfs_rq->h_nr_delayed++; + if (cfs_rq_throttled(cfs_rq)) + break; + } +} + +static void clear_delayed(struct sched_entity *se) { se->sched_delayed = 0; + for_each_sched_entity(se) { + struct cfs_rq *cfs_rq = cfs_rq_of(se); + + cfs_rq->h_nr_delayed--; + if (cfs_rq_throttled(cfs_rq)) + break; + } +} + +static inline void finish_delayed_dequeue_entity(struct sched_entity *se) +{ + clear_delayed(se); if (sched_feat(DELAY_ZERO) && se->vlag > 0) se->vlag = 0; } @@ -5502,7 +5526,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) if (sched_feat(DELAY_DEQUEUE) && delay && !entity_eligible(cfs_rq, se)) { update_load_avg(cfs_rq, se, 0); - se->sched_delayed = 1; + set_delayed(se); return false; } } @@ -5920,7 +5944,7 @@ static bool throttle_cfs_rq(struct cfs_rq *cfs_rq) struct rq *rq = rq_of(cfs_rq); struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg); struct sched_entity *se; - long task_delta, idle_task_delta, dequeue = 1; + long task_delta, idle_task_delta, delayed_delta, dequeue = 1; long rq_h_nr_running = rq->cfs.h_nr_running;
raw_spin_lock(&cfs_b->lock); @@ -5953,6 +5977,7 @@ static bool throttle_cfs_rq(struct cfs_rq *cfs_rq)
task_delta = cfs_rq->h_nr_running; idle_task_delta = cfs_rq->idle_h_nr_running; + delayed_delta = cfs_rq->h_nr_delayed; for_each_sched_entity(se) { struct cfs_rq *qcfs_rq = cfs_rq_of(se); int flags; @@ -5976,6 +6001,7 @@ static bool throttle_cfs_rq(struct cfs_rq *cfs_rq)
qcfs_rq->h_nr_running -= task_delta; qcfs_rq->idle_h_nr_running -= idle_task_delta; + qcfs_rq->h_nr_delayed -= delayed_delta;
if (qcfs_rq->load.weight) { /* Avoid re-evaluating load for this entity: */ @@ -5998,6 +6024,7 @@ static bool throttle_cfs_rq(struct cfs_rq *cfs_rq)
qcfs_rq->h_nr_running -= task_delta; qcfs_rq->idle_h_nr_running -= idle_task_delta; + qcfs_rq->h_nr_delayed -= delayed_delta; }
/* At this point se is NULL and we are at root level*/ @@ -6023,7 +6050,7 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq) struct rq *rq = rq_of(cfs_rq); struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg); struct sched_entity *se; - long task_delta, idle_task_delta; + long task_delta, idle_task_delta, delayed_delta; long rq_h_nr_running = rq->cfs.h_nr_running;
se = cfs_rq->tg->se[cpu_of(rq)]; @@ -6059,6 +6086,7 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
task_delta = cfs_rq->h_nr_running; idle_task_delta = cfs_rq->idle_h_nr_running; + delayed_delta = cfs_rq->h_nr_delayed; for_each_sched_entity(se) { struct cfs_rq *qcfs_rq = cfs_rq_of(se);
@@ -6076,6 +6104,7 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
qcfs_rq->h_nr_running += task_delta; qcfs_rq->idle_h_nr_running += idle_task_delta; + qcfs_rq->h_nr_delayed += delayed_delta;
/* end evaluation on encountering a throttled cfs_rq */ if (cfs_rq_throttled(qcfs_rq)) @@ -6093,6 +6122,7 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
qcfs_rq->h_nr_running += task_delta; qcfs_rq->idle_h_nr_running += idle_task_delta; + qcfs_rq->h_nr_delayed += delayed_delta;
/* end evaluation on encountering a throttled cfs_rq */ if (cfs_rq_throttled(qcfs_rq)) @@ -6946,7 +6976,7 @@ requeue_delayed_entity(struct sched_entity *se) }
update_load_avg(cfs_rq, se, 0); - se->sched_delayed = 0; + clear_delayed(se); }
/* @@ -6960,6 +6990,7 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags) struct cfs_rq *cfs_rq; struct sched_entity *se = &p->se; int idle_h_nr_running = task_has_idle_policy(p); + int h_nr_delayed = 0; int task_new = !(flags & ENQUEUE_WAKEUP); int rq_h_nr_running = rq->cfs.h_nr_running; u64 slice = 0; @@ -6986,6 +7017,9 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags) if (p->in_iowait) cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT);
+ if (task_new) + h_nr_delayed = !!se->sched_delayed; + for_each_sched_entity(se) { if (se->on_rq) { if (se->sched_delayed) @@ -7008,6 +7042,7 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
cfs_rq->h_nr_running++; cfs_rq->idle_h_nr_running += idle_h_nr_running; + cfs_rq->h_nr_delayed += h_nr_delayed;
if (cfs_rq_is_idle(cfs_rq)) idle_h_nr_running = 1; @@ -7031,6 +7066,7 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
cfs_rq->h_nr_running++; cfs_rq->idle_h_nr_running += idle_h_nr_running; + cfs_rq->h_nr_delayed += h_nr_delayed;
if (cfs_rq_is_idle(cfs_rq)) idle_h_nr_running = 1; @@ -7093,6 +7129,7 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags) struct task_struct *p = NULL; int idle_h_nr_running = 0; int h_nr_running = 0; + int h_nr_delayed = 0; struct cfs_rq *cfs_rq; u64 slice = 0;
@@ -7100,6 +7137,8 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags) p = task_of(se); h_nr_running = 1; idle_h_nr_running = task_has_idle_policy(p); + if (!task_sleep && !task_delayed) + h_nr_delayed = !!se->sched_delayed; } else { cfs_rq = group_cfs_rq(se); slice = cfs_rq_min_slice(cfs_rq); @@ -7117,6 +7156,7 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags)
cfs_rq->h_nr_running -= h_nr_running; cfs_rq->idle_h_nr_running -= idle_h_nr_running; + cfs_rq->h_nr_delayed -= h_nr_delayed;
if (cfs_rq_is_idle(cfs_rq)) idle_h_nr_running = h_nr_running; @@ -7155,6 +7195,7 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags)
cfs_rq->h_nr_running -= h_nr_running; cfs_rq->idle_h_nr_running -= idle_h_nr_running; + cfs_rq->h_nr_delayed -= h_nr_delayed;
if (cfs_rq_is_idle(cfs_rq)) idle_h_nr_running = h_nr_running; diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c index a9c65d97b3ca..171a802420a1 100644 --- a/kernel/sched/pelt.c +++ b/kernel/sched/pelt.c @@ -321,7 +321,7 @@ int __update_load_avg_cfs_rq(u64 now, struct cfs_rq *cfs_rq) { if (___update_load_sum(now, &cfs_rq->avg, scale_load_down(cfs_rq->load.weight), - cfs_rq->h_nr_running, + cfs_rq->h_nr_running - cfs_rq->h_nr_delayed, cfs_rq->curr != NULL)) {
___update_load_avg(&cfs_rq->avg, 1); diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index c03b3d7b320e..c53696275ca1 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -649,6 +649,7 @@ struct cfs_rq { unsigned int h_nr_running; /* SCHED_{NORMAL,BATCH,IDLE} */ unsigned int idle_nr_running; /* SCHED_IDLE */ unsigned int idle_h_nr_running; /* SCHED_IDLE */ + unsigned int h_nr_delayed;
s64 avg_vruntime; u64 avg_load; @@ -898,8 +899,11 @@ struct dl_rq {
static inline void se_update_runnable(struct sched_entity *se) { - if (!entity_is_task(se)) - se->runnable_weight = se->my_q->h_nr_running; + if (!entity_is_task(se)) { + struct cfs_rq *cfs_rq = se->my_q; + + se->runnable_weight = cfs_rq->h_nr_running - cfs_rq->h_nr_delayed; + } }
static inline long se_runnable(struct sched_entity *se)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com
[ Upstream commit 9244524d60ddea55f4df54c51200e8fef2032447 ]
To prepare for the following fix, factor out the code to read the P2SB resource from the cache to the new function p2sb_read_from_cache().
Signed-off-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20241128002836.373745-2-shinichiro.kawasaki@wdc.co... Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Stable-dep-of: 360c400d0f56 ("p2sb: Do not scan and remove the P2SB device when it is unhidden") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/p2sb.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/drivers/platform/x86/p2sb.c b/drivers/platform/x86/p2sb.c index 31f38309b389..aa34b8a69bc1 100644 --- a/drivers/platform/x86/p2sb.c +++ b/drivers/platform/x86/p2sb.c @@ -171,6 +171,22 @@ static int p2sb_cache_resources(void) return ret; }
+static int p2sb_read_from_cache(struct pci_bus *bus, unsigned int devfn, + struct resource *mem) +{ + struct p2sb_res_cache *cache = &p2sb_resources[PCI_FUNC(devfn)]; + + if (cache->bus_dev_id != bus->dev.id) + return -ENODEV; + + if (!p2sb_valid_resource(&cache->res)) + return -ENOENT; + + memcpy(mem, &cache->res, sizeof(*mem)); + + return 0; +} + /** * p2sb_bar - Get Primary to Sideband (P2SB) bridge device BAR * @bus: PCI bus to communicate with @@ -187,8 +203,6 @@ static int p2sb_cache_resources(void) */ int p2sb_bar(struct pci_bus *bus, unsigned int devfn, struct resource *mem) { - struct p2sb_res_cache *cache; - bus = p2sb_get_bus(bus); if (!bus) return -ENODEV; @@ -196,15 +210,7 @@ int p2sb_bar(struct pci_bus *bus, unsigned int devfn, struct resource *mem) if (!devfn) p2sb_get_devfn(&devfn);
- cache = &p2sb_resources[PCI_FUNC(devfn)]; - if (cache->bus_dev_id != bus->dev.id) - return -ENODEV; - - if (!p2sb_valid_resource(&cache->res)) - return -ENOENT; - - memcpy(mem, &cache->res, sizeof(*mem)); - return 0; + return p2sb_read_from_cache(bus, devfn, mem); } EXPORT_SYMBOL_GPL(p2sb_bar);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com
[ Upstream commit ae3e6ebc5ab046d434c05c58a3e3f7e94441fec2 ]
To prepare for the following fix, introduce the global flag p2sb_hidden_by_bios. Check if the BIOS hides the P2SB device and store the result in the flag. This allows to refer to the check result across functions.
Signed-off-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20241128002836.373745-3-shinichiro.kawasaki@wdc.co... Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Stable-dep-of: 360c400d0f56 ("p2sb: Do not scan and remove the P2SB device when it is unhidden") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/p2sb.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/platform/x86/p2sb.c b/drivers/platform/x86/p2sb.c index aa34b8a69bc1..273ac90c8fbd 100644 --- a/drivers/platform/x86/p2sb.c +++ b/drivers/platform/x86/p2sb.c @@ -42,6 +42,7 @@ struct p2sb_res_cache { };
static struct p2sb_res_cache p2sb_resources[NR_P2SB_RES_CACHE]; +static bool p2sb_hidden_by_bios;
static void p2sb_get_devfn(unsigned int *devfn) { @@ -157,13 +158,14 @@ static int p2sb_cache_resources(void) * Unhide the P2SB device here, if needed. */ pci_bus_read_config_dword(bus, devfn_p2sb, P2SBC, &value); - if (value & P2SBC_HIDE) + p2sb_hidden_by_bios = value & P2SBC_HIDE; + if (p2sb_hidden_by_bios) pci_bus_write_config_dword(bus, devfn_p2sb, P2SBC, 0);
ret = p2sb_scan_and_cache(bus, devfn_p2sb);
/* Hide the P2SB device, if it was hidden */ - if (value & P2SBC_HIDE) + if (p2sb_hidden_by_bios) pci_bus_write_config_dword(bus, devfn_p2sb, P2SBC, P2SBC_HIDE);
pci_unlock_rescan_remove();
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com
[ Upstream commit 0286070c74ee48391fc07f7f617460479472d221 ]
To prepare for the following fix, move the code to hide and unhide the P2SB device from p2sb_cache_resources() to p2sb_scan_and_cache().
Signed-off-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20241128002836.373745-4-shinichiro.kawasaki@wdc.co... Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Stable-dep-of: 360c400d0f56 ("p2sb: Do not scan and remove the P2SB device when it is unhidden") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/p2sb.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/drivers/platform/x86/p2sb.c b/drivers/platform/x86/p2sb.c index 273ac90c8fbd..0bc6b21c4c20 100644 --- a/drivers/platform/x86/p2sb.c +++ b/drivers/platform/x86/p2sb.c @@ -97,6 +97,14 @@ static void p2sb_scan_and_cache_devfn(struct pci_bus *bus, unsigned int devfn)
static int p2sb_scan_and_cache(struct pci_bus *bus, unsigned int devfn) { + /* + * The BIOS prevents the P2SB device from being enumerated by the PCI + * subsystem, so we need to unhide and hide it back to lookup the BAR. + * Unhide the P2SB device here, if needed. + */ + if (p2sb_hidden_by_bios) + pci_bus_write_config_dword(bus, devfn, P2SBC, 0); + /* Scan the P2SB device and cache its BAR0 */ p2sb_scan_and_cache_devfn(bus, devfn);
@@ -104,6 +112,10 @@ static int p2sb_scan_and_cache(struct pci_bus *bus, unsigned int devfn) if (devfn == P2SB_DEVFN_GOLDMONT) p2sb_scan_and_cache_devfn(bus, SPI_DEVFN_GOLDMONT);
+ /* Hide the P2SB device, if it was hidden */ + if (p2sb_hidden_by_bios) + pci_bus_write_config_dword(bus, devfn, P2SBC, P2SBC_HIDE); + if (!p2sb_valid_resource(&p2sb_resources[PCI_FUNC(devfn)].res)) return -ENOENT;
@@ -152,22 +164,11 @@ static int p2sb_cache_resources(void) */ pci_lock_rescan_remove();
- /* - * The BIOS prevents the P2SB device from being enumerated by the PCI - * subsystem, so we need to unhide and hide it back to lookup the BAR. - * Unhide the P2SB device here, if needed. - */ pci_bus_read_config_dword(bus, devfn_p2sb, P2SBC, &value); p2sb_hidden_by_bios = value & P2SBC_HIDE; - if (p2sb_hidden_by_bios) - pci_bus_write_config_dword(bus, devfn_p2sb, P2SBC, 0);
ret = p2sb_scan_and_cache(bus, devfn_p2sb);
- /* Hide the P2SB device, if it was hidden */ - if (p2sb_hidden_by_bios) - pci_bus_write_config_dword(bus, devfn_p2sb, P2SBC, P2SBC_HIDE); - pci_unlock_rescan_remove();
return ret;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com
[ Upstream commit 360c400d0f568636c1b98d1d5f9f49aa3d420c70 ]
When drivers access P2SB device resources, it calls p2sb_bar(). Before the commit 5913320eb0b3 ("platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe"), p2sb_bar() obtained the resources and then called pci_stop_and_remove_bus_device() for clean up. Then the P2SB device disappeared. The commit 5913320eb0b3 introduced the P2SB device resource cache feature in the boot process. During the resource cache, pci_stop_and_remove_bus_device() is called for the P2SB device, then the P2SB device disappears regardless of whether p2sb_bar() is called or not. Such P2SB device disappearance caused a confusion [1]. To avoid the confusion, avoid the pci_stop_and_remove_bus_device() call when the BIOS does not hide the P2SB device.
For that purpose, cache the P2SB device resources only if the BIOS hides the P2SB device. Call p2sb_scan_and_cache() only if p2sb_hidden_by_bios is true. This allows removing two branches from p2sb_scan_and_cache(). When p2sb_bar() is called, get the resources from the cache if the P2SB device is hidden. Otherwise, read the resources from the unhidden P2SB device.
Reported-by: Daniel Walker (danielwa) danielwa@cisco.com Closes: https://lore.kernel.org/lkml/ZzTI+biIUTvFT6NC@goliath/ [1] Fixes: 5913320eb0b3 ("platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe") Signed-off-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20241128002836.373745-5-shinichiro.kawasaki@wdc.co... Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/p2sb.c | 42 +++++++++++++++++++++++++++++-------- 1 file changed, 33 insertions(+), 9 deletions(-)
diff --git a/drivers/platform/x86/p2sb.c b/drivers/platform/x86/p2sb.c index 0bc6b21c4c20..c56650b9ff96 100644 --- a/drivers/platform/x86/p2sb.c +++ b/drivers/platform/x86/p2sb.c @@ -100,10 +100,8 @@ static int p2sb_scan_and_cache(struct pci_bus *bus, unsigned int devfn) /* * The BIOS prevents the P2SB device from being enumerated by the PCI * subsystem, so we need to unhide and hide it back to lookup the BAR. - * Unhide the P2SB device here, if needed. */ - if (p2sb_hidden_by_bios) - pci_bus_write_config_dword(bus, devfn, P2SBC, 0); + pci_bus_write_config_dword(bus, devfn, P2SBC, 0);
/* Scan the P2SB device and cache its BAR0 */ p2sb_scan_and_cache_devfn(bus, devfn); @@ -112,9 +110,7 @@ static int p2sb_scan_and_cache(struct pci_bus *bus, unsigned int devfn) if (devfn == P2SB_DEVFN_GOLDMONT) p2sb_scan_and_cache_devfn(bus, SPI_DEVFN_GOLDMONT);
- /* Hide the P2SB device, if it was hidden */ - if (p2sb_hidden_by_bios) - pci_bus_write_config_dword(bus, devfn, P2SBC, P2SBC_HIDE); + pci_bus_write_config_dword(bus, devfn, P2SBC, P2SBC_HIDE);
if (!p2sb_valid_resource(&p2sb_resources[PCI_FUNC(devfn)].res)) return -ENOENT; @@ -141,7 +137,7 @@ static int p2sb_cache_resources(void) u32 value = P2SBC_HIDE; struct pci_bus *bus; u16 class; - int ret; + int ret = 0;
/* Get devfn for P2SB device itself */ p2sb_get_devfn(&devfn_p2sb); @@ -167,7 +163,12 @@ static int p2sb_cache_resources(void) pci_bus_read_config_dword(bus, devfn_p2sb, P2SBC, &value); p2sb_hidden_by_bios = value & P2SBC_HIDE;
- ret = p2sb_scan_and_cache(bus, devfn_p2sb); + /* + * If the BIOS does not hide the P2SB device then its resources + * are accesilble. Cache them only if the P2SB device is hidden. + */ + if (p2sb_hidden_by_bios) + ret = p2sb_scan_and_cache(bus, devfn_p2sb);
pci_unlock_rescan_remove();
@@ -190,6 +191,26 @@ static int p2sb_read_from_cache(struct pci_bus *bus, unsigned int devfn, return 0; }
+static int p2sb_read_from_dev(struct pci_bus *bus, unsigned int devfn, + struct resource *mem) +{ + struct pci_dev *pdev; + int ret = 0; + + pdev = pci_get_slot(bus, devfn); + if (!pdev) + return -ENODEV; + + if (p2sb_valid_resource(pci_resource_n(pdev, 0))) + p2sb_read_bar0(pdev, mem); + else + ret = -ENOENT; + + pci_dev_put(pdev); + + return ret; +} + /** * p2sb_bar - Get Primary to Sideband (P2SB) bridge device BAR * @bus: PCI bus to communicate with @@ -213,7 +234,10 @@ int p2sb_bar(struct pci_bus *bus, unsigned int devfn, struct resource *mem) if (!devfn) p2sb_get_devfn(&devfn);
- return p2sb_read_from_cache(bus, devfn, mem); + if (p2sb_hidden_by_bios) + return p2sb_read_from_cache(bus, devfn, mem); + + return p2sb_read_from_dev(bus, devfn, mem); } EXPORT_SYMBOL_GPL(p2sb_bar);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vladimir Riabchun ferr.lambarginio@gmail.com
[ Upstream commit 7363f2d4c18557c99c536b70489187bb4e05c412 ]
Since commit f63b94be6942 ("i2c: pnx: Fix potential deadlock warning from del_timer_sync() call in isr") jiffies are stored in i2c_pnx_algo_data.timeout, but wait_timeout and wait_reset are still using it as milliseconds. Convert jiffies back to milliseconds to wait for the expected amount of time.
Fixes: f63b94be6942 ("i2c: pnx: Fix potential deadlock warning from del_timer_sync() call in isr") Signed-off-by: Vladimir Riabchun ferr.lambarginio@gmail.com Signed-off-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-pnx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-pnx.c b/drivers/i2c/busses/i2c-pnx.c index 1dafadda73af..135300f3b534 100644 --- a/drivers/i2c/busses/i2c-pnx.c +++ b/drivers/i2c/busses/i2c-pnx.c @@ -95,7 +95,7 @@ enum {
static inline int wait_timeout(struct i2c_pnx_algo_data *data) { - long timeout = data->timeout; + long timeout = jiffies_to_msecs(data->timeout); while (timeout > 0 && (ioread32(I2C_REG_STS(data)) & mstatus_active)) { mdelay(1); @@ -106,7 +106,7 @@ static inline int wait_timeout(struct i2c_pnx_algo_data *data)
static inline int wait_reset(struct i2c_pnx_algo_data *data) { - long timeout = data->timeout; + long timeout = jiffies_to_msecs(data->timeout); while (timeout > 0 && (ioread32(I2C_REG_CTL(data)) & mcntrl_reset)) { mdelay(1);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Gordeev agordeev@linux.ibm.com
[ Upstream commit 5fa49dd8e521a42379e5e41fcf2c92edaaec0a8b ]
DEFINE_IPL_ATTR_STR_RW() macro produces "unsigned 'len' is never less than zero." warning when sys_vmcmd_on_*_store() callbacks are defined.
Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202412081614.5uel8F6W-lkp@intel.com/ Fixes: 247576bf624a ("s390/ipl: Do not accept z/VM CP diag X'008' cmds longer than max length") Reviewed-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/ipl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/s390/kernel/ipl.c b/arch/s390/kernel/ipl.c index f17bb7bf9392..5fa203f4bc6b 100644 --- a/arch/s390/kernel/ipl.c +++ b/arch/s390/kernel/ipl.c @@ -270,7 +270,7 @@ static ssize_t sys_##_prefix##_##_name##_store(struct kobject *kobj, \ if (len >= sizeof(_value)) \ return -E2BIG; \ len = strscpy(_value, buf, sizeof(_value)); \ - if (len < 0) \ + if ((ssize_t)len < 0) \ return len; \ strim(_value); \ return len; \
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gao Xiang hsiangkao@linux.alibaba.com
[ Upstream commit 1a2180f6859c73c674809f9f82e36c94084682ba ]
Max Kellermann recently reported psi_group_cpu.tasks[NR_MEMSTALL] is incorrect in the 6.11.9 kernel.
The root cause appears to be that, since the problematic commit, bio can be NULL, causing psi_memstall_leave() to be skipped in z_erofs_submit_queue().
Reported-by: Max Kellermann max.kellermann@ionos.com Closes: https://lore.kernel.org/r/CAKPOu+8tvSowiJADW2RuKyofL_CSkm_SuyZA7ME5vMLWmL6pq... Fixes: 9e2f9d34dd12 ("erofs: handle overlapped pclusters out of crafted images properly") Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Link: https://lore.kernel.org/r/20241127085236.3538334-1-hsiangkao@linux.alibaba.c... Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/zdata.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index a569ff9dfd04..1a00f061798a 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -1679,9 +1679,9 @@ static void z_erofs_submit_queue(struct z_erofs_decompress_frontend *f, erofs_fscache_submit_bio(bio); else submit_bio(bio); - if (memstall) - psi_memstall_leave(&pflags); } + if (memstall) + psi_memstall_leave(&pflags);
/* * although background is preferred, no one is pending for submission.
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vineeth Pillai (Google) vineeth@bitbyteword.org
[ Upstream commit b53127db1dbf7f1047cf35c10922d801dcd40324 ]
dlserver can get dequeued during a dlserver pick_task due to the delayed deueue feature and this can lead to issues with dlserver logic as it still thinks that dlserver is on the runqueue. The dlserver throttling and replenish logic gets confused and can lead to double enqueue of dlserver.
Double enqueue of dlserver could happend due to couple of reasons:
Case 1 ------
Delayed dequeue feature[1] can cause dlserver being stopped during a pick initiated by dlserver: __pick_next_task pick_task_dl -> server_pick_task pick_task_fair pick_next_entity (if (sched_delayed)) dequeue_entities dl_server_stop
server_pick_task goes ahead with update_curr_dl_se without knowing that dlserver is dequeued and this confuses the logic and may lead to unintended enqueue while the server is stopped.
Case 2 ------ A race condition between a task dequeue on one cpu and same task's enqueue on this cpu by a remote cpu while the lock is released causing dlserver double enqueue.
One cpu would be in the schedule() and releasing RQ-lock:
current->state = TASK_INTERRUPTIBLE(); schedule(); deactivate_task() dl_stop_server(); pick_next_task() pick_next_task_fair() sched_balance_newidle() rq_unlock(this_rq)
at which point another CPU can take our RQ-lock and do:
try_to_wake_up() ttwu_queue() rq_lock() ... activate_task() dl_server_start() --> first enqueue wakeup_preempt() := check_preempt_wakeup_fair() update_curr() update_curr_task() if (current->dl_server) dl_server_update() enqueue_dl_entity() --> second enqueue
This bug was not apparent as the enqueue in dl_server_start doesn't usually happen because of the defer logic. But as a side effect of the first case(dequeue during dlserver pick), dl_throttled and dl_yield will be set and this causes the time accounting of dlserver to messup and then leading to a enqueue in dl_server_start.
Have an explicit flag representing the status of dlserver to avoid the confusion. This is set in dl_server_start and reset in dlserver_stop.
Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers") Suggested-by: Peter Zijlstra peterz@infradead.org Signed-off-by: "Vineeth Pillai (Google)" vineeth@bitbyteword.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Tested-by: Marcel Ziswiler marcel.ziswiler@codethink.co.uk # ROCK 5B Link: https://lkml.kernel.org/r/20241213032244.877029-1-vineeth@bitbyteword.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/sched.h | 7 +++++++ kernel/sched/deadline.c | 8 ++++++-- kernel/sched/sched.h | 5 +++++ 3 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h index bb343136ddd0..c14446c6164d 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -656,6 +656,12 @@ struct sched_dl_entity { * @dl_defer_armed tells if the deferrable server is waiting * for the replenishment timer to activate it. * + * @dl_server_active tells if the dlserver is active(started). + * dlserver is started on first cfs enqueue on an idle runqueue + * and is stopped when a dequeue results in 0 cfs tasks on the + * runqueue. In other words, dlserver is active only when cpu's + * runqueue has atleast one cfs task. + * * @dl_defer_running tells if the deferrable server is actually * running, skipping the defer phase. */ @@ -664,6 +670,7 @@ struct sched_dl_entity { unsigned int dl_non_contending : 1; unsigned int dl_overrun : 1; unsigned int dl_server : 1; + unsigned int dl_server_active : 1; unsigned int dl_defer : 1; unsigned int dl_defer_armed : 1; unsigned int dl_defer_running : 1; diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index fc6f41ac33eb..a17c23b53049 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1647,6 +1647,7 @@ void dl_server_start(struct sched_dl_entity *dl_se) if (!dl_se->dl_runtime) return;
+ dl_se->dl_server_active = 1; enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP); if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl)) resched_curr(dl_se->rq); @@ -1661,6 +1662,7 @@ void dl_server_stop(struct sched_dl_entity *dl_se) hrtimer_try_to_cancel(&dl_se->dl_timer); dl_se->dl_defer_armed = 0; dl_se->dl_throttled = 0; + dl_se->dl_server_active = 0; }
void dl_server_init(struct sched_dl_entity *dl_se, struct rq *rq, @@ -2420,8 +2422,10 @@ static struct task_struct *__pick_task_dl(struct rq *rq) if (dl_server(dl_se)) { p = dl_se->server_pick_task(dl_se); if (!p) { - dl_se->dl_yielded = 1; - update_curr_dl_se(rq, dl_se, 0); + if (dl_server_active(dl_se)) { + dl_se->dl_yielded = 1; + update_curr_dl_se(rq, dl_se, 0); + } goto again; } rq->dl_server = dl_se; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index c53696275ca1..f2ef520513c4 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -398,6 +398,11 @@ extern void __dl_server_attach_root(struct sched_dl_entity *dl_se, struct rq *rq extern int dl_server_apply_params(struct sched_dl_entity *dl_se, u64 runtime, u64 period, bool init);
+static inline bool dl_server_active(struct sched_dl_entity *dl_se) +{ + return dl_se->dl_server_active; +} + #ifdef CONFIG_CGROUP_SCHED
extern struct list_head task_groups;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vineeth Pillai (Google) vineeth@bitbyteword.org
[ Upstream commit c7f7e9c73178e0e342486fd31e7f363ef60e3f83 ]
dlserver time is accounted when: - dlserver is active and the dlserver proxies the cfs task. - dlserver is active but deferred and cfs task runs after being picked through the normal fair class pick.
dl_server_update is called in two places to make sure that both the above times are accounted for. But it doesn't check if dlserver is active or not. Now that we have this dl_server_active flag, we can consolidate dl_server_update into one place and all we need to check is whether dlserver is active or not. When dlserver is active there is only two possible conditions: - dlserver is deferred. - cfs task is running on behalf of dlserver.
Fixes: a110a81c52a9 ("sched/deadline: Deferrable dl server") Signed-off-by: "Vineeth Pillai (Google)" vineeth@bitbyteword.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Tested-by: Marcel Ziswiler marcel.ziswiler@codethink.co.uk # ROCK 5B Link: https://lore.kernel.org/r/20241213032244.877029-2-vineeth@bitbyteword.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/fair.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 93142f9077c7..1ca96c99872f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1159,8 +1159,6 @@ static inline void update_curr_task(struct task_struct *p, s64 delta_exec) trace_sched_stat_runtime(p, delta_exec); account_group_exec_runtime(p, delta_exec); cgroup_account_cputime(p, delta_exec); - if (p->dl_server) - dl_server_update(p->dl_server, delta_exec); }
static inline bool did_preempt_short(struct cfs_rq *cfs_rq, struct sched_entity *curr) @@ -1237,11 +1235,16 @@ static void update_curr(struct cfs_rq *cfs_rq) update_curr_task(p, delta_exec);
/* - * Any fair task that runs outside of fair_server should - * account against fair_server such that it can account for - * this time and possibly avoid running this period. + * If the fair_server is active, we need to account for the + * fair_server time whether or not the task is running on + * behalf of fair_server or not: + * - If the task is running on behalf of fair_server, we need + * to limit its time based on the assigned runtime. + * - Fair task that runs outside of fair_server should account + * against fair_server such that it can account for this time + * and possibly avoid running this period. */ - if (p->dl_server != &rq->fair_server) + if (dl_server_active(&rq->fair_server)) dl_server_update(&rq->fair_server, delta_exec); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vasily Gorbik gor@linux.ibm.com
[ Upstream commit 282da38b465395c930687974627c24f47ddce5ff ]
The calculation determining whether to use three- or four-level paging didn't account for KMSAN modules metadata. Include this metadata in the virtual memory size calculation to ensure correct paging mode selection and avoiding potentially unnecessary physical memory size limitations.
Fixes: 65ca73f9fb36 ("s390/mm: define KMSAN metadata for vmalloc and modules") Acked-by: Heiko Carstens hca@linux.ibm.com Reviewed-by: Alexander Gordeev agordeev@linux.ibm.com Reviewed-by: Ilya Leoshkevich iii@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/boot/startup.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/s390/boot/startup.c b/arch/s390/boot/startup.c index c8f149ad77e5..c2ee0745f59e 100644 --- a/arch/s390/boot/startup.c +++ b/arch/s390/boot/startup.c @@ -231,6 +231,8 @@ static unsigned long get_vmem_size(unsigned long identity_size, vsize = round_up(SZ_2G + max_mappable, rte_size) + round_up(vmemmap_size, rte_size) + FIXMAP_SIZE + MODULES_LEN + KASLR_LEN; + if (IS_ENABLED(CONFIG_KMSAN)) + vsize += MODULES_LEN * 2; return size_add(vsize, vmalloc_size); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gao Xiang hsiangkao@linux.alibaba.com
[ Upstream commit e2de3c1bf6a0c99b089bd706a62da8f988918858 ]
Unify the common parts of erofs_fc_free() and erofs_kill_sb() as erofs_sb_free().
Thus, fput() in erofs_fc_get_tree() is no longer needed, too.
Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Link: https://lore.kernel.org/r/20241212133504.2047178-1-hsiangkao@linux.alibaba.c... Stable-dep-of: 6422cde1b0d5 ("erofs: use buffered I/O for file-backed mounts by default") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/super.c | 36 +++++++++++++++++++----------------- 1 file changed, 19 insertions(+), 17 deletions(-)
diff --git a/fs/erofs/super.c b/fs/erofs/super.c index 2dd7d819572f..c40821346d50 100644 --- a/fs/erofs/super.c +++ b/fs/erofs/super.c @@ -718,16 +718,19 @@ static int erofs_fc_get_tree(struct fs_context *fc) GET_TREE_BDEV_QUIET_LOOKUP : 0); #ifdef CONFIG_EROFS_FS_BACKED_BY_FILE if (ret == -ENOTBLK) { + struct file *file; + if (!fc->source) return invalf(fc, "No source specified"); - sbi->fdev = filp_open(fc->source, O_RDONLY | O_LARGEFILE, 0); - if (IS_ERR(sbi->fdev)) - return PTR_ERR(sbi->fdev); + + file = filp_open(fc->source, O_RDONLY | O_LARGEFILE, 0); + if (IS_ERR(file)) + return PTR_ERR(file); + sbi->fdev = file;
if (S_ISREG(file_inode(sbi->fdev)->i_mode) && sbi->fdev->f_mapping->a_ops->read_folio) return get_tree_nodev(fc, erofs_fc_fill_super); - fput(sbi->fdev); } #endif return ret; @@ -778,19 +781,24 @@ static void erofs_free_dev_context(struct erofs_dev_context *devs) kfree(devs); }
-static void erofs_fc_free(struct fs_context *fc) +static void erofs_sb_free(struct erofs_sb_info *sbi) { - struct erofs_sb_info *sbi = fc->s_fs_info; - - if (!sbi) - return; - erofs_free_dev_context(sbi->devs); kfree(sbi->fsid); kfree(sbi->domain_id); + if (sbi->fdev) + fput(sbi->fdev); kfree(sbi); }
+static void erofs_fc_free(struct fs_context *fc) +{ + struct erofs_sb_info *sbi = fc->s_fs_info; + + if (sbi) /* free here if an error occurs before transferring to sb */ + erofs_sb_free(sbi); +} + static const struct fs_context_operations erofs_context_ops = { .parse_param = erofs_fc_parse_param, .get_tree = erofs_fc_get_tree, @@ -828,15 +836,9 @@ static void erofs_kill_sb(struct super_block *sb) kill_anon_super(sb); else kill_block_super(sb); - - erofs_free_dev_context(sbi->devs); fs_put_dax(sbi->dax_dev, NULL); erofs_fscache_unregister_fs(sb); - kfree(sbi->fsid); - kfree(sbi->domain_id); - if (sbi->fdev) - fput(sbi->fdev); - kfree(sbi); + erofs_sb_free(sbi); sb->s_fs_info = NULL; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gao Xiang hsiangkao@linux.alibaba.com
[ Upstream commit 7b00af2c5414dc01e0718deef7ead81102867636 ]
Instead of just listing each one directly in `struct erofs_sb_info` except that we still use `sb->s_bdev` for the primary block device.
Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Link: https://lore.kernel.org/r/20241216125310.930933-2-hsiangkao@linux.alibaba.co... Stable-dep-of: 6422cde1b0d5 ("erofs: use buffered I/O for file-backed mounts by default") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/data.c | 12 ++++-------- fs/erofs/fscache.c | 6 +++--- fs/erofs/internal.h | 8 ++------ fs/erofs/super.c | 27 +++++++++++++-------------- 4 files changed, 22 insertions(+), 31 deletions(-)
diff --git a/fs/erofs/data.c b/fs/erofs/data.c index fa51437e1d99..365c988262b1 100644 --- a/fs/erofs/data.c +++ b/fs/erofs/data.c @@ -63,10 +63,10 @@ void erofs_init_metabuf(struct erofs_buf *buf, struct super_block *sb)
buf->file = NULL; if (erofs_is_fileio_mode(sbi)) { - buf->file = sbi->fdev; /* some fs like FUSE needs it */ + buf->file = sbi->dif0.file; /* some fs like FUSE needs it */ buf->mapping = buf->file->f_mapping; } else if (erofs_is_fscache_mode(sb)) - buf->mapping = sbi->s_fscache->inode->i_mapping; + buf->mapping = sbi->dif0.fscache->inode->i_mapping; else buf->mapping = sb->s_bdev->bd_mapping; } @@ -208,12 +208,8 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map) erofs_off_t startoff, length; int id;
- map->m_bdev = sb->s_bdev; - map->m_daxdev = EROFS_SB(sb)->dax_dev; - map->m_dax_part_off = EROFS_SB(sb)->dax_part_off; - map->m_fscache = EROFS_SB(sb)->s_fscache; - map->m_fp = EROFS_SB(sb)->fdev; - + erofs_fill_from_devinfo(map, &EROFS_SB(sb)->dif0); + map->m_bdev = sb->s_bdev; /* use s_bdev for the primary device */ if (map->m_deviceid) { down_read(&devs->rwsem); dif = idr_find(&devs->tree, map->m_deviceid - 1); diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c index fda16eedafb5..ce7e38c82719 100644 --- a/fs/erofs/fscache.c +++ b/fs/erofs/fscache.c @@ -657,7 +657,7 @@ int erofs_fscache_register_fs(struct super_block *sb) if (IS_ERR(fscache)) return PTR_ERR(fscache);
- sbi->s_fscache = fscache; + sbi->dif0.fscache = fscache; return 0; }
@@ -665,14 +665,14 @@ void erofs_fscache_unregister_fs(struct super_block *sb) { struct erofs_sb_info *sbi = EROFS_SB(sb);
- erofs_fscache_unregister_cookie(sbi->s_fscache); + erofs_fscache_unregister_cookie(sbi->dif0.fscache);
if (sbi->domain) erofs_fscache_domain_put(sbi->domain); else fscache_relinquish_volume(sbi->volume, NULL, false);
- sbi->s_fscache = NULL; + sbi->dif0.fscache = NULL; sbi->volume = NULL; sbi->domain = NULL; } diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index 9b03c8f323a7..d70aa2410472 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -113,6 +113,7 @@ struct erofs_xattr_prefix_item { };
struct erofs_sb_info { + struct erofs_device_info dif0; struct erofs_mount_opts opt; /* options */ #ifdef CONFIG_EROFS_FS_ZIP /* list for all registered superblocks, mainly for shrinker */ @@ -130,13 +131,9 @@ struct erofs_sb_info {
struct erofs_sb_lz4_info lz4; #endif /* CONFIG_EROFS_FS_ZIP */ - struct file *fdev; struct inode *packed_inode; struct erofs_dev_context *devs; - struct dax_device *dax_dev; - u64 dax_part_off; u64 total_blocks; - u32 primarydevice_blocks;
u32 meta_blkaddr; #ifdef CONFIG_EROFS_FS_XATTR @@ -172,7 +169,6 @@ struct erofs_sb_info {
/* fscache support */ struct fscache_volume *volume; - struct erofs_fscache *s_fscache; struct erofs_domain *domain; char *fsid; char *domain_id; @@ -193,7 +189,7 @@ struct erofs_sb_info {
static inline bool erofs_is_fileio_mode(struct erofs_sb_info *sbi) { - return IS_ENABLED(CONFIG_EROFS_FS_BACKED_BY_FILE) && sbi->fdev; + return IS_ENABLED(CONFIG_EROFS_FS_BACKED_BY_FILE) && sbi->dif0.file; }
static inline bool erofs_is_fscache_mode(struct super_block *sb) diff --git a/fs/erofs/super.c b/fs/erofs/super.c index c40821346d50..60f7bd43a5a4 100644 --- a/fs/erofs/super.c +++ b/fs/erofs/super.c @@ -218,7 +218,7 @@ static int erofs_scan_devices(struct super_block *sb, struct erofs_device_info *dif; int id, err = 0;
- sbi->total_blocks = sbi->primarydevice_blocks; + sbi->total_blocks = sbi->dif0.blocks; if (!erofs_sb_has_device_table(sbi)) ondisk_extradevs = 0; else @@ -322,7 +322,7 @@ static int erofs_read_superblock(struct super_block *sb) sbi->sb_size); goto out; } - sbi->primarydevice_blocks = le32_to_cpu(dsb->blocks); + sbi->dif0.blocks = le32_to_cpu(dsb->blocks); sbi->meta_blkaddr = le32_to_cpu(dsb->meta_blkaddr); #ifdef CONFIG_EROFS_FS_XATTR sbi->xattr_blkaddr = le32_to_cpu(dsb->xattr_blkaddr); @@ -617,9 +617,8 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc) return -EINVAL; }
- sbi->dax_dev = fs_dax_get_by_bdev(sb->s_bdev, - &sbi->dax_part_off, - NULL, NULL); + sbi->dif0.dax_dev = fs_dax_get_by_bdev(sb->s_bdev, + &sbi->dif0.dax_part_off, NULL, NULL); }
err = erofs_read_superblock(sb); @@ -642,7 +641,7 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc) }
if (test_opt(&sbi->opt, DAX_ALWAYS)) { - if (!sbi->dax_dev) { + if (!sbi->dif0.dax_dev) { errorfc(fc, "DAX unsupported by block device. Turning off DAX."); clear_opt(&sbi->opt, DAX_ALWAYS); } else if (sbi->blkszbits != PAGE_SHIFT) { @@ -722,14 +721,13 @@ static int erofs_fc_get_tree(struct fs_context *fc)
if (!fc->source) return invalf(fc, "No source specified"); - file = filp_open(fc->source, O_RDONLY | O_LARGEFILE, 0); if (IS_ERR(file)) return PTR_ERR(file); - sbi->fdev = file; + sbi->dif0.file = file;
- if (S_ISREG(file_inode(sbi->fdev)->i_mode) && - sbi->fdev->f_mapping->a_ops->read_folio) + if (S_ISREG(file_inode(sbi->dif0.file)->i_mode) && + sbi->dif0.file->f_mapping->a_ops->read_folio) return get_tree_nodev(fc, erofs_fc_fill_super); } #endif @@ -786,8 +784,8 @@ static void erofs_sb_free(struct erofs_sb_info *sbi) erofs_free_dev_context(sbi->devs); kfree(sbi->fsid); kfree(sbi->domain_id); - if (sbi->fdev) - fput(sbi->fdev); + if (sbi->dif0.file) + fput(sbi->dif0.file); kfree(sbi); }
@@ -832,11 +830,12 @@ static void erofs_kill_sb(struct super_block *sb) { struct erofs_sb_info *sbi = EROFS_SB(sb);
- if ((IS_ENABLED(CONFIG_EROFS_FS_ONDEMAND) && sbi->fsid) || sbi->fdev) + if ((IS_ENABLED(CONFIG_EROFS_FS_ONDEMAND) && sbi->fsid) || + sbi->dif0.file) kill_anon_super(sb); else kill_block_super(sb); - fs_put_dax(sbi->dax_dev, NULL); + fs_put_dax(sbi->dif0.dax_dev, NULL); erofs_fscache_unregister_fs(sb); erofs_sb_free(sbi); sb->s_fs_info = NULL;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gao Xiang hsiangkao@linux.alibaba.com
[ Upstream commit f8d920a402aec3482931cb5f1539ed438740fc49 ]
Record `m_sb` and `m_dif` to replace `m_fscache`, `m_daxdev`, `m_fp` and `m_dax_part_off` in order to simplify the codebase.
Note that `m_bdev` is still left since it can be assigned from `sb->s_bdev` directly.
Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Link: https://lore.kernel.org/r/20241212235401.2857246-1-hsiangkao@linux.alibaba.c... Stable-dep-of: 6422cde1b0d5 ("erofs: use buffered I/O for file-backed mounts by default") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/data.c | 26 ++++++++++---------------- fs/erofs/fileio.c | 2 +- fs/erofs/fscache.c | 4 ++-- fs/erofs/internal.h | 6 ++---- 4 files changed, 15 insertions(+), 23 deletions(-)
diff --git a/fs/erofs/data.c b/fs/erofs/data.c index 365c988262b1..722151d3fee8 100644 --- a/fs/erofs/data.c +++ b/fs/erofs/data.c @@ -186,19 +186,13 @@ int erofs_map_blocks(struct inode *inode, struct erofs_map_blocks *map) }
static void erofs_fill_from_devinfo(struct erofs_map_dev *map, - struct erofs_device_info *dif) + struct super_block *sb, struct erofs_device_info *dif) { + map->m_sb = sb; + map->m_dif = dif; map->m_bdev = NULL; - map->m_fp = NULL; - if (dif->file) { - if (S_ISBLK(file_inode(dif->file)->i_mode)) - map->m_bdev = file_bdev(dif->file); - else - map->m_fp = dif->file; - } - map->m_daxdev = dif->dax_dev; - map->m_dax_part_off = dif->dax_part_off; - map->m_fscache = dif->fscache; + if (dif->file && S_ISBLK(file_inode(dif->file)->i_mode)) + map->m_bdev = file_bdev(dif->file); }
int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map) @@ -208,7 +202,7 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map) erofs_off_t startoff, length; int id;
- erofs_fill_from_devinfo(map, &EROFS_SB(sb)->dif0); + erofs_fill_from_devinfo(map, sb, &EROFS_SB(sb)->dif0); map->m_bdev = sb->s_bdev; /* use s_bdev for the primary device */ if (map->m_deviceid) { down_read(&devs->rwsem); @@ -222,7 +216,7 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map) up_read(&devs->rwsem); return 0; } - erofs_fill_from_devinfo(map, dif); + erofs_fill_from_devinfo(map, sb, dif); up_read(&devs->rwsem); } else if (devs->extra_devices && !devs->flatdev) { down_read(&devs->rwsem); @@ -235,7 +229,7 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map) if (map->m_pa >= startoff && map->m_pa < startoff + length) { map->m_pa -= startoff; - erofs_fill_from_devinfo(map, dif); + erofs_fill_from_devinfo(map, sb, dif); break; } } @@ -305,7 +299,7 @@ static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
iomap->offset = map.m_la; if (flags & IOMAP_DAX) - iomap->dax_dev = mdev.m_daxdev; + iomap->dax_dev = mdev.m_dif->dax_dev; else iomap->bdev = mdev.m_bdev; iomap->length = map.m_llen; @@ -334,7 +328,7 @@ static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length, iomap->type = IOMAP_MAPPED; iomap->addr = mdev.m_pa; if (flags & IOMAP_DAX) - iomap->addr += mdev.m_dax_part_off; + iomap->addr += mdev.m_dif->dax_part_off; } return 0; } diff --git a/fs/erofs/fileio.c b/fs/erofs/fileio.c index 3af96b1e2c2a..a61b8faec651 100644 --- a/fs/erofs/fileio.c +++ b/fs/erofs/fileio.c @@ -67,7 +67,7 @@ static struct erofs_fileio_rq *erofs_fileio_rq_alloc(struct erofs_map_dev *mdev) GFP_KERNEL | __GFP_NOFAIL);
bio_init(&rq->bio, NULL, rq->bvecs, BIO_MAX_VECS, REQ_OP_READ); - rq->iocb.ki_filp = mdev->m_fp; + rq->iocb.ki_filp = mdev->m_dif->file; return rq; }
diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c index ce7e38c82719..ce3d8737df85 100644 --- a/fs/erofs/fscache.c +++ b/fs/erofs/fscache.c @@ -198,7 +198,7 @@ struct bio *erofs_fscache_bio_alloc(struct erofs_map_dev *mdev)
io = kmalloc(sizeof(*io), GFP_KERNEL | __GFP_NOFAIL); bio_init(&io->bio, NULL, io->bvecs, BIO_MAX_VECS, REQ_OP_READ); - io->io.private = mdev->m_fscache->cookie; + io->io.private = mdev->m_dif->fscache->cookie; io->io.end_io = erofs_fscache_bio_endio; refcount_set(&io->io.ref, 1); return &io->bio; @@ -316,7 +316,7 @@ static int erofs_fscache_data_read_slice(struct erofs_fscache_rq *req) if (!io) return -ENOMEM; iov_iter_xarray(&io->iter, ITER_DEST, &mapping->i_pages, pos, count); - ret = erofs_fscache_read_io_async(mdev.m_fscache->cookie, + ret = erofs_fscache_read_io_async(mdev.m_dif->fscache->cookie, mdev.m_pa + (pos - map.m_la), io); erofs_fscache_req_io_put(io);
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index d70aa2410472..3108ece1d709 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -366,11 +366,9 @@ enum { };
struct erofs_map_dev { - struct erofs_fscache *m_fscache; + struct super_block *m_sb; + struct erofs_device_info *m_dif; struct block_device *m_bdev; - struct dax_device *m_daxdev; - struct file *m_fp; - u64 m_dax_part_off;
erofs_off_t m_pa; unsigned int m_deviceid;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gao Xiang hsiangkao@linux.alibaba.com
[ Upstream commit 6422cde1b0d5a31b206b263417c1c2b3c80fe82c ]
For many use cases (e.g. container images are just fetched from remote), performance will be impacted if underlay page cache is up-to-date but direct i/o flushes dirty pages first.
Instead, let's use buffered I/O by default to keep in sync with loop devices and add a (re)mount option to explicitly give a try to use direct I/O if supported by the underlying files.
The container startup time is improved as below: [workload] docker.io/library/workpress:latest unpack 1st run non-1st runs EROFS snapshotter buffered I/O file 4.586404265s 0.308s 0.198s EROFS snapshotter direct I/O file 4.581742849s 2.238s 0.222s EROFS snapshotter loop 4.596023152s 0.346s 0.201s Overlayfs snapshotter 5.382851037s 0.206s 0.214s
Fixes: fb176750266a ("erofs: add file-backed mount support") Cc: Derek McGowan derek@mcg.dev Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Link: https://lore.kernel.org/r/20241212134336.2059899-1-hsiangkao@linux.alibaba.c... Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/fileio.c | 7 +++++-- fs/erofs/internal.h | 1 + fs/erofs/super.c | 23 +++++++++++++++-------- 3 files changed, 21 insertions(+), 10 deletions(-)
diff --git a/fs/erofs/fileio.c b/fs/erofs/fileio.c index a61b8faec651..33f8539dda4a 100644 --- a/fs/erofs/fileio.c +++ b/fs/erofs/fileio.c @@ -9,6 +9,7 @@ struct erofs_fileio_rq { struct bio_vec bvecs[BIO_MAX_VECS]; struct bio bio; struct kiocb iocb; + struct super_block *sb; };
struct erofs_fileio { @@ -52,8 +53,9 @@ static void erofs_fileio_rq_submit(struct erofs_fileio_rq *rq) rq->iocb.ki_pos = rq->bio.bi_iter.bi_sector << SECTOR_SHIFT; rq->iocb.ki_ioprio = get_current_ioprio(); rq->iocb.ki_complete = erofs_fileio_ki_complete; - rq->iocb.ki_flags = (rq->iocb.ki_filp->f_mode & FMODE_CAN_ODIRECT) ? - IOCB_DIRECT : 0; + if (test_opt(&EROFS_SB(rq->sb)->opt, DIRECT_IO) && + rq->iocb.ki_filp->f_mode & FMODE_CAN_ODIRECT) + rq->iocb.ki_flags = IOCB_DIRECT; iov_iter_bvec(&iter, ITER_DEST, rq->bvecs, rq->bio.bi_vcnt, rq->bio.bi_iter.bi_size); ret = vfs_iocb_iter_read(rq->iocb.ki_filp, &rq->iocb, &iter); @@ -68,6 +70,7 @@ static struct erofs_fileio_rq *erofs_fileio_rq_alloc(struct erofs_map_dev *mdev)
bio_init(&rq->bio, NULL, rq->bvecs, BIO_MAX_VECS, REQ_OP_READ); rq->iocb.ki_filp = mdev->m_dif->file; + rq->sb = mdev->m_sb; return rq; }
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index 3108ece1d709..77e785a6dfa7 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -182,6 +182,7 @@ struct erofs_sb_info { #define EROFS_MOUNT_POSIX_ACL 0x00000020 #define EROFS_MOUNT_DAX_ALWAYS 0x00000040 #define EROFS_MOUNT_DAX_NEVER 0x00000080 +#define EROFS_MOUNT_DIRECT_IO 0x00000100
#define clear_opt(opt, option) ((opt)->mount_opt &= ~EROFS_MOUNT_##option) #define set_opt(opt, option) ((opt)->mount_opt |= EROFS_MOUNT_##option) diff --git a/fs/erofs/super.c b/fs/erofs/super.c index 60f7bd43a5a4..5b279977c9d5 100644 --- a/fs/erofs/super.c +++ b/fs/erofs/super.c @@ -379,14 +379,8 @@ static void erofs_default_options(struct erofs_sb_info *sbi) }
enum { - Opt_user_xattr, - Opt_acl, - Opt_cache_strategy, - Opt_dax, - Opt_dax_enum, - Opt_device, - Opt_fsid, - Opt_domain_id, + Opt_user_xattr, Opt_acl, Opt_cache_strategy, Opt_dax, Opt_dax_enum, + Opt_device, Opt_fsid, Opt_domain_id, Opt_directio, Opt_err };
@@ -413,6 +407,7 @@ static const struct fs_parameter_spec erofs_fs_parameters[] = { fsparam_string("device", Opt_device), fsparam_string("fsid", Opt_fsid), fsparam_string("domain_id", Opt_domain_id), + fsparam_flag_no("directio", Opt_directio), {} };
@@ -526,6 +521,16 @@ static int erofs_fc_parse_param(struct fs_context *fc, errorfc(fc, "%s option not supported", erofs_fs_parameters[opt].name); break; #endif + case Opt_directio: +#ifdef CONFIG_EROFS_FS_BACKED_BY_FILE + if (result.boolean) + set_opt(&sbi->opt, DIRECT_IO); + else + clear_opt(&sbi->opt, DIRECT_IO); +#else + errorfc(fc, "%s option not supported", erofs_fs_parameters[opt].name); +#endif + break; default: return -ENOPARAM; } @@ -963,6 +968,8 @@ static int erofs_show_options(struct seq_file *seq, struct dentry *root) seq_puts(seq, ",dax=always"); if (test_opt(opt, DAX_NEVER)) seq_puts(seq, ",dax=never"); + if (erofs_is_fileio_mode(sbi) && test_opt(opt, DIRECT_IO)) + seq_puts(seq, ",directio"); #ifdef CONFIG_EROFS_FS_ONDEMAND if (sbi->fsid) seq_printf(seq, ",fsid=%s", sbi->fsid);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Chinner dchinner@redhat.com
commit 59e43f5479cce106d71c0b91a297c7ad1913176c upstream.
It's just read in from the superblock and used without doing any validity checks at all on the value.
Fixes: fb4f2b4e5a82 ("xfs: add sparse inode chunk alignment superblock field") Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Carlos Maiolino cem@kernel.org [djwong: actually tag for 6.12 because upstream maintainer ignored cc-stable tag] Link: https://lore.kernel.org/linux-xfs/20241024165544.GI21853@frogsfrogsfrogs/ Signed-off-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_sb.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index 02ebcbc4882f..9e0ae312bc80 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -391,6 +391,20 @@ xfs_validate_sb_common( sbp->sb_inoalignmt, align); return -EINVAL; } + + if (!sbp->sb_spino_align || + sbp->sb_spino_align > sbp->sb_inoalignmt || + (sbp->sb_inoalignmt % sbp->sb_spino_align) != 0) { + xfs_warn(mp, + "Sparse inode alignment (%u) is invalid.", + sbp->sb_spino_align); + return -EINVAL; + } + } else if (sbp->sb_spino_align) { + xfs_warn(mp, + "Sparse inode alignment (%u) should be zero.", + sbp->sb_spino_align); + return -EINVAL; } } else if (sbp->sb_qflags & (XFS_PQUOTA_ENFD | XFS_GQUOTA_ENFD | XFS_PQUOTA_CHKD | XFS_GQUOTA_CHKD)) {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Chinner dchinner@redhat.com
commit 13325333582d4820d39b9e8f63d6a54e745585d9 upstream.
The runt AG at the end of a filesystem is almost always smaller than the mp->m_sb.sb_agblocks. Unfortunately, when setting the max_agbno limit for the inode chunk allocation, we do not take this into account. This means we can allocate a sparse inode chunk that overlaps beyond the end of an AG. When we go to allocate an inode from that sparse chunk, the irec fails validation because the agbno of the start of the irec is beyond valid limits for the runt AG.
Prevent this from happening by taking into account the size of the runt AG when allocating inode chunks. Also convert the various checks for valid inode chunk agbnos to use xfs_ag_block_count() so that they will also catch such issues in the future.
Fixes: 56d1115c9bc7 ("xfs: allocate sparse inode chunks on full chunk allocation failure") Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Carlos Maiolino cem@kernel.org [djwong: backport to stable because upstream maintainer ignored cc-stable] Link: https://lore.kernel.org/linux-xfs/20241112231539.GG9438@frogsfrogsfrogs/ Signed-off-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_ialloc.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c index 271855227514..6258527315f2 100644 --- a/fs/xfs/libxfs/xfs_ialloc.c +++ b/fs/xfs/libxfs/xfs_ialloc.c @@ -855,7 +855,8 @@ xfs_ialloc_ag_alloc( * the end of the AG. */ args.min_agbno = args.mp->m_sb.sb_inoalignmt; - args.max_agbno = round_down(args.mp->m_sb.sb_agblocks, + args.max_agbno = round_down(xfs_ag_block_count(args.mp, + pag->pag_agno), args.mp->m_sb.sb_inoalignmt) - igeo->ialloc_blks;
@@ -2332,9 +2333,9 @@ xfs_difree( return -EINVAL; } agbno = XFS_AGINO_TO_AGBNO(mp, agino); - if (agbno >= mp->m_sb.sb_agblocks) { - xfs_warn(mp, "%s: agbno >= mp->m_sb.sb_agblocks (%d >= %d).", - __func__, agbno, mp->m_sb.sb_agblocks); + if (agbno >= xfs_ag_block_count(mp, pag->pag_agno)) { + xfs_warn(mp, "%s: agbno >= xfs_ag_block_count (%d >= %d).", + __func__, agbno, xfs_ag_block_count(mp, pag->pag_agno)); ASSERT(0); return -EINVAL; } @@ -2457,7 +2458,7 @@ xfs_imap( */ agino = XFS_INO_TO_AGINO(mp, ino); agbno = XFS_AGINO_TO_AGBNO(mp, agino); - if (agbno >= mp->m_sb.sb_agblocks || + if (agbno >= xfs_ag_block_count(mp, pag->pag_agno) || ino != XFS_AGINO_TO_INO(mp, pag->pag_agno, agino)) { error = -EINVAL; #ifdef DEBUG @@ -2467,11 +2468,12 @@ xfs_imap( */ if (flags & XFS_IGET_UNTRUSTED) return error; - if (agbno >= mp->m_sb.sb_agblocks) { + if (agbno >= xfs_ag_block_count(mp, pag->pag_agno)) { xfs_alert(mp, "%s: agbno (0x%llx) >= mp->m_sb.sb_agblocks (0x%lx)", __func__, (unsigned long long)agbno, - (unsigned long)mp->m_sb.sb_agblocks); + (unsigned long)xfs_ag_block_count(mp, + pag->pag_agno)); } if (ino != XFS_AGINO_TO_INO(mp, pag->pag_agno, agino)) { xfs_alert(mp,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit a440a28ddbdcb861150987b4d6e828631656b92f upstream.
In commit ca6448aed4f10a, we created an "end_daddr" variable to fix fsmap reporting when the end of the range requested falls in the middle of an unknown (aka free on the rmapbt) region. Unfortunately, I didn't notice that the the code sets end_daddr to the last sector of the device but then uses that quantity to compute the length of the synthesized mapping.
Zizhi Wo later observed that when end_daddr isn't set, we still don't report the last fsblock on a device because in that case (aka when info->last is true), the info->high mapping that we pass to xfs_getfsmap_group_helper has a startblock that points to the last fsblock. This is also wrong because the code uses startblock to compute the length of the synthesized mapping.
Fix the second problem by setting end_daddr unconditionally, and fix the first problem by setting start_daddr to one past the end of the range to query.
Cc: stable@vger.kernel.org # v6.11 Fixes: ca6448aed4f10a ("xfs: Fix missing interval for missing_owner in xfs fsmap") Signed-off-by: "Darrick J. Wong" djwong@kernel.org Reported-by: Zizhi Wo wozizhi@huawei.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_fsmap.c | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-)
diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c index ae18ab86e608..8712b891defb 100644 --- a/fs/xfs/xfs_fsmap.c +++ b/fs/xfs/xfs_fsmap.c @@ -162,7 +162,8 @@ struct xfs_getfsmap_info { xfs_daddr_t next_daddr; /* next daddr we expect */ /* daddr of low fsmap key when we're using the rtbitmap */ xfs_daddr_t low_daddr; - xfs_daddr_t end_daddr; /* daddr of high fsmap key */ + /* daddr of high fsmap key, or the last daddr on the device */ + xfs_daddr_t end_daddr; u64 missing_owner; /* owner of holes */ u32 dev; /* device id */ /* @@ -306,7 +307,7 @@ xfs_getfsmap_helper( * Note that if the btree query found a mapping, there won't be a gap. */ if (info->last && info->end_daddr != XFS_BUF_DADDR_NULL) - rec_daddr = info->end_daddr; + rec_daddr = info->end_daddr + 1;
/* Are we just counting mappings? */ if (info->head->fmh_count == 0) { @@ -898,7 +899,10 @@ xfs_getfsmap( struct xfs_trans *tp = NULL; struct xfs_fsmap dkeys[2]; /* per-dev keys */ struct xfs_getfsmap_dev handlers[XFS_GETFSMAP_DEVS]; - struct xfs_getfsmap_info info = { NULL }; + struct xfs_getfsmap_info info = { + .fsmap_recs = fsmap_recs, + .head = head, + }; bool use_rmap; int i; int error = 0; @@ -963,9 +967,6 @@ xfs_getfsmap(
info.next_daddr = head->fmh_keys[0].fmr_physical + head->fmh_keys[0].fmr_length; - info.end_daddr = XFS_BUF_DADDR_NULL; - info.fsmap_recs = fsmap_recs; - info.head = head;
/* For each device we support... */ for (i = 0; i < XFS_GETFSMAP_DEVS; i++) { @@ -978,17 +979,23 @@ xfs_getfsmap( break;
/* - * If this device number matches the high key, we have - * to pass the high key to the handler to limit the - * query results. If the device number exceeds the - * low key, zero out the low key so that we get - * everything from the beginning. + * If this device number matches the high key, we have to pass + * the high key to the handler to limit the query results, and + * set the end_daddr so that we can synthesize records at the + * end of the query range or device. */ if (handlers[i].dev == head->fmh_keys[1].fmr_device) { dkeys[1] = head->fmh_keys[1]; info.end_daddr = min(handlers[i].nr_sectors - 1, dkeys[1].fmr_physical); + } else { + info.end_daddr = handlers[i].nr_sectors - 1; } + + /* + * If the device number exceeds the low key, zero out the low + * key so that we get everything from the beginning. + */ if (handlers[i].dev > head->fmh_keys[0].fmr_device) memset(&dkeys[0], 0, sizeof(struct xfs_fsmap));
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit 7f8a44f37229fc76bfcafa341a4b8862368ef44a upstream.
For a sparse inodes filesystem, mkfs.xfs computes the values of sb_spino_align and sb_inoalignmt with the following code:
int cluster_size = XFS_INODE_BIG_CLUSTER_SIZE;
if (cfg->sb_feat.crcs_enabled) cluster_size *= cfg->inodesize / XFS_DINODE_MIN_SIZE;
sbp->sb_spino_align = cluster_size >> cfg->blocklog; sbp->sb_inoalignmt = XFS_INODES_PER_CHUNK * cfg->inodesize >> cfg->blocklog;
On a V5 filesystem with 64k fsblocks and 512 byte inodes, this results in cluster_size = 8192 * (512 / 256) = 16384. As a result, sb_spino_align and sb_inoalignmt are both set to zero. Unfortunately, this trips the new sb_spino_align check that was just added to xfs_validate_sb_common, and the mkfs fails:
# mkfs.xfs -f -b size=64k, /dev/sda meta-data=/dev/sda isize=512 agcount=4, agsize=81136 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=1 = reflink=1 bigtime=1 inobtcount=1 nrext64=1 = exchange=0 metadir=0 data = bsize=65536 blocks=324544, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=65536 ascii-ci=0, ftype=1, parent=0 log =internal log bsize=65536 blocks=5006, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=65536 blocks=0, rtextents=0 = rgcount=0 rgsize=0 extents Discarding blocks...Sparse inode alignment (0) is invalid. Metadata corruption detected at 0x560ac5a80bbe, xfs_sb block 0x0/0x200 libxfs_bwrite: write verifier failed on xfs_sb bno 0x0/0x1 mkfs.xfs: Releasing dirty buffer to free list! found dirty buffer (bulk) on free list! Sparse inode alignment (0) is invalid. Metadata corruption detected at 0x560ac5a80bbe, xfs_sb block 0x0/0x200 libxfs_bwrite: write verifier failed on xfs_sb bno 0x0/0x1 mkfs.xfs: writing AG headers failed, err=22
Prior to commit 59e43f5479cce1 this all worked fine, even if "sparse" inodes are somewhat meaningless when everything fits in a single fsblock. Adjust the checks to handle existing filesystems.
Cc: stable@vger.kernel.org # v6.13-rc1 Fixes: 59e43f5479cce1 ("xfs: sb_spino_align is not verified") Signed-off-by: "Darrick J. Wong" djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_sb.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index 9e0ae312bc80..e27b63281d01 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -392,12 +392,13 @@ xfs_validate_sb_common( return -EINVAL; }
- if (!sbp->sb_spino_align || - sbp->sb_spino_align > sbp->sb_inoalignmt || - (sbp->sb_inoalignmt % sbp->sb_spino_align) != 0) { + if (sbp->sb_spino_align && + (sbp->sb_spino_align > sbp->sb_inoalignmt || + (sbp->sb_inoalignmt % sbp->sb_spino_align) != 0)) { xfs_warn(mp, - "Sparse inode alignment (%u) is invalid.", - sbp->sb_spino_align); +"Sparse inode alignment (%u) is invalid, must be integer factor of (%u).", + sbp->sb_spino_align, + sbp->sb_inoalignmt); return -EINVAL; } } else if (sbp->sb_spino_align) {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit c004a793e0ec34047c3bd423bcd8966f5fac88dc upstream.
The logic to check that the region past the end of the superblock is all zeroes is wrong -- we don't want to check only the bytes past the end of the maximally sized ondisk superblock structure as currently defined in xfs_format.h; we want to check the bytes beyond the end of the ondisk as defined by the feature bits.
Port the superblock size logic from xfs_repair and then put it to use in xfs_scrub.
Cc: stable@vger.kernel.org # v4.15 Fixes: 21fb4cb1981ef7 ("xfs: scrub the secondary superblocks") Signed-off-by: "Darrick J. Wong" djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/scrub/agheader.c | 29 +++++++++++++++++++++++++++-- 1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/scrub/agheader.c b/fs/xfs/scrub/agheader.c index da30f926cbe6..0f2f1852d58f 100644 --- a/fs/xfs/scrub/agheader.c +++ b/fs/xfs/scrub/agheader.c @@ -59,6 +59,30 @@ xchk_superblock_xref( /* scrub teardown will take care of sc->sa for us */ }
+/* + * Calculate the ondisk superblock size in bytes given the feature set of the + * mounted filesystem (aka the primary sb). This is subtlely different from + * the logic in xfs_repair, which computes the size of a secondary sb given the + * featureset listed in the secondary sb. + */ +STATIC size_t +xchk_superblock_ondisk_size( + struct xfs_mount *mp) +{ + if (xfs_has_metauuid(mp)) + return offsetofend(struct xfs_dsb, sb_meta_uuid); + if (xfs_has_crc(mp)) + return offsetofend(struct xfs_dsb, sb_lsn); + if (xfs_sb_version_hasmorebits(&mp->m_sb)) + return offsetofend(struct xfs_dsb, sb_bad_features2); + if (xfs_has_logv2(mp)) + return offsetofend(struct xfs_dsb, sb_logsunit); + if (xfs_has_sector(mp)) + return offsetofend(struct xfs_dsb, sb_logsectsize); + /* only support dirv2 or more recent */ + return offsetofend(struct xfs_dsb, sb_dirblklog); +} + /* * Scrub the filesystem superblock. * @@ -75,6 +99,7 @@ xchk_superblock( struct xfs_buf *bp; struct xfs_dsb *sb; struct xfs_perag *pag; + size_t sblen; xfs_agnumber_t agno; uint32_t v2_ok; __be32 features_mask; @@ -350,8 +375,8 @@ xchk_superblock( }
/* Everything else must be zero. */ - if (memchr_inv(sb + 1, 0, - BBTOB(bp->b_length) - sizeof(struct xfs_dsb))) + sblen = xchk_superblock_ondisk_size(mp); + if (memchr_inv((char *)sb + sblen, 0, BBTOB(bp->b_length) - sblen)) xchk_block_set_corrupt(sc, bp);
xchk_superblock_xref(sc, bp);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Olaf Hering olaf@aepfle.de
[ Upstream commit 91ae69c7ed9e262f24240c425ad1eef2cf6639b7 ]
Align permissions of the resulting .nmconnection file, instead of the input file from hv_kvp_daemon. To avoid the tiny time frame where the output file is world-readable, use umask instead of chmod.
Fixes: 42999c904612 ("hv/hv_kvp_daemon:Support for keyfile based connection profile") Signed-off-by: Olaf Hering olaf@aepfle.de Reviewed-by: Shradha Gupta shradhagupta@linux.microsoft.com Link: https://lore.kernel.org/r/20241016143521.3735-1-olaf@aepfle.de Signed-off-by: Wei Liu wei.liu@kernel.org Message-ID: 20241016143521.3735-1-olaf@aepfle.de Signed-off-by: Sasha Levin sashal@kernel.org --- tools/hv/hv_set_ifconfig.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/hv/hv_set_ifconfig.sh b/tools/hv/hv_set_ifconfig.sh index 440a91b35823..2f8baed2b8f7 100755 --- a/tools/hv/hv_set_ifconfig.sh +++ b/tools/hv/hv_set_ifconfig.sh @@ -81,7 +81,7 @@ echo "ONBOOT=yes" >> $1
cp $1 /etc/sysconfig/network-scripts/
-chmod 600 $2 +umask 0177 interface=$(echo $2 | awk -F - '{ print $2 }') filename="${2##*/}"
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Davidlohr Bueso dave@stgolabs.net
[ Upstream commit da4d8c83358163df9a4addaeba0ef8bcb03b22e8 ]
If cxl_pci_ras_unmask() returns non-zero, cxl_pci_probe() will end up returning that value, instead of zero.
Fixes: 248529edc86f ("cxl: add RAS status unmasking for CXL") Reviewed-by: Fan Ni fan.ni@samsung.com Signed-off-by: Davidlohr Bueso dave@stgolabs.net Reviewed-by: Ira Weiny ira.weiny@intel.com Link: https://patch.msgid.link/20241115170032.108445-1-dave@stgolabs.net Signed-off-by: Dave Jiang dave.jiang@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cxl/pci.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 188412d45e0d..6e553b5752b1 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -942,8 +942,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) if (rc) return rc;
- rc = cxl_pci_ras_unmask(pdev); - if (rc) + if (cxl_pci_ras_unmask(pdev)) dev_dbg(&pdev->dev, "No RAS reporting unmasked\n");
pci_save_state(pdev);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Huaisheng Ye huaisheng.ye@intel.com
[ Upstream commit 76467a94810c2aa4dd3096903291ac6df30c399e ]
The cxl_port_setup_targets() algorithm fails to identify valid target list ordering in the presence of 4-way and above switches resulting in 'cxl create-region' failures of the form:
$ cxl create-region -d decoder0.0 -g 1024 -s 2G -t ram -w 8 -m mem4 mem1 mem6 mem3 mem2 mem5 mem7 mem0 cxl region: create_region: region0: failed to set target7 to mem0 cxl region: cmd_create_region: created 0 regions
[kernel debug message] check_last_peer:1213: cxl region0: pci0000:0c:port1: cannot host mem6:decoder7.0 at 2 bus_remove_device:574: bus: 'cxl': remove device region0
QEMU can create this failing topology:
ACPI0017:00 [root0] | HB_0 [port1] / \ RP_0 RP_1 | | USP [port2] USP [port3] / / \ \ / / \ \ DSP DSP DSP DSP DSP DSP DSP DSP | | | | | | | | mem4 mem6 mem2 mem7 mem1 mem3 mem5 mem0 Pos: 0 2 4 6 1 3 5 7
HB: Host Bridge RP: Root Port USP: Upstream Port DSP: Downstream Port
...with the following command steps:
$ qemu-system-x86_64 -machine q35,cxl=on,accel=tcg \ -smp cpus=8 \ -m 8G \ -hda /home/work/vm-images/centos-stream8-02.qcow2 \ -object memory-backend-ram,size=4G,id=m0 \ -object memory-backend-ram,size=4G,id=m1 \ -object memory-backend-ram,size=2G,id=cxl-mem0 \ -object memory-backend-ram,size=2G,id=cxl-mem1 \ -object memory-backend-ram,size=2G,id=cxl-mem2 \ -object memory-backend-ram,size=2G,id=cxl-mem3 \ -object memory-backend-ram,size=2G,id=cxl-mem4 \ -object memory-backend-ram,size=2G,id=cxl-mem5 \ -object memory-backend-ram,size=2G,id=cxl-mem6 \ -object memory-backend-ram,size=2G,id=cxl-mem7 \ -numa node,memdev=m0,cpus=0-3,nodeid=0 \ -numa node,memdev=m1,cpus=4-7,nodeid=1 \ -netdev user,id=net0,hostfwd=tcp::2222-:22 \ -device virtio-net-pci,netdev=net0 \ -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \ -device cxl-rp,port=1,bus=cxl.1,id=root_port1,chassis=0,slot=1 \ -device cxl-upstream,bus=root_port0,id=us0 \ -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \ -device cxl-type3,bus=swport0,volatile-memdev=cxl-mem0,id=cxl-vmem0 \ -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \ -device cxl-type3,bus=swport1,volatile-memdev=cxl-mem1,id=cxl-vmem1 \ -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \ -device cxl-type3,bus=swport2,volatile-memdev=cxl-mem2,id=cxl-vmem2 \ -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \ -device cxl-type3,bus=swport3,volatile-memdev=cxl-mem3,id=cxl-vmem3 \ -device cxl-upstream,bus=root_port1,id=us1 \ -device cxl-downstream,port=4,bus=us1,id=swport4,chassis=0,slot=8 \ -device cxl-type3,bus=swport4,volatile-memdev=cxl-mem4,id=cxl-vmem4 \ -device cxl-downstream,port=5,bus=us1,id=swport5,chassis=0,slot=9 \ -device cxl-type3,bus=swport5,volatile-memdev=cxl-mem5,id=cxl-vmem5 \ -device cxl-downstream,port=6,bus=us1,id=swport6,chassis=0,slot=10 \ -device cxl-type3,bus=swport6,volatile-memdev=cxl-mem6,id=cxl-vmem6 \ -device cxl-downstream,port=7,bus=us1,id=swport7,chassis=0,slot=11 \ -device cxl-type3,bus=swport7,volatile-memdev=cxl-mem7,id=cxl-vmem7 \ -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=32G &
In Guest OS: $ cxl create-region -d decoder0.0 -g 1024 -s 2G -t ram -w 8 -m mem4 mem1 mem6 mem3 mem2 mem5 mem7 mem0
Fix the method to calculate @distance by iterativeley multiplying the number of targets per switch port. This also follows the algorithm recommended here [1].
Fixes: 27b3f8d13830 ("cxl/region: Program target lists") Link: http://lore.kernel.org/6538824b52349_7258329466@dwillia2-xfh.jf.intel.com.no... [1] Signed-off-by: Huaisheng Ye huaisheng.ye@intel.com Tested-by: Li Zhijian lizhijian@fujitsu.com [djbw: add a comment explaining 'distance'] Signed-off-by: Dan Williams dan.j.williams@intel.com Link: https://patch.msgid.link/173378716722.1270362.9546805175813426729.stgit@dwil... Signed-off-by: Dave Jiang dave.jiang@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cxl/core/region.c | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index dff618c708dc..a0d6e8d7f42c 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1295,6 +1295,7 @@ static int cxl_port_setup_targets(struct cxl_port *port, struct cxl_region_params *p = &cxlr->params; struct cxl_decoder *cxld = cxl_rr->decoder; struct cxl_switch_decoder *cxlsd; + struct cxl_port *iter = port; u16 eig, peig; u8 eiw, peiw;
@@ -1311,16 +1312,26 @@ static int cxl_port_setup_targets(struct cxl_port *port,
cxlsd = to_cxl_switch_decoder(&cxld->dev); if (cxl_rr->nr_targets_set) { - int i, distance; + int i, distance = 1; + struct cxl_region_ref *cxl_rr_iter;
/* - * Passthrough decoders impose no distance requirements between - * peers + * The "distance" between peer downstream ports represents which + * endpoint positions in the region interleave a given port can + * host. + * + * For example, at the root of a hierarchy the distance is + * always 1 as every index targets a different host-bridge. At + * each subsequent switch level those ports map every Nth region + * position where N is the width of the switch == distance. */ - if (cxl_rr->nr_targets == 1) - distance = 0; - else - distance = p->nr_targets / cxl_rr->nr_targets; + do { + cxl_rr_iter = cxl_rr_load(iter, cxlr); + distance *= cxl_rr_iter->nr_targets; + iter = to_cxl_port(iter->dev.parent); + } while (!is_cxl_root(iter)); + distance *= cxlrd->cxlsd.cxld.interleave_ways; + for (i = 0; i < cxl_rr->nr_targets_set; i++) if (ep->dport == cxlsd->target[i]) { rc = check_last_peer(cxled, ep, cxl_rr,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guangguan Wang guangguan.wang@linux.alibaba.com
[ Upstream commit 2b33eb8f1b3e8c2f87cfdbc8cc117f6bdfabc6ec ]
link down work may be scheduled before lgr freed but execute after lgr freed, which may result in crash. So it is need to hold a reference before shedule link down work, and put the reference after work executed or canceled.
The relevant crash call stack as follows: list_del corruption. prev->next should be ffffb638c9c0fe20, but was 0000000000000000 ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:51! invalid opcode: 0000 [#1] SMP NOPTI CPU: 6 PID: 978112 Comm: kworker/6:119 Kdump: loaded Tainted: G #1 Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 2221b89 04/01/2014 Workqueue: events smc_link_down_work [smc] RIP: 0010:__list_del_entry_valid.cold+0x31/0x47 RSP: 0018:ffffb638c9c0fdd8 EFLAGS: 00010086 RAX: 0000000000000054 RBX: ffff942fb75e5128 RCX: 0000000000000000 RDX: ffff943520930aa0 RSI: ffff94352091fc80 RDI: ffff94352091fc80 RBP: 0000000000000000 R08: 0000000000000000 R09: ffffb638c9c0fc38 R10: ffffb638c9c0fc30 R11: ffffffffa015eb28 R12: 0000000000000002 R13: ffffb638c9c0fe20 R14: 0000000000000001 R15: ffff942f9cd051c0 FS: 0000000000000000(0000) GS:ffff943520900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f4f25214000 CR3: 000000025fbae004 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: rwsem_down_write_slowpath+0x17e/0x470 smc_link_down_work+0x3c/0x60 [smc] process_one_work+0x1ac/0x350 worker_thread+0x49/0x2f0 ? rescuer_thread+0x360/0x360 kthread+0x118/0x140 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x1f/0x30
Fixes: 541afa10c126 ("net/smc: add smcr_port_err() and smcr_link_down() processing") Signed-off-by: Guangguan Wang guangguan.wang@linux.alibaba.com Reviewed-by: Tony Lu tonylu@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/smc_core.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c index 4e694860ece4..68515a41d776 100644 --- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c @@ -1818,7 +1818,9 @@ void smcr_link_down_cond_sched(struct smc_link *lnk) { if (smc_link_downing(&lnk->state)) { trace_smcr_link_down(lnk, __builtin_return_address(0)); - schedule_work(&lnk->link_down_wrk); + smcr_link_hold(lnk); /* smcr_link_put in link_down_wrk */ + if (!schedule_work(&lnk->link_down_wrk)) + smcr_link_put(lnk); } }
@@ -1850,11 +1852,14 @@ static void smc_link_down_work(struct work_struct *work) struct smc_link_group *lgr = link->lgr;
if (list_empty(&lgr->list)) - return; + goto out; wake_up_all(&lgr->llc_msg_waiter); down_write(&lgr->llc_conf_mutex); smcr_link_down(link); up_write(&lgr->llc_conf_mutex); + +out: + smcr_link_put(link); /* smcr_link_hold by schedulers of link_down_work */ }
static int smc_vlan_by_tcpsk_walk(struct net_device *lower_dev,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guangguan Wang guangguan.wang@linux.alibaba.com
[ Upstream commit 679e9ddcf90dbdf98aaaa71a492454654b627bcb ]
When application sending data more than sndbuf_space, there have chances application will sleep in epoll_wait, and will never be wakeup again. This is caused by a race between smc_poll and smc_cdc_tx_handler.
application tasklet smc_tx_sendmsg(len > sndbuf_space) | epoll_wait for EPOLL_OUT,timeout=0 | smc_poll | if (!smc->conn.sndbuf_space) | | smc_cdc_tx_handler | atomic_add sndbuf_space | smc_tx_sndbuf_nonfull | if (!test_bit SOCK_NOSPACE) | do not sk_write_space; set_bit SOCK_NOSPACE; | return mask=0; |
Application will sleep in epoll_wait as smc_poll returns 0. And smc_cdc_tx_handler will not call sk_write_space because the SOCK_NOSPACE has not be set. If there is no inflight cdc msg, sk_write_space will not be called any more, and application will sleep in epoll_wait forever. So check sndbuf_space again after NOSPACE flag is set to break the race.
Fixes: 8dce2786a290 ("net/smc: smc_poll improvements") Signed-off-by: Guangguan Wang guangguan.wang@linux.alibaba.com Suggested-by: Paolo Abeni pabeni@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/af_smc.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 9e6c69d18581..92448f2c362c 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -2881,6 +2881,13 @@ __poll_t smc_poll(struct file *file, struct socket *sock, } else { sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk); set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); + + if (sk->sk_state != SMC_INIT) { + /* Race breaker the same way as tcp_poll(). */ + smp_mb__after_atomic(); + if (atomic_read(&smc->conn.sndbuf_space)) + mask |= EPOLLOUT | EPOLLWRNORM; + } } if (atomic_read(&smc->conn.bytes_to_rcv)) mask |= EPOLLIN | EPOLLRDNORM;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guangguan Wang guangguan.wang@linux.alibaba.com
[ Upstream commit a29e220d3c8edbf0e1beb0f028878a4a85966556 ]
When receiving proposal msg in server, the field iparea_offset and the field ipv6_prefixes_cnt in proposal msg are from the remote client and can not be fully trusted. Especially the field iparea_offset, once exceed the max value, there has the chance to access wrong address, and crash may happen.
This patch checks iparea_offset and ipv6_prefixes_cnt before using them.
Fixes: e7b7a64a8493 ("smc: support variable CLC proposal messages") Signed-off-by: Guangguan Wang guangguan.wang@linux.alibaba.com Reviewed-by: Wen Gu guwen@linux.alibaba.com Reviewed-by: D. Wythe alibuda@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/af_smc.c | 6 +++++- net/smc/smc_clc.c | 4 ++++ net/smc/smc_clc.h | 6 +++++- 3 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 92448f2c362c..9a74c9693f09 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -2032,6 +2032,8 @@ static int smc_listen_prfx_check(struct smc_sock *new_smc, if (pclc->hdr.typev1 == SMC_TYPE_N) return 0; pclc_prfx = smc_clc_proposal_get_prefix(pclc); + if (!pclc_prfx) + return -EPROTO; if (smc_clc_prfx_match(newclcsock, pclc_prfx)) return SMC_CLC_DECL_DIFFPREFIX;
@@ -2221,7 +2223,9 @@ static void smc_find_ism_v1_device_serv(struct smc_sock *new_smc, int rc = 0;
/* check if ISM V1 is available */ - if (!(ini->smcd_version & SMC_V1) || !smcd_indicated(ini->smc_type_v1)) + if (!(ini->smcd_version & SMC_V1) || + !smcd_indicated(ini->smc_type_v1) || + !pclc_smcd) goto not_found; ini->is_smcd = true; /* prepare ISM check */ ini->ism_peer_gid[0].gid = ntohll(pclc_smcd->ism.gid); diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c index 33fa787c28eb..66a43b97eede 100644 --- a/net/smc/smc_clc.c +++ b/net/smc/smc_clc.c @@ -354,6 +354,10 @@ static bool smc_clc_msg_prop_valid(struct smc_clc_msg_proposal *pclc)
v2_ext = smc_get_clc_v2_ext(pclc); pclc_prfx = smc_clc_proposal_get_prefix(pclc); + if (!pclc_prfx || + pclc_prfx->ipv6_prefixes_cnt > SMC_CLC_MAX_V6_PREFIX) + return false; + if (hdr->version == SMC_V1) { if (hdr->typev1 == SMC_TYPE_N) return false; diff --git a/net/smc/smc_clc.h b/net/smc/smc_clc.h index 5625fda2960b..ddad4af8e88f 100644 --- a/net/smc/smc_clc.h +++ b/net/smc/smc_clc.h @@ -336,8 +336,12 @@ struct smc_clc_msg_decline_v2 { /* clc decline message */ static inline struct smc_clc_msg_proposal_prefix * smc_clc_proposal_get_prefix(struct smc_clc_msg_proposal *pclc) { + u16 offset = ntohs(pclc->iparea_offset); + + if (offset > sizeof(struct smc_clc_msg_smcd)) + return NULL; return (struct smc_clc_msg_proposal_prefix *) - ((u8 *)pclc + sizeof(*pclc) + ntohs(pclc->iparea_offset)); + ((u8 *)pclc + sizeof(*pclc) + offset); }
static inline bool smcr_indicated(int smc_type)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guangguan Wang guangguan.wang@linux.alibaba.com
[ Upstream commit 7863c9f3d24ba49dbead7e03dfbe40deb5888fdf ]
When receiving proposal msg in server, the fields v2_ext_offset/ eid_cnt/ism_gid_cnt in proposal msg are from the remote client and can not be fully trusted. Especially the field v2_ext_offset, once exceed the max value, there has the chance to access wrong address, and crash may happen.
This patch checks the fields v2_ext_offset/eid_cnt/ism_gid_cnt before using them.
Fixes: 8c3dca341aea ("net/smc: build and send V2 CLC proposal") Signed-off-by: Guangguan Wang guangguan.wang@linux.alibaba.com Reviewed-by: Wen Gu guwen@linux.alibaba.com Reviewed-by: D. Wythe alibuda@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/af_smc.c | 3 ++- net/smc/smc_clc.c | 8 +++++++- net/smc/smc_clc.h | 8 +++++++- 3 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 9a74c9693f09..5d96f9de5b5d 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -2276,7 +2276,8 @@ static void smc_find_rdma_v2_device_serv(struct smc_sock *new_smc, goto not_found;
smc_v2_ext = smc_get_clc_v2_ext(pclc); - if (!smc_clc_match_eid(ini->negotiated_eid, smc_v2_ext, NULL, NULL)) + if (!smc_v2_ext || + !smc_clc_match_eid(ini->negotiated_eid, smc_v2_ext, NULL, NULL)) goto not_found;
/* prepare RDMA check */ diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c index 66a43b97eede..f721d03efcbd 100644 --- a/net/smc/smc_clc.c +++ b/net/smc/smc_clc.c @@ -352,7 +352,6 @@ static bool smc_clc_msg_prop_valid(struct smc_clc_msg_proposal *pclc) struct smc_clc_msg_hdr *hdr = &pclc->hdr; struct smc_clc_v2_extension *v2_ext;
- v2_ext = smc_get_clc_v2_ext(pclc); pclc_prfx = smc_clc_proposal_get_prefix(pclc); if (!pclc_prfx || pclc_prfx->ipv6_prefixes_cnt > SMC_CLC_MAX_V6_PREFIX) @@ -369,6 +368,13 @@ static bool smc_clc_msg_prop_valid(struct smc_clc_msg_proposal *pclc) sizeof(struct smc_clc_msg_trail)) return false; } else { + v2_ext = smc_get_clc_v2_ext(pclc); + if ((hdr->typev2 != SMC_TYPE_N && + (!v2_ext || v2_ext->hdr.eid_cnt > SMC_CLC_MAX_UEID)) || + (smcd_indicated(hdr->typev2) && + v2_ext->hdr.ism_gid_cnt > SMCD_CLC_MAX_V2_GID_ENTRIES)) + return false; + if (ntohs(hdr->length) != sizeof(*pclc) + sizeof(struct smc_clc_msg_smcd) + diff --git a/net/smc/smc_clc.h b/net/smc/smc_clc.h index ddad4af8e88f..2ff423224a59 100644 --- a/net/smc/smc_clc.h +++ b/net/smc/smc_clc.h @@ -380,8 +380,14 @@ static inline struct smc_clc_v2_extension * smc_get_clc_v2_ext(struct smc_clc_msg_proposal *prop) { struct smc_clc_msg_smcd *prop_smcd = smc_get_clc_msg_smcd(prop); + u16 max_offset;
- if (!prop_smcd || !ntohs(prop_smcd->v2_ext_offset)) + max_offset = offsetof(struct smc_clc_msg_proposal_area, pclc_v2_ext) - + offsetof(struct smc_clc_msg_proposal_area, pclc_smcd) - + offsetofend(struct smc_clc_msg_smcd, v2_ext_offset); + + if (!prop_smcd || !ntohs(prop_smcd->v2_ext_offset) || + ntohs(prop_smcd->v2_ext_offset) > max_offset) return NULL;
return (struct smc_clc_v2_extension *)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guangguan Wang guangguan.wang@linux.alibaba.com
[ Upstream commit 9ab332deb671d8f7e66d82a2ff2b3f715bc3a4ad ]
When receiving proposal msg in server, the field smcd_v2_ext_offset in proposal msg is from the remote client and can not be fully trusted. Once the value of smcd_v2_ext_offset exceed the max value, there has the chance to access wrong address, and crash may happen.
This patch checks the value of smcd_v2_ext_offset before using it.
Fixes: 5c21c4ccafe8 ("net/smc: determine accepted ISM devices") Signed-off-by: Guangguan Wang guangguan.wang@linux.alibaba.com Reviewed-by: Wen Gu guwen@linux.alibaba.com Reviewed-by: D. Wythe alibuda@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/af_smc.c | 2 ++ net/smc/smc_clc.h | 8 +++++++- 2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 5d96f9de5b5d..6cc7b846cff1 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -2147,6 +2147,8 @@ static void smc_find_ism_v2_device_serv(struct smc_sock *new_smc, pclc_smcd = smc_get_clc_msg_smcd(pclc); smc_v2_ext = smc_get_clc_v2_ext(pclc); smcd_v2_ext = smc_get_clc_smcd_v2_ext(smc_v2_ext); + if (!pclc_smcd || !smc_v2_ext || !smcd_v2_ext) + goto not_found;
mutex_lock(&smcd_dev_list.mutex); if (pclc_smcd->ism.chid) { diff --git a/net/smc/smc_clc.h b/net/smc/smc_clc.h index 2ff423224a59..1a7676227f16 100644 --- a/net/smc/smc_clc.h +++ b/net/smc/smc_clc.h @@ -400,9 +400,15 @@ smc_get_clc_v2_ext(struct smc_clc_msg_proposal *prop) static inline struct smc_clc_smcd_v2_extension * smc_get_clc_smcd_v2_ext(struct smc_clc_v2_extension *prop_v2ext) { + u16 max_offset = offsetof(struct smc_clc_msg_proposal_area, pclc_smcd_v2_ext) - + offsetof(struct smc_clc_msg_proposal_area, pclc_v2_ext) - + offsetof(struct smc_clc_v2_extension, hdr) - + offsetofend(struct smc_clnt_opts_area_hdr, smcd_v2_ext_offset); + if (!prop_v2ext) return NULL; - if (!ntohs(prop_v2ext->hdr.smcd_v2_ext_offset)) + if (!ntohs(prop_v2ext->hdr.smcd_v2_ext_offset) || + ntohs(prop_v2ext->hdr.smcd_v2_ext_offset) > max_offset) return NULL;
return (struct smc_clc_smcd_v2_extension *)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guangguan Wang guangguan.wang@linux.alibaba.com
[ Upstream commit c5b8ee5022a19464783058dc6042e8eefa34e8cd ]
When receiving clc msg, the field length in smc_clc_msg_hdr indicates the length of msg should be received from network and the value should not be fully trusted as it is from the network. Once the value of length exceeds the value of buflen in function smc_clc_wait_msg it may run into deadloop when trying to drain the remaining data exceeding buflen.
This patch checks the return value of sock_recvmsg when draining data in case of deadloop in draining.
Fixes: fb4f79264c0f ("net/smc: tolerate future SMCD versions") Signed-off-by: Guangguan Wang guangguan.wang@linux.alibaba.com Reviewed-by: Wen Gu guwen@linux.alibaba.com Reviewed-by: D. Wythe alibuda@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/smc_clc.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c index f721d03efcbd..521f5df80e10 100644 --- a/net/smc/smc_clc.c +++ b/net/smc/smc_clc.c @@ -774,6 +774,11 @@ int smc_clc_wait_msg(struct smc_sock *smc, void *buf, int buflen, SMC_CLC_RECV_BUF_LEN : datlen; iov_iter_kvec(&msg.msg_iter, ITER_DEST, &vec, 1, recvlen); len = sock_recvmsg(smc->clcsock, &msg, krflags); + if (len < recvlen) { + smc->sk.sk_err = EPROTO; + reason_code = -EPROTO; + goto out; + } datlen -= len; } if (clcm->type == SMC_CLC_DECLINE) {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 2d5df3a680ffdaf606baa10636bdb1daf757832e ]
Packets injected by the CPU should have a SRC_PORT field equal to the CPU port module index in the Analyzer block (ocelot->num_phys_ports).
The blamed commit copied the ocelot_ifh_set_basic() call incorrectly from ocelot_xmit_common() in net/dsa/tag_ocelot.c. Instead of calling with "x", it calls with BIT_ULL(x), but the field is not a port mask, but rather a single port index.
[ side note: this is the technical debt of code duplication :( ]
The error used to be silent and doesn't appear to have other user-visible manifestations, but with new changes in the packing library, it now fails loudly as follows:
------------[ cut here ]------------ Cannot store 0x40 inside bits 46-43 - will truncate sja1105 spi2.0: xmit timed out WARNING: CPU: 1 PID: 102 at lib/packing.c:98 __pack+0x90/0x198 sja1105 spi2.0: timed out polling for tstamp CPU: 1 UID: 0 PID: 102 Comm: felix_xmit Tainted: G W N 6.13.0-rc1-00372-gf706b85d972d-dirty #2605 Call trace: __pack+0x90/0x198 (P) __pack+0x90/0x198 (L) packing+0x78/0x98 ocelot_ifh_set_basic+0x260/0x368 ocelot_port_inject_frame+0xa8/0x250 felix_port_deferred_xmit+0x14c/0x258 kthread_worker_fn+0x134/0x350 kthread+0x114/0x138
The code path pertains to the ocelot switchdev driver and to the felix secondary DSA tag protocol, ocelot-8021q. Here seen with ocelot-8021q.
The messenger (packing) is not really to blame, so fix the original commit instead.
Fixes: e1b9e80236c5 ("net: mscc: ocelot: fix QoS class for injected packets with "ocelot-8021q"") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20241212165546.879567-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mscc/ocelot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c index 3d72aa7b1305..ef93df520887 100644 --- a/drivers/net/ethernet/mscc/ocelot.c +++ b/drivers/net/ethernet/mscc/ocelot.c @@ -1432,7 +1432,7 @@ void ocelot_ifh_set_basic(void *ifh, struct ocelot *ocelot, int port,
memset(ifh, 0, OCELOT_TAG_LEN); ocelot_ifh_set_bypass(ifh, 1); - ocelot_ifh_set_src(ifh, BIT_ULL(ocelot->num_phys_ports)); + ocelot_ifh_set_src(ifh, ocelot->num_phys_ports); ocelot_ifh_set_dest(ifh, BIT_ULL(port)); ocelot_ifh_set_qos_class(ifh, qos_class); ocelot_ifh_set_tag_type(ifh, tag_type);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit ee76746387f6233bdfa93d7406990f923641568f ]
If either a zero count or a large one is provided, kernel can crash.
Fixes: 82c93a87bf8b ("netdevsim: implement couple of testing devlink health reporters") Reported-by: syzbot+ea40e4294e58b0292f74@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/675c6862.050a0220.37aaf.00b1.GAE@google.com/T... Signed-off-by: Eric Dumazet edumazet@google.com Cc: Jiri Pirko jiri@nvidia.com Reviewed-by: Joe Damato jdamato@fastly.com Link: https://patch.msgid.link/20241213172518.2415666-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/netdevsim/health.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/netdevsim/health.c b/drivers/net/netdevsim/health.c index 70e8bdf34be9..688f05316b5e 100644 --- a/drivers/net/netdevsim/health.c +++ b/drivers/net/netdevsim/health.c @@ -149,6 +149,8 @@ static ssize_t nsim_dev_health_break_write(struct file *file, char *break_msg; int err;
+ if (count == 0 || count > PAGE_SIZE) + return -EINVAL; break_msg = memdup_user_nul(data, count); if (IS_ERR(break_msg)) return PTR_ERR(break_msg);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Donald Hunter donald.hunter@gmail.com
[ Upstream commit 663ad7481f068057f6f692c5368c47150e855370 ]
Use the correct attribute space for sub-message key lookup in nested attributes when adding attributes. This fixes rt_link where the "kind" key and "data" sub-message are nested attributes in "linkinfo".
For example:
./tools/net/ynl/cli.py \ --create \ --spec Documentation/netlink/specs/rt_link.yaml \ --do newlink \ --json '{"link": 99, "linkinfo": { "kind": "vlan", "data": {"id": 4 } } }'
Signed-off-by: Donald Hunter donald.hunter@gmail.com Fixes: ab463c4342d1 ("tools/net/ynl: Add support for encoding sub-messages") Link: https://patch.msgid.link/20241213130711.40267-1-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/net/ynl/lib/ynl.py | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/net/ynl/lib/ynl.py b/tools/net/ynl/lib/ynl.py index c22c22bf2cb7..a3f741fed0a3 100644 --- a/tools/net/ynl/lib/ynl.py +++ b/tools/net/ynl/lib/ynl.py @@ -553,10 +553,10 @@ class YnlFamily(SpecFamily): if attr["type"] == 'nest': nl_type |= Netlink.NLA_F_NESTED attr_payload = b'' - sub_attrs = SpaceAttrs(self.attr_sets[space], value, search_attrs) + sub_space = attr['nested-attributes'] + sub_attrs = SpaceAttrs(self.attr_sets[sub_space], value, search_attrs) for subname, subvalue in value.items(): - attr_payload += self._add_attr(attr['nested-attributes'], - subname, subvalue, sub_attrs) + attr_payload += self._add_attr(sub_space, subname, subvalue, sub_attrs) elif attr["type"] == 'flag': if not value: # If value is absent or false then skip attribute creation.
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brett Creeley brett.creeley@amd.com
[ Upstream commit 9590d32e090ea2751e131ae5273859ca22f5ac14 ]
If register_netdev() fails, then the driver leaks the netdev notifier. Fix this by calling ionic_lif_unregister() on register_netdev() failure. This will also call ionic_lif_unregister_phc() if it has already been registered.
Fixes: 30b87ab4c0b3 ("ionic: remove lif list concept") Signed-off-by: Brett Creeley brett.creeley@amd.com Signed-off-by: Shannon Nelson shannon.nelson@amd.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/20241212213157.12212-2-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/pensando/ionic/ionic_lif.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c index 40496587b2b3..3d3f936779f7 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c @@ -3869,8 +3869,8 @@ int ionic_lif_register(struct ionic_lif *lif) /* only register LIF0 for now */ err = register_netdev(lif->netdev); if (err) { - dev_err(lif->ionic->dev, "Cannot register net device, aborting\n"); - ionic_lif_unregister_phc(lif); + dev_err(lif->ionic->dev, "Cannot register net device: %d, aborting\n", err); + ionic_lif_unregister(lif); return err; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shannon Nelson shannon.nelson@amd.com
[ Upstream commit 746e6ae2e202b062b9deee7bd86d94937997ecd7 ]
There are some FW error handling paths that can cause us to try to destroy the workqueue more than once, so let's be sure we're checking for that.
The case where this popped up was in an AER event where the handlers got called in such a way that ionic_reset_prepare() and thus ionic_dev_teardown() got called twice in a row. The second time through the workqueue was already destroyed, and destroy_workqueue() choked on the bad wq pointer.
We didn't hit this in AER handler testing before because at that time we weren't using a private workqueue. Later we replaced the use of the system workqueue with our own private workqueue but hadn't rerun the AER handler testing since then.
Fixes: 9e25450da700 ("ionic: add private workqueue per-device") Signed-off-by: Shannon Nelson shannon.nelson@amd.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/20241212213157.12212-3-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/pensando/ionic/ionic_dev.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_dev.c b/drivers/net/ethernet/pensando/ionic/ionic_dev.c index 9e42d599840d..57edcde9e6f8 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_dev.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_dev.c @@ -277,7 +277,10 @@ void ionic_dev_teardown(struct ionic *ionic) idev->phy_cmb_pages = 0; idev->cmb_npages = 0;
- destroy_workqueue(ionic->wq); + if (ionic->wq) { + destroy_workqueue(ionic->wq); + ionic->wq = NULL; + } mutex_destroy(&idev->cmb_inuse_lock); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shannon Nelson shannon.nelson@amd.com
[ Upstream commit b096d62ba1323391b2db98b7704e2468cf3b1588 ]
Some calls into ionic_get_module_eeprom() don't use a single full buffer size, but instead multiple calls with an offset. Teach our driver to use the offset correctly so we can respond appropriately to the caller.
Fixes: 4d03e00a2140 ("ionic: Add initial ethtool support") Signed-off-by: Shannon Nelson shannon.nelson@amd.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/20241212213157.12212-4-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/pensando/ionic/ionic_ethtool.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c index dda22fa4448c..9b7f78b6cdb1 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c @@ -961,8 +961,8 @@ static int ionic_get_module_eeprom(struct net_device *netdev, len = min_t(u32, sizeof(xcvr->sprom), ee->len);
do { - memcpy(data, xcvr->sprom, len); - memcpy(tbuf, xcvr->sprom, len); + memcpy(data, &xcvr->sprom[ee->offset], len); + memcpy(tbuf, &xcvr->sprom[ee->offset], len);
/* Let's make sure we got a consistent copy */ if (!memcmp(data, tbuf, len))
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikita Yushchenko nikita.yoush@cogentembedded.com
[ Upstream commit 922b4b955a03d19fea98938f33ef0e62d01f5159 ]
The existing linked list based implementation of how ts tags are assigned and managed is unsafe against concurrency and corner cases: - element addition in tx processing can race against element removal in ts queue completion, - element removal in ts queue completion can race against element removal in device close, - if a large number of frames gets added to tx queue without ts queue completions in between, elements with duplicate tag values can get added.
Use a different implementation, based on per-port used tags bitmaps and saved skb arrays.
Safety for addition in tx processing vs removal in ts completion is provided by:
tag = find_first_zero_bit(...); smp_mb(); <write rdev->ts_skb[tag]> set_bit(...);
vs
<read rdev->ts_skb[tag]> smp_mb(); clear_bit(...);
Safety for removal in ts completion vs removal in device close is provided by using atomic read-and-clear for rdev->ts_skb[tag]:
ts_skb = xchg(&rdev->ts_skb[tag], NULL); if (ts_skb) <handle it>
Fixes: 33f5d733b589 ("net: renesas: rswitch: Improve TX timestamp accuracy") Signed-off-by: Nikita Yushchenko nikita.yoush@cogentembedded.com Link: https://patch.msgid.link/20241212062558.436455-1-nikita.yoush@cogentembedded... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/renesas/rswitch.c | 74 ++++++++++++++------------ drivers/net/ethernet/renesas/rswitch.h | 13 ++--- 2 files changed, 42 insertions(+), 45 deletions(-)
diff --git a/drivers/net/ethernet/renesas/rswitch.c b/drivers/net/ethernet/renesas/rswitch.c index 09117110e3dd..f86fcecb91a8 100644 --- a/drivers/net/ethernet/renesas/rswitch.c +++ b/drivers/net/ethernet/renesas/rswitch.c @@ -547,7 +547,6 @@ static int rswitch_gwca_ts_queue_alloc(struct rswitch_private *priv) desc = &gq->ts_ring[gq->ring_size]; desc->desc.die_dt = DT_LINKFIX; rswitch_desc_set_dptr(&desc->desc, gq->ring_dma); - INIT_LIST_HEAD(&priv->gwca.ts_info_list);
return 0; } @@ -1003,9 +1002,10 @@ static int rswitch_gwca_request_irqs(struct rswitch_private *priv) static void rswitch_ts(struct rswitch_private *priv) { struct rswitch_gwca_queue *gq = &priv->gwca.ts_queue; - struct rswitch_gwca_ts_info *ts_info, *ts_info2; struct skb_shared_hwtstamps shhwtstamps; struct rswitch_ts_desc *desc; + struct rswitch_device *rdev; + struct sk_buff *ts_skb; struct timespec64 ts; unsigned int num; u32 tag, port; @@ -1015,23 +1015,28 @@ static void rswitch_ts(struct rswitch_private *priv) dma_rmb();
port = TS_DESC_DPN(__le32_to_cpu(desc->desc.dptrl)); - tag = TS_DESC_TSUN(__le32_to_cpu(desc->desc.dptrl)); - - list_for_each_entry_safe(ts_info, ts_info2, &priv->gwca.ts_info_list, list) { - if (!(ts_info->port == port && ts_info->tag == tag)) - continue; - - memset(&shhwtstamps, 0, sizeof(shhwtstamps)); - ts.tv_sec = __le32_to_cpu(desc->ts_sec); - ts.tv_nsec = __le32_to_cpu(desc->ts_nsec & cpu_to_le32(0x3fffffff)); - shhwtstamps.hwtstamp = timespec64_to_ktime(ts); - skb_tstamp_tx(ts_info->skb, &shhwtstamps); - dev_consume_skb_irq(ts_info->skb); - list_del(&ts_info->list); - kfree(ts_info); - break; - } + if (unlikely(port >= RSWITCH_NUM_PORTS)) + goto next; + rdev = priv->rdev[port];
+ tag = TS_DESC_TSUN(__le32_to_cpu(desc->desc.dptrl)); + if (unlikely(tag >= TS_TAGS_PER_PORT)) + goto next; + ts_skb = xchg(&rdev->ts_skb[tag], NULL); + smp_mb(); /* order rdev->ts_skb[] read before bitmap update */ + clear_bit(tag, rdev->ts_skb_used); + + if (unlikely(!ts_skb)) + goto next; + + memset(&shhwtstamps, 0, sizeof(shhwtstamps)); + ts.tv_sec = __le32_to_cpu(desc->ts_sec); + ts.tv_nsec = __le32_to_cpu(desc->ts_nsec & cpu_to_le32(0x3fffffff)); + shhwtstamps.hwtstamp = timespec64_to_ktime(ts); + skb_tstamp_tx(ts_skb, &shhwtstamps); + dev_consume_skb_irq(ts_skb); + +next: gq->cur = rswitch_next_queue_index(gq, true, 1); desc = &gq->ts_ring[gq->cur]; } @@ -1576,8 +1581,9 @@ static int rswitch_open(struct net_device *ndev) static int rswitch_stop(struct net_device *ndev) { struct rswitch_device *rdev = netdev_priv(ndev); - struct rswitch_gwca_ts_info *ts_info, *ts_info2; + struct sk_buff *ts_skb; unsigned long flags; + unsigned int tag;
netif_tx_stop_all_queues(ndev);
@@ -1594,12 +1600,13 @@ static int rswitch_stop(struct net_device *ndev) if (bitmap_empty(rdev->priv->opened_ports, RSWITCH_NUM_PORTS)) iowrite32(GWCA_TS_IRQ_BIT, rdev->priv->addr + GWTSDID);
- list_for_each_entry_safe(ts_info, ts_info2, &rdev->priv->gwca.ts_info_list, list) { - if (ts_info->port != rdev->port) - continue; - dev_kfree_skb_irq(ts_info->skb); - list_del(&ts_info->list); - kfree(ts_info); + for (tag = find_first_bit(rdev->ts_skb_used, TS_TAGS_PER_PORT); + tag < TS_TAGS_PER_PORT; + tag = find_next_bit(rdev->ts_skb_used, TS_TAGS_PER_PORT, tag + 1)) { + ts_skb = xchg(&rdev->ts_skb[tag], NULL); + clear_bit(tag, rdev->ts_skb_used); + if (ts_skb) + dev_kfree_skb(ts_skb); }
return 0; @@ -1612,20 +1619,17 @@ static bool rswitch_ext_desc_set_info1(struct rswitch_device *rdev, desc->info1 = cpu_to_le64(INFO1_DV(BIT(rdev->etha->index)) | INFO1_IPV(GWCA_IPV_NUM) | INFO1_FMT); if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) { - struct rswitch_gwca_ts_info *ts_info; + unsigned int tag;
- ts_info = kzalloc(sizeof(*ts_info), GFP_ATOMIC); - if (!ts_info) + tag = find_first_zero_bit(rdev->ts_skb_used, TS_TAGS_PER_PORT); + if (tag == TS_TAGS_PER_PORT) return false; + smp_mb(); /* order bitmap read before rdev->ts_skb[] write */ + rdev->ts_skb[tag] = skb_get(skb); + set_bit(tag, rdev->ts_skb_used);
skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS; - rdev->ts_tag++; - desc->info1 |= cpu_to_le64(INFO1_TSUN(rdev->ts_tag) | INFO1_TXC); - - ts_info->skb = skb_get(skb); - ts_info->port = rdev->port; - ts_info->tag = rdev->ts_tag; - list_add_tail(&ts_info->list, &rdev->priv->gwca.ts_info_list); + desc->info1 |= cpu_to_le64(INFO1_TSUN(tag) | INFO1_TXC);
skb_tx_timestamp(skb); } diff --git a/drivers/net/ethernet/renesas/rswitch.h b/drivers/net/ethernet/renesas/rswitch.h index e020800dcc57..d8d4ed7d7f8b 100644 --- a/drivers/net/ethernet/renesas/rswitch.h +++ b/drivers/net/ethernet/renesas/rswitch.h @@ -972,14 +972,6 @@ struct rswitch_gwca_queue { }; };
-struct rswitch_gwca_ts_info { - struct sk_buff *skb; - struct list_head list; - - int port; - u8 tag; -}; - #define RSWITCH_NUM_IRQ_REGS (RSWITCH_MAX_NUM_QUEUES / BITS_PER_TYPE(u32)) struct rswitch_gwca { unsigned int index; @@ -989,7 +981,6 @@ struct rswitch_gwca { struct rswitch_gwca_queue *queues; int num_queues; struct rswitch_gwca_queue ts_queue; - struct list_head ts_info_list; DECLARE_BITMAP(used, RSWITCH_MAX_NUM_QUEUES); u32 tx_irq_bits[RSWITCH_NUM_IRQ_REGS]; u32 rx_irq_bits[RSWITCH_NUM_IRQ_REGS]; @@ -997,6 +988,7 @@ struct rswitch_gwca { };
#define NUM_QUEUES_PER_NDEV 2 +#define TS_TAGS_PER_PORT 256 struct rswitch_device { struct rswitch_private *priv; struct net_device *ndev; @@ -1004,7 +996,8 @@ struct rswitch_device { void __iomem *addr; struct rswitch_gwca_queue *tx_queue; struct rswitch_gwca_queue *rx_queue; - u8 ts_tag; + struct sk_buff *ts_skb[TS_TAGS_PER_PORT]; + DECLARE_BITMAP(ts_skb_used, TS_TAGS_PER_PORT); bool disabled;
int port;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marios Makassikis mmakassikis@freebox.fr
[ Upstream commit 83c47d9e0ce79b5d7c0b21b9f35402dbde0fa15c ]
This changes the semantics of req_running to count all in-flight requests on a given connection, rather than the number of elements in the conn->request list. The latter is used only in smb2_cancel, and the counter is not used
Signed-off-by: Marios Makassikis mmakassikis@freebox.fr Acked-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Stable-dep-of: 43fb7bce8866 ("ksmbd: fix broken transfers when exceeding max simultaneous operations") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/server/connection.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/smb/server/connection.c b/fs/smb/server/connection.c index e6a72f75ab94..3980645085ed 100644 --- a/fs/smb/server/connection.c +++ b/fs/smb/server/connection.c @@ -120,8 +120,8 @@ void ksmbd_conn_enqueue_request(struct ksmbd_work *work) if (conn->ops->get_cmd_val(work) != SMB2_CANCEL_HE) requests_queue = &conn->requests;
+ atomic_inc(&conn->req_running); if (requests_queue) { - atomic_inc(&conn->req_running); spin_lock(&conn->request_lock); list_add_tail(&work->request_entry, requests_queue); spin_unlock(&conn->request_lock); @@ -132,11 +132,12 @@ void ksmbd_conn_try_dequeue_request(struct ksmbd_work *work) { struct ksmbd_conn *conn = work->conn;
+ atomic_dec(&conn->req_running); + if (list_empty(&work->request_entry) && list_empty(&work->async_request_entry)) return;
- atomic_dec(&conn->req_running); spin_lock(&conn->request_lock); list_del_init(&work->request_entry); spin_unlock(&conn->request_lock);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marios Makassikis mmakassikis@freebox.fr
[ Upstream commit 43fb7bce8866e793275c4f9f25af6a37745f3416 ]
Since commit 0a77d947f599 ("ksmbd: check outstanding simultaneous SMB operations"), ksmbd enforces a maximum number of simultaneous operations for a connection. The problem is that reaching the limit causes ksmbd to close the socket, and the client has no indication that it should have slowed down.
This behaviour can be reproduced by setting "smb2 max credits = 128" (or lower), and transferring a large file (25GB).
smbclient fails as below:
$ smbclient //192.168.1.254/testshare -U user%pass smb: > put file.bin cli_push returned NT_STATUS_USER_SESSION_DELETED putting file file.bin as \file.bin smb2cli_req_compound_submit: Insufficient credits. 0 available, 1 needed NT_STATUS_INTERNAL_ERROR closing remote file \file.bin smb: > smb2cli_req_compound_submit: Insufficient credits. 0 available, 1 needed
Windows clients fail with 0x8007003b (with smaller files even).
Fix this by delaying reading from the socket until there's room to allocate a request. This effectively applies backpressure on the client, so the transfer completes, albeit at a slower rate.
Fixes: 0a77d947f599 ("ksmbd: check outstanding simultaneous SMB operations") Signed-off-by: Marios Makassikis mmakassikis@freebox.fr Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/server/connection.c | 13 +++++++++++-- fs/smb/server/connection.h | 1 - fs/smb/server/server.c | 7 +------ fs/smb/server/server.h | 1 + fs/smb/server/transport_ipc.c | 5 ++++- 5 files changed, 17 insertions(+), 10 deletions(-)
diff --git a/fs/smb/server/connection.c b/fs/smb/server/connection.c index 3980645085ed..bf45822db5d5 100644 --- a/fs/smb/server/connection.c +++ b/fs/smb/server/connection.c @@ -70,7 +70,6 @@ struct ksmbd_conn *ksmbd_conn_alloc(void) atomic_set(&conn->req_running, 0); atomic_set(&conn->r_count, 0); atomic_set(&conn->refcnt, 1); - atomic_set(&conn->mux_smb_requests, 0); conn->total_credits = 1; conn->outstanding_credits = 0;
@@ -133,6 +132,8 @@ void ksmbd_conn_try_dequeue_request(struct ksmbd_work *work) struct ksmbd_conn *conn = work->conn;
atomic_dec(&conn->req_running); + if (waitqueue_active(&conn->req_running_q)) + wake_up(&conn->req_running_q);
if (list_empty(&work->request_entry) && list_empty(&work->async_request_entry)) @@ -309,7 +310,7 @@ int ksmbd_conn_handler_loop(void *p) { struct ksmbd_conn *conn = (struct ksmbd_conn *)p; struct ksmbd_transport *t = conn->transport; - unsigned int pdu_size, max_allowed_pdu_size; + unsigned int pdu_size, max_allowed_pdu_size, max_req; char hdr_buf[4] = {0,}; int size;
@@ -319,6 +320,7 @@ int ksmbd_conn_handler_loop(void *p) if (t->ops->prepare && t->ops->prepare(t)) goto out;
+ max_req = server_conf.max_inflight_req; conn->last_active = jiffies; set_freezable(); while (ksmbd_conn_alive(conn)) { @@ -328,6 +330,13 @@ int ksmbd_conn_handler_loop(void *p) kvfree(conn->request_buf); conn->request_buf = NULL;
+recheck: + if (atomic_read(&conn->req_running) + 1 > max_req) { + wait_event_interruptible(conn->req_running_q, + atomic_read(&conn->req_running) < max_req); + goto recheck; + } + size = t->ops->read(t, hdr_buf, sizeof(hdr_buf), -1); if (size != sizeof(hdr_buf)) break; diff --git a/fs/smb/server/connection.h b/fs/smb/server/connection.h index 8ddd5a3c7baf..b379ae4fdcdf 100644 --- a/fs/smb/server/connection.h +++ b/fs/smb/server/connection.h @@ -107,7 +107,6 @@ struct ksmbd_conn { __le16 signing_algorithm; bool binding; atomic_t refcnt; - atomic_t mux_smb_requests; };
struct ksmbd_conn_ops { diff --git a/fs/smb/server/server.c b/fs/smb/server/server.c index 698af37e988d..d146b0e7c3a9 100644 --- a/fs/smb/server/server.c +++ b/fs/smb/server/server.c @@ -270,7 +270,6 @@ static void handle_ksmbd_work(struct work_struct *wk)
ksmbd_conn_try_dequeue_request(work); ksmbd_free_work_struct(work); - atomic_dec(&conn->mux_smb_requests); /* * Checking waitqueue to dropping pending requests on * disconnection. waitqueue_active is safe because it @@ -300,11 +299,6 @@ static int queue_ksmbd_work(struct ksmbd_conn *conn) if (err) return 0;
- if (atomic_inc_return(&conn->mux_smb_requests) >= conn->vals->max_credits) { - atomic_dec_return(&conn->mux_smb_requests); - return -ENOSPC; - } - work = ksmbd_alloc_work_struct(); if (!work) { pr_err("allocation for work failed\n"); @@ -367,6 +361,7 @@ static int server_conf_init(void) server_conf.auth_mechs |= KSMBD_AUTH_KRB5 | KSMBD_AUTH_MSKRB5; #endif + server_conf.max_inflight_req = SMB2_MAX_CREDITS; return 0; }
diff --git a/fs/smb/server/server.h b/fs/smb/server/server.h index 4fc529335271..94187628ff08 100644 --- a/fs/smb/server/server.h +++ b/fs/smb/server/server.h @@ -42,6 +42,7 @@ struct ksmbd_server_config { struct smb_sid domain_sid; unsigned int auth_mechs; unsigned int max_connections; + unsigned int max_inflight_req;
char *conf[SERVER_CONF_WORK_GROUP + 1]; struct task_struct *dh_task; diff --git a/fs/smb/server/transport_ipc.c b/fs/smb/server/transport_ipc.c index 2f27afb695f6..6de351cc2b60 100644 --- a/fs/smb/server/transport_ipc.c +++ b/fs/smb/server/transport_ipc.c @@ -319,8 +319,11 @@ static int ipc_server_config_on_startup(struct ksmbd_startup_request *req) init_smb2_max_write_size(req->smb2_max_write); if (req->smb2_max_trans) init_smb2_max_trans_size(req->smb2_max_trans); - if (req->smb2_max_credits) + if (req->smb2_max_credits) { init_smb2_max_credits(req->smb2_max_credits); + server_conf.max_inflight_req = + req->smb2_max_credits; + } if (req->smbd_max_io_size) init_smbd_max_io_size(req->smbd_max_io_size);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit b1f3a2f5a742c1e939a73031bd31b9e557a2d77d ]
The context is supposed to record the next queue to dump, not last dumped. If the dump doesn't fit we will restart from the already-dumped queue, duplicating the message.
Before this fix and with the selftest improvements later in this series we see:
# ./run_kselftest.sh -t drivers/net:queues.py timeout set to 45 selftests: drivers/net: queues.py KTAP version 1 1..2 # Check| At /root/ksft-net-drv/drivers/net/./queues.py, line 32, in get_queues: # Check| ksft_eq(queues, expected) # Check failed 102 != 100 # Check| At /root/ksft-net-drv/drivers/net/./queues.py, line 32, in get_queues: # Check| ksft_eq(queues, expected) # Check failed 101 != 100 not ok 1 queues.get_queues ok 2 queues.addremove_queues # Totals: pass:1 fail:1 xfail:0 xpass:0 skip:0 error:0 not ok 1 selftests: drivers/net: queues.py # exit=1
With the fix:
# ./ksft-net-drv/run_kselftest.sh -t drivers/net:queues.py timeout set to 45 selftests: drivers/net: queues.py KTAP version 1 1..2 ok 1 queues.get_queues ok 2 queues.addremove_queues # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0
Fixes: 6b6171db7fc8 ("netdev-genl: Add netlink framework functions for queue") Reviewed-by: Joe Damato jdamato@fastly.com Link: https://patch.msgid.link/20241213152244.3080955-2-kuba@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/netdev-genl.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index d2baa1af9df0..71359922ae8b 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -417,24 +417,21 @@ netdev_nl_queue_dump_one(struct net_device *netdev, struct sk_buff *rsp, struct netdev_nl_dump_ctx *ctx) { int err = 0; - int i;
if (!(netdev->flags & IFF_UP)) return err;
- for (i = ctx->rxq_idx; i < netdev->real_num_rx_queues;) { - err = netdev_nl_queue_fill_one(rsp, netdev, i, + for (; ctx->rxq_idx < netdev->real_num_rx_queues; ctx->rxq_idx++) { + err = netdev_nl_queue_fill_one(rsp, netdev, ctx->rxq_idx, NETDEV_QUEUE_TYPE_RX, info); if (err) return err; - ctx->rxq_idx = i++; } - for (i = ctx->txq_idx; i < netdev->real_num_tx_queues;) { - err = netdev_nl_queue_fill_one(rsp, netdev, i, + for (; ctx->txq_idx < netdev->real_num_tx_queues; ctx->txq_idx++) { + err = netdev_nl_queue_fill_one(rsp, netdev, ctx->txq_idx, NETDEV_QUEUE_TYPE_TX, info); if (err) return err; - ctx->txq_idx = i++; }
return err;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit ecc391a541573da46b7ccc188105efedd40aef1b ]
The context is supposed to record the next queue to dump, not last dumped. If the dump doesn't fit we will restart from the already-dumped queue, duplicating the message.
Before this fix and with the selftest improvements later in this series we see:
# ./run_kselftest.sh -t drivers/net:stats.py timeout set to 45 selftests: drivers/net: stats.py KTAP version 1 1..5 ok 1 stats.check_pause ok 2 stats.check_fec ok 3 stats.pkt_byte_sum # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), len(set(queues[qtype])), # Check failed 45 != 44 repeated queue keys # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1, # Check failed 45 != 44 missing queue keys # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), len(set(queues[qtype])), # Check failed 45 != 44 repeated queue keys # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1, # Check failed 45 != 44 missing queue keys # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), len(set(queues[qtype])), # Check failed 103 != 100 repeated queue keys # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1, # Check failed 103 != 100 missing queue keys # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), len(set(queues[qtype])), # Check failed 102 != 100 repeated queue keys # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1, # Check failed 102 != 100 missing queue keys not ok 4 stats.qstat_by_ifindex ok 5 stats.check_down # Totals: pass:4 fail:1 xfail:0 xpass:0 skip:0 error:0
With the fix:
# ./ksft-net-drv/run_kselftest.sh -t drivers/net:stats.py timeout set to 45 selftests: drivers/net: stats.py KTAP version 1 1..5 ok 1 stats.check_pause ok 2 stats.check_fec ok 3 stats.pkt_byte_sum ok 4 stats.qstat_by_ifindex ok 5 stats.check_down # Totals: pass:5 fail:0 xfail:0 xpass:0 skip:0 error:0
Fixes: ab63a2387cb9 ("netdev: add per-queue statistics") Reviewed-by: Joe Damato jdamato@fastly.com Link: https://patch.msgid.link/20241213152244.3080955-3-kuba@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/netdev-genl.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index 71359922ae8b..224d1b5b79a7 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -597,7 +597,7 @@ netdev_nl_stats_by_queue(struct net_device *netdev, struct sk_buff *rsp, i, info); if (err) return err; - ctx->rxq_idx = i++; + ctx->rxq_idx = ++i; } i = ctx->txq_idx; while (ops->get_queue_stats_tx && i < netdev->real_num_tx_queues) { @@ -605,7 +605,7 @@ netdev_nl_stats_by_queue(struct net_device *netdev, struct sk_buff *rsp, i, info); if (err) return err; - ctx->txq_idx = i++; + ctx->txq_idx = ++i; }
ctx->rxq_idx = 0;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Borkmann daniel@iogearbox.net
[ Upstream commit e78c20f327bd94dabac68b98218dff069a8780f0 ]
Small follow-up to align this to an equivalent behavior as the bond driver. The change in 3625920b62c3 ("teaming: fix vlan_features computing") removed the netdevice vlan_features when there is no team port attached, yet it leaves the full set of enc_features intact.
Instead, leave the default features as pre 3625920b62c3, and recompute once we do have ports attached. Also, similarly as in bonding case, call the netdev_base_features() helper on the enc_features.
Fixes: 3625920b62c3 ("teaming: fix vlan_features computing") Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Nikolay Aleksandrov razor@blackwall.org Link: https://patch.msgid.link/20241213123657.401868-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/team/team_core.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c index 6ace5a74cddb..1c85dda83825 100644 --- a/drivers/net/team/team_core.c +++ b/drivers/net/team/team_core.c @@ -998,9 +998,13 @@ static void __team_compute_features(struct team *team) unsigned int dst_release_flag = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM;
+ rcu_read_lock(); + if (list_empty(&team->port_list)) + goto done; + vlan_features = netdev_base_features(vlan_features); + enc_features = netdev_base_features(enc_features);
- rcu_read_lock(); list_for_each_entry_rcu(port, &team->port_list, list) { vlan_features = netdev_increment_features(vlan_features, port->dev->vlan_features, @@ -1010,11 +1014,11 @@ static void __team_compute_features(struct team *team) port->dev->hw_enc_features, TEAM_ENC_FEATURES);
- dst_release_flag &= port->dev->priv_flags; if (port->dev->hard_header_len > max_hard_header_len) max_hard_header_len = port->dev->hard_header_len; } +done: rcu_read_unlock();
team->dev->vlan_features = vlan_features;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 7203d10e93b6e6e1d19481ef7907de6a9133a467 ]
There is a check for NULL at the start of create_txqs() and create_rxqs() which tess if "nic_dev->txqs" is non-NULL. The intention is that if the device is already open and the queues are already created then we don't create them a second time.
However, the bug is that if we have an error in the create_txqs() then the pointer doesn't get set back to NULL. The NULL check at the start of the function will say that it's already open when it's not and the device can't be used.
Set ->txqs back to NULL on cleanup on error.
Fixes: c3e79baf1b03 ("net-next/hinic: Add logical Txq and Rxq") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/0cc98faf-a0ed-4565-a55b-0fa2734bc205@stanley.mounta... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/huawei/hinic/hinic_main.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_main.c b/drivers/net/ethernet/huawei/hinic/hinic_main.c index 890f213da8d1..ae1f523d6841 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_main.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_main.c @@ -172,6 +172,7 @@ static int create_txqs(struct hinic_dev *nic_dev) hinic_sq_dbgfs_uninit(nic_dev);
devm_kfree(&netdev->dev, nic_dev->txqs); + nic_dev->txqs = NULL; return err; }
@@ -268,6 +269,7 @@ static int create_rxqs(struct hinic_dev *nic_dev) hinic_rq_dbgfs_uninit(nic_dev);
devm_kfree(&netdev->dev, nic_dev->rxqs); + nic_dev->rxqs = NULL; return err; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Parthiban Veerasooran parthiban.veerasooran@microchip.com
[ Upstream commit 7d2f320e12744e5906a4fab40381060a81d22c12 ]
SPI thread wakes up to perform SPI transfer whenever there is an TX skb from n/w stack or interrupt from MAC-PHY. Ethernet frame from TX skb is transferred based on the availability tx credits in the MAC-PHY which is reported from the previous SPI transfer. Sometimes there is a possibility that TX skb is available to transmit but there is no tx credits from MAC-PHY. In this case, there will not be any SPI transfer but the thread will be running in an endless loop until tx credits available again.
So checking the availability of tx credits along with TX skb will prevent the above infinite loop. When the tx credits available again that will be notified through interrupt which will trigger the SPI transfer to get the available tx credits.
Fixes: 53fbde8ab21e ("net: ethernet: oa_tc6: implement transmit path to transfer tx ethernet frames") Reviewed-by: Jacob Keller jacob.e.keller@intel.com Signed-off-by: Parthiban Veerasooran parthiban.veerasooran@microchip.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/oa_tc6.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/oa_tc6.c b/drivers/net/ethernet/oa_tc6.c index f9c0dcd965c2..4c8b0ca922b7 100644 --- a/drivers/net/ethernet/oa_tc6.c +++ b/drivers/net/ethernet/oa_tc6.c @@ -1111,8 +1111,9 @@ static int oa_tc6_spi_thread_handler(void *data) /* This kthread will be waken up if there is a tx skb or mac-phy * interrupt to perform spi transfer with tx chunks. */ - wait_event_interruptible(tc6->spi_wq, tc6->waiting_tx_skb || - tc6->int_flag || + wait_event_interruptible(tc6->spi_wq, tc6->int_flag || + (tc6->waiting_tx_skb && + tc6->tx_credits) || kthread_should_stop());
if (kthread_should_stop())
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Parthiban Veerasooran parthiban.veerasooran@microchip.com
[ Upstream commit e592b5110b3e9393881b0a019d86832bbf71a47f ]
There are two skb pointers to manage tx skb's enqueued from n/w stack. waiting_tx_skb pointer points to the tx skb which needs to be processed and ongoing_tx_skb pointer points to the tx skb which is being processed.
SPI thread prepares the tx data chunks from the tx skb pointed by the ongoing_tx_skb pointer. When the tx skb pointed by the ongoing_tx_skb is processed, the tx skb pointed by the waiting_tx_skb is assigned to ongoing_tx_skb and the waiting_tx_skb pointer is assigned with NULL. Whenever there is a new tx skb from n/w stack, it will be assigned to waiting_tx_skb pointer if it is NULL. Enqueuing and processing of a tx skb handled in two different threads.
Consider a scenario where the SPI thread processed an ongoing_tx_skb and it moves next tx skb from waiting_tx_skb pointer to ongoing_tx_skb pointer without doing any NULL check. At this time, if the waiting_tx_skb pointer is NULL then ongoing_tx_skb pointer is also assigned with NULL. After that, if a new tx skb is assigned to waiting_tx_skb pointer by the n/w stack and there is a chance to overwrite the tx skb pointer with NULL in the SPI thread. Finally one of the tx skb will be left as unhandled, resulting packet missing and memory leak.
- Consider the below scenario where the TXC reported from the previous transfer is 10 and ongoing_tx_skb holds an tx ethernet frame which can be transported in 20 TXCs and waiting_tx_skb is still NULL. tx_credits = 10; /* 21 are filled in the previous transfer */ ongoing_tx_skb = 20; waiting_tx_skb = NULL; /* Still NULL */ - So, (tc6->ongoing_tx_skb || tc6->waiting_tx_skb) becomes true. - After oa_tc6_prepare_spi_tx_buf_for_tx_skbs() ongoing_tx_skb = 10; waiting_tx_skb = NULL; /* Still NULL */ - Perform SPI transfer. - Process SPI rx buffer to get the TXC from footers. - Now let's assume previously filled 21 TXCs are freed so we are good to transport the next remaining 10 tx chunks from ongoing_tx_skb. tx_credits = 21; ongoing_tx_skb = 10; waiting_tx_skb = NULL; - So, (tc6->ongoing_tx_skb || tc6->waiting_tx_skb) becomes true again. - In the oa_tc6_prepare_spi_tx_buf_for_tx_skbs() ongoing_tx_skb = NULL; waiting_tx_skb = NULL;
- Now the below bad case might happen,
Thread1 (oa_tc6_start_xmit) Thread2 (oa_tc6_spi_thread_handler) --------------------------- ----------------------------------- - if waiting_tx_skb is NULL - if ongoing_tx_skb is NULL - ongoing_tx_skb = waiting_tx_skb - waiting_tx_skb = skb - waiting_tx_skb = NULL ... - ongoing_tx_skb = NULL - if waiting_tx_skb is NULL - waiting_tx_skb = skb
To overcome the above issue, protect the moving of tx skb reference from waiting_tx_skb pointer to ongoing_tx_skb pointer and assigning new tx skb to waiting_tx_skb pointer, so that the other thread can't access the waiting_tx_skb pointer until the current thread completes moving the tx skb reference safely.
Fixes: 53fbde8ab21e ("net: ethernet: oa_tc6: implement transmit path to transfer tx ethernet frames") Signed-off-by: Parthiban Veerasooran parthiban.veerasooran@microchip.com Reviewed-by: Larysa Zaremba larysa.zaremba@intel.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/oa_tc6.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/oa_tc6.c b/drivers/net/ethernet/oa_tc6.c index 4c8b0ca922b7..db200e4ec284 100644 --- a/drivers/net/ethernet/oa_tc6.c +++ b/drivers/net/ethernet/oa_tc6.c @@ -113,6 +113,7 @@ struct oa_tc6 { struct mii_bus *mdiobus; struct spi_device *spi; struct mutex spi_ctrl_lock; /* Protects spi control transfer */ + spinlock_t tx_skb_lock; /* Protects tx skb handling */ void *spi_ctrl_tx_buf; void *spi_ctrl_rx_buf; void *spi_data_tx_buf; @@ -1004,8 +1005,10 @@ static u16 oa_tc6_prepare_spi_tx_buf_for_tx_skbs(struct oa_tc6 *tc6) for (used_tx_credits = 0; used_tx_credits < tc6->tx_credits; used_tx_credits++) { if (!tc6->ongoing_tx_skb) { + spin_lock_bh(&tc6->tx_skb_lock); tc6->ongoing_tx_skb = tc6->waiting_tx_skb; tc6->waiting_tx_skb = NULL; + spin_unlock_bh(&tc6->tx_skb_lock); } if (!tc6->ongoing_tx_skb) break; @@ -1210,7 +1213,9 @@ netdev_tx_t oa_tc6_start_xmit(struct oa_tc6 *tc6, struct sk_buff *skb) return NETDEV_TX_OK; }
+ spin_lock_bh(&tc6->tx_skb_lock); tc6->waiting_tx_skb = skb; + spin_unlock_bh(&tc6->tx_skb_lock);
/* Wake spi kthread to perform spi transfer */ wake_up_interruptible(&tc6->spi_wq); @@ -1240,6 +1245,7 @@ struct oa_tc6 *oa_tc6_init(struct spi_device *spi, struct net_device *netdev) tc6->netdev = netdev; SET_NETDEV_DEV(netdev, &spi->dev); mutex_init(&tc6->spi_ctrl_lock); + spin_lock_init(&tc6->tx_skb_lock);
/* Set the SPI controller to pump at realtime priority */ tc6->spi->rt = true;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp
[ Upstream commit 0cb2c504d79e7caa3abade3f466750c82ad26f01 ]
The OF node obtained by of_parse_phandle() is not freed. Call of_node_put() to balance the refcount.
This bug was found by an experimental static analysis tool that I am developing.
Fixes: 1676aba5ef7e ("net: ethernet: bgmac: device tree phy enablement") Signed-off-by: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20241214014912.2810315-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bgmac-platform.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bgmac-platform.c b/drivers/net/ethernet/broadcom/bgmac-platform.c index 77425c7a32db..78f7862ca006 100644 --- a/drivers/net/ethernet/broadcom/bgmac-platform.c +++ b/drivers/net/ethernet/broadcom/bgmac-platform.c @@ -171,6 +171,7 @@ static int platform_phy_connect(struct bgmac *bgmac) static int bgmac_probe(struct platform_device *pdev) { struct device_node *np = pdev->dev.of_node; + struct device_node *phy_node; struct bgmac *bgmac; struct resource *regs; int ret; @@ -236,7 +237,9 @@ static int bgmac_probe(struct platform_device *pdev) bgmac->cco_ctl_maskset = platform_bgmac_cco_ctl_maskset; bgmac->get_bus_clock = platform_bgmac_get_bus_clock; bgmac->cmn_maskset32 = platform_bgmac_cmn_maskset32; - if (of_parse_phandle(np, "phy-handle", 0)) { + phy_node = of_parse_phandle(np, "phy-handle", 0); + if (phy_node) { + of_node_put(phy_node); bgmac->phy_connect = platform_phy_connect; } else { bgmac->phy_connect = bgmac_phy_connect_direct;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit b9b8301d369b4c876de5255dbf067b19ba88ac71 ]
nsim_pp_hold_write() has two problems:
1) It may return with rtnl held, as found by syzbot.
2) Its return value does not propagate an error if any.
Fixes: 1580cbcbfe77 ("net: netdevsim: add some fake page pool use") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Eric Dumazet edumazet@google.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20241216083703.1859921-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/netdevsim/netdev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c index 017a6102be0a..1b29d1d794a2 100644 --- a/drivers/net/netdevsim/netdev.c +++ b/drivers/net/netdevsim/netdev.c @@ -596,10 +596,10 @@ nsim_pp_hold_write(struct file *file, const char __user *data, page_pool_put_full_page(ns->page->pp, ns->page, false); ns->page = NULL; } - rtnl_unlock();
exit: - return count; + rtnl_unlock(); + return ret; }
static const struct file_operations nsim_pp_hold_fops = {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthias Schiffer matthias.schiffer@ew.tq-group.com
[ Upstream commit fca2977629f49dee437e217c3fc423b6e0cad98c ]
While an m_can controller usually already has the init flag from a hardware reset, no such reset happens on the integrated m_can_pci of the Intel Elkhart Lake. If the CAN controller is found in an active state, m_can_dev_setup() would fail because m_can_niso_supported() calls m_can_cccr_update_bits(), which refuses to modify any other configuration bits when CCCR_INIT is not set.
To avoid this issue, set CCCR_INIT before attempting to modify any other configuration flags.
Fixes: cd5a46ce6fa6 ("can: m_can: don't enable transceiver when probing") Signed-off-by: Matthias Schiffer matthias.schiffer@ew.tq-group.com Reviewed-by: Markus Schneider-Pargmann msp@baylibre.com Link: https://patch.msgid.link/e247f331cb72829fcbdfda74f31a59cbad1a6006.1728288535... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/m_can/m_can.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c index 533bcb77c9f9..67c404fbe166 100644 --- a/drivers/net/can/m_can/m_can.c +++ b/drivers/net/can/m_can/m_can.c @@ -1695,6 +1695,14 @@ static int m_can_dev_setup(struct m_can_classdev *cdev) return -EINVAL; }
+ /* Write the INIT bit, in case no hardware reset has happened before + * the probe (for example, it was observed that the Intel Elkhart Lake + * SoCs do not properly reset the CAN controllers on reboot) + */ + err = m_can_cccr_update_bits(cdev, CCCR_INIT, CCCR_INIT); + if (err) + return err; + if (!cdev->is_peripheral) netif_napi_add(dev, &cdev->napi, m_can_poll);
@@ -1746,11 +1754,7 @@ static int m_can_dev_setup(struct m_can_classdev *cdev) return -EINVAL; }
- /* Forcing standby mode should be redundant, as the chip should be in - * standby after a reset. Write the INIT bit anyways, should the chip - * be configured by previous stage. - */ - return m_can_cccr_update_bits(cdev, CCCR_INIT, CCCR_INIT); + return 0; }
static void m_can_stop(struct net_device *dev)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthias Schiffer matthias.schiffer@ew.tq-group.com
[ Upstream commit 743375f8deee360b0e902074bab99b0c9368d42f ]
The interrupt line of PCI devices is interpreted as edge-triggered, however the interrupt signal of the m_can controller integrated in Intel Elkhart Lake CPUs appears to be generated level-triggered.
Consider the following sequence of events:
- IR register is read, interrupt X is set - A new interrupt Y is triggered in the m_can controller - IR register is written to acknowledge interrupt X. Y remains set in IR
As at no point in this sequence no interrupt flag is set in IR, the m_can interrupt line will never become deasserted, and no edge will ever be observed to trigger another run of the ISR. This was observed to result in the TX queue of the EHL m_can to get stuck under high load, because frames were queued to the hardware in m_can_start_xmit(), but m_can_finish_tx() was never run to account for their successful transmission.
On an Elkhart Lake based board with the two CAN interfaces connected to each other, the following script can reproduce the issue:
ip link set can0 up type can bitrate 1000000 ip link set can1 up type can bitrate 1000000
cangen can0 -g 2 -I 000 -L 8 & cangen can0 -g 2 -I 001 -L 8 & cangen can0 -g 2 -I 002 -L 8 & cangen can0 -g 2 -I 003 -L 8 & cangen can0 -g 2 -I 004 -L 8 & cangen can0 -g 2 -I 005 -L 8 & cangen can0 -g 2 -I 006 -L 8 & cangen can0 -g 2 -I 007 -L 8 &
cangen can1 -g 2 -I 100 -L 8 & cangen can1 -g 2 -I 101 -L 8 & cangen can1 -g 2 -I 102 -L 8 & cangen can1 -g 2 -I 103 -L 8 & cangen can1 -g 2 -I 104 -L 8 & cangen can1 -g 2 -I 105 -L 8 & cangen can1 -g 2 -I 106 -L 8 & cangen can1 -g 2 -I 107 -L 8 &
stress-ng --matrix 0 &
To fix the issue, repeatedly read and acknowledge interrupts at the start of the ISR until no interrupt flags are set, so the next incoming interrupt will also result in an edge on the interrupt line.
While we have received a report that even with this patch, the TX queue can become stuck under certain (currently unknown) circumstances on the Elkhart Lake, this patch completely fixes the issue with the above reproducer, and it is unclear whether the remaining issue has a similar cause at all.
Fixes: cab7ffc0324f ("can: m_can: add PCI glue driver for Intel Elkhart Lake") Signed-off-by: Matthias Schiffer matthias.schiffer@ew.tq-group.com Reviewed-by: Markus Schneider-Pargmann msp@baylibre.com Link: https://patch.msgid.link/fdf0439c51bcb3a46c21e9fb21c7f1d06363be84.1728288535... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/m_can/m_can.c | 22 +++++++++++++++++----- drivers/net/can/m_can/m_can.h | 1 + drivers/net/can/m_can/m_can_pci.c | 1 + 3 files changed, 19 insertions(+), 5 deletions(-)
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c index 67c404fbe166..97cd8bbf2e32 100644 --- a/drivers/net/can/m_can/m_can.c +++ b/drivers/net/can/m_can/m_can.c @@ -1220,20 +1220,32 @@ static void m_can_coalescing_update(struct m_can_classdev *cdev, u32 ir) static int m_can_interrupt_handler(struct m_can_classdev *cdev) { struct net_device *dev = cdev->net; - u32 ir; + u32 ir = 0, ir_read; int ret;
if (pm_runtime_suspended(cdev->dev)) return IRQ_NONE;
- ir = m_can_read(cdev, M_CAN_IR); + /* The m_can controller signals its interrupt status as a level, but + * depending in the integration the CPU may interpret the signal as + * edge-triggered (for example with m_can_pci). For these + * edge-triggered integrations, we must observe that IR is 0 at least + * once to be sure that the next interrupt will generate an edge. + */ + while ((ir_read = m_can_read(cdev, M_CAN_IR)) != 0) { + ir |= ir_read; + + /* ACK all irqs */ + m_can_write(cdev, M_CAN_IR, ir); + + if (!cdev->irq_edge_triggered) + break; + } + m_can_coalescing_update(cdev, ir); if (!ir) return IRQ_NONE;
- /* ACK all irqs */ - m_can_write(cdev, M_CAN_IR, ir); - if (cdev->ops->clear_interrupts) cdev->ops->clear_interrupts(cdev);
diff --git a/drivers/net/can/m_can/m_can.h b/drivers/net/can/m_can/m_can.h index 92b2bd8628e6..ef39e8e527ab 100644 --- a/drivers/net/can/m_can/m_can.h +++ b/drivers/net/can/m_can/m_can.h @@ -99,6 +99,7 @@ struct m_can_classdev { int pm_clock_support; int pm_wake_source; int is_peripheral; + bool irq_edge_triggered;
// Cached M_CAN_IE register content u32 active_interrupts; diff --git a/drivers/net/can/m_can/m_can_pci.c b/drivers/net/can/m_can/m_can_pci.c index d72fe771dfc7..9ad7419f88f8 100644 --- a/drivers/net/can/m_can/m_can_pci.c +++ b/drivers/net/can/m_can/m_can_pci.c @@ -127,6 +127,7 @@ static int m_can_pci_probe(struct pci_dev *pci, const struct pci_device_id *id) mcan_class->pm_clock_support = 1; mcan_class->pm_wake_source = 0; mcan_class->can.clock.freq = id->driver_data; + mcan_class->irq_edge_triggered = true; mcan_class->ops = &m_can_pci_ops;
pci_set_drvdata(pci, mcan_class);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Laight David.Laight@ACULAB.COM
[ Upstream commit cf2c97423a4f89c8b798294d3f34ecfe7e7035c3 ]
The 'max_avail' value is calculated from the system memory size using order_base_2(). order_base_2(x) is defined as '(x) ? fn(x) : 0'. The compiler generates two copies of the code that follows and then expands clamp(max, min, PAGE_SHIFT - 12) (11 on 32bit). This triggers a compile-time assert since min is 5.
In reality a system would have to have less than 512MB memory for the bounds passed to clamp to be reversed.
Swap the order of the arguments to clamp() to avoid the warning.
Replace the clamp_val() on the line below with clamp(). clamp_val() is just 'an accident waiting to happen' and not needed here.
Detected by compile time checks added to clamp(), specifically: minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()
Reported-by: Linux Kernel Functional Testing lkft@linaro.org Closes: https://lore.kernel.org/all/CA+G9fYsT34UkGFKxus63H6UVpYi5GRZkezT9MRLfAbM3f6k... Fixes: 4f325e26277b ("ipvs: dynamically limit the connection hash table") Tested-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Reviewed-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: David Laight david.laight@aculab.com Acked-by: Julian Anastasov ja@ssi.bg Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipvs/ip_vs_conn.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c index 98d7dbe3d787..c0289f83f96d 100644 --- a/net/netfilter/ipvs/ip_vs_conn.c +++ b/net/netfilter/ipvs/ip_vs_conn.c @@ -1495,8 +1495,8 @@ int __init ip_vs_conn_init(void) max_avail -= 2; /* ~4 in hash row */ max_avail -= 1; /* IPVS up to 1/2 of mem */ max_avail -= order_base_2(sizeof(struct ip_vs_conn)); - max = clamp(max, min, max_avail); - ip_vs_conn_tab_bits = clamp_val(ip_vs_conn_tab_bits, min, max); + max = clamp(max_avail, min, max); + ip_vs_conn_tab_bits = clamp(ip_vs_conn_tab_bits, min, max); ip_vs_conn_tab_size = 1 << ip_vs_conn_tab_bits; ip_vs_conn_tab_mask = ip_vs_conn_tab_size - 1;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Phil Sutter phil@nwl.cc
[ Upstream commit 70b6f46a4ed8bd56c85ffff22df91e20e8c85e33 ]
With CONFIG_PROVE_LOCKING, when creating a set of type bitmap:ip, adding it to a set of type list:set and populating it from iptables SET target triggers a kernel warning:
| WARNING: possible recursive locking detected | 6.12.0-rc7-01692-g5e9a28f41134-dirty #594 Not tainted | -------------------------------------------- | ping/4018 is trying to acquire lock: | ffff8881094a6848 (&set->lock){+.-.}-{2:2}, at: ip_set_add+0x28c/0x360 [ip_set] | | but task is already holding lock: | ffff88811034c048 (&set->lock){+.-.}-{2:2}, at: ip_set_add+0x28c/0x360 [ip_set]
This is a false alarm: ipset does not allow nested list:set type, so the loop in list_set_kadd() can never encounter the outer set itself. No other set type supports embedded sets, so this is the only case to consider.
To avoid the false report, create a distinct lock class for list:set type ipset locks.
Fixes: f830837f0eed ("netfilter: ipset: list:set set type support") Signed-off-by: Phil Sutter phil@nwl.cc Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipset/ip_set_list_set.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/netfilter/ipset/ip_set_list_set.c b/net/netfilter/ipset/ip_set_list_set.c index bfae7066936b..db794fe1300e 100644 --- a/net/netfilter/ipset/ip_set_list_set.c +++ b/net/netfilter/ipset/ip_set_list_set.c @@ -611,6 +611,8 @@ init_list_set(struct net *net, struct ip_set *set, u32 size) return true; }
+static struct lock_class_key list_set_lockdep_key; + static int list_set_create(struct net *net, struct ip_set *set, struct nlattr *tb[], u32 flags) @@ -627,6 +629,7 @@ list_set_create(struct net *net, struct ip_set *set, struct nlattr *tb[], if (size < IP_SET_LIST_MIN_SIZE) size = IP_SET_LIST_MIN_SIZE;
+ lockdep_set_class(&set->lock, &list_set_lockdep_key); set->variant = &set_variant; set->dsize = ip_set_elem_len(set, tb, sizeof(struct set_elem), __alignof__(struct set_elem));
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adrian Moreno amorenoz@redhat.com
[ Upstream commit a17975992cc11588767175247ccaae1213a8b582 ]
Fix the way tcpdump is executed by: - Using the right variable for the namespace. Currently the use of the empty "ns" makes the command fail. - Waiting until it starts to capture to ensure the interesting traffic is caught on slow systems. - Using line-buffered output to ensure logs are available when the test is paused with "-p". Otherwise the last chunk of data might only be written when tcpdump is killed.
Fixes: 74cc26f416b9 ("selftests: openvswitch: add interface support") Signed-off-by: Adrian Moreno amorenoz@redhat.com Acked-by: Eelco Chaudron echaudro@redhat.com Link: https://patch.msgid.link/20241217211652.483016-1-amorenoz@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/openvswitch/openvswitch.sh | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh b/tools/testing/selftests/net/openvswitch/openvswitch.sh index cc0bfae2bafa..960e1ab4dd04 100755 --- a/tools/testing/selftests/net/openvswitch/openvswitch.sh +++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh @@ -171,8 +171,10 @@ ovs_add_netns_and_veths () { ovs_add_if "$1" "$2" "$4" -u || return 1 fi
- [ $TRACING -eq 1 ] && ovs_netns_spawn_daemon "$1" "$ns" \ - tcpdump -i any -s 65535 + if [ $TRACING -eq 1 ]; then + ovs_netns_spawn_daemon "$1" "$3" tcpdump -l -i any -s 6553 + ovs_wait grep -q "listening on any" ${ovs_dir}/stderr + fi
return 0 }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 16f027cd40eeedd2325f7e720689462ca8d9d13e ]
Robert Hodaszi reports that locally terminated traffic towards VLAN-unaware bridge ports is broken with ocelot-8021q. He is describing the same symptoms as for commit 1f9fc48fd302 ("net: dsa: sja1105: fix reception from VLAN-unaware bridges").
For context, the set merged as "VLAN fixes for Ocelot driver": https://lore.kernel.org/netdev/20240815000707.2006121-1-vladimir.oltean@nxp....
was developed in a slightly different form earlier this year, in January. Initially, the switch was unconditionally configured to set OCELOT_ES0_TAG when using ocelot-8021q, regardless of port operating mode.
This led to the situation where VLAN-unaware bridge ports would always push their PVID - see ocelot_vlan_unaware_pvid() - a negligible value anyway - into RX packets. To strip this in software, we would have needed DSA to know what private VID the switch chose for VLAN-unaware bridge ports, and pushed into the packets. This was implemented downstream, and a remnant of it remains in the form of a comment mentioning ds->ops->get_private_vid(), as something which would maybe need to be considered in the future.
However, for upstream, it was deemed inappropriate, because it would mean introducing yet another behavior for stripping VLAN tags from VLAN-unaware bridge ports, when one already existed (ds->untag_bridge_pvid). The latter has been marked as obsolete along with an explanation why it is logically broken, but still, it would have been confusing.
So, for upstream, felix_update_tag_8021q_rx_rule() was developed, which essentially changed the state of affairs from "Felix with ocelot-8021q delivers all packets as VLAN-tagged towards the CPU" into "Felix with ocelot-8021q delivers all packets from VLAN-aware bridge ports towards the CPU". This was done on the premise that in VLAN-unaware mode, there's nothing useful in the VLAN tags, and we can avoid introducing ds->ops->get_private_vid() in the DSA receive path if we configure the switch to not push those VLAN tags into packets in the first place.
Unfortunately, and this is when the trainwreck started, the selftests developed initially and posted with the series were not re-ran. dsa_software_vlan_untag() was initially written given the assumption that users of this feature would send _all_ traffic as VLAN-tagged. It was only partially adapted to the new scheme, by removing ds->ops->get_private_vid(), which also used to be necessary in standalone ports mode.
Where the trainwreck became even worse is that I had a second opportunity to think about this, when the dsa_software_vlan_untag() logic change initially broke sja1105, in commit 1f9fc48fd302 ("net: dsa: sja1105: fix reception from VLAN-unaware bridges"). I did not connect the dots that it also breaks ocelot-8021q, for pretty much the same reason that not all received packets will be VLAN-tagged.
To be compatible with the optimized Felix control path which runs felix_update_tag_8021q_rx_rule() to only push VLAN tags when useful (in VLAN-aware mode), we need to restore the old dsa_software_vlan_untag() logic. The blamed commit introduced the assumption that dsa_software_vlan_untag() will see only VLAN-tagged packets, assumption which is false. What corrupts RX traffic is the fact that we call skb_vlan_untag() on packets which are not VLAN-tagged in the first place.
Fixes: 93e4649efa96 ("net: dsa: provide a software untagging function on RX for VLAN-aware bridges") Reported-by: Robert Hodaszi robert.hodaszi@digi.com Closes: https://lore.kernel.org/netdev/20241215163334.615427-1-robert.hodaszi@digi.c... Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Link: https://patch.msgid.link/20241216135059.1258266-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/dsa/tag.h | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/net/dsa/tag.h b/net/dsa/tag.h index d5707870906b..5d80ddad4ff6 100644 --- a/net/dsa/tag.h +++ b/net/dsa/tag.h @@ -138,9 +138,10 @@ static inline void dsa_software_untag_vlan_unaware_bridge(struct sk_buff *skb, * dsa_software_vlan_untag: Software VLAN untagging in DSA receive path * @skb: Pointer to socket buffer (packet) * - * Receive path method for switches which cannot avoid tagging all packets - * towards the CPU port. Called when ds->untag_bridge_pvid (legacy) or - * ds->untag_vlan_aware_bridge_pvid is set to true. + * Receive path method for switches which send some packets as VLAN-tagged + * towards the CPU port (generally from VLAN-aware bridge ports) even when the + * packet was not tagged on the wire. Called when ds->untag_bridge_pvid + * (legacy) or ds->untag_vlan_aware_bridge_pvid is set to true. * * As a side effect of this method, any VLAN tag from the skb head is moved * to hwaccel. @@ -149,14 +150,19 @@ static inline struct sk_buff *dsa_software_vlan_untag(struct sk_buff *skb) { struct dsa_port *dp = dsa_user_to_port(skb->dev); struct net_device *br = dsa_port_bridge_dev_get(dp); - u16 vid; + u16 vid, proto; + int err;
/* software untagging for standalone ports not yet necessary */ if (!br) return skb;
+ err = br_vlan_get_proto(br, &proto); + if (err) + return skb; + /* Move VLAN tag from data to hwaccel */ - if (!skb_vlan_tag_present(skb)) { + if (!skb_vlan_tag_present(skb) && skb->protocol == htons(proto)) { skb = skb_vlan_untag(skb); if (!skb) return NULL;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 5eb70dbebf32c2fd1f2814c654ae17fc47d6e859 ]
Empty netlink responses from do() are not correct (as opposed to dump() where not dumping anything is perfectly fine). We should return an error if the target object does not exist, in this case if the netdev is down it has no queues.
Fixes: 6b6171db7fc8 ("netdev-genl: Add netlink framework functions for queue") Reported-by: syzbot+0a884bc2d304ce4af70f@syzkaller.appspotmail.com Reviewed-by: Eric Dumazet edumazet@google.com Reviewed-by: Joe Damato jdamato@fastly.com Link: https://patch.msgid.link/20241218022508.815344-1-kuba@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/netdev-genl.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index 224d1b5b79a7..7ce22f40db5b 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -359,10 +359,10 @@ static int netdev_nl_queue_fill(struct sk_buff *rsp, struct net_device *netdev, u32 q_idx, u32 q_type, const struct genl_info *info) { - int err = 0; + int err;
if (!(netdev->flags & IFF_UP)) - return err; + return -ENOENT;
err = netdev_nl_queue_validate(netdev, q_idx, q_type); if (err)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adrian Moreno amorenoz@redhat.com
[ Upstream commit 5eecd85c77a254a43bde3212da8047b001745c9f ]
If PSAMPLE_ATTR_SAMPLE_PROBABILITY flag is to be sent, the available size for the packet data has to be adjusted accordingly.
Also, check the error code returned by nla_put_flag.
Fixes: 7b1b2b60c63f ("net: psample: allow using rate as probability") Signed-off-by: Adrian Moreno amorenoz@redhat.com Reviewed-by: Aaron Conole aconole@redhat.com Reviewed-by: Ido Schimmel idosch@nvidia.com Link: https://patch.msgid.link/20241217113739.3929300-1-amorenoz@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/psample/psample.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/net/psample/psample.c b/net/psample/psample.c index a0ddae8a65f9..25f92ba0840c 100644 --- a/net/psample/psample.c +++ b/net/psample/psample.c @@ -393,7 +393,9 @@ void psample_sample_packet(struct psample_group *group, nla_total_size_64bit(sizeof(u64)) + /* timestamp */ nla_total_size(sizeof(u16)) + /* protocol */ (md->user_cookie_len ? - nla_total_size(md->user_cookie_len) : 0); /* user cookie */ + nla_total_size(md->user_cookie_len) : 0) + /* user cookie */ + (md->rate_as_probability ? + nla_total_size(0) : 0); /* rate as probability */
#ifdef CONFIG_INET tun_info = skb_tunnel_info(skb); @@ -498,8 +500,9 @@ void psample_sample_packet(struct psample_group *group, md->user_cookie)) goto error;
- if (md->rate_as_probability) - nla_put_flag(nl_skb, PSAMPLE_ATTR_SAMPLE_PROBABILITY); + if (md->rate_as_probability && + nla_put_flag(nl_skb, PSAMPLE_ATTR_SAMPLE_PROBABILITY)) + goto error;
genlmsg_end(nl_skb, data); genlmsg_multicast_netns(&psample_nl_family, group->net, nl_skb, 0,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp
[ Upstream commit 572af9f284669d31d9175122bbef9bc62cea8ded ]
fwnode_find_mii_timestamper() calls of_parse_phandle_with_fixed_args() but does not decrement the refcount of the obtained OF node. Add an of_node_put() call before returning from the function.
This bug was detected by an experimental static analysis tool that I am developing.
Fixes: bc1bee3b87ee ("net: mdiobus: Introduce fwnode_mdiobus_register_phy()") Signed-off-by: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://patch.msgid.link/20241218035106.1436405-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/mdio/fwnode_mdio.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/net/mdio/fwnode_mdio.c b/drivers/net/mdio/fwnode_mdio.c index b156493d7084..aea0f0357568 100644 --- a/drivers/net/mdio/fwnode_mdio.c +++ b/drivers/net/mdio/fwnode_mdio.c @@ -40,6 +40,7 @@ fwnode_find_pse_control(struct fwnode_handle *fwnode) static struct mii_timestamper * fwnode_find_mii_timestamper(struct fwnode_handle *fwnode) { + struct mii_timestamper *mii_ts; struct of_phandle_args arg; int err;
@@ -53,10 +54,16 @@ fwnode_find_mii_timestamper(struct fwnode_handle *fwnode) else if (err) return ERR_PTR(err);
- if (arg.args_count != 1) - return ERR_PTR(-EINVAL); + if (arg.args_count != 1) { + mii_ts = ERR_PTR(-EINVAL); + goto put_node; + } + + mii_ts = register_mii_timestamper(arg.np, arg.args[0]);
- return register_mii_timestamper(arg.np, arg.args[0]); +put_node: + of_node_put(arg.np); + return mii_ts; }
int fwnode_mdiobus_phy_device_register(struct mii_bus *mdio,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Prathamesh Shete pshete@nvidia.com
commit a56335c85b592cb2833db0a71f7112b7d9f0d56b upstream.
Value 0 in ADMA length descriptor is interpreted as 65536 on new Tegra chips, remove SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC quirk to make sure max ADMA2 length is 65536.
Fixes: 4346b7c7941d ("mmc: tegra: Add Tegra186 support") Cc: stable@vger.kernel.org Signed-off-by: Prathamesh Shete pshete@nvidia.com Acked-by: Thierry Reding treding@nvidia.com Acked-by: Adrian Hunter adrian.hunter@intel.com Message-ID: 20241209101009.22710-1-pshete@nvidia.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-tegra.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/mmc/host/sdhci-tegra.c +++ b/drivers/mmc/host/sdhci-tegra.c @@ -1525,7 +1525,6 @@ static const struct sdhci_pltfm_data sdh .quirks = SDHCI_QUIRK_BROKEN_TIMEOUT_VAL | SDHCI_QUIRK_SINGLE_POWER_WRITE | SDHCI_QUIRK_NO_HISPD_BIT | - SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC | SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN, .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN | SDHCI_QUIRK2_ISSUE_CMD_DAT_RESET_TOGETHER,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp
commit f3d87abe11ed04d1b23a474a212f0e5deeb50892 upstream.
Current implementation leaves pdev->dev as a wakeup source. Add a device_init_wakeup(&pdev->dev, false) call in the .remove() function and in the error path of the .probe() function.
Signed-off-by: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp Fixes: 527f36f5efa4 ("mmc: mediatek: add support for SDIO eint wakup IRQ") Cc: stable@vger.kernel.org Message-ID: 20241203023442.2434018-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/mtk-sd.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/mmc/host/mtk-sd.c +++ b/drivers/mmc/host/mtk-sd.c @@ -2924,6 +2924,7 @@ release_clk: msdc_gate_clock(host); platform_set_drvdata(pdev, NULL); release_mem: + device_init_wakeup(&pdev->dev, false); if (host->dma.gpd) dma_free_coherent(&pdev->dev, 2 * sizeof(struct mt_gpdma_desc), @@ -2957,6 +2958,7 @@ static void msdc_drv_remove(struct platf host->dma.gpd, host->dma.gpd_addr); dma_free_coherent(&pdev->dev, MAX_BD_NUM * sizeof(struct mt_bdma_desc), host->dma.bd, host->dma.bd_addr); + device_init_wakeup(&pdev->dev, false); }
static void msdc_save_reg(struct msdc_host *host)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Zyngier maz@kernel.org
commit 773c05f417fa14e1ac94776619e9c978ec001f0b upstream.
It appears that the relatively popular RK3399 SoC has been put together using a large amount of illicit substances, as experiments reveal that its integration of GIC500 exposes the *secure* programming interface to non-secure.
This has some pretty bad effects on the way priorities are handled, and results in a dead machine if booting with pseudo-NMI enabled (irqchip.gicv3_pseudo_nmi=1) if the kernel contains 18fdb6348c480 ("arm64: irqchip/gic-v3: Select priorities at boot time"), which relies on the priorities being programmed using the NS view.
Let's restore some sanity by going one step further and disable security altogether in this case. This is not any worse, and puts us in a mode where priorities actually make some sense.
Huge thanks to Mark Kettenis who initially identified this issue on OpenBSD, and to Chen-Yu Tsai who reported the problem in Linux.
Fixes: 18fdb6348c480 ("arm64: irqchip/gic-v3: Select priorities at boot time") Reported-by: Mark Kettenis mark.kettenis@xs4all.nl Reported-by: Chen-Yu Tsai wens@csie.org Signed-off-by: Marc Zyngier maz@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Chen-Yu Tsai wens@csie.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241213141037.3995049-1-maz@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/irqchip/irq-gic-v3.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index 34db379d066a..79d8cc80693c 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -161,7 +161,22 @@ static bool cpus_have_group0 __ro_after_init;
static void __init gic_prio_init(void) { - cpus_have_security_disabled = gic_dist_security_disabled(); + bool ds; + + ds = gic_dist_security_disabled(); + if (!ds) { + u32 val; + + val = readl_relaxed(gic_data.dist_base + GICD_CTLR); + val |= GICD_CTLR_DS; + writel_relaxed(val, gic_data.dist_base + GICD_CTLR); + + ds = gic_dist_security_disabled(); + if (ds) + pr_warn("Broken GIC integration, security disabled"); + } + + cpus_have_security_disabled = ds; cpus_have_group0 = gic_has_group0();
/*
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov (AMD) bp@alien8.de
commit 747367340ca6b5070728b86ae36ad6747f66b2fb upstream.
The intent of the check is to see whether at least one UMC has ECC enabled. So do that instead of tracking which ones are enabled in masks which are too small in size anyway and lead to not loading the driver on Zen4 machines with UMCs enabled over UMC8.
Fixes: e2be5955a886 ("EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh") Reported-by: Avadhut Naik avadhut.naik@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Tested-by: Avadhut Naik avadhut.naik@amd.com Reviewed-by: Avadhut Naik avadhut.naik@amd.com Cc: stable@kernel.org Link: https://lore.kernel.org/r/20241210212054.3895697-1-avadhut.naik@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/edac/amd64_edac.c | 34 +++++++++++----------------------- 1 file changed, 11 insertions(+), 23 deletions(-)
--- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -3362,36 +3362,24 @@ static bool dct_ecc_enabled(struct amd64
static bool umc_ecc_enabled(struct amd64_pvt *pvt) { - u8 umc_en_mask = 0, ecc_en_mask = 0; - u16 nid = pvt->mc_node_id; struct amd64_umc *umc; - u8 ecc_en = 0, i; + bool ecc_en = false; + int i;
+ /* Check whether at least one UMC is enabled: */ for_each_umc(i) { umc = &pvt->umc[i];
- /* Only check enabled UMCs. */ - if (!(umc->sdp_ctrl & UMC_SDP_INIT)) - continue; - - umc_en_mask |= BIT(i); - - if (umc->umc_cap_hi & UMC_ECC_ENABLED) - ecc_en_mask |= BIT(i); + if (umc->sdp_ctrl & UMC_SDP_INIT && + umc->umc_cap_hi & UMC_ECC_ENABLED) { + ecc_en = true; + break; + } }
- /* Check whether at least one UMC is enabled: */ - if (umc_en_mask) - ecc_en = umc_en_mask == ecc_en_mask; - else - edac_dbg(0, "Node %d: No enabled UMCs.\n", nid); - - edac_dbg(3, "Node %d: DRAM ECC %s.\n", nid, (ecc_en ? "enabled" : "disabled")); - - if (!ecc_en) - return false; - else - return true; + edac_dbg(3, "Node %d: DRAM ECC %s.\n", pvt->mc_node_id, (ecc_en ? "enabled" : "disabled")); + + return ecc_en; }
static inline void
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Zyngier maz@kernel.org
commit 03c7527e97f73081633d773f9f8c2373f9854b25 upstream.
Catalin reports that a hypervisor lying to a guest about the size of the ASID field may result in unexpected issues:
- if the underlying HW does only supports 8 bit ASIDs, the ASID field in a TLBI VAE1* operation is only 8 bits, and the HW will ignore the other 8 bits
- if on the contrary the HW is 16 bit capable, the ASID field in the same TLBI operation is always 16 bits, irrespective of the value of TCR_ELx.AS.
This could lead to missed invalidations if the guest was lead to assume that the HW had 8 bit ASIDs while they really are 16 bit wide.
In order to avoid any potential disaster that would be hard to debug, prenent the migration between a host with 8 bit ASIDs to one with wider ASIDs (the converse was obviously always forbidden). This is also consistent with what we already do for VMIDs.
If it becomes absolutely mandatory to support such a migration path in the future, we will have to trap and emulate all TLBIs, something that nobody should look forward to.
Fixes: d5a32b60dc18 ("KVM: arm64: Allow userspace to change ID_AA64MMFR{0-2}_EL1") Reported-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Cc: Will Deacon will@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Marc Zyngier maz@kernel.org Cc: James Morse james.morse@arm.com Cc: Oliver Upton oliver.upton@linux.dev Acked-by: Catalin Marinas catalin.marinas@arm.com Link: https://lore.kernel.org/r/20241203190236.505759-1-maz@kernel.org Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kvm/sys_regs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2503,7 +2503,8 @@ static const struct sys_reg_desc sys_reg ID_WRITABLE(ID_AA64MMFR0_EL1, ~(ID_AA64MMFR0_EL1_RES0 | ID_AA64MMFR0_EL1_TGRAN4_2 | ID_AA64MMFR0_EL1_TGRAN64_2 | - ID_AA64MMFR0_EL1_TGRAN16_2)), + ID_AA64MMFR0_EL1_TGRAN16_2 | + ID_AA64MMFR0_EL1_ASIDBITS)), ID_WRITABLE(ID_AA64MMFR1_EL1, ~(ID_AA64MMFR1_EL1_RES0 | ID_AA64MMFR1_EL1_HCX | ID_AA64MMFR1_EL1_TWED |
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 1201f226c863b7da739f7420ddba818cedf372fc upstream.
Snapshot the output of CPUID.0xD.[1..n] during kvm.ko initiliaization to avoid the overead of CPUID during runtime. The offset, size, and metadata for CPUID.0xD.[1..n] sub-leaves does not depend on XCR0 or XSS values, i.e. is constant for a given CPU, and thus can be cached during module load.
On Intel's Emerald Rapids, CPUID is *wildly* expensive, to the point where recomputing XSAVE offsets and sizes results in a 4x increase in latency of nested VM-Enter and VM-Exit (nested transitions can trigger xstate_required_size() multiple times per transition), relative to using cached values. The issue is easily visible by running `perf top` while triggering nested transitions: kvm_update_cpuid_runtime() shows up at a whopping 50%.
As measured via RDTSC from L2 (using KVM-Unit-Test's CPUID VM-Exit test and a slightly modified L1 KVM to handle CPUID in the fastpath), a nested roundtrip to emulate CPUID on Skylake (SKX), Icelake (ICX), and Emerald Rapids (EMR) takes:
SKX 11650 ICX 22350 EMR 28850
Using cached values, the latency drops to:
SKX 6850 ICX 9000 EMR 7900
The underlying issue is that CPUID itself is slow on ICX, and comically slow on EMR. The problem is exacerbated on CPUs which support XSAVES and/or XSAVEC, as KVM invokes xstate_required_size() twice on each runtime CPUID update, and because there are more supported XSAVE features (CPUID for supported XSAVE feature sub-leafs is significantly slower).
SKX: CPUID.0xD.2 = 348 cycles CPUID.0xD.3 = 400 cycles CPUID.0xD.4 = 276 cycles CPUID.0xD.5 = 236 cycles <other sub-leaves are similar>
EMR: CPUID.0xD.2 = 1138 cycles CPUID.0xD.3 = 1362 cycles CPUID.0xD.4 = 1068 cycles CPUID.0xD.5 = 910 cycles CPUID.0xD.6 = 914 cycles CPUID.0xD.7 = 1350 cycles CPUID.0xD.8 = 734 cycles CPUID.0xD.9 = 766 cycles CPUID.0xD.10 = 732 cycles CPUID.0xD.11 = 718 cycles CPUID.0xD.12 = 734 cycles CPUID.0xD.13 = 1700 cycles CPUID.0xD.14 = 1126 cycles CPUID.0xD.15 = 898 cycles CPUID.0xD.16 = 716 cycles CPUID.0xD.17 = 748 cycles CPUID.0xD.18 = 776 cycles
Note, updating runtime CPUID information multiple times per nested transition is itself a flaw, especially since CPUID is a mandotory intercept on both Intel and AMD. E.g. KVM doesn't need to ensure emulated CPUID state is up-to-date while running L2. That flaw will be fixed in a future patch, as deferring runtime CPUID updates is more subtle than it appears at first glance, the benefits aren't super critical to have once the XSAVE issue is resolved, and caching CPUID output is desirable even if KVM's updates are deferred.
Cc: Jim Mattson jmattson@google.com Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson seanjc@google.com Message-ID: 20241211013302.1347853-2-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/cpuid.c | 31 ++++++++++++++++++++++++++----- arch/x86/kvm/cpuid.h | 1 + arch/x86/kvm/x86.c | 2 ++ 3 files changed, 29 insertions(+), 5 deletions(-)
--- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -36,6 +36,26 @@ u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly; EXPORT_SYMBOL_GPL(kvm_cpu_caps);
+struct cpuid_xstate_sizes { + u32 eax; + u32 ebx; + u32 ecx; +}; + +static struct cpuid_xstate_sizes xstate_sizes[XFEATURE_MAX] __ro_after_init; + +void __init kvm_init_xstate_sizes(void) +{ + u32 ign; + int i; + + for (i = XFEATURE_YMM; i < ARRAY_SIZE(xstate_sizes); i++) { + struct cpuid_xstate_sizes *xs = &xstate_sizes[i]; + + cpuid_count(0xD, i, &xs->eax, &xs->ebx, &xs->ecx, &ign); + } +} + u32 xstate_required_size(u64 xstate_bv, bool compacted) { int feature_bit = 0; @@ -44,14 +64,15 @@ u32 xstate_required_size(u64 xstate_bv, xstate_bv &= XFEATURE_MASK_EXTEND; while (xstate_bv) { if (xstate_bv & 0x1) { - u32 eax, ebx, ecx, edx, offset; - cpuid_count(0xD, feature_bit, &eax, &ebx, &ecx, &edx); + struct cpuid_xstate_sizes *xs = &xstate_sizes[feature_bit]; + u32 offset; + /* ECX[1]: 64B alignment in compacted form */ if (compacted) - offset = (ecx & 0x2) ? ALIGN(ret, 64) : ret; + offset = (xs->ecx & 0x2) ? ALIGN(ret, 64) : ret; else - offset = ebx; - ret = max(ret, offset + eax); + offset = xs->ebx; + ret = max(ret, offset + xs->eax); }
xstate_bv >>= 1; --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -32,6 +32,7 @@ int kvm_vcpu_ioctl_get_cpuid2(struct kvm bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx, u32 *ecx, u32 *edx, bool exact_only);
+void __init kvm_init_xstate_sizes(void); u32 xstate_required_size(u64 xstate_bv, bool compacted);
int cpuid_query_maxphyaddr(struct kvm_vcpu *vcpu); --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -14010,6 +14010,8 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_rmp_fau
static int __init kvm_x86_init(void) { + kvm_init_xstate_sizes(); + kvm_mmu_x86_module_init(); mitigate_smt_rsb &= boot_cpu_has_bug(X86_BUG_SMT_RSB) && cpu_smt_possible(); return 0;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
commit 429fde2d81bcef0ebab002215358955704586457 upstream.
syzbot reported the following crash [1]
Issue came with the blamed commit. Instead of going through all the iov components, we keep using the first one and end up with a malformed skb.
[1]
kernel BUG at net/core/skbuff.c:2849 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 UID: 0 PID: 6230 Comm: syz-executor132 Not tainted 6.13.0-rc1-syzkaller-00407-g96b6fcc0ee41 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024 RIP: 0010:__pskb_pull_tail+0x1568/0x1570 net/core/skbuff.c:2848 Code: 38 c1 0f 8c 32 f1 ff ff 4c 89 f7 e8 92 96 74 f8 e9 25 f1 ff ff e8 e8 ae 09 f8 48 8b 5c 24 08 e9 eb fb ff ff e8 d9 ae 09 f8 90 <0f> 0b 66 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 0018:ffffc90004cbef30 EFLAGS: 00010293 RAX: ffffffff8995c347 RBX: 00000000fffffff2 RCX: ffff88802cf45a00 RDX: 0000000000000000 RSI: 00000000fffffff2 RDI: 0000000000000000 RBP: ffff88807df0c06a R08: ffffffff8995b084 R09: 1ffff1100fbe185c R10: dffffc0000000000 R11: ffffed100fbe185d R12: ffff888076e85d50 R13: ffff888076e85c80 R14: ffff888076e85cf4 R15: ffff888076e85c80 FS: 00007f0dca6ea6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f0dca6ead58 CR3: 00000000119da000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_cow_data+0x2da/0xcb0 net/core/skbuff.c:5284 tipc_aead_decrypt net/tipc/crypto.c:894 [inline] tipc_crypto_rcv+0x402/0x24e0 net/tipc/crypto.c:1844 tipc_rcv+0x57e/0x12a0 net/tipc/node.c:2109 tipc_l2_rcv_msg+0x2bd/0x450 net/tipc/bearer.c:668 __netif_receive_skb_list_ptype net/core/dev.c:5720 [inline] __netif_receive_skb_list_core+0x8b7/0x980 net/core/dev.c:5762 __netif_receive_skb_list net/core/dev.c:5814 [inline] netif_receive_skb_list_internal+0xa51/0xe30 net/core/dev.c:5905 gro_normal_list include/net/gro.h:515 [inline] napi_complete_done+0x2b5/0x870 net/core/dev.c:6256 napi_complete include/linux/netdevice.h:567 [inline] tun_get_user+0x2ea0/0x4890 drivers/net/tun.c:1982 tun_chr_write_iter+0x10d/0x1f0 drivers/net/tun.c:2057 do_iter_readv_writev+0x600/0x880 vfs_writev+0x376/0xba0 fs/read_write.c:1050 do_writev+0x1b6/0x360 fs/read_write.c:1096 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: de4f5fed3f23 ("iov_iter: add iter_iovec() helper") Reported-by: syzbot+4f66250f6663c0c1d67e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/675b61aa.050a0220.599f4.00bb.GAE@google.com/T... Cc: stable@vger.kernel.org Signed-off-by: Eric Dumazet edumazet@google.com Reviewed-by: Joe Damato jdamato@fastly.com Reviewed-by: Jens Axboe axboe@kernel.dk Acked-by: Willem de Bruijn willemb@google.com Acked-by: Michael S. Tsirkin mst@redhat.com Link: https://patch.msgid.link/20241212222247.724674-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/tun.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1481,7 +1481,7 @@ static struct sk_buff *tun_napi_alloc_fr skb->truesize += skb->data_len;
for (i = 1; i < it->nr_segs; i++) { - const struct iovec *iov = iter_iov(it); + const struct iovec *iov = iter_iov(it) + i; size_t fragsz = iov->iov_len; struct page *page; void *frag;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
commit fbbd84af6ba70334335bdeba3ae536cf751c14c6 upstream.
The "gl->tot_len" variable is controlled by the user. It comes from process_responses(). On 32bit systems, the "gl->tot_len + sizeof(struct cpl_pass_accept_req) + sizeof(struct rss_header)" addition could have an integer wrapping bug. Use size_add() to prevent this.
Fixes: a08943947873 ("crypto: chtls - Register chtls with net tls") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/c6bfb23c-2db2-4e1b-b8ab-ba3925c82ef5@stanley.mounta... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c @@ -346,8 +346,9 @@ static struct sk_buff *copy_gl_to_skb_pk * driver. Once driver synthesizes cpl_pass_accpet_req the skb will go * through the regular cpl_pass_accept_req processing in TOM. */ - skb = alloc_skb(gl->tot_len + sizeof(struct cpl_pass_accept_req) - - pktshift, GFP_ATOMIC); + skb = alloc_skb(size_add(gl->tot_len, + sizeof(struct cpl_pass_accept_req)) - + pktshift, GFP_ATOMIC); if (unlikely(!skb)) return NULL; __skb_put(skb, gl->tot_len + sizeof(struct cpl_pass_accept_req)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeremy Kerr jk@codeconstruct.com.au
commit ce1219c3f76bb131d095e90521506d3c6ccfa086 upstream.
Currently, we don't use the return value from sock_queue_rcv_skb, which means we may leak skbs if a message is not successfully queued to a socket.
Instead, ensure that we're freeing the skb where the sock hasn't otherwise taken ownership of the skb by adding checks on the sock_queue_rcv_skb() to invoke a kfree on failure.
In doing so, rather than using the 'rc' value to trigger the kfree_skb(), use the skb pointer itself, which is more explicit.
Also, add a kunit test for the sock delivery failure cases.
Fixes: 4a992bbd3650 ("mctp: Implement message fragmentation & reassembly") Cc: stable@vger.kernel.org Signed-off-by: Jeremy Kerr jk@codeconstruct.com.au Link: https://patch.msgid.link/20241218-mctp-next-v2-1-1c1729645eaa@codeconstruct.... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mctp/route.c | 36 +++++++++++++----- net/mctp/test/route-test.c | 86 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 112 insertions(+), 10 deletions(-)
--- a/net/mctp/route.c +++ b/net/mctp/route.c @@ -374,8 +374,13 @@ static int mctp_route_input(struct mctp_ msk = NULL; rc = -EINVAL;
- /* we may be receiving a locally-routed packet; drop source sk - * accounting + /* We may be receiving a locally-routed packet; drop source sk + * accounting. + * + * From here, we will either queue the skb - either to a frag_queue, or + * to a receiving socket. When that succeeds, we clear the skb pointer; + * a non-NULL skb on exit will be otherwise unowned, and hence + * kfree_skb()-ed. */ skb_orphan(skb);
@@ -434,7 +439,9 @@ static int mctp_route_input(struct mctp_ * pending key. */ if (flags & MCTP_HDR_FLAG_EOM) { - sock_queue_rcv_skb(&msk->sk, skb); + rc = sock_queue_rcv_skb(&msk->sk, skb); + if (!rc) + skb = NULL; if (key) { /* we've hit a pending reassembly; not much we * can do but drop it @@ -443,7 +450,6 @@ static int mctp_route_input(struct mctp_ MCTP_TRACE_KEY_REPLIED); key = NULL; } - rc = 0; goto out_unlock; }
@@ -470,8 +476,10 @@ static int mctp_route_input(struct mctp_ * this function. */ rc = mctp_key_add(key, msk); - if (!rc) + if (!rc) { trace_mctp_key_acquire(key); + skb = NULL; + }
/* we don't need to release key->lock on exit, so * clean up here and suppress the unlock via @@ -489,6 +497,8 @@ static int mctp_route_input(struct mctp_ key = NULL; } else { rc = mctp_frag_queue(key, skb); + if (!rc) + skb = NULL; } }
@@ -503,12 +513,19 @@ static int mctp_route_input(struct mctp_ else rc = mctp_frag_queue(key, skb);
+ if (rc) + goto out_unlock; + + /* we've queued; the queue owns the skb now */ + skb = NULL; + /* end of message? deliver to socket, and we're done with * the reassembly/response key */ - if (!rc && flags & MCTP_HDR_FLAG_EOM) { - sock_queue_rcv_skb(key->sk, key->reasm_head); - key->reasm_head = NULL; + if (flags & MCTP_HDR_FLAG_EOM) { + rc = sock_queue_rcv_skb(key->sk, key->reasm_head); + if (!rc) + key->reasm_head = NULL; __mctp_key_done_in(key, net, f, MCTP_TRACE_KEY_REPLIED); key = NULL; } @@ -527,8 +544,7 @@ out_unlock: if (any_key) mctp_key_unref(any_key); out: - if (rc) - kfree_skb(skb); + kfree_skb(skb); return rc; }
--- a/net/mctp/test/route-test.c +++ b/net/mctp/test/route-test.c @@ -837,6 +837,90 @@ static void mctp_test_route_input_multip mctp_test_route_input_multiple_nets_key_fini(test, &t2); }
+/* Input route to socket, using a single-packet message, where sock delivery + * fails. Ensure we're handling the failure appropriately. + */ +static void mctp_test_route_input_sk_fail_single(struct kunit *test) +{ + const struct mctp_hdr hdr = RX_HDR(1, 10, 8, FL_S | FL_E | FL_TO); + struct mctp_test_route *rt; + struct mctp_test_dev *dev; + struct socket *sock; + struct sk_buff *skb; + int rc; + + __mctp_route_test_init(test, &dev, &rt, &sock, MCTP_NET_ANY); + + /* No rcvbuf space, so delivery should fail. __sock_set_rcvbuf will + * clamp the minimum to SOCK_MIN_RCVBUF, so we open-code this. + */ + lock_sock(sock->sk); + WRITE_ONCE(sock->sk->sk_rcvbuf, 0); + release_sock(sock->sk); + + skb = mctp_test_create_skb(&hdr, 10); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, skb); + skb_get(skb); + + mctp_test_skb_set_dev(skb, dev); + + /* do route input, which should fail */ + rc = mctp_route_input(&rt->rt, skb); + KUNIT_EXPECT_NE(test, rc, 0); + + /* we should hold the only reference to skb */ + KUNIT_EXPECT_EQ(test, refcount_read(&skb->users), 1); + kfree_skb(skb); + + __mctp_route_test_fini(test, dev, rt, sock); +} + +/* Input route to socket, using a fragmented message, where sock delivery fails. + */ +static void mctp_test_route_input_sk_fail_frag(struct kunit *test) +{ + const struct mctp_hdr hdrs[2] = { RX_FRAG(FL_S, 0), RX_FRAG(FL_E, 1) }; + struct mctp_test_route *rt; + struct mctp_test_dev *dev; + struct sk_buff *skbs[2]; + struct socket *sock; + unsigned int i; + int rc; + + __mctp_route_test_init(test, &dev, &rt, &sock, MCTP_NET_ANY); + + lock_sock(sock->sk); + WRITE_ONCE(sock->sk->sk_rcvbuf, 0); + release_sock(sock->sk); + + for (i = 0; i < ARRAY_SIZE(skbs); i++) { + skbs[i] = mctp_test_create_skb(&hdrs[i], 10); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, skbs[i]); + skb_get(skbs[i]); + + mctp_test_skb_set_dev(skbs[i], dev); + } + + /* first route input should succeed, we're only queueing to the + * frag list + */ + rc = mctp_route_input(&rt->rt, skbs[0]); + KUNIT_EXPECT_EQ(test, rc, 0); + + /* final route input should fail to deliver to the socket */ + rc = mctp_route_input(&rt->rt, skbs[1]); + KUNIT_EXPECT_NE(test, rc, 0); + + /* we should hold the only reference to both skbs */ + KUNIT_EXPECT_EQ(test, refcount_read(&skbs[0]->users), 1); + kfree_skb(skbs[0]); + + KUNIT_EXPECT_EQ(test, refcount_read(&skbs[1]->users), 1); + kfree_skb(skbs[1]); + + __mctp_route_test_fini(test, dev, rt, sock); +} + #if IS_ENABLED(CONFIG_MCTP_FLOWS)
static void mctp_test_flow_init(struct kunit *test, @@ -1053,6 +1137,8 @@ static struct kunit_case mctp_test_cases mctp_route_input_sk_reasm_gen_params), KUNIT_CASE_PARAM(mctp_test_route_input_sk_keys, mctp_route_input_sk_keys_gen_params), + KUNIT_CASE(mctp_test_route_input_sk_fail_single), + KUNIT_CASE(mctp_test_route_input_sk_fail_frag), KUNIT_CASE(mctp_test_route_input_multiple_nets_bind), KUNIT_CASE(mctp_test_route_input_multiple_nets_key), KUNIT_CASE(mctp_test_packet_flow),
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ming Lei ming.lei@redhat.com
commit 224749be6c23efe7fb8a030854f4fc5d1dd813b3 upstream.
This reverts commit be26ba96421ab0a8fa2055ccf7db7832a13c44d2.
Commit be26ba96421a ("block: Fix potential deadlock while freezing queue and acquiring sysfs_loc") actually reverts commit 22465bbac53c ("blk-mq: move cpuhp callback registering out of q->sysfs_lock"), and causes the original resctrl lockdep warning.
So revert it and we need to fix the issue in another way.
Cc: Nilay Shroff nilay@linux.ibm.com Fixes: be26ba96421a ("block: Fix potential deadlock while freezing queue and acquiring sysfs_loc") Signed-off-by: Ming Lei ming.lei@redhat.com Link: https://lore.kernel.org/r/20241218101617.3275704-2-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- block/blk-mq-sysfs.c | 16 ++++++++++------ block/blk-mq.c | 29 +++++++++++------------------ block/blk-sysfs.c | 4 ++-- 3 files changed, 23 insertions(+), 26 deletions(-)
--- a/block/blk-mq-sysfs.c +++ b/block/blk-mq-sysfs.c @@ -275,13 +275,15 @@ void blk_mq_sysfs_unregister_hctxs(struc struct blk_mq_hw_ctx *hctx; unsigned long i;
- lockdep_assert_held(&q->sysfs_dir_lock); - + mutex_lock(&q->sysfs_dir_lock); if (!q->mq_sysfs_init_done) - return; + goto unlock;
queue_for_each_hw_ctx(q, hctx, i) blk_mq_unregister_hctx(hctx); + +unlock: + mutex_unlock(&q->sysfs_dir_lock); }
int blk_mq_sysfs_register_hctxs(struct request_queue *q) @@ -290,10 +292,9 @@ int blk_mq_sysfs_register_hctxs(struct r unsigned long i; int ret = 0;
- lockdep_assert_held(&q->sysfs_dir_lock); - + mutex_lock(&q->sysfs_dir_lock); if (!q->mq_sysfs_init_done) - return ret; + goto unlock;
queue_for_each_hw_ctx(q, hctx, i) { ret = blk_mq_register_hctx(hctx); @@ -301,5 +302,8 @@ int blk_mq_sysfs_register_hctxs(struct r break; }
+unlock: + mutex_unlock(&q->sysfs_dir_lock); + return ret; } --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4462,8 +4462,7 @@ static void blk_mq_realloc_hw_ctxs(struc unsigned long i, j;
/* protect against switching io scheduler */ - lockdep_assert_held(&q->sysfs_lock); - + mutex_lock(&q->sysfs_lock); for (i = 0; i < set->nr_hw_queues; i++) { int old_node; int node = blk_mq_get_hctx_node(set, i); @@ -4496,6 +4495,7 @@ static void blk_mq_realloc_hw_ctxs(struc
xa_for_each_start(&q->hctx_table, j, hctx, j) blk_mq_exit_hctx(q, set, hctx, j); + mutex_unlock(&q->sysfs_lock);
/* unregister cpuhp callbacks for exited hctxs */ blk_mq_remove_hw_queues_cpuhp(q); @@ -4527,14 +4527,10 @@ int blk_mq_init_allocated_queue(struct b
xa_init(&q->hctx_table);
- mutex_lock(&q->sysfs_lock); - blk_mq_realloc_hw_ctxs(set, q); if (!q->nr_hw_queues) goto err_hctxs;
- mutex_unlock(&q->sysfs_lock); - INIT_WORK(&q->timeout_work, blk_mq_timeout_work); blk_queue_rq_timeout(q, set->timeout ? set->timeout : 30 * HZ);
@@ -4553,7 +4549,6 @@ int blk_mq_init_allocated_queue(struct b return 0;
err_hctxs: - mutex_unlock(&q->sysfs_lock); blk_mq_release(q); err_exit: q->mq_ops = NULL; @@ -4934,12 +4929,12 @@ static bool blk_mq_elv_switch_none(struc return false;
/* q->elevator needs protection from ->sysfs_lock */ - lockdep_assert_held(&q->sysfs_lock); + mutex_lock(&q->sysfs_lock);
/* the check has to be done with holding sysfs_lock */ if (!q->elevator) { kfree(qe); - goto out; + goto unlock; }
INIT_LIST_HEAD(&qe->node); @@ -4949,7 +4944,9 @@ static bool blk_mq_elv_switch_none(struc __elevator_get(qe->type); list_add(&qe->node, head); elevator_disable(q); -out: +unlock: + mutex_unlock(&q->sysfs_lock); + return true; }
@@ -4978,9 +4975,11 @@ static void blk_mq_elv_switch_back(struc list_del(&qe->node); kfree(qe);
+ mutex_lock(&q->sysfs_lock); elevator_switch(q, t); /* drop the reference acquired in blk_mq_elv_switch_none */ elevator_put(t); + mutex_unlock(&q->sysfs_lock); }
static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, @@ -5000,11 +4999,8 @@ static void __blk_mq_update_nr_hw_queues if (set->nr_maps == 1 && nr_hw_queues == set->nr_hw_queues) return;
- list_for_each_entry(q, &set->tag_list, tag_set_list) { - mutex_lock(&q->sysfs_dir_lock); - mutex_lock(&q->sysfs_lock); + list_for_each_entry(q, &set->tag_list, tag_set_list) blk_mq_freeze_queue(q); - } /* * Switch IO scheduler to 'none', cleaning up the data associated * with the previous scheduler. We will switch back once we are done @@ -5060,11 +5056,8 @@ switch_back: list_for_each_entry(q, &set->tag_list, tag_set_list) blk_mq_elv_switch_back(&head, q);
- list_for_each_entry(q, &set->tag_list, tag_set_list) { + list_for_each_entry(q, &set->tag_list, tag_set_list) blk_mq_unfreeze_queue(q); - mutex_unlock(&q->sysfs_lock); - mutex_unlock(&q->sysfs_dir_lock); - }
/* Free the excess tags when nr_hw_queues shrink. */ for (i = set->nr_hw_queues; i < prev_nr_hw_queues; i++) --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -690,11 +690,11 @@ queue_attr_store(struct kobject *kobj, s return res; }
- mutex_lock(&q->sysfs_lock); blk_mq_freeze_queue(q); + mutex_lock(&q->sysfs_lock); res = entry->store(disk, page, length); - blk_mq_unfreeze_queue(q); mutex_unlock(&q->sysfs_lock); + blk_mq_unfreeze_queue(q); return res; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Geert Uytterhoeven geert+renesas@glider.be
commit de6b43798d9043a7c749a0428dbb02d5fff156e5 upstream.
Currently, the RIIC driver may run the I2C bus faster than requested, which may cause subtle failures. E.g. Biju reported a measured bus speed of 450 kHz instead of the expected maximum of 400 kHz on RZ/G2L.
The initial calculation of the bus period uses DIV_ROUND_UP(), to make sure the actual bus speed never becomes faster than the requested bus speed. However, the subsequent division-by-two steps do not use round-up, which may lead to a too-small period, hence a too-fast and possible out-of-spec bus speed. E.g. on RZ/Five, requesting a bus speed of 100 resp. 400 kHz will yield too-fast target bus speeds of 100806 resp. 403226 Hz instead of 97656 resp. 390625 Hz.
Fix this by using DIV_ROUND_UP() in the subsequent divisions, too.
Tested on RZ/A1H, RZ/A2M, and RZ/Five.
Fixes: d982d66514192cdb ("i2c: riic: remove clock and frequency restrictions") Reported-by: Biju Das biju.das.jz@bp.renesas.com Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Cc: stable@vger.kernel.org # v4.15+ Link: https://lore.kernel.org/r/c59aea77998dfea1b4456c4b33b55ab216fcbf5e.173228474... Signed-off-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-riic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/i2c/busses/i2c-riic.c +++ b/drivers/i2c/busses/i2c-riic.c @@ -352,7 +352,7 @@ static int riic_init_hw(struct riic_dev if (brl <= (0x1F + 3)) break;
- total_ticks /= 2; + total_ticks = DIV_ROUND_UP(total_ticks, 2); rate /= 2; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Bottomley James.Bottomley@HansenPartnership.com
commit 2ab0837cb91b7de507daa145d17b3b6b2efb3abf upstream.
When looking up a non-existent file, efivarfs returns -EINVAL if the file does not conform to the NAME-GUID format and -ENOENT if it does. This is caused by efivars_d_hash() returning -EINVAL if the name is not formatted correctly. This error is returned before simple_lookup() returns a negative dentry, and is the error value that the user sees.
Fix by removing this check. If the file does not exist, simple_lookup() will return a negative dentry leading to -ENOENT and efivarfs_create() already has a validity check before it creates an entry (and will correctly return -EINVAL)
Signed-off-by: James Bottomley James.Bottomley@HansenPartnership.com Cc: stable@vger.kernel.org [ardb: make efivarfs_valid_name() static] Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/efivarfs/inode.c | 2 +- fs/efivarfs/internal.h | 1 - fs/efivarfs/super.c | 3 --- 3 files changed, 1 insertion(+), 5 deletions(-)
--- a/fs/efivarfs/inode.c +++ b/fs/efivarfs/inode.c @@ -51,7 +51,7 @@ struct inode *efivarfs_get_inode(struct * * VariableName-12345678-1234-1234-1234-1234567891bc */ -bool efivarfs_valid_name(const char *str, int len) +static bool efivarfs_valid_name(const char *str, int len) { const char *s = str + len - EFI_VARIABLE_GUID_LEN;
--- a/fs/efivarfs/internal.h +++ b/fs/efivarfs/internal.h @@ -60,7 +60,6 @@ bool efivar_variable_is_removable(efi_gu
extern const struct file_operations efivarfs_file_operations; extern const struct inode_operations efivarfs_dir_inode_operations; -extern bool efivarfs_valid_name(const char *str, int len); extern struct inode *efivarfs_get_inode(struct super_block *sb, const struct inode *dir, int mode, dev_t dev, bool is_removable); --- a/fs/efivarfs/super.c +++ b/fs/efivarfs/super.c @@ -144,9 +144,6 @@ static int efivarfs_d_hash(const struct const unsigned char *s = qstr->name; unsigned int len = qstr->len;
- if (!efivarfs_valid_name(s, len)) - return -EINVAL; - while (len-- > EFI_VARIABLE_GUID_LEN) hash = partial_name_hash(*s++, hash);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Chancellor nathan@kernel.org
commit aef25be35d23ec768eed08bfcf7ca3cf9685bc28 upstream.
The Hexagon-specific constant extender optimization in LLVM may crash on Linux kernel code [1], such as fs/bcache/btree_io.c after commit 32ed4a620c54 ("bcachefs: Btree path tracepoints") in 6.12:
clang: llvm/lib/Target/Hexagon/HexagonConstExtenders.cpp:745: bool (anonymous namespace)::HexagonConstExtenders::ExtRoot::operator<(const HCE::ExtRoot &) const: Assertion `ThisB->getParent() == OtherB->getParent()' failed. Stack dump: 0. Program arguments: clang --target=hexagon-linux-musl ... fs/bcachefs/btree_io.c 1. <eof> parser at end of file 2. Code generation 3. Running pass 'Function Pass Manager' on module 'fs/bcachefs/btree_io.c'. 4. Running pass 'Hexagon constant-extender optimization' on function '@__btree_node_lock_nopath'
Without assertions enabled, there is just a hang during compilation.
This has been resolved in LLVM main (20.0.0) [2] and backported to LLVM 19.1.0 but the kernel supports LLVM 13.0.1 and newer, so disable the constant expander optimization using the '-mllvm' option when using a toolchain that is not fixed.
Cc: stable@vger.kernel.org Link: https://github.com/llvm/llvm-project/issues/99714 [1] Link: https://github.com/llvm/llvm-project/commit/68df06a0b2998765cb0a41353fcf0919... [2] Link: https://github.com/llvm/llvm-project/commit/2ab8d93061581edad3501561722ebd56... [3] Reviewed-by: Brian Cain bcain@quicinc.com Signed-off-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/hexagon/Makefile | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/arch/hexagon/Makefile +++ b/arch/hexagon/Makefile @@ -32,3 +32,9 @@ KBUILD_LDFLAGS += $(ldflags-y) TIR_NAME := r19 KBUILD_CFLAGS += -ffixed-$(TIR_NAME) -DTHREADINFO_REG=$(TIR_NAME) -D__linux__ KBUILD_AFLAGS += -DTHREADINFO_REG=$(TIR_NAME) + +# Disable HexagonConstExtenders pass for LLVM versions prior to 19.1.0 +# https://github.com/llvm/llvm-project/issues/99714 +ifneq ($(call clang-min-version, 190100),y) +KBUILD_CFLAGS += -mllvm -hexagon-cext=false +endif
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Swanemar d.swanemar@gmail.com
commit fdad4fb7c506bea8b419f70ff2163d99962e8ede upstream.
Add the following TCL IK512 compositions:
0x0530: Modem + Diag + AT + MBIM T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 3 Spd=10000 MxCh= 0 D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=1bbb ProdID=0530 Rev=05.04 S: Manufacturer=TCL S: Product=TCL 5G USB Dongle S: SerialNumber=3136b91a C: #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=896mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 4 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
0x0640: ECM + Modem + Diag + AT T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 4 Spd=10000 MxCh= 0 D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=1bbb ProdID=0640 Rev=05.04 S: Manufacturer=TCL S: Product=TCL 5G USB Dongle S: SerialNumber=3136b91a C: #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=896mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=06 Prot=00 Driver=cdc_ether E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether E: Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
Signed-off-by: Daniel Swanemar d.swanemar@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -2385,6 +2385,10 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x60) }, + { USB_DEVICE_INTERFACE_CLASS(0x1bbb, 0x0530, 0xff), /* TCL IK512 MBIM */ + .driver_info = NCTRL(1) }, + { USB_DEVICE_INTERFACE_CLASS(0x1bbb, 0x0640, 0xff), /* TCL IK512 ECM */ + .driver_info = NCTRL(3) }, { } /* Terminating entry */ }; MODULE_DEVICE_TABLE(usb, option_ids);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Hrusecky michal.hrusecky@turris.com
commit 724d461e44dfc0815624d2a9792f2f2beb7ee46d upstream.
Update the USB serial option driver to support MeiG Smart SLM770A.
ID 2dee:4d57 Marvell Mobile Composite Device Bus
T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=2dee ProdID=4d57 Rev= 1.00 S: Manufacturer=Marvell S: Product=Mobile Composite Device Bus C:* #Ifs= 6 Cfg#= 1 Atr=c0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=e0(wlcon) Sub=01 Prot=03 I:* If#= 0 Alt= 0 #EPs= 1 Cls=e0(wlcon) Sub=01 Prot=03 Driver=rndis_host E: Ad=87(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0c(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0b(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0a(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=89(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0e(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Tested successfully connecting to the Internet via rndis interface after dialing via AT commands on If#=3 or If#=4. Not sure of the purpose of the other serial interfaces.
Signed-off-by: Michal Hrusecky michal.hrusecky@turris.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -625,6 +625,8 @@ static void option_instat_callback(struc #define MEIGSMART_PRODUCT_SRM825L 0x4d22 /* MeiG Smart SLM320 based on UNISOC UIS8910 */ #define MEIGSMART_PRODUCT_SLM320 0x4d41 +/* MeiG Smart SLM770A based on ASR1803 */ +#define MEIGSMART_PRODUCT_SLM770A 0x4d57
/* Device flags */
@@ -2382,6 +2384,7 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, TOZED_PRODUCT_LT70C, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, LUAT_PRODUCT_AIR720U, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SLM320, 0xff, 0, 0) }, + { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SLM770A, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x60) },
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mank Wang mank.wang@netprisma.com
commit aa954ae08262bb5cd6ab18dd56a0b58c1315db8b upstream.
LCUK54-WRD's pid/vid 0x3731/0x010a 0x3731/0x010c
LCUK54-WWD's pid/vid 0x3731/0x010b 0x3731/0x010d
Above products use the exact same interface layout and option driver: MBIM + GNSS + DIAG + NMEA + AT + QDSS + DPL
T: Bus=01 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#= 5 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=3731 ProdID=0101 Rev= 5.04 S: Manufacturer=NetPrisma S: Product=LCUK54-WRD S: SerialNumber=feeba631 C:* #Ifs= 8 Cfg#= 1 Atr=a0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00 I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none) E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=40 Driver=option E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=70 Driver=(none) E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 7 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=8f(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Signed-off-by: Mank Wang mank.wang@netprisma.com [ johan: use lower case hex notation ] Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -2377,6 +2377,18 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x0116, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WWD for Golbal EDU */ { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x0116, 0xff, 0x00, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x0116, 0xff, 0xff, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010a, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WRD for WWAN Ready */ + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010a, 0xff, 0x00, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010a, 0xff, 0xff, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010b, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WWD for WWAN Ready */ + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010b, 0xff, 0x00, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010b, 0xff, 0xff, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010c, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WRD for WWAN Ready */ + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010c, 0xff, 0x00, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010c, 0xff, 0xff, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010d, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WWD for WWAN Ready */ + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010d, 0xff, 0x00, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010d, 0xff, 0xff, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(OPPO_VENDOR_ID, OPPO_PRODUCT_R11, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9191, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9191, 0xff, 0xff, 0x40) },
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jack Wu wojackbb@gmail.com
commit f07dfa6a1b65034a5c3ba3a555950d972f252757 upstream.
Add the MediaTek T7XX compositions:
T: Bus=03 Lev=01 Prnt=01 Port=05 Cnt=01 Dev#= 74 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0e8d ProdID=7129 Rev= 0.01 S: Manufacturer=MediaTek Inc. S: Product=USB DATA CARD S: SerialNumber=004402459035402 C:* #Ifs=10 Cfg#= 1 Atr=a0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00 I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 7 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 8 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=08(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 9 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=8a(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=09(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
------------------------------- | If Number | Function | ------------------------------- | 2 | USB AP Log Port | ------------------------------- | 3 | USB AP GNSS Port| ------------------------------- | 4 | USB AP META Port| ------------------------------- | 5 | ADB port | ------------------------------- | 6 | USB MD AT Port | ------------------------------ | 7 | USB MD META Port| ------------------------------- | 8 | USB NTZ Port | ------------------------------- | 9 | USB Debug port | -------------------------------
Signed-off-by: Jack Wu wojackbb@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -2249,6 +2249,8 @@ static const struct usb_device_id option .driver_info = NCTRL(2) }, { USB_DEVICE_AND_INTERFACE_INFO(MEDIATEK_VENDOR_ID, 0x7127, 0xff, 0x00, 0x00), .driver_info = NCTRL(2) | NCTRL(3) | NCTRL(4) }, + { USB_DEVICE_AND_INTERFACE_INFO(MEDIATEK_VENDOR_ID, 0x7129, 0xff, 0x00, 0x00), /* MediaTek T7XX */ + .driver_info = NCTRL(2) | NCTRL(3) | NCTRL(4) }, { USB_DEVICE(CELLIENT_VENDOR_ID, CELLIENT_PRODUCT_MEN200) }, { USB_DEVICE(CELLIENT_VENDOR_ID, CELLIENT_PRODUCT_MPL200), .driver_info = RSVD(1) | RSVD(4) },
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniele Palmas dnlplm@gmail.com
commit 8366e64a4454481339e7c56a8ad280161f2e441d upstream.
Add the following Telit FE910C04 compositions:
0x10c0: rmnet + tty (AT/NMEA) + tty (AT) + tty (diag) T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 13 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c0 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
0x10c4: rmnet + tty (AT) + tty (AT) + tty (diag) T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 14 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c4 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
0x10c8: rmnet + tty (AT) + tty (diag) + DPL (data packet logging) + adb T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 17 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c8 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Signed-off-by: Daniele Palmas dnlplm@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -1397,6 +1397,12 @@ static const struct usb_device_id option .driver_info = RSVD(0) | NCTRL(2) | RSVD(3) | RSVD(4) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10aa, 0xff), /* Telit FN920C04 (MBIM) */ .driver_info = NCTRL(3) | RSVD(4) | RSVD(5) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10c0, 0xff), /* Telit FE910C04 (rmnet) */ + .driver_info = RSVD(0) | NCTRL(3) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10c4, 0xff), /* Telit FE910C04 (rmnet) */ + .driver_info = RSVD(0) | NCTRL(3) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10c8, 0xff), /* Telit FE910C04 (rmnet) */ + .driver_info = RSVD(0) | NCTRL(2) | RSVD(3) | RSVD(4) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910), .driver_info = NCTRL(0) | RSVD(1) | RSVD(3) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910_DUAL_MODEM),
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mathias Nyman mathias.nyman@linux.intel.com
commit e21ebe51af688eb98fd6269240212a3c7300deea upstream.
xHC hosts from several vendors have the same issue where endpoints start so slowly that a later queued 'Stop Endpoint' command may complete before endpoint is up and running.
The 'Stop Endpoint' command fails with context state error as the endpoint still appears as stopped.
See commit 42b758137601 ("usb: xhci: Limit Stop Endpoint retries") for details
CC: stable@vger.kernel.org Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20241217102122.2316814-2-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-ring.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -1192,8 +1192,6 @@ static void xhci_handle_cmd_stop_ep(stru * Keep retrying until the EP starts and stops again, on * chips where this is known to help. Wait for 100ms. */ - if (!(xhci->quirks & XHCI_NEC_HOST)) - break; if (time_is_before_jiffies(ep->stop_time + msecs_to_jiffies(100))) break; fallthrough;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mika Westerberg mika.westerberg@linux.intel.com
commit 8644b48714dca8bf2f42a4ff8311de8efc9bd8c3 upstream.
Intel Panther Lake-M/P has the same integrated Thunderbolt/USB4 controller as Lunar Lake. Add these PCI IDs to the driver list of supported devices.
Cc: stable@vger.kernel.org Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/thunderbolt/nhi.c | 8 ++++++++ drivers/thunderbolt/nhi.h | 4 ++++ 2 files changed, 12 insertions(+)
--- a/drivers/thunderbolt/nhi.c +++ b/drivers/thunderbolt/nhi.c @@ -1520,6 +1520,14 @@ static struct pci_device_id nhi_ids[] = .driver_data = (kernel_ulong_t)&icl_nhi_ops }, { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_LNL_NHI1), .driver_data = (kernel_ulong_t)&icl_nhi_ops }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_PTL_M_NHI0), + .driver_data = (kernel_ulong_t)&icl_nhi_ops }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_PTL_M_NHI1), + .driver_data = (kernel_ulong_t)&icl_nhi_ops }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_PTL_P_NHI0), + .driver_data = (kernel_ulong_t)&icl_nhi_ops }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_PTL_P_NHI1), + .driver_data = (kernel_ulong_t)&icl_nhi_ops }, { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HOST_80G_NHI) }, { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HOST_40G_NHI) },
--- a/drivers/thunderbolt/nhi.h +++ b/drivers/thunderbolt/nhi.h @@ -92,6 +92,10 @@ extern const struct tb_nhi_ops icl_nhi_o #define PCI_DEVICE_ID_INTEL_RPL_NHI1 0xa76d #define PCI_DEVICE_ID_INTEL_LNL_NHI0 0xa833 #define PCI_DEVICE_ID_INTEL_LNL_NHI1 0xa834 +#define PCI_DEVICE_ID_INTEL_PTL_M_NHI0 0xe333 +#define PCI_DEVICE_ID_INTEL_PTL_M_NHI1 0xe334 +#define PCI_DEVICE_ID_INTEL_PTL_P_NHI0 0xe433 +#define PCI_DEVICE_ID_INTEL_PTL_P_NHI1 0xe434
#define PCI_CLASS_SERIAL_USB_USB4 0x0c0340
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mika Westerberg mika.westerberg@linux.intel.com
commit 24740385cb0d6d22ab7fa7adf36546d5b3cdcf73 upstream.
When USB-C monitor is connected directly to Intel Barlow Ridge host, it goes into "redrive" mode that basically routes the DisplayPort signals directly from the GPU to the USB-C monitor without any tunneling needed. However, the host router must be powered on for this to work. Aaron reported that there are a couple of cases where this will not work with the current code:
- Booting with USB-C monitor plugged in. - Plugging in USB-C monitor when the host router is in sleep state (runtime suspended). - Plugging in USB-C device while the system is in system sleep state.
In all these cases once the host router is runtime suspended the picture on the connected USB-C display disappears too. This is certainly not what the user expected.
For this reason improve the redrive mode handling to keep the host router from runtime suspending when detect that any of the above cases is happening.
Fixes: a75e0684efe5 ("thunderbolt: Keep the domain powered when USB4 port is in redrive mode") Reported-by: Aaron Rainbolt arainbolt@kfocus.org Closes: https://lore.kernel.org/linux-usb/20241009220118.70bfedd0@kf-ir16/ Cc: stable@vger.kernel.org Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/thunderbolt/tb.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
--- a/drivers/thunderbolt/tb.c +++ b/drivers/thunderbolt/tb.c @@ -2059,6 +2059,37 @@ static void tb_exit_redrive(struct tb_po } }
+static void tb_switch_enter_redrive(struct tb_switch *sw) +{ + struct tb_port *port; + + tb_switch_for_each_port(sw, port) + tb_enter_redrive(port); +} + +/* + * Called during system and runtime suspend to forcefully exit redrive + * mode without querying whether the resource is available. + */ +static void tb_switch_exit_redrive(struct tb_switch *sw) +{ + struct tb_port *port; + + if (!(sw->quirks & QUIRK_KEEP_POWER_IN_DP_REDRIVE)) + return; + + tb_switch_for_each_port(sw, port) { + if (!tb_port_is_dpin(port)) + continue; + + if (port->redrive) { + port->redrive = false; + pm_runtime_put(&sw->dev); + tb_port_dbg(port, "exit redrive mode\n"); + } + } +} + static void tb_dp_resource_unavailable(struct tb *tb, struct tb_port *port) { struct tb_port *in, *out; @@ -2909,6 +2940,7 @@ static int tb_start(struct tb *tb, bool tb_create_usb3_tunnels(tb->root_switch); /* Add DP IN resources for the root switch */ tb_add_dp_resources(tb->root_switch); + tb_switch_enter_redrive(tb->root_switch); /* Make the discovered switches available to the userspace */ device_for_each_child(&tb->root_switch->dev, NULL, tb_scan_finalize_switch); @@ -2924,6 +2956,7 @@ static int tb_suspend_noirq(struct tb *t
tb_dbg(tb, "suspending...\n"); tb_disconnect_and_release_dp(tb); + tb_switch_exit_redrive(tb->root_switch); tb_switch_suspend(tb->root_switch, false); tcm->hotplug_active = false; /* signal tb_handle_hotplug to quit */ tb_dbg(tb, "suspend finished\n"); @@ -3016,6 +3049,7 @@ static int tb_resume_noirq(struct tb *tb tb_dbg(tb, "tunnels restarted, sleeping for 100ms\n"); msleep(100); } + tb_switch_enter_redrive(tb->root_switch); /* Allow tb_handle_hotplug to progress events */ tcm->hotplug_active = true; tb_dbg(tb, "resume finished\n"); @@ -3079,6 +3113,12 @@ static int tb_runtime_suspend(struct tb struct tb_cm *tcm = tb_priv(tb);
mutex_lock(&tb->lock); + /* + * The below call only releases DP resources to allow exiting and + * re-entering redrive mode. + */ + tb_disconnect_and_release_dp(tb); + tb_switch_exit_redrive(tb->root_switch); tb_switch_suspend(tb->root_switch, true); tcm->hotplug_active = false; mutex_unlock(&tb->lock); @@ -3110,6 +3150,7 @@ static int tb_runtime_resume(struct tb * tb_restore_children(tb->root_switch); list_for_each_entry_safe(tunnel, n, &tcm->tunnel_list, list) tb_tunnel_restart(tunnel); + tb_switch_enter_redrive(tb->root_switch); tcm->hotplug_active = true; mutex_unlock(&tb->lock);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
commit e34f1717ef0632fcec5cb827e5e0e9f223d70c9b upstream.
The read will never succeed if NVM wasn't initialized due to an unknown format.
Add a new callback for visibility to only show when supported.
Cc: stable@vger.kernel.org Fixes: aef9c693e7e5 ("thunderbolt: Move vendor specific NVM handling into nvm.c") Reported-by: Richard Hughes hughsient@gmail.com Closes: https://github.com/fwupd/fwupd/issues/8200 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/thunderbolt/retimer.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
--- a/drivers/thunderbolt/retimer.c +++ b/drivers/thunderbolt/retimer.c @@ -103,6 +103,7 @@ static int tb_retimer_nvm_add(struct tb_
err_nvm: dev_dbg(&rt->dev, "NVM upgrade disabled\n"); + rt->no_nvm_upgrade = true; if (!IS_ERR(nvm)) tb_nvm_free(nvm);
@@ -182,8 +183,6 @@ static ssize_t nvm_authenticate_show(str
if (!rt->nvm) ret = -EAGAIN; - else if (rt->no_nvm_upgrade) - ret = -EOPNOTSUPP; else ret = sysfs_emit(buf, "%#x\n", rt->auth_status);
@@ -323,8 +322,6 @@ static ssize_t nvm_version_show(struct d
if (!rt->nvm) ret = -EAGAIN; - else if (rt->no_nvm_upgrade) - ret = -EOPNOTSUPP; else ret = sysfs_emit(buf, "%x.%x\n", rt->nvm->major, rt->nvm->minor);
@@ -342,6 +339,19 @@ static ssize_t vendor_show(struct device } static DEVICE_ATTR_RO(vendor);
+static umode_t retimer_is_visible(struct kobject *kobj, struct attribute *attr, + int n) +{ + struct device *dev = kobj_to_dev(kobj); + struct tb_retimer *rt = tb_to_retimer(dev); + + if (attr == &dev_attr_nvm_authenticate.attr || + attr == &dev_attr_nvm_version.attr) + return rt->no_nvm_upgrade ? 0 : attr->mode; + + return attr->mode; +} + static struct attribute *retimer_attrs[] = { &dev_attr_device.attr, &dev_attr_nvm_authenticate.attr, @@ -351,6 +361,7 @@ static struct attribute *retimer_attrs[] };
static const struct attribute_group retimer_group = { + .is_visible = retimer_is_visible, .attrs = retimer_attrs, };
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Karas krzysztof.karas@intel.com
commit 080b2e7b5e9ad23343e4b11f0751e4c724a78958 upstream.
Instead of returning a generic NULL on error from drm_dp_tunnel_mgr_create(), use error pointers with informative codes to align the function with stub that is executed when CONFIG_DRM_DISPLAY_DP_TUNNEL is unset. This will also trigger IS_ERR() in current caller (intel_dp_tunnerl_mgr_init()) instead of bypassing it via NULL pointer.
v2: use error codes inside drm_dp_tunnel_mgr_create() instead of handling on caller's side (Michal, Imre)
v3: fixup commit message and add "CC"/"Fixes" lines (Andi), mention aligning function code with stub
Fixes: 91888b5b1ad2 ("drm/i915/dp: Add support for DP tunnel BW allocation") Cc: Imre Deak imre.deak@intel.com Cc: stable@vger.kernel.org # v6.9+ Signed-off-by: Krzysztof Karas krzysztof.karas@intel.com Reviewed-by: Andi Shyti andi.shyti@linux.intel.com Signed-off-by: Imre Deak imre.deak@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/7q4fpnmmztmchczjewgm6igy55qt6j... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/display/drm_dp_tunnel.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/display/drm_dp_tunnel.c b/drivers/gpu/drm/display/drm_dp_tunnel.c index 48b2df120086..90fe07a89260 100644 --- a/drivers/gpu/drm/display/drm_dp_tunnel.c +++ b/drivers/gpu/drm/display/drm_dp_tunnel.c @@ -1896,8 +1896,8 @@ static void destroy_mgr(struct drm_dp_tunnel_mgr *mgr) * * Creates a DP tunnel manager for @dev. * - * Returns a pointer to the tunnel manager if created successfully or NULL in - * case of an error. + * Returns a pointer to the tunnel manager if created successfully or error + * pointer in case of failure. */ struct drm_dp_tunnel_mgr * drm_dp_tunnel_mgr_create(struct drm_device *dev, int max_group_count) @@ -1907,7 +1907,7 @@ drm_dp_tunnel_mgr_create(struct drm_device *dev, int max_group_count)
mgr = kzalloc(sizeof(*mgr), GFP_KERNEL); if (!mgr) - return NULL; + return ERR_PTR(-ENOMEM);
mgr->dev = dev; init_waitqueue_head(&mgr->bw_req_queue); @@ -1916,7 +1916,7 @@ drm_dp_tunnel_mgr_create(struct drm_device *dev, int max_group_count) if (!mgr->groups) { kfree(mgr);
- return NULL; + return ERR_PTR(-ENOMEM); }
#ifdef CONFIG_DRM_DISPLAY_DP_TUNNEL_STATE_DEBUG @@ -1927,7 +1927,7 @@ drm_dp_tunnel_mgr_create(struct drm_device *dev, int max_group_count) if (!init_group(mgr, &mgr->groups[i])) { destroy_mgr(mgr);
- return NULL; + return ERR_PTR(-ENOMEM); }
mgr->group_count++;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
commit a7f9d98eb1202132014ba760c26ad8608ffc9caf upstream.
This helps to avoid a spurious PME event on hotplug to Azalia.
Cc: Vijendar Mukunda Vijendar.Mukunda@amd.com Reported-and-tested-by: ionut_n2001@yahoo.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215884 Tested-by: Gabriel Marcano gabemarcano@yahoo.com Acked-by: Alex Deucher alexander.deucher@amd.com Link: https://lore.kernel.org/r/20241211024414.7840-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 3f6f237b9dd189e1fb85b8a3f7c97a8f27c1e49a) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c @@ -271,8 +271,19 @@ const struct nbio_hdp_flush_reg nbio_v7_ .ref_and_mask_sdma1 = GPU_HDP_FLUSH_DONE__SDMA1_MASK, };
+#define regRCC_DEV0_EPF6_STRAP4 0xd304 +#define regRCC_DEV0_EPF6_STRAP4_BASE_IDX 5 + static void nbio_v7_0_init_registers(struct amdgpu_device *adev) { + uint32_t data; + + switch (adev->ip_versions[NBIO_HWIP][0]) { + case IP_VERSION(2, 5, 0): + data = RREG32_SOC15(NBIO, 0, regRCC_DEV0_EPF6_STRAP4) & ~BIT(23); + WREG32_SOC15(NBIO, 0, regRCC_DEV0_EPF6_STRAP4, data); + break; + } }
#define MMIO_REG_HOLE_OFFSET (0x80000 - PAGE_SIZE)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit 9398332f23fab10c5ec57c168b44e72997d6318e upstream.
drm_mode_vrefresh() is trying to avoid divide by zero by checking whether htotal or vtotal are zero. But we may still end up with a div-by-zero of vtotal*htotal*...
Cc: stable@vger.kernel.org Reported-by: syzbot+622bba18029bcde672e1@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=622bba18029bcde672e1 Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241129042629.18280-2-ville.s... Reviewed-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_modes.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/drm_modes.c +++ b/drivers/gpu/drm/drm_modes.c @@ -1287,14 +1287,11 @@ EXPORT_SYMBOL(drm_mode_set_name); */ int drm_mode_vrefresh(const struct drm_display_mode *mode) { - unsigned int num, den; + unsigned int num = 1, den = 1;
if (mode->htotal == 0 || mode->vtotal == 0) return 0;
- num = mode->clock; - den = mode->htotal * mode->vtotal; - if (mode->flags & DRM_MODE_FLAG_INTERLACE) num *= 2; if (mode->flags & DRM_MODE_FLAG_DBLSCAN) @@ -1302,6 +1299,12 @@ int drm_mode_vrefresh(const struct drm_d if (mode->vscan > 1) den *= mode->vscan;
+ if (check_mul_overflow(mode->clock, num, &num)) + return 0; + + if (check_mul_overflow(mode->htotal * mode->vtotal, den, &den)) + return 0; + return DIV_ROUND_CLOSEST_ULL(mul_u32_u32(num, 1000), den); } EXPORT_SYMBOL(drm_mode_vrefresh);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian König christian.koenig@amd.com
commit 8d1a13816e59254bd3b18f5ae0895230922bd120 upstream.
The VM pointer might already be outdated when that function is called. Use the PASID instead to gather the information instead.
Signed-off-by: Christian König christian.koenig@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 57f812d171af4ba233d3ed7c94dfa5b8e92dcc04) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c @@ -345,11 +345,10 @@ void amdgpu_coredump(struct amdgpu_devic coredump->skip_vram_check = skip_vram_check; coredump->reset_vram_lost = vram_lost;
- if (job && job->vm) { - struct amdgpu_vm *vm = job->vm; + if (job && job->pasid) { struct amdgpu_task_info *ti;
- ti = amdgpu_vm_get_task_info_vm(vm); + ti = amdgpu_vm_get_task_info_pasid(adev, job->pasid); if (ti) { coredump->reset_task_info = *ti; amdgpu_vm_put_task_info(ti);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michel Dänzer mdaenzer@redhat.com
commit 85230ee36d88e7a09fb062d43203035659dd10a5 upstream.
Third time's the charm, I hope?
Fixes: d3116756a710 ("drm/ttm: rename bo->mem and make it a pointer") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3837 Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Michel Dänzer mdaenzer@redhat.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 695c2c745e5dff201b75da8a1d237ce403600d04) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -1260,10 +1260,9 @@ int amdgpu_vm_bo_update(struct amdgpu_de * next command submission. */ if (amdgpu_vm_is_bo_always_valid(vm, bo)) { - uint32_t mem_type = bo->tbo.resource->mem_type; - - if (!(bo->preferred_domains & - amdgpu_mem_type_to_domain(mem_type))) + if (bo->tbo.resource && + !(bo->preferred_domains & + amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type))) amdgpu_vm_bo_evicted(&bo_va->base); else amdgpu_vm_bo_idle(&bo_va->base);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Huan Yang link@vivo.com
[ Upstream commit 164fd9efd46531fddfaa933d394569259896642b ]
This patch aim to simplify the memfd folio pin during the udmabuf create. No functional changes.
This patch create a udmabuf_pin_folios function, in this, do the memfd pin folio and then record each pinned folio, offset.
This patch simplify the pinned folio record, iter by each pinned folio, and then record each offset in it.
Compare to iter by pgcnt, more readable.
Suggested-by: Vivek Kasireddy vivek.kasireddy@intel.com Signed-off-by: Huan Yang link@vivo.com Acked-by: Vivek Kasireddy vivek.kasireddy@intel.com Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240918025238.2957823-5-link@... Stable-dep-of: f49856f525ac ("udmabuf: fix memory leak on last export_udmabuf() error path") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma-buf/udmabuf.c | 137 +++++++++++++++++++++----------------- 1 file changed, 76 insertions(+), 61 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c index a3638ccc15f5..970e08a95dc0 100644 --- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -262,9 +262,6 @@ static int check_memfd_seals(struct file *memfd) { int seals;
- if (!memfd) - return -EBADFD; - if (!shmem_file(memfd) && !is_file_hugepages(memfd)) return -EBADFD;
@@ -299,17 +296,68 @@ static int export_udmabuf(struct udmabuf *ubuf, return dma_buf_fd(buf, flags); }
+static long udmabuf_pin_folios(struct udmabuf *ubuf, struct file *memfd, + loff_t start, loff_t size) +{ + pgoff_t pgoff, pgcnt, upgcnt = ubuf->pagecount; + struct folio **folios = NULL; + u32 cur_folio, cur_pgcnt; + long nr_folios; + long ret = 0; + loff_t end; + + pgcnt = size >> PAGE_SHIFT; + folios = kvmalloc_array(pgcnt, sizeof(*folios), GFP_KERNEL); + if (!folios) + return -ENOMEM; + + end = start + (pgcnt << PAGE_SHIFT) - 1; + nr_folios = memfd_pin_folios(memfd, start, end, folios, pgcnt, &pgoff); + if (nr_folios <= 0) { + ret = nr_folios ? nr_folios : -EINVAL; + goto end; + } + + cur_pgcnt = 0; + for (cur_folio = 0; cur_folio < nr_folios; ++cur_folio) { + pgoff_t subpgoff = pgoff; + size_t fsize = folio_size(folios[cur_folio]); + + ret = add_to_unpin_list(&ubuf->unpin_list, folios[cur_folio]); + if (ret < 0) + goto end; + + for (; subpgoff < fsize; subpgoff += PAGE_SIZE) { + ubuf->folios[upgcnt] = folios[cur_folio]; + ubuf->offsets[upgcnt] = subpgoff; + ++upgcnt; + + if (++cur_pgcnt >= pgcnt) + goto end; + } + + /** + * In a given range, only the first subpage of the first folio + * has an offset, that is returned by memfd_pin_folios(). + * The first subpages of other folios (in the range) have an + * offset of 0. + */ + pgoff = 0; + } +end: + ubuf->pagecount = upgcnt; + kvfree(folios); + return ret; +} + static long udmabuf_create(struct miscdevice *device, struct udmabuf_create_list *head, struct udmabuf_create_item *list) { - pgoff_t pgoff, pgcnt, pglimit, pgbuf = 0; - long nr_folios, ret = -EINVAL; - struct file *memfd = NULL; - struct folio **folios; + pgoff_t pgcnt = 0, pglimit; struct udmabuf *ubuf; - u32 i, j, k, flags; - loff_t end; + long ret = -EINVAL; + u32 i, flags;
ubuf = kzalloc(sizeof(*ubuf), GFP_KERNEL); if (!ubuf) @@ -318,81 +366,50 @@ static long udmabuf_create(struct miscdevice *device, INIT_LIST_HEAD(&ubuf->unpin_list); pglimit = (size_limit_mb * 1024 * 1024) >> PAGE_SHIFT; for (i = 0; i < head->count; i++) { - if (!IS_ALIGNED(list[i].offset, PAGE_SIZE)) + if (!PAGE_ALIGNED(list[i].offset)) goto err; - if (!IS_ALIGNED(list[i].size, PAGE_SIZE)) + if (!PAGE_ALIGNED(list[i].size)) goto err; - ubuf->pagecount += list[i].size >> PAGE_SHIFT; - if (ubuf->pagecount > pglimit) + + pgcnt += list[i].size >> PAGE_SHIFT; + if (pgcnt > pglimit) goto err; }
- if (!ubuf->pagecount) + if (!pgcnt) goto err;
- ubuf->folios = kvmalloc_array(ubuf->pagecount, sizeof(*ubuf->folios), - GFP_KERNEL); + ubuf->folios = kvmalloc_array(pgcnt, sizeof(*ubuf->folios), GFP_KERNEL); if (!ubuf->folios) { ret = -ENOMEM; goto err; } - ubuf->offsets = kvcalloc(ubuf->pagecount, sizeof(*ubuf->offsets), - GFP_KERNEL); + + ubuf->offsets = kvcalloc(pgcnt, sizeof(*ubuf->offsets), GFP_KERNEL); if (!ubuf->offsets) { ret = -ENOMEM; goto err; }
- pgbuf = 0; for (i = 0; i < head->count; i++) { - memfd = fget(list[i].memfd); - ret = check_memfd_seals(memfd); - if (ret < 0) - goto err; + struct file *memfd = fget(list[i].memfd);
- pgcnt = list[i].size >> PAGE_SHIFT; - folios = kvmalloc_array(pgcnt, sizeof(*folios), GFP_KERNEL); - if (!folios) { - ret = -ENOMEM; + if (!memfd) { + ret = -EBADFD; goto err; }
- end = list[i].offset + (pgcnt << PAGE_SHIFT) - 1; - ret = memfd_pin_folios(memfd, list[i].offset, end, - folios, pgcnt, &pgoff); - if (ret <= 0) { - kvfree(folios); - if (!ret) - ret = -EINVAL; + ret = check_memfd_seals(memfd); + if (ret < 0) { + fput(memfd); goto err; }
- nr_folios = ret; - pgoff >>= PAGE_SHIFT; - for (j = 0, k = 0; j < pgcnt; j++) { - ubuf->folios[pgbuf] = folios[k]; - ubuf->offsets[pgbuf] = pgoff << PAGE_SHIFT; - - if (j == 0 || ubuf->folios[pgbuf-1] != folios[k]) { - ret = add_to_unpin_list(&ubuf->unpin_list, - folios[k]); - if (ret < 0) { - kfree(folios); - goto err; - } - } - - pgbuf++; - if (++pgoff == folio_nr_pages(folios[k])) { - pgoff = 0; - if (++k == nr_folios) - break; - } - } - - kvfree(folios); + ret = udmabuf_pin_folios(ubuf, memfd, list[i].offset, + list[i].size); fput(memfd); - memfd = NULL; + if (ret) + goto err; }
flags = head->flags & UDMABUF_FLAGS_CLOEXEC ? O_CLOEXEC : 0; @@ -403,8 +420,6 @@ static long udmabuf_create(struct miscdevice *device, return ret;
err: - if (memfd) - fput(memfd); unpin_all_folios(&ubuf->unpin_list); kvfree(ubuf->offsets); kvfree(ubuf->folios);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
[ Upstream commit f49856f525acd5bef52ae28b7da2e001bbe7439e ]
In export_udmabuf(), if dma_buf_fd() fails because the FD table is full, a dma_buf owning the udmabuf has already been created; but the error handling in udmabuf_create() will tear down the udmabuf without doing anything about the containing dma_buf.
This leaves a dma_buf in memory that contains a dangling pointer; though that doesn't seem to lead to anything bad except a memory leak.
Fix it by moving the dma_buf_fd() call out of export_udmabuf() so that we can give it different error handling.
Note that the shape of this code changed a lot in commit 5e72b2b41a21 ("udmabuf: convert udmabuf driver to use folios"); but the memory leak seems to have existed since the introduction of udmabuf.
Fixes: fbb0de795078 ("Add udmabuf misc device") Acked-by: Vivek Kasireddy vivek.kasireddy@intel.com Signed-off-by: Jann Horn jannh@google.com Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241204-udmabuf-fixes-v2-3-23... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma-buf/udmabuf.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c index 970e08a95dc0..614df433c451 100644 --- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -276,12 +276,10 @@ static int check_memfd_seals(struct file *memfd) return 0; }
-static int export_udmabuf(struct udmabuf *ubuf, - struct miscdevice *device, - u32 flags) +static struct dma_buf *export_udmabuf(struct udmabuf *ubuf, + struct miscdevice *device) { DEFINE_DMA_BUF_EXPORT_INFO(exp_info); - struct dma_buf *buf;
ubuf->device = device; exp_info.ops = &udmabuf_ops; @@ -289,11 +287,7 @@ static int export_udmabuf(struct udmabuf *ubuf, exp_info.priv = ubuf; exp_info.flags = O_RDWR;
- buf = dma_buf_export(&exp_info); - if (IS_ERR(buf)) - return PTR_ERR(buf); - - return dma_buf_fd(buf, flags); + return dma_buf_export(&exp_info); }
static long udmabuf_pin_folios(struct udmabuf *ubuf, struct file *memfd, @@ -356,6 +350,7 @@ static long udmabuf_create(struct miscdevice *device, { pgoff_t pgcnt = 0, pglimit; struct udmabuf *ubuf; + struct dma_buf *dmabuf; long ret = -EINVAL; u32 i, flags;
@@ -413,9 +408,20 @@ static long udmabuf_create(struct miscdevice *device, }
flags = head->flags & UDMABUF_FLAGS_CLOEXEC ? O_CLOEXEC : 0; - ret = export_udmabuf(ubuf, device, flags); - if (ret < 0) + dmabuf = export_udmabuf(ubuf, device); + if (IS_ERR(dmabuf)) { + ret = PTR_ERR(dmabuf); goto err; + } + /* + * Ownership of ubuf is held by the dmabuf from here. + * If the following dma_buf_fd() fails, dma_buf_put() cleans up both the + * dmabuf and the ubuf (through udmabuf_ops.release). + */ + + ret = dma_buf_fd(dmabuf, flags); + if (ret < 0) + dma_buf_put(dmabuf);
return ret;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: T.J. Mercier tjmercier@google.com
[ Upstream commit 0cff90dec63da908fb16d9ea2872ebbcd2d18e6a ]
The arguments for __dma_buf_debugfs_list_del do not match for both the CONFIG_DEBUG_FS case and the !CONFIG_DEBUG_FS case. The !CONFIG_DEBUG_FS case should take a struct dma_buf *, but it's currently struct file *. This can lead to the build error:
error: passing argument 1 of ‘__dma_buf_debugfs_list_del’ from incompatible pointer type [-Werror=incompatible-pointer-types]
dma-buf.c:63:53: note: expected ‘struct file *’ but argument is of type ‘struct dma_buf *’ 63 | static void __dma_buf_debugfs_list_del(struct file *file)
Fixes: bfc7bc539392 ("dma-buf: Do not build debugfs related code when !CONFIG_DEBUG_FS") Signed-off-by: T.J. Mercier tjmercier@google.com Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@igalia.com Signed-off-by: Sumit Semwal sumit.semwal@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20241117170326.1971113-1-tjmer... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma-buf/dma-buf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 8892bc701a66..afb8c1c50107 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -60,7 +60,7 @@ static void __dma_buf_debugfs_list_add(struct dma_buf *dmabuf) { }
-static void __dma_buf_debugfs_list_del(struct file *file) +static void __dma_buf_debugfs_list_del(struct dma_buf *dmabuf) { } #endif
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Zekun zhangzekun11@huawei.com
[ Upstream commit e1e1af9148dc4c866eda3fb59cd6ec3c7ea34b1d ]
drm_mode_duplicate() could return NULL due to lack of memory, which will then call NULL pointer dereference. Add a check to prevent it.
Fixes: 0ef94554dc40 ("drm/panel: himax-hx83102: Break out as separate driver") Signed-off-by: Zhang Zekun zhangzekun11@huawei.com Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20241025073408.27481-3-zhangzekun11@huawei.com Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20241025073408.27481-3-zhangze... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-himax-hx83102.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/panel/panel-himax-hx83102.c b/drivers/gpu/drm/panel/panel-himax-hx83102.c index 8b48bba18131..3644a7544b93 100644 --- a/drivers/gpu/drm/panel/panel-himax-hx83102.c +++ b/drivers/gpu/drm/panel/panel-himax-hx83102.c @@ -565,6 +565,8 @@ static int hx83102_get_modes(struct drm_panel *panel, struct drm_display_mode *mode;
mode = drm_mode_duplicate(connector->dev, m); + if (!mode) + return -ENOMEM;
mode->type = DRM_MODE_TYPE_DRIVER | DRM_MODE_TYPE_PREFERRED; drm_mode_set_name(mode);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit f8fd0968eff52cf092c0d517d17507ea2f6e5ea5 ]
mipi_dsi_device_register_full() never returns NULL pointer, it will return ERR_PTR() when it fails, so replace the check with IS_ERR().
Fixes: 623a3531e9cf ("drm/panel: Add driver for Novatek NT35950 DSI DriverIC panels") Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20241029123957.1588-1-yangyingliang@huaweicloud.co... Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20241029123957.1588-1-yangying... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-novatek-nt35950.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/panel/panel-novatek-nt35950.c b/drivers/gpu/drm/panel/panel-novatek-nt35950.c index b036208f9356..08b22b592ab0 100644 --- a/drivers/gpu/drm/panel/panel-novatek-nt35950.c +++ b/drivers/gpu/drm/panel/panel-novatek-nt35950.c @@ -481,9 +481,9 @@ static int nt35950_probe(struct mipi_dsi_device *dsi) return dev_err_probe(dev, -EPROBE_DEFER, "Cannot get secondary DSI host\n");
nt->dsi[1] = mipi_dsi_device_register_full(dsi_r_host, info); - if (!nt->dsi[1]) { + if (IS_ERR(nt->dsi[1])) { dev_err(dev, "Cannot get secondary DSI node\n"); - return -ENODEV; + return PTR_ERR(nt->dsi[1]); } num_dsis++; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Vasut marex@denx.de
[ Upstream commit 406dd4c7984a457567ca652455d5efad81983f02 ]
The DSI host must be enabled for the panel to be initialized in prepare(). Set the prepare_prev_first flag to guarantee this. This fixes the panel operation on NXP i.MX8MP SoC / Samsung DSIM DSI host.
Fixes: 849b2e3ff969 ("drm/panel: Add Sitronix ST7701 panel driver") Signed-off-by: Marek Vasut marex@denx.de Reviewed-by: Jessica Zhang quic_jesszhan@quicinc.com Link: https://lore.kernel.org/r/20241124224812.150263-1-marex@denx.de Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20241124224812.150263-1-marex@... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-sitronix-st7701.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/panel/panel-sitronix-st7701.c b/drivers/gpu/drm/panel/panel-sitronix-st7701.c index eef03d04e0cd..1f72ef7ca74c 100644 --- a/drivers/gpu/drm/panel/panel-sitronix-st7701.c +++ b/drivers/gpu/drm/panel/panel-sitronix-st7701.c @@ -1177,6 +1177,7 @@ static int st7701_probe(struct device *dev, int connector_type) return dev_err_probe(dev, ret, "Failed to get orientation\n");
drm_panel_init(&st7701->panel, dev, &st7701_funcs, connector_type); + st7701->panel.prepare_prev_first = true;
/** * Once sleep out has been issued, ST7701 IC required to wait 120ms
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Trimarchi michael@amarulasolutions.com
[ Upstream commit d2bd3fcb825725a59c8880070b1206b1710922bd ]
The shutdown function can be called when the display is already unprepared. For example during reboot this trigger a kernel backlog. Calling the drm_panel_unprepare, allow us to avoid to trigger the kernel warning.
Fixes: 2e87bad7cd33 ("drm/panel: Add Synaptics R63353 panel driver") Tested-by: Dario Binacchi dario.binacchi@amarulasolutions.com Signed-off-by: Michael Trimarchi michael@amarulasolutions.com Signed-off-by: Dario Binacchi dario.binacchi@amarulasolutions.com Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Reviewed-by: Jessica Zhang quic_jesszhan@quicinc.com Link: https://lore.kernel.org/r/20241205163002.1804784-1-dario.binacchi@amarulasol... Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20241205163002.1804784-1-dario... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-synaptics-r63353.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panel/panel-synaptics-r63353.c b/drivers/gpu/drm/panel/panel-synaptics-r63353.c index 169c629746c7..17349825543f 100644 --- a/drivers/gpu/drm/panel/panel-synaptics-r63353.c +++ b/drivers/gpu/drm/panel/panel-synaptics-r63353.c @@ -325,7 +325,7 @@ static void r63353_panel_shutdown(struct mipi_dsi_device *dsi) { struct r63353_panel *rpanel = mipi_dsi_get_drvdata(dsi);
- r63353_panel_unprepare(&rpanel->base); + drm_panel_unprepare(&rpanel->base); }
static const struct r63353_desc sharp_ls068b3sx02_data = {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com
[ Upstream commit abcc2ddae5f82aa6cfca162e3db643dd33f0a2e8 ]
On GT reset, we store total busyness counts for all engines and re-register the utilization buffer with GuC. At that time we should reset the buffer, so that we don't get spurious busyness counts on subsequent queries.
To repro this issue, run igt@perf_pmu@busy-hang followed by igt@perf_pmu@most-busy-idle-check-all for a couple iterations.
Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: John Harrison John.C.Harrison@Intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-2-umesh.... (cherry picked from commit abd318237fa6556c1e5225529af145ef15d5ff0d) Signed-off-by: Tvrtko Ursulin tursulin@ursulin.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +++++++++++++++++++ 1 file changed, 21 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index ed979847187f..4793759f4d4a 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -1243,6 +1243,21 @@ static void __get_engine_usage_record(struct intel_engine_cs *engine, } while (++i < 6); }
+static void __set_engine_usage_record(struct intel_engine_cs *engine, + u32 last_in, u32 id, u32 total) +{ + struct iosys_map rec_map = intel_guc_engine_usage_record_map(engine); + +#define record_write(map_, field_, val_) \ + iosys_map_wr_field(map_, 0, struct guc_engine_usage_record, field_, val_) + + record_write(&rec_map, last_switch_in_stamp, last_in); + record_write(&rec_map, current_context_index, id); + record_write(&rec_map, total_runtime, total); + +#undef record_write +} + static void guc_update_engine_gt_clks(struct intel_engine_cs *engine) { struct intel_engine_guc_stats *stats = &engine->stats.guc; @@ -1543,6 +1558,9 @@ static void guc_timestamp_ping(struct work_struct *wrk)
static int guc_action_enable_usage_stats(struct intel_guc *guc) { + struct intel_gt *gt = guc_to_gt(guc); + struct intel_engine_cs *engine; + enum intel_engine_id id; u32 offset = intel_guc_engine_usage_offset(guc); u32 action[] = { INTEL_GUC_ACTION_SET_ENG_UTIL_BUFF, @@ -1550,6 +1568,9 @@ static int guc_action_enable_usage_stats(struct intel_guc *guc) 0, };
+ for_each_engine(engine, gt, id) + __set_engine_usage_record(engine, 0, 0xffffffff, 0); + return intel_guc_send(guc, action, ARRAY_SIZE(action)); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com
[ Upstream commit 59a0b46788d58fdcee8d2f6b4e619d264a1799bf ]
Active busyness of an engine is calculated using gt timestamp and the context switch in time. While capturing the gt timestamp, it's possible that the context switches out. This race could result in an active busyness value that is greater than the actual context runtime value by a small amount. This leads to a negative delta and throws off busyness calculations for the user.
If a subsequent count is smaller than the previous one, just return the previous one, since we expect the busyness to catch up.
Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: John Harrison John.C.Harrison@Intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-3-umesh.... (cherry picked from commit cf907f6d294217985e9dafd9985dce874e04ca37) Signed-off-by: Tvrtko Ursulin tursulin@ursulin.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gt/intel_engine_types.h | 5 +++++ drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 ++++- 2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index ba55c059063d..fe1f85e5dda3 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -343,6 +343,11 @@ struct intel_engine_guc_stats { * @start_gt_clk: GT clock time of last idle to active transition. */ u64 start_gt_clk; + + /** + * @total: The last value of total returned + */ + u64 total; };
union intel_engine_tlb_inv_reg { diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 4793759f4d4a..fbff9b9a067c 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -1378,9 +1378,12 @@ static ktime_t guc_engine_busyness(struct intel_engine_cs *engine, ktime_t *now) total += intel_gt_clock_interval_to_ns(gt, clk); }
+ if (total > stats->total) + stats->total = total; + spin_unlock_irqrestore(&guc->timestamp.lock, flags);
- return ns_to_ktime(total); + return ns_to_ktime(stats->total); }
static void guc_enable_busyness_worker(struct intel_guc *guc)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com
[ Upstream commit 1622ed27d26ab4c234476be746aa55bcd39159dd ]
On gt reset, if a context is running, then accumulate it's active time into the busyness counter since there will be no chance for the context to switch out and update it's run time.
v2: Move comment right above the if (John)
Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: John Harrison John.C.Harrison@Intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-4-umesh.... (cherry picked from commit 7ed047da59cfa1acb558b95169d347acc8d85da1) Signed-off-by: Tvrtko Ursulin tursulin@ursulin.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index fbff9b9a067c..ee12ee0ed418 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -1449,8 +1449,21 @@ static void __reset_guc_busyness_stats(struct intel_guc *guc)
guc_update_pm_timestamp(guc, &unused); for_each_engine(engine, gt, id) { + struct intel_engine_guc_stats *stats = &engine->stats.guc; + guc_update_engine_gt_clks(engine); - engine->stats.guc.prev_total = 0; + + /* + * If resetting a running context, accumulate the active + * time as well since there will be no context switch. + */ + if (stats->running) { + u64 clk = guc->timestamp.gt_stamp - stats->start_gt_clk; + + stats->total_gt_clks += clk; + } + stats->prev_total = 0; + stats->running = 0; }
spin_unlock_irqrestore(&guc->timestamp.lock, flags);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pierre-Eric Pelloux-Prayer pierre-eric.pelloux-prayer@amd.com
[ Upstream commit a93b1020eb9386d7da11608477121b10079c076a ]
Since 2320c9e6a768 ("drm/sched: memset() 'job' in drm_sched_job_init()") accessing job->base.sched can produce unexpected results as the initialisation of (*job)->base.sched done in amdgpu_job_alloc is overwritten by the memset.
This commit fixes an issue when a CS would fail validation and would be rejected after job->num_ibs is incremented. In this case, amdgpu_ib_free(ring->adev, ...) will be called, which would crash the machine because the ring value is bogus.
To fix this, pass a NULL pointer to amdgpu_ib_free(): we can do this because the device is actually not used in this function.
The next commit will remove the ring argument completely.
Fixes: 2320c9e6a768 ("drm/sched: memset() 'job' in drm_sched_job_init()") Signed-off-by: Pierre-Eric Pelloux-Prayer pierre-eric.pelloux-prayer@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 2ae520cb12831d264ceb97c61f72c59d33c0dbd7) Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 16f2605ac50b..1ce20a19be8b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -253,7 +253,6 @@ void amdgpu_job_set_resources(struct amdgpu_job *job, struct amdgpu_bo *gds,
void amdgpu_job_free_resources(struct amdgpu_job *job) { - struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched); struct dma_fence *f; unsigned i;
@@ -266,7 +265,7 @@ void amdgpu_job_free_resources(struct amdgpu_job *job) f = NULL;
for (i = 0; i < job->num_ibs; ++i) - amdgpu_ib_free(ring->adev, &job->ibs[i], f); + amdgpu_ib_free(NULL, &job->ibs[i], f); }
static void amdgpu_job_free_cb(struct drm_sched_job *s_job)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Murad Masimov m.masimov@maxima.ru
[ Upstream commit 74d7e038fd072635d21e4734e3223378e09168d3 ]
The values returned by the driver after processing the contents of the Shunt Voltage Register and the Shunt Limit Registers do not correspond to the TMP512/TMP513 specifications. A raw register value is converted to a signed integer value by a sign extension in accordance with the algorithm provided in the specification, but due to the off-by-one error in the sign bit index, the result is incorrect. Moreover, the PGA shift calculated with the tmp51x_get_pga_shift function is relevant only to the Shunt Voltage Register, but is also applied to the Shunt Limit Registers.
According to the TMP512 and TMP513 datasheets, the Shunt Voltage Register (04h) is 13 to 16 bit two's complement integer value, depending on the PGA setting. The Shunt Positive (0Ch) and Negative (0Dh) Limit Registers are 16-bit two's complement integer values. Below are some examples:
* Shunt Voltage Register If PGA = 8, and regval = 1000 0011 0000 0000, then the decimal value must be -32000, but the value calculated by the driver will be 33536.
* Shunt Limit Register If regval = 1000 0011 0000 0000, then the decimal value must be -32000, but the value calculated by the driver will be 768, if PGA = 1.
Fix sign bit index, and also correct misleading comment describing the tmp51x_get_pga_shift function.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: Murad Masimov m.masimov@maxima.ru Link: https://lore.kernel.org/r/20241216173648.526-2-m.masimov@maxima.ru [groeck: Fixed description and multi-line alignments] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/tmp513.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/hwmon/tmp513.c b/drivers/hwmon/tmp513.c index 926d28cd3fab..d87fcea3ef24 100644 --- a/drivers/hwmon/tmp513.c +++ b/drivers/hwmon/tmp513.c @@ -182,7 +182,7 @@ struct tmp51x_data { struct regmap *regmap; };
-// Set the shift based on the gain 8=4, 4=3, 2=2, 1=1 +// Set the shift based on the gain: 8 -> 1, 4 -> 2, 2 -> 3, 1 -> 4 static inline u8 tmp51x_get_pga_shift(struct tmp51x_data *data) { return 5 - ffs(data->pga_gain); @@ -204,7 +204,9 @@ static int tmp51x_get_value(struct tmp51x_data *data, u8 reg, u8 pos, * 2's complement number shifted by one to four depending * on the pga gain setting. 1lsb = 10uV */ - *val = sign_extend32(regval, 17 - tmp51x_get_pga_shift(data)); + *val = sign_extend32(regval, + reg == TMP51X_SHUNT_CURRENT_RESULT ? + 16 - tmp51x_get_pga_shift(data) : 15); *val = DIV_ROUND_CLOSEST(*val * 10 * MILLI, data->shunt_uohms); break; case TMP51X_BUS_VOLTAGE_RESULT:
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Murad Masimov m.masimov@maxima.ru
[ Upstream commit da1d0e6ba211baf6747db74c07700caddfd8a179 ]
The value returned by the driver after processing the contents of the Current Register does not correspond to the TMP512/TMP513 specifications. A raw register value is converted to a signed integer value by a sign extension in accordance with the algorithm provided in the specification, but due to the off-by-one error in the sign bit index, the result is incorrect. Moreover, negative values will be reported as large positive due to missing sign extension from u32 to long.
According to the TMP512 and TMP513 datasheets, the Current Register (07h) is a 16-bit two's complement integer value. E.g., if regval = 1000 0011 0000 0000, then the value must be (-32000 * lsb), but the driver will return (33536 * lsb).
Fix off-by-one bug, and also cast data->curr_lsb_ua (which is of type u32) to long to prevent incorrect cast for negative values.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: Murad Masimov m.masimov@maxima.ru Link: https://lore.kernel.org/r/20241216173648.526-3-m.masimov@maxima.ru [groeck: Fixed description line length] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/tmp513.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/tmp513.c b/drivers/hwmon/tmp513.c index d87fcea3ef24..2846b1cc515d 100644 --- a/drivers/hwmon/tmp513.c +++ b/drivers/hwmon/tmp513.c @@ -222,7 +222,7 @@ static int tmp51x_get_value(struct tmp51x_data *data, u8 reg, u8 pos, break; case TMP51X_BUS_CURRENT_RESULT: // Current = (ShuntVoltage * CalibrationRegister) / 4096 - *val = sign_extend32(regval, 16) * data->curr_lsb_ua; + *val = sign_extend32(regval, 15) * (long)data->curr_lsb_ua; *val = DIV_ROUND_CLOSEST(*val, MILLI); break; case TMP51X_LOCAL_TEMP_RESULT:
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Murad Masimov m.masimov@maxima.ru
[ Upstream commit dd471e25770e7e632f736b90db1e2080b2171668 ]
The values returned by the driver after processing the contents of the Temperature Result and the Temperature Limit Registers do not correspond to the TMP512/TMP513 specifications. A raw register value is converted to a signed integer value by a sign extension in accordance with the algorithm provided in the specification, but due to the off-by-one error in the sign bit index, the result is incorrect.
According to the TMP512 and TMP513 datasheets, the Temperature Result (08h to 0Bh) and Limit (11h to 14h) Registers are 13-bit two's complement integer values, shifted left by 3 bits. The value is scaled by 0.0625 degrees Celsius per bit. E.g., if regval = 1 1110 0111 0000 000, the output should be -25 degrees, but the driver will return +487 degrees.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: Murad Masimov m.masimov@maxima.ru Link: https://lore.kernel.org/r/20241216173648.526-4-m.masimov@maxima.ru [groeck: fixed description line length] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/tmp513.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/tmp513.c b/drivers/hwmon/tmp513.c index 2846b1cc515d..1c2cb12071b8 100644 --- a/drivers/hwmon/tmp513.c +++ b/drivers/hwmon/tmp513.c @@ -234,7 +234,7 @@ static int tmp51x_get_value(struct tmp51x_data *data, u8 reg, u8 pos, case TMP51X_REMOTE_TEMP_LIMIT_2: case TMP513_REMOTE_TEMP_LIMIT_3: // 1lsb = 0.0625 degrees centigrade - *val = sign_extend32(regval, 16) >> TMP51X_TEMP_SHIFT; + *val = sign_extend32(regval, 15) >> TMP51X_TEMP_SHIFT; *val = DIV_ROUND_CLOSEST(*val * 625, 10); break; case TMP51X_N_FACTOR_AND_HYST_1:
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 85672ca9ceeaa1dcf2777a7048af5f4aee3fd02b ]
If the 'hctx' isn't removed from cpuhp callback list, we can't reuse it, otherwise use-after-free may be triggered.
Reported-by: kernel test robot oliver.sang@intel.com Closes: https://lore.kernel.org/oe-lkp/202412172217.b906db7c-lkp@intel.com Tested-by: kernel test robot oliver.sang@intel.com Fixes: 22465bbac53c ("blk-mq: move cpuhp callback registering out of q->sysfs_lock") Signed-off-by: Ming Lei ming.lei@redhat.com Link: https://lore.kernel.org/r/20241218101617.3275704-3-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-mq.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c index 1030875a3e95..d5995021815d 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4421,6 +4421,15 @@ struct gendisk *blk_mq_alloc_disk_for_queue(struct request_queue *q, } EXPORT_SYMBOL(blk_mq_alloc_disk_for_queue);
+/* + * Only hctx removed from cpuhp list can be reused + */ +static bool blk_mq_hctx_is_reusable(struct blk_mq_hw_ctx *hctx) +{ + return hlist_unhashed(&hctx->cpuhp_online) && + hlist_unhashed(&hctx->cpuhp_dead); +} + static struct blk_mq_hw_ctx *blk_mq_alloc_and_init_hctx( struct blk_mq_tag_set *set, struct request_queue *q, int hctx_idx, int node) @@ -4430,7 +4439,7 @@ static struct blk_mq_hw_ctx *blk_mq_alloc_and_init_hctx( /* reuse dead hctx first */ spin_lock(&q->unused_hctx_lock); list_for_each_entry(tmp, &q->unused_hctx_list, hctx_list) { - if (tmp->numa_node == node) { + if (tmp->numa_node == node && blk_mq_hctx_is_reusable(tmp)) { hctx = tmp; break; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
commit 8cd63406d08110c8098e1efda8aef7ddab4db348 upstream.
The TP_printk() of a TRACE_EVENT() is a generic printf format that any developer can create for their event. It may include pointers to strings and such. A boot mapped buffer may contain data from a previous kernel where the strings addresses are different.
One solution is to copy the event content and update the pointers by the recorded delta, but a simpler solution (for now) is to just use the print_fields() function to print these events. The print_fields() function just iterates the fields and prints them according to what type they are, and ignores the TP_printk() format from the event itself.
To understand the difference, when printing via TP_printk() the output looks like this:
4582.696626: kmem_cache_alloc: call_site=getname_flags+0x47/0x1f0 ptr=00000000e70e10e0 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL node=-1 accounted=false 4582.696629: kmem_cache_alloc: call_site=alloc_empty_file+0x6b/0x110 ptr=0000000095808002 bytes_req=360 bytes_alloc=384 gfp_flags=GFP_KERNEL node=-1 accounted=false 4582.696630: kmem_cache_alloc: call_site=security_file_alloc+0x24/0x100 ptr=00000000576339c3 bytes_req=16 bytes_alloc=16 gfp_flags=GFP_KERNEL|__GFP_ZERO node=-1 accounted=false 4582.696653: kmem_cache_free: call_site=do_sys_openat2+0xa7/0xd0 ptr=00000000e70e10e0 name=names_cache
But when printing via print_fields() (echo 1 > /sys/kernel/tracing/options/fields) the same event output looks like this:
4582.696626: kmem_cache_alloc: call_site=0xffffffff92d10d97 (-1831793257) ptr=0xffff9e0e8571e000 (-107689771147264) bytes_req=0x1000 (4096) bytes_alloc=0x1000 (4096) gfp_flags=0xcc0 (3264) node=0xffffffff (-1) accounted=(0) 4582.696629: kmem_cache_alloc: call_site=0xffffffff92d0250b (-1831852789) ptr=0xffff9e0e8577f800 (-107689770747904) bytes_req=0x168 (360) bytes_alloc=0x180 (384) gfp_flags=0xcc0 (3264) node=0xffffffff (-1) accounted=(0) 4582.696630: kmem_cache_alloc: call_site=0xffffffff92efca74 (-1829778828) ptr=0xffff9e0e8d35d3b0 (-107689640864848) bytes_req=0x10 (16) bytes_alloc=0x10 (16) gfp_flags=0xdc0 (3520) node=0xffffffff (-1) accounted=(0) 4582.696653: kmem_cache_free: call_site=0xffffffff92cfbea7 (-1831879001) ptr=0xffff9e0e8571e000 (-107689771147264) name=names_cache
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/20241218141507.28389a1d@gandalf.local.home Fixes: 07714b4bb3f98 ("tracing: Handle old buffer mappings for event strings and functions") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -4377,6 +4377,15 @@ static enum print_line_t print_trace_fmt if (event) { if (tr->trace_flags & TRACE_ITER_FIELDS) return print_event_fields(iter, event); + /* + * For TRACE_EVENT() events, the print_fmt is not + * safe to use if the array has delta offsets + * Force printing via the fields. + */ + if ((tr->text_delta || tr->data_delta) && + event->type > __TRACE_LAST_TYPE) + return print_event_fields(iter, event); + return event->funcs->trace(iter, sym_flags, event); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 8c1ecc7197a88c6ae62de56e1c0887f220712a32 upstream.
Use the helper function rather than reading it directly.
Reviewed-by: Yang Wang kevinyang.wang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 2c8eeaaa0fe5841ccf07a0eb51b1426f34ef39f7) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/nbio_v7_11.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_11.c +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_11.c @@ -275,7 +275,7 @@ static void nbio_v7_11_init_registers(st if (def != data) WREG32_SOC15(NBIO, 0, regBIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3, data);
- switch (adev->ip_versions[NBIO_HWIP][0]) { + switch (amdgpu_ip_version(adev, NBIO_HWIP, 0)) { case IP_VERSION(7, 11, 0): case IP_VERSION(7, 11, 1): case IP_VERSION(7, 11, 2):
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 458600da793da12e0f3724ecbea34a80703f4d5b upstream.
Use the helper function rather than reading it directly.
Reviewed-by: Yang Wang kevinyang.wang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 22b9555bc90df22b585bdd1f161b61584b13af51) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c @@ -247,7 +247,7 @@ static void nbio_v7_7_init_registers(str if (def != data) WREG32_SOC15(NBIO, 0, regBIF0_PCIE_MST_CTRL_3, data);
- switch (adev->ip_versions[NBIO_HWIP][0]) { + switch (amdgpu_ip_version(adev, NBIO_HWIP, 0)) { case IP_VERSION(7, 7, 0): data = RREG32_SOC15(NBIO, 0, regRCC_DEV0_EPF5_STRAP4) & ~BIT(23); WREG32_SOC15(NBIO, 0, regRCC_DEV0_EPF5_STRAP4, data);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 9e752ee26c1031312a01d2afc281f5f6fdfca176 upstream.
Use the helper function rather than reading it directly.
Reviewed-by: Yang Wang kevinyang.wang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 8f2cd1067afe68372a1723e05e19b68ed187676a) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c @@ -2108,7 +2108,7 @@ static int smu_v14_0_2_enable_gfx_featur { struct amdgpu_device *adev = smu->adev;
- if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(14, 0, 2)) + if (amdgpu_ip_version(adev, MP1_HWIP, 0) == IP_VERSION(14, 0, 2)) return smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_EnableAllSmuFeatures, FEATURE_PWR_GFX, NULL); else
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kairui Song kasong@tencent.com
commit be48c412f6ebf38849213c19547bc6d5b692b5e5 upstream.
Patch series "zram: fix backing device setup issue", v2.
This series fixes two bugs of backing device setting:
- ZRAM should reject using a zero sized (or the uninitialized ZRAM device itself) as the backing device. - Fix backing device leaking when removing a uninitialized ZRAM device.
This patch (of 2):
Setting a zero sized block device as backing device is pointless, and one can easily create a recursive loop by setting the uninitialized ZRAM device itself as its own backing device by (zram0 is uninitialized):
echo /dev/zram0 > /sys/block/zram0/backing_dev
It's definitely a wrong config, and the module will pin itself, kernel should refuse doing so in the first place.
By refusing to use zero sized device we avoided misuse cases including this one above.
Link: https://lkml.kernel.org/r/20241209165717.94215-1-ryncsn@gmail.com Link: https://lkml.kernel.org/r/20241209165717.94215-2-ryncsn@gmail.com Fixes: 013bf95a83ec ("zram: add interface to specif backing device") Signed-off-by: Kairui Song kasong@tencent.com Reported-by: Desheng Wu deshengwu@tencent.com Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/zram/zram_drv.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -524,6 +524,12 @@ static ssize_t backing_dev_store(struct }
nr_pages = i_size_read(inode) >> PAGE_SHIFT; + /* Refuse to use zero sized device (also prevents self reference) */ + if (!nr_pages) { + err = -EINVAL; + goto out; + } + bitmap_sz = BITS_TO_LONGS(nr_pages) * sizeof(long); bitmap = kvzalloc(bitmap_sz, GFP_KERNEL); if (!bitmap) {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kairui Song kasong@tencent.com
commit 74363ec674cb172d8856de25776c8f3103f05e2f upstream.
Setting backing device is done before ZRAM initialization. If we set the backing device, then remove the ZRAM module without initializing the device, the backing device reference will be leaked and the device will be hold forever.
Fix this by always reset the ZRAM fully on rmmod or reset store.
Link: https://lkml.kernel.org/r/20241209165717.94215-3-ryncsn@gmail.com Fixes: 013bf95a83ec ("zram: add interface to specif backing device") Signed-off-by: Kairui Song kasong@tencent.com Reported-by: Desheng Wu deshengwu@tencent.com Suggested-by: Sergey Senozhatsky senozhatsky@chromium.org Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/zram/zram_drv.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
--- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -1325,12 +1325,16 @@ static void zram_meta_free(struct zram * size_t num_pages = disksize >> PAGE_SHIFT; size_t index;
+ if (!zram->table) + return; + /* Free all pages that are still in this zram device */ for (index = 0; index < num_pages; index++) zram_free_page(zram, index);
zs_destroy_pool(zram->mem_pool); vfree(zram->table); + zram->table = NULL; }
static bool zram_meta_alloc(struct zram *zram, u64 disksize) @@ -2171,11 +2175,6 @@ static void zram_reset_device(struct zra
zram->limit_pages = 0;
- if (!init_done(zram)) { - up_write(&zram->init_lock); - return; - } - set_capacity_and_notify(zram->disk, 0); part_stat_set_all(zram->disk->part0, 0);
On (24/12/23 16:58), Greg Kroah-Hartman wrote:
6.12-stable review patch. If anyone has any objections, please let me know.
From: Kairui Song kasong@tencent.com
commit 74363ec674cb172d8856de25776c8f3103f05e2f upstream.
Setting backing device is done before ZRAM initialization. If we set the backing device, then remove the ZRAM module without initializing the device, the backing device reference will be leaked and the device will be hold forever.
Fix this by always reset the ZRAM fully on rmmod or reset store.
Link: https://lkml.kernel.org/r/20241209165717.94215-3-ryncsn@gmail.com Fixes: 013bf95a83ec ("zram: add interface to specif backing device") Signed-off-by: Kairui Song kasong@tencent.com Reported-by: Desheng Wu deshengwu@tencent.com Suggested-by: Sergey Senozhatsky senozhatsky@chromium.org Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Can we please drop this patch?
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthew Wilcox (Oracle) willy@infradead.org
commit a2e740e216f5bf49ccb83b6d490c72a340558a43 upstream.
If the caller of vmap() specifies VM_MAP_PUT_PAGES (currently only the i915 driver), we will decrement nr_vmalloc_pages and MEMCG_VMALLOC in vfree(). These counters are incremented by vmalloc() but not by vmap() so this will cause an underflow. Check the VM_MAP_PUT_PAGES flag before decrementing either counter.
Link: https://lkml.kernel.org/r/20241211202538.168311-1-willy@infradead.org Fixes: b944afc9d64d ("mm: add a VM_MAP_PUT_PAGES flag for vmap") Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Acked-by: Johannes Weiner hannes@cmpxchg.org Reviewed-by: Shakeel Butt shakeel.butt@linux.dev Reviewed-by: Balbir Singh balbirs@nvidia.com Acked-by: Michal Hocko mhocko@suse.com Cc: Christoph Hellwig hch@lst.de Cc: Muchun Song muchun.song@linux.dev Cc: Roman Gushchin roman.gushchin@linux.dev Cc: "Uladzislau Rezki (Sony)" urezki@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/vmalloc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3369,7 +3369,8 @@ void vfree(const void *addr) struct page *page = vm->pages[i];
BUG_ON(!page); - mod_memcg_page_state(page, MEMCG_VMALLOC, -1); + if (!(vm->flags & VM_MAP_PUT_PAGES)) + mod_memcg_page_state(page, MEMCG_VMALLOC, -1); /* * High-order allocs for huge vmallocs are split, so * can be freed as an array of order-0 allocations @@ -3377,7 +3378,8 @@ void vfree(const void *addr) __free_page(page); cond_resched(); } - atomic_long_sub(vm->nr_pages, &nr_vmalloc_pages); + if (!(vm->flags & VM_MAP_PUT_PAGES)) + atomic_long_sub(vm->nr_pages, &nr_vmalloc_pages); kvfree(vm->pages); kfree(vm); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Hildenbrand david@redhat.com
commit faeec8e23c10bd30e8aa759a2eb3018dae00f924 upstream.
In split_large_buddy(), we might call pfn_to_page() on a PFN that might not exist. In corner cases, such as when freeing the highest pageblock in the last memory section, this could result with CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_EXTREME in __pfn_to_section() returning NULL and and __section_mem_map_addr() dereferencing that NULL pointer.
Let's fix it, and avoid doing a pfn_to_page() call for the first iteration, where we already have the page.
So far this was found by code inspection, but let's just CC stable as the fix is easy.
Link: https://lkml.kernel.org/r/20241210093437.174413-1-david@redhat.com Fixes: fd919a85cd55 ("mm: page_isolation: prepare for hygienic freelists") Signed-off-by: David Hildenbrand david@redhat.com Reported-by: Vlastimil Babka vbabka@suse.cz Closes: https://lkml.kernel.org/r/e1a898ba-a717-4d20-9144-29df1a6c8813@suse.cz Reviewed-by: Vlastimil Babka vbabka@suse.cz Reviewed-by: Zi Yan ziy@nvidia.com Acked-by: Johannes Weiner hannes@cmpxchg.org Cc: Yu Zhao yuzhao@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/page_alloc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1cb4b8c8886d..cae7b93864c2 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1238,13 +1238,15 @@ static void split_large_buddy(struct zone *zone, struct page *page, if (order > pageblock_order) order = pageblock_order;
- while (pfn != end) { + do { int mt = get_pfnblock_migratetype(page, pfn);
__free_one_page(page, pfn, zone, order, mt, fpi); pfn += 1 << order; + if (pfn == end) + break; page = pfn_to_page(pfn); - } + } while (1); }
static void free_one_page(struct zone *zone, struct page *page,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Edward Adam Davis eadavis@qq.com
commit c58a812c8e49ad688f94f4b050ad5c5b388fc5d2 upstream.
An overflow occurred when performing the following calculation:
nr_pages = ((nr_subbufs + 1) << subbuf_order) - pgoff;
Add a check before the calculation to avoid this problem.
syzbot reported this as a slab-out-of-bounds in __rb_map_vma:
BUG: KASAN: slab-out-of-bounds in __rb_map_vma+0x9ab/0xae0 kernel/trace/ring_buffer.c:7058 Read of size 8 at addr ffff8880767dd2b8 by task syz-executor187/5836
CPU: 0 UID: 0 PID: 5836 Comm: syz-executor187 Not tainted 6.13.0-rc2-syzkaller-00159-gf932fb9b4074 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xc3/0x620 mm/kasan/report.c:489 kasan_report+0xd9/0x110 mm/kasan/report.c:602 __rb_map_vma+0x9ab/0xae0 kernel/trace/ring_buffer.c:7058 ring_buffer_map+0x56e/0x9b0 kernel/trace/ring_buffer.c:7138 tracing_buffers_mmap+0xa6/0x120 kernel/trace/trace.c:8482 call_mmap include/linux/fs.h:2183 [inline] mmap_file mm/internal.h:124 [inline] __mmap_new_file_vma mm/vma.c:2291 [inline] __mmap_new_vma mm/vma.c:2355 [inline] __mmap_region+0x1786/0x2670 mm/vma.c:2456 mmap_region+0x127/0x320 mm/mmap.c:1348 do_mmap+0xc00/0xfc0 mm/mmap.c:496 vm_mmap_pgoff+0x1ba/0x360 mm/util.c:580 ksys_mmap_pgoff+0x32c/0x5c0 mm/mmap.c:542 __do_sys_mmap arch/x86/kernel/sys_x86_64.c:89 [inline] __se_sys_mmap arch/x86/kernel/sys_x86_64.c:82 [inline] __x64_sys_mmap+0x125/0x190 arch/x86/kernel/sys_x86_64.c:82 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
The reproducer for this bug is:
------------------------8<------------------------- #include <fcntl.h> #include <stdlib.h> #include <unistd.h> #include <asm/types.h> #include <sys/mman.h>
int main(int argc, char **argv) { int page_size = getpagesize(); int fd; void *meta;
system("echo 1 > /sys/kernel/tracing/buffer_size_kb"); fd = open("/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw", O_RDONLY);
meta = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, page_size * 5); } ------------------------>8-------------------------
Cc: stable@vger.kernel.org Fixes: 117c39200d9d7 ("ring-buffer: Introducing ring-buffer mapping functions") Link: https://lore.kernel.org/tencent_06924B6674ED771167C23CC336C097223609@qq.com Reported-by: syzbot+345e4443a21200874b18@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=345e4443a21200874b18 Signed-off-by: Edward Adam Davis eadavis@qq.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/ring_buffer.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 7e257e855dd1..60210fb5b211 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -7019,7 +7019,11 @@ static int __rb_map_vma(struct ring_buffer_per_cpu *cpu_buffer, lockdep_assert_held(&cpu_buffer->mapping_lock);
nr_subbufs = cpu_buffer->nr_pages + 1; /* + reader-subbuf */ - nr_pages = ((nr_subbufs + 1) << subbuf_order) - pgoff; /* + meta-page */ + nr_pages = ((nr_subbufs + 1) << subbuf_order); /* + meta-page */ + if (nr_pages <= pgoff) + return -EINVAL; + + nr_pages -= pgoff;
nr_vma_pages = vma_pages(vma); if (!nr_vma_pages || nr_vma_pages > nr_pages)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Suren Baghdasaryan surenb@google.com
commit 60da7445a142bd15e67f3cda915497781c3f781f upstream.
It was recently noticed that set_codetag_empty() might be used not only to mark NULL alloctag references as empty to avoid warnings but also to reset valid tags (in clear_page_tag_ref()). Since set_codetag_empty() is defined as NOOP for CONFIG_MEM_ALLOC_PROFILING_DEBUG=n, such use of set_codetag_empty() leads to subtle bugs. Fix set_codetag_empty() for CONFIG_MEM_ALLOC_PROFILING_DEBUG=n to reset the tag reference.
Link: https://lkml.kernel.org/r/20241130001423.1114965-2-surenb@google.com Fixes: a8fc28dad6d5 ("alloc_tag: introduce clear_page_tag_ref() helper function") Signed-off-by: Suren Baghdasaryan surenb@google.com Reported-by: David Wang 00107082@163.com Closes: https://lore.kernel.org/lkml/20241124074318.399027-1-00107082@163.com/ Cc: David Wang 00107082@163.com Cc: Kent Overstreet kent.overstreet@linux.dev Cc: Mike Rapoport (Microsoft) rppt@kernel.org Cc: Pasha Tatashin pasha.tatashin@soleen.com Cc: Sourav Panda souravpanda@google.com Cc: Yu Zhao yuzhao@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/alloc_tag.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/include/linux/alloc_tag.h +++ b/include/linux/alloc_tag.h @@ -48,7 +48,12 @@ static inline void set_codetag_empty(uni #else /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */
static inline bool is_codetag_empty(union codetag_ref *ref) { return false; } -static inline void set_codetag_empty(union codetag_ref *ref) {} + +static inline void set_codetag_empty(union codetag_ref *ref) +{ + if (ref) + ref->ct = NULL; +}
#endif /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
commit be691b5e593f2cc8cef67bbc59c1fb91b74a86a9 upstream.
Btrfs like other file systems can't really deal with I/O not aligned to it's internal block size (which strangely is called sector size in btrfs, for historical reasons), but the block layer split helper doesn't even know about that.
Round down the split boundary so that all I/Os are aligned.
Fixes: d5e4377d5051 ("btrfs: split zone append bios in btrfs_submit_bio") CC: stable@vger.kernel.org # 6.12 Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Damien Le Moal dlemoal@kernel.org Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/bio.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -649,8 +649,14 @@ static u64 btrfs_append_map_length(struc map_length = min(map_length, bbio->fs_info->max_zone_append_size); sector_offset = bio_split_rw_at(&bbio->bio, &bbio->fs_info->limits, &nr_segs, map_length); - if (sector_offset) - return sector_offset << SECTOR_SHIFT; + if (sector_offset) { + /* + * bio_split_rw_at() could split at a size smaller than our + * sectorsize and thus cause unaligned I/Os. Fix that by + * always rounding down to the nearest boundary. + */ + return ALIGN_DOWN(sector_offset << SECTOR_SHIFT, bbio->fs_info->sectorsize); + } return map_length; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik josef@toxicpanda.com
commit d75d72a858f0c00ca8ae161b48cdb403807be4de upstream.
We have been using the following check
if (generation <= root->root_key.offset)
to make decisions about whether or not to visit a node during snapshot delete. This is because for normal subvolumes this is set to 0, and for snapshots it's set to the creation generation. The idea being that if the generation of the node is less than or equal to our creation generation then we don't need to visit that node, because it doesn't belong to us, we can simply drop our reference and move on.
However reloc roots don't have their generation stored in root->root_key.offset, instead that is the objectid of their corresponding fs root. This means we can incorrectly not walk into nodes that need to be dropped when deleting a reloc root.
There are a variety of consequences to making the wrong choice in two distinct areas.
visit_node_for_delete()
1. False positive. We think we are newer than the block when we really aren't. We don't visit the node and drop our reference to the node and carry on. This would result in leaked space. 2. False negative. We do decide to walk down into a block that we should have just dropped our reference to. However this means that the child node will have refs > 1, so we will switch to UPDATE_BACKREF, and then the subsequent walk_down_proc() will notice that btrfs_header_owner(node) != root->root_key.objectid and it'll break out of the loop, and then walk_up_proc() will drop our reference, so this appears to be ok.
do_walk_down()
1. False positive. We are in UPDATE_BACKREF and incorrectly decide that we are done and don't need to update the backref for our lower nodes. This is another case that simply won't happen with relocation, as we only have to do UPDATE_BACKREF if the node below us was shared and didn't have FULL_BACKREF set, and since we don't own that node because we're a reloc root we actually won't end up in this case. 2. False negative. Again this is tricky because as described above, we simply wouldn't be here from relocation, because we don't own any of the nodes because we never set btrfs_header_owner() to the reloc root objectid, and we always use FULL_BACKREF, we never actually need to set FULL_BACKREF on any children.
Having spent a lot of time stressing relocation/snapshot delete recently I've not seen this pop in practice. But this is objectively incorrect, so fix this to get the correct starting generation based on the root we're dropping to keep me from thinking there's a problem here.
CC: stable@vger.kernel.org Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Josef Bacik josef@toxicpanda.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/ctree.h | 19 +++++++++++++++++++ fs/btrfs/extent-tree.c | 6 +++--- 2 files changed, 22 insertions(+), 3 deletions(-)
--- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -371,6 +371,25 @@ static inline void btrfs_set_root_last_t }
/* + * Return the generation this root started with. + * + * Every normal root that is created with root->root_key.offset set to it's + * originating generation. If it is a snapshot it is the generation when the + * snapshot was created. + * + * However for TREE_RELOC roots root_key.offset is the objectid of the owning + * tree root. Thankfully we copy the root item of the owning tree root, which + * has it's last_snapshot set to what we would have root_key.offset set to, so + * return that if this is a TREE_RELOC root. + */ +static inline u64 btrfs_root_origin_generation(const struct btrfs_root *root) +{ + if (btrfs_root_id(root) == BTRFS_TREE_RELOC_OBJECTID) + return btrfs_root_last_snapshot(&root->root_item); + return root->root_key.offset; +} + +/* * Structure that conveys information about an extent that is going to replace * all the extents in a file range. */ --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -5308,7 +5308,7 @@ static bool visit_node_for_delete(struct * reference to it. */ generation = btrfs_node_ptr_generation(eb, slot); - if (!wc->update_ref || generation <= root->root_key.offset) + if (!wc->update_ref || generation <= btrfs_root_origin_generation(root)) return false;
/* @@ -5363,7 +5363,7 @@ static noinline void reada_walk_down(str goto reada;
if (wc->stage == UPDATE_BACKREF && - generation <= root->root_key.offset) + generation <= btrfs_root_origin_generation(root)) continue;
/* We don't lock the tree block, it's OK to be racy here */ @@ -5706,7 +5706,7 @@ static noinline int do_walk_down(struct * for the subtree */ if (wc->stage == UPDATE_BACKREF && - generation <= root->root_key.offset) { + generation <= btrfs_root_origin_generation(root)) { wc->lookup_info = 1; return 1; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
commit dfb92681a19e1d5172420baa242806414b3eff6f upstream.
[BUG] There is a bug report in the mailing list where btrfs_run_delayed_refs() failed to drop the ref count for logical 25870311358464 num_bytes 2113536.
The involved leaf dump looks like this:
item 166 key (25870311358464 168 2113536) itemoff 10091 itemsize 50 extent refs 1 gen 84178 flags 1 ref#0: shared data backref parent 32399126528000 count 0 <<< ref#1: shared data backref parent 31808973717504 count 1
Notice the count number is 0.
[CAUSE] There is no concrete evidence yet, but considering 0 -> 1 is also a single bit flipped, it's possible that hardware memory bitflip is involved, causing the on-disk extent tree to be corrupted.
[FIX] To prevent us reading such corrupted extent item, or writing such damaged extent item back to disk, enhance the handling of BTRFS_EXTENT_DATA_REF_KEY and BTRFS_SHARED_DATA_REF_KEY keys for both inlined and key items, to detect such 0 ref count and reject them.
CC: stable@vger.kernel.org # 5.4+ Link: https://lore.kernel.org/linux-btrfs/7c69dd49-c346-4806-86e7-e6f863a66f48@app... Reported-by: Frankie Fisher frankie@terrorise.me.uk Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/tree-checker.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-)
--- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -1527,6 +1527,11 @@ static int check_extent_item(struct exte dref_offset, fs_info->sectorsize); return -EUCLEAN; } + if (unlikely(btrfs_extent_data_ref_count(leaf, dref) == 0)) { + extent_err(leaf, slot, + "invalid data ref count, should have non-zero value"); + return -EUCLEAN; + } inline_refs += btrfs_extent_data_ref_count(leaf, dref); break; /* Contains parent bytenr and ref count */ @@ -1539,6 +1544,11 @@ static int check_extent_item(struct exte inline_offset, fs_info->sectorsize); return -EUCLEAN; } + if (unlikely(btrfs_shared_data_ref_count(leaf, sref) == 0)) { + extent_err(leaf, slot, + "invalid shared data ref count, should have non-zero value"); + return -EUCLEAN; + } inline_refs += btrfs_shared_data_ref_count(leaf, sref); break; case BTRFS_EXTENT_OWNER_REF_KEY: @@ -1611,8 +1621,18 @@ static int check_simple_keyed_refs(struc { u32 expect_item_size = 0;
- if (key->type == BTRFS_SHARED_DATA_REF_KEY) + if (key->type == BTRFS_SHARED_DATA_REF_KEY) { + struct btrfs_shared_data_ref *sref; + + sref = btrfs_item_ptr(leaf, slot, struct btrfs_shared_data_ref); + if (unlikely(btrfs_shared_data_ref_count(leaf, sref) == 0)) { + extent_err(leaf, slot, + "invalid shared data backref count, should have non-zero value"); + return -EUCLEAN; + } + expect_item_size = sizeof(struct btrfs_shared_data_ref); + }
if (unlikely(btrfs_item_size(leaf, slot) != expect_item_size)) { generic_err(leaf, slot, @@ -1689,6 +1709,11 @@ static int check_extent_data_ref(struct offset, leaf->fs_info->sectorsize); return -EUCLEAN; } + if (unlikely(btrfs_extent_data_ref_count(leaf, dref) == 0)) { + extent_err(leaf, slot, + "invalid extent data backref count, should have non-zero value"); + return -EUCLEAN; + } } return 0; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiko Carstens hca@linux.ibm.com
commit 41856638e6c4ed51d8aa9e54f70059d1e357b46e upstream.
With uncoupling of physical and virtual address spaces population of the identity mapping was changed to use the type POPULATE_IDENTITY instead of POPULATE_DIRECT. This breaks DirectMap accounting:
cat /proc/meminfo
DirectMap4k: 55296 kB DirectMap1M: 18446744073709496320 kB
Adjust all locations of update_page_count() in vmem.c to use POPULATE_IDENTITY instead of POPULATE_DIRECT as well. With this accounting is correct again:
cat /proc/meminfo
DirectMap4k: 54264 kB DirectMap1M: 8334336 kB
Fixes: c98d2ecae08f ("s390/mm: Uncouple physical vs virtual address spaces") Cc: stable@vger.kernel.org Reviewed-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/boot/vmem.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/s390/boot/vmem.c +++ b/arch/s390/boot/vmem.c @@ -306,7 +306,7 @@ static void pgtable_pte_populate(pmd_t * pages++; } } - if (mode == POPULATE_DIRECT) + if (mode == POPULATE_IDENTITY) update_page_count(PG_DIRECT_MAP_4K, pages); }
@@ -339,7 +339,7 @@ static void pgtable_pmd_populate(pud_t * } pgtable_pte_populate(pmd, addr, next, mode); } - if (mode == POPULATE_DIRECT) + if (mode == POPULATE_IDENTITY) update_page_count(PG_DIRECT_MAP_1M, pages); }
@@ -372,7 +372,7 @@ static void pgtable_pud_populate(p4d_t * } pgtable_pmd_populate(pud, addr, next, mode); } - if (mode == POPULATE_DIRECT) + if (mode == POPULATE_IDENTITY) update_page_count(PG_DIRECT_MAP_2G, pages); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 3abb660f9e18925468685591a3702bda05faba4f upstream.
Use the helper function rather than reading it directly.
Reviewed-by: Yang Wang kevinyang.wang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 0ec43fbece784215d3c4469973e4556d70bce915) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c index 49e953f86ced..d1032e9992b4 100644 --- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c @@ -278,7 +278,7 @@ static void nbio_v7_0_init_registers(struct amdgpu_device *adev) { uint32_t data;
- switch (adev->ip_versions[NBIO_HWIP][0]) { + switch (amdgpu_ip_version(adev, NBIO_HWIP, 0)) { case IP_VERSION(2, 5, 0): data = RREG32_SOC15(NBIO, 0, regRCC_DEV0_EPF6_STRAP4) & ~BIT(23); WREG32_SOC15(NBIO, 0, regRCC_DEV0_EPF6_STRAP4, data);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 41be00f839e9ee7753892a73a36ce4c14c6f5cbf upstream.
Use the helper function rather than reading it directly.
Reviewed-by: Yang Wang kevinyang.wang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit f1fd1d0f40272948aa6ab82a3a82ecbbc76dff53) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c @@ -4105,7 +4105,7 @@ static int gfx_v12_0_set_clockgating_sta if (amdgpu_sriov_vf(adev)) return 0;
- switch (adev->ip_versions[GC_HWIP][0]) { + switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { case IP_VERSION(12, 0, 0): case IP_VERSION(12, 0, 1): gfx_v12_0_update_gfx_clock_gating(adev,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 6ebc5b92190e01dd48313b68cbf752c9adcfefa8 upstream.
Use the helper function rather than reading it directly.
Reviewed-by: Yang Wang kevinyang.wang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 63bfd24088b42c6f55c2096bfc41b50213d419b2) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c b/drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c index 0fbc3be81f14..f2ab5001b492 100644 --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c @@ -108,7 +108,7 @@ mmhub_v4_1_0_print_l2_protection_fault_status(struct amdgpu_device *adev, dev_err(adev->dev, "MMVM_L2_PROTECTION_FAULT_STATUS_LO32:0x%08X\n", status); - switch (adev->ip_versions[MMHUB_HWIP][0]) { + switch (amdgpu_ip_version(adev, MMHUB_HWIP, 0)) { case IP_VERSION(4, 1, 0): mmhub_cid = mmhub_client_ids_v4_1_0[cid][rw]; break;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
commit cc252bb592638e0f7aea40d580186c36d89526b8 upstream.
A bug was discovered where the idle shadow stacks were not initialized for offline CPUs when starting function graph tracer, and when they came online they were not traced due to the missing shadow stack. To fix this, the idle task shadow stack initialization was moved to using the CPU hotplug callbacks. But it removed the initialization when the function graph was enabled. The problem here is that the hotplug callbacks are called when the CPUs come online, but the idle shadow stack initialization only happens if function graph is currently active. This caused the online CPUs to not get their shadow stack initialized.
The idle shadow stack initialization still needs to be done when the function graph is registered, as they will not be allocated if function graph is not registered.
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Link: https://lore.kernel.org/20241211135335.094ba282@batman.local.home Fixes: 2c02f7375e65 ("fgraph: Use CPU hotplug mechanism to initialize idle shadow stacks") Reported-by: Linus Walleij linus.walleij@linaro.org Tested-by: Linus Walleij linus.walleij@linaro.org Closes: https://lore.kernel.org/all/CACRpkdaTBrHwRbbrphVy-=SeDz6MSsXhTKypOtLrTQ+DgGA... Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/fgraph.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/kernel/trace/fgraph.c +++ b/kernel/trace/fgraph.c @@ -1160,7 +1160,7 @@ void fgraph_update_pid_func(void) static int start_graph_tracing(void) { unsigned long **ret_stack_list; - int ret; + int ret, cpu;
ret_stack_list = kcalloc(FTRACE_RETSTACK_ALLOC_SIZE, sizeof(*ret_stack_list), GFP_KERNEL); @@ -1168,6 +1168,12 @@ static int start_graph_tracing(void) if (!ret_stack_list) return -ENOMEM;
+ /* The cpu_boot init_task->ret_stack will never be freed */ + for_each_online_cpu(cpu) { + if (!idle_task(cpu)->ret_stack) + ftrace_graph_init_idle_task(idle_task(cpu), cpu); + } + do { ret = alloc_retstack_tasklist(ret_stack_list); } while (ret == -EAGAIN);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Kelley mhklinux@outlook.com
commit 07a756a49f4b4290b49ea46e089cbe6f79ff8d26 upstream.
If the KVP (or VSS) daemon starts before the VMBus channel's ringbuffer is fully initialized, we can hit the panic below:
hv_utils: Registering HyperV Utility Driver hv_vmbus: registering driver hv_utils ... BUG: kernel NULL pointer dereference, address: 0000000000000000 CPU: 44 UID: 0 PID: 2552 Comm: hv_kvp_daemon Tainted: G E 6.11.0-rc3+ #1 RIP: 0010:hv_pkt_iter_first+0x12/0xd0 Call Trace: ... vmbus_recvpacket hv_kvp_onchannelcallback vmbus_on_event tasklet_action_common tasklet_action handle_softirqs irq_exit_rcu sysvec_hyperv_stimer0 </IRQ> <TASK> asm_sysvec_hyperv_stimer0 ... kvp_register_done hvt_op_read vfs_read ksys_read __x64_sys_read
This can happen because the KVP/VSS channel callback can be invoked even before the channel is fully opened: 1) as soon as hv_kvp_init() -> hvutil_transport_init() creates /dev/vmbus/hv_kvp, the kvp daemon can open the device file immediately and register itself to the driver by writing a message KVP_OP_REGISTER1 to the file (which is handled by kvp_on_msg() ->kvp_handle_handshake()) and reading the file for the driver's response, which is handled by hvt_op_read(), which calls hvt->on_read(), i.e. kvp_register_done().
2) the problem with kvp_register_done() is that it can cause the channel callback to be called even before the channel is fully opened, and when the channel callback is starting to run, util_probe()-> vmbus_open() may have not initialized the ringbuffer yet, so the callback can hit the panic of NULL pointer dereference.
To reproduce the panic consistently, we can add a "ssleep(10)" for KVP in __vmbus_open(), just before the first hv_ringbuffer_init(), and then we unload and reload the driver hv_utils, and run the daemon manually within the 10 seconds.
Fix the panic by reordering the steps in util_probe() so the char dev entry used by the KVP or VSS daemon is not created until after vmbus_open() has completed. This reordering prevents the race condition from happening.
Reported-by: Dexuan Cui decui@microsoft.com Fixes: e0fa3e5e7df6 ("Drivers: hv: utils: fix a race on userspace daemons registration") Cc: stable@vger.kernel.org Signed-off-by: Michael Kelley mhklinux@outlook.com Acked-by: Wei Liu wei.liu@kernel.org Link: https://lore.kernel.org/r/20241106154247.2271-3-mhklinux@outlook.com Signed-off-by: Wei Liu wei.liu@kernel.org Message-ID: 20241106154247.2271-3-mhklinux@outlook.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hv/hv_kvp.c | 6 ++++++ drivers/hv/hv_snapshot.c | 6 ++++++ drivers/hv/hv_util.c | 9 +++++++++ drivers/hv/hyperv_vmbus.h | 2 ++ include/linux/hyperv.h | 1 + 5 files changed, 24 insertions(+)
--- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -767,6 +767,12 @@ hv_kvp_init(struct hv_util_service *srv) */ kvp_transaction.state = HVUTIL_DEVICE_INIT;
+ return 0; +} + +int +hv_kvp_init_transport(void) +{ hvt = hvutil_transport_init(kvp_devname, CN_KVP_IDX, CN_KVP_VAL, kvp_on_msg, kvp_on_reset); if (!hvt) --- a/drivers/hv/hv_snapshot.c +++ b/drivers/hv/hv_snapshot.c @@ -388,6 +388,12 @@ hv_vss_init(struct hv_util_service *srv) */ vss_transaction.state = HVUTIL_DEVICE_INIT;
+ return 0; +} + +int +hv_vss_init_transport(void) +{ hvt = hvutil_transport_init(vss_devname, CN_VSS_IDX, CN_VSS_VAL, vss_on_msg, vss_on_reset); if (!hvt) { --- a/drivers/hv/hv_util.c +++ b/drivers/hv/hv_util.c @@ -141,6 +141,7 @@ static struct hv_util_service util_heart static struct hv_util_service util_kvp = { .util_cb = hv_kvp_onchannelcallback, .util_init = hv_kvp_init, + .util_init_transport = hv_kvp_init_transport, .util_pre_suspend = hv_kvp_pre_suspend, .util_pre_resume = hv_kvp_pre_resume, .util_deinit = hv_kvp_deinit, @@ -149,6 +150,7 @@ static struct hv_util_service util_kvp = static struct hv_util_service util_vss = { .util_cb = hv_vss_onchannelcallback, .util_init = hv_vss_init, + .util_init_transport = hv_vss_init_transport, .util_pre_suspend = hv_vss_pre_suspend, .util_pre_resume = hv_vss_pre_resume, .util_deinit = hv_vss_deinit, @@ -613,6 +615,13 @@ static int util_probe(struct hv_device * if (ret) goto error;
+ if (srv->util_init_transport) { + ret = srv->util_init_transport(); + if (ret) { + vmbus_close(dev->channel); + goto error; + } + } return 0;
error: --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -370,12 +370,14 @@ void vmbus_on_event(unsigned long data); void vmbus_on_msg_dpc(unsigned long data);
int hv_kvp_init(struct hv_util_service *srv); +int hv_kvp_init_transport(void); void hv_kvp_deinit(void); int hv_kvp_pre_suspend(void); int hv_kvp_pre_resume(void); void hv_kvp_onchannelcallback(void *context);
int hv_vss_init(struct hv_util_service *srv); +int hv_vss_init_transport(void); void hv_vss_deinit(void); int hv_vss_pre_suspend(void); int hv_vss_pre_resume(void); --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1559,6 +1559,7 @@ struct hv_util_service { void *channel; void (*util_cb)(void *); int (*util_init)(struct hv_util_service *); + int (*util_init_transport)(void); void (*util_deinit)(void); int (*util_pre_suspend)(void); int (*util_pre_resume)(void);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dexuan Cui decui@microsoft.com
commit cb1b78f1c726c938bd47497c1ab16b01ce967f37 upstream.
hv_fcopy_uio_daemon.c:436:53: warning: '%s' directive output may be truncated writing up to 14 bytes into a region of size 10 [-Wformat-truncation=] 436 | snprintf(uio_dev_path, sizeof(uio_dev_path), "/dev/%s", uio_name);
Also added 'static' for the array 'desc[]'.
Fixes: 82b0945ce2c2 ("tools: hv: Add new fcopy application based on uio driver") Cc: stable@vger.kernel.org # 6.10+ Signed-off-by: Dexuan Cui decui@microsoft.com Reviewed-by: Saurabh Sengar ssengar@linux.microsoft.com Link: https://lore.kernel.org/r/20240910004433.50254-1-decui@microsoft.com Signed-off-by: Wei Liu wei.liu@kernel.org Message-ID: 20240910004433.50254-1-decui@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/hv/hv_fcopy_uio_daemon.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/tools/hv/hv_fcopy_uio_daemon.c b/tools/hv/hv_fcopy_uio_daemon.c index 7a00f3066a98..12743d7f164f 100644 --- a/tools/hv/hv_fcopy_uio_daemon.c +++ b/tools/hv/hv_fcopy_uio_daemon.c @@ -35,8 +35,6 @@ #define WIN8_SRV_MINOR 1 #define WIN8_SRV_VERSION (WIN8_SRV_MAJOR << 16 | WIN8_SRV_MINOR)
-#define MAX_FOLDER_NAME 15 -#define MAX_PATH_LEN 15 #define FCOPY_UIO "/sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/uio"
#define FCOPY_VER_COUNT 1 @@ -51,7 +49,7 @@ static const int fw_versions[] = {
#define HV_RING_SIZE 0x4000 /* 16KB ring buffer size */
-unsigned char desc[HV_RING_SIZE]; +static unsigned char desc[HV_RING_SIZE];
static int target_fd; static char target_fname[PATH_MAX]; @@ -409,8 +407,8 @@ int main(int argc, char *argv[]) struct vmbus_br txbr, rxbr; void *ring; uint32_t len = HV_RING_SIZE; - char uio_name[MAX_FOLDER_NAME] = {0}; - char uio_dev_path[MAX_PATH_LEN] = {0}; + char uio_name[NAME_MAX] = {0}; + char uio_dev_path[PATH_MAX] = {0};
static struct option long_options[] = { {"help", no_argument, 0, 'h' },
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Naman Jain namjain@linux.microsoft.com
commit bcc80dec91ee745b3d66f3e48f0ec2efdea97149 upstream.
read_hv_sched_clock_tsc() assumes that the Hyper-V clock counter is bigger than the variable hv_sched_clock_offset, which is cached during early boot, but depending on the timing this assumption may be false when a hibernated VM starts again (the clock counter starts from 0 again) and is resuming back (Note: hv_init_tsc_clocksource() is not called during hibernation/resume); consequently, read_hv_sched_clock_tsc() may return a negative integer (which is interpreted as a huge positive integer since the return type is u64) and new kernel messages are prefixed with huge timestamps before read_hv_sched_clock_tsc() grows big enough (which typically takes several seconds).
Fix the issue by saving the Hyper-V clock counter just before the suspend, and using it to correct the hv_sched_clock_offset in resume. This makes hv tsc page based sched_clock continuous and ensures that post resume, it starts from where it left off during suspend. Override x86_platform.save_sched_clock_state and x86_platform.restore_sched_clock_state routines to correct this as soon as possible.
Note: if Invariant TSC is available, the issue doesn't happen because 1) we don't register read_hv_sched_clock_tsc() for sched clock: See commit e5313f1c5404 ("clocksource/drivers/hyper-v: Rework clocksource and sched clock setup"); 2) the common x86 code adjusts TSC similarly: see __restore_processor_state() -> tsc_verify_tsc_adjust(true) and x86_platform.restore_sched_clock_state().
Cc: stable@vger.kernel.org Fixes: 1349401ff1aa ("clocksource/drivers/hyper-v: Suspend/resume Hyper-V clocksource for hibernation") Co-developed-by: Dexuan Cui decui@microsoft.com Signed-off-by: Dexuan Cui decui@microsoft.com Signed-off-by: Naman Jain namjain@linux.microsoft.com Reviewed-by: Michael Kelley mhklinux@outlook.com Link: https://lore.kernel.org/r/20240917053917.76787-1-namjain@linux.microsoft.com Signed-off-by: Wei Liu wei.liu@kernel.org Message-ID: 20240917053917.76787-1-namjain@linux.microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/mshyperv.c | 58 +++++++++++++++++++++++++++++++++++++ drivers/clocksource/hyperv_timer.c | 14 ++++++++ include/clocksource/hyperv_timer.h | 2 + 3 files changed, 73 insertions(+), 1 deletion(-)
--- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -223,6 +223,63 @@ static void hv_machine_crash_shutdown(st hyperv_cleanup(); } #endif /* CONFIG_CRASH_DUMP */ + +static u64 hv_ref_counter_at_suspend; +static void (*old_save_sched_clock_state)(void); +static void (*old_restore_sched_clock_state)(void); + +/* + * Hyper-V clock counter resets during hibernation. Save and restore clock + * offset during suspend/resume, while also considering the time passed + * before suspend. This is to make sure that sched_clock using hv tsc page + * based clocksource, proceeds from where it left off during suspend and + * it shows correct time for the timestamps of kernel messages after resume. + */ +static void save_hv_clock_tsc_state(void) +{ + hv_ref_counter_at_suspend = hv_read_reference_counter(); +} + +static void restore_hv_clock_tsc_state(void) +{ + /* + * Adjust the offsets used by hv tsc clocksource to + * account for the time spent before hibernation. + * adjusted value = reference counter (time) at suspend + * - reference counter (time) now. + */ + hv_adj_sched_clock_offset(hv_ref_counter_at_suspend - hv_read_reference_counter()); +} + +/* + * Functions to override save_sched_clock_state and restore_sched_clock_state + * functions of x86_platform. The Hyper-V clock counter is reset during + * suspend-resume and the offset used to measure time needs to be + * corrected, post resume. + */ +static void hv_save_sched_clock_state(void) +{ + old_save_sched_clock_state(); + save_hv_clock_tsc_state(); +} + +static void hv_restore_sched_clock_state(void) +{ + restore_hv_clock_tsc_state(); + old_restore_sched_clock_state(); +} + +static void __init x86_setup_ops_for_tsc_pg_clock(void) +{ + if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE)) + return; + + old_save_sched_clock_state = x86_platform.save_sched_clock_state; + x86_platform.save_sched_clock_state = hv_save_sched_clock_state; + + old_restore_sched_clock_state = x86_platform.restore_sched_clock_state; + x86_platform.restore_sched_clock_state = hv_restore_sched_clock_state; +} #endif /* CONFIG_HYPERV */
static uint32_t __init ms_hyperv_platform(void) @@ -579,6 +636,7 @@ static void __init ms_hyperv_init_platfo
/* Register Hyper-V specific clocksource */ hv_init_clocksource(); + x86_setup_ops_for_tsc_pg_clock(); hv_vtl_init_platform(); #endif /* --- a/drivers/clocksource/hyperv_timer.c +++ b/drivers/clocksource/hyperv_timer.c @@ -27,7 +27,8 @@ #include <asm/mshyperv.h>
static struct clock_event_device __percpu *hv_clock_event; -static u64 hv_sched_clock_offset __ro_after_init; +/* Note: offset can hold negative values after hibernation. */ +static u64 hv_sched_clock_offset __read_mostly;
/* * If false, we're using the old mechanism for stimer0 interrupts @@ -470,6 +471,17 @@ static void resume_hv_clock_tsc(struct c hv_set_msr(HV_MSR_REFERENCE_TSC, tsc_msr.as_uint64); }
+/* + * Called during resume from hibernation, from overridden + * x86_platform.restore_sched_clock_state routine. This is to adjust offsets + * used to calculate time for hv tsc page based sched_clock, to account for + * time spent before hibernation. + */ +void hv_adj_sched_clock_offset(u64 offset) +{ + hv_sched_clock_offset -= offset; +} + #ifdef HAVE_VDSO_CLOCKMODE_HVCLOCK static int hv_cs_enable(struct clocksource *cs) { --- a/include/clocksource/hyperv_timer.h +++ b/include/clocksource/hyperv_timer.h @@ -38,6 +38,8 @@ extern void hv_remap_tsc_clocksource(voi extern unsigned long hv_get_tsc_pfn(void); extern struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
+extern void hv_adj_sched_clock_offset(u64 offset); + static __always_inline bool hv_read_tsc_page_tsc(const struct ms_hyperv_tsc_page *tsc_pg, u64 *cur_tsc, u64 *time)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 9b42d1e8e4fe9dc631162c04caa69b0d1860b0f0 upstream.
Use is_64_bit_hypercall() instead of is_64_bit_mode() to detect a 64-bit hypercall when completing said hypercall. For guests with protected state, e.g. SEV-ES and SEV-SNP, KVM must assume the hypercall was made in 64-bit mode as the vCPU state needed to detect 64-bit mode is unavailable.
Hacking the sev_smoke_test selftest to generate a KVM_HC_MAP_GPA_RANGE hypercall via VMGEXIT trips the WARN:
------------[ cut here ]------------ WARNING: CPU: 273 PID: 326626 at arch/x86/kvm/x86.h:180 complete_hypercall_exit+0x44/0xe0 [kvm] Modules linked in: kvm_amd kvm ... [last unloaded: kvm] CPU: 273 UID: 0 PID: 326626 Comm: sev_smoke_test Not tainted 6.12.0-smp--392e932fa0f3-feat #470 Hardware name: Google Astoria/astoria, BIOS 0.20240617.0-0 06/17/2024 RIP: 0010:complete_hypercall_exit+0x44/0xe0 [kvm] Call Trace: <TASK> kvm_arch_vcpu_ioctl_run+0x2400/0x2720 [kvm] kvm_vcpu_ioctl+0x54f/0x630 [kvm] __se_sys_ioctl+0x6b/0xc0 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e </TASK> ---[ end trace 0000000000000000 ]---
Fixes: b5aead0064f3 ("KVM: x86: Assume a 64-bit hypercall for guests with protected state") Cc: stable@vger.kernel.org Cc: Tom Lendacky thomas.lendacky@amd.com Reviewed-by: Xiaoyao Li xiaoyao.li@intel.com Reviewed-by: Nikunj A Dadhania nikunj@amd.com Reviewed-by: Tom Lendacky thomas.lendacky@amd.com Reviewed-by: Binbin Wu binbin.wu@linux.intel.com Reviewed-by: Kai Huang kai.huang@intel.com Link: https://lore.kernel.org/r/20241128004344.4072099-2-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/x86.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9991,7 +9991,7 @@ static int complete_hypercall_exit(struc { u64 ret = vcpu->run->hypercall.ret;
- if (!is_64_bit_mode(vcpu)) + if (!is_64_bit_hypercall(vcpu)) ret = (u32)ret; kvm_rax_write(vcpu, ret); ++vcpu->stat.hypercalls;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Enzo Matsumiya ematsumiya@suse.de
commit e9f2517a3e18a54a3943c098d2226b245d488801 upstream.
Commit ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.") fixed a netns UAF by manually enabled socket refcounting (sk->sk_net_refcnt=1 and sock_inuse_add(net, 1)).
The reason the patch worked for that bug was because we now hold references to the netns (get_net_track() gets a ref internally) and they're properly released (internally, on __sk_destruct()), but only because sk->sk_net_refcnt was set.
Problem: (this happens regardless of CONFIG_NET_NS_REFCNT_TRACKER and regardless if init_net or other)
Setting sk->sk_net_refcnt=1 *manually* and *after* socket creation is not only out of cifs scope, but also technically wrong -- it's set conditionally based on user (=1) vs kernel (=0) sockets. And net/ implementations seem to base their user vs kernel space operations on it.
e.g. upon TCP socket close, the TCP timers are not cleared because sk->sk_net_refcnt=1: (cf. commit 151c9c724d05 ("tcp: properly terminate timers for kernel sockets"))
net/ipv4/tcp.c: void tcp_close(struct sock *sk, long timeout) { lock_sock(sk); __tcp_close(sk, timeout); release_sock(sk); if (!sk->sk_net_refcnt) inet_csk_clear_xmit_timers_sync(sk); sock_put(sk); }
Which will throw a lockdep warning and then, as expected, deadlock on tcp_write_timer().
A way to reproduce this is by running the reproducer from ef7134c7fc48 and then 'rmmod cifs'. A few seconds later, the deadlock/lockdep warning shows up.
Fix: We shouldn't mess with socket internals ourselves, so do not set sk_net_refcnt manually.
Also change __sock_create() to sock_create_kern() for explicitness.
As for non-init_net network namespaces, we deal with it the best way we can -- hold an extra netns reference for server->ssocket and drop it when it's released. This ensures that the netns still exists whenever we need to create/destroy server->ssocket, but is not directly tied to it.
Fixes: ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.") Cc: stable@vger.kernel.org Signed-off-by: Enzo Matsumiya ematsumiya@suse.de Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/connect.c | 36 ++++++++++++++++++++++++++---------- 1 file changed, 26 insertions(+), 10 deletions(-)
--- a/fs/smb/client/connect.c +++ b/fs/smb/client/connect.c @@ -987,9 +987,13 @@ clean_demultiplex_info(struct TCP_Server msleep(125); if (cifs_rdma_enabled(server)) smbd_destroy(server); + if (server->ssocket) { sock_release(server->ssocket); server->ssocket = NULL; + + /* Release netns reference for the socket. */ + put_net(cifs_net_ns(server)); }
if (!list_empty(&server->pending_mid_q)) { @@ -1037,6 +1041,7 @@ clean_demultiplex_info(struct TCP_Server */ }
+ /* Release netns reference for this server. */ put_net(cifs_net_ns(server)); kfree(server->leaf_fullpath); kfree(server); @@ -1713,6 +1718,8 @@ cifs_get_tcp_session(struct smb3_fs_cont
tcp_ses->ops = ctx->ops; tcp_ses->vals = ctx->vals; + + /* Grab netns reference for this server. */ cifs_set_net_ns(tcp_ses, get_net(current->nsproxy->net_ns));
tcp_ses->conn_id = atomic_inc_return(&tcpSesNextId); @@ -1844,6 +1851,7 @@ smbd_connected: out_err_crypto_release: cifs_crypto_secmech_release(tcp_ses);
+ /* Release netns reference for this server. */ put_net(cifs_net_ns(tcp_ses));
out_err: @@ -1852,8 +1860,10 @@ out_err: cifs_put_tcp_session(tcp_ses->primary_server, false); kfree(tcp_ses->hostname); kfree(tcp_ses->leaf_fullpath); - if (tcp_ses->ssocket) + if (tcp_ses->ssocket) { sock_release(tcp_ses->ssocket); + put_net(cifs_net_ns(tcp_ses)); + } kfree(tcp_ses); } return ERR_PTR(rc); @@ -3111,20 +3121,20 @@ generic_ip_connect(struct TCP_Server_Inf socket = server->ssocket; } else { struct net *net = cifs_net_ns(server); - struct sock *sk;
- rc = __sock_create(net, sfamily, SOCK_STREAM, - IPPROTO_TCP, &server->ssocket, 1); + rc = sock_create_kern(net, sfamily, SOCK_STREAM, IPPROTO_TCP, &server->ssocket); if (rc < 0) { cifs_server_dbg(VFS, "Error %d creating socket\n", rc); return rc; }
- sk = server->ssocket->sk; - __netns_tracker_free(net, &sk->ns_tracker, false); - sk->sk_net_refcnt = 1; - get_net_track(net, &sk->ns_tracker, GFP_KERNEL); - sock_inuse_add(net, 1); + /* + * Grab netns reference for the socket. + * + * It'll be released here, on error, or in clean_demultiplex_info() upon server + * teardown. + */ + get_net(net);
/* BB other socket options to set KEEPALIVE, NODELAY? */ cifs_dbg(FYI, "Socket created\n"); @@ -3138,8 +3148,10 @@ generic_ip_connect(struct TCP_Server_Inf }
rc = bind_socket(server); - if (rc < 0) + if (rc < 0) { + put_net(cifs_net_ns(server)); return rc; + }
/* * Eventually check for other socket options to change from @@ -3176,6 +3188,7 @@ generic_ip_connect(struct TCP_Server_Inf if (rc < 0) { cifs_dbg(FYI, "Error %d connecting to server\n", rc); trace_smb3_connect_err(server->hostname, server->conn_id, &server->dstaddr, rc); + put_net(cifs_net_ns(server)); sock_release(socket); server->ssocket = NULL; return rc; @@ -3184,6 +3197,9 @@ generic_ip_connect(struct TCP_Server_Inf if (sport == htons(RFC1001_PORT)) rc = ip_rfc1001_connect(server);
+ if (rc < 0) + put_net(cifs_net_ns(server)); + return rc; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com
commit 4b2efb9db0c22a130bbd1275e489b42c02d08050 upstream.
Check if ctx is not NULL before accessing its fields.
Fixes: 37dee2a2f433 ("accel/ivpu: Improve buffer object debug logs") Cc: stable@vger.kernel.org # v6.8 Reviewed-by: Karol Wachowski karol.wachowski@intel.com Reviewed-by: Jeffrey Hugo quic_jhugo@quicinc.com Signed-off-by: Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241210130939.1575610-2-jacek... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/accel/ivpu/ivpu_gem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/accel/ivpu/ivpu_gem.c +++ b/drivers/accel/ivpu/ivpu_gem.c @@ -406,7 +406,7 @@ static void ivpu_bo_print_info(struct iv mutex_lock(&bo->lock);
drm_printf(p, "%-9p %-3u 0x%-12llx %-10lu 0x%-8x %-4u", - bo, bo->ctx->id, bo->vpu_addr, bo->base.base.size, + bo, bo->ctx ? bo->ctx->id : 0, bo->vpu_addr, bo->base.base.size, bo->flags, kref_read(&bo->base.base.refcount));
if (bo->base.pages)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com
commit 0f6482caa6acdfdfc744db7430771fe7e6c4e787 upstream.
Move pm_runtime_set_active() to ivpu_pm_init() so when ivpu_ipc_send_receive_internal() is executed before ivpu_pm_enable() it already has correct runtime state, even if last resume was not successful.
Fixes: 8ed520ff4682 ("accel/ivpu: Move set autosuspend delay to HW specific code") Cc: stable@vger.kernel.org # v6.7+ Reviewed-by: Karol Wachowski karol.wachowski@intel.com Reviewed-by: Jeffrey Hugo quic_jhugo@quicinc.com Signed-off-by: Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241210130939.1575610-4-jacek... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/accel/ivpu/ivpu_pm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/accel/ivpu/ivpu_pm.c +++ b/drivers/accel/ivpu/ivpu_pm.c @@ -364,6 +364,7 @@ void ivpu_pm_init(struct ivpu_device *vd
pm_runtime_use_autosuspend(dev); pm_runtime_set_autosuspend_delay(dev, delay); + pm_runtime_set_active(dev);
ivpu_dbg(vdev, PM, "Autosuspend delay = %d\n", delay); } @@ -378,7 +379,6 @@ void ivpu_pm_enable(struct ivpu_device * { struct device *dev = vdev->drm.dev;
- pm_runtime_set_active(dev); pm_runtime_allow(dev); pm_runtime_mark_last_busy(dev); pm_runtime_put_autosuspend(dev);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
commit a6629626c584200daf495cc9a740048b455addcd upstream.
The test_event_printk() analyzes print formats of trace events looking for cases where it may dereference a pointer that is not in the ring buffer which can possibly be a bug when the trace event is read from the ring buffer and the content of that pointer no longer exists.
The function needs to accurately go from one print format argument to the next. It handles quotes and parenthesis that may be included in an argument. When it finds the start of the next argument, it uses a simple "c = strstr(fmt + i, ',')" to find the end of that argument!
In order to include "%s" dereferencing, it needs to process the entire content of the print format argument and not just the content of the first ',' it finds. As there may be content like:
({ const char *saved_ptr = trace_seq_buffer_ptr(p); static const char *access_str[] = { "---", "--x", "w--", "w-x", "-u-", "-ux", "wu-", "wux" }; union kvm_mmu_page_role role; role.word = REC->role; trace_seq_printf(p, "sp gen %u gfn %llx l%u %u-byte q%u%s %s%s" " %snxe %sad root %u %s%c", REC->mmu_valid_gen, REC->gfn, role.level, role.has_4_byte_gpte ? 4 : 8, role.quadrant, role.direct ? " direct" : "", access_str[role.access], role.invalid ? " invalid" : "", role.efer_nx ? "" : "!", role.ad_disabled ? "!" : "", REC->root_count, REC->unsync ? "unsync" : "sync", 0); saved_ptr; })
Which is an example of a full argument of an existing event. As the code already handles finding the next print format argument, process the argument at the end of it and not the start of it. This way it has both the start of the argument as well as the end of it.
Add a helper function "process_pointer()" that will do the processing during the loop as well as at the end. It also makes the code cleaner and easier to read.
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Al Viro viro@ZenIV.linux.org.uk Cc: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/20241217024720.362271189@goodmis.org Fixes: 5013f454a352c ("tracing: Add check of trace event print fmts for dereferencing pointers") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events.c | 82 ++++++++++++++++++++++++++++---------------- 1 file changed, 53 insertions(+), 29 deletions(-)
--- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -265,8 +265,7 @@ static bool test_field(const char *fmt, len = p - fmt;
for (; field->type; field++) { - if (strncmp(field->name, fmt, len) || - field->name[len]) + if (strncmp(field->name, fmt, len) || field->name[len]) continue; array_descriptor = strchr(field->type, '['); /* This is an array and is OK to dereference. */ @@ -275,6 +274,32 @@ static bool test_field(const char *fmt, return false; }
+/* Return true if the argument pointer is safe */ +static bool process_pointer(const char *fmt, int len, struct trace_event_call *call) +{ + const char *r, *e, *a; + + e = fmt + len; + + /* Find the REC-> in the argument */ + r = strstr(fmt, "REC->"); + if (r && r < e) { + /* + * Addresses of events on the buffer, or an array on the buffer is + * OK to dereference. There's ways to fool this, but + * this is to catch common mistakes, not malicious code. + */ + a = strchr(fmt, '&'); + if ((a && (a < r)) || test_field(r, call)) + return true; + } else if ((r = strstr(fmt, "__get_dynamic_array(")) && r < e) { + return true; + } else if ((r = strstr(fmt, "__get_sockaddr(")) && r < e) { + return true; + } + return false; +} + /* * Examine the print fmt of the event looking for unsafe dereference * pointers using %p* that could be recorded in the trace event and @@ -285,12 +310,12 @@ static void test_event_printk(struct tra { u64 dereference_flags = 0; bool first = true; - const char *fmt, *c, *r, *a; + const char *fmt; int parens = 0; char in_quote = 0; int start_arg = 0; int arg = 0; - int i; + int i, e;
fmt = call->print_fmt;
@@ -403,42 +428,41 @@ static void test_event_printk(struct tra case ',': if (in_quote || parens) continue; + e = i; i++; while (isspace(fmt[i])) i++; - start_arg = i; - if (!(dereference_flags & (1ULL << arg))) - goto next_arg;
- /* Find the REC-> in the argument */ - c = strchr(fmt + i, ','); - r = strstr(fmt + i, "REC->"); - if (r && (!c || r < c)) { - /* - * Addresses of events on the buffer, - * or an array on the buffer is - * OK to dereference. - * There's ways to fool this, but - * this is to catch common mistakes, - * not malicious code. - */ - a = strchr(fmt + i, '&'); - if ((a && (a < r)) || test_field(r, call)) + /* + * If start_arg is zero, then this is the start of the + * first argument. The processing of the argument happens + * when the end of the argument is found, as it needs to + * handle paranthesis and such. + */ + if (!start_arg) { + start_arg = i; + /* Balance out the i++ in the for loop */ + i--; + continue; + } + + if (dereference_flags & (1ULL << arg)) { + if (process_pointer(fmt + start_arg, e - start_arg, call)) dereference_flags &= ~(1ULL << arg); - } else if ((r = strstr(fmt + i, "__get_dynamic_array(")) && - (!c || r < c)) { - dereference_flags &= ~(1ULL << arg); - } else if ((r = strstr(fmt + i, "__get_sockaddr(")) && - (!c || r < c)) { - dereference_flags &= ~(1ULL << arg); }
- next_arg: - i--; + start_arg = i; arg++; + /* Balance out the i++ in the for loop */ + i--; } }
+ if (dereference_flags & (1ULL << arg)) { + if (process_pointer(fmt + start_arg, i - start_arg, call)) + dereference_flags &= ~(1ULL << arg); + } + /* * If you triggered the below warning, the trace event reported * uses an unsafe dereference pointer %p*. As the data stored
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
commit 917110481f6bc1c96b1e54b62bb114137fbc6d17 upstream.
The process_pointer() helper function looks to see if various trace event macros are used. These macros are for storing data in the event. This makes it safe to dereference as the dereference will then point into the event on the ring buffer where the content of the data stays with the event itself.
A few helper functions were missing. Those were:
__get_rel_dynamic_array() __get_dynamic_array_len() __get_rel_dynamic_array_len() __get_rel_sockaddr()
Also add a helper function find_print_string() to not need to use a middle man variable to test if the string exists.
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Al Viro viro@ZenIV.linux.org.uk Cc: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/20241217024720.521836792@goodmis.org Fixes: 5013f454a352c ("tracing: Add check of trace event print fmts for dereferencing pointers") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-)
--- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -274,6 +274,15 @@ static bool test_field(const char *fmt, return false; }
+/* Look for a string within an argument */ +static bool find_print_string(const char *arg, const char *str, const char *end) +{ + const char *r; + + r = strstr(arg, str); + return r && r < end; +} + /* Return true if the argument pointer is safe */ static bool process_pointer(const char *fmt, int len, struct trace_event_call *call) { @@ -292,9 +301,17 @@ static bool process_pointer(const char * a = strchr(fmt, '&'); if ((a && (a < r)) || test_field(r, call)) return true; - } else if ((r = strstr(fmt, "__get_dynamic_array(")) && r < e) { + } else if (find_print_string(fmt, "__get_dynamic_array(", e)) { + return true; + } else if (find_print_string(fmt, "__get_rel_dynamic_array(", e)) { + return true; + } else if (find_print_string(fmt, "__get_dynamic_array_len(", e)) { + return true; + } else if (find_print_string(fmt, "__get_rel_dynamic_array_len(", e)) { + return true; + } else if (find_print_string(fmt, "__get_sockaddr(", e)) { return true; - } else if ((r = strstr(fmt, "__get_sockaddr(")) && r < e) { + } else if (find_print_string(fmt, "__get_rel_sockaddr(", e)) { return true; } return false;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
commit 65a25d9f7ac02e0cf361356e834d1c71d36acca9 upstream.
The test_event_printk() code makes sure that when a trace event is registered, any dereferenced pointers in from the event's TP_printk() are pointing to content in the ring buffer. But currently it does not handle "%s", as there's cases where the string pointer saved in the ring buffer points to a static string in the kernel that will never be freed. As that is a valid case, the pointer needs to be checked at runtime.
Currently the runtime check is done via trace_check_vprintf(), but to not have to replicate everything in vsnprintf() it does some logic with the va_list that may not be reliable across architectures. In order to get rid of that logic, more work in the test_event_printk() needs to be done. Some of the strings can be validated at this time when it is obvious the string is valid because the string will be saved in the ring buffer content.
Do all the validation of strings in the ring buffer at boot in test_event_printk(), and make sure that the field of the strings that point into the kernel are accessible. This will allow adding checks at runtime that will validate the fields themselves and not rely on paring the TP_printk() format at runtime.
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Al Viro viro@ZenIV.linux.org.uk Cc: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/20241217024720.685917008@goodmis.org Fixes: 5013f454a352c ("tracing: Add check of trace event print fmts for dereferencing pointers") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events.c | 104 +++++++++++++++++++++++++++++++++++++------- 1 file changed, 89 insertions(+), 15 deletions(-)
--- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -244,19 +244,16 @@ int trace_event_get_offsets(struct trace return tail->offset + tail->size; }
-/* - * Check if the referenced field is an array and return true, - * as arrays are OK to dereference. - */ -static bool test_field(const char *fmt, struct trace_event_call *call) + +static struct trace_event_fields *find_event_field(const char *fmt, + struct trace_event_call *call) { struct trace_event_fields *field = call->class->fields_array; - const char *array_descriptor; const char *p = fmt; int len;
if (!(len = str_has_prefix(fmt, "REC->"))) - return false; + return NULL; fmt += len; for (p = fmt; *p; p++) { if (!isalnum(*p) && *p != '_') @@ -267,11 +264,26 @@ static bool test_field(const char *fmt, for (; field->type; field++) { if (strncmp(field->name, fmt, len) || field->name[len]) continue; - array_descriptor = strchr(field->type, '['); - /* This is an array and is OK to dereference. */ - return array_descriptor != NULL; + + return field; } - return false; + return NULL; +} + +/* + * Check if the referenced field is an array and return true, + * as arrays are OK to dereference. + */ +static bool test_field(const char *fmt, struct trace_event_call *call) +{ + struct trace_event_fields *field; + + field = find_event_field(fmt, call); + if (!field) + return false; + + /* This is an array and is OK to dereference. */ + return strchr(field->type, '[') != NULL; }
/* Look for a string within an argument */ @@ -317,6 +329,53 @@ static bool process_pointer(const char * return false; }
+/* Return true if the string is safe */ +static bool process_string(const char *fmt, int len, struct trace_event_call *call) +{ + const char *r, *e, *s; + + e = fmt + len; + + /* + * There are several helper functions that return strings. + * If the argument contains a function, then assume its field is valid. + * It is considered that the argument has a function if it has: + * alphanumeric or '_' before a parenthesis. + */ + s = fmt; + do { + r = strstr(s, "("); + if (!r || r >= e) + break; + for (int i = 1; r - i >= s; i++) { + char ch = *(r - i); + if (isspace(ch)) + continue; + if (isalnum(ch) || ch == '_') + return true; + /* Anything else, this isn't a function */ + break; + } + /* A function could be wrapped in parethesis, try the next one */ + s = r + 1; + } while (s < e); + + /* + * If there's any strings in the argument consider this arg OK as it + * could be: REC->field ? "foo" : "bar" and we don't want to get into + * verifying that logic here. + */ + if (find_print_string(fmt, """, e)) + return true; + + /* Dereferenced strings are also valid like any other pointer */ + if (process_pointer(fmt, len, call)) + return true; + + /* Make sure the field is found, and consider it OK for now if it is */ + return find_event_field(fmt, call) != NULL; +} + /* * Examine the print fmt of the event looking for unsafe dereference * pointers using %p* that could be recorded in the trace event and @@ -326,6 +385,7 @@ static bool process_pointer(const char * static void test_event_printk(struct trace_event_call *call) { u64 dereference_flags = 0; + u64 string_flags = 0; bool first = true; const char *fmt; int parens = 0; @@ -416,8 +476,16 @@ static void test_event_printk(struct tra star = true; continue; } - if ((fmt[i + j] == 's') && star) - arg++; + if ((fmt[i + j] == 's')) { + if (star) + arg++; + if (WARN_ONCE(arg == 63, + "Too many args for event: %s", + trace_event_name(call))) + return; + dereference_flags |= 1ULL << arg; + string_flags |= 1ULL << arg; + } break; } break; @@ -464,7 +532,10 @@ static void test_event_printk(struct tra }
if (dereference_flags & (1ULL << arg)) { - if (process_pointer(fmt + start_arg, e - start_arg, call)) + if (string_flags & (1ULL << arg)) { + if (process_string(fmt + start_arg, e - start_arg, call)) + dereference_flags &= ~(1ULL << arg); + } else if (process_pointer(fmt + start_arg, e - start_arg, call)) dereference_flags &= ~(1ULL << arg); }
@@ -476,7 +547,10 @@ static void test_event_printk(struct tra }
if (dereference_flags & (1ULL << arg)) { - if (process_pointer(fmt + start_arg, i - start_arg, call)) + if (string_flags & (1ULL << arg)) { + if (process_string(fmt + start_arg, i - start_arg, call)) + dereference_flags &= ~(1ULL << arg); + } else if (process_pointer(fmt + start_arg, i - start_arg, call)) dereference_flags &= ~(1ULL << arg); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
commit afd2627f727b89496d79a6b934a025fc916d4ded upstream.
The TP_printk() portion of a trace event is executed at the time a event is read from the trace. This can happen seconds, minutes, hours, days, months, years possibly later since the event was recorded. If the print format contains a dereference to a string via "%s", and that string was allocated, there's a chance that string could be freed before it is read by the trace file.
To protect against such bugs, there are two functions that verify the event. The first one is test_event_printk(), which is called when the event is created. It reads the TP_printk() format as well as its arguments to make sure nothing may be dereferencing a pointer that was not copied into the ring buffer along with the event. If it is, it will trigger a WARN_ON().
For strings that use "%s", it is not so easy. The string may not reside in the ring buffer but may still be valid. Strings that are static and part of the kernel proper which will not be freed for the life of the running system, are safe to dereference. But to know if it is a pointer to a static string or to something on the heap can not be determined until the event is triggered.
This brings us to the second function that tests for the bad dereferencing of strings, trace_check_vprintf(). It would walk through the printf format looking for "%s", and when it finds it, it would validate that the pointer is safe to read. If not, it would produces a WARN_ON() as well and write into the ring buffer "[UNSAFE-MEMORY]".
The problem with this is how it used va_list to have vsnprintf() handle all the cases that it didn't need to check. Instead of re-implementing vsnprintf(), it would make a copy of the format up to the %s part, and call vsnprintf() with the current va_list ap variable, where the ap would then be ready to point at the string in question.
For architectures that passed va_list by reference this was possible. For architectures that passed it by copy it was not. A test_can_verify() function was used to differentiate between the two, and if it wasn't possible, it would disable it.
Even for architectures where this was feasible, it was a stretch to rely on such a method that is undocumented, and could cause issues later on with new optimizations of the compiler.
Instead, the first function test_event_printk() was updated to look at "%s" as well. If the "%s" argument is a pointer outside the event in the ring buffer, it would find the field type of the event that is the problem and mark the structure with a new flag called "needs_test". The event itself will be marked by TRACE_EVENT_FL_TEST_STR to let it be known that this event has a field that needs to be verified before the event can be printed using the printf format.
When the event fields are created from the field type structure, the fields would copy the field type's "needs_test" value.
Finally, before being printed, a new function ignore_event() is called which will check if the event has the TEST_STR flag set (if not, it returns false). If the flag is set, it then iterates through the events fields looking for the ones that have the "needs_test" flag set.
Then it uses the offset field from the field structure to find the pointer in the ring buffer event. It runs the tests to make sure that pointer is safe to print and if not, it triggers the WARN_ON() and also adds to the trace output that the event in question has an unsafe memory access.
The ignore_event() makes the trace_check_vprintf() obsolete so it is removed.
Link: https://lore.kernel.org/all/CAHk-=wh3uOnqnZPpR0PeLZZtyWbZLboZ7cHLCKRWsocvs9Y...
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Al Viro viro@ZenIV.linux.org.uk Cc: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/20241217024720.848621576@goodmis.org Fixes: 5013f454a352c ("tracing: Add check of trace event print fmts for dereferencing pointers") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/trace_events.h | 6 - kernel/trace/trace.c | 255 ++++++++----------------------------------- kernel/trace/trace.h | 6 - kernel/trace/trace_events.c | 32 +++-- kernel/trace/trace_output.c | 6 - 5 files changed, 88 insertions(+), 217 deletions(-)
--- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -285,7 +285,8 @@ struct trace_event_fields { const char *name; const int size; const int align; - const int is_signed; + const unsigned int is_signed:1; + unsigned int needs_test:1; const int filter_type; const int len; }; @@ -337,6 +338,7 @@ enum { TRACE_EVENT_FL_EPROBE_BIT, TRACE_EVENT_FL_FPROBE_BIT, TRACE_EVENT_FL_CUSTOM_BIT, + TRACE_EVENT_FL_TEST_STR_BIT, };
/* @@ -354,6 +356,7 @@ enum { * CUSTOM - Event is a custom event (to be attached to an exsiting tracepoint) * This is set when the custom event has not been attached * to a tracepoint yet, then it is cleared when it is. + * TEST_STR - The event has a "%s" that points to a string outside the event */ enum { TRACE_EVENT_FL_FILTERED = (1 << TRACE_EVENT_FL_FILTERED_BIT), @@ -367,6 +370,7 @@ enum { TRACE_EVENT_FL_EPROBE = (1 << TRACE_EVENT_FL_EPROBE_BIT), TRACE_EVENT_FL_FPROBE = (1 << TRACE_EVENT_FL_FPROBE_BIT), TRACE_EVENT_FL_CUSTOM = (1 << TRACE_EVENT_FL_CUSTOM_BIT), + TRACE_EVENT_FL_TEST_STR = (1 << TRACE_EVENT_FL_TEST_STR_BIT), };
#define TRACE_EVENT_FL_UKPROBE (TRACE_EVENT_FL_KPROBE | TRACE_EVENT_FL_UPROBE) --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -3635,17 +3635,12 @@ char *trace_iter_expand_format(struct tr }
/* Returns true if the string is safe to dereference from an event */ -static bool trace_safe_str(struct trace_iterator *iter, const char *str, - bool star, int len) +static bool trace_safe_str(struct trace_iterator *iter, const char *str) { unsigned long addr = (unsigned long)str; struct trace_event *trace_event; struct trace_event_call *event;
- /* Ignore strings with no length */ - if (star && !len) - return true; - /* OK if part of the event data */ if ((addr >= (unsigned long)iter->ent) && (addr < (unsigned long)iter->ent + iter->ent_size)) @@ -3685,181 +3680,69 @@ static bool trace_safe_str(struct trace_ return false; }
-static DEFINE_STATIC_KEY_FALSE(trace_no_verify); - -static int test_can_verify_check(const char *fmt, ...) -{ - char buf[16]; - va_list ap; - int ret; - - /* - * The verifier is dependent on vsnprintf() modifies the va_list - * passed to it, where it is sent as a reference. Some architectures - * (like x86_32) passes it by value, which means that vsnprintf() - * does not modify the va_list passed to it, and the verifier - * would then need to be able to understand all the values that - * vsnprintf can use. If it is passed by value, then the verifier - * is disabled. - */ - va_start(ap, fmt); - vsnprintf(buf, 16, "%d", ap); - ret = va_arg(ap, int); - va_end(ap); - - return ret; -} - -static void test_can_verify(void) -{ - if (!test_can_verify_check("%d %d", 0, 1)) { - pr_info("trace event string verifier disabled\n"); - static_branch_inc(&trace_no_verify); - } -} - /** - * trace_check_vprintf - Check dereferenced strings while writing to the seq buffer + * ignore_event - Check dereferenced fields while writing to the seq buffer * @iter: The iterator that holds the seq buffer and the event being printed - * @fmt: The format used to print the event - * @ap: The va_list holding the data to print from @fmt. * - * This writes the data into the @iter->seq buffer using the data from - * @fmt and @ap. If the format has a %s, then the source of the string - * is examined to make sure it is safe to print, otherwise it will - * warn and print "[UNSAFE MEMORY]" in place of the dereferenced string - * pointer. + * At boot up, test_event_printk() will flag any event that dereferences + * a string with "%s" that does exist in the ring buffer. It may still + * be valid, as the string may point to a static string in the kernel + * rodata that never gets freed. But if the string pointer is pointing + * to something that was allocated, there's a chance that it can be freed + * by the time the user reads the trace. This would cause a bad memory + * access by the kernel and possibly crash the system. + * + * This function will check if the event has any fields flagged as needing + * to be checked at runtime and perform those checks. + * + * If it is found that a field is unsafe, it will write into the @iter->seq + * a message stating what was found to be unsafe. + * + * @return: true if the event is unsafe and should be ignored, + * false otherwise. */ -void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, - va_list ap) +bool ignore_event(struct trace_iterator *iter) { - long text_delta = 0; - long data_delta = 0; - const char *p = fmt; - const char *str; - bool good; - int i, j; + struct ftrace_event_field *field; + struct trace_event *trace_event; + struct trace_event_call *event; + struct list_head *head; + struct trace_seq *seq; + const void *ptr;
- if (WARN_ON_ONCE(!fmt)) - return; + trace_event = ftrace_find_event(iter->ent->type);
- if (static_branch_unlikely(&trace_no_verify)) - goto print; + seq = &iter->seq;
- /* - * When the kernel is booted with the tp_printk command line - * parameter, trace events go directly through to printk(). - * It also is checked by this function, but it does not - * have an associated trace_array (tr) for it. - */ - if (iter->tr) { - text_delta = iter->tr->text_delta; - data_delta = iter->tr->data_delta; + if (!trace_event) { + trace_seq_printf(seq, "EVENT ID %d NOT FOUND?\n", iter->ent->type); + return true; }
- /* Don't bother checking when doing a ftrace_dump() */ - if (iter->fmt == static_fmt_buf) - goto print; - - while (*p) { - bool star = false; - int len = 0; - - j = 0; - - /* - * We only care about %s and variants - * as well as %p[sS] if delta is non-zero - */ - for (i = 0; p[i]; i++) { - if (i + 1 >= iter->fmt_size) { - /* - * If we can't expand the copy buffer, - * just print it. - */ - if (!trace_iter_expand_format(iter)) - goto print; - } - - if (p[i] == '\' && p[i+1]) { - i++; - continue; - } - if (p[i] == '%') { - /* Need to test cases like %08.*s */ - for (j = 1; p[i+j]; j++) { - if (isdigit(p[i+j]) || - p[i+j] == '.') - continue; - if (p[i+j] == '*') { - star = true; - continue; - } - break; - } - if (p[i+j] == 's') - break; + event = container_of(trace_event, struct trace_event_call, event); + if (!(event->flags & TRACE_EVENT_FL_TEST_STR)) + return false;
- if (text_delta && p[i+1] == 'p' && - ((p[i+2] == 's' || p[i+2] == 'S'))) - break; + head = trace_get_fields(event); + if (!head) { + trace_seq_printf(seq, "FIELDS FOR EVENT '%s' NOT FOUND?\n", + trace_event_name(event)); + return true; + }
- star = false; - } - j = 0; - } - /* If no %s found then just print normally */ - if (!p[i]) - break; + /* Offsets are from the iter->ent that points to the raw event */ + ptr = iter->ent;
- /* Copy up to the %s, and print that */ - strncpy(iter->fmt, p, i); - iter->fmt[i] = '\0'; - trace_seq_vprintf(&iter->seq, iter->fmt, ap); - - /* Add delta to %pS pointers */ - if (p[i+1] == 'p') { - unsigned long addr; - char fmt[4]; - - fmt[0] = '%'; - fmt[1] = 'p'; - fmt[2] = p[i+2]; /* Either %ps or %pS */ - fmt[3] = '\0'; - - addr = va_arg(ap, unsigned long); - addr += text_delta; - trace_seq_printf(&iter->seq, fmt, (void *)addr); + list_for_each_entry(field, head, link) { + const char *str; + bool good;
- p += i + 3; + if (!field->needs_test) continue; - }
- /* - * If iter->seq is full, the above call no longer guarantees - * that ap is in sync with fmt processing, and further calls - * to va_arg() can return wrong positional arguments. - * - * Ensure that ap is no longer used in this case. - */ - if (iter->seq.full) { - p = ""; - break; - } - - if (star) - len = va_arg(ap, int); - - /* The ap now points to the string data of the %s */ - str = va_arg(ap, const char *); - - good = trace_safe_str(iter, str, star, len); + str = *(const char **)(ptr + field->offset);
- /* Could be from the last boot */ - if (data_delta && !good) { - str += data_delta; - good = trace_safe_str(iter, str, star, len); - } + good = trace_safe_str(iter, str);
/* * If you hit this warning, it is likely that the @@ -3870,44 +3753,14 @@ void trace_check_vprintf(struct trace_it * instead. See samples/trace_events/trace-events-sample.h * for reference. */ - if (WARN_ONCE(!good, "fmt: '%s' current_buffer: '%s'", - fmt, seq_buf_str(&iter->seq.seq))) { - int ret; - - /* Try to safely read the string */ - if (star) { - if (len + 1 > iter->fmt_size) - len = iter->fmt_size - 1; - if (len < 0) - len = 0; - ret = copy_from_kernel_nofault(iter->fmt, str, len); - iter->fmt[len] = 0; - star = false; - } else { - ret = strncpy_from_kernel_nofault(iter->fmt, str, - iter->fmt_size); - } - if (ret < 0) - trace_seq_printf(&iter->seq, "(0x%px)", str); - else - trace_seq_printf(&iter->seq, "(0x%px:%s)", - str, iter->fmt); - str = "[UNSAFE-MEMORY]"; - strcpy(iter->fmt, "%s"); - } else { - strncpy(iter->fmt, p + i, j + 1); - iter->fmt[j+1] = '\0'; + if (WARN_ONCE(!good, "event '%s' has unsafe pointer field '%s'", + trace_event_name(event), field->name)) { + trace_seq_printf(seq, "EVENT %s: HAS UNSAFE POINTER FIELD '%s'\n", + trace_event_name(event), field->name); + return true; } - if (star) - trace_seq_printf(&iter->seq, iter->fmt, len, str); - else - trace_seq_printf(&iter->seq, iter->fmt, str); - - p += i + j + 1; } - print: - if (*p) - trace_seq_vprintf(&iter->seq, p, ap); + return false; }
const char *trace_event_format(struct trace_iterator *iter, const char *fmt) @@ -10803,8 +10656,6 @@ __init static int tracer_alloc_buffers(v
register_snapshot_cmd();
- test_can_verify(); - return 0;
out_free_pipe_cpumask: --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -664,9 +664,8 @@ void trace_buffer_unlock_commit_nostack(
bool trace_is_tracepoint_string(const char *str); const char *trace_event_format(struct trace_iterator *iter, const char *fmt); -void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, - va_list ap) __printf(2, 0); char *trace_iter_expand_format(struct trace_iterator *iter); +bool ignore_event(struct trace_iterator *iter);
int trace_empty(struct trace_iterator *iter);
@@ -1402,7 +1401,8 @@ struct ftrace_event_field { int filter_type; int offset; int size; - int is_signed; + unsigned int is_signed:1; + unsigned int needs_test:1; int len; };
--- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -82,7 +82,7 @@ static int system_refcount_dec(struct ev }
static struct ftrace_event_field * -__find_event_field(struct list_head *head, char *name) +__find_event_field(struct list_head *head, const char *name) { struct ftrace_event_field *field;
@@ -114,7 +114,8 @@ trace_find_event_field(struct trace_even
static int __trace_define_field(struct list_head *head, const char *type, const char *name, int offset, int size, - int is_signed, int filter_type, int len) + int is_signed, int filter_type, int len, + int need_test) { struct ftrace_event_field *field;
@@ -133,6 +134,7 @@ static int __trace_define_field(struct l field->offset = offset; field->size = size; field->is_signed = is_signed; + field->needs_test = need_test; field->len = len;
list_add(&field->link, head); @@ -151,13 +153,13 @@ int trace_define_field(struct trace_even
head = trace_get_fields(call); return __trace_define_field(head, type, name, offset, size, - is_signed, filter_type, 0); + is_signed, filter_type, 0, 0); } EXPORT_SYMBOL_GPL(trace_define_field);
static int trace_define_field_ext(struct trace_event_call *call, const char *type, const char *name, int offset, int size, int is_signed, - int filter_type, int len) + int filter_type, int len, int need_test) { struct list_head *head;
@@ -166,13 +168,13 @@ static int trace_define_field_ext(struct
head = trace_get_fields(call); return __trace_define_field(head, type, name, offset, size, - is_signed, filter_type, len); + is_signed, filter_type, len, need_test); }
#define __generic_field(type, item, filter_type) \ ret = __trace_define_field(&ftrace_generic_fields, #type, \ #item, 0, 0, is_signed_type(type), \ - filter_type, 0); \ + filter_type, 0, 0); \ if (ret) \ return ret;
@@ -181,7 +183,8 @@ static int trace_define_field_ext(struct "common_" #item, \ offsetof(typeof(ent), item), \ sizeof(ent.item), \ - is_signed_type(type), FILTER_OTHER, 0); \ + is_signed_type(type), FILTER_OTHER, \ + 0, 0); \ if (ret) \ return ret;
@@ -332,6 +335,7 @@ static bool process_pointer(const char * /* Return true if the string is safe */ static bool process_string(const char *fmt, int len, struct trace_event_call *call) { + struct trace_event_fields *field; const char *r, *e, *s;
e = fmt + len; @@ -372,8 +376,16 @@ static bool process_string(const char *f if (process_pointer(fmt, len, call)) return true;
- /* Make sure the field is found, and consider it OK for now if it is */ - return find_event_field(fmt, call) != NULL; + /* Make sure the field is found */ + field = find_event_field(fmt, call); + if (!field) + return false; + + /* Test this field's string before printing the event */ + call->flags |= TRACE_EVENT_FL_TEST_STR; + field->needs_test = 1; + + return true; }
/* @@ -2586,7 +2598,7 @@ event_define_fields(struct trace_event_c ret = trace_define_field_ext(call, field->type, field->name, offset, field->size, field->is_signed, field->filter_type, - field->len); + field->len, field->needs_test); if (WARN_ON_ONCE(ret)) { pr_err("error code is %d\n", ret); break; --- a/kernel/trace/trace_output.c +++ b/kernel/trace/trace_output.c @@ -317,10 +317,14 @@ EXPORT_SYMBOL(trace_raw_output_prep);
void trace_event_printf(struct trace_iterator *iter, const char *fmt, ...) { + struct trace_seq *s = &iter->seq; va_list ap;
+ if (ignore_event(iter)) + return; + va_start(ap, fmt); - trace_check_vprintf(iter, trace_event_format(iter, fmt), ap); + trace_seq_vprintf(s, trace_event_format(iter, fmt), ap); va_end(ap); } EXPORT_SYMBOL(trace_event_printf);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Isaac J. Manjarres isaacmanjarres@google.com
commit 6a75f19af16ff482cfd6085c77123aa0f464f8dd upstream.
The sysctl tests for vm.memfd_noexec rely on the kernel to support PID namespaces (i.e. the kernel is built with CONFIG_PID_NS=y). If the kernel the test runs on does not support PID namespaces, the first sysctl test will fail when attempting to spawn a new thread in a new PID namespace, abort the test, preventing the remaining tests from being run.
This is not desirable, as not all kernels need PID namespaces, but can still use the other features provided by memfd. Therefore, only run the sysctl tests if the kernel supports PID namespaces. Otherwise, skip those tests and emit an informative message to let the user know why the sysctl tests are not being run.
Link: https://lkml.kernel.org/r/20241205192943.3228757-1-isaacmanjarres@google.com Fixes: 11f75a01448f ("selftests/memfd: add tests for MFD_NOEXEC_SEAL MFD_EXEC") Signed-off-by: Isaac J. Manjarres isaacmanjarres@google.com Reviewed-by: Jeff Xu jeffxu@google.com Cc: Suren Baghdasaryan surenb@google.com Cc: Kalesh Singh kaleshsingh@google.com Cc: stable@vger.kernel.org [6.6+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/memfd/memfd_test.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/memfd/memfd_test.c +++ b/tools/testing/selftests/memfd/memfd_test.c @@ -9,6 +9,7 @@ #include <fcntl.h> #include <linux/memfd.h> #include <sched.h> +#include <stdbool.h> #include <stdio.h> #include <stdlib.h> #include <signal.h> @@ -1557,6 +1558,11 @@ static void test_share_fork(char *banner close(fd); }
+static bool pid_ns_supported(void) +{ + return access("/proc/self/ns/pid", F_OK) == 0; +} + int main(int argc, char **argv) { pid_t pid; @@ -1591,8 +1597,12 @@ int main(int argc, char **argv) test_seal_grow(); test_seal_resize();
- test_sysctl_simple(); - test_sysctl_nested(); + if (pid_ns_supported()) { + test_sysctl_simple(); + test_sysctl_nested(); + } else { + printf("PID namespaces are not supported; skipping sysctl tests\n"); + }
test_share_dup("SHARE-DUP", ""); test_share_mmap("SHARE-MMAP", "");
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tiezhu Yang yangtiezhu@loongson.cn
commit 29d44cce324dab2bd86c447071a596262e7109b6 upstream.
Currently, LoongArch LLVM does not support the constraint "o" and no plan to support it, it only supports the similar constraint "m", so change the constraints from "nor" in the "else" case to arch-specific "nmr" to avoid the build error such as "unexpected asm memory constraint" for LoongArch.
Fixes: 630301b0d59d ("selftests/bpf: Add basic USDT selftests") Suggested-by: Weining Lu luweining@loongson.cn Suggested-by: Li Chen chenli@loongson.cn Signed-off-by: Tiezhu Yang yangtiezhu@loongson.cn Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Huacai Chen chenhuacai@loongson.cn Cc: stable@vger.kernel.org Link: https://llvm.org/docs/LangRef.html#supported-constraint-code-list Link: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/LoongArch/Loo... Link: https://lore.kernel.org/bpf/20241219111506.20643-1-yangtiezhu@loongson.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/bpf/sdt.h | 2 ++ 1 file changed, 2 insertions(+)
--- a/tools/testing/selftests/bpf/sdt.h +++ b/tools/testing/selftests/bpf/sdt.h @@ -102,6 +102,8 @@ # define STAP_SDT_ARG_CONSTRAINT nZr # elif defined __arm__ # define STAP_SDT_ARG_CONSTRAINT g +# elif defined __loongarch__ +# define STAP_SDT_ARG_CONSTRAINT nmr # else # define STAP_SDT_ARG_CONSTRAINT nor # endif
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
commit 12d908116f7efd34f255a482b9afc729d7a5fb78 upstream.
Currently, io_uring_unreg_ringfd() (which cleans up registered rings) is only called on exit, but __io_uring_free (which frees the tctx in which the registered ring pointers are stored) is also called on execve (via begin_new_exec -> io_uring_task_cancel -> __io_uring_cancel -> io_uring_cancel_generic -> __io_uring_free).
This means: A process going through execve while having registered rings will leak references to the rings' `struct file`.
Fix it by zapping registered rings on execve(). This is implemented by moving the io_uring_unreg_ringfd() from io_uring_files_cancel() into its callee __io_uring_cancel(), which is called from io_uring_task_cancel() on execve.
This could probably be exploited *on 32-bit kernels* by leaking 2^32 references to the same ring, because the file refcount is stored in a pointer-sized field and get_file() doesn't have protection against refcount overflow, just a WARN_ONCE(); but on 64-bit it should have no impact beyond a memory leak.
Cc: stable@vger.kernel.org Fixes: e7a6c00dc77a ("io_uring: add support for registering ring file descriptors") Signed-off-by: Jann Horn jannh@google.com Link: https://lore.kernel.org/r/20241218-uring-reg-ring-cleanup-v1-1-8f63e999045b@... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/io_uring.h | 4 +--- io_uring/io_uring.c | 1 + 2 files changed, 2 insertions(+), 3 deletions(-)
--- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -15,10 +15,8 @@ bool io_is_uring_fops(struct file *file)
static inline void io_uring_files_cancel(void) { - if (current->io_uring) { - io_uring_unreg_ringfd(); + if (current->io_uring) __io_uring_cancel(false); - } } static inline void io_uring_task_cancel(void) { --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3230,6 +3230,7 @@ end_wait:
void __io_uring_cancel(bool cancel_all) { + io_uring_unreg_ringfd(); io_uring_cancel_generic(cancel_all, NULL); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Begunkov asml.silence@gmail.com
commit dbd2ca9367eb19bc5e269b8c58b0b1514ada9156 upstream.
task work can be executed after the task has gone through io_uring termination, whether it's the final task_work run or the fallback path. In this case, task work will find ->io_wq being already killed and null'ed, which is a problem if it then tries to forward the request to io_queue_iowq(). Make io_queue_iowq() fail requests in this case.
Note that it also checks PF_KTHREAD, because the user can first close a DEFER_TASKRUN ring and shortly after kill the task, in which case ->iowq check would race.
Cc: stable@vger.kernel.org Fixes: 50c52250e2d74 ("block: implement async io_uring discard cmd") Fixes: 773af69121ecc ("io_uring: always reissue from task_work context") Reported-by: Will willsroot@protonmail.com Signed-off-by: Pavel Begunkov asml.silence@gmail.com Link: https://lore.kernel.org/r/63312b4a2c2bb67ad67b857d17a300e1d3b078e8.173463790... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -515,7 +515,11 @@ static void io_queue_iowq(struct io_kioc struct io_uring_task *tctx = req->task->io_uring;
BUG_ON(!tctx); - BUG_ON(!tctx->io_wq); + + if ((current->flags & PF_KTHREAD) || !tctx->io_wq) { + io_req_task_queue_fail(req, -ECANCELED); + return; + }
/* init ->work of the whole link before punting */ io_prep_async_link(req);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 62e2a47ceab8f3f7d2e3f0e03fdd1c5e0059fd8b upstream.
When the server is recalling a layout, we should ignore the count of outstanding layoutget calls, since the server is expected to return either NFS4ERR_RECALLCONFLICT or NFS4ERR_RETURNCONFLICT for as long as the recall is outstanding. Currently, we may end up livelocking, causing the layout to eventually be forcibly revoked.
Fixes: bf0291dd2267 ("pNFS: Ensure LAYOUTGET and LAYOUTRETURN are properly serialised") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfs/pnfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -1308,7 +1308,7 @@ pnfs_prepare_layoutreturn(struct pnfs_la enum pnfs_iomode *iomode) { /* Serialise LAYOUTGET/LAYOUTRETURN */ - if (atomic_read(&lo->plh_outstanding) != 0) + if (atomic_read(&lo->plh_outstanding) != 0 && lo->plh_return_seq == 0) return false; if (test_and_set_bit(NFS_LAYOUT_RETURN_LOCK, &lo->plh_flags)) return false;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 4d5163cba43fe96902165606fa54e1aecbbb32de upstream.
Drop KVM's arbitrary behavior of making DE_CFG.LFENCE_SERIALIZE read-only for the guest, as rejecting writes can lead to guest crashes, e.g. Windows in particular doesn't gracefully handle unexpected #GPs on the WRMSR, and nothing in the AMD manuals suggests that LFENCE_SERIALIZE is read-only _if it exists_.
KVM only allows LFENCE_SERIALIZE to be set, by the guest or host, if the underlying CPU has X86_FEATURE_LFENCE_RDTSC, i.e. if LFENCE is guaranteed to be serializing. So if the guest sets LFENCE_SERIALIZE, KVM will provide the desired/correct behavior without any additional action (the guest's value is never stuffed into hardware). And having LFENCE be serializing even when it's not _required_ to be is a-ok from a functional perspective.
Fixes: 74a0e79df68a ("KVM: SVM: Disallow guest from changing userspace's MSR_AMD64_DE_CFG value") Fixes: d1d93fa90f1a ("KVM: SVM: Add MSR-based feature support for serializing LFENCE") Reported-by: Simon Pilkington simonp.git@mailbox.org Closes: https://lore.kernel.org/all/52914da7-a97b-45ad-86a0-affdf8266c61@mailbox.org Cc: Tom Lendacky thomas.lendacky@amd.com Cc: stable@vger.kernel.org Reviewed-by: Tom Lendacky thomas.lendacky@amd.com Link: https://lore.kernel.org/r/20241211172952.1477605-1-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/svm.c | 9 --------- 1 file changed, 9 deletions(-)
--- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3199,15 +3199,6 @@ static int svm_set_msr(struct kvm_vcpu * if (data & ~supported_de_cfg) return 1;
- /* - * Don't let the guest change the host-programmed value. The - * MSR is very model specific, i.e. contains multiple bits that - * are completely unknown to KVM, and the one bit known to KVM - * is simply a reflection of hardware capabilities. - */ - if (!msr->host_initiated && data != svm->msr_decfg) - return 1; - svm->msr_decfg = data; break; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zijun Hu quic_zijuhu@quicinc.com
commit fec3edc47d5cfc2dd296a5141df887bf567944db upstream.
On a malformed interrupt-map property which is shorter than expected by 1 cell, we may read bogus data past the end of the property instead of returning an error in of_irq_parse_imap_parent().
Decrement the remaining length when skipping over the interrupt parent phandle cell.
Fixes: 935df1bd40d4 ("of/irq: Factor out parsing of interrupt-map parent phandle+args from of_irq_parse_raw()") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com Link: https://lore.kernel.org/r/20241209-of_irq_fix-v1-1-782f1419c8a1@quicinc.com [rh: reword commit msg] Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/irq.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/of/irq.c +++ b/drivers/of/irq.c @@ -111,6 +111,7 @@ const __be32 *of_irq_parse_imap_parent(c else np = of_find_node_by_phandle(be32_to_cpup(imap)); imap++; + len--;
/* Check if not found */ if (!np) {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zijun Hu quic_zijuhu@quicinc.com
commit 0f7ca6f69354e0c3923bbc28c92d0ecab4d50a3e upstream.
of_irq_parse_one() may use uninitialized variable @addr_len as shown below:
// @addr_len is uninitialized int addr_len;
// This operation does not touch @addr_len if it fails. addr = of_get_property(device, "reg", &addr_len);
// Use uninitialized @addr_len if the operation fails. if (addr_len > sizeof(addr_buf)) addr_len = sizeof(addr_buf);
// Check the operation result here. if (addr) memcpy(addr_buf, addr, addr_len);
Fix by initializing @addr_len before the operation.
Fixes: b739dffa5d57 ("of/irq: Prevent device address out-of-bounds read in interrupt map walk") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com Link: https://lore.kernel.org/r/20241209-of_irq_fix-v1-4-782f1419c8a1@quicinc.com Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/irq.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/of/irq.c +++ b/drivers/of/irq.c @@ -355,6 +355,7 @@ int of_irq_parse_one(struct device_node return of_irq_parse_oldworld(device, index, out_irq);
/* Get the reg property (if any) */ + addr_len = 0; addr = of_get_property(device, "reg", &addr_len);
/* Prevent out-of-bounds read in case of longer interrupt parent address size */
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heming Zhao heming.zhao@suse.com
commit 7782e3b3b004e8cb94a88621a22cc3c2f33e5b90 upstream.
Commit 30dd3478c3cd ("ocfs2: correctly use ocfs2_find_next_zero_bit()") introduced an issue, the ocfs2_sync_local_to_main() ignores the last contiguous free bits, which causes an OCFS2 volume to lose the last free clusters of LA window during the release routine.
Please note, because commit dfe6c5692fb5 ("ocfs2: fix the la space leak when unmounting an ocfs2 volume") was reverted, this commit is a replacement fix for commit dfe6c5692fb5.
Link: https://lkml.kernel.org/r/20241205104835.18223-3-heming.zhao@suse.com Fixes: 30dd3478c3cd ("ocfs2: correctly use ocfs2_find_next_zero_bit()") Signed-off-by: Heming Zhao heming.zhao@suse.com Suggested-by: Joseph Qi joseph.qi@linux.alibaba.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Jun Piao piaojun@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/localalloc.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/fs/ocfs2/localalloc.c +++ b/fs/ocfs2/localalloc.c @@ -971,9 +971,9 @@ static int ocfs2_sync_local_to_main(stru start = count = 0; left = le32_to_cpu(alloc->id1.bitmap1.i_total);
- while ((bit_off = ocfs2_find_next_zero_bit(bitmap, left, start)) < - left) { - if (bit_off == start) { + while (1) { + bit_off = ocfs2_find_next_zero_bit(bitmap, left, start); + if ((bit_off < left) && (bit_off == start)) { count++; start++; continue; @@ -998,6 +998,8 @@ static int ocfs2_sync_local_to_main(stru } }
+ if (bit_off >= left) + break; count = 1; start = bit_off + 1; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ryusuke Konishi konishi.ryusuke@gmail.com
commit 6309b8ce98e9a18390b9fd8f03fc412f3c17aee9 upstream.
When block_invalidatepage was converted to block_invalidate_folio, the fallback to block_invalidatepage in folio_invalidate() if the address_space_operations method invalidatepage (currently invalidate_folio) was not set, was removed.
Unfortunately, some pseudo-inodes in nilfs2 use empty_aops set by inode_init_always_gfp() as is, or explicitly set it to address_space_operations. Therefore, with this change, block_invalidatepage() is no longer called from folio_invalidate(), and as a result, the buffer_head structures attached to these pages/folios are no longer freed via try_to_free_buffers().
Thus, these buffer heads are now leaked by truncate_inode_pages(), which cleans up the page cache from inode evict(), etc.
Three types of caches use empty_aops: gc inode caches and the DAT shadow inode used by GC, and b-tree node caches. Of these, b-tree node caches explicitly call invalidate_mapping_pages() during cleanup, which involves calling try_to_free_buffers(), so the leak was not visible during normal operation but worsened when GC was performed.
Fix this issue by using address_space_operations with invalidate_folio set to block_invalidate_folio instead of empty_aops, which will ensure the same behavior as before.
Link: https://lkml.kernel.org/r/20241212164556.21338-1-konishi.ryusuke@gmail.com Fixes: 7ba13abbd31e ("fs: Turn block_invalidatepage into block_invalidate_folio") Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Cc: stable@vger.kernel.org [5.18+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nilfs2/btnode.c | 1 + fs/nilfs2/gcinode.c | 2 +- fs/nilfs2/inode.c | 5 +++++ fs/nilfs2/nilfs.h | 1 + 4 files changed, 8 insertions(+), 1 deletion(-)
--- a/fs/nilfs2/btnode.c +++ b/fs/nilfs2/btnode.c @@ -35,6 +35,7 @@ void nilfs_init_btnc_inode(struct inode ii->i_flags = 0; memset(&ii->i_bmap_data, 0, sizeof(struct nilfs_bmap)); mapping_set_gfp_mask(btnc_inode->i_mapping, GFP_NOFS); + btnc_inode->i_mapping->a_ops = &nilfs_buffer_cache_aops; }
void nilfs_btnode_cache_clear(struct address_space *btnc) --- a/fs/nilfs2/gcinode.c +++ b/fs/nilfs2/gcinode.c @@ -163,7 +163,7 @@ int nilfs_init_gcinode(struct inode *ino
inode->i_mode = S_IFREG; mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); - inode->i_mapping->a_ops = &empty_aops; + inode->i_mapping->a_ops = &nilfs_buffer_cache_aops;
ii->i_flags = 0; nilfs_bmap_init_gc(ii->i_bmap); --- a/fs/nilfs2/inode.c +++ b/fs/nilfs2/inode.c @@ -307,6 +307,10 @@ const struct address_space_operations ni .is_partially_uptodate = block_is_partially_uptodate, };
+const struct address_space_operations nilfs_buffer_cache_aops = { + .invalidate_folio = block_invalidate_folio, +}; + static int nilfs_insert_inode_locked(struct inode *inode, struct nilfs_root *root, unsigned long ino) @@ -706,6 +710,7 @@ struct inode *nilfs_iget_for_shadow(stru NILFS_I(s_inode)->i_flags = 0; memset(NILFS_I(s_inode)->i_bmap, 0, sizeof(struct nilfs_bmap)); mapping_set_gfp_mask(s_inode->i_mapping, GFP_NOFS); + s_inode->i_mapping->a_ops = &nilfs_buffer_cache_aops;
err = nilfs_attach_btree_node_cache(s_inode); if (unlikely(err)) { --- a/fs/nilfs2/nilfs.h +++ b/fs/nilfs2/nilfs.h @@ -401,6 +401,7 @@ extern const struct file_operations nilf extern const struct inode_operations nilfs_file_inode_operations; extern const struct file_operations nilfs_file_operations; extern const struct address_space_operations nilfs_aops; +extern const struct address_space_operations nilfs_buffer_cache_aops; extern const struct inode_operations nilfs_dir_inode_operations; extern const struct inode_operations nilfs_special_inode_operations; extern const struct inode_operations nilfs_symlink_inode_operations;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Edward Adam Davis eadavis@qq.com
commit 901ce9705fbb9f330ff1f19600e5daf9770b0175 upstream.
syzbot reported a WARNING in nilfs_rmdir. [1]
Because the inode bitmap is corrupted, an inode with an inode number that should exist as a ".nilfs" file was reassigned by nilfs_mkdir for "file0", causing an inode duplication during execution. And this causes an underflow of i_nlink in rmdir operations.
The inode is used twice by the same task to unmount and remove directories ".nilfs" and "file0", it trigger warning in nilfs_rmdir.
Avoid to this issue, check i_nlink in nilfs_iget(), if it is 0, it means that this inode has been deleted, and iput is executed to reclaim it.
[1] WARNING: CPU: 1 PID: 5824 at fs/inode.c:407 drop_nlink+0xc4/0x110 fs/inode.c:407 ... Call Trace: <TASK> nilfs_rmdir+0x1b0/0x250 fs/nilfs2/namei.c:342 vfs_rmdir+0x3a3/0x510 fs/namei.c:4394 do_rmdir+0x3b5/0x580 fs/namei.c:4453 __do_sys_rmdir fs/namei.c:4472 [inline] __se_sys_rmdir fs/namei.c:4470 [inline] __x64_sys_rmdir+0x47/0x50 fs/namei.c:4470 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Link: https://lkml.kernel.org/r/20241209065759.6781-1-konishi.ryusuke@gmail.com Fixes: d25006523d0b ("nilfs2: pathname operations") Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Reported-by: syzbot+9260555647a5132edd48@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9260555647a5132edd48 Tested-by: syzbot+9260555647a5132edd48@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis eadavis@qq.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nilfs2/inode.c | 8 +++++++- fs/nilfs2/namei.c | 5 +++++ 2 files changed, 12 insertions(+), 1 deletion(-)
--- a/fs/nilfs2/inode.c +++ b/fs/nilfs2/inode.c @@ -579,8 +579,14 @@ struct inode *nilfs_iget(struct super_bl inode = nilfs_iget_locked(sb, root, ino); if (unlikely(!inode)) return ERR_PTR(-ENOMEM); - if (!(inode->i_state & I_NEW)) + + if (!(inode->i_state & I_NEW)) { + if (!inode->i_nlink) { + iput(inode); + return ERR_PTR(-ESTALE); + } return inode; + }
err = __nilfs_read_inode(sb, root, ino, inode); if (unlikely(err)) { --- a/fs/nilfs2/namei.c +++ b/fs/nilfs2/namei.c @@ -67,6 +67,11 @@ nilfs_lookup(struct inode *dir, struct d inode = NULL; } else { inode = nilfs_iget(dir->i_sb, NILFS_I(dir)->i_root, ino); + if (inode == ERR_PTR(-ESTALE)) { + nilfs_error(dir->i_sb, + "deleted inode referenced: %lu", ino); + return ERR_PTR(-EIO); + } }
return d_splice_alias(inode, dentry);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
commit 9cb189a882738c1d28b349d4e7c6a1ef9b3d8f87 upstream.
The current check_memfd_seals() is racy: Since we first do check_memfd_seals() and then udmabuf_pin_folios() without holding any relevant lock across both, F_SEAL_WRITE can be set in between. This is problematic because we can end up holding pins to pages in a write-sealed memfd.
Fix it using the inode lock, that's probably the easiest way. In the future, we might want to consider moving this logic into memfd, especially if anyone else wants to use memfd_pin_folios().
Reported-by: Julian Orth ju.orth@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219106 Closes: https://lore.kernel.org/r/CAG48ez0w8HrFEZtJkfmkVKFDhE5aP7nz=obrimeTgpD+StkV9... Fixes: fbb0de795078 ("Add udmabuf misc device") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn jannh@google.com Acked-by: Joel Fernandes (Google) joel@joelfernandes.org Acked-by: Vivek Kasireddy vivek.kasireddy@intel.com Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241204-udmabuf-fixes-v2-1-23... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma-buf/udmabuf.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
--- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -394,14 +394,19 @@ static long udmabuf_create(struct miscde goto err; }
+ /* + * Take the inode lock to protect against concurrent + * memfd_add_seals(), which takes this lock in write mode. + */ + inode_lock_shared(file_inode(memfd)); ret = check_memfd_seals(memfd); - if (ret < 0) { - fput(memfd); - goto err; - } + if (ret) + goto out_unlock;
ret = udmabuf_pin_folios(ubuf, memfd, list[i].offset, list[i].size); +out_unlock: + inode_unlock_shared(file_inode(memfd)); fput(memfd); if (ret) goto err;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
commit 0a16e24e34f28210f68195259456c73462518597 upstream.
When F_SEAL_FUTURE_WRITE was introduced, it was overlooked that udmabuf must reject memfds with this flag, just like ones with F_SEAL_WRITE. Fix it by adding F_SEAL_FUTURE_WRITE to SEALS_DENIED.
Fixes: ab3948f58ff8 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd") Cc: stable@vger.kernel.org Acked-by: Vivek Kasireddy vivek.kasireddy@intel.com Signed-off-by: Jann Horn jannh@google.com Reviewed-by: Joel Fernandes (Google) joel@joelfernandes.org Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241204-udmabuf-fixes-v2-2-23... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma-buf/udmabuf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -256,7 +256,7 @@ static const struct dma_buf_ops udmabuf_ };
#define SEALS_WANTED (F_SEAL_SHRINK) -#define SEALS_DENIED (F_SEAL_WRITE) +#define SEALS_DENIED (F_SEAL_WRITE|F_SEAL_FUTURE_WRITE)
static int check_memfd_seals(struct file *memfd) {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Samuel Holland samuel.holland@sifive.com
commit bc7acc0bd0f94c26bc0defc902311794a3d0fae9 upstream.
commit 7f00be96f125 ("of: property: Add device link support for interrupt-parent, dmas and -gpio(s)") started adding device links for the interrupt-parent property. commit 4104ca776ba3 ("of: property: Add fw_devlink support for interrupts") and commit f265f06af194 ("of: property: Fix fw_devlink handling of interrupts/interrupts-extended") later added full support for parsing the interrupts and interrupts-extended properties, which includes looking up the node of the parent domain. This made the handler for the interrupt-parent property redundant.
In fact, creating device links based solely on interrupt-parent is problematic, because it can create spurious cycles. A node may have this property without itself being an interrupt controller or consumer. For example, this property is often present in the root node or a /soc bus node to set the default interrupt parent for child nodes. However, it is incorrect for the bus to depend on the interrupt controller, as some of the bus's children may not be interrupt consumers at all or may have a different interrupt parent.
Resolving these spurious dependency cycles can cause an incorrect probe order for interrupt controller drivers. This was observed on a RISC-V system with both an APLIC and IMSIC under /soc, where interrupt-parent in /soc points to the APLIC, and the APLIC msi-parent points to the IMSIC. fw_devlink found three dependency cycles and attempted to probe the APLIC before the IMSIC. After applying this patch, there were no dependency cycles and the probe order was correct.
Acked-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Fixes: 4104ca776ba3 ("of: property: Add fw_devlink support for interrupts") Signed-off-by: Samuel Holland samuel.holland@sifive.com Link: https://lore.kernel.org/r/20241120233124.3649382-1-samuel.holland@sifive.com Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/property.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/of/property.c +++ b/drivers/of/property.c @@ -1213,7 +1213,6 @@ DEFINE_SIMPLE_PROP(iommus, "iommus", "#i DEFINE_SIMPLE_PROP(mboxes, "mboxes", "#mbox-cells") DEFINE_SIMPLE_PROP(io_channels, "io-channels", "#io-channel-cells") DEFINE_SIMPLE_PROP(io_backends, "io-backends", "#io-backend-cells") -DEFINE_SIMPLE_PROP(interrupt_parent, "interrupt-parent", NULL) DEFINE_SIMPLE_PROP(dmas, "dmas", "#dma-cells") DEFINE_SIMPLE_PROP(power_domains, "power-domains", "#power-domain-cells") DEFINE_SIMPLE_PROP(hwlocks, "hwlocks", "#hwlock-cells") @@ -1359,7 +1358,6 @@ static const struct supplier_bindings of { .parse_prop = parse_mboxes, }, { .parse_prop = parse_io_channels, }, { .parse_prop = parse_io_backends, }, - { .parse_prop = parse_interrupt_parent, }, { .parse_prop = parse_dmas, .optional = true, }, { .parse_prop = parse_power_domains, }, { .parse_prop = parse_hwlocks, },
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrea della Porta andrea.porta@suse.com
commit 7f05e20b989ac33c9c0f8c2028ec0a566493548f upstream.
A missing or empty dma-ranges in a DT node implies a 1:1 mapping for dma translations. In this specific case, the current behaviour is to zero out the entire specifier so that the translation could be carried on as an offset from zero. This includes address specifier that has flags (e.g. PCI ranges).
Once the flags portion has been zeroed, the translation chain is broken since the mapping functions will check the upcoming address specifier against mismatching flags, always failing the 1:1 mapping and its entire purpose of always succeeding.
Set to zero only the address portion while passing the flags through.
Fixes: dbbdee94734b ("of/address: Merge all of the bus translation code") Cc: stable@vger.kernel.org Signed-off-by: Andrea della Porta andrea.porta@suse.com Tested-by: Herve Codina herve.codina@bootlin.com Link: https://lore.kernel.org/r/e51ae57874e58a9b349c35e2e877425ebc075d7a.173244181... Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/address.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -455,7 +455,8 @@ static int of_translate_one(struct devic } if (ranges == NULL || rlen == 0) { offset = of_read_number(addr, na); - memset(addr, 0, pna * 4); + /* set address to zero, pass flags through */ + memset(addr + pbus->flag_cells, 0, (pna - pbus->flag_cells) * 4); pr_debug("empty ranges; 1:1 translation\n"); goto finish; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Herve Codina herve.codina@bootlin.com
commit d7dfa7fde63dde4d2ec0083133efe2c6686c03ff upstream.
The current code uses some 'goto put;' to cancel the parsing operation and can lead to a return code value of 0 even on error cases.
Indeed, some goto calls are done from a loop without setting the ret value explicitly before the goto call and so the ret value can be set to 0 due to operation done in previous loop iteration. For instance match can be set to 0 in the previous loop iteration (leading to a new iteration) but ret can also be set to 0 it the of_property_read_u32() call succeed. In that case if no match are found or if an error is detected the new iteration, the return value can be wrongly 0.
Avoid those cases setting the ret value explicitly before the goto calls.
Fixes: bd6f2fd5a1d5 ("of: Support parsing phandle argument lists through a nexus node") Cc: stable@vger.kernel.org Signed-off-by: Herve Codina herve.codina@bootlin.com Link: https://lore.kernel.org/r/20241202165819.158681-1-herve.codina@bootlin.com Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/base.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
--- a/drivers/of/base.c +++ b/drivers/of/base.c @@ -1455,8 +1455,10 @@ int of_parse_phandle_with_args_map(const map_len--;
/* Check if not found */ - if (!new) + if (!new) { + ret = -EINVAL; goto put; + }
if (!of_device_is_available(new)) match = 0; @@ -1466,17 +1468,20 @@ int of_parse_phandle_with_args_map(const goto put;
/* Check for malformed properties */ - if (WARN_ON(new_size > MAX_PHANDLE_ARGS)) - goto put; - if (map_len < new_size) + if (WARN_ON(new_size > MAX_PHANDLE_ARGS) || + map_len < new_size) { + ret = -EINVAL; goto put; + }
/* Move forward by new node's #<list>-cells amount */ map += new_size; map_len -= new_size; } - if (!match) + if (!match) { + ret = -ENOENT; goto put; + }
/* Get the <list>-map-pass-thru property (optional) */ pass = of_get_property(cur, pass_name, NULL);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zijun Hu quic_zijuhu@quicinc.com
commit 5d009e024056ded20c5bb1583146b833b23bbd5a upstream.
__of_get_dma_parent() returns OF device node @args.np, but the node's refcount is increased twice, by both of_parse_phandle_with_args() and of_node_get(), so causes refcount leakage for the node.
Fix by directly returning the node got by of_parse_phandle_with_args().
Fixes: f83a6e5dea6c ("of: address: Add support for the parent DMA bus") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com Link: https://lore.kernel.org/r/20241206-of_core_fix-v1-4-dc28ed56bec3@quicinc.com Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/address.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -616,7 +616,7 @@ struct device_node *__of_get_dma_parent( if (ret < 0) return of_get_parent(np);
- return of_node_get(args.np); + return args.np; } #endif
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Max Kellermann max.kellermann@ionos.com
commit 550f7ca98ee028a606aa75705a7e77b1bd11720f upstream.
If the full path to be built by ceph_mdsc_build_path() happens to be longer than PATH_MAX, then this function will enter an endless (retry) loop, effectively blocking the whole task. Most of the machine becomes unusable, making this a very simple and effective DoS vulnerability.
I cannot imagine why this retry was ever implemented, but it seems rather useless and harmful to me. Let's remove it and fail with ENAMETOOLONG instead.
Cc: stable@vger.kernel.org Reported-by: Dario Weißer dario@cure53.de Signed-off-by: Max Kellermann max.kellermann@ionos.com Reviewed-by: Alex Markuze amarkuze@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ceph/mds_client.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
--- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2808,12 +2808,11 @@ retry:
if (pos < 0) { /* - * A rename didn't occur, but somehow we didn't end up where - * we thought we would. Throw a warning and try again. + * The path is longer than PATH_MAX and this function + * cannot ever succeed. Creating paths that long is + * possible with Ceph, but Linux cannot use them. */ - pr_warn_client(cl, "did not end path lookup where expected (pos = %d)\n", - pos); - goto retry; + return ERR_PTR(-ENAMETOOLONG); }
*pbase = base;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Dryomov idryomov@gmail.com
commit 12eb22a5a609421b380c3c6ca887474fb2089b2c upstream.
It becomes a path component, so it shouldn't exceed NAME_MAX characters. This was hardened in commit c152737be22b ("ceph: Use strscpy() instead of strcpy() in __get_snap_name()"), but no actual check was put in place.
Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Alex Markuze amarkuze@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ceph/super.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -431,6 +431,8 @@ static int ceph_parse_mount_param(struct
switch (token) { case Opt_snapdirname: + if (strlen(param->string) > NAME_MAX) + return invalfc(fc, "snapdirname too long"); kfree(fsopt->snapdir_name); fsopt->snapdir_name = param->string; param->string = NULL;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Markuze amarkuze@redhat.com
commit 9abee475803fab6ad59d4f4fc59c6a75374a7d9d upstream.
This patch refines the read logic in __ceph_sync_read() to ensure more predictable and efficient behavior in various edge cases.
- Return early if the requested read length is zero or if the file size (`i_size`) is zero. - Initialize the index variable (`idx`) where needed and reorder some code to ensure it is always set before use. - Improve error handling by checking for negative return values earlier. - Remove redundant encrypted file checks after failures. Only attempt filesystem-level decryption if the read succeeded. - Simplify leftover calculations to correctly handle cases where the read extends beyond the end of the file or stops short. This can be hit by continuously reading a file while, on another client, we keep truncating and writing new data into it. - This resolves multiple issues caused by integer and consequent buffer overflow (`pages` array being accessed beyond `num_pages`): - https://tracker.ceph.com/issues/67524 - https://tracker.ceph.com/issues/68980 - https://tracker.ceph.com/issues/68981
Cc: stable@vger.kernel.org Fixes: 1065da21e5df ("ceph: stop copying to iter at EOF on sync reads") Reported-by: Luis Henriques (SUSE) luis.henriques@linux.dev Signed-off-by: Alex Markuze amarkuze@redhat.com Reviewed-by: Viacheslav Dubeyko Slava.Dubeyko@ibm.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ceph/file.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-)
--- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1066,7 +1066,7 @@ ssize_t __ceph_sync_read(struct inode *i if (ceph_inode_is_shutdown(inode)) return -EIO;
- if (!len) + if (!len || !i_size) return 0; /* * flush any page cache pages in this range. this @@ -1086,7 +1086,7 @@ ssize_t __ceph_sync_read(struct inode *i int num_pages; size_t page_off; bool more; - int idx; + int idx = 0; size_t left; struct ceph_osd_req_op *op; u64 read_off = off; @@ -1160,7 +1160,14 @@ ssize_t __ceph_sync_read(struct inode *i else if (ret == -ENOENT) ret = 0;
- if (ret > 0 && IS_ENCRYPTED(inode)) { + if (ret < 0) { + ceph_osdc_put_request(req); + if (ret == -EBLOCKLISTED) + fsc->blocklisted = true; + break; + } + + if (IS_ENCRYPTED(inode)) { int fret;
fret = ceph_fscrypt_decrypt_extents(inode, pages, @@ -1189,7 +1196,7 @@ ssize_t __ceph_sync_read(struct inode *i ceph_osdc_put_request(req);
/* Short read but not EOF? Zero out the remainder. */ - if (ret >= 0 && ret < len && (off + ret < i_size)) { + if (ret < len && (off + ret < i_size)) { int zlen = min(len - ret, i_size - off - ret); int zoff = page_off + ret;
@@ -1199,13 +1206,11 @@ ssize_t __ceph_sync_read(struct inode *i ret += zlen; }
- idx = 0; - if (ret <= 0) - left = 0; - else if (off + ret > i_size) - left = i_size - off; + if (off + ret > i_size) + left = (i_size > off) ? i_size - off : 0; else left = ret; + while (left > 0) { size_t plen, copied;
@@ -1223,12 +1228,6 @@ ssize_t __ceph_sync_read(struct inode *i } ceph_release_page_vector(pages, num_pages);
- if (ret < 0) { - if (ret == -EBLOCKLISTED) - fsc->blocklisted = true; - break; - } - if (off >= i_size || !more) break; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Max Kellermann max.kellermann@ionos.com
commit d6fd6f8280f0257ba93f16900a0d3d3912f32c79 upstream.
In two `break` statements, the call to ceph_release_page_vector() was missing, leaking the allocation from ceph_alloc_page_vector().
Instead of adding the missing ceph_release_page_vector() calls, the Ceph maintainers preferred to transfer page ownership to the `ceph_osd_request` by passing `own_pages=true` to osd_req_op_extent_osd_data_pages(). This requires postponing the ceph_osdc_put_request() call until after the block that accesses the `pages`.
Cc: stable@vger.kernel.org Fixes: 03bc06c7b0bd ("ceph: add new mount option to enable sparse reads") Fixes: f0fe1e54cfcf ("ceph: plumb in decryption during reads") Signed-off-by: Max Kellermann max.kellermann@ionos.com Reviewed-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ceph/file.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1127,7 +1127,7 @@ ssize_t __ceph_sync_read(struct inode *i
osd_req_op_extent_osd_data_pages(req, 0, pages, read_len, offset_in_page(read_off), - false, false); + false, true);
op = &req->r_ops[0]; if (sparse) { @@ -1193,8 +1193,6 @@ ssize_t __ceph_sync_read(struct inode *i ret = min_t(ssize_t, fret, len); }
- ceph_osdc_put_request(req); - /* Short read but not EOF? Zero out the remainder. */ if (ret < len && (off + ret < i_size)) { int zlen = min(len - ret, i_size - off - ret); @@ -1226,7 +1224,8 @@ ssize_t __ceph_sync_read(struct inode *i break; } } - ceph_release_page_vector(pages, num_pages); + + ceph_osdc_put_request(req);
if (off >= i_size || !more) break;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Dryomov idryomov@gmail.com
commit 66e0c4f91461d17d48071695271c824620bed4ef upstream.
The bvecs array which is allocated in iter_get_bvecs_alloc() is leaked and pages remain pinned if ceph_alloc_sparse_ext_map() fails.
There is no need to delay the allocation of sparse_ext map until after the bvecs array is set up, so fix this by moving sparse_ext allocation a bit earlier. Also, make a similar adjustment in __ceph_sync_read() for consistency (a leak of the same kind in __ceph_sync_read() has been addressed differently).
Cc: stable@vger.kernel.org Fixes: 03bc06c7b0bd ("ceph: add new mount option to enable sparse reads") Signed-off-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Alex Markuze amarkuze@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ceph/file.c | 43 ++++++++++++++++++++++--------------------- 1 file changed, 22 insertions(+), 21 deletions(-)
--- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1116,6 +1116,16 @@ ssize_t __ceph_sync_read(struct inode *i len = read_off + read_len - off; more = len < iov_iter_count(to);
+ op = &req->r_ops[0]; + if (sparse) { + extent_cnt = __ceph_sparse_read_ext_count(inode, read_len); + ret = ceph_alloc_sparse_ext_map(op, extent_cnt); + if (ret) { + ceph_osdc_put_request(req); + break; + } + } + num_pages = calc_pages_for(read_off, read_len); page_off = offset_in_page(off); pages = ceph_alloc_page_vector(num_pages, GFP_KERNEL); @@ -1129,16 +1139,6 @@ ssize_t __ceph_sync_read(struct inode *i offset_in_page(read_off), false, true);
- op = &req->r_ops[0]; - if (sparse) { - extent_cnt = __ceph_sparse_read_ext_count(inode, read_len); - ret = ceph_alloc_sparse_ext_map(op, extent_cnt); - if (ret) { - ceph_osdc_put_request(req); - break; - } - } - ceph_osdc_start_request(osdc, req); ret = ceph_osdc_wait_request(osdc, req);
@@ -1551,6 +1551,16 @@ ceph_direct_read_write(struct kiocb *ioc break; }
+ op = &req->r_ops[0]; + if (sparse) { + extent_cnt = __ceph_sparse_read_ext_count(inode, size); + ret = ceph_alloc_sparse_ext_map(op, extent_cnt); + if (ret) { + ceph_osdc_put_request(req); + break; + } + } + len = iter_get_bvecs_alloc(iter, size, &bvecs, &num_pages); if (len < 0) { ceph_osdc_put_request(req); @@ -1560,6 +1570,8 @@ ceph_direct_read_write(struct kiocb *ioc if (len != size) osd_req_op_extent_update(req, 0, len);
+ osd_req_op_extent_osd_data_bvecs(req, 0, bvecs, num_pages, len); + /* * To simplify error handling, allow AIO when IO within i_size * or IO can be satisfied by single OSD request. @@ -1591,17 +1603,6 @@ ceph_direct_read_write(struct kiocb *ioc req->r_mtime = mtime; }
- osd_req_op_extent_osd_data_bvecs(req, 0, bvecs, num_pages, len); - op = &req->r_ops[0]; - if (sparse) { - extent_cnt = __ceph_sparse_read_ext_count(inode, size); - ret = ceph_alloc_sparse_ext_map(op, extent_cnt); - if (ret) { - ceph_osdc_put_request(req); - break; - } - } - if (aio_req) { aio_req->total_len += len; aio_req->num_reqs++;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kefeng Wang wangkefeng.wang@huawei.com
commit 8aca2bc96c833ba695ede7a45ad7784c836a262e upstream.
In current kernel, hugetlb_no_page() calls folio_zero_user() with the fault address. Where the fault address may be not aligned with the huge page size. Then, folio_zero_user() may call clear_gigantic_page() with the address, while clear_gigantic_page() requires the address to be huge page size aligned. So, this may cause memory corruption or information leak, addtional, use more obvious naming 'addr_hint' instead of 'addr' for clear_gigantic_page().
Link: https://lkml.kernel.org/r/20241028145656.932941-1-wangkefeng.wang@huawei.com Fixes: 78fefd04c123 ("mm: memory: convert clear_huge_page() to folio_zero_user()") Signed-off-by: Kefeng Wang wangkefeng.wang@huawei.com Reviewed-by: "Huang, Ying" ying.huang@intel.com Reviewed-by: David Hildenbrand david@redhat.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Muchun Song muchun.song@linux.dev Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/hugetlbfs/inode.c | 2 +- mm/memory.c | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-)
--- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -893,7 +893,7 @@ static long hugetlbfs_fallocate(struct f error = PTR_ERR(folio); goto out; } - folio_zero_user(folio, ALIGN_DOWN(addr, hpage_size)); + folio_zero_user(folio, addr); __folio_mark_uptodate(folio); error = hugetlb_add_to_page_cache(folio, mapping, index); if (unlikely(error)) { --- a/mm/memory.c +++ b/mm/memory.c @@ -6780,9 +6780,10 @@ static inline int process_huge_page( return 0; }
-static void clear_gigantic_page(struct folio *folio, unsigned long addr, +static void clear_gigantic_page(struct folio *folio, unsigned long addr_hint, unsigned int nr_pages) { + unsigned long addr = ALIGN_DOWN(addr_hint, folio_size(folio)); int i;
might_sleep();
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kefeng Wang wangkefeng.wang@huawei.com
commit f5d09de9f1bf9674c6418ff10d0a40cfe29268e1 upstream.
In current kernel, hugetlb_wp() calls copy_user_large_folio() with the fault address. Where the fault address may be not aligned with the huge page size. Then, copy_user_large_folio() may call copy_user_gigantic_page() with the address, while copy_user_gigantic_page() requires the address to be huge page size aligned. So, this may cause memory corruption or information leak, addtional, use more obvious naming 'addr_hint' instead of 'addr' for copy_user_gigantic_page().
Link: https://lkml.kernel.org/r/20241028145656.932941-2-wangkefeng.wang@huawei.com Fixes: 530dd9926dc1 ("mm: memory: improve copy_user_large_folio()") Signed-off-by: Kefeng Wang wangkefeng.wang@huawei.com Reviewed-by: David Hildenbrand david@redhat.com Cc: Huang Ying ying.huang@intel.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Muchun Song muchun.song@linux.dev Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/hugetlb.c | 5 ++--- mm/memory.c | 5 +++-- 2 files changed, 5 insertions(+), 5 deletions(-)
--- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5333,7 +5333,7 @@ again: break; } ret = copy_user_large_folio(new_folio, pte_folio, - ALIGN_DOWN(addr, sz), dst_vma); + addr, dst_vma); folio_put(pte_folio); if (ret) { folio_put(new_folio); @@ -6632,8 +6632,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_ *foliop = NULL; goto out; } - ret = copy_user_large_folio(folio, *foliop, - ALIGN_DOWN(dst_addr, size), dst_vma); + ret = copy_user_large_folio(folio, *foliop, dst_addr, dst_vma); folio_put(*foliop); *foliop = NULL; if (ret) { --- a/mm/memory.c +++ b/mm/memory.c @@ -6817,13 +6817,14 @@ void folio_zero_user(struct folio *folio }
static int copy_user_gigantic_page(struct folio *dst, struct folio *src, - unsigned long addr, + unsigned long addr_hint, struct vm_area_struct *vma, unsigned int nr_pages) { - int i; + unsigned long addr = ALIGN_DOWN(addr_hint, folio_size(dst)); struct page *dst_page; struct page *src_page; + int i;
for (i = 0; i < nr_pages; i++) { dst_page = folio_page(dst, i);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hugh Dickins hughd@google.com
commit dad2dc9c92e0f93f33cebcb0595b8daa3d57473f upstream.
/proc/meminfo ShmemHugePages has been showing overlarge amounts (more than Shmem) after swapping out THPs: we forgot to update NR_SHMEM_THPS.
Add shmem_update_stats(), to avoid repetition, and risk of making that mistake again: the call from shmem_delete_from_page_cache() is the bugfix; the call from shmem_replace_folio() is reassuring, but not really a bugfix (replace corrects misplaced swapin readahead, but huge swapin readahead would be a mistake).
Link: https://lkml.kernel.org/r/5ba477c8-a569-70b5-923e-09ab221af45b@google.com Fixes: 809bc86517cc ("mm: shmem: support large folio swap out") Signed-off-by: Hugh Dickins hughd@google.com Reviewed-by: Shakeel Butt shakeel.butt@linux.dev Reviewed-by: Yosry Ahmed yosryahmed@google.com Reviewed-by: Baolin Wang baolin.wang@linux.alibaba.com Tested-by: Baolin Wang baolin.wang@linux.alibaba.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/shmem.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
--- a/mm/shmem.c +++ b/mm/shmem.c @@ -779,6 +779,14 @@ static bool shmem_huge_global_enabled(st } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+static void shmem_update_stats(struct folio *folio, int nr_pages) +{ + if (folio_test_pmd_mappable(folio)) + __lruvec_stat_mod_folio(folio, NR_SHMEM_THPS, nr_pages); + __lruvec_stat_mod_folio(folio, NR_FILE_PAGES, nr_pages); + __lruvec_stat_mod_folio(folio, NR_SHMEM, nr_pages); +} + /* * Somewhat like filemap_add_folio, but error if expected item has gone. */ @@ -813,10 +821,7 @@ static int shmem_add_to_page_cache(struc xas_store(&xas, folio); if (xas_error(&xas)) goto unlock; - if (folio_test_pmd_mappable(folio)) - __lruvec_stat_mod_folio(folio, NR_SHMEM_THPS, nr); - __lruvec_stat_mod_folio(folio, NR_FILE_PAGES, nr); - __lruvec_stat_mod_folio(folio, NR_SHMEM, nr); + shmem_update_stats(folio, nr); mapping->nrpages += nr; unlock: xas_unlock_irq(&xas); @@ -844,8 +849,7 @@ static void shmem_delete_from_page_cache error = shmem_replace_entry(mapping, folio->index, folio, radswap); folio->mapping = NULL; mapping->nrpages -= nr; - __lruvec_stat_mod_folio(folio, NR_FILE_PAGES, -nr); - __lruvec_stat_mod_folio(folio, NR_SHMEM, -nr); + shmem_update_stats(folio, -nr); xa_unlock_irq(&mapping->i_pages); folio_put_refs(folio, nr); BUG_ON(error); @@ -1944,10 +1948,8 @@ static int shmem_replace_folio(struct fo } if (!error) { mem_cgroup_replace_folio(old, new); - __lruvec_stat_mod_folio(new, NR_FILE_PAGES, nr_pages); - __lruvec_stat_mod_folio(new, NR_SHMEM, nr_pages); - __lruvec_stat_mod_folio(old, NR_FILE_PAGES, -nr_pages); - __lruvec_stat_mod_folio(old, NR_SHMEM, -nr_pages); + shmem_update_stats(new, nr_pages); + shmem_update_stats(old, -nr_pages); } xa_unlock_irq(&swap_mapping->i_pages);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Usama Arif usamaarif642@gmail.com
commit 42b2eb69835b0fda797f70eb5b4fc213dbe3a7ea upstream.
Other page flags in the 2nd page, like PG_hwpoison and PG_anon_exclusive can get modified concurrently. Changes to other page flags might be lost if they are happening at the same time as non-atomic partially_mapped operations. Hence, make partially_mapped operations atomic.
Link: https://lkml.kernel.org/r/20241212183351.1345389-1-usamaarif642@gmail.com Fixes: 8422acdc97ed ("mm: introduce a pageflag for partially mapped folios") Reported-by: David Hildenbrand david@redhat.com Link: https://lore.kernel.org/all/e53b04ad-1827-43a2-a1ab-864c7efecf6e@redhat.com/ Signed-off-by: Usama Arif usamaarif642@gmail.com Acked-by: David Hildenbrand david@redhat.com Acked-by: Johannes Weiner hannes@cmpxchg.org Acked-by: Roman Gushchin roman.gushchin@linux.dev Cc: Barry Song baohua@kernel.org Cc: Domenico Cerasuolo cerasuolodomenico@gmail.com Cc: Jonathan Corbet corbet@lwn.net Cc: Matthew Wilcox willy@infradead.org Cc: Mike Rapoport (Microsoft) rppt@kernel.org Cc: Nico Pache npache@redhat.com Cc: Rik van Riel riel@surriel.com Cc: Ryan Roberts ryan.roberts@arm.com Cc: Shakeel Butt shakeel.butt@linux.dev Cc: Yu Zhao yuzhao@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/page-flags.h | 12 ++---------- mm/huge_memory.c | 8 ++++---- 2 files changed, 6 insertions(+), 14 deletions(-)
--- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -860,18 +860,10 @@ static inline void ClearPageCompound(str ClearPageHead(page); } FOLIO_FLAG(large_rmappable, FOLIO_SECOND_PAGE) -FOLIO_TEST_FLAG(partially_mapped, FOLIO_SECOND_PAGE) -/* - * PG_partially_mapped is protected by deferred_split split_queue_lock, - * so its safe to use non-atomic set/clear. - */ -__FOLIO_SET_FLAG(partially_mapped, FOLIO_SECOND_PAGE) -__FOLIO_CLEAR_FLAG(partially_mapped, FOLIO_SECOND_PAGE) +FOLIO_FLAG(partially_mapped, FOLIO_SECOND_PAGE) #else FOLIO_FLAG_FALSE(large_rmappable) -FOLIO_TEST_FLAG_FALSE(partially_mapped) -__FOLIO_SET_FLAG_NOOP(partially_mapped) -__FOLIO_CLEAR_FLAG_NOOP(partially_mapped) +FOLIO_FLAG_FALSE(partially_mapped) #endif
#define PG_head_mask ((1UL << PG_head)) --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3503,7 +3503,7 @@ int split_huge_page_to_list_to_order(str !list_empty(&folio->_deferred_list)) { ds_queue->split_queue_len--; if (folio_test_partially_mapped(folio)) { - __folio_clear_partially_mapped(folio); + folio_clear_partially_mapped(folio); mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); } @@ -3615,7 +3615,7 @@ bool __folio_unqueue_deferred_split(stru if (!list_empty(&folio->_deferred_list)) { ds_queue->split_queue_len--; if (folio_test_partially_mapped(folio)) { - __folio_clear_partially_mapped(folio); + folio_clear_partially_mapped(folio); mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); } @@ -3659,7 +3659,7 @@ void deferred_split_folio(struct folio * spin_lock_irqsave(&ds_queue->split_queue_lock, flags); if (partially_mapped) { if (!folio_test_partially_mapped(folio)) { - __folio_set_partially_mapped(folio); + folio_set_partially_mapped(folio); if (folio_test_pmd_mappable(folio)) count_vm_event(THP_DEFERRED_SPLIT_PAGE); count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); @@ -3752,7 +3752,7 @@ static unsigned long deferred_split_scan } else { /* We lost race with folio_put() */ if (folio_test_partially_mapped(folio)) { - __folio_clear_partially_mapped(folio); + folio_clear_partially_mapped(folio); mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xuewen Yan xuewen.yan@unisoc.com
commit 900bbaae67e980945dec74d36f8afe0de7556d5a upstream.
Now, the epoll only use wake_up() interface to wake up task. However, sometimes, there are epoll users which want to use the synchronous wakeup flag to hint the scheduler, such as Android binder driver. So add a wake_up_sync() define, and use the wake_up_sync() when the sync is true in ep_poll_callback().
Co-developed-by: Jing Xia jing.xia@unisoc.com Signed-off-by: Jing Xia jing.xia@unisoc.com Signed-off-by: Xuewen Yan xuewen.yan@unisoc.com Link: https://lore.kernel.org/r/20240426080548.8203-1-xuewen.yan@unisoc.com Tested-by: Brian Geffon bgeffon@google.com Reviewed-by: Brian Geffon bgeffon@google.com Reported-by: Benoit Lize lizeb@google.com Signed-off-by: Christian Brauner brauner@kernel.org Cc: Brian Geffon bgeffon@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/eventpoll.c | 5 ++++- include/linux/wait.h | 1 + 2 files changed, 5 insertions(+), 1 deletion(-)
--- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1373,7 +1373,10 @@ static int ep_poll_callback(wait_queue_e break; } } - wake_up(&ep->wq); + if (sync) + wake_up_sync(&ep->wq); + else + wake_up(&ep->wq); } if (waitqueue_active(&ep->poll_wait)) pwake++; --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -221,6 +221,7 @@ void __wake_up_pollfree(struct wait_queu #define wake_up_all(x) __wake_up(x, TASK_NORMAL, 0, NULL) #define wake_up_locked(x) __wake_up_locked((x), TASK_NORMAL, 1) #define wake_up_all_locked(x) __wake_up_locked((x), TASK_NORMAL, 0) +#define wake_up_sync(x) __wake_up_sync(x, TASK_NORMAL)
#define wake_up_interruptible(x) __wake_up(x, TASK_INTERRUPTIBLE, 1, NULL) #define wake_up_interruptible_nr(x, nr) __wake_up(x, TASK_INTERRUPTIBLE, nr, NULL)
Hello,
On Mon, 23 Dec 2024 16:56:51 +0100 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
This rc kernel passes DAMON functionality test[1] on my test machine. Attaching the test results summary below. Please note that I retrieved the kernel from linux-stable-rc tree[2].
Tested-by: SeongJae Park sj@kernel.org
[1] https://github.com/damonitor/damon-tests/tree/next/corr [2] c157915828d8 ("Linux 6.12.7-rc1")
Thanks, SJ
[...]
---
ok 9 selftests: damon: damos_tried_regions.py ok 10 selftests: damon: damon_nr_regions.py ok 11 selftests: damon: reclaim.sh ok 12 selftests: damon: lru_sort.sh ok 13 selftests: damon: debugfs_empty_targets.sh ok 14 selftests: damon: debugfs_huge_count_read_write.sh ok 15 selftests: damon: debugfs_duplicate_context_creation.sh ok 16 selftests: damon: debugfs_rm_non_contexts.sh ok 17 selftests: damon: debugfs_target_ids_read_before_terminate_race.sh ok 18 selftests: damon: debugfs_target_ids_pid_leak.sh ok 19 selftests: damon: sysfs_update_removed_scheme_dir.sh ok 20 selftests: damon: sysfs_update_schemes_tried_regions_hang.py ok 1 selftests: damon-tests: kunit.sh ok 2 selftests: damon-tests: huge_count_read_write.sh ok 3 selftests: damon-tests: buffer_overflow.sh ok 4 selftests: damon-tests: rm_contexts.sh ok 5 selftests: damon-tests: record_null_deref.sh ok 6 selftests: damon-tests: dbgfs_target_ids_read_before_terminate_race.sh ok 7 selftests: damon-tests: dbgfs_target_ids_pid_leak.sh ok 8 selftests: damon-tests: damo_tests.sh ok 9 selftests: damon-tests: masim-record.sh ok 10 selftests: damon-tests: build_i386.sh ok 11 selftests: damon-tests: build_arm64.sh # SKIP ok 12 selftests: damon-tests: build_m68k.sh # SKIP ok 13 selftests: damon-tests: build_i386_idle_flag.sh ok 14 selftests: damon-tests: build_i386_highpte.sh ok 15 selftests: damon-tests: build_nomemcg.sh [33m [92mPASS [39m
On 12/23/24 08:56, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
Hi Greg
On Tue, Dec 24, 2024 at 1:00 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
6.12.7-rc1 tested.
Build successfully completed. Boot successfully completed. No dmesg regressions. Video output normal. Sound output normal.
Lenovo ThinkPad X1 Carbon Gen10(Intel i7-1260P(x86_64) arch linux)
[ 0.000000] Linux version 6.12.7-rc1rv (takeshi@ThinkPadX1Gen10J0764) (gcc (GCC) 14.2.1 20240910, GNU ld (GNU Binutils) 2.43.0) #1 SMP PREEMPT_DYNAMIC Tue Dec 24 07:47:05 JST 2024
Thanks
Tested-by: Takeshi Ogasawara takeshi.ogasawara@futuring-girl.com
Hi Greg,
On 23/12/24 21:26, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
No problems seen on x86_64 and aarch64 with our testing.
Tested-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com
Thanks, Harshit
On 12/23/24 07:56, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
Am 23.12.2024 um 16:56 schrieb Greg Kroah-Hartman:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Builds, boots and works on my 2-socket Ivy Bridge Xeon E5-2697 v2 server. No dmesg oddities or regressions found.
Tested-by: Peter Schneider pschneider1968@googlemail.com
Happy holiday season! Peter Schneider
Works as it should
Tested-by: Luna Jernberg droidbittin@gmail.com
AMD Ryzen 5 5600 6-Core Processor: https://www.inet.se/produkt/5304697/amd-ryzen-5-5600-3-5-ghz-35mb on a https://www.gigabyte.com/Motherboard/B550-AORUS-ELITE-V2-rev-12 https://www.inet.se/produkt/1903406/gigabyte-b550-aorus-elite-v2 motherboard :)
running Arch Linux with the testing repos enabled: https://archlinux.org/ https://archboot.com/ https://wiki.archlinux.org/title/Arch_Testing_Team
and merry xmas here from Sweden :)
Den mån 23 dec. 2024 kl 17:00 skrev Greg Kroah-Hartman gregkh@linuxfoundation.org:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.12.7-rc1
Xuewen Yan xuewen.yan@unisoc.com epoll: Add synchronous wakeup support for ep_poll_callback
Usama Arif usamaarif642@gmail.com mm: convert partially_mapped set/clear operations to be atomic
Hugh Dickins hughd@google.com mm: shmem: fix ShmemHugePages at swapout
Kefeng Wang wangkefeng.wang@huawei.com mm: use aligned address in copy_user_gigantic_page()
Kefeng Wang wangkefeng.wang@huawei.com mm: use aligned address in clear_gigantic_page()
Ilya Dryomov idryomov@gmail.com ceph: fix memory leak in ceph_direct_read_write()
Max Kellermann max.kellermann@ionos.com ceph: fix memory leaks in __ceph_sync_read()
Alex Markuze amarkuze@redhat.com ceph: improve error handling and short/overflow-read logic in __ceph_sync_read()
Ilya Dryomov idryomov@gmail.com ceph: validate snapdirname option length when mounting
Max Kellermann max.kellermann@ionos.com ceph: give up on paths longer than PATH_MAX
Zijun Hu quic_zijuhu@quicinc.com of: Fix refcount leakage for OF node returned by __of_get_dma_parent()
Herve Codina herve.codina@bootlin.com of: Fix error path in of_parse_phandle_with_args_map()
Andrea della Porta andrea.porta@suse.com of: address: Preserve the flags portion on 1:1 dma-ranges mapping
Samuel Holland samuel.holland@sifive.com of: property: fw_devlink: Do not use interrupt-parent directly
Jann Horn jannh@google.com udmabuf: also check for F_SEAL_FUTURE_WRITE
Jann Horn jannh@google.com udmabuf: fix racy memfd sealing check
Edward Adam Davis eadavis@qq.com nilfs2: prevent use of deleted inode
Ryusuke Konishi konishi.ryusuke@gmail.com nilfs2: fix buffer head leaks in calls to truncate_inode_pages()
Heming Zhao heming.zhao@suse.com ocfs2: fix the space leak in LA when releasing LA
Zijun Hu quic_zijuhu@quicinc.com of/irq: Fix using uninitialized variable @addr_len in API of_irq_parse_one()
Zijun Hu quic_zijuhu@quicinc.com of/irq: Fix interrupt-map cell length check in of_irq_parse_imap_parent()
Sean Christopherson seanjc@google.com KVM: SVM: Allow guest writes to set MSR_AMD64_DE_CFG bits
Trond Myklebust trond.myklebust@hammerspace.com NFS/pnfs: Fix a live lock between recalled layouts and layoutget
Pavel Begunkov asml.silence@gmail.com io_uring: check if iowq is killed before queuing
Jann Horn jannh@google.com io_uring: Fix registered ring file refcount leak
Tiezhu Yang yangtiezhu@loongson.cn selftests/bpf: Use asm constraint "m" for LoongArch
Isaac J. Manjarres isaacmanjarres@google.com selftests/memfd: run sysctl tests when PID namespace support is enabled
Steven Rostedt rostedt@goodmis.org tracing: Check "%s" dereference via the field and not the TP_printk format
Steven Rostedt rostedt@goodmis.org tracing: Add "%s" check in test_event_printk()
Steven Rostedt rostedt@goodmis.org tracing: Add missing helper functions in event pointer dereference check
Steven Rostedt rostedt@goodmis.org tracing: Fix test_event_printk() to process entire print argument
Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com accel/ivpu: Fix WARN in ivpu_ipc_send_receive_internal()
Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com accel/ivpu: Fix general protection fault in ivpu_bo_list()
Enzo Matsumiya ematsumiya@suse.de smb: client: fix TCP timers deadlock after rmmod
Sean Christopherson seanjc@google.com KVM: x86: Play nice with protected guests in complete_hypercall_exit()
Naman Jain namjain@linux.microsoft.com x86/hyperv: Fix hv tsc page based sched_clock for hibernation
Dexuan Cui decui@microsoft.com tools: hv: Fix a complier warning in the fcopy uio daemon
Michael Kelley mhklinux@outlook.com Drivers: hv: util: Avoid accessing a ringbuffer not initialized yet
Steven Rostedt rostedt@goodmis.org fgraph: Still initialize idle shadow stacks when starting
Alex Deucher alexander.deucher@amd.com drm/amdgpu/mmhub4.1: fix IP version check
Alex Deucher alexander.deucher@amd.com drm/amdgpu/gfx12: fix IP version check
Alex Deucher alexander.deucher@amd.com drm/amdgpu/nbio7.0: fix IP version check
Heiko Carstens hca@linux.ibm.com s390/mm: Fix DirectMap accounting
Qu Wenruo wqu@suse.com btrfs: tree-checker: reject inline extent items with 0 ref count
Josef Bacik josef@toxicpanda.com btrfs: fix improper generation check in snapshot delete
Christoph Hellwig hch@lst.de btrfs: split bios to the fs sector size boundary
Suren Baghdasaryan surenb@google.com alloc_tag: fix set_codetag_empty() when !CONFIG_MEM_ALLOC_PROFILING_DEBUG
Edward Adam Davis eadavis@qq.com ring-buffer: Fix overflow in __rb_map_vma
David Hildenbrand david@redhat.com mm/page_alloc: don't call pfn_to_page() on possibly non-existent PFN in split_large_buddy()
Matthew Wilcox (Oracle) willy@infradead.org vmalloc: fix accounting with i915
Kairui Song kasong@tencent.com zram: fix uninitialized ZRAM not releasing backing device
Kairui Song kasong@tencent.com zram: refuse to use zero sized block device as backing device
Alex Deucher alexander.deucher@amd.com drm/amdgpu/smu14.0.2: fix IP version check
Alex Deucher alexander.deucher@amd.com drm/amdgpu/nbio7.7: fix IP version check
Alex Deucher alexander.deucher@amd.com drm/amdgpu/nbio7.11: fix IP version check
Steven Rostedt rostedt@goodmis.org trace/ring-buffer: Do not use TP_printk() formatting for boot mapped buffers
Ming Lei ming.lei@redhat.com block: avoid to reuse `hctx` not removed from cpuhp callback list
Murad Masimov m.masimov@maxima.ru hwmon: (tmp513) Fix interpretation of values of Temperature Result and Limit Registers
Murad Masimov m.masimov@maxima.ru hwmon: (tmp513) Fix Current Register value interpretation
Murad Masimov m.masimov@maxima.ru hwmon: (tmp513) Fix interpretation of values of Shunt Voltage and Limit Registers
Pierre-Eric Pelloux-Prayer pierre-eric.pelloux-prayer@amd.com drm/amdgpu: don't access invalid sched
Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com i915/guc: Accumulate active runtime on gt reset
Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com i915/guc: Ensure busyness counter increases motonically
Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com i915/guc: Reset engine utilization buffer before registration
Michael Trimarchi michael@amarulasolutions.com drm/panel: synaptics-r63353: Fix regulator unbalance
Marek Vasut marex@denx.de drm/panel: st7701: Add prepare_prev_first flag to drm_panel
Yang Yingliang yangyingliang@huawei.com drm/panel: novatek-nt35950: fix return value check in nt35950_probe()
Zhang Zekun zhangzekun11@huawei.com drm/panel: himax-hx83102: Add a check to prevent NULL pointer dereference
T.J. Mercier tjmercier@google.com dma-buf: Fix __dma_buf_debugfs_list_del argument for !CONFIG_DEBUG_FS
Jann Horn jannh@google.com udmabuf: fix memory leak on last export_udmabuf() error path
Huan Yang link@vivo.com udmabuf: udmabuf_create pin folio codestyle cleanup
Michel Dänzer mdaenzer@redhat.com drm/amdgpu: Handle NULL bo->tbo.resource (again) in amdgpu_vm_bo_update
Christian König christian.koenig@amd.com drm/amdgpu: fix amdgpu_coredump
Ville Syrjälä ville.syrjala@linux.intel.com drm/modes: Avoid divide by zero harder in drm_mode_vrefresh()
Mario Limonciello mario.limonciello@amd.com drm/amd: Update strapping for NBIO 2.5.0
Krzysztof Karas krzysztof.karas@intel.com drm/display: use ERR_PTR on DP tunnel manager creation fail
Mario Limonciello mario.limonciello@amd.com thunderbolt: Don't display nvm_version unless upgrade supported
Mika Westerberg mika.westerberg@linux.intel.com thunderbolt: Improve redrive mode handling
Mika Westerberg mika.westerberg@linux.intel.com thunderbolt: Add support for Intel Panther Lake-M/P
Mathias Nyman mathias.nyman@linux.intel.com xhci: Turn NEC specific quirk for handling Stop Endpoint errors generic
Daniele Palmas dnlplm@gmail.com USB: serial: option: add Telit FE910C04 rmnet compositions
Jack Wu wojackbb@gmail.com USB: serial: option: add MediaTek T7XX compositions
Mank Wang mank.wang@netprisma.com USB: serial: option: add Netprisma LCUK54 modules for WWAN Ready
Michal Hrusecky michal.hrusecky@turris.com USB: serial: option: add MeiG Smart SLM770A
Daniel Swanemar d.swanemar@gmail.com USB: serial: option: add TCL IK512 MBIM & ECM
Nathan Chancellor nathan@kernel.org hexagon: Disable constant extender optimization for LLVM prior to 19.1.0
James Bottomley James.Bottomley@HansenPartnership.com efivarfs: Fix error on non-existent file
Geert Uytterhoeven geert+renesas@glider.be i2c: riic: Always round-up when calculating bus period
Ming Lei ming.lei@redhat.com block: Revert "block: Fix potential deadlock while freezing queue and acquiring sysfs_lock"
Jeremy Kerr jk@codeconstruct.com.au net: mctp: handle skb cleanup on sock_queue failures
Dan Carpenter dan.carpenter@linaro.org chelsio/chtls: prevent potential integer overflow on 32bit
Eric Dumazet edumazet@google.com net: tun: fix tun_napi_alloc_frags()
Sean Christopherson seanjc@google.com KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init
Marc Zyngier maz@kernel.org KVM: arm64: Do not allow ID_AA64MMFR0_EL1.ASIDbits to be overridden
Borislav Petkov (AMD) bp@alien8.de EDAC/amd64: Simplify ECC check on unified memory controllers
Marc Zyngier maz@kernel.org irqchip/gic-v3: Work around insecure GIC integrations
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp mmc: mtk-sd: disable wakeup in .remove() and in the error path of .probe()
Prathamesh Shete pshete@nvidia.com mmc: sdhci-tegra: Remove SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC quirk
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp net: mdiobus: fix an OF node reference leak
Adrian Moreno amorenoz@redhat.com psample: adjust size if rate_as_probability is set
Jakub Kicinski kuba@kernel.org netdev-genl: avoid empty messages in queue dump
Vladimir Oltean vladimir.oltean@nxp.com net: dsa: restore dsa_software_vlan_untag() ability to operate on VLAN-untagged traffic
Adrian Moreno amorenoz@redhat.com selftests: openvswitch: fix tcpdump execution
Phil Sutter phil@nwl.cc netfilter: ipset: Fix for recursive locking warning
David Laight David.Laight@ACULAB.COM ipvs: Fix clamp() of ip_vs_conn_tab on small memory systems
Matthias Schiffer matthias.schiffer@ew.tq-group.com can: m_can: fix missed interrupts with m_can_pci
Matthias Schiffer matthias.schiffer@ew.tq-group.com can: m_can: set init flag earlier in probe
Eric Dumazet edumazet@google.com net: netdevsim: fix nsim_pp_hold_write()
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp net: ethernet: bgmac-platform: fix an OF node reference leak
Parthiban Veerasooran parthiban.veerasooran@microchip.com net: ethernet: oa_tc6: fix tx skb race condition between reference pointers
Parthiban Veerasooran parthiban.veerasooran@microchip.com net: ethernet: oa_tc6: fix infinite loop error when tx credits becomes 0
Dan Carpenter dan.carpenter@linaro.org net: hinic: Fix cleanup in create_rxqs/txqs()
Daniel Borkmann daniel@iogearbox.net team: Fix feature exposure when no ports are present
Jakub Kicinski kuba@kernel.org netdev: fix repeated netlink messages in queue stats
Jakub Kicinski kuba@kernel.org netdev: fix repeated netlink messages in queue dump
Marios Makassikis mmakassikis@freebox.fr ksmbd: fix broken transfers when exceeding max simultaneous operations
Marios Makassikis mmakassikis@freebox.fr ksmbd: count all requests in req_running counter
Nikita Yushchenko nikita.yoush@cogentembedded.com net: renesas: rswitch: rework ts tags management
Shannon Nelson shannon.nelson@amd.com ionic: use ee->offset when returning sprom data
Shannon Nelson shannon.nelson@amd.com ionic: no double destroy workqueue
Brett Creeley brett.creeley@amd.com ionic: Fix netdev notifier unregister on failure
Donald Hunter donald.hunter@gmail.com tools/net/ynl: fix sub-message key lookup for nested attributes
Eric Dumazet edumazet@google.com netdevsim: prevent bad user input in nsim_dev_health_break_write()
Vladimir Oltean vladimir.oltean@nxp.com net: mscc: ocelot: fix incorrect IFH SRC_PORT field in ocelot_ifh_set_basic()
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check return value of sock_recvmsg when draining clc data
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check smcd_v2_ext_offset when receiving proposal msg
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check v2_ext_offset/eid_cnt/ism_gid_cnt when receiving proposal msg
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check iparea_offset and ipv6_prefixes_cnt when receiving proposal msg
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check sndbuf_space again after NOSPACE flag is set in smc_poll
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: protect link down work from execute after lgr freed
Huaisheng Ye huaisheng.ye@intel.com cxl/region: Fix region creation for greater than x2 switches
Davidlohr Bueso dave@stgolabs.net cxl/pci: Fix potential bogus return value upon successful probing
Olaf Hering olaf@aepfle.de tools: hv: change permissions of NetworkManager configuration file
Darrick J. Wong djwong@kernel.org xfs: fix zero byte checking in the superblock scrubber
Darrick J. Wong djwong@kernel.org xfs: fix sb_spino_align checks for large fsblock sizes
Darrick J. Wong djwong@kernel.org xfs: fix off-by-one error in fsmap's end_daddr usage
Dave Chinner dchinner@redhat.com xfs: fix sparse inode limits on runt AG
Dave Chinner dchinner@redhat.com xfs: sb_spino_align is not verified
Gao Xiang xiang@kernel.org erofs: use buffered I/O for file-backed mounts by default
Gao Xiang xiang@kernel.org erofs: reference `struct erofs_device_info` for erofs_map_dev
Gao Xiang xiang@kernel.org erofs: use `struct erofs_device_info` for the primary device
Gao Xiang xiang@kernel.org erofs: add erofs_sb_free() helper
Vasily Gorbik gor@linux.ibm.com s390/mm: Consider KMSAN modules metadata for paging levels
Vineeth Pillai (Google) vineeth@bitbyteword.org sched/dlserver: Fix dlserver time accounting
Vineeth Pillai (Google) vineeth@bitbyteword.org sched/dlserver: Fix dlserver double enqueue
Gao Xiang xiang@kernel.org erofs: fix PSI memstall accounting
Alexander Gordeev agordeev@linux.ibm.com s390/ipl: Fix never less than zero warning
Vladimir Riabchun ferr.lambarginio@gmail.com i2c: pnx: Fix timeout in wait functions
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com p2sb: Do not scan and remove the P2SB device when it is unhidden
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com p2sb: Move P2SB hide and unhide code to p2sb_scan_and_cache()
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com p2sb: Introduce the global flag p2sb_hidden_by_bios
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com p2sb: Factor out p2sb_read_from_cache()
Peter Zijlstra peterz@infradead.org sched/eevdf: More PELT vs DELAYED_DEQUEUE
Vincent Guittot vincent.guittot@linaro.org sched/fair: Fix sched_can_stop_tick() for fair tasks
K Prateek Nayak kprateek.nayak@amd.com sched/fair: Fix NEXT_BUDDY
Michael Neuling michaelneuling@tenstorrent.com RISC-V: KVM: Fix csr_write -> csr_set for HVIEN PMU overflow bit
Levi Yun yeoreum.yun@arm.com firmware: arm_ffa: Fix the race around setting ffa_dev->properties
Arnd Bergmann arnd@arndb.de firmware: arm_scmi: Fix i.MX build dependency
Russell King (Oracle) rmk+kernel@armlinux.org.uk net: stmmac: fix TSO DMA API usage causing oops
Lion Ackermann nnamrec@gmail.com net: sched: fix ordering of qlen adjustment
Diffstat:
Makefile | 4 +- arch/arm64/kvm/sys_regs.c | 3 +- arch/hexagon/Makefile | 6 + arch/riscv/kvm/aia.c | 2 +- arch/s390/boot/startup.c | 2 + arch/s390/boot/vmem.c | 6 +- arch/s390/kernel/ipl.c | 2 +- arch/x86/kernel/cpu/mshyperv.c | 58 +++++ arch/x86/kvm/cpuid.c | 31 ++- arch/x86/kvm/cpuid.h | 1 + arch/x86/kvm/svm/svm.c | 9 - arch/x86/kvm/x86.c | 4 +- block/blk-mq-sysfs.c | 16 +- block/blk-mq.c | 40 ++-- block/blk-sysfs.c | 4 +- drivers/accel/ivpu/ivpu_gem.c | 2 +- drivers/accel/ivpu/ivpu_pm.c | 2 +- drivers/block/zram/zram_drv.c | 15 +- drivers/clocksource/hyperv_timer.c | 14 +- drivers/cxl/core/region.c | 25 +- drivers/cxl/pci.c | 3 +- drivers/dma-buf/dma-buf.c | 2 +- drivers/dma-buf/udmabuf.c | 180 ++++++++------ drivers/edac/amd64_edac.c | 32 +-- drivers/firmware/arm_ffa/bus.c | 15 +- drivers/firmware/arm_ffa/driver.c | 7 +- drivers/firmware/arm_scmi/vendors/imx/Kconfig | 1 + drivers/firmware/imx/Kconfig | 1 - drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +- drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c | 11 + drivers/gpu/drm/amd/amdgpu/nbio_v7_11.c | 2 +- drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c | 2 +- .../gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c | 2 +- drivers/gpu/drm/display/drm_dp_tunnel.c | 10 +- drivers/gpu/drm/drm_modes.c | 11 +- drivers/gpu/drm/i915/gt/intel_engine_types.h | 5 + drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 41 +++- drivers/gpu/drm/panel/panel-himax-hx83102.c | 2 + drivers/gpu/drm/panel/panel-novatek-nt35950.c | 4 +- drivers/gpu/drm/panel/panel-sitronix-st7701.c | 1 + drivers/gpu/drm/panel/panel-synaptics-r63353.c | 2 +- drivers/hv/hv_kvp.c | 6 + drivers/hv/hv_snapshot.c | 6 + drivers/hv/hv_util.c | 9 + drivers/hv/hyperv_vmbus.h | 2 + drivers/hwmon/tmp513.c | 10 +- drivers/i2c/busses/i2c-pnx.c | 4 +- drivers/i2c/busses/i2c-riic.c | 2 +- drivers/irqchip/irq-gic-v3.c | 17 +- drivers/mmc/host/mtk-sd.c | 2 + drivers/mmc/host/sdhci-tegra.c | 1 - drivers/net/can/m_can/m_can.c | 36 ++- drivers/net/can/m_can/m_can.h | 1 + drivers/net/can/m_can/m_can_pci.c | 1 + drivers/net/ethernet/broadcom/bgmac-platform.c | 5 +- .../chelsio/inline_crypto/chtls/chtls_main.c | 5 +- drivers/net/ethernet/huawei/hinic/hinic_main.c | 2 + drivers/net/ethernet/mscc/ocelot.c | 2 +- drivers/net/ethernet/oa_tc6.c | 11 +- drivers/net/ethernet/pensando/ionic/ionic_dev.c | 5 +- .../net/ethernet/pensando/ionic/ionic_ethtool.c | 4 +- drivers/net/ethernet/pensando/ionic/ionic_lif.c | 4 +- drivers/net/ethernet/renesas/rswitch.c | 68 +++--- drivers/net/ethernet/renesas/rswitch.h | 13 +- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 7 +- drivers/net/mdio/fwnode_mdio.c | 13 +- drivers/net/netdevsim/health.c | 2 + drivers/net/netdevsim/netdev.c | 4 +- drivers/net/team/team_core.c | 10 +- drivers/net/tun.c | 2 +- drivers/of/address.c | 5 +- drivers/of/base.c | 15 +- drivers/of/irq.c | 2 + drivers/of/property.c | 2 - drivers/platform/x86/p2sb.c | 79 ++++-- drivers/thunderbolt/nhi.c | 8 + drivers/thunderbolt/nhi.h | 4 + drivers/thunderbolt/retimer.c | 19 +- drivers/thunderbolt/tb.c | 41 ++++ drivers/usb/host/xhci-ring.c | 2 - drivers/usb/serial/option.c | 27 +++ fs/btrfs/bio.c | 10 +- fs/btrfs/ctree.h | 19 ++ fs/btrfs/extent-tree.c | 6 +- fs/btrfs/tree-checker.c | 27 ++- fs/ceph/file.c | 77 +++--- fs/ceph/mds_client.c | 9 +- fs/ceph/super.c | 2 + fs/efivarfs/inode.c | 2 +- fs/efivarfs/internal.h | 1 - fs/efivarfs/super.c | 3 - fs/erofs/data.c | 36 +-- fs/erofs/fileio.c | 9 +- fs/erofs/fscache.c | 10 +- fs/erofs/internal.h | 15 +- fs/erofs/super.c | 80 ++++--- fs/erofs/zdata.c | 4 +- fs/eventpoll.c | 5 +- fs/hugetlbfs/inode.c | 2 +- fs/nfs/pnfs.c | 2 +- fs/nilfs2/btnode.c | 1 + fs/nilfs2/gcinode.c | 2 +- fs/nilfs2/inode.c | 13 +- fs/nilfs2/namei.c | 5 + fs/nilfs2/nilfs.h | 1 + fs/ocfs2/localalloc.c | 8 +- fs/smb/client/connect.c | 36 ++- fs/smb/server/connection.c | 18 +- fs/smb/server/connection.h | 1 - fs/smb/server/server.c | 7 +- fs/smb/server/server.h | 1 + fs/smb/server/transport_ipc.c | 5 +- fs/xfs/libxfs/xfs_ialloc.c | 16 +- fs/xfs/libxfs/xfs_sb.c | 15 ++ fs/xfs/scrub/agheader.c | 29 ++- fs/xfs/xfs_fsmap.c | 29 ++- include/clocksource/hyperv_timer.h | 2 + include/linux/alloc_tag.h | 7 +- include/linux/arm_ffa.h | 13 +- include/linux/hyperv.h | 1 + include/linux/io_uring.h | 4 +- include/linux/page-flags.h | 12 +- include/linux/sched.h | 7 + include/linux/trace_events.h | 6 +- include/linux/wait.h | 1 + io_uring/io_uring.c | 7 +- kernel/sched/core.c | 2 +- kernel/sched/deadline.c | 8 +- kernel/sched/debug.c | 1 + kernel/sched/fair.c | 73 ++++-- kernel/sched/pelt.c | 2 +- kernel/sched/sched.h | 13 +- kernel/trace/fgraph.c | 8 +- kernel/trace/ring_buffer.c | 6 +- kernel/trace/trace.c | 264 +++++---------------- kernel/trace/trace.h | 6 +- kernel/trace/trace_events.c | 227 ++++++++++++++---- kernel/trace/trace_output.c | 6 +- mm/huge_memory.c | 8 +- mm/hugetlb.c | 5 +- mm/memory.c | 8 +- mm/page_alloc.c | 6 +- mm/shmem.c | 22 +- mm/vmalloc.c | 6 +- net/core/netdev-genl.c | 19 +- net/dsa/tag.h | 16 +- net/mctp/route.c | 36 ++- net/mctp/test/route-test.c | 86 +++++++ net/netfilter/ipset/ip_set_list_set.c | 3 + net/netfilter/ipvs/ip_vs_conn.c | 4 +- net/psample/psample.c | 9 +- net/sched/sch_cake.c | 2 +- net/sched/sch_choke.c | 2 +- net/smc/af_smc.c | 18 +- net/smc/smc_clc.c | 17 +- net/smc/smc_clc.h | 22 +- net/smc/smc_core.c | 9 +- sound/soc/fsl/Kconfig | 1 + tools/hv/hv_fcopy_uio_daemon.c | 8 +- tools/hv/hv_set_ifconfig.sh | 2 +- tools/net/ynl/lib/ynl.py | 6 +- tools/testing/selftests/bpf/sdt.h | 2 + tools/testing/selftests/memfd/memfd_test.c | 14 +- .../selftests/net/openvswitch/openvswitch.sh | 6 +- 168 files changed, 1685 insertions(+), 891 deletions(-)
On Mon, 23 Dec 2024 16:56:51 +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v6.12: 10 builds: 10 pass, 0 fail 26 boots: 26 pass, 0 fail 116 tests: 116 pass, 0 fail
Linux version: 6.12.7-rc1-gc157915828d8 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On Mon, 23 Dec 2024 at 21:31, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
The following test regressions found on arm64 selftests kvm kvm_set_id_regs.
This was reported and fixed by a patch [1].
* graviton4-metal, kselftest-kvm - kvm_set_id_regs
* rk3399-rock-pi-4b-nvhe, kselftest-kvm - kvm_set_id_regs
* rk3399-rock-pi-4b-protected, kselftest-kvm - kvm_set_id_regs
* rk3399-rock-pi-4b-vhe, kselftest-kvm - kvm_set_id_regs
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Test log: ----------- # ==== Test Assertion Failure ==== # aarch64/set_id_regs.c:434: masks[idx] & ftr_bits[j].mask == ftr_bits[j].mask # pid=2627 tid=2627 errno=22 - Invalid argument # 1 0x0000000000402fe7: test_vm_ftr_id_regs at set_id_regs.c:434 # 2 0x0000000000401b53: main at set_id_regs.c:588 # 3 0x0000ffffa640773f: ?? ??:0 # 4 0x0000ffffa6407817: ?? ??:0 # 5 0x0000000000401e2f: _start at ??:? # 0 != 0xf0 (masks[idx] & ftr_bits[j].mask != ftr_bits[j].mask) not ok 7 selftests: kvm: set_id_regs # exit=254
Test report and fix link, [1] https://lore.kernel.org/all/20241216-kvm-arm64-fix-set-id-asidbits-v1-1-8b10...
Test failed Links: --------- - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.12.y/build/v6.12.... - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.12.y/build/v6.12....
## Build * kernel: 6.12.7-rc1 * git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git * git commit: c157915828d8f4b0a4f2e60fffed2459c27f3003 * git describe: v6.12.6-161-gc157915828d8 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.12.y/build/v6.12....
## Test Regressions (compared to v6.12.5-173-g83a2a70d2d65)
* graviton4-metal, kselftest-kvm - kvm_set_id_regs
* rk3399-rock-pi-4b-nvhe, kselftest-kvm - kvm_set_id_regs
* rk3399-rock-pi-4b-protected, kselftest-kvm - kvm_set_id_regs
* rk3399-rock-pi-4b-vhe, kselftest-kvm - kvm_set_id_regs
## Metric Regressions (compared to v6.12.5-173-g83a2a70d2d65)
## Test Fixes (compared to v6.12.5-173-g83a2a70d2d65)
## Metric Fixes (compared to v6.12.5-173-g83a2a70d2d65)
## Test result summary total: 116741, pass: 93918, fail: 4621, skip: 18202, xfail: 0
## Build Summary * arc: 6 total, 5 passed, 1 failed * arm: 143 total, 137 passed, 6 failed * arm64: 58 total, 56 passed, 2 failed * i386: 22 total, 19 passed, 3 failed * mips: 38 total, 33 passed, 5 failed * parisc: 5 total, 3 passed, 2 failed * powerpc: 44 total, 40 passed, 4 failed * riscv: 27 total, 24 passed, 3 failed * s390: 26 total, 22 passed, 4 failed * sh: 6 total, 5 passed, 1 failed * sparc: 5 total, 3 passed, 2 failed * x86_64: 50 total, 49 passed, 1 failed
## Test suites summary * boot * commands * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-efivarfs * kselftest-exec * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-filesystems-epoll * kselftest-firmware * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-kcmp * kselftest-kvm * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-mincore * kselftest-mqueue * kselftest-net * kselftest-net-mptcp * kselftest-openat2 * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-rust * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-tc-testing * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user_events * kselftest-vDSO * kselftest-x86 * kunit * kvm-unit-tests * libgpiod * libhugetlbfs * log-parser-boot * log-parser-build-clang * log-parser-build-gcc * log-parser-test * ltp-capability * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-hugetlb * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-smoke * ltp-syscalls * ltp-tracing * perf * rcutorture
-- Linaro LKFT https://lkft.linaro.org
On Tue, 24 Dec 2024 19:12:40 +0000, Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Mon, 23 Dec 2024 at 21:31, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
The following test regressions found on arm64 selftests kvm kvm_set_id_regs.
This was reported and fixed by a patch [1].
graviton4-metal, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-nvhe, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-protected, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-vhe, kselftest-kvm
- kvm_set_id_regs
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
This is totally harmless, and if anything, indicates that the *fix* is doing its job, and that this patch *must* be backported.
M.
On Thu, Dec 26, 2024 at 01:41:41PM +0000, Marc Zyngier wrote:
On Tue, 24 Dec 2024 19:12:40 +0000, Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Mon, 23 Dec 2024 at 21:31, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
The following test regressions found on arm64 selftests kvm kvm_set_id_regs.
This was reported and fixed by a patch [1].
graviton4-metal, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-nvhe, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-protected, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-vhe, kselftest-kvm
- kvm_set_id_regs
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
This is totally harmless, and if anything, indicates that the *fix* is doing its job, and that this patch *must* be backported.
Ok, but for some bizare reason someone stripped OFF the Fixes: tag, which causes this problem to now show up. Hopefully that will not happen again in the future, but now I don't know what the git id is in Linus's tree to be able to apply here.
So, what do I do now?
confused,
greg k-h
On Fri, 27 Dec 2024 13:04:11 +0000, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Dec 26, 2024 at 01:41:41PM +0000, Marc Zyngier wrote:
On Tue, 24 Dec 2024 19:12:40 +0000, Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Mon, 23 Dec 2024 at 21:31, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
The following test regressions found on arm64 selftests kvm kvm_set_id_regs.
This was reported and fixed by a patch [1].
graviton4-metal, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-nvhe, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-protected, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-vhe, kselftest-kvm
- kvm_set_id_regs
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
This is totally harmless, and if anything, indicates that the *fix* is doing its job, and that this patch *must* be backported.
I think I caused the confusion here, as "this patch" refers to the original fix which has been queued, rather than the patch to the selftest, which I don't consider a candidate for backports.
Ok, but for some bizare reason someone stripped OFF the Fixes: tag,
"Someone" == we, the KVM/arm64 maintainers.
And that's on purpose. A selftest patch doesn't fix anything, and I really don't want to use the "Fixes:" tag as a type of dependency. Additionally, these tests are mostly pointless anyway, specially this one, which really should be deleted.
which causes this problem to now show up. Hopefully that will not happen again in the future, but now I don't know what the git id is in Linus's tree to be able to apply here.
So, what do I do now?
Nothing, apart from applying the original fix, and blissfully ignoring the selftest, unless someone really want to backport it (I don't).
Thanks, and sorry for the confusion.
M.
On Fri, Dec 27, 2024 at 01:23:40PM +0000, Marc Zyngier wrote:
On Fri, 27 Dec 2024 13:04:11 +0000, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Dec 26, 2024 at 01:41:41PM +0000, Marc Zyngier wrote:
On Tue, 24 Dec 2024 19:12:40 +0000, Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Mon, 23 Dec 2024 at 21:31, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
The following test regressions found on arm64 selftests kvm kvm_set_id_regs.
This was reported and fixed by a patch [1].
graviton4-metal, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-nvhe, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-protected, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-vhe, kselftest-kvm
- kvm_set_id_regs
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
This is totally harmless, and if anything, indicates that the *fix* is doing its job, and that this patch *must* be backported.
I think I caused the confusion here, as "this patch" refers to the original fix which has been queued, rather than the patch to the selftest, which I don't consider a candidate for backports.
Ok, but for some bizare reason someone stripped OFF the Fixes: tag,
"Someone" == we, the KVM/arm64 maintainers.
And that's on purpose. A selftest patch doesn't fix anything, and I really don't want to use the "Fixes:" tag as a type of dependency. Additionally, these tests are mostly pointless anyway, specially this one, which really should be deleted.
So should I drop something? Revert it? Add a new commit? What is going to help solve the issue that we now have selftests failing?
still confused,
greg k-h
On Fri, 27 Dec 2024 13:34:52 +0000, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Fri, Dec 27, 2024 at 01:23:40PM +0000, Marc Zyngier wrote:
On Fri, 27 Dec 2024 13:04:11 +0000, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Dec 26, 2024 at 01:41:41PM +0000, Marc Zyngier wrote:
On Tue, 24 Dec 2024 19:12:40 +0000, Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Mon, 23 Dec 2024 at 21:31, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
The following test regressions found on arm64 selftests kvm kvm_set_id_regs.
This was reported and fixed by a patch [1].
graviton4-metal, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-nvhe, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-protected, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-vhe, kselftest-kvm
- kvm_set_id_regs
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
This is totally harmless, and if anything, indicates that the *fix* is doing its job, and that this patch *must* be backported.
I think I caused the confusion here, as "this patch" refers to the original fix which has been queued, rather than the patch to the selftest, which I don't consider a candidate for backports.
Ok, but for some bizare reason someone stripped OFF the Fixes: tag,
"Someone" == we, the KVM/arm64 maintainers.
And that's on purpose. A selftest patch doesn't fix anything, and I really don't want to use the "Fixes:" tag as a type of dependency. Additionally, these tests are mostly pointless anyway, specially this one, which really should be deleted.
So should I drop something? Revert it? Add a new commit? What is going to help solve the issue that we now have selftests failing?
There is nothing to add, as the fix for the selftest isn't upstream yet. If you are that bothered about an utterly pointless test failing, feel free to drop the backport of this patch and leave 6.12 being broken.
M.
On Fri, Dec 27, 2024 at 01:43:46PM +0000, Marc Zyngier wrote:
On Fri, 27 Dec 2024 13:34:52 +0000, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Fri, Dec 27, 2024 at 01:23:40PM +0000, Marc Zyngier wrote:
On Fri, 27 Dec 2024 13:04:11 +0000, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Dec 26, 2024 at 01:41:41PM +0000, Marc Zyngier wrote:
On Tue, 24 Dec 2024 19:12:40 +0000, Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Mon, 23 Dec 2024 at 21:31, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote: > > This is the start of the stable review cycle for the 6.12.7 release. > There are 160 patches in this series, all will be posted as a response > to this one. If anyone has any issues with these being applied, please > let me know. > > Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. > Anything received after that time might be too late. > > The whole patch series can be found in one patch at: > https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... > or in the git tree and branch at: > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y > and the diffstat can be found below. > > thanks, > > greg k-h
The following test regressions found on arm64 selftests kvm kvm_set_id_regs.
This was reported and fixed by a patch [1].
graviton4-metal, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-nvhe, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-protected, kselftest-kvm
- kvm_set_id_regs
rk3399-rock-pi-4b-vhe, kselftest-kvm
- kvm_set_id_regs
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
This is totally harmless, and if anything, indicates that the *fix* is doing its job, and that this patch *must* be backported.
I think I caused the confusion here, as "this patch" refers to the original fix which has been queued, rather than the patch to the selftest, which I don't consider a candidate for backports.
Ok, but for some bizare reason someone stripped OFF the Fixes: tag,
"Someone" == we, the KVM/arm64 maintainers.
And that's on purpose. A selftest patch doesn't fix anything, and I really don't want to use the "Fixes:" tag as a type of dependency. Additionally, these tests are mostly pointless anyway, specially this one, which really should be deleted.
So should I drop something? Revert it? Add a new commit? What is going to help solve the issue that we now have selftests failing?
There is nothing to add, as the fix for the selftest isn't upstream yet. If you are that bothered about an utterly pointless test failing, feel free to drop the backport of this patch and leave 6.12 being broken.
I'm not running arm kvm selftests, so it doesn't bother me! :)
thanks,
greg k-h
On Fri, Dec 27, 2024 at 01:23:40PM +0000, Marc Zyngier wrote: ...
Additionally, these tests are mostly pointless anyway, specially this one, which really should be deleted.
Would it by any chance be possible to remove such pointless tests, or at least mark them as BROKEN ? Having them present suggests that people should run them, and feedback such as this one isn't really helpful. If anything, it is tremendously frustrating for anyone trying to run those tests and getting that kind of feedback when noticing and reporting that they are broken.
Thanks, Guenter
On Mon, Dec 23, 2024 at 04:56:51PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
Tested rc1 against the Fedora build system (aarch64, ppc64le, s390x, x86_64), and boot tested x86_64. No regressions noted.
Tested-by: Justin M. Forbes jforbes@fedoraproject.org
On 12/23/24 8:56 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
OVERVIEW
Builds: 42 passed, 0 failed
Boot tests: 571 passed, 0 failed
CI systems: broonie, maestro
REVISION
Commit name: v6.12.6-161-gc157915828d8 hash: c157915828d8f4b0a4f2e60fffed2459c27f3003 Checked out from https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y
BUILDS
No build failures found
BOOT TESTS
No boot failures found
See complete and up-to-date report at:
https://kcidb.kernelci.org/d/revision/revision?orgId=1&var-git_commit_ha...
Tested-by: kernelci.org bot bot@kernelci.org
Thanks, KernelCI team
* Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
Hi Greg
6.12.7-rc1 compiles, boots and runs here on x86_64 (AMD Ryzen 5 PRO 4650G, Slackware64-15.0)
Nitpicking: On Slackware64-current (Tue Dec 24 19:32:08 UTC 2024) I notice the following info in dmesg on an AMD Ryzen 5 7520U lappy with kernel-firmware-20240904_87cae27 popping up within some minutes after booting, no negative system impact noticed; tho, once the event is logged, it seems to show up more frequent even in idle usage.
Dec 26 11:09:22 karrde kernel: rtw_8821ce 0000:02:00.0: unhandled firmware c2h interrupt Dec 26 11:10:11 karrde last message buffered 3 times Dec 26 11:12:14 karrde last message buffered 12 times Dec 26 11:21:19 karrde last message buffered 41 times
Bluetooth is enabled. This happens on both 6.12.7-rc1 and 6.12.6 (probably affects earlier 6.12.X kernels too), I can't bisect.
Pat, what about a more recent kernel-firmware package to test? :)
Tested-by: Markus Reichelt lkt+2023@mareichelt.com
On 12/23/2024 7:56 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
The kernel builds fine for v6.12.7-rc1 on x86 and arm64 Azure VM.
Tested-by: Hardik Garg hargar@linux.microsoft.com
Thanks, Hardik
Hi!
This is the start of the stable review cycle for the 6.12.7 release. There are 160 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
CIP testing did not find any problems here (obsvx2, bbb are test problems, not kernel problems):
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-6...
6.6 passes our testing, too:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-6...
Tested-by: Pavel Machek (CIP) pavel@denx.de
Best regards, Pavel
linux-stable-mirror@lists.linaro.org