This is the start of the stable review cycle for the 6.6.68 release. There are 116 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.68-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.6.68-rc1
Michel Dänzer mdaenzer@redhat.com drm/amdgpu: Handle NULL bo->tbo.resource (again) in amdgpu_vm_bo_update
Francesco Dolcini francesco.dolcini@toradex.com net: fec: make PPS channel configurable
Francesco Dolcini francesco.dolcini@toradex.com net: fec: refactor PPS channel configuration
Pavel Begunkov asml.silence@gmail.com io_uring/rw: avoid punting to io-wq directly
Jens Axboe axboe@kernel.dk io_uring/rw: treat -EOPNOTSUPP for IOCB_NOWAIT like -EAGAIN
Jens Axboe axboe@kernel.dk io_uring/rw: split io_read() into a helper
Xuewen Yan xuewen.yan@unisoc.com epoll: Add synchronous wakeup support for ep_poll_callback
Max Kellermann max.kellermann@ionos.com ceph: fix memory leaks in __ceph_sync_read()
Alex Markuze amarkuze@redhat.com ceph: improve error handling and short/overflow-read logic in __ceph_sync_read()
Ilya Dryomov idryomov@gmail.com ceph: validate snapdirname option length when mounting
Zijun Hu quic_zijuhu@quicinc.com of: Fix refcount leakage for OF node returned by __of_get_dma_parent()
Herve Codina herve.codina@bootlin.com of: Fix error path in of_parse_phandle_with_args_map()
Jann Horn jannh@google.com udmabuf: also check for F_SEAL_FUTURE_WRITE
Edward Adam Davis eadavis@qq.com nilfs2: prevent use of deleted inode
Ryusuke Konishi konishi.ryusuke@gmail.com nilfs2: fix buffer head leaks in calls to truncate_inode_pages()
Zijun Hu quic_zijuhu@quicinc.com of/irq: Fix using uninitialized variable @addr_len in API of_irq_parse_one()
Zijun Hu quic_zijuhu@quicinc.com of/irq: Fix interrupt-map cell length check in of_irq_parse_imap_parent()
Trond Myklebust trond.myklebust@hammerspace.com NFS/pnfs: Fix a live lock between recalled layouts and layoutget
Pavel Begunkov asml.silence@gmail.com io_uring: check if iowq is killed before queuing
Jann Horn jannh@google.com io_uring: Fix registered ring file refcount leak
Tiezhu Yang yangtiezhu@loongson.cn selftests/bpf: Use asm constraint "m" for LoongArch
Isaac J. Manjarres isaacmanjarres@google.com selftests/memfd: run sysctl tests when PID namespace support is enabled
Steven Rostedt rostedt@goodmis.org tracing: Add "%s" check in test_event_printk()
Steven Rostedt rostedt@goodmis.org tracing: Add missing helper functions in event pointer dereference check
Steven Rostedt rostedt@goodmis.org tracing: Fix test_event_printk() to process entire print argument
Enzo Matsumiya ematsumiya@suse.de smb: client: fix TCP timers deadlock after rmmod
Sean Christopherson seanjc@google.com KVM: x86: Play nice with protected guests in complete_hypercall_exit()
Michael Kelley mhklinux@outlook.com Drivers: hv: util: Avoid accessing a ringbuffer not initialized yet
Qu Wenruo wqu@suse.com btrfs: tree-checker: reject inline extent items with 0 ref count
Matthew Wilcox (Oracle) willy@infradead.org vmalloc: fix accounting with i915
Kairui Song kasong@tencent.com zram: fix uninitialized ZRAM not releasing backing device
Kairui Song kasong@tencent.com zram: refuse to use zero sized block device as backing device
Murad Masimov m.masimov@maxima.ru hwmon: (tmp513) Fix interpretation of values of Temperature Result and Limit Registers
Murad Masimov m.masimov@maxima.ru hwmon: (tmp513) Fix Current Register value interpretation
Murad Masimov m.masimov@maxima.ru hwmon: (tmp513) Fix interpretation of values of Shunt Voltage and Limit Registers
Andy Shevchenko andriy.shevchenko@linux.intel.com hwmon: (tmp513) Use SI constants from units.h
Andy Shevchenko andriy.shevchenko@linux.intel.com hwmon: (tmp513) Simplify with dev_err_probe()
Andy Shevchenko andriy.shevchenko@linux.intel.com hwmon: (tmp513) Don't use "proxy" headers
Pierre-Eric Pelloux-Prayer pierre-eric.pelloux-prayer@amd.com drm/amdgpu: don't access invalid sched
Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com i915/guc: Accumulate active runtime on gt reset
Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com i915/guc: Ensure busyness counter increases motonically
Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com i915/guc: Reset engine utilization buffer before registration
Yang Yingliang yangyingliang@huawei.com drm/panel: novatek-nt35950: fix return value check in nt35950_probe()
Ville Syrjälä ville.syrjala@linux.intel.com drm/modes: Avoid divide by zero harder in drm_mode_vrefresh()
Mika Westerberg mika.westerberg@linux.intel.com thunderbolt: Improve redrive mode handling
Daniele Palmas dnlplm@gmail.com USB: serial: option: add Telit FE910C04 rmnet compositions
Jack Wu wojackbb@gmail.com USB: serial: option: add MediaTek T7XX compositions
Mank Wang mank.wang@netprisma.com USB: serial: option: add Netprisma LCUK54 modules for WWAN Ready
Michal Hrusecky michal.hrusecky@turris.com USB: serial: option: add MeiG Smart SLM770A
Daniel Swanemar d.swanemar@gmail.com USB: serial: option: add TCL IK512 MBIM & ECM
Nathan Chancellor nathan@kernel.org hexagon: Disable constant extender optimization for LLVM prior to 19.1.0
James Bottomley James.Bottomley@HansenPartnership.com efivarfs: Fix error on non-existent file
Geert Uytterhoeven geert+renesas@glider.be i2c: riic: Always round-up when calculating bus period
Dan Carpenter dan.carpenter@linaro.org chelsio/chtls: prevent potential integer overflow on 32bit
Eric Dumazet edumazet@google.com net: tun: fix tun_napi_alloc_frags()
Sean Christopherson seanjc@google.com KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init
Borislav Petkov (AMD) bp@alien8.de EDAC/amd64: Simplify ECC check on unified memory controllers
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp mmc: mtk-sd: disable wakeup in .remove() and in the error path of .probe()
Prathamesh Shete pshete@nvidia.com mmc: sdhci-tegra: Remove SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC quirk
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp net: mdiobus: fix an OF node reference leak
Adrian Moreno amorenoz@redhat.com selftests: openvswitch: fix tcpdump execution
Phil Sutter phil@nwl.cc netfilter: ipset: Fix for recursive locking warning
David Laight David.Laight@ACULAB.COM ipvs: Fix clamp() of ip_vs_conn_tab on small memory systems
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp net: ethernet: bgmac-platform: fix an OF node reference leak
Dan Carpenter dan.carpenter@linaro.org net: hinic: Fix cleanup in create_rxqs/txqs()
Marios Makassikis mmakassikis@freebox.fr ksmbd: fix broken transfers when exceeding max simultaneous operations
Marios Makassikis mmakassikis@freebox.fr ksmbd: count all requests in req_running counter
Nikita Yushchenko nikita.yoush@cogentembedded.com net: renesas: rswitch: rework ts tags management
Shannon Nelson shannon.nelson@amd.com ionic: use ee->offset when returning sprom data
Brett Creeley brett.creeley@amd.com ionic: Fix netdev notifier unregister on failure
Eric Dumazet edumazet@google.com netdevsim: prevent bad user input in nsim_dev_health_break_write()
Vladimir Oltean vladimir.oltean@nxp.com net: mscc: ocelot: fix incorrect IFH SRC_PORT field in ocelot_ifh_set_basic()
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check return value of sock_recvmsg when draining clc data
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check smcd_v2_ext_offset when receiving proposal msg
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check v2_ext_offset/eid_cnt/ism_gid_cnt when receiving proposal msg
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check iparea_offset and ipv6_prefixes_cnt when receiving proposal msg
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: check sndbuf_space again after NOSPACE flag is set in smc_poll
Guangguan Wang guangguan.wang@linux.alibaba.com net/smc: protect link down work from execute after lgr freed
Huaisheng Ye huaisheng.ye@intel.com cxl/region: Fix region creation for greater than x2 switches
Davidlohr Bueso dave@stgolabs.net cxl/pci: Fix potential bogus return value upon successful probing
Olaf Hering olaf@aepfle.de tools: hv: change permissions of NetworkManager configuration file
Darrick J. Wong djwong@kernel.org xfs: reset rootdir extent size hint after growfsrt
Darrick J. Wong djwong@kernel.org xfs: take m_growlock when running growfsrt
Darrick J. Wong djwong@kernel.org xfs: use XFS_BUF_DADDR_NULL for daddrs in getfsmap code
Zizhi Wo wozizhi@huawei.com xfs: Fix the owner setting issue for rmap query in xfs fsmap
Darrick J. Wong djwong@kernel.org xfs: conditionally allow FS_XFLAG_REALTIME changes if S_DAX is set
Darrick J. Wong djwong@kernel.org xfs: attr forks require attr, not attr2
Julian Sun sunjunchao2870@gmail.com xfs: remove unused parameter in macro XFS_DQUOT_LOGRES
Darrick J. Wong djwong@kernel.org xfs: fix file_path handling in tracepoints
Chen Ni nichen@iscas.ac.cn xfs: convert comma to semicolon
lei lu llfamsec@gmail.com xfs: don't walk off the end of a directory data block
John Garry john.g.garry@oracle.com xfs: Fix xfs_prepare_shift() range for RT
John Garry john.g.garry@oracle.com xfs: Fix xfs_flush_unmap_range() range for RT
Darrick J. Wong djwong@kernel.org xfs: create a new helper to return a file's allocation unit
Darrick J. Wong djwong@kernel.org xfs: declare xfs_file.c symbols in xfs_file.h
Darrick J. Wong djwong@kernel.org xfs: use consistent uid/gid when grabbing dquots for inodes
Darrick J. Wong djwong@kernel.org xfs: verify buffer, inode, and dquot items every tx commit
Christoph Hellwig hch@lst.de xfs: fix the contact address for the sysfs ABI documentation
Vladimir Riabchun ferr.lambarginio@gmail.com i2c: pnx: Fix timeout in wait functions
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com p2sb: Do not scan and remove the P2SB device when it is unhidden
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com p2sb: Move P2SB hide and unhide code to p2sb_scan_and_cache()
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com p2sb: Introduce the global flag p2sb_hidden_by_bios
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com p2sb: Factor out p2sb_read_from_cache()
Hans de Goede hdegoede@redhat.com platform/x86: p2sb: Make p2sb_get_devfn() return void
Russell King (Oracle) rmk+kernel@armlinux.org.uk net: stmmac: fix TSO DMA API usage causing oops
Roger Quadros rogerq@kernel.org usb: cdns3: Add quirk flag to enable suspend residency
Kai-Heng Feng kai.heng.feng@canonical.com PCI/AER: Disable AER service on suspend
Vidya Sagar vidyas@nvidia.com PCI: Use preserve_config in place of pci_flags
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: Intel: sof_sdw: add quirk for Dell SKU 0B8C
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: Intel: sof_sdw: fix jack detection on ADL-N variant RVP
Jiaxun Yang jiaxun.yang@flygoat.com MIPS: Loongson64: DTS: Fix msi node for ls7a
Roger Quadros rogerq@kernel.org usb: cdns3-ti: Add workaround for Errata i2409
Ajit Khaparde ajit.khaparde@broadcom.com PCI: Add ACS quirk for Broadcom BCM5760X NIC
Jiwei Sun sunjw10@lenovo.com PCI: vmd: Create domain symlink before pci_bus_add_devices()
Peng Hongchi hongchi.peng@siengine.com usb: dwc2: gadget: Don't write invalid mapped sg entries into dma_desc with iommu enabled
Lion Ackermann nnamrec@gmail.com net: sched: fix ordering of qlen adjustment
-------------
Diffstat:
Documentation/ABI/testing/sysfs-fs-xfs | 8 +- Makefile | 4 +- arch/hexagon/Makefile | 6 + .../boot/dts/loongson/loongson64g_4core_ls7a.dts | 1 + arch/x86/kvm/cpuid.c | 31 +++- arch/x86/kvm/cpuid.h | 1 + arch/x86/kvm/x86.c | 4 +- drivers/block/zram/zram_drv.c | 15 +- drivers/cxl/core/region.c | 25 ++- drivers/cxl/pci.c | 3 +- drivers/dma-buf/udmabuf.c | 2 +- drivers/edac/amd64_edac.c | 32 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +- drivers/gpu/drm/drm_modes.c | 11 +- drivers/gpu/drm/i915/gt/intel_engine_types.h | 5 + drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 41 ++++- drivers/gpu/drm/panel/panel-novatek-nt35950.c | 4 +- drivers/hv/hv_kvp.c | 6 + drivers/hv/hv_snapshot.c | 6 + drivers/hv/hv_util.c | 9 + drivers/hv/hyperv_vmbus.h | 2 + drivers/hwmon/tmp513.c | 74 ++++---- drivers/i2c/busses/i2c-pnx.c | 4 +- drivers/i2c/busses/i2c-riic.c | 2 +- drivers/mmc/host/mtk-sd.c | 2 + drivers/mmc/host/sdhci-tegra.c | 1 - drivers/net/ethernet/broadcom/bgmac-platform.c | 5 +- .../chelsio/inline_crypto/chtls/chtls_main.c | 5 +- drivers/net/ethernet/freescale/fec_ptp.c | 11 +- drivers/net/ethernet/huawei/hinic/hinic_main.c | 2 + drivers/net/ethernet/mscc/ocelot.c | 2 +- .../net/ethernet/pensando/ionic/ionic_ethtool.c | 4 +- drivers/net/ethernet/pensando/ionic/ionic_lif.c | 4 +- drivers/net/ethernet/renesas/rswitch.c | 68 +++---- drivers/net/ethernet/renesas/rswitch.h | 13 +- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 7 +- drivers/net/mdio/fwnode_mdio.c | 13 +- drivers/net/netdevsim/health.c | 2 + drivers/net/tun.c | 2 +- drivers/of/address.c | 2 +- drivers/of/base.c | 15 +- drivers/of/irq.c | 2 + drivers/pci/controller/pci-host-common.c | 4 - drivers/pci/controller/vmd.c | 8 +- drivers/pci/pcie/aer.c | 18 ++ drivers/pci/probe.c | 22 ++- drivers/pci/quirks.c | 4 + drivers/platform/x86/p2sb.c | 94 ++++++---- drivers/thunderbolt/tb.c | 41 +++++ drivers/usb/cdns3/cdns3-ti.c | 15 +- drivers/usb/cdns3/core.h | 1 + drivers/usb/cdns3/drd.c | 10 +- drivers/usb/cdns3/drd.h | 3 + drivers/usb/dwc2/gadget.c | 4 +- drivers/usb/serial/option.c | 27 +++ fs/btrfs/tree-checker.c | 27 ++- fs/ceph/file.c | 34 ++-- fs/ceph/super.c | 2 + fs/efivarfs/inode.c | 2 +- fs/efivarfs/internal.h | 1 - fs/efivarfs/super.c | 3 - fs/eventpoll.c | 5 +- fs/nfs/pnfs.c | 2 +- fs/nilfs2/btnode.c | 1 + fs/nilfs2/gcinode.c | 2 +- fs/nilfs2/inode.c | 13 +- fs/nilfs2/namei.c | 5 + fs/nilfs2/nilfs.h | 1 + fs/smb/client/connect.c | 36 ++-- fs/smb/server/connection.c | 18 +- fs/smb/server/connection.h | 1 - fs/smb/server/server.c | 7 +- fs/smb/server/server.h | 1 + fs/smb/server/transport_ipc.c | 5 +- fs/xfs/Kconfig | 12 ++ fs/xfs/libxfs/xfs_dir2_data.c | 31 +++- fs/xfs/libxfs/xfs_dir2_priv.h | 7 + fs/xfs/libxfs/xfs_quota_defs.h | 2 +- fs/xfs/libxfs/xfs_trans_resv.c | 28 +-- fs/xfs/scrub/agheader_repair.c | 2 +- fs/xfs/scrub/bmap.c | 8 +- fs/xfs/scrub/trace.h | 10 +- fs/xfs/xfs.h | 4 + fs/xfs/xfs_bmap_util.c | 22 ++- fs/xfs/xfs_buf_item.c | 32 ++++ fs/xfs/xfs_dquot_item.c | 31 ++++ fs/xfs/xfs_file.c | 29 ++- fs/xfs/xfs_file.h | 15 ++ fs/xfs/xfs_fsmap.c | 6 +- fs/xfs/xfs_inode.c | 29 ++- fs/xfs/xfs_inode.h | 2 + fs/xfs/xfs_inode_item.c | 32 ++++ fs/xfs/xfs_ioctl.c | 12 ++ fs/xfs/xfs_iops.c | 1 + fs/xfs/xfs_iops.h | 3 - fs/xfs/xfs_rtalloc.c | 78 ++++++-- fs/xfs/xfs_symlink.c | 8 +- include/linux/hyperv.h | 1 + include/linux/io_uring.h | 4 +- include/linux/wait.h | 1 + io_uring/io_uring.c | 15 +- io_uring/io_uring.h | 1 - io_uring/rw.c | 31 +++- kernel/trace/trace_events.c | 199 ++++++++++++++++----- mm/vmalloc.c | 6 +- net/netfilter/ipset/ip_set_list_set.c | 3 + net/netfilter/ipvs/ip_vs_conn.c | 4 +- net/sched/sch_cake.c | 2 +- net/sched/sch_choke.c | 2 +- net/smc/af_smc.c | 18 +- net/smc/smc_clc.c | 17 +- net/smc/smc_clc.h | 22 ++- net/smc/smc_core.c | 9 +- sound/soc/intel/boards/sof_sdw.c | 18 ++ tools/hv/hv_set_ifconfig.sh | 2 +- tools/testing/selftests/bpf/sdt.h | 2 + tools/testing/selftests/memfd/memfd_test.c | 14 +- .../selftests/net/openvswitch/openvswitch.sh | 6 +- 119 files changed, 1223 insertions(+), 441 deletions(-)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lion Ackermann nnamrec@gmail.com
commit 5eb7de8cd58e73851cd37ff8d0666517d9926948 upstream.
Changes to sch->q.qlen around qdisc_tree_reduce_backlog() need to happen _before_ a call to said function because otherwise it may fail to notify parent qdiscs when the child is about to become empty.
Signed-off-by: Lion Ackermann nnamrec@gmail.com Acked-by: Toke Høiland-Jørgensen toke@toke.dk Signed-off-by: David S. Miller davem@davemloft.net Cc: Artem Metla ametla@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_cake.c | 2 +- net/sched/sch_choke.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -1542,7 +1542,6 @@ static unsigned int cake_drop(struct Qdi b->backlogs[idx] -= len; b->tin_backlog -= len; sch->qstats.backlog -= len; - qdisc_tree_reduce_backlog(sch, 1, len);
flow->dropped++; b->tin_dropped++; @@ -1553,6 +1552,7 @@ static unsigned int cake_drop(struct Qdi
__qdisc_drop(skb, to_free); sch->q.qlen--; + qdisc_tree_reduce_backlog(sch, 1, len);
cake_heapify(q, 0);
--- a/net/sched/sch_choke.c +++ b/net/sched/sch_choke.c @@ -123,10 +123,10 @@ static void choke_drop_by_idx(struct Qdi if (idx == q->tail) choke_zap_tail_holes(q);
+ --sch->q.qlen; qdisc_qstats_backlog_dec(sch, skb); qdisc_tree_reduce_backlog(sch, 1, qdisc_pkt_len(skb)); qdisc_drop(skb, sch, to_free); - --sch->q.qlen; }
struct choke_skb_cb {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peng Hongchi hongchi.peng@siengine.com
[ Upstream commit 1134289b6b93d73721340b66c310fd985385e8fa ]
When using dma_map_sg() to map the scatterlist with iommu enabled, the entries in the scatterlist can be mergerd into less but longer entries in the function __finalise_sg(). So that the number of valid mapped entries is actually smaller than ureq->num_reqs,and there are still some invalid entries in the scatterlist with dma_addr=0xffffffff and len=0. Writing these invalid sg entries into the dma_desc can cause a data transmission error.
The function dma_map_sg() returns the number of valid map entries and the return value is assigned to usb_request::num_mapped_sgs in function usb_gadget_map_request_by_dev(). So that just write valid mapped entries into dma_desc according to the usb_request::num_mapped_sgs, and set the IOC bit if it's the last valid mapped entry.
This patch poses no risk to no-iommu situation, cause ureq->num_mapped_sgs equals ureq->num_sgs while using dma_direct_map_sg() to map the scatterlist whith iommu disabled.
Signed-off-by: Peng Hongchi hongchi.peng@siengine.com Link: https://lore.kernel.org/r/20240523100315.7226-1-hongchi.peng@siengine.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/dwc2/gadget.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c index b2f6da5b65cc..b26de09f6b6d 100644 --- a/drivers/usb/dwc2/gadget.c +++ b/drivers/usb/dwc2/gadget.c @@ -885,10 +885,10 @@ static void dwc2_gadget_config_nonisoc_xfer_ddma(struct dwc2_hsotg_ep *hs_ep, }
/* DMA sg buffer */ - for_each_sg(ureq->sg, sg, ureq->num_sgs, i) { + for_each_sg(ureq->sg, sg, ureq->num_mapped_sgs, i) { dwc2_gadget_fill_nonisoc_xfer_ddma_one(hs_ep, &desc, sg_dma_address(sg) + sg->offset, sg_dma_len(sg), - sg_is_last(sg)); + (i == (ureq->num_mapped_sgs - 1))); desc_count += hs_ep->desc_count; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiwei Sun sunjw10@lenovo.com
[ Upstream commit f24c9bfcd423e2b2bb0d198456412f614ec2030a ]
The vmd driver creates a "domain" symlink in sysfs for each VMD bridge. Previously this symlink was created after pci_bus_add_devices() added devices below the VMD bridge and emitted udev events to announce them to userspace.
This led to a race between userspace consumers of the udev events and the kernel creation of the symlink. One such consumer is mdadm, which assembles block devices into a RAID array, and for devices below a VMD bridge, mdadm depends on the "domain" symlink.
If mdadm loses the race, it may be unable to assemble a RAID array, which may cause a boot failure or other issues, with complaints like this:
(udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: Unable to get real path for '/sys/bus/pci/drivers/vmd/0000:c7:00.5/domain/device'' (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: /dev/nvme1n1 is not attached to Intel(R) RAID controller.' (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: No OROM/EFI properties for /dev/nvme1n1' (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: no RAID superblock on /dev/nvme1n1.' (udev-worker)[2149]: nvme1n1: Process '/sbin/mdadm -I /dev/nvme1n1' failed with exit code 1.
This symptom prevents the OS from booting successfully.
After a NVMe disk is probed/added by the nvme driver, udevd invokes mdadm to detect if there is a mdraid associated with this NVMe disk, and mdadm determines if a NVMe device is connected to a particular VMD domain by checking the "domain" symlink. For example:
Thread A Thread B Thread mdadm vmd_enable_domain pci_bus_add_devices __driver_probe_device ... work_on_cpu schedule_work_on : wakeup Thread B nvme_probe : wakeup scan_work to scan nvme disk and add nvme disk then wakeup udevd : udevd executes mdadm command flush_work main : wait for nvme_probe done ... __driver_probe_device find_driver_devices : probe next nvme device : 1) Detect domain symlink ... 2) Find domain symlink ... from vmd sysfs ... 3) Domain symlink not ... created yet; failed sysfs_create_link : create domain symlink
Create the VMD "domain" symlink before invoking pci_bus_add_devices() to avoid this race.
Suggested-by: Adrian Huang ahuang12@lenovo.com Link: https://lore.kernel.org/linux-pci/20240605124844.24293-1-sjiwei@163.com Signed-off-by: Jiwei Sun sunjw10@lenovo.com Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Nirmal Patel nirmal.patel@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/vmd.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c index 992fea22fd9f..5ff2066aa516 100644 --- a/drivers/pci/controller/vmd.c +++ b/drivers/pci/controller/vmd.c @@ -930,6 +930,9 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features) dev_set_msi_domain(&vmd->bus->dev, dev_get_msi_domain(&vmd->dev->dev));
+ WARN(sysfs_create_link(&vmd->dev->dev.kobj, &vmd->bus->dev.kobj, + "domain"), "Can't create symlink to domain\n"); + vmd_acpi_begin();
pci_scan_child_bus(vmd->bus); @@ -969,9 +972,6 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features) pci_bus_add_devices(vmd->bus);
vmd_acpi_end(); - - WARN(sysfs_create_link(&vmd->dev->dev.kobj, &vmd->bus->dev.kobj, - "domain"), "Can't create symlink to domain\n"); return 0; }
@@ -1047,8 +1047,8 @@ static void vmd_remove(struct pci_dev *dev) { struct vmd_dev *vmd = pci_get_drvdata(dev);
- sysfs_remove_link(&vmd->dev->dev.kobj, "domain"); pci_stop_root_bus(vmd->bus); + sysfs_remove_link(&vmd->dev->dev.kobj, "domain"); pci_remove_root_bus(vmd->bus); vmd_cleanup_srcu(vmd); vmd_detach_resources(vmd);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ajit Khaparde ajit.khaparde@broadcom.com
[ Upstream commit 524e057b2d66b61f9b63b6db30467ab7b0bb4796 ]
The Broadcom BCM5760X NIC may be a multi-function device.
While it does not advertise an ACS capability, peer-to-peer transactions are not possible between the individual functions. So it is ok to treat them as fully isolated.
Add an ACS quirk for this device so the functions can be in independent IOMMU groups and attached individually to userspace applications using VFIO.
[kwilczynski: commit log] Link: https://lore.kernel.org/linux-pci/20240510204228.73435-1-ajit.khaparde@broad... Signed-off-by: Ajit Khaparde ajit.khaparde@broadcom.com Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Andy Gospodarek gospo@broadcom.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index c5115ad59766..fd35ad0648a0 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5116,6 +5116,10 @@ static const struct pci_dev_acs_enabled { { PCI_VENDOR_ID_BROADCOM, 0x1750, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_BROADCOM, 0x1751, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_BROADCOM, 0x1752, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_BROADCOM, 0x1760, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_BROADCOM, 0x1761, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_BROADCOM, 0x1762, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_BROADCOM, 0x1763, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_BROADCOM, 0xD714, pci_quirk_brcm_acs }, /* Amazon Annapurna Labs */ { PCI_VENDOR_ID_AMAZON_ANNAPURNA_LABS, 0x0031, pci_quirk_al_acs },
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roger Quadros rogerq@kernel.org
[ Upstream commit b50a2da03bd95784541b3f9058e452cc38f9ba05 ]
TI USB2 PHY is known to have a lockup issue on very short suspend intervals. Enable the Suspend Residency quirk flag to workaround this as described in Errata i2409 [1].
[1] - https://www.ti.com/lit/er/sprz457h/sprz457h.pdf
Signed-off-by: Roger Quadros rogerq@kernel.org Signed-off-by: Ravi Gunasekaran r-gunasekaran@ti.com Acked-by: Peter Chen peter.chen@kernel.org Link: https://lore.kernel.org/r/20240516044537.16801-3-r-gunasekaran@ti.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/cdns3/cdns3-ti.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/cdns3/cdns3-ti.c b/drivers/usb/cdns3/cdns3-ti.c index 5945c4b1e11f..cfabc12ee0e3 100644 --- a/drivers/usb/cdns3/cdns3-ti.c +++ b/drivers/usb/cdns3/cdns3-ti.c @@ -16,6 +16,7 @@ #include <linux/of_platform.h> #include <linux/pm_runtime.h> #include <linux/property.h> +#include "core.h"
/* USB Wrapper register offsets */ #define USBSS_PID 0x0 @@ -85,6 +86,18 @@ static inline void cdns_ti_writel(struct cdns_ti *data, u32 offset, u32 value) writel(value, data->usbss + offset); }
+static struct cdns3_platform_data cdns_ti_pdata = { + .quirks = CDNS3_DRD_SUSPEND_RESIDENCY_ENABLE, /* Errata i2409 */ +}; + +static const struct of_dev_auxdata cdns_ti_auxdata[] = { + { + .compatible = "cdns,usb3", + .platform_data = &cdns_ti_pdata, + }, + {}, +}; + static int cdns_ti_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; @@ -176,7 +189,7 @@ static int cdns_ti_probe(struct platform_device *pdev) reg |= USBSS_W1_PWRUP_RST; cdns_ti_writel(data, USBSS_W1, reg);
- error = of_platform_populate(node, NULL, NULL, dev); + error = of_platform_populate(node, NULL, cdns_ti_auxdata, dev); if (error) { dev_err(dev, "failed to create children: %d\n", error); goto err;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiaxun Yang jiaxun.yang@flygoat.com
[ Upstream commit 98a9e2ac3755a353eefea8c52e23d5b0c50f3899 ]
Add it to silent warning: arch/mips/boot/dts/loongson/ls7a-pch.dtsi:68.16-416.5: Warning (interrupt_provider): /bus@10000000/pci@1a000000: '#interrupt-cells' found, but node is not an interrupt provider arch/mips/boot/dts/loongson/loongson64g_4core_ls7a.dts:32.31-40.4: Warning (interrupt_provider): /bus@10000000/msi-controller@2ff00000: Missing '#interrupt-cells' in interrupt provider arch/mips/boot/dts/loongson/loongson64g_4core_ls7a.dtb: Warning (interrupt_map): Failed prerequisite 'interrupt_provider'
Signed-off-by: Jiaxun Yang jiaxun.yang@flygoat.com Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/boot/dts/loongson/loongson64g_4core_ls7a.dts | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/mips/boot/dts/loongson/loongson64g_4core_ls7a.dts b/arch/mips/boot/dts/loongson/loongson64g_4core_ls7a.dts index c945f8565d54..fb180cb2b8e2 100644 --- a/arch/mips/boot/dts/loongson/loongson64g_4core_ls7a.dts +++ b/arch/mips/boot/dts/loongson/loongson64g_4core_ls7a.dts @@ -33,6 +33,7 @@ compatible = "loongson,pch-msi-1.0"; reg = <0 0x2ff00000 0 0x8>; interrupt-controller; + #interrupt-cells = <1>; msi-controller; loongson,msi-base-vec = <64>; loongson,msi-num-vecs = <192>;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit 65c90df918205bc84f5448550cde76a54dae5f52 ]
Experimental tests show that JD2_100K is required, otherwise the jack is detected always even with nothing plugged-in.
To avoid matching with other known quirks the SKU information is used.
Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Link: https://patch.msgid.link/20240624121119.91552-2-pierre-louis.bossart@linux.i... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/boards/sof_sdw.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/sound/soc/intel/boards/sof_sdw.c b/sound/soc/intel/boards/sof_sdw.c index 5980fce81797..dc144cd7e0e3 100644 --- a/sound/soc/intel/boards/sof_sdw.c +++ b/sound/soc/intel/boards/sof_sdw.c @@ -286,6 +286,15 @@ static const struct dmi_system_id sof_sdw_quirk_table[] = { SOF_BT_OFFLOAD_SSP(2) | SOF_SSP_BT_OFFLOAD_PRESENT), }, + { + .callback = sof_sdw_quirk_cb, + .matches = { + DMI_MATCH(DMI_BOARD_VENDOR, "Intel Corporation"), + DMI_MATCH(DMI_PRODUCT_SKU, "0000000000070000"), + }, + .driver_data = (void *)(SOF_SDW_TGL_HDMI | + RT711_JD2_100K), + }, { .callback = sof_sdw_quirk_cb, .matches = {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit 92d5b5930e7d55ca07b483490d6298eee828bbe4 ]
Jack detection needs to rely on JD2, as most other Dell AlderLake-based devices.
Closes: https://github.com/thesofproject/linux/issues/5021 Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Link: https://patch.msgid.link/20240624121119.91552-4-pierre-louis.bossart@linux.i... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/boards/sof_sdw.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/sound/soc/intel/boards/sof_sdw.c b/sound/soc/intel/boards/sof_sdw.c index dc144cd7e0e3..db1dcb9d7046 100644 --- a/sound/soc/intel/boards/sof_sdw.c +++ b/sound/soc/intel/boards/sof_sdw.c @@ -425,6 +425,15 @@ static const struct dmi_system_id sof_sdw_quirk_table[] = { /* No Jack */ .driver_data = (void *)SOF_SDW_TGL_HDMI, }, + { + .callback = sof_sdw_quirk_cb, + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc"), + DMI_EXACT_MATCH(DMI_PRODUCT_SKU, "0B8C"), + }, + .driver_data = (void *)(SOF_SDW_TGL_HDMI | + RT711_JD2), + }, { .callback = sof_sdw_quirk_cb, .matches = {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vidya Sagar vidyas@nvidia.com
[ Upstream commit 7246a4520b4bf1494d7d030166a11b5226f6d508 ]
Use preserve_config in place of checking for PCI_PROBE_ONLY flag to enable support for "linux,pci-probe-only" on a per host bridge basis.
This also obviates the use of adding PCI_REASSIGN_ALL_BUS flag if !PCI_PROBE_ONLY, as pci_assign_unassigned_root_bus_resources() takes care of reassigning the resources that are not already claimed.
Link: https://lore.kernel.org/r/20240508174138.3630283-5-vidyas@nvidia.com Signed-off-by: Vidya Sagar vidyas@nvidia.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pci-host-common.c | 4 ---- drivers/pci/probe.c | 20 +++++++++----------- 2 files changed, 9 insertions(+), 15 deletions(-)
diff --git a/drivers/pci/controller/pci-host-common.c b/drivers/pci/controller/pci-host-common.c index 6be3266cd7b5..e2602e38ae45 100644 --- a/drivers/pci/controller/pci-host-common.c +++ b/drivers/pci/controller/pci-host-common.c @@ -73,10 +73,6 @@ int pci_host_common_probe(struct platform_device *pdev) if (IS_ERR(cfg)) return PTR_ERR(cfg);
- /* Do not reassign resources if probe only */ - if (!pci_has_flag(PCI_PROBE_ONLY)) - pci_add_flags(PCI_REASSIGN_ALL_BUS); - bridge->sysdata = cfg; bridge->ops = (struct pci_ops *)&ops->pci_ops; bridge->msi_domain = true; diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 03b519a22840..7e84e472b338 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -3096,20 +3096,18 @@ int pci_host_probe(struct pci_host_bridge *bridge)
bus = bridge->bus;
+ /* If we must preserve the resource configuration, claim now */ + if (bridge->preserve_config) + pci_bus_claim_resources(bus); + /* - * We insert PCI resources into the iomem_resource and - * ioport_resource trees in either pci_bus_claim_resources() - * or pci_bus_assign_resources(). + * Assign whatever was left unassigned. If we didn't claim above, + * this will reassign everything. */ - if (pci_has_flag(PCI_PROBE_ONLY)) { - pci_bus_claim_resources(bus); - } else { - pci_bus_size_bridges(bus); - pci_bus_assign_resources(bus); + pci_assign_unassigned_root_bus_resources(bus);
- list_for_each_entry(child, &bus->children, node) - pcie_bus_configure_settings(child); - } + list_for_each_entry(child, &bus->children, node) + pcie_bus_configure_settings(child);
pci_bus_add_devices(bus); return 0;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kai-Heng Feng kai.heng.feng@canonical.com
[ Upstream commit 5afc2f763edc5daae4722ee46fea4e627d01fa90 ]
If the link is powered off during suspend, electrical noise may cause errors that are logged via AER. If the AER interrupt is enabled and shares an IRQ with PME, that causes a spurious wakeup during suspend.
Disable the AER interrupt during suspend to prevent this. Clear error status before re-enabling IRQ interrupts during resume so we don't get an interrupt for errors that occurred during the suspend/resume process.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=209149 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295 Link: https://bugzilla.kernel.org/show_bug.cgi?id=218090 Link: https://lore.kernel.org/r/20240416043225.1462548-2-kai.heng.feng@canonical.c... Signed-off-by: Kai-Heng Feng kai.heng.feng@canonical.com [bhelgaas: drop pci_ancestor_pr3_present() etc, commit log] Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/aer.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index c9afe4362835..eeb9ea9044b4 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -1342,6 +1342,22 @@ static int aer_probe(struct pcie_device *dev) return 0; }
+static int aer_suspend(struct pcie_device *dev) +{ + struct aer_rpc *rpc = get_service_data(dev); + + aer_disable_rootport(rpc); + return 0; +} + +static int aer_resume(struct pcie_device *dev) +{ + struct aer_rpc *rpc = get_service_data(dev); + + aer_enable_rootport(rpc); + return 0; +} + /** * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP * @dev: pointer to Root Port, RCEC, or RCiEP @@ -1413,6 +1429,8 @@ static struct pcie_port_service_driver aerdriver = { .service = PCIE_PORT_SERVICE_AER,
.probe = aer_probe, + .suspend = aer_suspend, + .resume = aer_resume, .remove = aer_remove, };
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roger Quadros rogerq@kernel.org
[ Upstream commit 0aca19e4037a4143273e90f1b44666b78b4dde9b ]
Some platforms (e.g. ti,j721e-usb, ti,am64-usb) require this bit to be set to workaround a lockup issue with PHY short suspend intervals [1]. Add a platform quirk flag to indicate if Suspend Residency should be enabled.
[1] - https://www.ti.com/lit/er/sprz457h/sprz457h.pdf i2409 - USB: USB2 PHY locks up due to short suspend
Signed-off-by: Roger Quadros rogerq@kernel.org Signed-off-by: Ravi Gunasekaran r-gunasekaran@ti.com Acked-by: Peter Chen peter.chen@kernel.org Link: https://lore.kernel.org/r/20240516044537.16801-2-r-gunasekaran@ti.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/cdns3/core.h | 1 + drivers/usb/cdns3/drd.c | 10 +++++++++- drivers/usb/cdns3/drd.h | 3 +++ 3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/cdns3/core.h b/drivers/usb/cdns3/core.h index 81a9c9d6be08..57d47348dc19 100644 --- a/drivers/usb/cdns3/core.h +++ b/drivers/usb/cdns3/core.h @@ -44,6 +44,7 @@ struct cdns3_platform_data { bool suspend, bool wakeup); unsigned long quirks; #define CDNS3_DEFAULT_PM_RUNTIME_ALLOW BIT(0) +#define CDNS3_DRD_SUSPEND_RESIDENCY_ENABLE BIT(1) };
/** diff --git a/drivers/usb/cdns3/drd.c b/drivers/usb/cdns3/drd.c index ee917f1b091c..1b4ce2da1e4b 100644 --- a/drivers/usb/cdns3/drd.c +++ b/drivers/usb/cdns3/drd.c @@ -389,7 +389,7 @@ static irqreturn_t cdns_drd_irq(int irq, void *data) int cdns_drd_init(struct cdns *cdns) { void __iomem *regs; - u32 state; + u32 state, reg; int ret;
regs = devm_ioremap_resource(cdns->dev, &cdns->otg_res); @@ -433,6 +433,14 @@ int cdns_drd_init(struct cdns *cdns) cdns->otg_irq_regs = (struct cdns_otg_irq_regs __iomem *) &cdns->otg_v1_regs->ien; writel(1, &cdns->otg_v1_regs->simulate); + + if (cdns->pdata && + (cdns->pdata->quirks & CDNS3_DRD_SUSPEND_RESIDENCY_ENABLE)) { + reg = readl(&cdns->otg_v1_regs->susp_ctrl); + reg |= SUSP_CTRL_SUSPEND_RESIDENCY_ENABLE; + writel(reg, &cdns->otg_v1_regs->susp_ctrl); + } + cdns->version = CDNS3_CONTROLLER_V1; } else { dev_err(cdns->dev, "not supporte DID=0x%08x\n", state); diff --git a/drivers/usb/cdns3/drd.h b/drivers/usb/cdns3/drd.h index d72370c321d3..1e2aee14d629 100644 --- a/drivers/usb/cdns3/drd.h +++ b/drivers/usb/cdns3/drd.h @@ -193,6 +193,9 @@ struct cdns_otg_irq_regs { /* OTGREFCLK - bitmasks */ #define OTGREFCLK_STB_CLK_SWITCH_EN BIT(31)
+/* SUPS_CTRL - bitmasks */ +#define SUSP_CTRL_SUSPEND_RESIDENCY_ENABLE BIT(17) + /* OVERRIDE - bitmasks */ #define OVERRIDE_IDPULLUP BIT(0) /* Only for CDNS3_CONTROLLER_V0 version */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King (Oracle) rmk+kernel@armlinux.org.uk
[ Upstream commit 4c49f38e20a57f8abaebdf95b369295b153d1f8e ]
Commit 66600fac7a98 ("net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data") moved the assignment of tx_skbuff_dma[]'s members to be later in stmmac_tso_xmit().
The buf (dma cookie) and len stored in this structure are passed to dma_unmap_single() by stmmac_tx_clean(). The DMA API requires that the dma cookie passed to dma_unmap_single() is the same as the value returned from dma_map_single(). However, by moving the assignment later, this is not the case when priv->dma_cap.addr64 > 32 as "des" is offset by proto_hdr_len.
This causes problems such as:
dwc-eth-dwmac 2490000.ethernet eth0: Tx DMA map failed
and with DMA_API_DEBUG enabled:
DMA-API: dwc-eth-dwmac 2490000.ethernet: device driver tries to +free DMA memory it has not allocated [device address=0x000000ffffcf65c0] [size=66 bytes]
Fix this by maintaining "des" as the original DMA cookie, and use tso_des to pass the offset DMA cookie to stmmac_tso_allocator().
Full details of the crashes can be found at: https://lore.kernel.org/all/d8112193-0386-4e14-b516-37c2d838171a@nvidia.com/ https://lore.kernel.org/all/klkzp5yn5kq5efgtrow6wbvnc46bcqfxs65nz3qy77ujr5tu...
Reported-by: Jon Hunter jonathanh@nvidia.com Reported-by: Thierry Reding thierry.reding@gmail.com Fixes: 66600fac7a98 ("net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data") Tested-by: Jon Hunter jonathanh@nvidia.com Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Furong Xu 0x1207@gmail.com Link: https://patch.msgid.link/E1tJXcx-006N4Z-PC@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 853851d5f362..d6ee90fef2ec 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -4119,9 +4119,9 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev) int tmp_pay_len = 0, first_tx; struct stmmac_tx_queue *tx_q; bool has_vlan, set_ic; + dma_addr_t tso_des, des; u8 proto_hdr_len, hdr; u32 pay_len, mss; - dma_addr_t des; int i;
tx_q = &priv->dma_conf.tx_queue[queue]; @@ -4206,14 +4206,15 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
/* If needed take extra descriptors to fill the remaining payload */ tmp_pay_len = pay_len - TSO_MAX_BUFF_SIZE; + tso_des = des; } else { stmmac_set_desc_addr(priv, first, des); tmp_pay_len = pay_len; - des += proto_hdr_len; + tso_des = des + proto_hdr_len; pay_len = 0; }
- stmmac_tso_allocator(priv, des, tmp_pay_len, (nfrags == 0), queue); + stmmac_tso_allocator(priv, tso_des, tmp_pay_len, (nfrags == 0), queue);
/* In case two or more DMA transmit descriptors are allocated for this * non-paged SKB data, the DMA buffer address should be saved to
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 3ff5873602a874035ba28826852bd45393002a08 ]
p2sb_get_devfn() always succeeds, make it return void and remove error checking from its callers.
Reviewed-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20240305094500.23778-1-hdegoede@redhat.com Stable-dep-of: 360c400d0f56 ("p2sb: Do not scan and remove the P2SB device when it is unhidden") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/p2sb.c | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-)
diff --git a/drivers/platform/x86/p2sb.c b/drivers/platform/x86/p2sb.c index 053be5c5e0ca..687e341e3206 100644 --- a/drivers/platform/x86/p2sb.c +++ b/drivers/platform/x86/p2sb.c @@ -43,7 +43,7 @@ struct p2sb_res_cache {
static struct p2sb_res_cache p2sb_resources[NR_P2SB_RES_CACHE];
-static int p2sb_get_devfn(unsigned int *devfn) +static void p2sb_get_devfn(unsigned int *devfn) { unsigned int fn = P2SB_DEVFN_DEFAULT; const struct x86_cpu_id *id; @@ -53,7 +53,6 @@ static int p2sb_get_devfn(unsigned int *devfn) fn = (unsigned int)id->driver_data;
*devfn = fn; - return 0; }
static bool p2sb_valid_resource(const struct resource *res) @@ -132,9 +131,7 @@ static int p2sb_cache_resources(void) int ret;
/* Get devfn for P2SB device itself */ - ret = p2sb_get_devfn(&devfn_p2sb); - if (ret) - return ret; + p2sb_get_devfn(&devfn_p2sb);
bus = p2sb_get_bus(NULL); if (!bus) @@ -191,17 +188,13 @@ static int p2sb_cache_resources(void) int p2sb_bar(struct pci_bus *bus, unsigned int devfn, struct resource *mem) { struct p2sb_res_cache *cache; - int ret;
bus = p2sb_get_bus(bus); if (!bus) return -ENODEV;
- if (!devfn) { - ret = p2sb_get_devfn(&devfn); - if (ret) - return ret; - } + if (!devfn) + p2sb_get_devfn(&devfn);
cache = &p2sb_resources[PCI_FUNC(devfn)]; if (cache->bus_dev_id != bus->dev.id)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com
[ Upstream commit 9244524d60ddea55f4df54c51200e8fef2032447 ]
To prepare for the following fix, factor out the code to read the P2SB resource from the cache to the new function p2sb_read_from_cache().
Signed-off-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20241128002836.373745-2-shinichiro.kawasaki@wdc.co... Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Stable-dep-of: 360c400d0f56 ("p2sb: Do not scan and remove the P2SB device when it is unhidden") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/p2sb.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/drivers/platform/x86/p2sb.c b/drivers/platform/x86/p2sb.c index 687e341e3206..d6ee4b34f911 100644 --- a/drivers/platform/x86/p2sb.c +++ b/drivers/platform/x86/p2sb.c @@ -171,6 +171,22 @@ static int p2sb_cache_resources(void) return ret; }
+static int p2sb_read_from_cache(struct pci_bus *bus, unsigned int devfn, + struct resource *mem) +{ + struct p2sb_res_cache *cache = &p2sb_resources[PCI_FUNC(devfn)]; + + if (cache->bus_dev_id != bus->dev.id) + return -ENODEV; + + if (!p2sb_valid_resource(&cache->res)) + return -ENOENT; + + memcpy(mem, &cache->res, sizeof(*mem)); + + return 0; +} + /** * p2sb_bar - Get Primary to Sideband (P2SB) bridge device BAR * @bus: PCI bus to communicate with @@ -187,8 +203,6 @@ static int p2sb_cache_resources(void) */ int p2sb_bar(struct pci_bus *bus, unsigned int devfn, struct resource *mem) { - struct p2sb_res_cache *cache; - bus = p2sb_get_bus(bus); if (!bus) return -ENODEV; @@ -196,15 +210,7 @@ int p2sb_bar(struct pci_bus *bus, unsigned int devfn, struct resource *mem) if (!devfn) p2sb_get_devfn(&devfn);
- cache = &p2sb_resources[PCI_FUNC(devfn)]; - if (cache->bus_dev_id != bus->dev.id) - return -ENODEV; - - if (!p2sb_valid_resource(&cache->res)) - return -ENOENT; - - memcpy(mem, &cache->res, sizeof(*mem)); - return 0; + return p2sb_read_from_cache(bus, devfn, mem); } EXPORT_SYMBOL_GPL(p2sb_bar);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com
[ Upstream commit ae3e6ebc5ab046d434c05c58a3e3f7e94441fec2 ]
To prepare for the following fix, introduce the global flag p2sb_hidden_by_bios. Check if the BIOS hides the P2SB device and store the result in the flag. This allows to refer to the check result across functions.
Signed-off-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20241128002836.373745-3-shinichiro.kawasaki@wdc.co... Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Stable-dep-of: 360c400d0f56 ("p2sb: Do not scan and remove the P2SB device when it is unhidden") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/p2sb.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/platform/x86/p2sb.c b/drivers/platform/x86/p2sb.c index d6ee4b34f911..d015ddc9f30e 100644 --- a/drivers/platform/x86/p2sb.c +++ b/drivers/platform/x86/p2sb.c @@ -42,6 +42,7 @@ struct p2sb_res_cache { };
static struct p2sb_res_cache p2sb_resources[NR_P2SB_RES_CACHE]; +static bool p2sb_hidden_by_bios;
static void p2sb_get_devfn(unsigned int *devfn) { @@ -157,13 +158,14 @@ static int p2sb_cache_resources(void) * Unhide the P2SB device here, if needed. */ pci_bus_read_config_dword(bus, devfn_p2sb, P2SBC, &value); - if (value & P2SBC_HIDE) + p2sb_hidden_by_bios = value & P2SBC_HIDE; + if (p2sb_hidden_by_bios) pci_bus_write_config_dword(bus, devfn_p2sb, P2SBC, 0);
ret = p2sb_scan_and_cache(bus, devfn_p2sb);
/* Hide the P2SB device, if it was hidden */ - if (value & P2SBC_HIDE) + if (p2sb_hidden_by_bios) pci_bus_write_config_dword(bus, devfn_p2sb, P2SBC, P2SBC_HIDE);
pci_unlock_rescan_remove();
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com
[ Upstream commit 0286070c74ee48391fc07f7f617460479472d221 ]
To prepare for the following fix, move the code to hide and unhide the P2SB device from p2sb_cache_resources() to p2sb_scan_and_cache().
Signed-off-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20241128002836.373745-4-shinichiro.kawasaki@wdc.co... Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Stable-dep-of: 360c400d0f56 ("p2sb: Do not scan and remove the P2SB device when it is unhidden") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/p2sb.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/drivers/platform/x86/p2sb.c b/drivers/platform/x86/p2sb.c index d015ddc9f30e..6fb76b82ecce 100644 --- a/drivers/platform/x86/p2sb.c +++ b/drivers/platform/x86/p2sb.c @@ -97,6 +97,14 @@ static void p2sb_scan_and_cache_devfn(struct pci_bus *bus, unsigned int devfn)
static int p2sb_scan_and_cache(struct pci_bus *bus, unsigned int devfn) { + /* + * The BIOS prevents the P2SB device from being enumerated by the PCI + * subsystem, so we need to unhide and hide it back to lookup the BAR. + * Unhide the P2SB device here, if needed. + */ + if (p2sb_hidden_by_bios) + pci_bus_write_config_dword(bus, devfn, P2SBC, 0); + /* Scan the P2SB device and cache its BAR0 */ p2sb_scan_and_cache_devfn(bus, devfn);
@@ -104,6 +112,10 @@ static int p2sb_scan_and_cache(struct pci_bus *bus, unsigned int devfn) if (devfn == P2SB_DEVFN_GOLDMONT) p2sb_scan_and_cache_devfn(bus, SPI_DEVFN_GOLDMONT);
+ /* Hide the P2SB device, if it was hidden */ + if (p2sb_hidden_by_bios) + pci_bus_write_config_dword(bus, devfn, P2SBC, P2SBC_HIDE); + if (!p2sb_valid_resource(&p2sb_resources[PCI_FUNC(devfn)].res)) return -ENOENT;
@@ -152,22 +164,11 @@ static int p2sb_cache_resources(void) */ pci_lock_rescan_remove();
- /* - * The BIOS prevents the P2SB device from being enumerated by the PCI - * subsystem, so we need to unhide and hide it back to lookup the BAR. - * Unhide the P2SB device here, if needed. - */ pci_bus_read_config_dword(bus, devfn_p2sb, P2SBC, &value); p2sb_hidden_by_bios = value & P2SBC_HIDE; - if (p2sb_hidden_by_bios) - pci_bus_write_config_dword(bus, devfn_p2sb, P2SBC, 0);
ret = p2sb_scan_and_cache(bus, devfn_p2sb);
- /* Hide the P2SB device, if it was hidden */ - if (p2sb_hidden_by_bios) - pci_bus_write_config_dword(bus, devfn_p2sb, P2SBC, P2SBC_HIDE); - pci_unlock_rescan_remove();
return ret;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com
[ Upstream commit 360c400d0f568636c1b98d1d5f9f49aa3d420c70 ]
When drivers access P2SB device resources, it calls p2sb_bar(). Before the commit 5913320eb0b3 ("platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe"), p2sb_bar() obtained the resources and then called pci_stop_and_remove_bus_device() for clean up. Then the P2SB device disappeared. The commit 5913320eb0b3 introduced the P2SB device resource cache feature in the boot process. During the resource cache, pci_stop_and_remove_bus_device() is called for the P2SB device, then the P2SB device disappears regardless of whether p2sb_bar() is called or not. Such P2SB device disappearance caused a confusion [1]. To avoid the confusion, avoid the pci_stop_and_remove_bus_device() call when the BIOS does not hide the P2SB device.
For that purpose, cache the P2SB device resources only if the BIOS hides the P2SB device. Call p2sb_scan_and_cache() only if p2sb_hidden_by_bios is true. This allows removing two branches from p2sb_scan_and_cache(). When p2sb_bar() is called, get the resources from the cache if the P2SB device is hidden. Otherwise, read the resources from the unhidden P2SB device.
Reported-by: Daniel Walker (danielwa) danielwa@cisco.com Closes: https://lore.kernel.org/lkml/ZzTI+biIUTvFT6NC@goliath/ [1] Fixes: 5913320eb0b3 ("platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe") Signed-off-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20241128002836.373745-5-shinichiro.kawasaki@wdc.co... Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/p2sb.c | 42 +++++++++++++++++++++++++++++-------- 1 file changed, 33 insertions(+), 9 deletions(-)
diff --git a/drivers/platform/x86/p2sb.c b/drivers/platform/x86/p2sb.c index 6fb76b82ecce..eff920de31c2 100644 --- a/drivers/platform/x86/p2sb.c +++ b/drivers/platform/x86/p2sb.c @@ -100,10 +100,8 @@ static int p2sb_scan_and_cache(struct pci_bus *bus, unsigned int devfn) /* * The BIOS prevents the P2SB device from being enumerated by the PCI * subsystem, so we need to unhide and hide it back to lookup the BAR. - * Unhide the P2SB device here, if needed. */ - if (p2sb_hidden_by_bios) - pci_bus_write_config_dword(bus, devfn, P2SBC, 0); + pci_bus_write_config_dword(bus, devfn, P2SBC, 0);
/* Scan the P2SB device and cache its BAR0 */ p2sb_scan_and_cache_devfn(bus, devfn); @@ -112,9 +110,7 @@ static int p2sb_scan_and_cache(struct pci_bus *bus, unsigned int devfn) if (devfn == P2SB_DEVFN_GOLDMONT) p2sb_scan_and_cache_devfn(bus, SPI_DEVFN_GOLDMONT);
- /* Hide the P2SB device, if it was hidden */ - if (p2sb_hidden_by_bios) - pci_bus_write_config_dword(bus, devfn, P2SBC, P2SBC_HIDE); + pci_bus_write_config_dword(bus, devfn, P2SBC, P2SBC_HIDE);
if (!p2sb_valid_resource(&p2sb_resources[PCI_FUNC(devfn)].res)) return -ENOENT; @@ -141,7 +137,7 @@ static int p2sb_cache_resources(void) u32 value = P2SBC_HIDE; struct pci_bus *bus; u16 class; - int ret; + int ret = 0;
/* Get devfn for P2SB device itself */ p2sb_get_devfn(&devfn_p2sb); @@ -167,7 +163,12 @@ static int p2sb_cache_resources(void) pci_bus_read_config_dword(bus, devfn_p2sb, P2SBC, &value); p2sb_hidden_by_bios = value & P2SBC_HIDE;
- ret = p2sb_scan_and_cache(bus, devfn_p2sb); + /* + * If the BIOS does not hide the P2SB device then its resources + * are accesilble. Cache them only if the P2SB device is hidden. + */ + if (p2sb_hidden_by_bios) + ret = p2sb_scan_and_cache(bus, devfn_p2sb);
pci_unlock_rescan_remove();
@@ -190,6 +191,26 @@ static int p2sb_read_from_cache(struct pci_bus *bus, unsigned int devfn, return 0; }
+static int p2sb_read_from_dev(struct pci_bus *bus, unsigned int devfn, + struct resource *mem) +{ + struct pci_dev *pdev; + int ret = 0; + + pdev = pci_get_slot(bus, devfn); + if (!pdev) + return -ENODEV; + + if (p2sb_valid_resource(pci_resource_n(pdev, 0))) + p2sb_read_bar0(pdev, mem); + else + ret = -ENOENT; + + pci_dev_put(pdev); + + return ret; +} + /** * p2sb_bar - Get Primary to Sideband (P2SB) bridge device BAR * @bus: PCI bus to communicate with @@ -213,7 +234,10 @@ int p2sb_bar(struct pci_bus *bus, unsigned int devfn, struct resource *mem) if (!devfn) p2sb_get_devfn(&devfn);
- return p2sb_read_from_cache(bus, devfn, mem); + if (p2sb_hidden_by_bios) + return p2sb_read_from_cache(bus, devfn, mem); + + return p2sb_read_from_dev(bus, devfn, mem); } EXPORT_SYMBOL_GPL(p2sb_bar);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vladimir Riabchun ferr.lambarginio@gmail.com
[ Upstream commit 7363f2d4c18557c99c536b70489187bb4e05c412 ]
Since commit f63b94be6942 ("i2c: pnx: Fix potential deadlock warning from del_timer_sync() call in isr") jiffies are stored in i2c_pnx_algo_data.timeout, but wait_timeout and wait_reset are still using it as milliseconds. Convert jiffies back to milliseconds to wait for the expected amount of time.
Fixes: f63b94be6942 ("i2c: pnx: Fix potential deadlock warning from del_timer_sync() call in isr") Signed-off-by: Vladimir Riabchun ferr.lambarginio@gmail.com Signed-off-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-pnx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-pnx.c b/drivers/i2c/busses/i2c-pnx.c index f448505d5468..2b9cc86fb3ab 100644 --- a/drivers/i2c/busses/i2c-pnx.c +++ b/drivers/i2c/busses/i2c-pnx.c @@ -95,7 +95,7 @@ enum {
static inline int wait_timeout(struct i2c_pnx_algo_data *data) { - long timeout = data->timeout; + long timeout = jiffies_to_msecs(data->timeout); while (timeout > 0 && (ioread32(I2C_REG_STS(data)) & mstatus_active)) { mdelay(1); @@ -106,7 +106,7 @@ static inline int wait_timeout(struct i2c_pnx_algo_data *data)
static inline int wait_reset(struct i2c_pnx_algo_data *data) { - long timeout = data->timeout; + long timeout = jiffies_to_msecs(data->timeout); while (timeout > 0 && (ioread32(I2C_REG_CTL(data)) & mcntrl_reset)) { mdelay(1);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
commit 9ff4490e2ab364ec433f15668ef3f5edfb53feca upstream.
oss.sgi.com is long dead, refer to the current linux-xfs list instead.
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/ABI/testing/sysfs-fs-xfs | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-fs-xfs b/Documentation/ABI/testing/sysfs-fs-xfs index f704925f6fe9..82d8e2f79834 100644 --- a/Documentation/ABI/testing/sysfs-fs-xfs +++ b/Documentation/ABI/testing/sysfs-fs-xfs @@ -1,7 +1,7 @@ What: /sys/fs/xfs/<disk>/log/log_head_lsn Date: July 2014 KernelVersion: 3.17 -Contact: xfs@oss.sgi.com +Contact: linux-xfs@vger.kernel.org Description: The log sequence number (LSN) of the current head of the log. The LSN is exported in "cycle:basic block" format. @@ -10,7 +10,7 @@ Users: xfstests What: /sys/fs/xfs/<disk>/log/log_tail_lsn Date: July 2014 KernelVersion: 3.17 -Contact: xfs@oss.sgi.com +Contact: linux-xfs@vger.kernel.org Description: The log sequence number (LSN) of the current tail of the log. The LSN is exported in "cycle:basic block" format. @@ -18,7 +18,7 @@ Description: What: /sys/fs/xfs/<disk>/log/reserve_grant_head Date: July 2014 KernelVersion: 3.17 -Contact: xfs@oss.sgi.com +Contact: linux-xfs@vger.kernel.org Description: The current state of the log reserve grant head. It represents the total log reservation of all currently @@ -29,7 +29,7 @@ Users: xfstests What: /sys/fs/xfs/<disk>/log/write_grant_head Date: July 2014 KernelVersion: 3.17 -Contact: xfs@oss.sgi.com +Contact: linux-xfs@vger.kernel.org Description: The current state of the log write grant head. It represents the total log reservation of all currently
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit 150bb10a28b9c8709ae227fc898d9cf6136faa1e upstream.
generic/388 has an annoying tendency to fail like this during log recovery:
XFS (sda4): Unmounting Filesystem 435fe39b-82b6-46ef-be56-819499585130 XFS (sda4): Mounting V5 Filesystem 435fe39b-82b6-46ef-be56-819499585130 XFS (sda4): Starting recovery (logdev: internal) 00000000: 49 4e 81 b6 03 02 00 00 00 00 00 07 00 00 00 07 IN.............. 00000010: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 10 ................ 00000020: 35 9a 8b c1 3e 6e 81 00 35 9a 8b c1 3f dc b7 00 5...>n..5...?... 00000030: 35 9a 8b c1 3f dc b7 00 00 00 00 00 00 3c 86 4f 5...?........<.O 00000040: 00 00 00 00 00 00 02 f3 00 00 00 00 00 00 00 00 ................ 00000050: 00 00 1f 01 00 00 00 00 00 00 00 02 b2 74 c9 0b .............t.. 00000060: ff ff ff ff d7 45 73 10 00 00 00 00 00 00 00 2d .....Es........- 00000070: 00 00 07 92 00 01 fe 30 00 00 00 00 00 00 00 1a .......0........ 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000090: 35 9a 8b c1 3b 55 0c 00 00 00 00 00 04 27 b2 d1 5...;U.......'.. 000000a0: 43 5f e3 9b 82 b6 46 ef be 56 81 94 99 58 51 30 C_....F..V...XQ0 XFS (sda4): Internal error Bad dinode after recovery at line 539 of file fs/xfs/xfs_inode_item_recover.c. Caller xlog_recover_items_pass2+0x4e/0xc0 [xfs] CPU: 0 PID: 2189311 Comm: mount Not tainted 6.9.0-rc4-djwx #rc4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20171121_152543-x86-ol7-builder-01.us.oracle.com-4.el7.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4f/0x60 xfs_corruption_error+0x90/0xa0 xlog_recover_inode_commit_pass2+0x5f1/0xb00 xlog_recover_items_pass2+0x4e/0xc0 xlog_recover_commit_trans+0x2db/0x350 xlog_recovery_process_trans+0xab/0xe0 xlog_recover_process_data+0xa7/0x130 xlog_do_recovery_pass+0x398/0x840 xlog_do_log_recovery+0x62/0xc0 xlog_do_recover+0x34/0x1d0 xlog_recover+0xe9/0x1a0 xfs_log_mount+0xff/0x260 xfs_mountfs+0x5d9/0xb60 xfs_fs_fill_super+0x76b/0xa30 get_tree_bdev+0x124/0x1d0 vfs_get_tree+0x17/0xa0 path_mount+0x72b/0xa90 __x64_sys_mount+0x112/0x150 do_syscall_64+0x49/0x100 entry_SYSCALL_64_after_hwframe+0x4b/0x53 </TASK> XFS (sda4): Corruption detected. Unmount and run xfs_repair XFS (sda4): Metadata corruption detected at xfs_dinode_verify.part.0+0x739/0x920 [xfs], inode 0x427b2d1 XFS (sda4): Filesystem has been shut down due to log error (0x2). XFS (sda4): Please unmount the filesystem and rectify the problem(s). XFS (sda4): log mount/recovery failed: error -117 XFS (sda4): log mount failed
This inode log item recovery failing the dinode verifier after replaying the contents of the inode log item into the ondisk inode. Looking back into what the kernel was doing at the time of the fs shutdown, a thread was in the middle of running a series of transactions, each of which committed changes to the inode.
At some point in the middle of that chain, an invalid (at least according to the verifier) change was committed. Had the filesystem not shut down in the middle of the chain, a subsequent transaction would have corrected the invalid state and nobody would have noticed. But that's not what happened here. Instead, the invalid inode state was committed to the ondisk log, so log recovery tripped over it.
The actual defect here was an overzealous inode verifier, which was fixed in a separate patch. This patch adds some transaction precommit functions for CONFIG_XFS_DEBUG=y mode so that we can detect these kinds of transient errors at transaction commit time, where it's much easier to find the root cause.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/Kconfig | 12 ++++++++++++ fs/xfs/xfs.h | 4 ++++ fs/xfs/xfs_buf_item.c | 32 ++++++++++++++++++++++++++++++++ fs/xfs/xfs_dquot_item.c | 31 +++++++++++++++++++++++++++++++ fs/xfs/xfs_inode_item.c | 32 ++++++++++++++++++++++++++++++++ 5 files changed, 111 insertions(+)
diff --git a/fs/xfs/Kconfig b/fs/xfs/Kconfig index 567fb37274d3..ced0e6272aef 100644 --- a/fs/xfs/Kconfig +++ b/fs/xfs/Kconfig @@ -204,6 +204,18 @@ config XFS_DEBUG
Say N unless you are an XFS developer, or you play one on TV.
+config XFS_DEBUG_EXPENSIVE + bool "XFS expensive debugging checks" + depends on XFS_FS && XFS_DEBUG + help + Say Y here to get an XFS build with expensive debugging checks + enabled. These checks may affect performance significantly. + + Note that the resulting code will be HUGER and SLOWER, and probably + not useful unless you are debugging a particular problem. + + Say N unless you are an XFS developer, or you play one on TV. + config XFS_ASSERT_FATAL bool "XFS fatal asserts" default y diff --git a/fs/xfs/xfs.h b/fs/xfs/xfs.h index f6ffb4f248f7..9355ccad9503 100644 --- a/fs/xfs/xfs.h +++ b/fs/xfs/xfs.h @@ -10,6 +10,10 @@ #define DEBUG 1 #endif
+#ifdef CONFIG_XFS_DEBUG_EXPENSIVE +#define DEBUG_EXPENSIVE 1 +#endif + #ifdef CONFIG_XFS_ASSERT_FATAL #define XFS_ASSERT_FATAL 1 #endif diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index 023d4e0385dd..b02ce568de0c 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -22,6 +22,7 @@ #include "xfs_trace.h" #include "xfs_log.h" #include "xfs_log_priv.h" +#include "xfs_error.h"
struct kmem_cache *xfs_buf_item_cache; @@ -781,8 +782,39 @@ xfs_buf_item_committed( return lsn; }
+#ifdef DEBUG_EXPENSIVE +static int +xfs_buf_item_precommit( + struct xfs_trans *tp, + struct xfs_log_item *lip) +{ + struct xfs_buf_log_item *bip = BUF_ITEM(lip); + struct xfs_buf *bp = bip->bli_buf; + struct xfs_mount *mp = bp->b_mount; + xfs_failaddr_t fa; + + if (!bp->b_ops || !bp->b_ops->verify_struct) + return 0; + if (bip->bli_flags & XFS_BLI_STALE) + return 0; + + fa = bp->b_ops->verify_struct(bp); + if (fa) { + xfs_buf_verifier_error(bp, -EFSCORRUPTED, bp->b_ops->name, + bp->b_addr, BBTOB(bp->b_length), fa); + xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); + ASSERT(fa == NULL); + } + + return 0; +} +#else +# define xfs_buf_item_precommit NULL +#endif + static const struct xfs_item_ops xfs_buf_item_ops = { .iop_size = xfs_buf_item_size, + .iop_precommit = xfs_buf_item_precommit, .iop_format = xfs_buf_item_format, .iop_pin = xfs_buf_item_pin, .iop_unpin = xfs_buf_item_unpin, diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c index 6a1aae799cf1..7d19091215b0 100644 --- a/fs/xfs/xfs_dquot_item.c +++ b/fs/xfs/xfs_dquot_item.c @@ -17,6 +17,7 @@ #include "xfs_trans_priv.h" #include "xfs_qm.h" #include "xfs_log.h" +#include "xfs_error.h"
static inline struct xfs_dq_logitem *DQUOT_ITEM(struct xfs_log_item *lip) { @@ -193,8 +194,38 @@ xfs_qm_dquot_logitem_committing( return xfs_qm_dquot_logitem_release(lip); }
+#ifdef DEBUG_EXPENSIVE +static int +xfs_qm_dquot_logitem_precommit( + struct xfs_trans *tp, + struct xfs_log_item *lip) +{ + struct xfs_dquot *dqp = DQUOT_ITEM(lip)->qli_dquot; + struct xfs_mount *mp = dqp->q_mount; + struct xfs_disk_dquot ddq = { }; + xfs_failaddr_t fa; + + xfs_dquot_to_disk(&ddq, dqp); + fa = xfs_dquot_verify(mp, &ddq, dqp->q_id); + if (fa) { + XFS_CORRUPTION_ERROR("Bad dquot during logging", + XFS_ERRLEVEL_LOW, mp, &ddq, sizeof(ddq)); + xfs_alert(mp, + "Metadata corruption detected at %pS, dquot 0x%x", + fa, dqp->q_id); + xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); + ASSERT(fa == NULL); + } + + return 0; +} +#else +# define xfs_qm_dquot_logitem_precommit NULL +#endif + static const struct xfs_item_ops xfs_dquot_item_ops = { .iop_size = xfs_qm_dquot_logitem_size, + .iop_precommit = xfs_qm_dquot_logitem_precommit, .iop_format = xfs_qm_dquot_logitem_format, .iop_pin = xfs_qm_dquot_logitem_pin, .iop_unpin = xfs_qm_dquot_logitem_unpin, diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c index 155a8b312875..b55ad3b7b113 100644 --- a/fs/xfs/xfs_inode_item.c +++ b/fs/xfs/xfs_inode_item.c @@ -36,6 +36,36 @@ xfs_inode_item_sort( return INODE_ITEM(lip)->ili_inode->i_ino; }
+#ifdef DEBUG_EXPENSIVE +static void +xfs_inode_item_precommit_check( + struct xfs_inode *ip) +{ + struct xfs_mount *mp = ip->i_mount; + struct xfs_dinode *dip; + xfs_failaddr_t fa; + + dip = kzalloc(mp->m_sb.sb_inodesize, GFP_KERNEL | GFP_NOFS); + if (!dip) { + ASSERT(dip != NULL); + return; + } + + xfs_inode_to_disk(ip, dip, 0); + xfs_dinode_calc_crc(mp, dip); + fa = xfs_dinode_verify(mp, ip->i_ino, dip); + if (fa) { + xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip, + sizeof(*dip), fa); + xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); + ASSERT(fa == NULL); + } + kfree(dip); +} +#else +# define xfs_inode_item_precommit_check(ip) ((void)0) +#endif + /* * Prior to finally logging the inode, we have to ensure that all the * per-modification inode state changes are applied. This includes VFS inode @@ -168,6 +198,8 @@ xfs_inode_item_precommit( iip->ili_fields |= (flags | iip->ili_last_fields); spin_unlock(&iip->ili_lock);
+ xfs_inode_item_precommit_check(ip); + /* * We are done with the log item transaction dirty state, so clear it so * that it doesn't pollute future transactions.
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit 24a4e1cb322e2bf0f3a1afd1978b610a23aa8f36 upstream.
I noticed that callers of xfs_qm_vop_dqalloc use the following code to compute the anticipated uid of the new file:
mapped_fsuid(idmap, &init_user_ns);
whereas the VFS uses a slightly different computation for actually assigning i_uid:
mapped_fsuid(idmap, i_user_ns(inode));
Technically, these are not the same things. According to Christian Brauner, the only time that inode->i_sb->s_user_ns != &init_user_ns is when the filesystem was mounted in a new mount namespace by an unpriviledged user. XFS does not allow this, which is why we've never seen bug reports about quotas being incorrect or the uid checks in xfs_qm_vop_create_dqattach tripping debug assertions.
However, this /is/ a logic bomb, so let's make the code consistent.
Link: https://lore.kernel.org/linux-fsdevel/20240617-weitblick-gefertigt-4a41f3711... Fixes: c14329d39f2d ("fs: port fs{g,u}id helpers to mnt_idmap") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_inode.c | 16 ++++++++++------ fs/xfs/xfs_symlink.c | 8 +++++--- 2 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 7aa73855fab6..1e50cc9a29db 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -982,10 +982,12 @@ xfs_create( prid = xfs_get_initial_prid(dp);
/* - * Make sure that we have allocated dquot(s) on disk. + * Make sure that we have allocated dquot(s) on disk. The uid/gid + * computation code must match what the VFS uses to assign i_[ug]id. + * INHERIT adjusts the gid computation for setgid/grpid systems. */ - error = xfs_qm_vop_dqalloc(dp, mapped_fsuid(idmap, &init_user_ns), - mapped_fsgid(idmap, &init_user_ns), prid, + error = xfs_qm_vop_dqalloc(dp, mapped_fsuid(idmap, i_user_ns(VFS_I(dp))), + mapped_fsgid(idmap, i_user_ns(VFS_I(dp))), prid, XFS_QMOPT_QUOTALL | XFS_QMOPT_INHERIT, &udqp, &gdqp, &pdqp); if (error) @@ -1131,10 +1133,12 @@ xfs_create_tmpfile( prid = xfs_get_initial_prid(dp);
/* - * Make sure that we have allocated dquot(s) on disk. + * Make sure that we have allocated dquot(s) on disk. The uid/gid + * computation code must match what the VFS uses to assign i_[ug]id. + * INHERIT adjusts the gid computation for setgid/grpid systems. */ - error = xfs_qm_vop_dqalloc(dp, mapped_fsuid(idmap, &init_user_ns), - mapped_fsgid(idmap, &init_user_ns), prid, + error = xfs_qm_vop_dqalloc(dp, mapped_fsuid(idmap, i_user_ns(VFS_I(dp))), + mapped_fsgid(idmap, i_user_ns(VFS_I(dp))), prid, XFS_QMOPT_QUOTALL | XFS_QMOPT_INHERIT, &udqp, &gdqp, &pdqp); if (error) diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c index 85e433df6a3f..b08be64dd10b 100644 --- a/fs/xfs/xfs_symlink.c +++ b/fs/xfs/xfs_symlink.c @@ -191,10 +191,12 @@ xfs_symlink( prid = xfs_get_initial_prid(dp);
/* - * Make sure that we have allocated dquot(s) on disk. + * Make sure that we have allocated dquot(s) on disk. The uid/gid + * computation code must match what the VFS uses to assign i_[ug]id. + * INHERIT adjusts the gid computation for setgid/grpid systems. */ - error = xfs_qm_vop_dqalloc(dp, mapped_fsuid(idmap, &init_user_ns), - mapped_fsgid(idmap, &init_user_ns), prid, + error = xfs_qm_vop_dqalloc(dp, mapped_fsuid(idmap, i_user_ns(VFS_I(dp))), + mapped_fsgid(idmap, i_user_ns(VFS_I(dp))), prid, XFS_QMOPT_QUOTALL | XFS_QMOPT_INHERIT, &udqp, &gdqp, &pdqp); if (error)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit 00acb28d96746f78389f23a7b5309a917b45c12f upstream.
[backport: dependency of d3b689d and f23660f]
Move the two public symbols in xfs_file.c to xfs_file.h. We're about to add more public symbols in that source file, so let's finally create the header file.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_file.c | 1 + fs/xfs/xfs_file.h | 12 ++++++++++++ fs/xfs/xfs_ioctl.c | 1 + fs/xfs/xfs_iops.c | 1 + fs/xfs/xfs_iops.h | 3 --- 5 files changed, 15 insertions(+), 3 deletions(-) create mode 100644 fs/xfs/xfs_file.h
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 16769c22c070..8dcbcf965b2c 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -24,6 +24,7 @@ #include "xfs_pnfs.h" #include "xfs_iomap.h" #include "xfs_reflink.h" +#include "xfs_file.h"
#include <linux/dax.h> #include <linux/falloc.h> diff --git a/fs/xfs/xfs_file.h b/fs/xfs/xfs_file.h new file mode 100644 index 000000000000..7d39e3eca56d --- /dev/null +++ b/fs/xfs/xfs_file.h @@ -0,0 +1,12 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2000-2005 Silicon Graphics, Inc. + * All Rights Reserved. + */ +#ifndef __XFS_FILE_H__ +#define __XFS_FILE_H__ + +extern const struct file_operations xfs_file_operations; +extern const struct file_operations xfs_dir_file_operations; + +#endif /* __XFS_FILE_H__ */ diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 535f6d38cdb5..df4bf0d56aad 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -38,6 +38,7 @@ #include "xfs_reflink.h" #include "xfs_ioctl.h" #include "xfs_xattr.h" +#include "xfs_file.h"
#include <linux/mount.h> #include <linux/namei.h> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index b8ec045708c3..f9466311dfea 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -25,6 +25,7 @@ #include "xfs_error.h" #include "xfs_ioctl.h" #include "xfs_xattr.h" +#include "xfs_file.h"
#include <linux/posix_acl.h> #include <linux/security.h> diff --git a/fs/xfs/xfs_iops.h b/fs/xfs/xfs_iops.h index 7f84a0843b24..52d6d510a21d 100644 --- a/fs/xfs/xfs_iops.h +++ b/fs/xfs/xfs_iops.h @@ -8,9 +8,6 @@
struct xfs_inode;
-extern const struct file_operations xfs_file_operations; -extern const struct file_operations xfs_dir_file_operations; - extern ssize_t xfs_vn_listxattr(struct dentry *, char *data, size_t size);
int xfs_vn_setattr_size(struct mnt_idmap *idmap,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit ee20808d848c87a51e176706d81b95a21747d6cf upstream.
[backport: dependency of d3b689d and f23660f]
Create a new helper function to calculate the fundamental allocation unit (i.e. the smallest unit of space we can allocate) of a file. Things are going to get hairy with range-exchange on the realtime device, so prepare for this now.
Remove the static attribute from xfs_is_falloc_aligned since the next patch will need it.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_file.c | 32 ++++++++++++-------------------- fs/xfs/xfs_file.h | 3 +++ fs/xfs/xfs_inode.c | 13 +++++++++++++ fs/xfs/xfs_inode.h | 2 ++ 4 files changed, 30 insertions(+), 20 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 8dcbcf965b2c..3b9d43d5c746 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -39,33 +39,25 @@ static const struct vm_operations_struct xfs_file_vm_ops; * Decide if the given file range is aligned to the size of the fundamental * allocation unit for the file. */ -static bool +bool xfs_is_falloc_aligned( struct xfs_inode *ip, loff_t pos, long long int len) { - struct xfs_mount *mp = ip->i_mount; - uint64_t mask; - - if (XFS_IS_REALTIME_INODE(ip)) { - if (!is_power_of_2(mp->m_sb.sb_rextsize)) { - u64 rextbytes; - u32 mod; - - rextbytes = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize); - div_u64_rem(pos, rextbytes, &mod); - if (mod) - return false; - div_u64_rem(len, rextbytes, &mod); - return mod == 0; - } - mask = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize) - 1; - } else { - mask = mp->m_sb.sb_blocksize - 1; + unsigned int alloc_unit = xfs_inode_alloc_unitsize(ip); + + if (!is_power_of_2(alloc_unit)) { + u32 mod; + + div_u64_rem(pos, alloc_unit, &mod); + if (mod) + return false; + div_u64_rem(len, alloc_unit, &mod); + return mod == 0; }
- return !((pos | len) & mask); + return !((pos | len) & (alloc_unit - 1)); }
/* diff --git a/fs/xfs/xfs_file.h b/fs/xfs/xfs_file.h index 7d39e3eca56d..2ad91f755caf 100644 --- a/fs/xfs/xfs_file.h +++ b/fs/xfs/xfs_file.h @@ -9,4 +9,7 @@ extern const struct file_operations xfs_file_operations; extern const struct file_operations xfs_dir_file_operations;
+bool xfs_is_falloc_aligned(struct xfs_inode *ip, loff_t pos, + long long int len); + #endif /* __XFS_FILE_H__ */ diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 1e50cc9a29db..6f7dca1c14c7 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -3782,3 +3782,16 @@ xfs_inode_reload_unlinked(
return error; } + +/* Returns the size of fundamental allocation unit for a file, in bytes. */ +unsigned int +xfs_inode_alloc_unitsize( + struct xfs_inode *ip) +{ + unsigned int blocks = 1; + + if (XFS_IS_REALTIME_INODE(ip)) + blocks = ip->i_mount->m_sb.sb_rextsize; + + return XFS_FSB_TO_B(ip->i_mount, blocks); +} diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 3beb470f1892..0f2999b84e7d 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -622,4 +622,6 @@ xfs_inode_unlinked_incomplete( int xfs_inode_reload_unlinked_bucket(struct xfs_trans *tp, struct xfs_inode *ip); int xfs_inode_reload_unlinked(struct xfs_inode *ip);
+unsigned int xfs_inode_alloc_unitsize(struct xfs_inode *ip); + #endif /* __XFS_INODE_H__ */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Garry john.g.garry@oracle.com
commit d3b689d7c711a9f36d3e48db9eaa75784a892f4c upstream.
Currently xfs_flush_unmap_range() does unmap for a full RT extent range, which we also want to ensure is clean and idle.
This code change is originally from Dave Chinner.
Reviewed-by: Christoph Hellwig hch@lst.de4 Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: John Garry john.g.garry@oracle.com Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_bmap_util.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index f9d72d8e3c35..7336402f1efa 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -963,14 +963,18 @@ xfs_flush_unmap_range( xfs_off_t offset, xfs_off_t len) { - struct xfs_mount *mp = ip->i_mount; struct inode *inode = VFS_I(ip); xfs_off_t rounding, start, end; int error;
- rounding = max_t(xfs_off_t, mp->m_sb.sb_blocksize, PAGE_SIZE); - start = round_down(offset, rounding); - end = round_up(offset + len, rounding) - 1; + /* + * Make sure we extend the flush out to extent alignment + * boundaries so any extent range overlapping the start/end + * of the modification we are about to do is clean and idle. + */ + rounding = max_t(xfs_off_t, xfs_inode_alloc_unitsize(ip), PAGE_SIZE); + start = rounddown_64(offset, rounding); + end = roundup_64(offset + len, rounding) - 1;
error = filemap_write_and_wait_range(inode->i_mapping, start, end); if (error)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Garry john.g.garry@oracle.com
commit f23660f059470ec7043748da7641e84183c23bc8 upstream.
The RT extent range must be considered in the xfs_flush_unmap_range() call to stabilize the boundary.
This code change is originally from Dave Chinner.
Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: John Garry john.g.garry@oracle.com Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_bmap_util.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 7336402f1efa..1fa10a83da0b 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -1059,7 +1059,7 @@ xfs_prepare_shift( struct xfs_inode *ip, loff_t offset) { - struct xfs_mount *mp = ip->i_mount; + unsigned int rounding; int error;
/* @@ -1077,11 +1077,13 @@ xfs_prepare_shift( * with the full range of the operation. If we don't, a COW writeback * completion could race with an insert, front merge with the start * extent (after split) during the shift and corrupt the file. Start - * with the block just prior to the start to stabilize the boundary. + * with the allocation unit just prior to the start to stabilize the + * boundary. */ - offset = round_down(offset, mp->m_sb.sb_blocksize); + rounding = xfs_inode_alloc_unitsize(ip); + offset = rounddown_64(offset, rounding); if (offset) - offset -= mp->m_sb.sb_blocksize; + offset -= rounding;
/* * Writeback and invalidate cache for the remainder of the file as we're
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: lei lu llfamsec@gmail.com
commit 0c7fcdb6d06cdf8b19b57c17605215b06afa864a upstream.
This adds sanity checks for xfs_dir2_data_unused and xfs_dir2_data_entry to make sure don't stray beyond valid memory region. Before patching, the loop simply checks that the start offset of the dup and dep is within the range. So in a crafted image, if last entry is xfs_dir2_data_unused, we can change dup->length to dup->length-1 and leave 1 byte of space. In the next traversal, this space will be considered as dup or dep. We may encounter an out of bound read when accessing the fixed members.
In the patch, we make sure that the remaining bytes large enough to hold an unused entry before accessing xfs_dir2_data_unused and xfs_dir2_data_unused is XFS_DIR2_DATA_ALIGN byte aligned. We also make sure that the remaining bytes large enough to hold a dirent with a single-byte name before accessing xfs_dir2_data_entry.
Signed-off-by: lei lu llfamsec@gmail.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_dir2_data.c | 31 ++++++++++++++++++++++++++----- fs/xfs/libxfs/xfs_dir2_priv.h | 7 +++++++ 2 files changed, 33 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_dir2_data.c b/fs/xfs/libxfs/xfs_dir2_data.c index dbcf58979a59..e1d5da6d8d4a 100644 --- a/fs/xfs/libxfs/xfs_dir2_data.c +++ b/fs/xfs/libxfs/xfs_dir2_data.c @@ -177,6 +177,14 @@ __xfs_dir3_data_check( while (offset < end) { struct xfs_dir2_data_unused *dup = bp->b_addr + offset; struct xfs_dir2_data_entry *dep = bp->b_addr + offset; + unsigned int reclen; + + /* + * Are the remaining bytes large enough to hold an + * unused entry? + */ + if (offset > end - xfs_dir2_data_unusedsize(1)) + return __this_address;
/* * If it's unused, look for the space in the bestfree table. @@ -186,9 +194,13 @@ __xfs_dir3_data_check( if (be16_to_cpu(dup->freetag) == XFS_DIR2_DATA_FREE_TAG) { xfs_failaddr_t fa;
+ reclen = xfs_dir2_data_unusedsize( + be16_to_cpu(dup->length)); if (lastfree != 0) return __this_address; - if (offset + be16_to_cpu(dup->length) > end) + if (be16_to_cpu(dup->length) != reclen) + return __this_address; + if (offset + reclen > end) return __this_address; if (be16_to_cpu(*xfs_dir2_data_unused_tag_p(dup)) != offset) @@ -206,10 +218,18 @@ __xfs_dir3_data_check( be16_to_cpu(bf[2].length)) return __this_address; } - offset += be16_to_cpu(dup->length); + offset += reclen; lastfree = 1; continue; } + + /* + * This is not an unused entry. Are the remaining bytes + * large enough for a dirent with a single-byte name? + */ + if (offset > end - xfs_dir2_data_entsize(mp, 1)) + return __this_address; + /* * It's a real entry. Validate the fields. * If this is a block directory then make sure it's @@ -218,9 +238,10 @@ __xfs_dir3_data_check( */ if (dep->namelen == 0) return __this_address; - if (!xfs_verify_dir_ino(mp, be64_to_cpu(dep->inumber))) + reclen = xfs_dir2_data_entsize(mp, dep->namelen); + if (offset + reclen > end) return __this_address; - if (offset + xfs_dir2_data_entsize(mp, dep->namelen) > end) + if (!xfs_verify_dir_ino(mp, be64_to_cpu(dep->inumber))) return __this_address; if (be16_to_cpu(*xfs_dir2_data_entry_tag_p(mp, dep)) != offset) return __this_address; @@ -244,7 +265,7 @@ __xfs_dir3_data_check( if (i >= be32_to_cpu(btp->count)) return __this_address; } - offset += xfs_dir2_data_entsize(mp, dep->namelen); + offset += reclen; } /* * Need to have seen all the entries and all the bestfree slots. diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h index 7404a9ff1a92..9046d08554e9 100644 --- a/fs/xfs/libxfs/xfs_dir2_priv.h +++ b/fs/xfs/libxfs/xfs_dir2_priv.h @@ -187,6 +187,13 @@ void xfs_dir2_sf_put_ftype(struct xfs_mount *mp, extern int xfs_readdir(struct xfs_trans *tp, struct xfs_inode *dp, struct dir_context *ctx, size_t bufsize);
+static inline unsigned int +xfs_dir2_data_unusedsize( + unsigned int len) +{ + return round_up(len, XFS_DIR2_DATA_ALIGN); +} + static inline unsigned int xfs_dir2_data_entsize( struct xfs_mount *mp,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen Ni nichen@iscas.ac.cn
commit 7bf888fa26e8f22bed4bc3965ab2a2953104ff96 upstream.
Replace a comma between expression statements by a semicolon.
Fixes: 178b48d588ea ("xfs: remove the for_each_xbitmap_ helpers") Signed-off-by: Chen Ni nichen@iscas.ac.cn Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/scrub/agheader_repair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/scrub/agheader_repair.c b/fs/xfs/scrub/agheader_repair.c index 876a2f41b063..058b6c305224 100644 --- a/fs/xfs/scrub/agheader_repair.c +++ b/fs/xfs/scrub/agheader_repair.c @@ -705,7 +705,7 @@ xrep_agfl_init_header( * step. */ xagb_bitmap_init(&af.used_extents); - af.agfl_bno = xfs_buf_to_agfl_bno(agfl_bp), + af.agfl_bno = xfs_buf_to_agfl_bno(agfl_bp); xagb_bitmap_walk(agfl_extents, xrep_agfl_fill, &af); error = xagb_bitmap_disunion(agfl_extents, &af.used_extents); if (error)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit 19ebc8f84ea12e18dd6c8d3ecaf87bcf4666eee1 upstream.
[backport: only apply fix for 3934e8ebb7cc6]
Since file_path() takes the output buffer as one of its arguments, we might as well have it format directly into the tracepoint's char array instead of wasting stack space.
Fixes: 3934e8ebb7cc6 ("xfs: create a big array data structure") Fixes: 5076a6040ca16 ("xfs: support in-memory buffer cache targets") Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202403290419.HPcyvqZu-lkp@intel.com/ Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/scrub/trace.h | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 6d86f4d56353..b1e6879a0731 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -784,18 +784,16 @@ TRACE_EVENT(xfile_create, TP_STRUCT__entry( __field(dev_t, dev) __field(unsigned long, ino) - __array(char, pathname, 256) + __array(char, pathname, MAXNAMELEN) ), TP_fast_assign( - char pathname[257]; char *path;
__entry->ino = file_inode(xf->file)->i_ino; - memset(pathname, 0, sizeof(pathname)); - path = file_path(xf->file, pathname, sizeof(pathname) - 1); + path = file_path(xf->file, __entry->pathname, MAXNAMELEN); if (IS_ERR(path)) - path = "(unknown)"; - strncpy(__entry->pathname, path, sizeof(__entry->pathname)); + strncpy(__entry->pathname, "(unknown)", + sizeof(__entry->pathname)); ), TP_printk("xfino 0x%lx path '%s'", __entry->ino,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Julian Sun sunjunchao2870@gmail.com
commit af5d92f2fad818663da2ce073b6fe15b9d56ffdc upstream.
In the macro definition of XFS_DQUOT_LOGRES, a parameter is accepted, but it is not used. Hence, it should be removed.
This patch has only passed compilation test, but it should be fine.
Signed-off-by: Julian Sun sunjunchao2870@gmail.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_quota_defs.h | 2 +- fs/xfs/libxfs/xfs_trans_resv.c | 28 ++++++++++++++-------------- 2 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_quota_defs.h b/fs/xfs/libxfs/xfs_quota_defs.h index cb035da3f990..fb05f44f6c75 100644 --- a/fs/xfs/libxfs/xfs_quota_defs.h +++ b/fs/xfs/libxfs/xfs_quota_defs.h @@ -56,7 +56,7 @@ typedef uint8_t xfs_dqtype_t; * And, of course, we also need to take into account the dquot log format item * used to describe each dquot. */ -#define XFS_DQUOT_LOGRES(mp) \ +#define XFS_DQUOT_LOGRES \ ((sizeof(struct xfs_dq_logformat) + sizeof(struct xfs_disk_dquot)) * 6)
#define XFS_IS_QUOTA_ON(mp) ((mp)->m_qflags & XFS_ALL_QUOTA_ACCT) diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index 5b2f27cbdb80..1bb2891b26ff 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -334,11 +334,11 @@ xfs_calc_write_reservation( blksz); t1 += adj; t3 += adj; - return XFS_DQUOT_LOGRES(mp) + max3(t1, t2, t3); + return XFS_DQUOT_LOGRES + max3(t1, t2, t3); }
t4 = xfs_calc_refcountbt_reservation(mp, 1); - return XFS_DQUOT_LOGRES(mp) + max(t4, max3(t1, t2, t3)); + return XFS_DQUOT_LOGRES + max(t4, max3(t1, t2, t3)); }
unsigned int @@ -406,11 +406,11 @@ xfs_calc_itruncate_reservation( xfs_refcountbt_block_count(mp, 4), blksz);
- return XFS_DQUOT_LOGRES(mp) + max3(t1, t2, t3); + return XFS_DQUOT_LOGRES + max3(t1, t2, t3); }
t4 = xfs_calc_refcountbt_reservation(mp, 2); - return XFS_DQUOT_LOGRES(mp) + max(t4, max3(t1, t2, t3)); + return XFS_DQUOT_LOGRES + max(t4, max3(t1, t2, t3)); }
unsigned int @@ -436,7 +436,7 @@ STATIC uint xfs_calc_rename_reservation( struct xfs_mount *mp) { - return XFS_DQUOT_LOGRES(mp) + + return XFS_DQUOT_LOGRES + max((xfs_calc_inode_res(mp, 5) + xfs_calc_buf_res(2 * XFS_DIROP_LOG_COUNT(mp), XFS_FSB_TO_B(mp, 1))), @@ -475,7 +475,7 @@ STATIC uint xfs_calc_link_reservation( struct xfs_mount *mp) { - return XFS_DQUOT_LOGRES(mp) + + return XFS_DQUOT_LOGRES + xfs_calc_iunlink_remove_reservation(mp) + max((xfs_calc_inode_res(mp, 2) + xfs_calc_buf_res(XFS_DIROP_LOG_COUNT(mp), @@ -513,7 +513,7 @@ STATIC uint xfs_calc_remove_reservation( struct xfs_mount *mp) { - return XFS_DQUOT_LOGRES(mp) + + return XFS_DQUOT_LOGRES + xfs_calc_iunlink_add_reservation(mp) + max((xfs_calc_inode_res(mp, 2) + xfs_calc_buf_res(XFS_DIROP_LOG_COUNT(mp), @@ -572,7 +572,7 @@ xfs_calc_icreate_resv_alloc( STATIC uint xfs_calc_icreate_reservation(xfs_mount_t *mp) { - return XFS_DQUOT_LOGRES(mp) + + return XFS_DQUOT_LOGRES + max(xfs_calc_icreate_resv_alloc(mp), xfs_calc_create_resv_modify(mp)); } @@ -581,7 +581,7 @@ STATIC uint xfs_calc_create_tmpfile_reservation( struct xfs_mount *mp) { - uint res = XFS_DQUOT_LOGRES(mp); + uint res = XFS_DQUOT_LOGRES;
res += xfs_calc_icreate_resv_alloc(mp); return res + xfs_calc_iunlink_add_reservation(mp); @@ -630,7 +630,7 @@ STATIC uint xfs_calc_ifree_reservation( struct xfs_mount *mp) { - return XFS_DQUOT_LOGRES(mp) + + return XFS_DQUOT_LOGRES + xfs_calc_inode_res(mp, 1) + xfs_calc_buf_res(3, mp->m_sb.sb_sectsize) + xfs_calc_iunlink_remove_reservation(mp) + @@ -647,7 +647,7 @@ STATIC uint xfs_calc_ichange_reservation( struct xfs_mount *mp) { - return XFS_DQUOT_LOGRES(mp) + + return XFS_DQUOT_LOGRES + xfs_calc_inode_res(mp, 1) + xfs_calc_buf_res(1, mp->m_sb.sb_sectsize);
@@ -756,7 +756,7 @@ STATIC uint xfs_calc_addafork_reservation( struct xfs_mount *mp) { - return XFS_DQUOT_LOGRES(mp) + + return XFS_DQUOT_LOGRES + xfs_calc_inode_res(mp, 1) + xfs_calc_buf_res(2, mp->m_sb.sb_sectsize) + xfs_calc_buf_res(1, mp->m_dir_geo->blksize) + @@ -804,7 +804,7 @@ STATIC uint xfs_calc_attrsetm_reservation( struct xfs_mount *mp) { - return XFS_DQUOT_LOGRES(mp) + + return XFS_DQUOT_LOGRES + xfs_calc_inode_res(mp, 1) + xfs_calc_buf_res(1, mp->m_sb.sb_sectsize) + xfs_calc_buf_res(XFS_DA_NODE_MAXDEPTH, XFS_FSB_TO_B(mp, 1)); @@ -844,7 +844,7 @@ STATIC uint xfs_calc_attrrm_reservation( struct xfs_mount *mp) { - return XFS_DQUOT_LOGRES(mp) + + return XFS_DQUOT_LOGRES + max((xfs_calc_inode_res(mp, 1) + xfs_calc_buf_res(XFS_DA_NODE_MAXDEPTH, XFS_FSB_TO_B(mp, 1)) +
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit 73c34b0b85d46bf9c2c0b367aeaffa1e2481b136 upstream.
It turns out that I misunderstood the difference between the attr and attr2 feature bits. "attr" means that at some point an attr fork was created somewhere in the filesystem. "attr2" means that inodes have variable-sized forks, but says nothing about whether or not there actually /are/ attr forks in the system.
If we have an attr fork, we only need to check that attr is set.
Fixes: 99d9d8d05da26 ("xfs: scrub inode block mappings") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/scrub/bmap.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c index 75588915572e..9dfa310df311 100644 --- a/fs/xfs/scrub/bmap.c +++ b/fs/xfs/scrub/bmap.c @@ -857,7 +857,13 @@ xchk_bmap( } break; case XFS_ATTR_FORK: - if (!xfs_has_attr(mp) && !xfs_has_attr2(mp)) + /* + * "attr" means that an attr fork was created at some point in + * the life of this filesystem. "attr2" means that inodes have + * variable-sized data/attr fork areas. Hence we only check + * attr here. + */ + if (!xfs_has_attr(mp)) xchk_ino_set_corrupt(sc, sc->ip->i_ino); break; default:
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit 8d16762047c627073955b7ed171a36addaf7b1ff upstream.
If a file has the S_DAX flag (aka fsdax access mode) set, we cannot allow users to change the realtime flag unless the datadev and rtdev both support fsdax access modes. Even if there are no extents allocated to the file, the setattr thread could be racing with another thread that has already started down the write code paths.
Fixes: ba23cba9b3bdc ("fs: allow per-device dax status checking for filesystems") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_ioctl.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index df4bf0d56aad..32e718043e0e 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1128,6 +1128,17 @@ xfs_ioctl_setattr_xflags( /* Can't change realtime flag if any extents are allocated. */ if (ip->i_df.if_nextents || ip->i_delayed_blks) return -EINVAL; + + /* + * If S_DAX is enabled on this file, we can only switch the + * device if both support fsdax. We can't update S_DAX because + * there might be other threads walking down the access paths. + */ + if (IS_DAX(VFS_I(ip)) && + (mp->m_ddev_targp->bt_daxdev == NULL || + (mp->m_rtdev_targp && + mp->m_rtdev_targp->bt_daxdev == NULL))) + return -EINVAL; }
if (rtflag) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zizhi Wo wozizhi@huawei.com
commit 68415b349f3f16904f006275757f4fcb34b8ee43 upstream.
I notice a rmap query bug in xfs_io fsmap: [root@fedora ~]# xfs_io -c 'fsmap -vvvv' /mnt EXT: DEV BLOCK-RANGE OWNER FILE-OFFSET AG AG-OFFSET TOTAL 0: 253:16 [0..7]: static fs metadata 0 (0..7) 8 1: 253:16 [8..23]: per-AG metadata 0 (8..23) 16 2: 253:16 [24..39]: inode btree 0 (24..39) 16 3: 253:16 [40..47]: per-AG metadata 0 (40..47) 8 4: 253:16 [48..55]: refcount btree 0 (48..55) 8 5: 253:16 [56..103]: per-AG metadata 0 (56..103) 48 6: 253:16 [104..127]: free space 0 (104..127) 24 ......
Bug: [root@fedora ~]# xfs_io -c 'fsmap -vvvv -d 0 3' /mnt [root@fedora ~]# Normally, we should be able to get one record, but we got nothing.
The root cause of this problem lies in the incorrect setting of rm_owner in the rmap query. In the case of the initial query where the owner is not set, __xfs_getfsmap_datadev() first sets info->high.rm_owner to ULLONG_MAX. This is done to prevent any omissions when comparing rmap items. However, if the current ag is detected to be the last one, the function sets info's high_irec based on the provided key. If high->rm_owner is not specified, it should continue to be set to ULLONG_MAX; otherwise, there will be issues with interval omissions. For example, consider "start" and "end" within the same block. If high->rm_owner == 0, it will be smaller than the founded record in rmapbt, resulting in a query with no records. The main call stack is as follows:
xfs_ioc_getfsmap xfs_getfsmap xfs_getfsmap_datadev_rmapbt __xfs_getfsmap_datadev info->high.rm_owner = ULLONG_MAX if (pag->pag_agno == end_ag) xfs_fsmap_owner_to_rmap // set info->high.rm_owner = 0 because fmr_owner == -1ULL dest->rm_owner = 0 // get nothing xfs_getfsmap_datadev_rmapbt_query
The problem can be resolved by simply modify the xfs_fsmap_owner_to_rmap function internal logic to achieve.
After applying this patch, the above problem have been solved: [root@fedora ~]# xfs_io -c 'fsmap -vvvv -d 0 3' /mnt EXT: DEV BLOCK-RANGE OWNER FILE-OFFSET AG AG-OFFSET TOTAL 0: 253:16 [0..7]: static fs metadata 0 (0..7) 8
Fixes: e89c041338ed ("xfs: implement the GETFSMAP ioctl") Signed-off-by: Zizhi Wo wozizhi@huawei.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_fsmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c index 8982c5d6cbd0..85953dbd4283 100644 --- a/fs/xfs/xfs_fsmap.c +++ b/fs/xfs/xfs_fsmap.c @@ -71,7 +71,7 @@ xfs_fsmap_owner_to_rmap( switch (src->fmr_owner) { case 0: /* "lowest owner id possible" */ case -1ULL: /* "highest owner id possible" */ - dest->rm_owner = 0; + dest->rm_owner = src->fmr_owner; break; case XFS_FMR_OWN_FREE: dest->rm_owner = XFS_RMAP_OWN_NULL;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit 6b35cc8d9239569700cc7cc737c8ed40b8b9cfdb upstream.
Use XFS_BUF_DADDR_NULL (instead of a magic sentinel value) to mean "this field is null" like the rest of xfs.
Cc: wozizhi@huawei.com Fixes: e89c041338ed6 ("xfs: implement the GETFSMAP ioctl") Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_fsmap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c index 85953dbd4283..7754d51e1c27 100644 --- a/fs/xfs/xfs_fsmap.c +++ b/fs/xfs/xfs_fsmap.c @@ -252,7 +252,7 @@ xfs_getfsmap_rec_before_start( const struct xfs_rmap_irec *rec, xfs_daddr_t rec_daddr) { - if (info->low_daddr != -1ULL) + if (info->low_daddr != XFS_BUF_DADDR_NULL) return rec_daddr < info->low_daddr; if (info->low.rm_blockcount) return xfs_rmap_compare(rec, &info->low) < 0; @@ -986,7 +986,7 @@ xfs_getfsmap( info.dev = handlers[i].dev; info.last = false; info.pag = NULL; - info.low_daddr = -1ULL; + info.low_daddr = XFS_BUF_DADDR_NULL; info.low.rm_blockcount = 0; error = handlers[i].fn(tp, dkeys, &info); if (error)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit 16e1fbdce9c8d084863fd63cdaff8fb2a54e2f88 upstream.
Take the grow lock when we're expanding the realtime volume, like we do for the other growfs calls.
Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_rtalloc.c | 38 +++++++++++++++++++++++++------------- 1 file changed, 25 insertions(+), 13 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 608db1ab88a4..9268961d887c 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -953,34 +953,39 @@ xfs_growfs_rt( /* Needs to have been mounted with an rt device. */ if (!XFS_IS_REALTIME_MOUNT(mp)) return -EINVAL; + + if (!mutex_trylock(&mp->m_growlock)) + return -EWOULDBLOCK; /* * Mount should fail if the rt bitmap/summary files don't load, but * we'll check anyway. */ + error = -EINVAL; if (!mp->m_rbmip || !mp->m_rsumip) - return -EINVAL; + goto out_unlock;
/* Shrink not supported. */ if (in->newblocks <= sbp->sb_rblocks) - return -EINVAL; + goto out_unlock;
/* Can only change rt extent size when adding rt volume. */ if (sbp->sb_rblocks > 0 && in->extsize != sbp->sb_rextsize) - return -EINVAL; + goto out_unlock;
/* Range check the extent size. */ if (XFS_FSB_TO_B(mp, in->extsize) > XFS_MAX_RTEXTSIZE || XFS_FSB_TO_B(mp, in->extsize) < XFS_MIN_RTEXTSIZE) - return -EINVAL; + goto out_unlock;
/* Unsupported realtime features. */ + error = -EOPNOTSUPP; if (xfs_has_rmapbt(mp) || xfs_has_reflink(mp) || xfs_has_quota(mp)) - return -EOPNOTSUPP; + goto out_unlock;
nrblocks = in->newblocks; error = xfs_sb_validate_fsb_count(sbp, nrblocks); if (error) - return error; + goto out_unlock; /* * Read in the last block of the device, make sure it exists. */ @@ -988,7 +993,7 @@ xfs_growfs_rt( XFS_FSB_TO_BB(mp, nrblocks - 1), XFS_FSB_TO_BB(mp, 1), 0, &bp, NULL); if (error) - return error; + goto out_unlock; xfs_buf_relse(bp);
/* @@ -996,8 +1001,10 @@ xfs_growfs_rt( */ nrextents = nrblocks; do_div(nrextents, in->extsize); - if (!xfs_validate_rtextents(nrextents)) - return -EINVAL; + if (!xfs_validate_rtextents(nrextents)) { + error = -EINVAL; + goto out_unlock; + } nrbmblocks = howmany_64(nrextents, NBBY * sbp->sb_blocksize); nrextslog = xfs_compute_rextslog(nrextents); nrsumlevels = nrextslog + 1; @@ -1009,8 +1016,11 @@ xfs_growfs_rt( * the log. This prevents us from getting a log overflow, * since we'll log basically the whole summary file at once. */ - if (nrsumblocks > (mp->m_sb.sb_logblocks >> 1)) - return -EINVAL; + if (nrsumblocks > (mp->m_sb.sb_logblocks >> 1)) { + error = -EINVAL; + goto out_unlock; + } + /* * Get the old block counts for bitmap and summary inodes. * These can't change since other growfs callers are locked out. @@ -1022,10 +1032,10 @@ xfs_growfs_rt( */ error = xfs_growfs_rt_alloc(mp, rbmblocks, nrbmblocks, mp->m_rbmip); if (error) - return error; + goto out_unlock; error = xfs_growfs_rt_alloc(mp, rsumblocks, nrsumblocks, mp->m_rsumip); if (error) - return error; + goto out_unlock;
rsum_cache = mp->m_rsum_cache; if (nrbmblocks != sbp->sb_rbmblocks) @@ -1190,6 +1200,8 @@ xfs_growfs_rt( } }
+out_unlock: + mutex_unlock(&mp->m_growlock); return error; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit a24cae8fc1f13f6f6929351309f248fd2e9351ce upstream.
If growfsrt is run on a filesystem that doesn't have a rt volume, it's possible to change the rt extent size. If the root directory was previously set up with an inherited extent size hint and rtinherit, it's possible that the hint is no longer a multiple of the rt extent size. Although the verifiers don't complain about this, xfs_repair will, so if we detect this situation, log the root directory to clean it up. This is still racy, but it's better than nothing.
Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_rtalloc.c | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 9268961d887c..ad828fbd5ce4 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -915,6 +915,39 @@ xfs_alloc_rsum_cache( xfs_warn(mp, "could not allocate realtime summary cache"); }
+/* + * If we changed the rt extent size (meaning there was no rt volume previously) + * and the root directory had EXTSZINHERIT and RTINHERIT set, it's possible + * that the extent size hint on the root directory is no longer congruent with + * the new rt extent size. Log the rootdir inode to fix this. + */ +static int +xfs_growfs_rt_fixup_extsize( + struct xfs_mount *mp) +{ + struct xfs_inode *ip = mp->m_rootip; + struct xfs_trans *tp; + int error = 0; + + xfs_ilock(ip, XFS_IOLOCK_EXCL); + if (!(ip->i_diflags & XFS_DIFLAG_RTINHERIT) || + !(ip->i_diflags & XFS_DIFLAG_EXTSZINHERIT)) + goto out_iolock; + + error = xfs_trans_alloc_inode(ip, &M_RES(mp)->tr_ichange, 0, 0, false, + &tp); + if (error) + goto out_iolock; + + xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); + error = xfs_trans_commit(tp); + xfs_iunlock(ip, XFS_ILOCK_EXCL); + +out_iolock: + xfs_iunlock(ip, XFS_IOLOCK_EXCL); + return error; +} + /* * Visible (exported) functions. */ @@ -944,6 +977,7 @@ xfs_growfs_rt( xfs_sb_t *sbp; /* old superblock */ xfs_fsblock_t sumbno; /* summary block number */ uint8_t *rsum_cache; /* old summary cache */ + xfs_agblock_t old_rextsize = mp->m_sb.sb_rextsize;
sbp = &mp->m_sb;
@@ -1177,6 +1211,12 @@ xfs_growfs_rt( if (error) goto out_free;
+ if (old_rextsize != in->extsize) { + error = xfs_growfs_rt_fixup_extsize(mp); + if (error) + goto out_free; + } + /* Update secondary superblocks now the physical grow has completed */ error = xfs_update_secondary_sbs(mp);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Olaf Hering olaf@aepfle.de
[ Upstream commit 91ae69c7ed9e262f24240c425ad1eef2cf6639b7 ]
Align permissions of the resulting .nmconnection file, instead of the input file from hv_kvp_daemon. To avoid the tiny time frame where the output file is world-readable, use umask instead of chmod.
Fixes: 42999c904612 ("hv/hv_kvp_daemon:Support for keyfile based connection profile") Signed-off-by: Olaf Hering olaf@aepfle.de Reviewed-by: Shradha Gupta shradhagupta@linux.microsoft.com Link: https://lore.kernel.org/r/20241016143521.3735-1-olaf@aepfle.de Signed-off-by: Wei Liu wei.liu@kernel.org Message-ID: 20241016143521.3735-1-olaf@aepfle.de Signed-off-by: Sasha Levin sashal@kernel.org --- tools/hv/hv_set_ifconfig.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/hv/hv_set_ifconfig.sh b/tools/hv/hv_set_ifconfig.sh index 440a91b35823..2f8baed2b8f7 100755 --- a/tools/hv/hv_set_ifconfig.sh +++ b/tools/hv/hv_set_ifconfig.sh @@ -81,7 +81,7 @@ echo "ONBOOT=yes" >> $1
cp $1 /etc/sysconfig/network-scripts/
-chmod 600 $2 +umask 0177 interface=$(echo $2 | awk -F - '{ print $2 }') filename="${2##*/}"
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Davidlohr Bueso dave@stgolabs.net
[ Upstream commit da4d8c83358163df9a4addaeba0ef8bcb03b22e8 ]
If cxl_pci_ras_unmask() returns non-zero, cxl_pci_probe() will end up returning that value, instead of zero.
Fixes: 248529edc86f ("cxl: add RAS status unmasking for CXL") Reviewed-by: Fan Ni fan.ni@samsung.com Signed-off-by: Davidlohr Bueso dave@stgolabs.net Reviewed-by: Ira Weiny ira.weiny@intel.com Link: https://patch.msgid.link/20241115170032.108445-1-dave@stgolabs.net Signed-off-by: Dave Jiang dave.jiang@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cxl/pci.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 8bece1e2e249..aacd93f9067d 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -911,8 +911,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) if (rc) return rc;
- rc = cxl_pci_ras_unmask(pdev); - if (rc) + if (cxl_pci_ras_unmask(pdev)) dev_dbg(&pdev->dev, "No RAS reporting unmasked\n");
pci_save_state(pdev);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Huaisheng Ye huaisheng.ye@intel.com
[ Upstream commit 76467a94810c2aa4dd3096903291ac6df30c399e ]
The cxl_port_setup_targets() algorithm fails to identify valid target list ordering in the presence of 4-way and above switches resulting in 'cxl create-region' failures of the form:
$ cxl create-region -d decoder0.0 -g 1024 -s 2G -t ram -w 8 -m mem4 mem1 mem6 mem3 mem2 mem5 mem7 mem0 cxl region: create_region: region0: failed to set target7 to mem0 cxl region: cmd_create_region: created 0 regions
[kernel debug message] check_last_peer:1213: cxl region0: pci0000:0c:port1: cannot host mem6:decoder7.0 at 2 bus_remove_device:574: bus: 'cxl': remove device region0
QEMU can create this failing topology:
ACPI0017:00 [root0] | HB_0 [port1] / \ RP_0 RP_1 | | USP [port2] USP [port3] / / \ \ / / \ \ DSP DSP DSP DSP DSP DSP DSP DSP | | | | | | | | mem4 mem6 mem2 mem7 mem1 mem3 mem5 mem0 Pos: 0 2 4 6 1 3 5 7
HB: Host Bridge RP: Root Port USP: Upstream Port DSP: Downstream Port
...with the following command steps:
$ qemu-system-x86_64 -machine q35,cxl=on,accel=tcg \ -smp cpus=8 \ -m 8G \ -hda /home/work/vm-images/centos-stream8-02.qcow2 \ -object memory-backend-ram,size=4G,id=m0 \ -object memory-backend-ram,size=4G,id=m1 \ -object memory-backend-ram,size=2G,id=cxl-mem0 \ -object memory-backend-ram,size=2G,id=cxl-mem1 \ -object memory-backend-ram,size=2G,id=cxl-mem2 \ -object memory-backend-ram,size=2G,id=cxl-mem3 \ -object memory-backend-ram,size=2G,id=cxl-mem4 \ -object memory-backend-ram,size=2G,id=cxl-mem5 \ -object memory-backend-ram,size=2G,id=cxl-mem6 \ -object memory-backend-ram,size=2G,id=cxl-mem7 \ -numa node,memdev=m0,cpus=0-3,nodeid=0 \ -numa node,memdev=m1,cpus=4-7,nodeid=1 \ -netdev user,id=net0,hostfwd=tcp::2222-:22 \ -device virtio-net-pci,netdev=net0 \ -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \ -device cxl-rp,port=1,bus=cxl.1,id=root_port1,chassis=0,slot=1 \ -device cxl-upstream,bus=root_port0,id=us0 \ -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \ -device cxl-type3,bus=swport0,volatile-memdev=cxl-mem0,id=cxl-vmem0 \ -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \ -device cxl-type3,bus=swport1,volatile-memdev=cxl-mem1,id=cxl-vmem1 \ -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \ -device cxl-type3,bus=swport2,volatile-memdev=cxl-mem2,id=cxl-vmem2 \ -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \ -device cxl-type3,bus=swport3,volatile-memdev=cxl-mem3,id=cxl-vmem3 \ -device cxl-upstream,bus=root_port1,id=us1 \ -device cxl-downstream,port=4,bus=us1,id=swport4,chassis=0,slot=8 \ -device cxl-type3,bus=swport4,volatile-memdev=cxl-mem4,id=cxl-vmem4 \ -device cxl-downstream,port=5,bus=us1,id=swport5,chassis=0,slot=9 \ -device cxl-type3,bus=swport5,volatile-memdev=cxl-mem5,id=cxl-vmem5 \ -device cxl-downstream,port=6,bus=us1,id=swport6,chassis=0,slot=10 \ -device cxl-type3,bus=swport6,volatile-memdev=cxl-mem6,id=cxl-vmem6 \ -device cxl-downstream,port=7,bus=us1,id=swport7,chassis=0,slot=11 \ -device cxl-type3,bus=swport7,volatile-memdev=cxl-mem7,id=cxl-vmem7 \ -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=32G &
In Guest OS: $ cxl create-region -d decoder0.0 -g 1024 -s 2G -t ram -w 8 -m mem4 mem1 mem6 mem3 mem2 mem5 mem7 mem0
Fix the method to calculate @distance by iterativeley multiplying the number of targets per switch port. This also follows the algorithm recommended here [1].
Fixes: 27b3f8d13830 ("cxl/region: Program target lists") Link: http://lore.kernel.org/6538824b52349_7258329466@dwillia2-xfh.jf.intel.com.no... [1] Signed-off-by: Huaisheng Ye huaisheng.ye@intel.com Tested-by: Li Zhijian lizhijian@fujitsu.com [djbw: add a comment explaining 'distance'] Signed-off-by: Dan Williams dan.j.williams@intel.com Link: https://patch.msgid.link/173378716722.1270362.9546805175813426729.stgit@dwil... Signed-off-by: Dave Jiang dave.jiang@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cxl/core/region.c | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 7a9357f42dad..d7f7f88009d7 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1167,6 +1167,7 @@ static int cxl_port_setup_targets(struct cxl_port *port, struct cxl_region_params *p = &cxlr->params; struct cxl_decoder *cxld = cxl_rr->decoder; struct cxl_switch_decoder *cxlsd; + struct cxl_port *iter = port; u16 eig, peig; u8 eiw, peiw;
@@ -1183,16 +1184,26 @@ static int cxl_port_setup_targets(struct cxl_port *port,
cxlsd = to_cxl_switch_decoder(&cxld->dev); if (cxl_rr->nr_targets_set) { - int i, distance; + int i, distance = 1; + struct cxl_region_ref *cxl_rr_iter;
/* - * Passthrough decoders impose no distance requirements between - * peers + * The "distance" between peer downstream ports represents which + * endpoint positions in the region interleave a given port can + * host. + * + * For example, at the root of a hierarchy the distance is + * always 1 as every index targets a different host-bridge. At + * each subsequent switch level those ports map every Nth region + * position where N is the width of the switch == distance. */ - if (cxl_rr->nr_targets == 1) - distance = 0; - else - distance = p->nr_targets / cxl_rr->nr_targets; + do { + cxl_rr_iter = cxl_rr_load(iter, cxlr); + distance *= cxl_rr_iter->nr_targets; + iter = to_cxl_port(iter->dev.parent); + } while (!is_cxl_root(iter)); + distance *= cxlrd->cxlsd.cxld.interleave_ways; + for (i = 0; i < cxl_rr->nr_targets_set; i++) if (ep->dport == cxlsd->target[i]) { rc = check_last_peer(cxled, ep, cxl_rr,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guangguan Wang guangguan.wang@linux.alibaba.com
[ Upstream commit 2b33eb8f1b3e8c2f87cfdbc8cc117f6bdfabc6ec ]
link down work may be scheduled before lgr freed but execute after lgr freed, which may result in crash. So it is need to hold a reference before shedule link down work, and put the reference after work executed or canceled.
The relevant crash call stack as follows: list_del corruption. prev->next should be ffffb638c9c0fe20, but was 0000000000000000 ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:51! invalid opcode: 0000 [#1] SMP NOPTI CPU: 6 PID: 978112 Comm: kworker/6:119 Kdump: loaded Tainted: G #1 Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 2221b89 04/01/2014 Workqueue: events smc_link_down_work [smc] RIP: 0010:__list_del_entry_valid.cold+0x31/0x47 RSP: 0018:ffffb638c9c0fdd8 EFLAGS: 00010086 RAX: 0000000000000054 RBX: ffff942fb75e5128 RCX: 0000000000000000 RDX: ffff943520930aa0 RSI: ffff94352091fc80 RDI: ffff94352091fc80 RBP: 0000000000000000 R08: 0000000000000000 R09: ffffb638c9c0fc38 R10: ffffb638c9c0fc30 R11: ffffffffa015eb28 R12: 0000000000000002 R13: ffffb638c9c0fe20 R14: 0000000000000001 R15: ffff942f9cd051c0 FS: 0000000000000000(0000) GS:ffff943520900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f4f25214000 CR3: 000000025fbae004 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: rwsem_down_write_slowpath+0x17e/0x470 smc_link_down_work+0x3c/0x60 [smc] process_one_work+0x1ac/0x350 worker_thread+0x49/0x2f0 ? rescuer_thread+0x360/0x360 kthread+0x118/0x140 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x1f/0x30
Fixes: 541afa10c126 ("net/smc: add smcr_port_err() and smcr_link_down() processing") Signed-off-by: Guangguan Wang guangguan.wang@linux.alibaba.com Reviewed-by: Tony Lu tonylu@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/smc_core.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c index 3d5c542cd231..3c626c014d1d 100644 --- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c @@ -1767,7 +1767,9 @@ void smcr_link_down_cond_sched(struct smc_link *lnk) { if (smc_link_downing(&lnk->state)) { trace_smcr_link_down(lnk, __builtin_return_address(0)); - schedule_work(&lnk->link_down_wrk); + smcr_link_hold(lnk); /* smcr_link_put in link_down_wrk */ + if (!schedule_work(&lnk->link_down_wrk)) + smcr_link_put(lnk); } }
@@ -1799,11 +1801,14 @@ static void smc_link_down_work(struct work_struct *work) struct smc_link_group *lgr = link->lgr;
if (list_empty(&lgr->list)) - return; + goto out; wake_up_all(&lgr->llc_msg_waiter); down_write(&lgr->llc_conf_mutex); smcr_link_down(link); up_write(&lgr->llc_conf_mutex); + +out: + smcr_link_put(link); /* smcr_link_hold by schedulers of link_down_work */ }
static int smc_vlan_by_tcpsk_walk(struct net_device *lower_dev,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guangguan Wang guangguan.wang@linux.alibaba.com
[ Upstream commit 679e9ddcf90dbdf98aaaa71a492454654b627bcb ]
When application sending data more than sndbuf_space, there have chances application will sleep in epoll_wait, and will never be wakeup again. This is caused by a race between smc_poll and smc_cdc_tx_handler.
application tasklet smc_tx_sendmsg(len > sndbuf_space) | epoll_wait for EPOLL_OUT,timeout=0 | smc_poll | if (!smc->conn.sndbuf_space) | | smc_cdc_tx_handler | atomic_add sndbuf_space | smc_tx_sndbuf_nonfull | if (!test_bit SOCK_NOSPACE) | do not sk_write_space; set_bit SOCK_NOSPACE; | return mask=0; |
Application will sleep in epoll_wait as smc_poll returns 0. And smc_cdc_tx_handler will not call sk_write_space because the SOCK_NOSPACE has not be set. If there is no inflight cdc msg, sk_write_space will not be called any more, and application will sleep in epoll_wait forever. So check sndbuf_space again after NOSPACE flag is set to break the race.
Fixes: 8dce2786a290 ("net/smc: smc_poll improvements") Signed-off-by: Guangguan Wang guangguan.wang@linux.alibaba.com Suggested-by: Paolo Abeni pabeni@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/af_smc.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 77c6c0dff069..06d607e676f6 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -2888,6 +2888,13 @@ static __poll_t smc_poll(struct file *file, struct socket *sock, } else { sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk); set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); + + if (sk->sk_state != SMC_INIT) { + /* Race breaker the same way as tcp_poll(). */ + smp_mb__after_atomic(); + if (atomic_read(&smc->conn.sndbuf_space)) + mask |= EPOLLOUT | EPOLLWRNORM; + } } if (atomic_read(&smc->conn.bytes_to_rcv)) mask |= EPOLLIN | EPOLLRDNORM;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guangguan Wang guangguan.wang@linux.alibaba.com
[ Upstream commit a29e220d3c8edbf0e1beb0f028878a4a85966556 ]
When receiving proposal msg in server, the field iparea_offset and the field ipv6_prefixes_cnt in proposal msg are from the remote client and can not be fully trusted. Especially the field iparea_offset, once exceed the max value, there has the chance to access wrong address, and crash may happen.
This patch checks iparea_offset and ipv6_prefixes_cnt before using them.
Fixes: e7b7a64a8493 ("smc: support variable CLC proposal messages") Signed-off-by: Guangguan Wang guangguan.wang@linux.alibaba.com Reviewed-by: Wen Gu guwen@linux.alibaba.com Reviewed-by: D. Wythe alibuda@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/af_smc.c | 6 +++++- net/smc/smc_clc.c | 4 ++++ net/smc/smc_clc.h | 6 +++++- 3 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 06d607e676f6..8551e097ad33 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -2039,6 +2039,8 @@ static int smc_listen_prfx_check(struct smc_sock *new_smc, if (pclc->hdr.typev1 == SMC_TYPE_N) return 0; pclc_prfx = smc_clc_proposal_get_prefix(pclc); + if (!pclc_prfx) + return -EPROTO; if (smc_clc_prfx_match(newclcsock, pclc_prfx)) return SMC_CLC_DECL_DIFFPREFIX;
@@ -2228,7 +2230,9 @@ static void smc_find_ism_v1_device_serv(struct smc_sock *new_smc, int rc = 0;
/* check if ISM V1 is available */ - if (!(ini->smcd_version & SMC_V1) || !smcd_indicated(ini->smc_type_v1)) + if (!(ini->smcd_version & SMC_V1) || + !smcd_indicated(ini->smc_type_v1) || + !pclc_smcd) goto not_found; ini->is_smcd = true; /* prepare ISM check */ ini->ism_peer_gid[0].gid = ntohll(pclc_smcd->ism.gid); diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c index 0084960a203d..b8fd64392209 100644 --- a/net/smc/smc_clc.c +++ b/net/smc/smc_clc.c @@ -354,6 +354,10 @@ static bool smc_clc_msg_prop_valid(struct smc_clc_msg_proposal *pclc)
v2_ext = smc_get_clc_v2_ext(pclc); pclc_prfx = smc_clc_proposal_get_prefix(pclc); + if (!pclc_prfx || + pclc_prfx->ipv6_prefixes_cnt > SMC_CLC_MAX_V6_PREFIX) + return false; + if (hdr->version == SMC_V1) { if (hdr->typev1 == SMC_TYPE_N) return false; diff --git a/net/smc/smc_clc.h b/net/smc/smc_clc.h index c8d6282ec9c0..eb843907c9d0 100644 --- a/net/smc/smc_clc.h +++ b/net/smc/smc_clc.h @@ -320,8 +320,12 @@ struct smc_clc_msg_decline_v2 { /* clc decline message */ static inline struct smc_clc_msg_proposal_prefix * smc_clc_proposal_get_prefix(struct smc_clc_msg_proposal *pclc) { + u16 offset = ntohs(pclc->iparea_offset); + + if (offset > sizeof(struct smc_clc_msg_smcd)) + return NULL; return (struct smc_clc_msg_proposal_prefix *) - ((u8 *)pclc + sizeof(*pclc) + ntohs(pclc->iparea_offset)); + ((u8 *)pclc + sizeof(*pclc) + offset); }
static inline bool smcr_indicated(int smc_type)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guangguan Wang guangguan.wang@linux.alibaba.com
[ Upstream commit 7863c9f3d24ba49dbead7e03dfbe40deb5888fdf ]
When receiving proposal msg in server, the fields v2_ext_offset/ eid_cnt/ism_gid_cnt in proposal msg are from the remote client and can not be fully trusted. Especially the field v2_ext_offset, once exceed the max value, there has the chance to access wrong address, and crash may happen.
This patch checks the fields v2_ext_offset/eid_cnt/ism_gid_cnt before using them.
Fixes: 8c3dca341aea ("net/smc: build and send V2 CLC proposal") Signed-off-by: Guangguan Wang guangguan.wang@linux.alibaba.com Reviewed-by: Wen Gu guwen@linux.alibaba.com Reviewed-by: D. Wythe alibuda@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/af_smc.c | 3 ++- net/smc/smc_clc.c | 8 +++++++- net/smc/smc_clc.h | 8 +++++++- 3 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 8551e097ad33..079ca4f1077d 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -2283,7 +2283,8 @@ static void smc_find_rdma_v2_device_serv(struct smc_sock *new_smc, goto not_found;
smc_v2_ext = smc_get_clc_v2_ext(pclc); - if (!smc_clc_match_eid(ini->negotiated_eid, smc_v2_ext, NULL, NULL)) + if (!smc_v2_ext || + !smc_clc_match_eid(ini->negotiated_eid, smc_v2_ext, NULL, NULL)) goto not_found;
/* prepare RDMA check */ diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c index b8fd64392209..286d6b19a1f1 100644 --- a/net/smc/smc_clc.c +++ b/net/smc/smc_clc.c @@ -352,7 +352,6 @@ static bool smc_clc_msg_prop_valid(struct smc_clc_msg_proposal *pclc) struct smc_clc_msg_hdr *hdr = &pclc->hdr; struct smc_clc_v2_extension *v2_ext;
- v2_ext = smc_get_clc_v2_ext(pclc); pclc_prfx = smc_clc_proposal_get_prefix(pclc); if (!pclc_prfx || pclc_prfx->ipv6_prefixes_cnt > SMC_CLC_MAX_V6_PREFIX) @@ -369,6 +368,13 @@ static bool smc_clc_msg_prop_valid(struct smc_clc_msg_proposal *pclc) sizeof(struct smc_clc_msg_trail)) return false; } else { + v2_ext = smc_get_clc_v2_ext(pclc); + if ((hdr->typev2 != SMC_TYPE_N && + (!v2_ext || v2_ext->hdr.eid_cnt > SMC_CLC_MAX_UEID)) || + (smcd_indicated(hdr->typev2) && + v2_ext->hdr.ism_gid_cnt > SMCD_CLC_MAX_V2_GID_ENTRIES)) + return false; + if (ntohs(hdr->length) != sizeof(*pclc) + sizeof(struct smc_clc_msg_smcd) + diff --git a/net/smc/smc_clc.h b/net/smc/smc_clc.h index eb843907c9d0..a3706300e04f 100644 --- a/net/smc/smc_clc.h +++ b/net/smc/smc_clc.h @@ -364,8 +364,14 @@ static inline struct smc_clc_v2_extension * smc_get_clc_v2_ext(struct smc_clc_msg_proposal *prop) { struct smc_clc_msg_smcd *prop_smcd = smc_get_clc_msg_smcd(prop); + u16 max_offset;
- if (!prop_smcd || !ntohs(prop_smcd->v2_ext_offset)) + max_offset = offsetof(struct smc_clc_msg_proposal_area, pclc_v2_ext) - + offsetof(struct smc_clc_msg_proposal_area, pclc_smcd) - + offsetofend(struct smc_clc_msg_smcd, v2_ext_offset); + + if (!prop_smcd || !ntohs(prop_smcd->v2_ext_offset) || + ntohs(prop_smcd->v2_ext_offset) > max_offset) return NULL;
return (struct smc_clc_v2_extension *)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guangguan Wang guangguan.wang@linux.alibaba.com
[ Upstream commit 9ab332deb671d8f7e66d82a2ff2b3f715bc3a4ad ]
When receiving proposal msg in server, the field smcd_v2_ext_offset in proposal msg is from the remote client and can not be fully trusted. Once the value of smcd_v2_ext_offset exceed the max value, there has the chance to access wrong address, and crash may happen.
This patch checks the value of smcd_v2_ext_offset before using it.
Fixes: 5c21c4ccafe8 ("net/smc: determine accepted ISM devices") Signed-off-by: Guangguan Wang guangguan.wang@linux.alibaba.com Reviewed-by: Wen Gu guwen@linux.alibaba.com Reviewed-by: D. Wythe alibuda@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/af_smc.c | 2 ++ net/smc/smc_clc.h | 8 +++++++- 2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 079ca4f1077d..0acf07538840 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -2154,6 +2154,8 @@ static void smc_find_ism_v2_device_serv(struct smc_sock *new_smc, pclc_smcd = smc_get_clc_msg_smcd(pclc); smc_v2_ext = smc_get_clc_v2_ext(pclc); smcd_v2_ext = smc_get_clc_smcd_v2_ext(smc_v2_ext); + if (!pclc_smcd || !smc_v2_ext || !smcd_v2_ext) + goto not_found;
mutex_lock(&smcd_dev_list.mutex); if (pclc_smcd->ism.chid) { diff --git a/net/smc/smc_clc.h b/net/smc/smc_clc.h index a3706300e04f..6e24d44de4a7 100644 --- a/net/smc/smc_clc.h +++ b/net/smc/smc_clc.h @@ -384,9 +384,15 @@ smc_get_clc_v2_ext(struct smc_clc_msg_proposal *prop) static inline struct smc_clc_smcd_v2_extension * smc_get_clc_smcd_v2_ext(struct smc_clc_v2_extension *prop_v2ext) { + u16 max_offset = offsetof(struct smc_clc_msg_proposal_area, pclc_smcd_v2_ext) - + offsetof(struct smc_clc_msg_proposal_area, pclc_v2_ext) - + offsetof(struct smc_clc_v2_extension, hdr) - + offsetofend(struct smc_clnt_opts_area_hdr, smcd_v2_ext_offset); + if (!prop_v2ext) return NULL; - if (!ntohs(prop_v2ext->hdr.smcd_v2_ext_offset)) + if (!ntohs(prop_v2ext->hdr.smcd_v2_ext_offset) || + ntohs(prop_v2ext->hdr.smcd_v2_ext_offset) > max_offset) return NULL;
return (struct smc_clc_smcd_v2_extension *)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guangguan Wang guangguan.wang@linux.alibaba.com
[ Upstream commit c5b8ee5022a19464783058dc6042e8eefa34e8cd ]
When receiving clc msg, the field length in smc_clc_msg_hdr indicates the length of msg should be received from network and the value should not be fully trusted as it is from the network. Once the value of length exceeds the value of buflen in function smc_clc_wait_msg it may run into deadloop when trying to drain the remaining data exceeding buflen.
This patch checks the return value of sock_recvmsg when draining data in case of deadloop in draining.
Fixes: fb4f79264c0f ("net/smc: tolerate future SMCD versions") Signed-off-by: Guangguan Wang guangguan.wang@linux.alibaba.com Reviewed-by: Wen Gu guwen@linux.alibaba.com Reviewed-by: D. Wythe alibuda@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/smc_clc.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c index 286d6b19a1f1..dbce904c03cf 100644 --- a/net/smc/smc_clc.c +++ b/net/smc/smc_clc.c @@ -773,6 +773,11 @@ int smc_clc_wait_msg(struct smc_sock *smc, void *buf, int buflen, SMC_CLC_RECV_BUF_LEN : datlen; iov_iter_kvec(&msg.msg_iter, ITER_DEST, &vec, 1, recvlen); len = sock_recvmsg(smc->clcsock, &msg, krflags); + if (len < recvlen) { + smc->sk.sk_err = EPROTO; + reason_code = -EPROTO; + goto out; + } datlen -= len; } if (clcm->type == SMC_CLC_DECLINE) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 2d5df3a680ffdaf606baa10636bdb1daf757832e ]
Packets injected by the CPU should have a SRC_PORT field equal to the CPU port module index in the Analyzer block (ocelot->num_phys_ports).
The blamed commit copied the ocelot_ifh_set_basic() call incorrectly from ocelot_xmit_common() in net/dsa/tag_ocelot.c. Instead of calling with "x", it calls with BIT_ULL(x), but the field is not a port mask, but rather a single port index.
[ side note: this is the technical debt of code duplication :( ]
The error used to be silent and doesn't appear to have other user-visible manifestations, but with new changes in the packing library, it now fails loudly as follows:
------------[ cut here ]------------ Cannot store 0x40 inside bits 46-43 - will truncate sja1105 spi2.0: xmit timed out WARNING: CPU: 1 PID: 102 at lib/packing.c:98 __pack+0x90/0x198 sja1105 spi2.0: timed out polling for tstamp CPU: 1 UID: 0 PID: 102 Comm: felix_xmit Tainted: G W N 6.13.0-rc1-00372-gf706b85d972d-dirty #2605 Call trace: __pack+0x90/0x198 (P) __pack+0x90/0x198 (L) packing+0x78/0x98 ocelot_ifh_set_basic+0x260/0x368 ocelot_port_inject_frame+0xa8/0x250 felix_port_deferred_xmit+0x14c/0x258 kthread_worker_fn+0x134/0x350 kthread+0x114/0x138
The code path pertains to the ocelot switchdev driver and to the felix secondary DSA tag protocol, ocelot-8021q. Here seen with ocelot-8021q.
The messenger (packing) is not really to blame, so fix the original commit instead.
Fixes: e1b9e80236c5 ("net: mscc: ocelot: fix QoS class for injected packets with "ocelot-8021q"") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20241212165546.879567-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mscc/ocelot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c index c2118bde908b..f6aa5d6b6597 100644 --- a/drivers/net/ethernet/mscc/ocelot.c +++ b/drivers/net/ethernet/mscc/ocelot.c @@ -1266,7 +1266,7 @@ void ocelot_ifh_set_basic(void *ifh, struct ocelot *ocelot, int port,
memset(ifh, 0, OCELOT_TAG_LEN); ocelot_ifh_set_bypass(ifh, 1); - ocelot_ifh_set_src(ifh, BIT_ULL(ocelot->num_phys_ports)); + ocelot_ifh_set_src(ifh, ocelot->num_phys_ports); ocelot_ifh_set_dest(ifh, BIT_ULL(port)); ocelot_ifh_set_qos_class(ifh, qos_class); ocelot_ifh_set_tag_type(ifh, tag_type);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit ee76746387f6233bdfa93d7406990f923641568f ]
If either a zero count or a large one is provided, kernel can crash.
Fixes: 82c93a87bf8b ("netdevsim: implement couple of testing devlink health reporters") Reported-by: syzbot+ea40e4294e58b0292f74@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/675c6862.050a0220.37aaf.00b1.GAE@google.com/T... Signed-off-by: Eric Dumazet edumazet@google.com Cc: Jiri Pirko jiri@nvidia.com Reviewed-by: Joe Damato jdamato@fastly.com Link: https://patch.msgid.link/20241213172518.2415666-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/netdevsim/health.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/netdevsim/health.c b/drivers/net/netdevsim/health.c index eb04ed715d2d..c63427b71898 100644 --- a/drivers/net/netdevsim/health.c +++ b/drivers/net/netdevsim/health.c @@ -203,6 +203,8 @@ static ssize_t nsim_dev_health_break_write(struct file *file, char *break_msg; int err;
+ if (count == 0 || count > PAGE_SIZE) + return -EINVAL; break_msg = memdup_user_nul(data, count); if (IS_ERR(break_msg)) return PTR_ERR(break_msg);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brett Creeley brett.creeley@amd.com
[ Upstream commit 9590d32e090ea2751e131ae5273859ca22f5ac14 ]
If register_netdev() fails, then the driver leaks the netdev notifier. Fix this by calling ionic_lif_unregister() on register_netdev() failure. This will also call ionic_lif_unregister_phc() if it has already been registered.
Fixes: 30b87ab4c0b3 ("ionic: remove lif list concept") Signed-off-by: Brett Creeley brett.creeley@amd.com Signed-off-by: Shannon Nelson shannon.nelson@amd.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/20241212213157.12212-2-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/pensando/ionic/ionic_lif.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c index 9d724d228b83..bc7c5cd38596 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c @@ -3736,8 +3736,8 @@ int ionic_lif_register(struct ionic_lif *lif) /* only register LIF0 for now */ err = register_netdev(lif->netdev); if (err) { - dev_err(lif->ionic->dev, "Cannot register net device, aborting\n"); - ionic_lif_unregister_phc(lif); + dev_err(lif->ionic->dev, "Cannot register net device: %d, aborting\n", err); + ionic_lif_unregister(lif); return err; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shannon Nelson shannon.nelson@amd.com
[ Upstream commit b096d62ba1323391b2db98b7704e2468cf3b1588 ]
Some calls into ionic_get_module_eeprom() don't use a single full buffer size, but instead multiple calls with an offset. Teach our driver to use the offset correctly so we can respond appropriately to the caller.
Fixes: 4d03e00a2140 ("ionic: Add initial ethtool support") Signed-off-by: Shannon Nelson shannon.nelson@amd.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/20241212213157.12212-4-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/pensando/ionic/ionic_ethtool.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c index 35829a2851fa..d76e63f57ff1 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c @@ -945,8 +945,8 @@ static int ionic_get_module_eeprom(struct net_device *netdev, len = min_t(u32, sizeof(xcvr->sprom), ee->len);
do { - memcpy(data, xcvr->sprom, len); - memcpy(tbuf, xcvr->sprom, len); + memcpy(data, &xcvr->sprom[ee->offset], len); + memcpy(tbuf, &xcvr->sprom[ee->offset], len);
/* Let's make sure we got a consistent copy */ if (!memcmp(data, tbuf, len))
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikita Yushchenko nikita.yoush@cogentembedded.com
[ Upstream commit 922b4b955a03d19fea98938f33ef0e62d01f5159 ]
The existing linked list based implementation of how ts tags are assigned and managed is unsafe against concurrency and corner cases: - element addition in tx processing can race against element removal in ts queue completion, - element removal in ts queue completion can race against element removal in device close, - if a large number of frames gets added to tx queue without ts queue completions in between, elements with duplicate tag values can get added.
Use a different implementation, based on per-port used tags bitmaps and saved skb arrays.
Safety for addition in tx processing vs removal in ts completion is provided by:
tag = find_first_zero_bit(...); smp_mb(); <write rdev->ts_skb[tag]> set_bit(...);
vs
<read rdev->ts_skb[tag]> smp_mb(); clear_bit(...);
Safety for removal in ts completion vs removal in device close is provided by using atomic read-and-clear for rdev->ts_skb[tag]:
ts_skb = xchg(&rdev->ts_skb[tag], NULL); if (ts_skb) <handle it>
Fixes: 33f5d733b589 ("net: renesas: rswitch: Improve TX timestamp accuracy") Signed-off-by: Nikita Yushchenko nikita.yoush@cogentembedded.com Link: https://patch.msgid.link/20241212062558.436455-1-nikita.yoush@cogentembedded... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/renesas/rswitch.c | 74 ++++++++++++++------------ drivers/net/ethernet/renesas/rswitch.h | 13 ++--- 2 files changed, 42 insertions(+), 45 deletions(-)
diff --git a/drivers/net/ethernet/renesas/rswitch.c b/drivers/net/ethernet/renesas/rswitch.c index 8abad9bb629e..54aa56c84133 100644 --- a/drivers/net/ethernet/renesas/rswitch.c +++ b/drivers/net/ethernet/renesas/rswitch.c @@ -546,7 +546,6 @@ static int rswitch_gwca_ts_queue_alloc(struct rswitch_private *priv) desc = &gq->ts_ring[gq->ring_size]; desc->desc.die_dt = DT_LINKFIX; rswitch_desc_set_dptr(&desc->desc, gq->ring_dma); - INIT_LIST_HEAD(&priv->gwca.ts_info_list);
return 0; } @@ -934,9 +933,10 @@ static int rswitch_gwca_request_irqs(struct rswitch_private *priv) static void rswitch_ts(struct rswitch_private *priv) { struct rswitch_gwca_queue *gq = &priv->gwca.ts_queue; - struct rswitch_gwca_ts_info *ts_info, *ts_info2; struct skb_shared_hwtstamps shhwtstamps; struct rswitch_ts_desc *desc; + struct rswitch_device *rdev; + struct sk_buff *ts_skb; struct timespec64 ts; unsigned int num; u32 tag, port; @@ -946,23 +946,28 @@ static void rswitch_ts(struct rswitch_private *priv) dma_rmb();
port = TS_DESC_DPN(__le32_to_cpu(desc->desc.dptrl)); - tag = TS_DESC_TSUN(__le32_to_cpu(desc->desc.dptrl)); - - list_for_each_entry_safe(ts_info, ts_info2, &priv->gwca.ts_info_list, list) { - if (!(ts_info->port == port && ts_info->tag == tag)) - continue; - - memset(&shhwtstamps, 0, sizeof(shhwtstamps)); - ts.tv_sec = __le32_to_cpu(desc->ts_sec); - ts.tv_nsec = __le32_to_cpu(desc->ts_nsec & cpu_to_le32(0x3fffffff)); - shhwtstamps.hwtstamp = timespec64_to_ktime(ts); - skb_tstamp_tx(ts_info->skb, &shhwtstamps); - dev_consume_skb_irq(ts_info->skb); - list_del(&ts_info->list); - kfree(ts_info); - break; - } + if (unlikely(port >= RSWITCH_NUM_PORTS)) + goto next; + rdev = priv->rdev[port];
+ tag = TS_DESC_TSUN(__le32_to_cpu(desc->desc.dptrl)); + if (unlikely(tag >= TS_TAGS_PER_PORT)) + goto next; + ts_skb = xchg(&rdev->ts_skb[tag], NULL); + smp_mb(); /* order rdev->ts_skb[] read before bitmap update */ + clear_bit(tag, rdev->ts_skb_used); + + if (unlikely(!ts_skb)) + goto next; + + memset(&shhwtstamps, 0, sizeof(shhwtstamps)); + ts.tv_sec = __le32_to_cpu(desc->ts_sec); + ts.tv_nsec = __le32_to_cpu(desc->ts_nsec & cpu_to_le32(0x3fffffff)); + shhwtstamps.hwtstamp = timespec64_to_ktime(ts); + skb_tstamp_tx(ts_skb, &shhwtstamps); + dev_consume_skb_irq(ts_skb); + +next: gq->cur = rswitch_next_queue_index(gq, true, 1); desc = &gq->ts_ring[gq->cur]; } @@ -1505,8 +1510,9 @@ static int rswitch_open(struct net_device *ndev) static int rswitch_stop(struct net_device *ndev) { struct rswitch_device *rdev = netdev_priv(ndev); - struct rswitch_gwca_ts_info *ts_info, *ts_info2; + struct sk_buff *ts_skb; unsigned long flags; + unsigned int tag;
netif_tx_stop_all_queues(ndev);
@@ -1523,12 +1529,13 @@ static int rswitch_stop(struct net_device *ndev) if (bitmap_empty(rdev->priv->opened_ports, RSWITCH_NUM_PORTS)) iowrite32(GWCA_TS_IRQ_BIT, rdev->priv->addr + GWTSDID);
- list_for_each_entry_safe(ts_info, ts_info2, &rdev->priv->gwca.ts_info_list, list) { - if (ts_info->port != rdev->port) - continue; - dev_kfree_skb_irq(ts_info->skb); - list_del(&ts_info->list); - kfree(ts_info); + for (tag = find_first_bit(rdev->ts_skb_used, TS_TAGS_PER_PORT); + tag < TS_TAGS_PER_PORT; + tag = find_next_bit(rdev->ts_skb_used, TS_TAGS_PER_PORT, tag + 1)) { + ts_skb = xchg(&rdev->ts_skb[tag], NULL); + clear_bit(tag, rdev->ts_skb_used); + if (ts_skb) + dev_kfree_skb(ts_skb); }
return 0; @@ -1541,20 +1548,17 @@ static bool rswitch_ext_desc_set_info1(struct rswitch_device *rdev, desc->info1 = cpu_to_le64(INFO1_DV(BIT(rdev->etha->index)) | INFO1_IPV(GWCA_IPV_NUM) | INFO1_FMT); if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) { - struct rswitch_gwca_ts_info *ts_info; + unsigned int tag;
- ts_info = kzalloc(sizeof(*ts_info), GFP_ATOMIC); - if (!ts_info) + tag = find_first_zero_bit(rdev->ts_skb_used, TS_TAGS_PER_PORT); + if (tag == TS_TAGS_PER_PORT) return false; + smp_mb(); /* order bitmap read before rdev->ts_skb[] write */ + rdev->ts_skb[tag] = skb_get(skb); + set_bit(tag, rdev->ts_skb_used);
skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS; - rdev->ts_tag++; - desc->info1 |= cpu_to_le64(INFO1_TSUN(rdev->ts_tag) | INFO1_TXC); - - ts_info->skb = skb_get(skb); - ts_info->port = rdev->port; - ts_info->tag = rdev->ts_tag; - list_add_tail(&ts_info->list, &rdev->priv->gwca.ts_info_list); + desc->info1 |= cpu_to_le64(INFO1_TSUN(tag) | INFO1_TXC);
skb_tx_timestamp(skb); } diff --git a/drivers/net/ethernet/renesas/rswitch.h b/drivers/net/ethernet/renesas/rswitch.h index f2d1cd47187d..0c93ef16b43e 100644 --- a/drivers/net/ethernet/renesas/rswitch.h +++ b/drivers/net/ethernet/renesas/rswitch.h @@ -965,14 +965,6 @@ struct rswitch_gwca_queue { }; };
-struct rswitch_gwca_ts_info { - struct sk_buff *skb; - struct list_head list; - - int port; - u8 tag; -}; - #define RSWITCH_NUM_IRQ_REGS (RSWITCH_MAX_NUM_QUEUES / BITS_PER_TYPE(u32)) struct rswitch_gwca { unsigned int index; @@ -982,7 +974,6 @@ struct rswitch_gwca { struct rswitch_gwca_queue *queues; int num_queues; struct rswitch_gwca_queue ts_queue; - struct list_head ts_info_list; DECLARE_BITMAP(used, RSWITCH_MAX_NUM_QUEUES); u32 tx_irq_bits[RSWITCH_NUM_IRQ_REGS]; u32 rx_irq_bits[RSWITCH_NUM_IRQ_REGS]; @@ -990,6 +981,7 @@ struct rswitch_gwca { };
#define NUM_QUEUES_PER_NDEV 2 +#define TS_TAGS_PER_PORT 256 struct rswitch_device { struct rswitch_private *priv; struct net_device *ndev; @@ -997,7 +989,8 @@ struct rswitch_device { void __iomem *addr; struct rswitch_gwca_queue *tx_queue; struct rswitch_gwca_queue *rx_queue; - u8 ts_tag; + struct sk_buff *ts_skb[TS_TAGS_PER_PORT]; + DECLARE_BITMAP(ts_skb_used, TS_TAGS_PER_PORT); bool disabled;
int port;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marios Makassikis mmakassikis@freebox.fr
[ Upstream commit 83c47d9e0ce79b5d7c0b21b9f35402dbde0fa15c ]
This changes the semantics of req_running to count all in-flight requests on a given connection, rather than the number of elements in the conn->request list. The latter is used only in smb2_cancel, and the counter is not used
Signed-off-by: Marios Makassikis mmakassikis@freebox.fr Acked-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Stable-dep-of: 43fb7bce8866 ("ksmbd: fix broken transfers when exceeding max simultaneous operations") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/server/connection.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/smb/server/connection.c b/fs/smb/server/connection.c index a751793c4512..5ded9372b2fb 100644 --- a/fs/smb/server/connection.c +++ b/fs/smb/server/connection.c @@ -120,8 +120,8 @@ void ksmbd_conn_enqueue_request(struct ksmbd_work *work) if (conn->ops->get_cmd_val(work) != SMB2_CANCEL_HE) requests_queue = &conn->requests;
+ atomic_inc(&conn->req_running); if (requests_queue) { - atomic_inc(&conn->req_running); spin_lock(&conn->request_lock); list_add_tail(&work->request_entry, requests_queue); spin_unlock(&conn->request_lock); @@ -132,11 +132,12 @@ void ksmbd_conn_try_dequeue_request(struct ksmbd_work *work) { struct ksmbd_conn *conn = work->conn;
+ atomic_dec(&conn->req_running); + if (list_empty(&work->request_entry) && list_empty(&work->async_request_entry)) return;
- atomic_dec(&conn->req_running); spin_lock(&conn->request_lock); list_del_init(&work->request_entry); spin_unlock(&conn->request_lock);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marios Makassikis mmakassikis@freebox.fr
[ Upstream commit 43fb7bce8866e793275c4f9f25af6a37745f3416 ]
Since commit 0a77d947f599 ("ksmbd: check outstanding simultaneous SMB operations"), ksmbd enforces a maximum number of simultaneous operations for a connection. The problem is that reaching the limit causes ksmbd to close the socket, and the client has no indication that it should have slowed down.
This behaviour can be reproduced by setting "smb2 max credits = 128" (or lower), and transferring a large file (25GB).
smbclient fails as below:
$ smbclient //192.168.1.254/testshare -U user%pass smb: > put file.bin cli_push returned NT_STATUS_USER_SESSION_DELETED putting file file.bin as \file.bin smb2cli_req_compound_submit: Insufficient credits. 0 available, 1 needed NT_STATUS_INTERNAL_ERROR closing remote file \file.bin smb: > smb2cli_req_compound_submit: Insufficient credits. 0 available, 1 needed
Windows clients fail with 0x8007003b (with smaller files even).
Fix this by delaying reading from the socket until there's room to allocate a request. This effectively applies backpressure on the client, so the transfer completes, albeit at a slower rate.
Fixes: 0a77d947f599 ("ksmbd: check outstanding simultaneous SMB operations") Signed-off-by: Marios Makassikis mmakassikis@freebox.fr Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/server/connection.c | 13 +++++++++++-- fs/smb/server/connection.h | 1 - fs/smb/server/server.c | 7 +------ fs/smb/server/server.h | 1 + fs/smb/server/transport_ipc.c | 5 ++++- 5 files changed, 17 insertions(+), 10 deletions(-)
diff --git a/fs/smb/server/connection.c b/fs/smb/server/connection.c index 5ded9372b2fb..f3178570329a 100644 --- a/fs/smb/server/connection.c +++ b/fs/smb/server/connection.c @@ -70,7 +70,6 @@ struct ksmbd_conn *ksmbd_conn_alloc(void) atomic_set(&conn->req_running, 0); atomic_set(&conn->r_count, 0); atomic_set(&conn->refcnt, 1); - atomic_set(&conn->mux_smb_requests, 0); conn->total_credits = 1; conn->outstanding_credits = 0;
@@ -133,6 +132,8 @@ void ksmbd_conn_try_dequeue_request(struct ksmbd_work *work) struct ksmbd_conn *conn = work->conn;
atomic_dec(&conn->req_running); + if (waitqueue_active(&conn->req_running_q)) + wake_up(&conn->req_running_q);
if (list_empty(&work->request_entry) && list_empty(&work->async_request_entry)) @@ -309,7 +310,7 @@ int ksmbd_conn_handler_loop(void *p) { struct ksmbd_conn *conn = (struct ksmbd_conn *)p; struct ksmbd_transport *t = conn->transport; - unsigned int pdu_size, max_allowed_pdu_size; + unsigned int pdu_size, max_allowed_pdu_size, max_req; char hdr_buf[4] = {0,}; int size;
@@ -319,6 +320,7 @@ int ksmbd_conn_handler_loop(void *p) if (t->ops->prepare && t->ops->prepare(t)) goto out;
+ max_req = server_conf.max_inflight_req; conn->last_active = jiffies; set_freezable(); while (ksmbd_conn_alive(conn)) { @@ -328,6 +330,13 @@ int ksmbd_conn_handler_loop(void *p) kvfree(conn->request_buf); conn->request_buf = NULL;
+recheck: + if (atomic_read(&conn->req_running) + 1 > max_req) { + wait_event_interruptible(conn->req_running_q, + atomic_read(&conn->req_running) < max_req); + goto recheck; + } + size = t->ops->read(t, hdr_buf, sizeof(hdr_buf), -1); if (size != sizeof(hdr_buf)) break; diff --git a/fs/smb/server/connection.h b/fs/smb/server/connection.h index 368295fb18a7..82343afc8d04 100644 --- a/fs/smb/server/connection.h +++ b/fs/smb/server/connection.h @@ -107,7 +107,6 @@ struct ksmbd_conn { __le16 signing_algorithm; bool binding; atomic_t refcnt; - atomic_t mux_smb_requests; };
struct ksmbd_conn_ops { diff --git a/fs/smb/server/server.c b/fs/smb/server/server.c index 7f9aca4aa742..71e1c1db9dea 100644 --- a/fs/smb/server/server.c +++ b/fs/smb/server/server.c @@ -270,7 +270,6 @@ static void handle_ksmbd_work(struct work_struct *wk)
ksmbd_conn_try_dequeue_request(work); ksmbd_free_work_struct(work); - atomic_dec(&conn->mux_smb_requests); /* * Checking waitqueue to dropping pending requests on * disconnection. waitqueue_active is safe because it @@ -300,11 +299,6 @@ static int queue_ksmbd_work(struct ksmbd_conn *conn) if (err) return 0;
- if (atomic_inc_return(&conn->mux_smb_requests) >= conn->vals->max_credits) { - atomic_dec_return(&conn->mux_smb_requests); - return -ENOSPC; - } - work = ksmbd_alloc_work_struct(); if (!work) { pr_err("allocation for work failed\n"); @@ -367,6 +361,7 @@ static int server_conf_init(void) server_conf.auth_mechs |= KSMBD_AUTH_KRB5 | KSMBD_AUTH_MSKRB5; #endif + server_conf.max_inflight_req = SMB2_MAX_CREDITS; return 0; }
diff --git a/fs/smb/server/server.h b/fs/smb/server/server.h index db7278181760..4d06f2eb0d6a 100644 --- a/fs/smb/server/server.h +++ b/fs/smb/server/server.h @@ -42,6 +42,7 @@ struct ksmbd_server_config { struct smb_sid domain_sid; unsigned int auth_mechs; unsigned int max_connections; + unsigned int max_inflight_req;
char *conf[SERVER_CONF_WORK_GROUP + 1]; }; diff --git a/fs/smb/server/transport_ipc.c b/fs/smb/server/transport_ipc.c index 8752ac82c557..c12b70d01880 100644 --- a/fs/smb/server/transport_ipc.c +++ b/fs/smb/server/transport_ipc.c @@ -305,8 +305,11 @@ static int ipc_server_config_on_startup(struct ksmbd_startup_request *req) init_smb2_max_write_size(req->smb2_max_write); if (req->smb2_max_trans) init_smb2_max_trans_size(req->smb2_max_trans); - if (req->smb2_max_credits) + if (req->smb2_max_credits) { init_smb2_max_credits(req->smb2_max_credits); + server_conf.max_inflight_req = + req->smb2_max_credits; + } if (req->smbd_max_io_size) init_smbd_max_io_size(req->smbd_max_io_size);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 7203d10e93b6e6e1d19481ef7907de6a9133a467 ]
There is a check for NULL at the start of create_txqs() and create_rxqs() which tess if "nic_dev->txqs" is non-NULL. The intention is that if the device is already open and the queues are already created then we don't create them a second time.
However, the bug is that if we have an error in the create_txqs() then the pointer doesn't get set back to NULL. The NULL check at the start of the function will say that it's already open when it's not and the device can't be used.
Set ->txqs back to NULL on cleanup on error.
Fixes: c3e79baf1b03 ("net-next/hinic: Add logical Txq and Rxq") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/0cc98faf-a0ed-4565-a55b-0fa2734bc205@stanley.mounta... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/huawei/hinic/hinic_main.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_main.c b/drivers/net/ethernet/huawei/hinic/hinic_main.c index 499c657d37a9..579c3dc0014c 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_main.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_main.c @@ -172,6 +172,7 @@ static int create_txqs(struct hinic_dev *nic_dev) hinic_sq_dbgfs_uninit(nic_dev);
devm_kfree(&netdev->dev, nic_dev->txqs); + nic_dev->txqs = NULL; return err; }
@@ -268,6 +269,7 @@ static int create_rxqs(struct hinic_dev *nic_dev) hinic_rq_dbgfs_uninit(nic_dev);
devm_kfree(&netdev->dev, nic_dev->rxqs); + nic_dev->rxqs = NULL; return err; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp
[ Upstream commit 0cb2c504d79e7caa3abade3f466750c82ad26f01 ]
The OF node obtained by of_parse_phandle() is not freed. Call of_node_put() to balance the refcount.
This bug was found by an experimental static analysis tool that I am developing.
Fixes: 1676aba5ef7e ("net: ethernet: bgmac: device tree phy enablement") Signed-off-by: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20241214014912.2810315-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bgmac-platform.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bgmac-platform.c b/drivers/net/ethernet/broadcom/bgmac-platform.c index b4381cd41979..3f4e8bac40c1 100644 --- a/drivers/net/ethernet/broadcom/bgmac-platform.c +++ b/drivers/net/ethernet/broadcom/bgmac-platform.c @@ -171,6 +171,7 @@ static int platform_phy_connect(struct bgmac *bgmac) static int bgmac_probe(struct platform_device *pdev) { struct device_node *np = pdev->dev.of_node; + struct device_node *phy_node; struct bgmac *bgmac; struct resource *regs; int ret; @@ -236,7 +237,9 @@ static int bgmac_probe(struct platform_device *pdev) bgmac->cco_ctl_maskset = platform_bgmac_cco_ctl_maskset; bgmac->get_bus_clock = platform_bgmac_get_bus_clock; bgmac->cmn_maskset32 = platform_bgmac_cmn_maskset32; - if (of_parse_phandle(np, "phy-handle", 0)) { + phy_node = of_parse_phandle(np, "phy-handle", 0); + if (phy_node) { + of_node_put(phy_node); bgmac->phy_connect = platform_phy_connect; } else { bgmac->phy_connect = bgmac_phy_connect_direct;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Laight David.Laight@ACULAB.COM
[ Upstream commit cf2c97423a4f89c8b798294d3f34ecfe7e7035c3 ]
The 'max_avail' value is calculated from the system memory size using order_base_2(). order_base_2(x) is defined as '(x) ? fn(x) : 0'. The compiler generates two copies of the code that follows and then expands clamp(max, min, PAGE_SHIFT - 12) (11 on 32bit). This triggers a compile-time assert since min is 5.
In reality a system would have to have less than 512MB memory for the bounds passed to clamp to be reversed.
Swap the order of the arguments to clamp() to avoid the warning.
Replace the clamp_val() on the line below with clamp(). clamp_val() is just 'an accident waiting to happen' and not needed here.
Detected by compile time checks added to clamp(), specifically: minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()
Reported-by: Linux Kernel Functional Testing lkft@linaro.org Closes: https://lore.kernel.org/all/CA+G9fYsT34UkGFKxus63H6UVpYi5GRZkezT9MRLfAbM3f6k... Fixes: 4f325e26277b ("ipvs: dynamically limit the connection hash table") Tested-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Reviewed-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: David Laight david.laight@aculab.com Acked-by: Julian Anastasov ja@ssi.bg Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipvs/ip_vs_conn.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c index 9065da3cdd12..8182833a3582 100644 --- a/net/netfilter/ipvs/ip_vs_conn.c +++ b/net/netfilter/ipvs/ip_vs_conn.c @@ -1494,8 +1494,8 @@ int __init ip_vs_conn_init(void) max_avail -= 2; /* ~4 in hash row */ max_avail -= 1; /* IPVS up to 1/2 of mem */ max_avail -= order_base_2(sizeof(struct ip_vs_conn)); - max = clamp(max, min, max_avail); - ip_vs_conn_tab_bits = clamp_val(ip_vs_conn_tab_bits, min, max); + max = clamp(max_avail, min, max); + ip_vs_conn_tab_bits = clamp(ip_vs_conn_tab_bits, min, max); ip_vs_conn_tab_size = 1 << ip_vs_conn_tab_bits; ip_vs_conn_tab_mask = ip_vs_conn_tab_size - 1;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Phil Sutter phil@nwl.cc
[ Upstream commit 70b6f46a4ed8bd56c85ffff22df91e20e8c85e33 ]
With CONFIG_PROVE_LOCKING, when creating a set of type bitmap:ip, adding it to a set of type list:set and populating it from iptables SET target triggers a kernel warning:
| WARNING: possible recursive locking detected | 6.12.0-rc7-01692-g5e9a28f41134-dirty #594 Not tainted | -------------------------------------------- | ping/4018 is trying to acquire lock: | ffff8881094a6848 (&set->lock){+.-.}-{2:2}, at: ip_set_add+0x28c/0x360 [ip_set] | | but task is already holding lock: | ffff88811034c048 (&set->lock){+.-.}-{2:2}, at: ip_set_add+0x28c/0x360 [ip_set]
This is a false alarm: ipset does not allow nested list:set type, so the loop in list_set_kadd() can never encounter the outer set itself. No other set type supports embedded sets, so this is the only case to consider.
To avoid the false report, create a distinct lock class for list:set type ipset locks.
Fixes: f830837f0eed ("netfilter: ipset: list:set set type support") Signed-off-by: Phil Sutter phil@nwl.cc Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipset/ip_set_list_set.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/netfilter/ipset/ip_set_list_set.c b/net/netfilter/ipset/ip_set_list_set.c index bfae7066936b..db794fe1300e 100644 --- a/net/netfilter/ipset/ip_set_list_set.c +++ b/net/netfilter/ipset/ip_set_list_set.c @@ -611,6 +611,8 @@ init_list_set(struct net *net, struct ip_set *set, u32 size) return true; }
+static struct lock_class_key list_set_lockdep_key; + static int list_set_create(struct net *net, struct ip_set *set, struct nlattr *tb[], u32 flags) @@ -627,6 +629,7 @@ list_set_create(struct net *net, struct ip_set *set, struct nlattr *tb[], if (size < IP_SET_LIST_MIN_SIZE) size = IP_SET_LIST_MIN_SIZE;
+ lockdep_set_class(&set->lock, &list_set_lockdep_key); set->variant = &set_variant; set->dsize = ip_set_elem_len(set, tb, sizeof(struct set_elem), __alignof__(struct set_elem));
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adrian Moreno amorenoz@redhat.com
[ Upstream commit a17975992cc11588767175247ccaae1213a8b582 ]
Fix the way tcpdump is executed by: - Using the right variable for the namespace. Currently the use of the empty "ns" makes the command fail. - Waiting until it starts to capture to ensure the interesting traffic is caught on slow systems. - Using line-buffered output to ensure logs are available when the test is paused with "-p". Otherwise the last chunk of data might only be written when tcpdump is killed.
Fixes: 74cc26f416b9 ("selftests: openvswitch: add interface support") Signed-off-by: Adrian Moreno amorenoz@redhat.com Acked-by: Eelco Chaudron echaudro@redhat.com Link: https://patch.msgid.link/20241217211652.483016-1-amorenoz@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/openvswitch/openvswitch.sh | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh b/tools/testing/selftests/net/openvswitch/openvswitch.sh index bab7436c6834..a0f4764ad0af 100755 --- a/tools/testing/selftests/net/openvswitch/openvswitch.sh +++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh @@ -128,8 +128,10 @@ ovs_add_netns_and_veths () { ovs_add_if "$1" "$2" "$4" -u || return 1 fi
- [ $TRACING -eq 1 ] && ovs_netns_spawn_daemon "$1" "$ns" \ - tcpdump -i any -s 65535 + if [ $TRACING -eq 1 ]; then + ovs_netns_spawn_daemon "$1" "$3" tcpdump -l -i any -s 6553 + ovs_wait grep -q "listening on any" ${ovs_dir}/stderr + fi
return 0 }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp
[ Upstream commit 572af9f284669d31d9175122bbef9bc62cea8ded ]
fwnode_find_mii_timestamper() calls of_parse_phandle_with_fixed_args() but does not decrement the refcount of the obtained OF node. Add an of_node_put() call before returning from the function.
This bug was detected by an experimental static analysis tool that I am developing.
Fixes: bc1bee3b87ee ("net: mdiobus: Introduce fwnode_mdiobus_register_phy()") Signed-off-by: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://patch.msgid.link/20241218035106.1436405-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/mdio/fwnode_mdio.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/net/mdio/fwnode_mdio.c b/drivers/net/mdio/fwnode_mdio.c index 1183ef5e203e..c62f2e85414d 100644 --- a/drivers/net/mdio/fwnode_mdio.c +++ b/drivers/net/mdio/fwnode_mdio.c @@ -38,6 +38,7 @@ fwnode_find_pse_control(struct fwnode_handle *fwnode) static struct mii_timestamper * fwnode_find_mii_timestamper(struct fwnode_handle *fwnode) { + struct mii_timestamper *mii_ts; struct of_phandle_args arg; int err;
@@ -51,10 +52,16 @@ fwnode_find_mii_timestamper(struct fwnode_handle *fwnode) else if (err) return ERR_PTR(err);
- if (arg.args_count != 1) - return ERR_PTR(-EINVAL); + if (arg.args_count != 1) { + mii_ts = ERR_PTR(-EINVAL); + goto put_node; + } + + mii_ts = register_mii_timestamper(arg.np, arg.args[0]);
- return register_mii_timestamper(arg.np, arg.args[0]); +put_node: + of_node_put(arg.np); + return mii_ts; }
int fwnode_mdiobus_phy_device_register(struct mii_bus *mdio,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Prathamesh Shete pshete@nvidia.com
commit a56335c85b592cb2833db0a71f7112b7d9f0d56b upstream.
Value 0 in ADMA length descriptor is interpreted as 65536 on new Tegra chips, remove SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC quirk to make sure max ADMA2 length is 65536.
Fixes: 4346b7c7941d ("mmc: tegra: Add Tegra186 support") Cc: stable@vger.kernel.org Signed-off-by: Prathamesh Shete pshete@nvidia.com Acked-by: Thierry Reding treding@nvidia.com Acked-by: Adrian Hunter adrian.hunter@intel.com Message-ID: 20241209101009.22710-1-pshete@nvidia.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-tegra.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/mmc/host/sdhci-tegra.c +++ b/drivers/mmc/host/sdhci-tegra.c @@ -1525,7 +1525,6 @@ static const struct sdhci_pltfm_data sdh .quirks = SDHCI_QUIRK_BROKEN_TIMEOUT_VAL | SDHCI_QUIRK_SINGLE_POWER_WRITE | SDHCI_QUIRK_NO_HISPD_BIT | - SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC | SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN, .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN | SDHCI_QUIRK2_ISSUE_CMD_DAT_RESET_TOGETHER,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp
commit f3d87abe11ed04d1b23a474a212f0e5deeb50892 upstream.
Current implementation leaves pdev->dev as a wakeup source. Add a device_init_wakeup(&pdev->dev, false) call in the .remove() function and in the error path of the .probe() function.
Signed-off-by: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp Fixes: 527f36f5efa4 ("mmc: mediatek: add support for SDIO eint wakup IRQ") Cc: stable@vger.kernel.org Message-ID: 20241203023442.2434018-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/mtk-sd.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/mmc/host/mtk-sd.c +++ b/drivers/mmc/host/mtk-sd.c @@ -2862,6 +2862,7 @@ release_clk: msdc_gate_clock(host); platform_set_drvdata(pdev, NULL); release_mem: + device_init_wakeup(&pdev->dev, false); if (host->dma.gpd) dma_free_coherent(&pdev->dev, 2 * sizeof(struct mt_gpdma_desc), @@ -2895,6 +2896,7 @@ static void msdc_drv_remove(struct platf host->dma.gpd, host->dma.gpd_addr); dma_free_coherent(&pdev->dev, MAX_BD_NUM * sizeof(struct mt_bdma_desc), host->dma.bd, host->dma.bd_addr); + device_init_wakeup(&pdev->dev, false); }
static void msdc_save_reg(struct msdc_host *host)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov (AMD) bp@alien8.de
commit 747367340ca6b5070728b86ae36ad6747f66b2fb upstream.
The intent of the check is to see whether at least one UMC has ECC enabled. So do that instead of tracking which ones are enabled in masks which are too small in size anyway and lead to not loading the driver on Zen4 machines with UMCs enabled over UMC8.
Fixes: e2be5955a886 ("EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh") Reported-by: Avadhut Naik avadhut.naik@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Tested-by: Avadhut Naik avadhut.naik@amd.com Reviewed-by: Avadhut Naik avadhut.naik@amd.com Cc: stable@kernel.org Link: https://lore.kernel.org/r/20241210212054.3895697-1-avadhut.naik@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/edac/amd64_edac.c | 34 +++++++++++----------------------- 1 file changed, 11 insertions(+), 23 deletions(-)
--- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -3620,36 +3620,24 @@ static bool dct_ecc_enabled(struct amd64
static bool umc_ecc_enabled(struct amd64_pvt *pvt) { - u8 umc_en_mask = 0, ecc_en_mask = 0; - u16 nid = pvt->mc_node_id; struct amd64_umc *umc; - u8 ecc_en = 0, i; + bool ecc_en = false; + int i;
+ /* Check whether at least one UMC is enabled: */ for_each_umc(i) { umc = &pvt->umc[i];
- /* Only check enabled UMCs. */ - if (!(umc->sdp_ctrl & UMC_SDP_INIT)) - continue; - - umc_en_mask |= BIT(i); - - if (umc->umc_cap_hi & UMC_ECC_ENABLED) - ecc_en_mask |= BIT(i); + if (umc->sdp_ctrl & UMC_SDP_INIT && + umc->umc_cap_hi & UMC_ECC_ENABLED) { + ecc_en = true; + break; + } }
- /* Check whether at least one UMC is enabled: */ - if (umc_en_mask) - ecc_en = umc_en_mask == ecc_en_mask; - else - edac_dbg(0, "Node %d: No enabled UMCs.\n", nid); - - edac_dbg(3, "Node %d: DRAM ECC %s.\n", nid, (ecc_en ? "enabled" : "disabled")); - - if (!ecc_en) - return false; - else - return true; + edac_dbg(3, "Node %d: DRAM ECC %s.\n", pvt->mc_node_id, (ecc_en ? "enabled" : "disabled")); + + return ecc_en; }
static inline void
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 1201f226c863b7da739f7420ddba818cedf372fc upstream.
Snapshot the output of CPUID.0xD.[1..n] during kvm.ko initiliaization to avoid the overead of CPUID during runtime. The offset, size, and metadata for CPUID.0xD.[1..n] sub-leaves does not depend on XCR0 or XSS values, i.e. is constant for a given CPU, and thus can be cached during module load.
On Intel's Emerald Rapids, CPUID is *wildly* expensive, to the point where recomputing XSAVE offsets and sizes results in a 4x increase in latency of nested VM-Enter and VM-Exit (nested transitions can trigger xstate_required_size() multiple times per transition), relative to using cached values. The issue is easily visible by running `perf top` while triggering nested transitions: kvm_update_cpuid_runtime() shows up at a whopping 50%.
As measured via RDTSC from L2 (using KVM-Unit-Test's CPUID VM-Exit test and a slightly modified L1 KVM to handle CPUID in the fastpath), a nested roundtrip to emulate CPUID on Skylake (SKX), Icelake (ICX), and Emerald Rapids (EMR) takes:
SKX 11650 ICX 22350 EMR 28850
Using cached values, the latency drops to:
SKX 6850 ICX 9000 EMR 7900
The underlying issue is that CPUID itself is slow on ICX, and comically slow on EMR. The problem is exacerbated on CPUs which support XSAVES and/or XSAVEC, as KVM invokes xstate_required_size() twice on each runtime CPUID update, and because there are more supported XSAVE features (CPUID for supported XSAVE feature sub-leafs is significantly slower).
SKX: CPUID.0xD.2 = 348 cycles CPUID.0xD.3 = 400 cycles CPUID.0xD.4 = 276 cycles CPUID.0xD.5 = 236 cycles <other sub-leaves are similar>
EMR: CPUID.0xD.2 = 1138 cycles CPUID.0xD.3 = 1362 cycles CPUID.0xD.4 = 1068 cycles CPUID.0xD.5 = 910 cycles CPUID.0xD.6 = 914 cycles CPUID.0xD.7 = 1350 cycles CPUID.0xD.8 = 734 cycles CPUID.0xD.9 = 766 cycles CPUID.0xD.10 = 732 cycles CPUID.0xD.11 = 718 cycles CPUID.0xD.12 = 734 cycles CPUID.0xD.13 = 1700 cycles CPUID.0xD.14 = 1126 cycles CPUID.0xD.15 = 898 cycles CPUID.0xD.16 = 716 cycles CPUID.0xD.17 = 748 cycles CPUID.0xD.18 = 776 cycles
Note, updating runtime CPUID information multiple times per nested transition is itself a flaw, especially since CPUID is a mandotory intercept on both Intel and AMD. E.g. KVM doesn't need to ensure emulated CPUID state is up-to-date while running L2. That flaw will be fixed in a future patch, as deferring runtime CPUID updates is more subtle than it appears at first glance, the benefits aren't super critical to have once the XSAVE issue is resolved, and caching CPUID output is desirable even if KVM's updates are deferred.
Cc: Jim Mattson jmattson@google.com Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson seanjc@google.com Message-ID: 20241211013302.1347853-2-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/cpuid.c | 31 ++++++++++++++++++++++++++----- arch/x86/kvm/cpuid.h | 1 + arch/x86/kvm/x86.c | 2 ++ 3 files changed, 29 insertions(+), 5 deletions(-)
--- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -36,6 +36,26 @@ u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly; EXPORT_SYMBOL_GPL(kvm_cpu_caps);
+struct cpuid_xstate_sizes { + u32 eax; + u32 ebx; + u32 ecx; +}; + +static struct cpuid_xstate_sizes xstate_sizes[XFEATURE_MAX] __ro_after_init; + +void __init kvm_init_xstate_sizes(void) +{ + u32 ign; + int i; + + for (i = XFEATURE_YMM; i < ARRAY_SIZE(xstate_sizes); i++) { + struct cpuid_xstate_sizes *xs = &xstate_sizes[i]; + + cpuid_count(0xD, i, &xs->eax, &xs->ebx, &xs->ecx, &ign); + } +} + u32 xstate_required_size(u64 xstate_bv, bool compacted) { int feature_bit = 0; @@ -44,14 +64,15 @@ u32 xstate_required_size(u64 xstate_bv, xstate_bv &= XFEATURE_MASK_EXTEND; while (xstate_bv) { if (xstate_bv & 0x1) { - u32 eax, ebx, ecx, edx, offset; - cpuid_count(0xD, feature_bit, &eax, &ebx, &ecx, &edx); + struct cpuid_xstate_sizes *xs = &xstate_sizes[feature_bit]; + u32 offset; + /* ECX[1]: 64B alignment in compacted form */ if (compacted) - offset = (ecx & 0x2) ? ALIGN(ret, 64) : ret; + offset = (xs->ecx & 0x2) ? ALIGN(ret, 64) : ret; else - offset = ebx; - ret = max(ret, offset + eax); + offset = xs->ebx; + ret = max(ret, offset + xs->eax); }
xstate_bv >>= 1; --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -32,6 +32,7 @@ int kvm_vcpu_ioctl_get_cpuid2(struct kvm bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx, u32 *ecx, u32 *edx, bool exact_only);
+void __init kvm_init_xstate_sizes(void); u32 xstate_required_size(u64 xstate_bv, bool compacted);
int cpuid_query_maxphyaddr(struct kvm_vcpu *vcpu); --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -13694,6 +13694,8 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit
static int __init kvm_x86_init(void) { + kvm_init_xstate_sizes(); + kvm_mmu_x86_module_init(); mitigate_smt_rsb &= boot_cpu_has_bug(X86_BUG_SMT_RSB) && cpu_smt_possible(); return 0;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
commit 429fde2d81bcef0ebab002215358955704586457 upstream.
syzbot reported the following crash [1]
Issue came with the blamed commit. Instead of going through all the iov components, we keep using the first one and end up with a malformed skb.
[1]
kernel BUG at net/core/skbuff.c:2849 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 UID: 0 PID: 6230 Comm: syz-executor132 Not tainted 6.13.0-rc1-syzkaller-00407-g96b6fcc0ee41 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024 RIP: 0010:__pskb_pull_tail+0x1568/0x1570 net/core/skbuff.c:2848 Code: 38 c1 0f 8c 32 f1 ff ff 4c 89 f7 e8 92 96 74 f8 e9 25 f1 ff ff e8 e8 ae 09 f8 48 8b 5c 24 08 e9 eb fb ff ff e8 d9 ae 09 f8 90 <0f> 0b 66 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 0018:ffffc90004cbef30 EFLAGS: 00010293 RAX: ffffffff8995c347 RBX: 00000000fffffff2 RCX: ffff88802cf45a00 RDX: 0000000000000000 RSI: 00000000fffffff2 RDI: 0000000000000000 RBP: ffff88807df0c06a R08: ffffffff8995b084 R09: 1ffff1100fbe185c R10: dffffc0000000000 R11: ffffed100fbe185d R12: ffff888076e85d50 R13: ffff888076e85c80 R14: ffff888076e85cf4 R15: ffff888076e85c80 FS: 00007f0dca6ea6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f0dca6ead58 CR3: 00000000119da000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_cow_data+0x2da/0xcb0 net/core/skbuff.c:5284 tipc_aead_decrypt net/tipc/crypto.c:894 [inline] tipc_crypto_rcv+0x402/0x24e0 net/tipc/crypto.c:1844 tipc_rcv+0x57e/0x12a0 net/tipc/node.c:2109 tipc_l2_rcv_msg+0x2bd/0x450 net/tipc/bearer.c:668 __netif_receive_skb_list_ptype net/core/dev.c:5720 [inline] __netif_receive_skb_list_core+0x8b7/0x980 net/core/dev.c:5762 __netif_receive_skb_list net/core/dev.c:5814 [inline] netif_receive_skb_list_internal+0xa51/0xe30 net/core/dev.c:5905 gro_normal_list include/net/gro.h:515 [inline] napi_complete_done+0x2b5/0x870 net/core/dev.c:6256 napi_complete include/linux/netdevice.h:567 [inline] tun_get_user+0x2ea0/0x4890 drivers/net/tun.c:1982 tun_chr_write_iter+0x10d/0x1f0 drivers/net/tun.c:2057 do_iter_readv_writev+0x600/0x880 vfs_writev+0x376/0xba0 fs/read_write.c:1050 do_writev+0x1b6/0x360 fs/read_write.c:1096 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: de4f5fed3f23 ("iov_iter: add iter_iovec() helper") Reported-by: syzbot+4f66250f6663c0c1d67e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/675b61aa.050a0220.599f4.00bb.GAE@google.com/T... Cc: stable@vger.kernel.org Signed-off-by: Eric Dumazet edumazet@google.com Reviewed-by: Joe Damato jdamato@fastly.com Reviewed-by: Jens Axboe axboe@kernel.dk Acked-by: Willem de Bruijn willemb@google.com Acked-by: Michael S. Tsirkin mst@redhat.com Link: https://patch.msgid.link/20241212222247.724674-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/tun.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1487,7 +1487,7 @@ static struct sk_buff *tun_napi_alloc_fr skb->truesize += skb->data_len;
for (i = 1; i < it->nr_segs; i++) { - const struct iovec *iov = iter_iov(it); + const struct iovec *iov = iter_iov(it) + i; size_t fragsz = iov->iov_len; struct page *page; void *frag;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
commit fbbd84af6ba70334335bdeba3ae536cf751c14c6 upstream.
The "gl->tot_len" variable is controlled by the user. It comes from process_responses(). On 32bit systems, the "gl->tot_len + sizeof(struct cpl_pass_accept_req) + sizeof(struct rss_header)" addition could have an integer wrapping bug. Use size_add() to prevent this.
Fixes: a08943947873 ("crypto: chtls - Register chtls with net tls") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/c6bfb23c-2db2-4e1b-b8ab-ba3925c82ef5@stanley.mounta... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c @@ -346,8 +346,9 @@ static struct sk_buff *copy_gl_to_skb_pk * driver. Once driver synthesizes cpl_pass_accpet_req the skb will go * through the regular cpl_pass_accept_req processing in TOM. */ - skb = alloc_skb(gl->tot_len + sizeof(struct cpl_pass_accept_req) - - pktshift, GFP_ATOMIC); + skb = alloc_skb(size_add(gl->tot_len, + sizeof(struct cpl_pass_accept_req)) - + pktshift, GFP_ATOMIC); if (unlikely(!skb)) return NULL; __skb_put(skb, gl->tot_len + sizeof(struct cpl_pass_accept_req)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Geert Uytterhoeven geert+renesas@glider.be
commit de6b43798d9043a7c749a0428dbb02d5fff156e5 upstream.
Currently, the RIIC driver may run the I2C bus faster than requested, which may cause subtle failures. E.g. Biju reported a measured bus speed of 450 kHz instead of the expected maximum of 400 kHz on RZ/G2L.
The initial calculation of the bus period uses DIV_ROUND_UP(), to make sure the actual bus speed never becomes faster than the requested bus speed. However, the subsequent division-by-two steps do not use round-up, which may lead to a too-small period, hence a too-fast and possible out-of-spec bus speed. E.g. on RZ/Five, requesting a bus speed of 100 resp. 400 kHz will yield too-fast target bus speeds of 100806 resp. 403226 Hz instead of 97656 resp. 390625 Hz.
Fix this by using DIV_ROUND_UP() in the subsequent divisions, too.
Tested on RZ/A1H, RZ/A2M, and RZ/Five.
Fixes: d982d66514192cdb ("i2c: riic: remove clock and frequency restrictions") Reported-by: Biju Das biju.das.jz@bp.renesas.com Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Cc: stable@vger.kernel.org # v4.15+ Link: https://lore.kernel.org/r/c59aea77998dfea1b4456c4b33b55ab216fcbf5e.173228474... Signed-off-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-riic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/i2c/busses/i2c-riic.c +++ b/drivers/i2c/busses/i2c-riic.c @@ -324,7 +324,7 @@ static int riic_init_hw(struct riic_dev if (brl <= (0x1F + 3)) break;
- total_ticks /= 2; + total_ticks = DIV_ROUND_UP(total_ticks, 2); rate /= 2; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Bottomley James.Bottomley@HansenPartnership.com
commit 2ab0837cb91b7de507daa145d17b3b6b2efb3abf upstream.
When looking up a non-existent file, efivarfs returns -EINVAL if the file does not conform to the NAME-GUID format and -ENOENT if it does. This is caused by efivars_d_hash() returning -EINVAL if the name is not formatted correctly. This error is returned before simple_lookup() returns a negative dentry, and is the error value that the user sees.
Fix by removing this check. If the file does not exist, simple_lookup() will return a negative dentry leading to -ENOENT and efivarfs_create() already has a validity check before it creates an entry (and will correctly return -EINVAL)
Signed-off-by: James Bottomley James.Bottomley@HansenPartnership.com Cc: stable@vger.kernel.org [ardb: make efivarfs_valid_name() static] Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/efivarfs/inode.c | 2 +- fs/efivarfs/internal.h | 1 - fs/efivarfs/super.c | 3 --- 3 files changed, 1 insertion(+), 5 deletions(-)
--- a/fs/efivarfs/inode.c +++ b/fs/efivarfs/inode.c @@ -47,7 +47,7 @@ struct inode *efivarfs_get_inode(struct * * VariableName-12345678-1234-1234-1234-1234567891bc */ -bool efivarfs_valid_name(const char *str, int len) +static bool efivarfs_valid_name(const char *str, int len) { const char *s = str + len - EFI_VARIABLE_GUID_LEN;
--- a/fs/efivarfs/internal.h +++ b/fs/efivarfs/internal.h @@ -50,7 +50,6 @@ bool efivar_variable_is_removable(efi_gu
extern const struct file_operations efivarfs_file_operations; extern const struct inode_operations efivarfs_dir_inode_operations; -extern bool efivarfs_valid_name(const char *str, int len); extern struct inode *efivarfs_get_inode(struct super_block *sb, const struct inode *dir, int mode, dev_t dev, bool is_removable); --- a/fs/efivarfs/super.c +++ b/fs/efivarfs/super.c @@ -107,9 +107,6 @@ static int efivarfs_d_hash(const struct const unsigned char *s = qstr->name; unsigned int len = qstr->len;
- if (!efivarfs_valid_name(s, len)) - return -EINVAL; - while (len-- > EFI_VARIABLE_GUID_LEN) hash = partial_name_hash(*s++, hash);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Chancellor nathan@kernel.org
commit aef25be35d23ec768eed08bfcf7ca3cf9685bc28 upstream.
The Hexagon-specific constant extender optimization in LLVM may crash on Linux kernel code [1], such as fs/bcache/btree_io.c after commit 32ed4a620c54 ("bcachefs: Btree path tracepoints") in 6.12:
clang: llvm/lib/Target/Hexagon/HexagonConstExtenders.cpp:745: bool (anonymous namespace)::HexagonConstExtenders::ExtRoot::operator<(const HCE::ExtRoot &) const: Assertion `ThisB->getParent() == OtherB->getParent()' failed. Stack dump: 0. Program arguments: clang --target=hexagon-linux-musl ... fs/bcachefs/btree_io.c 1. <eof> parser at end of file 2. Code generation 3. Running pass 'Function Pass Manager' on module 'fs/bcachefs/btree_io.c'. 4. Running pass 'Hexagon constant-extender optimization' on function '@__btree_node_lock_nopath'
Without assertions enabled, there is just a hang during compilation.
This has been resolved in LLVM main (20.0.0) [2] and backported to LLVM 19.1.0 but the kernel supports LLVM 13.0.1 and newer, so disable the constant expander optimization using the '-mllvm' option when using a toolchain that is not fixed.
Cc: stable@vger.kernel.org Link: https://github.com/llvm/llvm-project/issues/99714 [1] Link: https://github.com/llvm/llvm-project/commit/68df06a0b2998765cb0a41353fcf0919... [2] Link: https://github.com/llvm/llvm-project/commit/2ab8d93061581edad3501561722ebd56... [3] Reviewed-by: Brian Cain bcain@quicinc.com Signed-off-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/hexagon/Makefile | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/arch/hexagon/Makefile +++ b/arch/hexagon/Makefile @@ -32,3 +32,9 @@ KBUILD_LDFLAGS += $(ldflags-y) TIR_NAME := r19 KBUILD_CFLAGS += -ffixed-$(TIR_NAME) -DTHREADINFO_REG=$(TIR_NAME) -D__linux__ KBUILD_AFLAGS += -DTHREADINFO_REG=$(TIR_NAME) + +# Disable HexagonConstExtenders pass for LLVM versions prior to 19.1.0 +# https://github.com/llvm/llvm-project/issues/99714 +ifneq ($(call clang-min-version, 190100),y) +KBUILD_CFLAGS += -mllvm -hexagon-cext=false +endif
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Swanemar d.swanemar@gmail.com
commit fdad4fb7c506bea8b419f70ff2163d99962e8ede upstream.
Add the following TCL IK512 compositions:
0x0530: Modem + Diag + AT + MBIM T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 3 Spd=10000 MxCh= 0 D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=1bbb ProdID=0530 Rev=05.04 S: Manufacturer=TCL S: Product=TCL 5G USB Dongle S: SerialNumber=3136b91a C: #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=896mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 4 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
0x0640: ECM + Modem + Diag + AT T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 4 Spd=10000 MxCh= 0 D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=1bbb ProdID=0640 Rev=05.04 S: Manufacturer=TCL S: Product=TCL 5G USB Dongle S: SerialNumber=3136b91a C: #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=896mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=06 Prot=00 Driver=cdc_ether E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether E: Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
Signed-off-by: Daniel Swanemar d.swanemar@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -2385,6 +2385,10 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x60) }, + { USB_DEVICE_INTERFACE_CLASS(0x1bbb, 0x0530, 0xff), /* TCL IK512 MBIM */ + .driver_info = NCTRL(1) }, + { USB_DEVICE_INTERFACE_CLASS(0x1bbb, 0x0640, 0xff), /* TCL IK512 ECM */ + .driver_info = NCTRL(3) }, { } /* Terminating entry */ }; MODULE_DEVICE_TABLE(usb, option_ids);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Hrusecky michal.hrusecky@turris.com
commit 724d461e44dfc0815624d2a9792f2f2beb7ee46d upstream.
Update the USB serial option driver to support MeiG Smart SLM770A.
ID 2dee:4d57 Marvell Mobile Composite Device Bus
T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=2dee ProdID=4d57 Rev= 1.00 S: Manufacturer=Marvell S: Product=Mobile Composite Device Bus C:* #Ifs= 6 Cfg#= 1 Atr=c0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=e0(wlcon) Sub=01 Prot=03 I:* If#= 0 Alt= 0 #EPs= 1 Cls=e0(wlcon) Sub=01 Prot=03 Driver=rndis_host E: Ad=87(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0c(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0b(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0a(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=89(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0e(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Tested successfully connecting to the Internet via rndis interface after dialing via AT commands on If#=3 or If#=4. Not sure of the purpose of the other serial interfaces.
Signed-off-by: Michal Hrusecky michal.hrusecky@turris.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -625,6 +625,8 @@ static void option_instat_callback(struc #define MEIGSMART_PRODUCT_SRM825L 0x4d22 /* MeiG Smart SLM320 based on UNISOC UIS8910 */ #define MEIGSMART_PRODUCT_SLM320 0x4d41 +/* MeiG Smart SLM770A based on ASR1803 */ +#define MEIGSMART_PRODUCT_SLM770A 0x4d57
/* Device flags */
@@ -2382,6 +2384,7 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, TOZED_PRODUCT_LT70C, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, LUAT_PRODUCT_AIR720U, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SLM320, 0xff, 0, 0) }, + { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SLM770A, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x60) },
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mank Wang mank.wang@netprisma.com
commit aa954ae08262bb5cd6ab18dd56a0b58c1315db8b upstream.
LCUK54-WRD's pid/vid 0x3731/0x010a 0x3731/0x010c
LCUK54-WWD's pid/vid 0x3731/0x010b 0x3731/0x010d
Above products use the exact same interface layout and option driver: MBIM + GNSS + DIAG + NMEA + AT + QDSS + DPL
T: Bus=01 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#= 5 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=3731 ProdID=0101 Rev= 5.04 S: Manufacturer=NetPrisma S: Product=LCUK54-WRD S: SerialNumber=feeba631 C:* #Ifs= 8 Cfg#= 1 Atr=a0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00 I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none) E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=40 Driver=option E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=70 Driver=(none) E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 7 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=8f(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Signed-off-by: Mank Wang mank.wang@netprisma.com [ johan: use lower case hex notation ] Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -2377,6 +2377,18 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x0116, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WWD for Golbal EDU */ { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x0116, 0xff, 0x00, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x0116, 0xff, 0xff, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010a, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WRD for WWAN Ready */ + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010a, 0xff, 0x00, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010a, 0xff, 0xff, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010b, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WWD for WWAN Ready */ + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010b, 0xff, 0x00, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010b, 0xff, 0xff, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010c, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WRD for WWAN Ready */ + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010c, 0xff, 0x00, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010c, 0xff, 0xff, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010d, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WWD for WWAN Ready */ + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010d, 0xff, 0x00, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010d, 0xff, 0xff, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(OPPO_VENDOR_ID, OPPO_PRODUCT_R11, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9191, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9191, 0xff, 0xff, 0x40) },
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jack Wu wojackbb@gmail.com
commit f07dfa6a1b65034a5c3ba3a555950d972f252757 upstream.
Add the MediaTek T7XX compositions:
T: Bus=03 Lev=01 Prnt=01 Port=05 Cnt=01 Dev#= 74 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0e8d ProdID=7129 Rev= 0.01 S: Manufacturer=MediaTek Inc. S: Product=USB DATA CARD S: SerialNumber=004402459035402 C:* #Ifs=10 Cfg#= 1 Atr=a0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00 I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 7 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 8 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=08(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 9 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=8a(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=09(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
------------------------------- | If Number | Function | ------------------------------- | 2 | USB AP Log Port | ------------------------------- | 3 | USB AP GNSS Port| ------------------------------- | 4 | USB AP META Port| ------------------------------- | 5 | ADB port | ------------------------------- | 6 | USB MD AT Port | ------------------------------ | 7 | USB MD META Port| ------------------------------- | 8 | USB NTZ Port | ------------------------------- | 9 | USB Debug port | -------------------------------
Signed-off-by: Jack Wu wojackbb@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -2249,6 +2249,8 @@ static const struct usb_device_id option .driver_info = NCTRL(2) }, { USB_DEVICE_AND_INTERFACE_INFO(MEDIATEK_VENDOR_ID, 0x7127, 0xff, 0x00, 0x00), .driver_info = NCTRL(2) | NCTRL(3) | NCTRL(4) }, + { USB_DEVICE_AND_INTERFACE_INFO(MEDIATEK_VENDOR_ID, 0x7129, 0xff, 0x00, 0x00), /* MediaTek T7XX */ + .driver_info = NCTRL(2) | NCTRL(3) | NCTRL(4) }, { USB_DEVICE(CELLIENT_VENDOR_ID, CELLIENT_PRODUCT_MEN200) }, { USB_DEVICE(CELLIENT_VENDOR_ID, CELLIENT_PRODUCT_MPL200), .driver_info = RSVD(1) | RSVD(4) },
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniele Palmas dnlplm@gmail.com
commit 8366e64a4454481339e7c56a8ad280161f2e441d upstream.
Add the following Telit FE910C04 compositions:
0x10c0: rmnet + tty (AT/NMEA) + tty (AT) + tty (diag) T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 13 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c0 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
0x10c4: rmnet + tty (AT) + tty (AT) + tty (diag) T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 14 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c4 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
0x10c8: rmnet + tty (AT) + tty (diag) + DPL (data packet logging) + adb T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 17 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c8 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Signed-off-by: Daniele Palmas dnlplm@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -1397,6 +1397,12 @@ static const struct usb_device_id option .driver_info = RSVD(0) | NCTRL(2) | RSVD(3) | RSVD(4) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10aa, 0xff), /* Telit FN920C04 (MBIM) */ .driver_info = NCTRL(3) | RSVD(4) | RSVD(5) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10c0, 0xff), /* Telit FE910C04 (rmnet) */ + .driver_info = RSVD(0) | NCTRL(3) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10c4, 0xff), /* Telit FE910C04 (rmnet) */ + .driver_info = RSVD(0) | NCTRL(3) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10c8, 0xff), /* Telit FE910C04 (rmnet) */ + .driver_info = RSVD(0) | NCTRL(2) | RSVD(3) | RSVD(4) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910), .driver_info = NCTRL(0) | RSVD(1) | RSVD(3) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910_DUAL_MODEM),
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mika Westerberg mika.westerberg@linux.intel.com
commit 24740385cb0d6d22ab7fa7adf36546d5b3cdcf73 upstream.
When USB-C monitor is connected directly to Intel Barlow Ridge host, it goes into "redrive" mode that basically routes the DisplayPort signals directly from the GPU to the USB-C monitor without any tunneling needed. However, the host router must be powered on for this to work. Aaron reported that there are a couple of cases where this will not work with the current code:
- Booting with USB-C monitor plugged in. - Plugging in USB-C monitor when the host router is in sleep state (runtime suspended). - Plugging in USB-C device while the system is in system sleep state.
In all these cases once the host router is runtime suspended the picture on the connected USB-C display disappears too. This is certainly not what the user expected.
For this reason improve the redrive mode handling to keep the host router from runtime suspending when detect that any of the above cases is happening.
Fixes: a75e0684efe5 ("thunderbolt: Keep the domain powered when USB4 port is in redrive mode") Reported-by: Aaron Rainbolt arainbolt@kfocus.org Closes: https://lore.kernel.org/linux-usb/20241009220118.70bfedd0@kf-ir16/ Cc: stable@vger.kernel.org Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/thunderbolt/tb.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
--- a/drivers/thunderbolt/tb.c +++ b/drivers/thunderbolt/tb.c @@ -1930,6 +1930,37 @@ static void tb_exit_redrive(struct tb_po } }
+static void tb_switch_enter_redrive(struct tb_switch *sw) +{ + struct tb_port *port; + + tb_switch_for_each_port(sw, port) + tb_enter_redrive(port); +} + +/* + * Called during system and runtime suspend to forcefully exit redrive + * mode without querying whether the resource is available. + */ +static void tb_switch_exit_redrive(struct tb_switch *sw) +{ + struct tb_port *port; + + if (!(sw->quirks & QUIRK_KEEP_POWER_IN_DP_REDRIVE)) + return; + + tb_switch_for_each_port(sw, port) { + if (!tb_port_is_dpin(port)) + continue; + + if (port->redrive) { + port->redrive = false; + pm_runtime_put(&sw->dev); + tb_port_dbg(port, "exit redrive mode\n"); + } + } +} + static void tb_dp_resource_unavailable(struct tb *tb, struct tb_port *port) { struct tb_port *in, *out; @@ -2690,6 +2721,7 @@ static int tb_start(struct tb *tb, bool tb_create_usb3_tunnels(tb->root_switch); /* Add DP IN resources for the root switch */ tb_add_dp_resources(tb->root_switch); + tb_switch_enter_redrive(tb->root_switch); /* Make the discovered switches available to the userspace */ device_for_each_child(&tb->root_switch->dev, NULL, tb_scan_finalize_switch); @@ -2705,6 +2737,7 @@ static int tb_suspend_noirq(struct tb *t
tb_dbg(tb, "suspending...\n"); tb_disconnect_and_release_dp(tb); + tb_switch_exit_redrive(tb->root_switch); tb_switch_suspend(tb->root_switch, false); tcm->hotplug_active = false; /* signal tb_handle_hotplug to quit */ tb_dbg(tb, "suspend finished\n"); @@ -2797,6 +2830,7 @@ static int tb_resume_noirq(struct tb *tb tb_dbg(tb, "tunnels restarted, sleeping for 100ms\n"); msleep(100); } + tb_switch_enter_redrive(tb->root_switch); /* Allow tb_handle_hotplug to progress events */ tcm->hotplug_active = true; tb_dbg(tb, "resume finished\n"); @@ -2860,6 +2894,12 @@ static int tb_runtime_suspend(struct tb struct tb_cm *tcm = tb_priv(tb);
mutex_lock(&tb->lock); + /* + * The below call only releases DP resources to allow exiting and + * re-entering redrive mode. + */ + tb_disconnect_and_release_dp(tb); + tb_switch_exit_redrive(tb->root_switch); tb_switch_suspend(tb->root_switch, true); tcm->hotplug_active = false; mutex_unlock(&tb->lock); @@ -2891,6 +2931,7 @@ static int tb_runtime_resume(struct tb * tb_restore_children(tb->root_switch); list_for_each_entry_safe(tunnel, n, &tcm->tunnel_list, list) tb_tunnel_restart(tunnel); + tb_switch_enter_redrive(tb->root_switch); tcm->hotplug_active = true; mutex_unlock(&tb->lock);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit 9398332f23fab10c5ec57c168b44e72997d6318e upstream.
drm_mode_vrefresh() is trying to avoid divide by zero by checking whether htotal or vtotal are zero. But we may still end up with a div-by-zero of vtotal*htotal*...
Cc: stable@vger.kernel.org Reported-by: syzbot+622bba18029bcde672e1@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=622bba18029bcde672e1 Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241129042629.18280-2-ville.s... Reviewed-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_modes.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/drm_modes.c +++ b/drivers/gpu/drm/drm_modes.c @@ -1285,14 +1285,11 @@ EXPORT_SYMBOL(drm_mode_set_name); */ int drm_mode_vrefresh(const struct drm_display_mode *mode) { - unsigned int num, den; + unsigned int num = 1, den = 1;
if (mode->htotal == 0 || mode->vtotal == 0) return 0;
- num = mode->clock; - den = mode->htotal * mode->vtotal; - if (mode->flags & DRM_MODE_FLAG_INTERLACE) num *= 2; if (mode->flags & DRM_MODE_FLAG_DBLSCAN) @@ -1300,6 +1297,12 @@ int drm_mode_vrefresh(const struct drm_d if (mode->vscan > 1) den *= mode->vscan;
+ if (check_mul_overflow(mode->clock, num, &num)) + return 0; + + if (check_mul_overflow(mode->htotal * mode->vtotal, den, &den)) + return 0; + return DIV_ROUND_CLOSEST_ULL(mul_u32_u32(num, 1000), den); } EXPORT_SYMBOL(drm_mode_vrefresh);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit f8fd0968eff52cf092c0d517d17507ea2f6e5ea5 ]
mipi_dsi_device_register_full() never returns NULL pointer, it will return ERR_PTR() when it fails, so replace the check with IS_ERR().
Fixes: 623a3531e9cf ("drm/panel: Add driver for Novatek NT35950 DSI DriverIC panels") Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20241029123957.1588-1-yangyingliang@huaweicloud.co... Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20241029123957.1588-1-yangying... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-novatek-nt35950.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/panel/panel-novatek-nt35950.c b/drivers/gpu/drm/panel/panel-novatek-nt35950.c index 4be5013330ec..a7e1c2db5e24 100644 --- a/drivers/gpu/drm/panel/panel-novatek-nt35950.c +++ b/drivers/gpu/drm/panel/panel-novatek-nt35950.c @@ -569,9 +569,9 @@ static int nt35950_probe(struct mipi_dsi_device *dsi) return dev_err_probe(dev, -EPROBE_DEFER, "Cannot get secondary DSI host\n");
nt->dsi[1] = mipi_dsi_device_register_full(dsi_r_host, info); - if (!nt->dsi[1]) { + if (IS_ERR(nt->dsi[1])) { dev_err(dev, "Cannot get secondary DSI node\n"); - return -ENODEV; + return PTR_ERR(nt->dsi[1]); } num_dsis++; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com
[ Upstream commit abcc2ddae5f82aa6cfca162e3db643dd33f0a2e8 ]
On GT reset, we store total busyness counts for all engines and re-register the utilization buffer with GuC. At that time we should reset the buffer, so that we don't get spurious busyness counts on subsequent queries.
To repro this issue, run igt@perf_pmu@busy-hang followed by igt@perf_pmu@most-busy-idle-check-all for a couple iterations.
Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: John Harrison John.C.Harrison@Intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-2-umesh.... (cherry picked from commit abd318237fa6556c1e5225529af145ef15d5ff0d) Signed-off-by: Tvrtko Ursulin tursulin@ursulin.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +++++++++++++++++++ 1 file changed, 21 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 236dfff81fea..44610c739fe7 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -1229,6 +1229,21 @@ static void __get_engine_usage_record(struct intel_engine_cs *engine, } while (++i < 6); }
+static void __set_engine_usage_record(struct intel_engine_cs *engine, + u32 last_in, u32 id, u32 total) +{ + struct iosys_map rec_map = intel_guc_engine_usage_record_map(engine); + +#define record_write(map_, field_, val_) \ + iosys_map_wr_field(map_, 0, struct guc_engine_usage_record, field_, val_) + + record_write(&rec_map, last_switch_in_stamp, last_in); + record_write(&rec_map, current_context_index, id); + record_write(&rec_map, total_runtime, total); + +#undef record_write +} + static void guc_update_engine_gt_clks(struct intel_engine_cs *engine) { struct intel_engine_guc_stats *stats = &engine->stats.guc; @@ -1488,6 +1503,9 @@ static void guc_timestamp_ping(struct work_struct *wrk)
static int guc_action_enable_usage_stats(struct intel_guc *guc) { + struct intel_gt *gt = guc_to_gt(guc); + struct intel_engine_cs *engine; + enum intel_engine_id id; u32 offset = intel_guc_engine_usage_offset(guc); u32 action[] = { INTEL_GUC_ACTION_SET_ENG_UTIL_BUFF, @@ -1495,6 +1513,9 @@ static int guc_action_enable_usage_stats(struct intel_guc *guc) 0, };
+ for_each_engine(engine, gt, id) + __set_engine_usage_record(engine, 0, 0xffffffff, 0); + return intel_guc_send(guc, action, ARRAY_SIZE(action)); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com
[ Upstream commit 59a0b46788d58fdcee8d2f6b4e619d264a1799bf ]
Active busyness of an engine is calculated using gt timestamp and the context switch in time. While capturing the gt timestamp, it's possible that the context switches out. This race could result in an active busyness value that is greater than the actual context runtime value by a small amount. This leads to a negative delta and throws off busyness calculations for the user.
If a subsequent count is smaller than the previous one, just return the previous one, since we expect the busyness to catch up.
Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: John Harrison John.C.Harrison@Intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-3-umesh.... (cherry picked from commit cf907f6d294217985e9dafd9985dce874e04ca37) Signed-off-by: Tvrtko Ursulin tursulin@ursulin.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gt/intel_engine_types.h | 5 +++++ drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 ++++- 2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index a7e677598004..5a7e464b4fb3 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -343,6 +343,11 @@ struct intel_engine_guc_stats { * @start_gt_clk: GT clock time of last idle to active transition. */ u64 start_gt_clk; + + /** + * @total: The last value of total returned + */ + u64 total; };
union intel_engine_tlb_inv_reg { diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 44610c739fe7..9028160b8448 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -1362,9 +1362,12 @@ static ktime_t guc_engine_busyness(struct intel_engine_cs *engine, ktime_t *now) total += intel_gt_clock_interval_to_ns(gt, clk); }
+ if (total > stats->total) + stats->total = total; + spin_unlock_irqrestore(&guc->timestamp.lock, flags);
- return ns_to_ktime(total); + return ns_to_ktime(stats->total); }
static void guc_enable_busyness_worker(struct intel_guc *guc)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com
[ Upstream commit 1622ed27d26ab4c234476be746aa55bcd39159dd ]
On gt reset, if a context is running, then accumulate it's active time into the busyness counter since there will be no chance for the context to switch out and update it's run time.
v2: Move comment right above the if (John)
Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: John Harrison John.C.Harrison@Intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-4-umesh.... (cherry picked from commit 7ed047da59cfa1acb558b95169d347acc8d85da1) Signed-off-by: Tvrtko Ursulin tursulin@ursulin.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 9028160b8448..ccaf9437cd0d 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -1394,8 +1394,21 @@ static void __reset_guc_busyness_stats(struct intel_guc *guc)
guc_update_pm_timestamp(guc, &unused); for_each_engine(engine, gt, id) { + struct intel_engine_guc_stats *stats = &engine->stats.guc; + guc_update_engine_gt_clks(engine); - engine->stats.guc.prev_total = 0; + + /* + * If resetting a running context, accumulate the active + * time as well since there will be no context switch. + */ + if (stats->running) { + u64 clk = guc->timestamp.gt_stamp - stats->start_gt_clk; + + stats->total_gt_clks += clk; + } + stats->prev_total = 0; + stats->running = 0; }
spin_unlock_irqrestore(&guc->timestamp.lock, flags);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pierre-Eric Pelloux-Prayer pierre-eric.pelloux-prayer@amd.com
[ Upstream commit a93b1020eb9386d7da11608477121b10079c076a ]
Since 2320c9e6a768 ("drm/sched: memset() 'job' in drm_sched_job_init()") accessing job->base.sched can produce unexpected results as the initialisation of (*job)->base.sched done in amdgpu_job_alloc is overwritten by the memset.
This commit fixes an issue when a CS would fail validation and would be rejected after job->num_ibs is incremented. In this case, amdgpu_ib_free(ring->adev, ...) will be called, which would crash the machine because the ring value is bogus.
To fix this, pass a NULL pointer to amdgpu_ib_free(): we can do this because the device is actually not used in this function.
The next commit will remove the ring argument completely.
Fixes: 2320c9e6a768 ("drm/sched: memset() 'job' in drm_sched_job_init()") Signed-off-by: Pierre-Eric Pelloux-Prayer pierre-eric.pelloux-prayer@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 2ae520cb12831d264ceb97c61f72c59d33c0dbd7) Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 99dd86337e84..49a6b6b88843 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -159,7 +159,6 @@ void amdgpu_job_set_resources(struct amdgpu_job *job, struct amdgpu_bo *gds,
void amdgpu_job_free_resources(struct amdgpu_job *job) { - struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched); struct dma_fence *f; unsigned i;
@@ -172,7 +171,7 @@ void amdgpu_job_free_resources(struct amdgpu_job *job) f = NULL;
for (i = 0; i < job->num_ibs; ++i) - amdgpu_ib_free(ring->adev, &job->ibs[i], f); + amdgpu_ib_free(NULL, &job->ibs[i], f); }
static void amdgpu_job_free_cb(struct drm_sched_job *s_job)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit 5d9ad4e0fa7cc27199fdb94beb6ec5f655ba9489 ]
The driver uses math.h and not util_macros.h.
All the same for the kernel.h, replace it with what the driver is using.
Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20231128180654.395692-2-andriy.shevchenko@linux.in... Signed-off-by: Guenter Roeck linux@roeck-us.net Stable-dep-of: 74d7e038fd07 ("hwmon: (tmp513) Fix interpretation of values of Shunt Voltage and Limit Registers") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/tmp513.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/hwmon/tmp513.c b/drivers/hwmon/tmp513.c index 9a180b1030c9..ca665033fe52 100644 --- a/drivers/hwmon/tmp513.c +++ b/drivers/hwmon/tmp513.c @@ -19,15 +19,19 @@ * the Free Software Foundation; version 2 of the License. */
+#include <linux/bitops.h> +#include <linux/bug.h> +#include <linux/device.h> #include <linux/err.h> #include <linux/hwmon.h> #include <linux/i2c.h> #include <linux/init.h> -#include <linux/kernel.h> +#include <linux/math.h> #include <linux/module.h> +#include <linux/property.h> #include <linux/regmap.h> #include <linux/slab.h> -#include <linux/util_macros.h> +#include <linux/types.h>
// Common register definition #define TMP51X_SHUNT_CONFIG 0x00
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit df989762bc4b71068e6adc0c2b5ef6f76b6acf74 ]
Common pattern of handling deferred probe can be simplified with dev_err_probe(). Less code and also it prints the error value.
Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20231128180654.395692-3-andriy.shevchenko@linux.in... [groeck: Fixed excessive line length] Signed-off-by: Guenter Roeck linux@roeck-us.net Stable-dep-of: 74d7e038fd07 ("hwmon: (tmp513) Fix interpretation of values of Shunt Voltage and Limit Registers") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/tmp513.c | 35 +++++++++++++++-------------------- 1 file changed, 15 insertions(+), 20 deletions(-)
diff --git a/drivers/hwmon/tmp513.c b/drivers/hwmon/tmp513.c index ca665033fe52..67b13ac5512c 100644 --- a/drivers/hwmon/tmp513.c +++ b/drivers/hwmon/tmp513.c @@ -632,9 +632,9 @@ static int tmp51x_vbus_range_to_reg(struct device *dev, } else if (data->vbus_range_uvolt == TMP51X_VBUS_RANGE_16V) { data->shunt_config &= ~TMP51X_BUS_VOLTAGE_MASK; } else { - dev_err(dev, "ti,bus-range-microvolt is invalid: %u\n", - data->vbus_range_uvolt); - return -EINVAL; + return dev_err_probe(dev, -EINVAL, + "ti,bus-range-microvolt is invalid: %u\n", + data->vbus_range_uvolt); } return 0; } @@ -650,8 +650,8 @@ static int tmp51x_pga_gain_to_reg(struct device *dev, struct tmp51x_data *data) } else if (data->pga_gain == 1) { data->shunt_config |= CURRENT_SENSE_VOLTAGE_40_MASK; } else { - dev_err(dev, "ti,pga-gain is invalid: %u\n", data->pga_gain); - return -EINVAL; + return dev_err_probe(dev, -EINVAL, + "ti,pga-gain is invalid: %u\n", data->pga_gain); } return 0; } @@ -684,9 +684,9 @@ static int tmp51x_read_properties(struct device *dev, struct tmp51x_data *data)
// Check if shunt value is compatible with pga-gain if (data->shunt_uohms > data->pga_gain * 40 * 1000 * 1000) { - dev_err(dev, "shunt-resistor: %u too big for pga_gain: %u\n", - data->shunt_uohms, data->pga_gain); - return -EINVAL; + return dev_err_probe(dev, -EINVAL, + "shunt-resistor: %u too big for pga_gain: %u\n", + data->shunt_uohms, data->pga_gain); }
return 0; @@ -727,22 +727,17 @@ static int tmp51x_probe(struct i2c_client *client) data->id = (uintptr_t)i2c_get_match_data(client);
ret = tmp51x_configure(dev, data); - if (ret < 0) { - dev_err(dev, "error configuring the device: %d\n", ret); - return ret; - } + if (ret < 0) + return dev_err_probe(dev, ret, "error configuring the device\n");
data->regmap = devm_regmap_init_i2c(client, &tmp51x_regmap_config); - if (IS_ERR(data->regmap)) { - dev_err(dev, "failed to allocate register map\n"); - return PTR_ERR(data->regmap); - } + if (IS_ERR(data->regmap)) + return dev_err_probe(dev, PTR_ERR(data->regmap), + "failed to allocate register map\n");
ret = tmp51x_init(data); - if (ret < 0) { - dev_err(dev, "error configuring the device: %d\n", ret); - return -ENODEV; - } + if (ret < 0) + return dev_err_probe(dev, ret, "error configuring the device\n");
hwmon_dev = devm_hwmon_device_register_with_info(dev, client->name, data,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit f07f9d2467f4a298d24e186ddee6f69724903067 ]
MILLI and MICRO may be used in the driver to make code more robust against possible miscalculations.
Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20231128180654.395692-4-andriy.shevchenko@linux.in... Signed-off-by: Guenter Roeck linux@roeck-us.net Stable-dep-of: 74d7e038fd07 ("hwmon: (tmp513) Fix interpretation of values of Shunt Voltage and Limit Registers") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/tmp513.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/drivers/hwmon/tmp513.c b/drivers/hwmon/tmp513.c index 67b13ac5512c..1e50ac680378 100644 --- a/drivers/hwmon/tmp513.c +++ b/drivers/hwmon/tmp513.c @@ -32,6 +32,7 @@ #include <linux/regmap.h> #include <linux/slab.h> #include <linux/types.h> +#include <linux/units.h>
// Common register definition #define TMP51X_SHUNT_CONFIG 0x00 @@ -104,8 +105,8 @@ #define TMP51X_REMOTE_TEMP_LIMIT_2_POS 8 #define TMP513_REMOTE_TEMP_LIMIT_3_POS 7
-#define TMP51X_VBUS_RANGE_32V 32000000 -#define TMP51X_VBUS_RANGE_16V 16000000 +#define TMP51X_VBUS_RANGE_32V (32 * MICRO) +#define TMP51X_VBUS_RANGE_16V (16 * MICRO)
// Max and Min value #define MAX_BUS_VOLTAGE_32_LIMIT 32764 @@ -200,7 +201,7 @@ static int tmp51x_get_value(struct tmp51x_data *data, u8 reg, u8 pos, * on the pga gain setting. 1lsb = 10uV */ *val = sign_extend32(regval, 17 - tmp51x_get_pga_shift(data)); - *val = DIV_ROUND_CLOSEST(*val * 10000, data->shunt_uohms); + *val = DIV_ROUND_CLOSEST(*val * 10 * MILLI, data->shunt_uohms); break; case TMP51X_BUS_VOLTAGE_RESULT: case TMP51X_BUS_VOLTAGE_H_LIMIT: @@ -216,7 +217,7 @@ static int tmp51x_get_value(struct tmp51x_data *data, u8 reg, u8 pos, case TMP51X_BUS_CURRENT_RESULT: // Current = (ShuntVoltage * CalibrationRegister) / 4096 *val = sign_extend32(regval, 16) * data->curr_lsb_ua; - *val = DIV_ROUND_CLOSEST(*val, 1000); + *val = DIV_ROUND_CLOSEST(*val, MILLI); break; case TMP51X_LOCAL_TEMP_RESULT: case TMP51X_REMOTE_TEMP_RESULT_1: @@ -256,7 +257,7 @@ static int tmp51x_set_value(struct tmp51x_data *data, u8 reg, long val) * The user enter current value and we convert it to * voltage. 1lsb = 10uV */ - val = DIV_ROUND_CLOSEST(val * data->shunt_uohms, 10000); + val = DIV_ROUND_CLOSEST(val * data->shunt_uohms, 10 * MILLI); max_val = U16_MAX >> tmp51x_get_pga_shift(data); regval = clamp_val(val, -max_val, max_val); break; @@ -546,18 +547,16 @@ static int tmp51x_calibrate(struct tmp51x_data *data) if (data->shunt_uohms == 0) return regmap_write(data->regmap, TMP51X_SHUNT_CALIBRATION, 0);
- max_curr_ma = DIV_ROUND_CLOSEST_ULL(vshunt_max * 1000 * 1000, - data->shunt_uohms); + max_curr_ma = DIV_ROUND_CLOSEST_ULL(vshunt_max * MICRO, data->shunt_uohms);
/* * Calculate the minimal bit resolution for the current and the power. * Those values will be used during register interpretation. */ - data->curr_lsb_ua = DIV_ROUND_CLOSEST_ULL(max_curr_ma * 1000, 32767); + data->curr_lsb_ua = DIV_ROUND_CLOSEST_ULL(max_curr_ma * MILLI, 32767); data->pwr_lsb_uw = 20 * data->curr_lsb_ua;
- div = DIV_ROUND_CLOSEST_ULL(data->curr_lsb_ua * data->shunt_uohms, - 1000 * 1000); + div = DIV_ROUND_CLOSEST_ULL(data->curr_lsb_ua * data->shunt_uohms, MICRO);
return regmap_write(data->regmap, TMP51X_SHUNT_CALIBRATION, DIV_ROUND_CLOSEST(40960, div)); @@ -683,7 +682,7 @@ static int tmp51x_read_properties(struct device *dev, struct tmp51x_data *data) memcpy(data->nfactor, nfactor, (data->id == tmp513) ? 3 : 2);
// Check if shunt value is compatible with pga-gain - if (data->shunt_uohms > data->pga_gain * 40 * 1000 * 1000) { + if (data->shunt_uohms > data->pga_gain * 40 * MICRO) { return dev_err_probe(dev, -EINVAL, "shunt-resistor: %u too big for pga_gain: %u\n", data->shunt_uohms, data->pga_gain);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Murad Masimov m.masimov@maxima.ru
[ Upstream commit 74d7e038fd072635d21e4734e3223378e09168d3 ]
The values returned by the driver after processing the contents of the Shunt Voltage Register and the Shunt Limit Registers do not correspond to the TMP512/TMP513 specifications. A raw register value is converted to a signed integer value by a sign extension in accordance with the algorithm provided in the specification, but due to the off-by-one error in the sign bit index, the result is incorrect. Moreover, the PGA shift calculated with the tmp51x_get_pga_shift function is relevant only to the Shunt Voltage Register, but is also applied to the Shunt Limit Registers.
According to the TMP512 and TMP513 datasheets, the Shunt Voltage Register (04h) is 13 to 16 bit two's complement integer value, depending on the PGA setting. The Shunt Positive (0Ch) and Negative (0Dh) Limit Registers are 16-bit two's complement integer values. Below are some examples:
* Shunt Voltage Register If PGA = 8, and regval = 1000 0011 0000 0000, then the decimal value must be -32000, but the value calculated by the driver will be 33536.
* Shunt Limit Register If regval = 1000 0011 0000 0000, then the decimal value must be -32000, but the value calculated by the driver will be 768, if PGA = 1.
Fix sign bit index, and also correct misleading comment describing the tmp51x_get_pga_shift function.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: Murad Masimov m.masimov@maxima.ru Link: https://lore.kernel.org/r/20241216173648.526-2-m.masimov@maxima.ru [groeck: Fixed description and multi-line alignments] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/tmp513.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/hwmon/tmp513.c b/drivers/hwmon/tmp513.c index 1e50ac680378..549503a56ed0 100644 --- a/drivers/hwmon/tmp513.c +++ b/drivers/hwmon/tmp513.c @@ -178,7 +178,7 @@ struct tmp51x_data { struct regmap *regmap; };
-// Set the shift based on the gain 8=4, 4=3, 2=2, 1=1 +// Set the shift based on the gain: 8 -> 1, 4 -> 2, 2 -> 3, 1 -> 4 static inline u8 tmp51x_get_pga_shift(struct tmp51x_data *data) { return 5 - ffs(data->pga_gain); @@ -200,7 +200,9 @@ static int tmp51x_get_value(struct tmp51x_data *data, u8 reg, u8 pos, * 2's complement number shifted by one to four depending * on the pga gain setting. 1lsb = 10uV */ - *val = sign_extend32(regval, 17 - tmp51x_get_pga_shift(data)); + *val = sign_extend32(regval, + reg == TMP51X_SHUNT_CURRENT_RESULT ? + 16 - tmp51x_get_pga_shift(data) : 15); *val = DIV_ROUND_CLOSEST(*val * 10 * MILLI, data->shunt_uohms); break; case TMP51X_BUS_VOLTAGE_RESULT:
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Murad Masimov m.masimov@maxima.ru
[ Upstream commit da1d0e6ba211baf6747db74c07700caddfd8a179 ]
The value returned by the driver after processing the contents of the Current Register does not correspond to the TMP512/TMP513 specifications. A raw register value is converted to a signed integer value by a sign extension in accordance with the algorithm provided in the specification, but due to the off-by-one error in the sign bit index, the result is incorrect. Moreover, negative values will be reported as large positive due to missing sign extension from u32 to long.
According to the TMP512 and TMP513 datasheets, the Current Register (07h) is a 16-bit two's complement integer value. E.g., if regval = 1000 0011 0000 0000, then the value must be (-32000 * lsb), but the driver will return (33536 * lsb).
Fix off-by-one bug, and also cast data->curr_lsb_ua (which is of type u32) to long to prevent incorrect cast for negative values.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: Murad Masimov m.masimov@maxima.ru Link: https://lore.kernel.org/r/20241216173648.526-3-m.masimov@maxima.ru [groeck: Fixed description line length] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/tmp513.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/tmp513.c b/drivers/hwmon/tmp513.c index 549503a56ed0..77dffbee8ebe 100644 --- a/drivers/hwmon/tmp513.c +++ b/drivers/hwmon/tmp513.c @@ -218,7 +218,7 @@ static int tmp51x_get_value(struct tmp51x_data *data, u8 reg, u8 pos, break; case TMP51X_BUS_CURRENT_RESULT: // Current = (ShuntVoltage * CalibrationRegister) / 4096 - *val = sign_extend32(regval, 16) * data->curr_lsb_ua; + *val = sign_extend32(regval, 15) * (long)data->curr_lsb_ua; *val = DIV_ROUND_CLOSEST(*val, MILLI); break; case TMP51X_LOCAL_TEMP_RESULT:
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Murad Masimov m.masimov@maxima.ru
[ Upstream commit dd471e25770e7e632f736b90db1e2080b2171668 ]
The values returned by the driver after processing the contents of the Temperature Result and the Temperature Limit Registers do not correspond to the TMP512/TMP513 specifications. A raw register value is converted to a signed integer value by a sign extension in accordance with the algorithm provided in the specification, but due to the off-by-one error in the sign bit index, the result is incorrect.
According to the TMP512 and TMP513 datasheets, the Temperature Result (08h to 0Bh) and Limit (11h to 14h) Registers are 13-bit two's complement integer values, shifted left by 3 bits. The value is scaled by 0.0625 degrees Celsius per bit. E.g., if regval = 1 1110 0111 0000 000, the output should be -25 degrees, but the driver will return +487 degrees.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: Murad Masimov m.masimov@maxima.ru Link: https://lore.kernel.org/r/20241216173648.526-4-m.masimov@maxima.ru [groeck: fixed description line length] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/tmp513.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/tmp513.c b/drivers/hwmon/tmp513.c index 77dffbee8ebe..070f93226ed6 100644 --- a/drivers/hwmon/tmp513.c +++ b/drivers/hwmon/tmp513.c @@ -230,7 +230,7 @@ static int tmp51x_get_value(struct tmp51x_data *data, u8 reg, u8 pos, case TMP51X_REMOTE_TEMP_LIMIT_2: case TMP513_REMOTE_TEMP_LIMIT_3: // 1lsb = 0.0625 degrees centigrade - *val = sign_extend32(regval, 16) >> TMP51X_TEMP_SHIFT; + *val = sign_extend32(regval, 15) >> TMP51X_TEMP_SHIFT; *val = DIV_ROUND_CLOSEST(*val * 625, 10); break; case TMP51X_N_FACTOR_AND_HYST_1:
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kairui Song kasong@tencent.com
commit be48c412f6ebf38849213c19547bc6d5b692b5e5 upstream.
Patch series "zram: fix backing device setup issue", v2.
This series fixes two bugs of backing device setting:
- ZRAM should reject using a zero sized (or the uninitialized ZRAM device itself) as the backing device. - Fix backing device leaking when removing a uninitialized ZRAM device.
This patch (of 2):
Setting a zero sized block device as backing device is pointless, and one can easily create a recursive loop by setting the uninitialized ZRAM device itself as its own backing device by (zram0 is uninitialized):
echo /dev/zram0 > /sys/block/zram0/backing_dev
It's definitely a wrong config, and the module will pin itself, kernel should refuse doing so in the first place.
By refusing to use zero sized device we avoided misuse cases including this one above.
Link: https://lkml.kernel.org/r/20241209165717.94215-1-ryncsn@gmail.com Link: https://lkml.kernel.org/r/20241209165717.94215-2-ryncsn@gmail.com Fixes: 013bf95a83ec ("zram: add interface to specif backing device") Signed-off-by: Kairui Song kasong@tencent.com Reported-by: Desheng Wu deshengwu@tencent.com Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/zram/zram_drv.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -538,6 +538,12 @@ static ssize_t backing_dev_store(struct }
nr_pages = i_size_read(inode) >> PAGE_SHIFT; + /* Refuse to use zero sized device (also prevents self reference) */ + if (!nr_pages) { + err = -EINVAL; + goto out; + } + bitmap_sz = BITS_TO_LONGS(nr_pages) * sizeof(long); bitmap = kvzalloc(bitmap_sz, GFP_KERNEL); if (!bitmap) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kairui Song kasong@tencent.com
commit 74363ec674cb172d8856de25776c8f3103f05e2f upstream.
Setting backing device is done before ZRAM initialization. If we set the backing device, then remove the ZRAM module without initializing the device, the backing device reference will be leaked and the device will be hold forever.
Fix this by always reset the ZRAM fully on rmmod or reset store.
Link: https://lkml.kernel.org/r/20241209165717.94215-3-ryncsn@gmail.com Fixes: 013bf95a83ec ("zram: add interface to specif backing device") Signed-off-by: Kairui Song kasong@tencent.com Reported-by: Desheng Wu deshengwu@tencent.com Suggested-by: Sergey Senozhatsky senozhatsky@chromium.org Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/zram/zram_drv.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
--- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -1238,12 +1238,16 @@ static void zram_meta_free(struct zram * size_t num_pages = disksize >> PAGE_SHIFT; size_t index;
+ if (!zram->table) + return; + /* Free all pages that are still in this zram device */ for (index = 0; index < num_pages; index++) zram_free_page(zram, index);
zs_destroy_pool(zram->mem_pool); vfree(zram->table); + zram->table = NULL; }
static bool zram_meta_alloc(struct zram *zram, u64 disksize) @@ -2023,11 +2027,6 @@ static void zram_reset_device(struct zra
zram->limit_pages = 0;
- if (!init_done(zram)) { - up_write(&zram->init_lock); - return; - } - set_capacity_and_notify(zram->disk, 0); part_stat_set_all(zram->disk->part0, 0);
On (24/12/23 16:59), Greg Kroah-Hartman wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
From: Kairui Song kasong@tencent.com
commit 74363ec674cb172d8856de25776c8f3103f05e2f upstream.
Setting backing device is done before ZRAM initialization. If we set the backing device, then remove the ZRAM module without initializing the device, the backing device reference will be leaked and the device will be hold forever.
Fix this by always reset the ZRAM fully on rmmod or reset store.
Link: https://lkml.kernel.org/r/20241209165717.94215-3-ryncsn@gmail.com Fixes: 013bf95a83ec ("zram: add interface to specif backing device") Signed-off-by: Kairui Song kasong@tencent.com Reported-by: Desheng Wu deshengwu@tencent.com Suggested-by: Sergey Senozhatsky senozhatsky@chromium.org Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Can we please drop this patch?
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthew Wilcox (Oracle) willy@infradead.org
commit a2e740e216f5bf49ccb83b6d490c72a340558a43 upstream.
If the caller of vmap() specifies VM_MAP_PUT_PAGES (currently only the i915 driver), we will decrement nr_vmalloc_pages and MEMCG_VMALLOC in vfree(). These counters are incremented by vmalloc() but not by vmap() so this will cause an underflow. Check the VM_MAP_PUT_PAGES flag before decrementing either counter.
Link: https://lkml.kernel.org/r/20241211202538.168311-1-willy@infradead.org Fixes: b944afc9d64d ("mm: add a VM_MAP_PUT_PAGES flag for vmap") Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Acked-by: Johannes Weiner hannes@cmpxchg.org Reviewed-by: Shakeel Butt shakeel.butt@linux.dev Reviewed-by: Balbir Singh balbirs@nvidia.com Acked-by: Michal Hocko mhocko@suse.com Cc: Christoph Hellwig hch@lst.de Cc: Muchun Song muchun.song@linux.dev Cc: Roman Gushchin roman.gushchin@linux.dev Cc: "Uladzislau Rezki (Sony)" urezki@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/vmalloc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2851,7 +2851,8 @@ void vfree(const void *addr) struct page *page = vm->pages[i];
BUG_ON(!page); - mod_memcg_page_state(page, MEMCG_VMALLOC, -1); + if (!(vm->flags & VM_MAP_PUT_PAGES)) + mod_memcg_page_state(page, MEMCG_VMALLOC, -1); /* * High-order allocs for huge vmallocs are split, so * can be freed as an array of order-0 allocations @@ -2859,7 +2860,8 @@ void vfree(const void *addr) __free_page(page); cond_resched(); } - atomic_long_sub(vm->nr_pages, &nr_vmalloc_pages); + if (!(vm->flags & VM_MAP_PUT_PAGES)) + atomic_long_sub(vm->nr_pages, &nr_vmalloc_pages); kvfree(vm->pages); kfree(vm); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
commit dfb92681a19e1d5172420baa242806414b3eff6f upstream.
[BUG] There is a bug report in the mailing list where btrfs_run_delayed_refs() failed to drop the ref count for logical 25870311358464 num_bytes 2113536.
The involved leaf dump looks like this:
item 166 key (25870311358464 168 2113536) itemoff 10091 itemsize 50 extent refs 1 gen 84178 flags 1 ref#0: shared data backref parent 32399126528000 count 0 <<< ref#1: shared data backref parent 31808973717504 count 1
Notice the count number is 0.
[CAUSE] There is no concrete evidence yet, but considering 0 -> 1 is also a single bit flipped, it's possible that hardware memory bitflip is involved, causing the on-disk extent tree to be corrupted.
[FIX] To prevent us reading such corrupted extent item, or writing such damaged extent item back to disk, enhance the handling of BTRFS_EXTENT_DATA_REF_KEY and BTRFS_SHARED_DATA_REF_KEY keys for both inlined and key items, to detect such 0 ref count and reject them.
CC: stable@vger.kernel.org # 5.4+ Link: https://lore.kernel.org/linux-btrfs/7c69dd49-c346-4806-86e7-e6f863a66f48@app... Reported-by: Frankie Fisher frankie@terrorise.me.uk Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/tree-checker.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-)
--- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -1503,6 +1503,11 @@ static int check_extent_item(struct exte dref_offset, fs_info->sectorsize); return -EUCLEAN; } + if (unlikely(btrfs_extent_data_ref_count(leaf, dref) == 0)) { + extent_err(leaf, slot, + "invalid data ref count, should have non-zero value"); + return -EUCLEAN; + } inline_refs += btrfs_extent_data_ref_count(leaf, dref); break; /* Contains parent bytenr and ref count */ @@ -1515,6 +1520,11 @@ static int check_extent_item(struct exte inline_offset, fs_info->sectorsize); return -EUCLEAN; } + if (unlikely(btrfs_shared_data_ref_count(leaf, sref) == 0)) { + extent_err(leaf, slot, + "invalid shared data ref count, should have non-zero value"); + return -EUCLEAN; + } inline_refs += btrfs_shared_data_ref_count(leaf, sref); break; default: @@ -1584,8 +1594,18 @@ static int check_simple_keyed_refs(struc { u32 expect_item_size = 0;
- if (key->type == BTRFS_SHARED_DATA_REF_KEY) + if (key->type == BTRFS_SHARED_DATA_REF_KEY) { + struct btrfs_shared_data_ref *sref; + + sref = btrfs_item_ptr(leaf, slot, struct btrfs_shared_data_ref); + if (unlikely(btrfs_shared_data_ref_count(leaf, sref) == 0)) { + extent_err(leaf, slot, + "invalid shared data backref count, should have non-zero value"); + return -EUCLEAN; + } + expect_item_size = sizeof(struct btrfs_shared_data_ref); + }
if (unlikely(btrfs_item_size(leaf, slot) != expect_item_size)) { generic_err(leaf, slot, @@ -1662,6 +1682,11 @@ static int check_extent_data_ref(struct offset, leaf->fs_info->sectorsize); return -EUCLEAN; } + if (unlikely(btrfs_extent_data_ref_count(leaf, dref) == 0)) { + extent_err(leaf, slot, + "invalid extent data backref count, should have non-zero value"); + return -EUCLEAN; + } } return 0; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Kelley mhklinux@outlook.com
commit 07a756a49f4b4290b49ea46e089cbe6f79ff8d26 upstream.
If the KVP (or VSS) daemon starts before the VMBus channel's ringbuffer is fully initialized, we can hit the panic below:
hv_utils: Registering HyperV Utility Driver hv_vmbus: registering driver hv_utils ... BUG: kernel NULL pointer dereference, address: 0000000000000000 CPU: 44 UID: 0 PID: 2552 Comm: hv_kvp_daemon Tainted: G E 6.11.0-rc3+ #1 RIP: 0010:hv_pkt_iter_first+0x12/0xd0 Call Trace: ... vmbus_recvpacket hv_kvp_onchannelcallback vmbus_on_event tasklet_action_common tasklet_action handle_softirqs irq_exit_rcu sysvec_hyperv_stimer0 </IRQ> <TASK> asm_sysvec_hyperv_stimer0 ... kvp_register_done hvt_op_read vfs_read ksys_read __x64_sys_read
This can happen because the KVP/VSS channel callback can be invoked even before the channel is fully opened: 1) as soon as hv_kvp_init() -> hvutil_transport_init() creates /dev/vmbus/hv_kvp, the kvp daemon can open the device file immediately and register itself to the driver by writing a message KVP_OP_REGISTER1 to the file (which is handled by kvp_on_msg() ->kvp_handle_handshake()) and reading the file for the driver's response, which is handled by hvt_op_read(), which calls hvt->on_read(), i.e. kvp_register_done().
2) the problem with kvp_register_done() is that it can cause the channel callback to be called even before the channel is fully opened, and when the channel callback is starting to run, util_probe()-> vmbus_open() may have not initialized the ringbuffer yet, so the callback can hit the panic of NULL pointer dereference.
To reproduce the panic consistently, we can add a "ssleep(10)" for KVP in __vmbus_open(), just before the first hv_ringbuffer_init(), and then we unload and reload the driver hv_utils, and run the daemon manually within the 10 seconds.
Fix the panic by reordering the steps in util_probe() so the char dev entry used by the KVP or VSS daemon is not created until after vmbus_open() has completed. This reordering prevents the race condition from happening.
Reported-by: Dexuan Cui decui@microsoft.com Fixes: e0fa3e5e7df6 ("Drivers: hv: utils: fix a race on userspace daemons registration") Cc: stable@vger.kernel.org Signed-off-by: Michael Kelley mhklinux@outlook.com Acked-by: Wei Liu wei.liu@kernel.org Link: https://lore.kernel.org/r/20241106154247.2271-3-mhklinux@outlook.com Signed-off-by: Wei Liu wei.liu@kernel.org Message-ID: 20241106154247.2271-3-mhklinux@outlook.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hv/hv_kvp.c | 6 ++++++ drivers/hv/hv_snapshot.c | 6 ++++++ drivers/hv/hv_util.c | 9 +++++++++ drivers/hv/hyperv_vmbus.h | 2 ++ include/linux/hyperv.h | 1 + 5 files changed, 24 insertions(+)
--- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -767,6 +767,12 @@ hv_kvp_init(struct hv_util_service *srv) */ kvp_transaction.state = HVUTIL_DEVICE_INIT;
+ return 0; +} + +int +hv_kvp_init_transport(void) +{ hvt = hvutil_transport_init(kvp_devname, CN_KVP_IDX, CN_KVP_VAL, kvp_on_msg, kvp_on_reset); if (!hvt) --- a/drivers/hv/hv_snapshot.c +++ b/drivers/hv/hv_snapshot.c @@ -388,6 +388,12 @@ hv_vss_init(struct hv_util_service *srv) */ vss_transaction.state = HVUTIL_DEVICE_INIT;
+ return 0; +} + +int +hv_vss_init_transport(void) +{ hvt = hvutil_transport_init(vss_devname, CN_VSS_IDX, CN_VSS_VAL, vss_on_msg, vss_on_reset); if (!hvt) { --- a/drivers/hv/hv_util.c +++ b/drivers/hv/hv_util.c @@ -141,6 +141,7 @@ static struct hv_util_service util_heart static struct hv_util_service util_kvp = { .util_cb = hv_kvp_onchannelcallback, .util_init = hv_kvp_init, + .util_init_transport = hv_kvp_init_transport, .util_pre_suspend = hv_kvp_pre_suspend, .util_pre_resume = hv_kvp_pre_resume, .util_deinit = hv_kvp_deinit, @@ -149,6 +150,7 @@ static struct hv_util_service util_kvp = static struct hv_util_service util_vss = { .util_cb = hv_vss_onchannelcallback, .util_init = hv_vss_init, + .util_init_transport = hv_vss_init_transport, .util_pre_suspend = hv_vss_pre_suspend, .util_pre_resume = hv_vss_pre_resume, .util_deinit = hv_vss_deinit, @@ -592,6 +594,13 @@ static int util_probe(struct hv_device * if (ret) goto error;
+ if (srv->util_init_transport) { + ret = srv->util_init_transport(); + if (ret) { + vmbus_close(dev->channel); + goto error; + } + } return 0;
error: --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -370,12 +370,14 @@ void vmbus_on_event(unsigned long data); void vmbus_on_msg_dpc(unsigned long data);
int hv_kvp_init(struct hv_util_service *srv); +int hv_kvp_init_transport(void); void hv_kvp_deinit(void); int hv_kvp_pre_suspend(void); int hv_kvp_pre_resume(void); void hv_kvp_onchannelcallback(void *context);
int hv_vss_init(struct hv_util_service *srv); +int hv_vss_init_transport(void); void hv_vss_deinit(void); int hv_vss_pre_suspend(void); int hv_vss_pre_resume(void); --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1561,6 +1561,7 @@ struct hv_util_service { void *channel; void (*util_cb)(void *); int (*util_init)(struct hv_util_service *); + int (*util_init_transport)(void); void (*util_deinit)(void); int (*util_pre_suspend)(void); int (*util_pre_resume)(void);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 9b42d1e8e4fe9dc631162c04caa69b0d1860b0f0 upstream.
Use is_64_bit_hypercall() instead of is_64_bit_mode() to detect a 64-bit hypercall when completing said hypercall. For guests with protected state, e.g. SEV-ES and SEV-SNP, KVM must assume the hypercall was made in 64-bit mode as the vCPU state needed to detect 64-bit mode is unavailable.
Hacking the sev_smoke_test selftest to generate a KVM_HC_MAP_GPA_RANGE hypercall via VMGEXIT trips the WARN:
------------[ cut here ]------------ WARNING: CPU: 273 PID: 326626 at arch/x86/kvm/x86.h:180 complete_hypercall_exit+0x44/0xe0 [kvm] Modules linked in: kvm_amd kvm ... [last unloaded: kvm] CPU: 273 UID: 0 PID: 326626 Comm: sev_smoke_test Not tainted 6.12.0-smp--392e932fa0f3-feat #470 Hardware name: Google Astoria/astoria, BIOS 0.20240617.0-0 06/17/2024 RIP: 0010:complete_hypercall_exit+0x44/0xe0 [kvm] Call Trace: <TASK> kvm_arch_vcpu_ioctl_run+0x2400/0x2720 [kvm] kvm_vcpu_ioctl+0x54f/0x630 [kvm] __se_sys_ioctl+0x6b/0xc0 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e </TASK> ---[ end trace 0000000000000000 ]---
Fixes: b5aead0064f3 ("KVM: x86: Assume a 64-bit hypercall for guests with protected state") Cc: stable@vger.kernel.org Cc: Tom Lendacky thomas.lendacky@amd.com Reviewed-by: Xiaoyao Li xiaoyao.li@intel.com Reviewed-by: Nikunj A Dadhania nikunj@amd.com Reviewed-by: Tom Lendacky thomas.lendacky@amd.com Reviewed-by: Binbin Wu binbin.wu@linux.intel.com Reviewed-by: Kai Huang kai.huang@intel.com Link: https://lore.kernel.org/r/20241128004344.4072099-2-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/x86.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9825,7 +9825,7 @@ static int complete_hypercall_exit(struc { u64 ret = vcpu->run->hypercall.ret;
- if (!is_64_bit_mode(vcpu)) + if (!is_64_bit_hypercall(vcpu)) ret = (u32)ret; kvm_rax_write(vcpu, ret); ++vcpu->stat.hypercalls;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Enzo Matsumiya ematsumiya@suse.de
commit e9f2517a3e18a54a3943c098d2226b245d488801 upstream.
Commit ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.") fixed a netns UAF by manually enabled socket refcounting (sk->sk_net_refcnt=1 and sock_inuse_add(net, 1)).
The reason the patch worked for that bug was because we now hold references to the netns (get_net_track() gets a ref internally) and they're properly released (internally, on __sk_destruct()), but only because sk->sk_net_refcnt was set.
Problem: (this happens regardless of CONFIG_NET_NS_REFCNT_TRACKER and regardless if init_net or other)
Setting sk->sk_net_refcnt=1 *manually* and *after* socket creation is not only out of cifs scope, but also technically wrong -- it's set conditionally based on user (=1) vs kernel (=0) sockets. And net/ implementations seem to base their user vs kernel space operations on it.
e.g. upon TCP socket close, the TCP timers are not cleared because sk->sk_net_refcnt=1: (cf. commit 151c9c724d05 ("tcp: properly terminate timers for kernel sockets"))
net/ipv4/tcp.c: void tcp_close(struct sock *sk, long timeout) { lock_sock(sk); __tcp_close(sk, timeout); release_sock(sk); if (!sk->sk_net_refcnt) inet_csk_clear_xmit_timers_sync(sk); sock_put(sk); }
Which will throw a lockdep warning and then, as expected, deadlock on tcp_write_timer().
A way to reproduce this is by running the reproducer from ef7134c7fc48 and then 'rmmod cifs'. A few seconds later, the deadlock/lockdep warning shows up.
Fix: We shouldn't mess with socket internals ourselves, so do not set sk_net_refcnt manually.
Also change __sock_create() to sock_create_kern() for explicitness.
As for non-init_net network namespaces, we deal with it the best way we can -- hold an extra netns reference for server->ssocket and drop it when it's released. This ensures that the netns still exists whenever we need to create/destroy server->ssocket, but is not directly tied to it.
Fixes: ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.") Cc: stable@vger.kernel.org Signed-off-by: Enzo Matsumiya ematsumiya@suse.de Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/connect.c | 36 ++++++++++++++++++++++++++---------- 1 file changed, 26 insertions(+), 10 deletions(-)
--- a/fs/smb/client/connect.c +++ b/fs/smb/client/connect.c @@ -1003,9 +1003,13 @@ clean_demultiplex_info(struct TCP_Server msleep(125); if (cifs_rdma_enabled(server)) smbd_destroy(server); + if (server->ssocket) { sock_release(server->ssocket); server->ssocket = NULL; + + /* Release netns reference for the socket. */ + put_net(cifs_net_ns(server)); }
if (!list_empty(&server->pending_mid_q)) { @@ -1054,6 +1058,7 @@ clean_demultiplex_info(struct TCP_Server */ }
+ /* Release netns reference for this server. */ put_net(cifs_net_ns(server)); kfree(server->leaf_fullpath); kfree(server); @@ -1726,6 +1731,8 @@ cifs_get_tcp_session(struct smb3_fs_cont
tcp_ses->ops = ctx->ops; tcp_ses->vals = ctx->vals; + + /* Grab netns reference for this server. */ cifs_set_net_ns(tcp_ses, get_net(current->nsproxy->net_ns));
tcp_ses->conn_id = atomic_inc_return(&tcpSesNextId); @@ -1857,6 +1864,7 @@ smbd_connected: out_err_crypto_release: cifs_crypto_secmech_release(tcp_ses);
+ /* Release netns reference for this server. */ put_net(cifs_net_ns(tcp_ses));
out_err: @@ -1865,8 +1873,10 @@ out_err: cifs_put_tcp_session(tcp_ses->primary_server, false); kfree(tcp_ses->hostname); kfree(tcp_ses->leaf_fullpath); - if (tcp_ses->ssocket) + if (tcp_ses->ssocket) { sock_release(tcp_ses->ssocket); + put_net(cifs_net_ns(tcp_ses)); + } kfree(tcp_ses); } return ERR_PTR(rc); @@ -3120,20 +3130,20 @@ generic_ip_connect(struct TCP_Server_Inf socket = server->ssocket; } else { struct net *net = cifs_net_ns(server); - struct sock *sk;
- rc = __sock_create(net, sfamily, SOCK_STREAM, - IPPROTO_TCP, &server->ssocket, 1); + rc = sock_create_kern(net, sfamily, SOCK_STREAM, IPPROTO_TCP, &server->ssocket); if (rc < 0) { cifs_server_dbg(VFS, "Error %d creating socket\n", rc); return rc; }
- sk = server->ssocket->sk; - __netns_tracker_free(net, &sk->ns_tracker, false); - sk->sk_net_refcnt = 1; - get_net_track(net, &sk->ns_tracker, GFP_KERNEL); - sock_inuse_add(net, 1); + /* + * Grab netns reference for the socket. + * + * It'll be released here, on error, or in clean_demultiplex_info() upon server + * teardown. + */ + get_net(net);
/* BB other socket options to set KEEPALIVE, NODELAY? */ cifs_dbg(FYI, "Socket created\n"); @@ -3147,8 +3157,10 @@ generic_ip_connect(struct TCP_Server_Inf }
rc = bind_socket(server); - if (rc < 0) + if (rc < 0) { + put_net(cifs_net_ns(server)); return rc; + }
/* * Eventually check for other socket options to change from @@ -3185,6 +3197,7 @@ generic_ip_connect(struct TCP_Server_Inf if (rc < 0) { cifs_dbg(FYI, "Error %d connecting to server\n", rc); trace_smb3_connect_err(server->hostname, server->conn_id, &server->dstaddr, rc); + put_net(cifs_net_ns(server)); sock_release(socket); server->ssocket = NULL; return rc; @@ -3193,6 +3206,9 @@ generic_ip_connect(struct TCP_Server_Inf if (sport == htons(RFC1001_PORT)) rc = ip_rfc1001_connect(server);
+ if (rc < 0) + put_net(cifs_net_ns(server)); + return rc; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
commit a6629626c584200daf495cc9a740048b455addcd upstream.
The test_event_printk() analyzes print formats of trace events looking for cases where it may dereference a pointer that is not in the ring buffer which can possibly be a bug when the trace event is read from the ring buffer and the content of that pointer no longer exists.
The function needs to accurately go from one print format argument to the next. It handles quotes and parenthesis that may be included in an argument. When it finds the start of the next argument, it uses a simple "c = strstr(fmt + i, ',')" to find the end of that argument!
In order to include "%s" dereferencing, it needs to process the entire content of the print format argument and not just the content of the first ',' it finds. As there may be content like:
({ const char *saved_ptr = trace_seq_buffer_ptr(p); static const char *access_str[] = { "---", "--x", "w--", "w-x", "-u-", "-ux", "wu-", "wux" }; union kvm_mmu_page_role role; role.word = REC->role; trace_seq_printf(p, "sp gen %u gfn %llx l%u %u-byte q%u%s %s%s" " %snxe %sad root %u %s%c", REC->mmu_valid_gen, REC->gfn, role.level, role.has_4_byte_gpte ? 4 : 8, role.quadrant, role.direct ? " direct" : "", access_str[role.access], role.invalid ? " invalid" : "", role.efer_nx ? "" : "!", role.ad_disabled ? "!" : "", REC->root_count, REC->unsync ? "unsync" : "sync", 0); saved_ptr; })
Which is an example of a full argument of an existing event. As the code already handles finding the next print format argument, process the argument at the end of it and not the start of it. This way it has both the start of the argument as well as the end of it.
Add a helper function "process_pointer()" that will do the processing during the loop as well as at the end. It also makes the code cleaner and easier to read.
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Al Viro viro@ZenIV.linux.org.uk Cc: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/20241217024720.362271189@goodmis.org Fixes: 5013f454a352c ("tracing: Add check of trace event print fmts for dereferencing pointers") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events.c | 82 ++++++++++++++++++++++++++++---------------- 1 file changed, 53 insertions(+), 29 deletions(-)
--- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -265,8 +265,7 @@ static bool test_field(const char *fmt, len = p - fmt;
for (; field->type; field++) { - if (strncmp(field->name, fmt, len) || - field->name[len]) + if (strncmp(field->name, fmt, len) || field->name[len]) continue; array_descriptor = strchr(field->type, '['); /* This is an array and is OK to dereference. */ @@ -275,6 +274,32 @@ static bool test_field(const char *fmt, return false; }
+/* Return true if the argument pointer is safe */ +static bool process_pointer(const char *fmt, int len, struct trace_event_call *call) +{ + const char *r, *e, *a; + + e = fmt + len; + + /* Find the REC-> in the argument */ + r = strstr(fmt, "REC->"); + if (r && r < e) { + /* + * Addresses of events on the buffer, or an array on the buffer is + * OK to dereference. There's ways to fool this, but + * this is to catch common mistakes, not malicious code. + */ + a = strchr(fmt, '&'); + if ((a && (a < r)) || test_field(r, call)) + return true; + } else if ((r = strstr(fmt, "__get_dynamic_array(")) && r < e) { + return true; + } else if ((r = strstr(fmt, "__get_sockaddr(")) && r < e) { + return true; + } + return false; +} + /* * Examine the print fmt of the event looking for unsafe dereference * pointers using %p* that could be recorded in the trace event and @@ -285,12 +310,12 @@ static void test_event_printk(struct tra { u64 dereference_flags = 0; bool first = true; - const char *fmt, *c, *r, *a; + const char *fmt; int parens = 0; char in_quote = 0; int start_arg = 0; int arg = 0; - int i; + int i, e;
fmt = call->print_fmt;
@@ -403,42 +428,41 @@ static void test_event_printk(struct tra case ',': if (in_quote || parens) continue; + e = i; i++; while (isspace(fmt[i])) i++; - start_arg = i; - if (!(dereference_flags & (1ULL << arg))) - goto next_arg;
- /* Find the REC-> in the argument */ - c = strchr(fmt + i, ','); - r = strstr(fmt + i, "REC->"); - if (r && (!c || r < c)) { - /* - * Addresses of events on the buffer, - * or an array on the buffer is - * OK to dereference. - * There's ways to fool this, but - * this is to catch common mistakes, - * not malicious code. - */ - a = strchr(fmt + i, '&'); - if ((a && (a < r)) || test_field(r, call)) + /* + * If start_arg is zero, then this is the start of the + * first argument. The processing of the argument happens + * when the end of the argument is found, as it needs to + * handle paranthesis and such. + */ + if (!start_arg) { + start_arg = i; + /* Balance out the i++ in the for loop */ + i--; + continue; + } + + if (dereference_flags & (1ULL << arg)) { + if (process_pointer(fmt + start_arg, e - start_arg, call)) dereference_flags &= ~(1ULL << arg); - } else if ((r = strstr(fmt + i, "__get_dynamic_array(")) && - (!c || r < c)) { - dereference_flags &= ~(1ULL << arg); - } else if ((r = strstr(fmt + i, "__get_sockaddr(")) && - (!c || r < c)) { - dereference_flags &= ~(1ULL << arg); }
- next_arg: - i--; + start_arg = i; arg++; + /* Balance out the i++ in the for loop */ + i--; } }
+ if (dereference_flags & (1ULL << arg)) { + if (process_pointer(fmt + start_arg, i - start_arg, call)) + dereference_flags &= ~(1ULL << arg); + } + /* * If you triggered the below warning, the trace event reported * uses an unsafe dereference pointer %p*. As the data stored
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
commit 917110481f6bc1c96b1e54b62bb114137fbc6d17 upstream.
The process_pointer() helper function looks to see if various trace event macros are used. These macros are for storing data in the event. This makes it safe to dereference as the dereference will then point into the event on the ring buffer where the content of the data stays with the event itself.
A few helper functions were missing. Those were:
__get_rel_dynamic_array() __get_dynamic_array_len() __get_rel_dynamic_array_len() __get_rel_sockaddr()
Also add a helper function find_print_string() to not need to use a middle man variable to test if the string exists.
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Al Viro viro@ZenIV.linux.org.uk Cc: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/20241217024720.521836792@goodmis.org Fixes: 5013f454a352c ("tracing: Add check of trace event print fmts for dereferencing pointers") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-)
--- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -274,6 +274,15 @@ static bool test_field(const char *fmt, return false; }
+/* Look for a string within an argument */ +static bool find_print_string(const char *arg, const char *str, const char *end) +{ + const char *r; + + r = strstr(arg, str); + return r && r < end; +} + /* Return true if the argument pointer is safe */ static bool process_pointer(const char *fmt, int len, struct trace_event_call *call) { @@ -292,9 +301,17 @@ static bool process_pointer(const char * a = strchr(fmt, '&'); if ((a && (a < r)) || test_field(r, call)) return true; - } else if ((r = strstr(fmt, "__get_dynamic_array(")) && r < e) { + } else if (find_print_string(fmt, "__get_dynamic_array(", e)) { + return true; + } else if (find_print_string(fmt, "__get_rel_dynamic_array(", e)) { + return true; + } else if (find_print_string(fmt, "__get_dynamic_array_len(", e)) { + return true; + } else if (find_print_string(fmt, "__get_rel_dynamic_array_len(", e)) { + return true; + } else if (find_print_string(fmt, "__get_sockaddr(", e)) { return true; - } else if ((r = strstr(fmt, "__get_sockaddr(")) && r < e) { + } else if (find_print_string(fmt, "__get_rel_sockaddr(", e)) { return true; } return false;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
commit 65a25d9f7ac02e0cf361356e834d1c71d36acca9 upstream.
The test_event_printk() code makes sure that when a trace event is registered, any dereferenced pointers in from the event's TP_printk() are pointing to content in the ring buffer. But currently it does not handle "%s", as there's cases where the string pointer saved in the ring buffer points to a static string in the kernel that will never be freed. As that is a valid case, the pointer needs to be checked at runtime.
Currently the runtime check is done via trace_check_vprintf(), but to not have to replicate everything in vsnprintf() it does some logic with the va_list that may not be reliable across architectures. In order to get rid of that logic, more work in the test_event_printk() needs to be done. Some of the strings can be validated at this time when it is obvious the string is valid because the string will be saved in the ring buffer content.
Do all the validation of strings in the ring buffer at boot in test_event_printk(), and make sure that the field of the strings that point into the kernel are accessible. This will allow adding checks at runtime that will validate the fields themselves and not rely on paring the TP_printk() format at runtime.
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Al Viro viro@ZenIV.linux.org.uk Cc: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/20241217024720.685917008@goodmis.org Fixes: 5013f454a352c ("tracing: Add check of trace event print fmts for dereferencing pointers") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events.c | 104 +++++++++++++++++++++++++++++++++++++------- 1 file changed, 89 insertions(+), 15 deletions(-)
--- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -244,19 +244,16 @@ int trace_event_get_offsets(struct trace return tail->offset + tail->size; }
-/* - * Check if the referenced field is an array and return true, - * as arrays are OK to dereference. - */ -static bool test_field(const char *fmt, struct trace_event_call *call) + +static struct trace_event_fields *find_event_field(const char *fmt, + struct trace_event_call *call) { struct trace_event_fields *field = call->class->fields_array; - const char *array_descriptor; const char *p = fmt; int len;
if (!(len = str_has_prefix(fmt, "REC->"))) - return false; + return NULL; fmt += len; for (p = fmt; *p; p++) { if (!isalnum(*p) && *p != '_') @@ -267,11 +264,26 @@ static bool test_field(const char *fmt, for (; field->type; field++) { if (strncmp(field->name, fmt, len) || field->name[len]) continue; - array_descriptor = strchr(field->type, '['); - /* This is an array and is OK to dereference. */ - return array_descriptor != NULL; + + return field; } - return false; + return NULL; +} + +/* + * Check if the referenced field is an array and return true, + * as arrays are OK to dereference. + */ +static bool test_field(const char *fmt, struct trace_event_call *call) +{ + struct trace_event_fields *field; + + field = find_event_field(fmt, call); + if (!field) + return false; + + /* This is an array and is OK to dereference. */ + return strchr(field->type, '[') != NULL; }
/* Look for a string within an argument */ @@ -317,6 +329,53 @@ static bool process_pointer(const char * return false; }
+/* Return true if the string is safe */ +static bool process_string(const char *fmt, int len, struct trace_event_call *call) +{ + const char *r, *e, *s; + + e = fmt + len; + + /* + * There are several helper functions that return strings. + * If the argument contains a function, then assume its field is valid. + * It is considered that the argument has a function if it has: + * alphanumeric or '_' before a parenthesis. + */ + s = fmt; + do { + r = strstr(s, "("); + if (!r || r >= e) + break; + for (int i = 1; r - i >= s; i++) { + char ch = *(r - i); + if (isspace(ch)) + continue; + if (isalnum(ch) || ch == '_') + return true; + /* Anything else, this isn't a function */ + break; + } + /* A function could be wrapped in parethesis, try the next one */ + s = r + 1; + } while (s < e); + + /* + * If there's any strings in the argument consider this arg OK as it + * could be: REC->field ? "foo" : "bar" and we don't want to get into + * verifying that logic here. + */ + if (find_print_string(fmt, """, e)) + return true; + + /* Dereferenced strings are also valid like any other pointer */ + if (process_pointer(fmt, len, call)) + return true; + + /* Make sure the field is found, and consider it OK for now if it is */ + return find_event_field(fmt, call) != NULL; +} + /* * Examine the print fmt of the event looking for unsafe dereference * pointers using %p* that could be recorded in the trace event and @@ -326,6 +385,7 @@ static bool process_pointer(const char * static void test_event_printk(struct trace_event_call *call) { u64 dereference_flags = 0; + u64 string_flags = 0; bool first = true; const char *fmt; int parens = 0; @@ -416,8 +476,16 @@ static void test_event_printk(struct tra star = true; continue; } - if ((fmt[i + j] == 's') && star) - arg++; + if ((fmt[i + j] == 's')) { + if (star) + arg++; + if (WARN_ONCE(arg == 63, + "Too many args for event: %s", + trace_event_name(call))) + return; + dereference_flags |= 1ULL << arg; + string_flags |= 1ULL << arg; + } break; } break; @@ -464,7 +532,10 @@ static void test_event_printk(struct tra }
if (dereference_flags & (1ULL << arg)) { - if (process_pointer(fmt + start_arg, e - start_arg, call)) + if (string_flags & (1ULL << arg)) { + if (process_string(fmt + start_arg, e - start_arg, call)) + dereference_flags &= ~(1ULL << arg); + } else if (process_pointer(fmt + start_arg, e - start_arg, call)) dereference_flags &= ~(1ULL << arg); }
@@ -476,7 +547,10 @@ static void test_event_printk(struct tra }
if (dereference_flags & (1ULL << arg)) { - if (process_pointer(fmt + start_arg, i - start_arg, call)) + if (string_flags & (1ULL << arg)) { + if (process_string(fmt + start_arg, i - start_arg, call)) + dereference_flags &= ~(1ULL << arg); + } else if (process_pointer(fmt + start_arg, i - start_arg, call)) dereference_flags &= ~(1ULL << arg); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Isaac J. Manjarres isaacmanjarres@google.com
commit 6a75f19af16ff482cfd6085c77123aa0f464f8dd upstream.
The sysctl tests for vm.memfd_noexec rely on the kernel to support PID namespaces (i.e. the kernel is built with CONFIG_PID_NS=y). If the kernel the test runs on does not support PID namespaces, the first sysctl test will fail when attempting to spawn a new thread in a new PID namespace, abort the test, preventing the remaining tests from being run.
This is not desirable, as not all kernels need PID namespaces, but can still use the other features provided by memfd. Therefore, only run the sysctl tests if the kernel supports PID namespaces. Otherwise, skip those tests and emit an informative message to let the user know why the sysctl tests are not being run.
Link: https://lkml.kernel.org/r/20241205192943.3228757-1-isaacmanjarres@google.com Fixes: 11f75a01448f ("selftests/memfd: add tests for MFD_NOEXEC_SEAL MFD_EXEC") Signed-off-by: Isaac J. Manjarres isaacmanjarres@google.com Reviewed-by: Jeff Xu jeffxu@google.com Cc: Suren Baghdasaryan surenb@google.com Cc: Kalesh Singh kaleshsingh@google.com Cc: stable@vger.kernel.org [6.6+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/memfd/memfd_test.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/memfd/memfd_test.c +++ b/tools/testing/selftests/memfd/memfd_test.c @@ -9,6 +9,7 @@ #include <fcntl.h> #include <linux/memfd.h> #include <sched.h> +#include <stdbool.h> #include <stdio.h> #include <stdlib.h> #include <signal.h> @@ -1567,6 +1568,11 @@ static void test_share_fork(char *banner close(fd); }
+static bool pid_ns_supported(void) +{ + return access("/proc/self/ns/pid", F_OK) == 0; +} + int main(int argc, char **argv) { pid_t pid; @@ -1601,8 +1607,12 @@ int main(int argc, char **argv) test_seal_grow(); test_seal_resize();
- test_sysctl_simple(); - test_sysctl_nested(); + if (pid_ns_supported()) { + test_sysctl_simple(); + test_sysctl_nested(); + } else { + printf("PID namespaces are not supported; skipping sysctl tests\n"); + }
test_share_dup("SHARE-DUP", ""); test_share_mmap("SHARE-MMAP", "");
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tiezhu Yang yangtiezhu@loongson.cn
commit 29d44cce324dab2bd86c447071a596262e7109b6 upstream.
Currently, LoongArch LLVM does not support the constraint "o" and no plan to support it, it only supports the similar constraint "m", so change the constraints from "nor" in the "else" case to arch-specific "nmr" to avoid the build error such as "unexpected asm memory constraint" for LoongArch.
Fixes: 630301b0d59d ("selftests/bpf: Add basic USDT selftests") Suggested-by: Weining Lu luweining@loongson.cn Suggested-by: Li Chen chenli@loongson.cn Signed-off-by: Tiezhu Yang yangtiezhu@loongson.cn Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Huacai Chen chenhuacai@loongson.cn Cc: stable@vger.kernel.org Link: https://llvm.org/docs/LangRef.html#supported-constraint-code-list Link: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/LoongArch/Loo... Link: https://lore.kernel.org/bpf/20241219111506.20643-1-yangtiezhu@loongson.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/bpf/sdt.h | 2 ++ 1 file changed, 2 insertions(+)
--- a/tools/testing/selftests/bpf/sdt.h +++ b/tools/testing/selftests/bpf/sdt.h @@ -102,6 +102,8 @@ # define STAP_SDT_ARG_CONSTRAINT nZr # elif defined __arm__ # define STAP_SDT_ARG_CONSTRAINT g +# elif defined __loongarch__ +# define STAP_SDT_ARG_CONSTRAINT nmr # else # define STAP_SDT_ARG_CONSTRAINT nor # endif
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
commit 12d908116f7efd34f255a482b9afc729d7a5fb78 upstream.
Currently, io_uring_unreg_ringfd() (which cleans up registered rings) is only called on exit, but __io_uring_free (which frees the tctx in which the registered ring pointers are stored) is also called on execve (via begin_new_exec -> io_uring_task_cancel -> __io_uring_cancel -> io_uring_cancel_generic -> __io_uring_free).
This means: A process going through execve while having registered rings will leak references to the rings' `struct file`.
Fix it by zapping registered rings on execve(). This is implemented by moving the io_uring_unreg_ringfd() from io_uring_files_cancel() into its callee __io_uring_cancel(), which is called from io_uring_task_cancel() on execve.
This could probably be exploited *on 32-bit kernels* by leaking 2^32 references to the same ring, because the file refcount is stored in a pointer-sized field and get_file() doesn't have protection against refcount overflow, just a WARN_ONCE(); but on 64-bit it should have no impact beyond a memory leak.
Cc: stable@vger.kernel.org Fixes: e7a6c00dc77a ("io_uring: add support for registering ring file descriptors") Signed-off-by: Jann Horn jannh@google.com Link: https://lore.kernel.org/r/20241218-uring-reg-ring-cleanup-v1-1-8f63e999045b@... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/io_uring.h | 4 +--- io_uring/io_uring.c | 1 + 2 files changed, 2 insertions(+), 3 deletions(-)
--- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -65,10 +65,8 @@ static inline void io_uring_cmd_complete
static inline void io_uring_files_cancel(void) { - if (current->io_uring) { - io_uring_unreg_ringfd(); + if (current->io_uring) __io_uring_cancel(false); - } } static inline void io_uring_task_cancel(void) { --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3431,6 +3431,7 @@ end_wait:
void __io_uring_cancel(bool cancel_all) { + io_uring_unreg_ringfd(); io_uring_cancel_generic(cancel_all, NULL); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Begunkov asml.silence@gmail.com
commit dbd2ca9367eb19bc5e269b8c58b0b1514ada9156 upstream.
task work can be executed after the task has gone through io_uring termination, whether it's the final task_work run or the fallback path. In this case, task work will find ->io_wq being already killed and null'ed, which is a problem if it then tries to forward the request to io_queue_iowq(). Make io_queue_iowq() fail requests in this case.
Note that it also checks PF_KTHREAD, because the user can first close a DEFER_TASKRUN ring and shortly after kill the task, in which case ->iowq check would race.
Cc: stable@vger.kernel.org Fixes: 50c52250e2d74 ("block: implement async io_uring discard cmd") Fixes: 773af69121ecc ("io_uring: always reissue from task_work context") Reported-by: Will willsroot@protonmail.com Signed-off-by: Pavel Begunkov asml.silence@gmail.com Link: https://lore.kernel.org/r/63312b4a2c2bb67ad67b857d17a300e1d3b078e8.173463790... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -498,7 +498,11 @@ void io_queue_iowq(struct io_kiocb *req, struct io_uring_task *tctx = req->task->io_uring;
BUG_ON(!tctx); - BUG_ON(!tctx->io_wq); + + if ((current->flags & PF_KTHREAD) || !tctx->io_wq) { + io_req_task_queue_fail(req, -ECANCELED); + return; + }
/* init ->work of the whole link before punting */ io_prep_async_link(req);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 62e2a47ceab8f3f7d2e3f0e03fdd1c5e0059fd8b upstream.
When the server is recalling a layout, we should ignore the count of outstanding layoutget calls, since the server is expected to return either NFS4ERR_RECALLCONFLICT or NFS4ERR_RETURNCONFLICT for as long as the recall is outstanding. Currently, we may end up livelocking, causing the layout to eventually be forcibly revoked.
Fixes: bf0291dd2267 ("pNFS: Ensure LAYOUTGET and LAYOUTRETURN are properly serialised") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfs/pnfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -1196,7 +1196,7 @@ pnfs_prepare_layoutreturn(struct pnfs_la enum pnfs_iomode *iomode) { /* Serialise LAYOUTGET/LAYOUTRETURN */ - if (atomic_read(&lo->plh_outstanding) != 0) + if (atomic_read(&lo->plh_outstanding) != 0 && lo->plh_return_seq == 0) return false; if (test_and_set_bit(NFS_LAYOUT_RETURN_LOCK, &lo->plh_flags)) return false;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zijun Hu quic_zijuhu@quicinc.com
commit fec3edc47d5cfc2dd296a5141df887bf567944db upstream.
On a malformed interrupt-map property which is shorter than expected by 1 cell, we may read bogus data past the end of the property instead of returning an error in of_irq_parse_imap_parent().
Decrement the remaining length when skipping over the interrupt parent phandle cell.
Fixes: 935df1bd40d4 ("of/irq: Factor out parsing of interrupt-map parent phandle+args from of_irq_parse_raw()") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com Link: https://lore.kernel.org/r/20241209-of_irq_fix-v1-1-782f1419c8a1@quicinc.com [rh: reword commit msg] Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/irq.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/of/irq.c +++ b/drivers/of/irq.c @@ -111,6 +111,7 @@ const __be32 *of_irq_parse_imap_parent(c else np = of_find_node_by_phandle(be32_to_cpup(imap)); imap++; + len--;
/* Check if not found */ if (!np) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zijun Hu quic_zijuhu@quicinc.com
commit 0f7ca6f69354e0c3923bbc28c92d0ecab4d50a3e upstream.
of_irq_parse_one() may use uninitialized variable @addr_len as shown below:
// @addr_len is uninitialized int addr_len;
// This operation does not touch @addr_len if it fails. addr = of_get_property(device, "reg", &addr_len);
// Use uninitialized @addr_len if the operation fails. if (addr_len > sizeof(addr_buf)) addr_len = sizeof(addr_buf);
// Check the operation result here. if (addr) memcpy(addr_buf, addr, addr_len);
Fix by initializing @addr_len before the operation.
Fixes: b739dffa5d57 ("of/irq: Prevent device address out-of-bounds read in interrupt map walk") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com Link: https://lore.kernel.org/r/20241209-of_irq_fix-v1-4-782f1419c8a1@quicinc.com Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/irq.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/of/irq.c +++ b/drivers/of/irq.c @@ -355,6 +355,7 @@ int of_irq_parse_one(struct device_node return of_irq_parse_oldworld(device, index, out_irq);
/* Get the reg property (if any) */ + addr_len = 0; addr = of_get_property(device, "reg", &addr_len);
/* Prevent out-of-bounds read in case of longer interrupt parent address size */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ryusuke Konishi konishi.ryusuke@gmail.com
commit 6309b8ce98e9a18390b9fd8f03fc412f3c17aee9 upstream.
When block_invalidatepage was converted to block_invalidate_folio, the fallback to block_invalidatepage in folio_invalidate() if the address_space_operations method invalidatepage (currently invalidate_folio) was not set, was removed.
Unfortunately, some pseudo-inodes in nilfs2 use empty_aops set by inode_init_always_gfp() as is, or explicitly set it to address_space_operations. Therefore, with this change, block_invalidatepage() is no longer called from folio_invalidate(), and as a result, the buffer_head structures attached to these pages/folios are no longer freed via try_to_free_buffers().
Thus, these buffer heads are now leaked by truncate_inode_pages(), which cleans up the page cache from inode evict(), etc.
Three types of caches use empty_aops: gc inode caches and the DAT shadow inode used by GC, and b-tree node caches. Of these, b-tree node caches explicitly call invalidate_mapping_pages() during cleanup, which involves calling try_to_free_buffers(), so the leak was not visible during normal operation but worsened when GC was performed.
Fix this issue by using address_space_operations with invalidate_folio set to block_invalidate_folio instead of empty_aops, which will ensure the same behavior as before.
Link: https://lkml.kernel.org/r/20241212164556.21338-1-konishi.ryusuke@gmail.com Fixes: 7ba13abbd31e ("fs: Turn block_invalidatepage into block_invalidate_folio") Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Cc: stable@vger.kernel.org [5.18+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nilfs2/btnode.c | 1 + fs/nilfs2/gcinode.c | 2 +- fs/nilfs2/inode.c | 5 +++++ fs/nilfs2/nilfs.h | 1 + 4 files changed, 8 insertions(+), 1 deletion(-)
--- a/fs/nilfs2/btnode.c +++ b/fs/nilfs2/btnode.c @@ -35,6 +35,7 @@ void nilfs_init_btnc_inode(struct inode ii->i_flags = 0; memset(&ii->i_bmap_data, 0, sizeof(struct nilfs_bmap)); mapping_set_gfp_mask(btnc_inode->i_mapping, GFP_NOFS); + btnc_inode->i_mapping->a_ops = &nilfs_buffer_cache_aops; }
void nilfs_btnode_cache_clear(struct address_space *btnc) --- a/fs/nilfs2/gcinode.c +++ b/fs/nilfs2/gcinode.c @@ -163,7 +163,7 @@ int nilfs_init_gcinode(struct inode *ino
inode->i_mode = S_IFREG; mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); - inode->i_mapping->a_ops = &empty_aops; + inode->i_mapping->a_ops = &nilfs_buffer_cache_aops;
ii->i_flags = 0; nilfs_bmap_init_gc(ii->i_bmap); --- a/fs/nilfs2/inode.c +++ b/fs/nilfs2/inode.c @@ -309,6 +309,10 @@ const struct address_space_operations ni .is_partially_uptodate = block_is_partially_uptodate, };
+const struct address_space_operations nilfs_buffer_cache_aops = { + .invalidate_folio = block_invalidate_folio, +}; + static int nilfs_insert_inode_locked(struct inode *inode, struct nilfs_root *root, unsigned long ino) @@ -748,6 +752,7 @@ struct inode *nilfs_iget_for_shadow(stru NILFS_I(s_inode)->i_flags = 0; memset(NILFS_I(s_inode)->i_bmap, 0, sizeof(struct nilfs_bmap)); mapping_set_gfp_mask(s_inode->i_mapping, GFP_NOFS); + s_inode->i_mapping->a_ops = &nilfs_buffer_cache_aops;
err = nilfs_attach_btree_node_cache(s_inode); if (unlikely(err)) { --- a/fs/nilfs2/nilfs.h +++ b/fs/nilfs2/nilfs.h @@ -379,6 +379,7 @@ extern const struct file_operations nilf extern const struct inode_operations nilfs_file_inode_operations; extern const struct file_operations nilfs_file_operations; extern const struct address_space_operations nilfs_aops; +extern const struct address_space_operations nilfs_buffer_cache_aops; extern const struct inode_operations nilfs_dir_inode_operations; extern const struct inode_operations nilfs_special_inode_operations; extern const struct inode_operations nilfs_symlink_inode_operations;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Edward Adam Davis eadavis@qq.com
commit 901ce9705fbb9f330ff1f19600e5daf9770b0175 upstream.
syzbot reported a WARNING in nilfs_rmdir. [1]
Because the inode bitmap is corrupted, an inode with an inode number that should exist as a ".nilfs" file was reassigned by nilfs_mkdir for "file0", causing an inode duplication during execution. And this causes an underflow of i_nlink in rmdir operations.
The inode is used twice by the same task to unmount and remove directories ".nilfs" and "file0", it trigger warning in nilfs_rmdir.
Avoid to this issue, check i_nlink in nilfs_iget(), if it is 0, it means that this inode has been deleted, and iput is executed to reclaim it.
[1] WARNING: CPU: 1 PID: 5824 at fs/inode.c:407 drop_nlink+0xc4/0x110 fs/inode.c:407 ... Call Trace: <TASK> nilfs_rmdir+0x1b0/0x250 fs/nilfs2/namei.c:342 vfs_rmdir+0x3a3/0x510 fs/namei.c:4394 do_rmdir+0x3b5/0x580 fs/namei.c:4453 __do_sys_rmdir fs/namei.c:4472 [inline] __se_sys_rmdir fs/namei.c:4470 [inline] __x64_sys_rmdir+0x47/0x50 fs/namei.c:4470 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Link: https://lkml.kernel.org/r/20241209065759.6781-1-konishi.ryusuke@gmail.com Fixes: d25006523d0b ("nilfs2: pathname operations") Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Reported-by: syzbot+9260555647a5132edd48@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9260555647a5132edd48 Tested-by: syzbot+9260555647a5132edd48@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis eadavis@qq.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nilfs2/inode.c | 8 +++++++- fs/nilfs2/namei.c | 5 +++++ 2 files changed, 12 insertions(+), 1 deletion(-)
--- a/fs/nilfs2/inode.c +++ b/fs/nilfs2/inode.c @@ -618,8 +618,14 @@ struct inode *nilfs_iget(struct super_bl inode = nilfs_iget_locked(sb, root, ino); if (unlikely(!inode)) return ERR_PTR(-ENOMEM); - if (!(inode->i_state & I_NEW)) + + if (!(inode->i_state & I_NEW)) { + if (!inode->i_nlink) { + iput(inode); + return ERR_PTR(-ESTALE); + } return inode; + }
err = __nilfs_read_inode(sb, root, ino, inode); if (unlikely(err)) { --- a/fs/nilfs2/namei.c +++ b/fs/nilfs2/namei.c @@ -67,6 +67,11 @@ nilfs_lookup(struct inode *dir, struct d inode = NULL; } else { inode = nilfs_iget(dir->i_sb, NILFS_I(dir)->i_root, ino); + if (inode == ERR_PTR(-ESTALE)) { + nilfs_error(dir->i_sb, + "deleted inode referenced: %lu", ino); + return ERR_PTR(-EIO); + } }
return d_splice_alias(inode, dentry);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
commit 0a16e24e34f28210f68195259456c73462518597 upstream.
When F_SEAL_FUTURE_WRITE was introduced, it was overlooked that udmabuf must reject memfds with this flag, just like ones with F_SEAL_WRITE. Fix it by adding F_SEAL_FUTURE_WRITE to SEALS_DENIED.
Fixes: ab3948f58ff8 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd") Cc: stable@vger.kernel.org Acked-by: Vivek Kasireddy vivek.kasireddy@intel.com Signed-off-by: Jann Horn jannh@google.com Reviewed-by: Joel Fernandes (Google) joel@joelfernandes.org Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241204-udmabuf-fixes-v2-2-23... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma-buf/udmabuf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -194,7 +194,7 @@ static const struct dma_buf_ops udmabuf_ };
#define SEALS_WANTED (F_SEAL_SHRINK) -#define SEALS_DENIED (F_SEAL_WRITE) +#define SEALS_DENIED (F_SEAL_WRITE|F_SEAL_FUTURE_WRITE)
static long udmabuf_create(struct miscdevice *device, struct udmabuf_create_list *head,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Herve Codina herve.codina@bootlin.com
commit d7dfa7fde63dde4d2ec0083133efe2c6686c03ff upstream.
The current code uses some 'goto put;' to cancel the parsing operation and can lead to a return code value of 0 even on error cases.
Indeed, some goto calls are done from a loop without setting the ret value explicitly before the goto call and so the ret value can be set to 0 due to operation done in previous loop iteration. For instance match can be set to 0 in the previous loop iteration (leading to a new iteration) but ret can also be set to 0 it the of_property_read_u32() call succeed. In that case if no match are found or if an error is detected the new iteration, the return value can be wrongly 0.
Avoid those cases setting the ret value explicitly before the goto calls.
Fixes: bd6f2fd5a1d5 ("of: Support parsing phandle argument lists through a nexus node") Cc: stable@vger.kernel.org Signed-off-by: Herve Codina herve.codina@bootlin.com Link: https://lore.kernel.org/r/20241202165819.158681-1-herve.codina@bootlin.com Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/base.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
--- a/drivers/of/base.c +++ b/drivers/of/base.c @@ -1415,8 +1415,10 @@ int of_parse_phandle_with_args_map(const map_len--;
/* Check if not found */ - if (!new) + if (!new) { + ret = -EINVAL; goto put; + }
if (!of_device_is_available(new)) match = 0; @@ -1426,17 +1428,20 @@ int of_parse_phandle_with_args_map(const goto put;
/* Check for malformed properties */ - if (WARN_ON(new_size > MAX_PHANDLE_ARGS)) - goto put; - if (map_len < new_size) + if (WARN_ON(new_size > MAX_PHANDLE_ARGS) || + map_len < new_size) { + ret = -EINVAL; goto put; + }
/* Move forward by new node's #<list>-cells amount */ map += new_size; map_len -= new_size; } - if (!match) + if (!match) { + ret = -ENOENT; goto put; + }
/* Get the <list>-map-pass-thru property (optional) */ pass = of_get_property(cur, pass_name, NULL);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zijun Hu quic_zijuhu@quicinc.com
commit 5d009e024056ded20c5bb1583146b833b23bbd5a upstream.
__of_get_dma_parent() returns OF device node @args.np, but the node's refcount is increased twice, by both of_parse_phandle_with_args() and of_node_get(), so causes refcount leakage for the node.
Fix by directly returning the node got by of_parse_phandle_with_args().
Fixes: f83a6e5dea6c ("of: address: Add support for the parent DMA bus") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com Link: https://lore.kernel.org/r/20241206-of_core_fix-v1-4-dc28ed56bec3@quicinc.com Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/address.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -653,7 +653,7 @@ struct device_node *__of_get_dma_parent( if (ret < 0) return of_get_parent(np);
- return of_node_get(args.np); + return args.np; } #endif
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Dryomov idryomov@gmail.com
commit 12eb22a5a609421b380c3c6ca887474fb2089b2c upstream.
It becomes a path component, so it shouldn't exceed NAME_MAX characters. This was hardened in commit c152737be22b ("ceph: Use strscpy() instead of strcpy() in __get_snap_name()"), but no actual check was put in place.
Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Alex Markuze amarkuze@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ceph/super.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -427,6 +427,8 @@ static int ceph_parse_mount_param(struct
switch (token) { case Opt_snapdirname: + if (strlen(param->string) > NAME_MAX) + return invalfc(fc, "snapdirname too long"); kfree(fsopt->snapdir_name); fsopt->snapdir_name = param->string; param->string = NULL;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Markuze amarkuze@redhat.com
commit 9abee475803fab6ad59d4f4fc59c6a75374a7d9d upstream.
This patch refines the read logic in __ceph_sync_read() to ensure more predictable and efficient behavior in various edge cases.
- Return early if the requested read length is zero or if the file size (`i_size`) is zero. - Initialize the index variable (`idx`) where needed and reorder some code to ensure it is always set before use. - Improve error handling by checking for negative return values earlier. - Remove redundant encrypted file checks after failures. Only attempt filesystem-level decryption if the read succeeded. - Simplify leftover calculations to correctly handle cases where the read extends beyond the end of the file or stops short. This can be hit by continuously reading a file while, on another client, we keep truncating and writing new data into it. - This resolves multiple issues caused by integer and consequent buffer overflow (`pages` array being accessed beyond `num_pages`): - https://tracker.ceph.com/issues/67524 - https://tracker.ceph.com/issues/68980 - https://tracker.ceph.com/issues/68981
Cc: stable@vger.kernel.org Fixes: 1065da21e5df ("ceph: stop copying to iter at EOF on sync reads") Reported-by: Luis Henriques (SUSE) luis.henriques@linux.dev Signed-off-by: Alex Markuze amarkuze@redhat.com Reviewed-by: Viacheslav Dubeyko Slava.Dubeyko@ibm.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ceph/file.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-)
--- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -976,7 +976,7 @@ ssize_t __ceph_sync_read(struct inode *i if (ceph_inode_is_shutdown(inode)) return -EIO;
- if (!len) + if (!len || !i_size) return 0; /* * flush any page cache pages in this range. this @@ -996,7 +996,7 @@ ssize_t __ceph_sync_read(struct inode *i int num_pages; size_t page_off; bool more; - int idx; + int idx = 0; size_t left; struct ceph_osd_req_op *op; u64 read_off = off; @@ -1068,7 +1068,14 @@ ssize_t __ceph_sync_read(struct inode *i else if (ret == -ENOENT) ret = 0;
- if (ret > 0 && IS_ENCRYPTED(inode)) { + if (ret < 0) { + ceph_osdc_put_request(req); + if (ret == -EBLOCKLISTED) + fsc->blocklisted = true; + break; + } + + if (IS_ENCRYPTED(inode)) { int fret;
fret = ceph_fscrypt_decrypt_extents(inode, pages, @@ -1097,7 +1104,7 @@ ssize_t __ceph_sync_read(struct inode *i ceph_osdc_put_request(req);
/* Short read but not EOF? Zero out the remainder. */ - if (ret >= 0 && ret < len && (off + ret < i_size)) { + if (ret < len && (off + ret < i_size)) { int zlen = min(len - ret, i_size - off - ret); int zoff = page_off + ret;
@@ -1107,13 +1114,11 @@ ssize_t __ceph_sync_read(struct inode *i ret += zlen; }
- idx = 0; - if (ret <= 0) - left = 0; - else if (off + ret > i_size) - left = i_size - off; + if (off + ret > i_size) + left = (i_size > off) ? i_size - off : 0; else left = ret; + while (left > 0) { size_t plen, copied;
@@ -1131,12 +1136,6 @@ ssize_t __ceph_sync_read(struct inode *i } ceph_release_page_vector(pages, num_pages);
- if (ret < 0) { - if (ret == -EBLOCKLISTED) - fsc->blocklisted = true; - break; - } - if (off >= i_size || !more) break; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Max Kellermann max.kellermann@ionos.com
commit d6fd6f8280f0257ba93f16900a0d3d3912f32c79 upstream.
In two `break` statements, the call to ceph_release_page_vector() was missing, leaking the allocation from ceph_alloc_page_vector().
Instead of adding the missing ceph_release_page_vector() calls, the Ceph maintainers preferred to transfer page ownership to the `ceph_osd_request` by passing `own_pages=true` to osd_req_op_extent_osd_data_pages(). This requires postponing the ceph_osdc_put_request() call until after the block that accesses the `pages`.
Cc: stable@vger.kernel.org Fixes: 03bc06c7b0bd ("ceph: add new mount option to enable sparse reads") Fixes: f0fe1e54cfcf ("ceph: plumb in decryption during reads") Signed-off-by: Max Kellermann max.kellermann@ionos.com Reviewed-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ceph/file.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1036,7 +1036,7 @@ ssize_t __ceph_sync_read(struct inode *i
osd_req_op_extent_osd_data_pages(req, 0, pages, read_len, offset_in_page(read_off), - false, false); + false, true);
op = &req->r_ops[0]; if (sparse) { @@ -1101,8 +1101,6 @@ ssize_t __ceph_sync_read(struct inode *i ret = min_t(ssize_t, fret, len); }
- ceph_osdc_put_request(req); - /* Short read but not EOF? Zero out the remainder. */ if (ret < len && (off + ret < i_size)) { int zlen = min(len - ret, i_size - off - ret); @@ -1134,7 +1132,8 @@ ssize_t __ceph_sync_read(struct inode *i break; } } - ceph_release_page_vector(pages, num_pages); + + ceph_osdc_put_request(req);
if (off >= i_size || !more) break;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xuewen Yan xuewen.yan@unisoc.com
commit 900bbaae67e980945dec74d36f8afe0de7556d5a upstream.
Now, the epoll only use wake_up() interface to wake up task. However, sometimes, there are epoll users which want to use the synchronous wakeup flag to hint the scheduler, such as Android binder driver. So add a wake_up_sync() define, and use the wake_up_sync() when the sync is true in ep_poll_callback().
Co-developed-by: Jing Xia jing.xia@unisoc.com Signed-off-by: Jing Xia jing.xia@unisoc.com Signed-off-by: Xuewen Yan xuewen.yan@unisoc.com Link: https://lore.kernel.org/r/20240426080548.8203-1-xuewen.yan@unisoc.com Tested-by: Brian Geffon bgeffon@google.com Reviewed-by: Brian Geffon bgeffon@google.com Reported-by: Benoit Lize lizeb@google.com Signed-off-by: Christian Brauner brauner@kernel.org Cc: Brian Geffon bgeffon@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/eventpoll.c | 5 ++++- include/linux/wait.h | 1 + 2 files changed, 5 insertions(+), 1 deletion(-)
--- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1268,7 +1268,10 @@ static int ep_poll_callback(wait_queue_e break; } } - wake_up(&ep->wq); + if (sync) + wake_up_sync(&ep->wq); + else + wake_up(&ep->wq); } if (waitqueue_active(&ep->poll_wait)) pwake++; --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -225,6 +225,7 @@ void __wake_up_pollfree(struct wait_queu #define wake_up_all(x) __wake_up(x, TASK_NORMAL, 0, NULL) #define wake_up_locked(x) __wake_up_locked((x), TASK_NORMAL, 1) #define wake_up_all_locked(x) __wake_up_locked((x), TASK_NORMAL, 0) +#define wake_up_sync(x) __wake_up_sync(x, TASK_NORMAL)
#define wake_up_interruptible(x) __wake_up(x, TASK_INTERRUPTIBLE, 1, NULL) #define wake_up_interruptible_nr(x, nr) __wake_up(x, TASK_INTERRUPTIBLE, nr, NULL)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
Commit a08d195b586a217d76b42062f88f375a3eedda4d upstream.
Add __io_read() which does the grunt of the work, leaving the completion side to the new io_read(). No functional changes in this patch.
Reviewed-by: Gabriel Krisman Bertazi krisman@suse.de Signed-off-by: Jens Axboe axboe@kernel.dk (cherry picked from commit a08d195b586a217d76b42062f88f375a3eedda4d) Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/rw.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
--- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -712,7 +712,7 @@ static int io_rw_init_file(struct io_kio return 0; }
-int io_read(struct io_kiocb *req, unsigned int issue_flags) +static int __io_read(struct io_kiocb *req, unsigned int issue_flags) { struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw); struct io_rw_state __s, *s = &__s; @@ -857,7 +857,18 @@ done: /* it's faster to check here then delegate to kfree */ if (iovec) kfree(iovec); - return kiocb_done(req, ret, issue_flags); + return ret; +} + +int io_read(struct io_kiocb *req, unsigned int issue_flags) +{ + int ret; + + ret = __io_read(req, issue_flags); + if (ret >= 0) + return kiocb_done(req, ret, issue_flags); + + return ret; }
static bool io_kiocb_start_write(struct io_kiocb *req, struct kiocb *kiocb)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
Commit c0a9d496e0fece67db777bd48550376cf2960c47 upstream.
Some file systems, ocfs2 in this case, will return -EOPNOTSUPP for an IOCB_NOWAIT read/write attempt. While this can be argued to be correct, the usual return value for something that requires blocking issue is -EAGAIN.
A refactoring io_uring commit dropped calling kiocb_done() for negative return values, which is otherwise where we already do that transformation. To ensure we catch it in both spots, check it in __io_read() itself as well.
Reported-by: Robert Sander r.sander@heinlein-support.de Link: https://fosstodon.org/@gurubert@mastodon.gurubert.de/113112431889638440 Cc: stable@vger.kernel.org Fixes: a08d195b586a ("io_uring/rw: split io_read() into a helper") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/rw.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -778,6 +778,14 @@ static int __io_read(struct io_kiocb *re
ret = io_iter_do_read(rw, &s->iter);
+ /* + * Some file systems like to return -EOPNOTSUPP for an IOCB_NOWAIT + * issue, even though they should be returning -EAGAIN. To be safe, + * retry from blocking context for either. + */ + if (ret == -EOPNOTSUPP && force_nonblock) + ret = -EAGAIN; + if (ret == -EAGAIN || (req->flags & REQ_F_REISSUE)) { req->flags &= ~REQ_F_REISSUE; /* if we can poll, just do that */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Begunkov asml.silence@gmail.com
Commit 6e6b8c62120a22acd8cb759304e4cd2e3215d488 upstream.
kiocb_done() should care to specifically redirecting requests to io-wq. Remove the hopping to tw to then queue an io-wq, return -EAGAIN and let the core code io_uring handle offloading.
Signed-off-by: Pavel Begunkov asml.silence@gmail.com Tested-by: Ming Lei ming.lei@redhat.com Link: https://lore.kernel.org/r/413564e550fe23744a970e1783dfa566291b0e6f.171079918... Signed-off-by: Jens Axboe axboe@kernel.dk (cherry picked from commit 6e6b8c62120a22acd8cb759304e4cd2e3215d488) Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.c | 8 ++++---- io_uring/io_uring.h | 1 - io_uring/rw.c | 8 +------- 3 files changed, 5 insertions(+), 12 deletions(-)
--- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -492,7 +492,7 @@ static void io_prep_async_link(struct io } }
-void io_queue_iowq(struct io_kiocb *req, struct io_tw_state *ts_dont_use) +static void io_queue_iowq(struct io_kiocb *req) { struct io_kiocb *link = io_prep_linked_timeout(req); struct io_uring_task *tctx = req->task->io_uring; @@ -1479,7 +1479,7 @@ void io_req_task_submit(struct io_kiocb if (unlikely(req->task->flags & PF_EXITING)) io_req_defer_failed(req, -EFAULT); else if (req->flags & REQ_F_FORCE_ASYNC) - io_queue_iowq(req, ts); + io_queue_iowq(req); else io_queue_sqe(req); } @@ -2044,7 +2044,7 @@ static void io_queue_async(struct io_kio break; case IO_APOLL_ABORTED: io_kbuf_recycle(req, 0); - io_queue_iowq(req, NULL); + io_queue_iowq(req); break; case IO_APOLL_OK: break; @@ -2093,7 +2093,7 @@ static void io_queue_sqe_fallback(struct if (unlikely(req->ctx->drain_active)) io_drain_req(req); else - io_queue_iowq(req, NULL); + io_queue_iowq(req); } }
--- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -63,7 +63,6 @@ struct file *io_file_get_fixed(struct io void __io_req_task_work_add(struct io_kiocb *req, unsigned flags); bool io_alloc_async_data(struct io_kiocb *req); void io_req_task_queue(struct io_kiocb *req); -void io_queue_iowq(struct io_kiocb *req, struct io_tw_state *ts_dont_use); void io_req_task_complete(struct io_kiocb *req, struct io_tw_state *ts); void io_req_task_queue_fail(struct io_kiocb *req, int ret); void io_req_task_submit(struct io_kiocb *req, struct io_tw_state *ts); --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -168,12 +168,6 @@ static inline loff_t *io_kiocb_update_po return NULL; }
-static void io_req_task_queue_reissue(struct io_kiocb *req) -{ - req->io_task_work.func = io_queue_iowq; - io_req_task_work_add(req); -} - #ifdef CONFIG_BLOCK static bool io_resubmit_prep(struct io_kiocb *req) { @@ -359,7 +353,7 @@ static int kiocb_done(struct io_kiocb *r if (req->flags & REQ_F_REISSUE) { req->flags &= ~REQ_F_REISSUE; if (io_resubmit_prep(req)) - io_req_task_queue_reissue(req); + return -EAGAIN; else io_req_task_queue_fail(req, final_ret); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Francesco Dolcini francesco.dolcini@toradex.com
commit bf8ca67e21671e7a56e31da45360480b28f185f1 upstream.
Preparation patch to allow for PPS channel configuration, no functional change intended.
Signed-off-by: Francesco Dolcini francesco.dolcini@toradex.com Reviewed-by: Frank Li Frank.Li@nxp.com Reviewed-by: Csókás, Bence csokas.bence@prolan.hu Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Csókás, Bence csokas.bence@prolan.hu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/fec_ptp.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
--- a/drivers/net/ethernet/freescale/fec_ptp.c +++ b/drivers/net/ethernet/freescale/fec_ptp.c @@ -84,8 +84,7 @@ #define FEC_CC_MULT (1 << 31) #define FEC_COUNTER_PERIOD (1 << 31) #define PPS_OUPUT_RELOAD_PERIOD NSEC_PER_SEC -#define FEC_CHANNLE_0 0 -#define DEFAULT_PPS_CHANNEL FEC_CHANNLE_0 +#define DEFAULT_PPS_CHANNEL 0
#define FEC_PTP_MAX_NSEC_PERIOD 4000000000ULL #define FEC_PTP_MAX_NSEC_COUNTER 0x80000000ULL @@ -524,8 +523,9 @@ static int fec_ptp_enable(struct ptp_clo unsigned long flags; int ret = 0;
+ fep->pps_channel = DEFAULT_PPS_CHANNEL; + if (rq->type == PTP_CLK_REQ_PPS) { - fep->pps_channel = DEFAULT_PPS_CHANNEL; fep->reload_period = PPS_OUPUT_RELOAD_PERIOD;
ret = fec_ptp_enable_pps(fep, on); @@ -536,10 +536,9 @@ static int fec_ptp_enable(struct ptp_clo if (rq->perout.flags) return -EOPNOTSUPP;
- if (rq->perout.index != DEFAULT_PPS_CHANNEL) + if (rq->perout.index != fep->pps_channel) return -EOPNOTSUPP;
- fep->pps_channel = DEFAULT_PPS_CHANNEL; period.tv_sec = rq->perout.period.sec; period.tv_nsec = rq->perout.period.nsec; period_ns = timespec64_to_ns(&period);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Francesco Dolcini francesco.dolcini@toradex.com
commit 566c2d83887f0570056833102adc5b88e681b0c7 upstream.
Depending on the SoC where the FEC is integrated into the PPS channel might be routed to different timer instances. Make this configurable from the devicetree.
When the related DT property is not present fallback to the previous default and use channel 0.
Reviewed-by: Frank Li Frank.Li@nxp.com Tested-by: Rafael Beims rafael.beims@toradex.com Signed-off-by: Francesco Dolcini francesco.dolcini@toradex.com Reviewed-by: Csókás, Bence csokas.bence@prolan.hu Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Csókás, Bence csokas.bence@prolan.hu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/fec_ptp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/freescale/fec_ptp.c +++ b/drivers/net/ethernet/freescale/fec_ptp.c @@ -523,8 +523,6 @@ static int fec_ptp_enable(struct ptp_clo unsigned long flags; int ret = 0;
- fep->pps_channel = DEFAULT_PPS_CHANNEL; - if (rq->type == PTP_CLK_REQ_PPS) { fep->reload_period = PPS_OUPUT_RELOAD_PERIOD;
@@ -706,12 +704,16 @@ void fec_ptp_init(struct platform_device { struct net_device *ndev = platform_get_drvdata(pdev); struct fec_enet_private *fep = netdev_priv(ndev); + struct device_node *np = fep->pdev->dev.of_node; int irq; int ret;
fep->ptp_caps.owner = THIS_MODULE; strscpy(fep->ptp_caps.name, "fec ptp", sizeof(fep->ptp_caps.name));
+ fep->pps_channel = DEFAULT_PPS_CHANNEL; + of_property_read_u32(np, "fsl,pps-channel", &fep->pps_channel); + fep->ptp_caps.max_adj = 250000000; fep->ptp_caps.n_alarm = 0; fep->ptp_caps.n_ext_ts = 0;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michel Dänzer mdaenzer@redhat.com
commit 85230ee36d88e7a09fb062d43203035659dd10a5 upstream.
Third time's the charm, I hope?
Fixes: d3116756a710 ("drm/ttm: rename bo->mem and make it a pointer") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3837 Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Michel Dänzer mdaenzer@redhat.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 695c2c745e5dff201b75da8a1d237ce403600d04) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -1161,10 +1161,9 @@ int amdgpu_vm_bo_update(struct amdgpu_de * next command submission. */ if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv) { - uint32_t mem_type = bo->tbo.resource->mem_type; - - if (!(bo->preferred_domains & - amdgpu_mem_type_to_domain(mem_type))) + if (bo->tbo.resource && + !(bo->preferred_domains & + amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type))) amdgpu_vm_bo_evicted(&bo_va->base); else amdgpu_vm_bo_idle(&bo_va->base);
Hello,
On Mon, 23 Dec 2024 16:57:50 +0100 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.6.68 release. There are 116 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
This rc kernel passes DAMON functionality test[1] on my test machine. Attaching the test results summary below. Please note that I retrieved the kernel from linux-stable-rc tree[2].
Tested-by: SeongJae Park sj@kernel.org
[1] https://github.com/damonitor/damon-tests/tree/next/corr [2] 6a86252ba24f ("Linux 6.6.68-rc1")
Thanks, SJ
[...]
---
ok 1 selftests: damon: debugfs_attrs.sh ok 2 selftests: damon: debugfs_schemes.sh ok 3 selftests: damon: debugfs_target_ids.sh ok 4 selftests: damon: debugfs_empty_targets.sh ok 5 selftests: damon: debugfs_huge_count_read_write.sh ok 6 selftests: damon: debugfs_duplicate_context_creation.sh ok 7 selftests: damon: debugfs_rm_non_contexts.sh ok 8 selftests: damon: sysfs.sh ok 9 selftests: damon: sysfs_update_removed_scheme_dir.sh ok 10 selftests: damon: reclaim.sh ok 11 selftests: damon: lru_sort.sh ok 1 selftests: damon-tests: kunit.sh ok 2 selftests: damon-tests: huge_count_read_write.sh ok 3 selftests: damon-tests: buffer_overflow.sh ok 4 selftests: damon-tests: rm_contexts.sh ok 5 selftests: damon-tests: record_null_deref.sh ok 6 selftests: damon-tests: dbgfs_target_ids_read_before_terminate_race.sh ok 7 selftests: damon-tests: dbgfs_target_ids_pid_leak.sh ok 8 selftests: damon-tests: damo_tests.sh ok 9 selftests: damon-tests: masim-record.sh ok 10 selftests: damon-tests: build_i386.sh ok 11 selftests: damon-tests: build_arm64.sh # SKIP ok 12 selftests: damon-tests: build_m68k.sh # SKIP ok 13 selftests: damon-tests: build_i386_idle_flag.sh ok 14 selftests: damon-tests: build_i386_highpte.sh ok 15 selftests: damon-tests: build_nomemcg.sh [33m [92mPASS [39m
On 12/23/24 08:57, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.68 release. There are 116 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.68-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
Hi Greg,
On 23/12/24 21:27, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.68 release. There are 116 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
No problems seen on x86_64 and aarch64 with our testing.
Tested-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com
Thanks, Harshit
On 12/23/24 07:57, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.68 release. There are 116 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.68-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
Am 23.12.2024 um 16:57 schrieb Greg Kroah-Hartman:
This is the start of the stable review cycle for the 6.6.68 release. There are 116 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Builds, boots and works on my 2-socket Ivy Bridge Xeon E5-2697 v2 server. No dmesg oddities or regressions found.
Tested-by: Peter Schneider pschneider1968@googlemail.com
Happy holiday season! Peter Schneider
On Mon, 23 Dec 2024 16:57:50 +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.68 release. There are 116 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.68-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v6.6: 10 builds: 10 pass, 0 fail 26 boots: 26 pass, 0 fail 116 tests: 116 pass, 0 fail
Linux version: 6.6.68-rc1-g6a86252ba24f Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On Mon, 23 Dec 2024 at 21:41, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.6.68 release. There are 116 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.68-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 6.6.68-rc1 * git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git * git commit: 6a86252ba24f89c8deb21b44cb5ffc867d9ab96f * git describe: v6.6.67-117-g6a86252ba24f * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.6.y/build/v6.6.67...
## Test Regressions (compared to v6.6.66-110-g584b6d5f2ac7)
## Metric Regressions (compared to v6.6.66-110-g584b6d5f2ac7)
## Test Fixes (compared to v6.6.66-110-g584b6d5f2ac7)
## Metric Fixes (compared to v6.6.66-110-g584b6d5f2ac7)
## Test result summary total: 158707, pass: 130517, fail: 3971, skip: 24142, xfail: 77
## Build Summary * arc: 6 total, 5 passed, 1 failed * arm: 132 total, 132 passed, 0 failed * arm64: 44 total, 42 passed, 2 failed * i386: 31 total, 28 passed, 3 failed * mips: 30 total, 25 passed, 5 failed * parisc: 5 total, 5 passed, 0 failed * powerpc: 36 total, 32 passed, 4 failed * riscv: 23 total, 22 passed, 1 failed * s390: 18 total, 14 passed, 4 failed * sh: 12 total, 10 passed, 2 failed * sparc: 9 total, 8 passed, 1 failed * x86_64: 36 total, 35 passed, 1 failed
## Test suites summary * boot * commands * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-efivarfs * kselftest-exec * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-filesystems-epoll * kselftest-firmware * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-kcmp * kselftest-kvm * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-mincore * kselftest-mqueue * kselftest-net * kselftest-net-mptcp * kselftest-openat2 * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-tc-testing * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user_events * kselftest-vDSO * kselftest-x86 * kunit * kvm-unit-tests * libgpiod * libhugetlbfs * log-parser-boot * log-parser-build-clang * log-parser-build-gcc * log-parser-test * ltp-capability * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-hugetlb * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-sm[ * ltp-smoke * ltp-syscalls * ltp-tracing * perf * rcutorture
-- Linaro LKFT https://lkft.linaro.org
On 12/23/24 8:57 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.68 release. There are 116 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.68-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
OVERVIEW
Builds: 42 passed, 0 failed
Boot tests: 528 passed, 0 failed
CI systems: broonie, maestro
REVISION
Commit name: v6.6.67-117-g6a86252ba24f hash: 6a86252ba24f89c8deb21b44cb5ffc867d9ab96f Checked out from https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
BUILDS
No build failures found
BOOT TESTS
No boot failure found
See complete and up-to-date report at:
https://kcidb.kernelci.org/d/revision/revision?orgId=1&var-git_commit_ha...
Tested-by: kernelci.org bot bot@kernelci.org
Thanks, KernelCI team
On 12/23/2024 7:57 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.68 release. There are 116 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 27 Dec 2024 15:53:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.68-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
The kernel, bpf tool, amd kselftest tool builds fine for v6.6.68-rc1 on x86 and arm64 Azure VM.
Tested-by: Hardik Garg hargar@linux.microsoft.com
Thanks, Hardik
linux-stable-mirror@lists.linaro.org